diff mbox

Possible bug in ide_cd_queue_pc() or ide_wait_stat() ?

Message ID 4C500A82.2040805@citrix.com
State RFC
Headers show

Commit Message

Dominic Curran July 28, 2010, 10:46 a.m. UTC
I have some TEAC DV-28E-V CDROM drives that after polling for status 
starts to timeout requests (the time it takes to do this varies between 
5mins and 24hrs).

I believe only an APATPI reset gets them out of this timeout behaviour.


I assume this is down to bad firmware/hardware (the latest firmware has 
been applied).

I happens on multiply drives of this version.


BUT...

The problem is that even when the requests timeout the ioctl 
CDROM_DRIVE_STATUS will receive a status of CDS_DISC_OK.

This is obviously not the correct status (particularly when there is no 
CD disc in the drive).

 

I have added some instrumentation to the code and this is the general 
stack flow I see:

 

In the interrupt handler:
do_ide_request()
  ide_do_request()
    start_request()                   returns ide_stopped
      ide_wait_stat()                 returns -EBUSY
        ide_error()                   returns ide_stopped  +  stat=0xD0 {Busy}
          ide_dump_status()           returns 0 (so err=0)
	  rq->errors = 1;
	  ide_end_drive_cmd(err)      b/c err is 0, then sets rq->errors=0


In the ioctl:
ide_cdrom_drive_status() Returns CDS_DISC_OK
cdrom_check_status()     Returns 0.
  ide_cd_queue_pc()      Ignores the return from blk_execute_rq(). Checks for the flag REQ_FAILED in rq->cmd_flags (which is not set). Returns 0.
    blk_execute_rq()     Since rq->errors==1 and then returns 0
      wait_for_completion()

The problem seems to be that in ide_cd_queue_pc():
1) the error return from blk_execute_rq() is ignored
2) the REQ_FAILED in rq->cmd_flags is not set in the interrupt handler (which is what ide_cd_queue_pc() seems to be concerned about)

Anyone have any comments:
1) why in ide_dump_status() a stat of BUSY_STAT does not translate to an error return ?

2) Is setting rq->cmd_flags |= REQ_FAILED in ide_wait_stat() an acceptable way to solve issue. 
   Patch something like this...




I appreciate any pointers you can give
Thanks
dom
 

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Borislav Petkov July 28, 2010, 11:51 a.m. UTC | #1
From: Dominic Curran <dominic.curran@citrix.com>
Date: Wed, Jul 28, 2010 at 11:46:26AM +0100

Hi Dominik,

> I have some TEAC DV-28E-V CDROM drives that after polling for status
> starts to timeout requests (the time it takes to do this varies
> between 5mins and 24hrs).

Which kernel is that? Can we see a complete dmesg? Which tool does the
polling (udev... )?

> The problem is that even when the requests timeout the ioctl
> CDROM_DRIVE_STATUS will receive a status of CDS_DISC_OK.

Can you add some more instrumentation code to ide_cdrom_drive_status()
to see which "return CDS_DISC_OK" is being hit since there are a couple
of them in that function.

Thanks.
Dominic Curran July 28, 2010, 3:23 p.m. UTC | #2
Borislav Petkov wrote:
>> I have some TEAC DV-28E-V CDROM drives that after polling for status
>> starts to timeout requests (the time it takes to do this varies
>> between 5mins and 24hrs).
>>     
>
> Which kernel is that? Can we see a complete dmesg? Which tool does the
> polling (udev... )?
>
>   
We have cooked our own usermode tool (we dont use udev).
Tool is very simple:
1) polls cdrom drive using ioctl CDROM_DRIVE_STATUS
2) Looks for a CDS_DISC_OK return
3) Once it sees that it tries to mount CD

The problem is that for this specific drive it starts timing out STATUS 
requests, and then returns CDS_DISC_OK.
This cause the tools to try to mount (non existent disc), which then 
eats 100% cpu time.

So even if the drive goes bad, I'd like to get back a error status from 
the ioctl rather than CDS_DISC_OK.

Kernel is 2.6.27.  Based on CentOS I believe.
>> The problem is that even when the requests timeout the ioctl
>> CDROM_DRIVE_STATUS will receive a status of CDS_DISC_OK.
>>     
>
> Can you add some more instrumentation code to ide_cdrom_drive_status()
> to see which "return CDS_DISC_OK" is being hit since there are a couple
> of them in that function.
>
>   
In ide_cdrom_drive_status() the return happens in the following if:

    stat = cdrom_check_status(drive, &sense);
printk("ide_cdrom_drive_status(stat=%i sense_key=%u)\n",stat, 
sense.sense_key);
    if (!stat || sense.sense_key == UNIT_ATTENTION) {
        printk("ide_cdrom_drive_status(#1) ret=CDS_DICS_OK)\n");
        return CDS_DISC_OK;
    }



As for dmesg, it is filled with my own instrumentation:
This is a ioctl that worked correctly....

Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ======= cdrom_ioctl() =======
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: entering 
CDROM_DRIVE_STATUS
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cdrom_drive_status()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ENTER: cdrom_check_status()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00000000)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x00082000 retries=10
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x00086640
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() Other. 
retries=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() retries=-1
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=-5)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: EXIT: 
cdrom_check_status(ret=-5)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_cdrom_drive_status(stat=-5 sense_key=2)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=700,cf=0x00008000)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x0008A000 retries=10
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x0008A640
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=0)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: 
cdrom_ioctl_drive_status (ret=1)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: entering cdrom_release
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: Use count for 
"/dev/hda" now zero
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: hda: No DVD+RW
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom: Unlocking door!
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00000000)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x00082000 retries=10
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x00082640
Jul 27 16:48:37 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=0)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom: entering cdrom_open
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom: Use count for 
"/dev/hda" now 1
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ENTER: cdrom_check_status()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00000000)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x00082000 retries=10
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x00086640
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() Other. 
retries=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() retries=-1
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=-5)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: EXIT: 
cdrom_check_status(ret=-5)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ENTER: cdrom_check_status()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00000000)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x00082000 retries=10
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x00086640
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() Other. 
retries=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() retries=-1
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=-5)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: EXIT: 
cdrom_check_status(ret=-5)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00008000)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x0008A000 retries=10
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x0008E640
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() Other. 
retries=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() retries=-1
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=-5)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00008000)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x0008A000 retries=10
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=0008A640)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: cdrom_transfer_packet_command()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00010440)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: 
cdrom_transfer_packet_command(#2)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBDE00 
rq->errors=0 err=0)
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x0008E640
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() Other. 
retries=0
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc() retries=-1
Jul 27 16:48:39 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=-5)


Note the line early on that reads...

EXIT: cdrom_check_status(ret=-5)
ide_cdrom_drive_status(stat=-5 sense_key=2)


A sense_key=2 means NOT_READY. Thats good.

Here is the spew when requests have started to timeout...

Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ======= cdrom_ioctl() =======
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: cdrom: entering 
CDROM_DRIVE_STATUS
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide_cdrom_drive_status()
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ENTER: cdrom_check_status()
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: 
ide_cd_queue_pc(timeout=0,cf=0x00000000)
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: About to call 
blk_execute_rq() rq->cmd_flags=0x00082000 retries=10
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide_wait_stat()
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: 
ide_wait_stat(rq->cmd_flags=00082640)
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide_error(stat=0xD0)
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: hda: status timeout: 
status=0xd0 { Busy }
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide: failed opcode was: unknown
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide_error(#2)[rq=EDEBD380 
errors=0 cmd_flags=00082640]
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: ide_wait_stat() ret=-16
Jul 27 16:51:27 xsdubianh-srdeaor131 kernel: hda: drive not ready for 
command
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: blk_execute_rq(rq=EDEBD380 
rq->errors=0 err=0)
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: blk_execute_rq() error=0 
rq->cmd_flags=0x00082640
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: ide_cd_queue_pc(ret=0)
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: EXIT: cdrom_check_status(ret=0)
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: 
ide_cdrom_drive_status(stat=0 sense_key=0)
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: ide_cdrom_drive_status(#1) 
ret=CDS_DICS_OK)
Jul 27 16:51:32 xsdubianh-srdeaor131 kernel: cdrom: 
cdrom_ioctl_drive_status (ret=4)

Note that CDS_DISC_OK is being returned.
Of course this is my own instrumentation so I doubt it will mean that 
much to you.


--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Borislav Petkov July 29, 2010, 8:12 a.m. UTC | #3
From: Dominic Curran <dominic.curran@citrix.com>
Date: Wed, Jul 28, 2010 at 04:23:13PM +0100

> Borislav Petkov wrote:
> >>I have some TEAC DV-28E-V CDROM drives that after polling for status
> >>starts to timeout requests (the time it takes to do this varies
> >>between 5mins and 24hrs).
> >
> >Which kernel is that? Can we see a complete dmesg? Which tool does the
> >polling (udev... )?
> >
> We have cooked our own usermode tool (we dont use udev).
> Tool is very simple:
> 1) polls cdrom drive using ioctl CDROM_DRIVE_STATUS
> 2) Looks for a CDS_DISC_OK return
> 3) Once it sees that it tries to mount CD
> 
> The problem is that for this specific drive it starts timing out
> STATUS requests, and then returns CDS_DISC_OK.
> This cause the tools to try to mount (non existent disc), which then
> eats 100% cpu time.
> 
> So even if the drive goes bad, I'd like to get back a error status
> from the ioctl rather than CDS_DISC_OK.

Right, I think I see the issue after staring at your instrumentation
below and the code, however...

> Kernel is 2.6.27.  Based on CentOS I believe.

this is an awfully old kernel and Bart and I broke^W (cough) changed a
lot (cough) :) in ide since then so I'd like to verify your issue with a
recent kernel first (current 35-rc6 is stable enough, it seems) before
we get on with debugging it further.

Thanks.
diff mbox

Patch

Index: linux/drivers/ide/ide-iops.c                                                                     
===================================================================                                               
--- linux.orig/drivers/ide/ide-iops.c 2010-07-20 15:37:01.871300665 +0100                               
+++ linux/drivers/ide/ide-iops.c      2010-07-28 11:43:21.386752993 +0100                               
@@ -652,6 +657,9 @@  int ide_wait_stat(ide_startstop_t *start
        if (err) {
                char *s = (err == -EBUSY) ? "status timeout" : "status error";
                *startstop = ide_error(drive, s, stat);
+
+               if (err == -EBUSY)
+                       rq->cmd_flags |= REQ_FAILED;
        }

        return err;