Patchwork fix scsi-generic

login
register
mail settings
Submitter adq
Date Aug. 7, 2010, 12:55 a.m.
Message ID <AANLkTinVH89VQzNPB6PRuxnvzkYCtTG_-P1HBkQuTohK@mail.gmail.com>
Download mbox | patch
Permalink /patch/61163/
State New
Headers show

Comments

adq - Aug. 7, 2010, 12:55 a.m.
Hi, I've been tracking down why scsi generic devices (using SG_IO)
don't work any more. After adding debug, I can see that it actually
submits the scsi CDB in hw/scsi-generic.c/execute_command(), but that
the hw/scsi-generic.c/scsi_read_complete() callback is never called.

This is because these are done with ioctls, and the posix async ioctl
code is, I think, broken right now. Some more debugging, led me to
posix-aio-compat.c/posix_aio_process_queue():

            if (acb->async_context_id != async_context_id) {

The async_context_ids don't match, so the request is never handled.
This is because the acb->async_context_id field is not initialised in
posix-aio-compat.c/paio_ioctl() (compare with
posix-aio-compat.c/paio_submit()). The attached patch adds the missing
line in.

This seems to fix the problem. Of course, /now/ I'm getting other
weird problems (as I'm trying to see if I can get slysoft anydvd
working in a KVM winXP vm), but they need further investigation and
likely other fixes.

Please forgive me if I'm mistaken in this, I've only just started
looking at the qemu code.
adq - Aug. 7, 2010, 1:50 a.m.
On 7 August 2010 01:55, adq <adq@lidskialf.net> wrote:
> Hi, I've been tracking down why scsi generic devices (using SG_IO)
> don't work any more. After adding debug, I can see that it actually
> submits the scsi CDB in hw/scsi-generic.c/execute_command(), but that
> the hw/scsi-generic.c/scsi_read_complete() callback is never called.
>
> This is because these are done with ioctls, and the posix async ioctl
> code is, I think, broken right now. Some more debugging, led me to
> posix-aio-compat.c/posix_aio_process_queue():
>
>            if (acb->async_context_id != async_context_id) {
>
> The async_context_ids don't match, so the request is never handled.
> This is because the acb->async_context_id field is not initialised in
> posix-aio-compat.c/paio_ioctl() (compare with
> posix-aio-compat.c/paio_submit()). The attached patch adds the missing
> line in.
>
> This seems to fix the problem. Of course, /now/ I'm getting other
> weird problems (as I'm trying to see if I can get slysoft anydvd
> working in a KVM winXP vm), but they need further investigation and
> likely other fixes.
>
> Please forgive me if I'm mistaken in this, I've only just started
> looking at the qemu code.

Hi, after further investigation, I can generate the "weird problem"
just from vanilla winxp. I'm using an lsi53c895a virtual SCSI device
BTW.

You can generate it by setting up a scsi generic cdrom, right clicking
on the cd/dvd device in winxp, click properties, hardware, select the
device from the list, and then choose Properties. Qemu will crash
instantly with the following debug from the emulated LSI code:

[snip]
lsi_scsi: Compare phase 1 == 7
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=f2022658 opcode 80880000 arg 00000414
lsi_scsi: Jump to 0xf2022a74
lsi_scsi: SCRIPTS dsp=f2022a74 opcode 58000008 arg 00000000
lsi_scsi: Set ATN
lsi_scsi: SCRIPTS dsp=f2022a7c opcode 60000040 arg 00000000
lsi_scsi: Clear ACK
lsi_scsi: SCRIPTS dsp=f2022a84 opcode 9e030000 arg 0000ff08
lsi_scsi: Compare phase 6 != 6
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=f2022a8c opcode 7c027f00 arg 00000000
lsi_scsi: Read-Modify-Write reg 0x2 AND data8=0x7f sfbr=0x00
lsi_scsi: Read reg 2
lsi_scsi: Write reg 2 = 00
lsi_scsi: SCRIPTS dsp=f2022a94 opcode 0e000001 arg f2022cc0
lsi_scsi: MSG out len=1
lsi_scsi: error: Unimplemented message 0x0c

At this point it crashes.

0xc seems to be the SCSI BUS DEVICE RESET message.. can anyone shed
any light on what is going wrong here?
adq - Aug. 7, 2010, 2:03 a.m.
On 7 August 2010 02:50, adq <adq@lidskialf.net> wrote:
> On 7 August 2010 01:55, adq <adq@lidskialf.net> wrote:
>> Hi, I've been tracking down why scsi generic devices (using SG_IO)
>> don't work any more. After adding debug, I can see that it actually
>> submits the scsi CDB in hw/scsi-generic.c/execute_command(), but that
>> the hw/scsi-generic.c/scsi_read_complete() callback is never called.
>>
>> This is because these are done with ioctls, and the posix async ioctl
>> code is, I think, broken right now. Some more debugging, led me to
>> posix-aio-compat.c/posix_aio_process_queue():
>>
>>            if (acb->async_context_id != async_context_id) {
>>
>> The async_context_ids don't match, so the request is never handled.
>> This is because the acb->async_context_id field is not initialised in
>> posix-aio-compat.c/paio_ioctl() (compare with
>> posix-aio-compat.c/paio_submit()). The attached patch adds the missing
>> line in.
>>
>> This seems to fix the problem. Of course, /now/ I'm getting other
>> weird problems (as I'm trying to see if I can get slysoft anydvd
>> working in a KVM winXP vm), but they need further investigation and
>> likely other fixes.
>>
>> Please forgive me if I'm mistaken in this, I've only just started
>> looking at the qemu code.
>
> Hi, after further investigation, I can generate the "weird problem"
> just from vanilla winxp. I'm using an lsi53c895a virtual SCSI device
> BTW.
>
> You can generate it by setting up a scsi generic cdrom, right clicking
> on the cd/dvd device in winxp, click properties, hardware, select the
> device from the list, and then choose Properties. Qemu will crash
> instantly with the following debug from the emulated LSI code:
>
> [snip]
> lsi_scsi: Compare phase 1 == 7
> lsi_scsi: Control condition failed
> lsi_scsi: SCRIPTS dsp=f2022658 opcode 80880000 arg 00000414
> lsi_scsi: Jump to 0xf2022a74
> lsi_scsi: SCRIPTS dsp=f2022a74 opcode 58000008 arg 00000000
> lsi_scsi: Set ATN
> lsi_scsi: SCRIPTS dsp=f2022a7c opcode 60000040 arg 00000000
> lsi_scsi: Clear ACK
> lsi_scsi: SCRIPTS dsp=f2022a84 opcode 9e030000 arg 0000ff08
> lsi_scsi: Compare phase 6 != 6
> lsi_scsi: Control condition failed
> lsi_scsi: SCRIPTS dsp=f2022a8c opcode 7c027f00 arg 00000000
> lsi_scsi: Read-Modify-Write reg 0x2 AND data8=0x7f sfbr=0x00
> lsi_scsi: Read reg 2
> lsi_scsi: Write reg 2 = 00
> lsi_scsi: SCRIPTS dsp=f2022a94 opcode 0e000001 arg f2022cc0
> lsi_scsi: MSG out len=1
> lsi_scsi: error: Unimplemented message 0x0c
>
> At this point it crashes.
>
> 0xc seems to be the SCSI BUS DEVICE RESET message.. can anyone shed
> any light on what is going wrong here?
>

Ah, just spotted that when you enable LSI_DEBUG, the BADF() macro has
an implicit exit() in it. Ok, disabling that, gives me the real cause
of the crash I'm seeing:

lsi_scsi: SCRIPTS dsp=f20229bc opcode 60000048 arg 00000000
lsi_scsi: Clear ATN ACK
lsi_scsi: SCRIPTS dsp=f20229c4 opcode 818b0000 arg fffffc4c
lsi_scsi: Compare phase 2 == 1
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=f20229cc opcode 808a0000 arg fffffc44
lsi_scsi: Compare phase 2 == 0
lsi_scsi: Control condition failed
lsi_scsi: SCRIPTS dsp=f20229d4 opcode 828a0000 arg fffffc0c
lsi_scsi: Compare phase 2 == 2
lsi_scsi: Jump to 0xf20225e8
lsi_scsi: SCRIPTS dsp=f20225e8 opcode 1a000000 arg 00000020
lsi_scsi: Send command len=524288
qemu: /root/qemu-kvm/src/qemu-kvm-0.12.5/hw/lsi53c895a.c:713:
lsi_do_command: Assertion `s->current == ((void *)0)' failed.
Kevin Wolf - Aug. 8, 2010, 1:11 p.m.
Am 07.08.2010 02:55, schrieb adq:
> Hi, I've been tracking down why scsi generic devices (using SG_IO)
> don't work any more. After adding debug, I can see that it actually
> submits the scsi CDB in hw/scsi-generic.c/execute_command(), but that
> the hw/scsi-generic.c/scsi_read_complete() callback is never called.
> 
> This is because these are done with ioctls, and the posix async ioctl
> code is, I think, broken right now. Some more debugging, led me to
> posix-aio-compat.c/posix_aio_process_queue():
> 
>             if (acb->async_context_id != async_context_id) {
> 
> The async_context_ids don't match, so the request is never handled.
> This is because the acb->async_context_id field is not initialised in
> posix-aio-compat.c/paio_ioctl() (compare with
> posix-aio-compat.c/paio_submit()). The attached patch adds the missing
> line in.
> 
> This seems to fix the problem. Of course, /now/ I'm getting other
> weird problems (as I'm trying to see if I can get slysoft anydvd
> working in a KVM winXP vm), but they need further investigation and
> likely other fixes.
> 
> Please forgive me if I'm mistaken in this, I've only just started
> looking at the qemu code.

The patch looks correct to me.

Please use git format-patch to generate the patch, so that it contains a
decent commit message and I can apply it with git am. Also, please don't
forget the Signed-off-by line, otherwise we can't accept it.

Kevin

Patch

diff -Naur qemu-kvm-0.12.5.orig//posix-aio-compat.c qemu-kvm-0.12.5/posix-aio-compat.c
--- qemu-kvm-0.12.5.orig//posix-aio-compat.c	2010-07-27 01:43:53.000000000 +0100
+++ qemu-kvm-0.12.5/posix-aio-compat.c	2010-08-07 01:49:07.051265778 +0100
@@ -597,6 +597,7 @@ 
     acb->aio_type = QEMU_AIO_IOCTL;
     acb->aio_fildes = fd;
     acb->ev_signo = SIGUSR2;
+    acb->async_context_id = get_async_context_id();
     acb->aio_offset = 0;
     acb->aio_ioctl_buf = buf;
     acb->aio_ioctl_cmd = req;