Patchwork Qemu-KVM 0.12.3 and Multipath -> Assertion

login
register
mail settings
Submitter Kevin Wolf
Date May 4, 2010, 12:20 p.m.
Message ID <4BE01120.30608@redhat.com>
Download mbox | patch
Permalink /patch/51601/
State New
Headers show

Comments

Kevin Wolf - May 4, 2010, 12:20 p.m.
Am 04.05.2010 13:38, schrieb Peter Lieven:
> hi kevin,
> 
> i set a breakpint at bmdma_active_if. the first 2 breaks encountered 
> when the last path in the multipath
> failed, but the assertion was not true.
> when i kicked one path back in the breakpoint was reached again, this 
> time leading to an assert.
> the stacktrace is from the point shortly before.
> 
> hope this helps.

Hm, looks like there's something wrong with cancelling requests -
bdrv_aio_cancel might decide that it completes a request (and
consequently calls the callback for it) whereas the IDE emulation
decides that it's done with the request before calling bdrv_aio_cancel.

I haven't looked in much detail what this could break, but does
something like this help?

Kevin
Peter Lieven - May 4, 2010, 1:42 p.m.
hi kevin,

you did it *g*

looks promising. applied this patched and was not able to reproduce yet :-)

secure way to reproduce was to shut down all multipath paths, then 
initiate i/o
in the vm (e.g. start an application). of course, everything hangs at 
this point.

after reenabling one path, vm crashed. now it seems to behave correctly and
just report an DMA timeout and continues normally afterwards.

can you imagine of any way preventing the vm to consume 100% cpu in
that waiting state?
my current approach is to run all vms with nice 1, which helped to keep the
machine responsible if all vms (in my test case 64 on a box) have hanging
i/o at the same time.

br,
peter



Kevin Wolf wrote:
> Am 04.05.2010 13:38, schrieb Peter Lieven:
>   
>> hi kevin,
>>
>> i set a breakpint at bmdma_active_if. the first 2 breaks encountered 
>> when the last path in the multipath
>> failed, but the assertion was not true.
>> when i kicked one path back in the breakpoint was reached again, this 
>> time leading to an assert.
>> the stacktrace is from the point shortly before.
>>
>> hope this helps.
>>     
>
> Hm, looks like there's something wrong with cancelling requests -
> bdrv_aio_cancel might decide that it completes a request (and
> consequently calls the callback for it) whereas the IDE emulation
> decides that it's done with the request before calling bdrv_aio_cancel.
>
> I haven't looked in much detail what this could break, but does
> something like this help?
>
> diff --git a/hw/ide/core.c b/hw/ide/core.c
> index 0757528..3cd55e3 100644
> --- a/hw/ide/core.c
> +++ b/hw/ide/core.c
> @@ -2838,10 +2838,6 @@ static void ide_dma_restart(IDEState *s, int is_read)
>  void ide_dma_cancel(BMDMAState *bm)
>  {
>      if (bm->status & BM_STATUS_DMAING) {
> -        bm->status &= ~BM_STATUS_DMAING;
> -        /* cancel DMA request */
> -        bm->unit = -1;
> -        bm->dma_cb = NULL;
>          if (bm->aiocb) {
>  #ifdef DEBUG_AIO
>              printf("aio_cancel\n");
> @@ -2849,6 +2845,10 @@ void ide_dma_cancel(BMDMAState *bm)
>              bdrv_aio_cancel(bm->aiocb);
>              bm->aiocb = NULL;
>          }
> +        bm->status &= ~BM_STATUS_DMAING;
> +        /* cancel DMA request */
> +        bm->unit = -1;
> +        bm->dma_cb = NULL;
>      }
>  }
>
> Kevin
>
>
Kevin Wolf - May 4, 2010, 2:01 p.m.
Am 04.05.2010 15:42, schrieb Peter Lieven:
> hi kevin,
> 
> you did it *g*
> 
> looks promising. applied this patched and was not able to reproduce yet :-)
> 
> secure way to reproduce was to shut down all multipath paths, then 
> initiate i/o
> in the vm (e.g. start an application). of course, everything hangs at 
> this point.
> 
> after reenabling one path, vm crashed. now it seems to behave correctly and
> just report an DMA timeout and continues normally afterwards.

Great, I'm going to submit it as a proper patch then.

Christoph, by now I'm pretty sure it's right, but can you have another
look if this is correct, anyway?

> can you imagine of any way preventing the vm to consume 100% cpu in
> that waiting state?
> my current approach is to run all vms with nice 1, which helped to keep the
> machine responsible if all vms (in my test case 64 on a box) have hanging
> i/o at the same time.

I don't have anything particular in mind, but you could just attach gdb
and get another backtrace while it consumes 100% CPU (you'll need to use
"thread apply all bt" to catch everything). Then we should see where
it's hanging.

Kevin
Christoph Hellwig - May 4, 2010, 5:07 p.m.
On Tue, May 04, 2010 at 04:01:35PM +0200, Kevin Wolf wrote:
> Great, I'm going to submit it as a proper patch then.
> 
> Christoph, by now I'm pretty sure it's right, but can you have another
> look if this is correct, anyway?

It looks correct to me - we really shouldn't update the the fields
until bdrv_aio_cancel has returned.  In fact we cannot cancel a request
more often than we can, so there's a fairly high chance it will
complete.


Reviewed-by: Christoph Hellwig <hch@lst.de>
André Weidemann - May 8, 2010, 9:53 a.m.
Hi Kevin,
On 04.05.2010 14:20, Kevin Wolf wrote:

> Am 04.05.2010 13:38, schrieb Peter Lieven:
>> hi kevin,
>>
>> i set a breakpint at bmdma_active_if. the first 2 breaks encountered
>> when the last path in the multipath
>> failed, but the assertion was not true.
>> when i kicked one path back in the breakpoint was reached again, this
>> time leading to an assert.
>> the stacktrace is from the point shortly before.
>>
>> hope this helps.
>
> Hm, looks like there's something wrong with cancelling requests -
> bdrv_aio_cancel might decide that it completes a request (and
> consequently calls the callback for it) whereas the IDE emulation
> decides that it's done with the request before calling bdrv_aio_cancel.
>
> I haven't looked in much detail what this could break, but does
> something like this help?

Your attached patch fixes the problem I had as well. I ran 3 consecutive 
tests tonight, which all finished without crashing the VM.
I reported my "assertion failed" error on March 14th while doing disk 
perfomance tests using iozone in an Ubuntu 9.10 VM with qemu-kvm 0.12.3.

Thank you very much.
  André
Peter Lieven - May 12, 2010, 2:01 p.m.
Hi Kevin,

here we go. I created a blocking multipath device (interrupted all 
paths). qemu-kvm hangs with 100% cpu.
also monitor is not responding.

If I restore at least one path, the vm is continueing.

BR,
Peter


^C
Program received signal SIGINT, Interrupt.
0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
(gdb) bt
#0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
#1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
#2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
#3  0x000000000042e739 in kvm_mutex_lock () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
#4  0x000000000042e76e in qemu_mutex_lock_iothread () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
#5  0x000000000040c262 in main_loop_wait (timeout=1000) at 
/usr/src/qemu-kvm-0.12.4/vl.c:3995
#6  0x000000000042dcf1 in kvm_main_loop () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
#7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
#8  0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, 
envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
(gdb) bt full
#0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
No symbol table info available.
#1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
No symbol table info available.
#2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
No symbol table info available.
#3  0x000000000042e739 in kvm_mutex_lock () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
No locals.
#4  0x000000000042e76e in qemu_mutex_lock_iothread () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
No locals.
#5  0x000000000040c262 in main_loop_wait (timeout=1000) at 
/usr/src/qemu-kvm-0.12.4/vl.c:3995
    ioh = (IOHandlerRecord *) 0x0
    rfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
    wfds = {fds_bits = {0 <repeats 16 times>}}
    xfds = {fds_bits = {0 <repeats 16 times>}}
    ret = 1
    nfds = 21
    tv = {tv_sec = 0, tv_usec = 999761}
#6  0x000000000042dcf1 in kvm_main_loop () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
    fds = {18, 19}
    mask = {__val = {268443712, 0 <repeats 15 times>}}
    sigfd = 20
#7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
    r = 0
#8  0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, 
envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
    gdbstub_dev = 0x0
    boot_devices_bitmap = 12
    i = 0
    snapshot = 0
    linux_boot = 0
    initrd_filename = 0x0
    kernel_filename = 0x0
    kernel_cmdline = 0x588fac ""
    boot_devices = "dc", '\0' <repeats 30 times>
    ds = (DisplayState *) 0x198bf00
    dcl = (DisplayChangeListener *) 0x0
    cyls = 0
    heads = 0
    secs = 0
    translation = 0
    hda_opts = (QemuOpts *) 0x0
    opts = (QemuOpts *) 0x1957390
    optind = 30
---Type <return> to continue, or q <return> to quit---
    r = 0x7fff266a8a23 "-usbdevice"
    optarg = 0x7fff266a8a2e "tablet"
    loadvm = 0x0
    machine = (QEMUMachine *) 0x861720
    cpu_model = 0x7fff266a8917 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' 
' <repeats 11 times>, "E5520  @ 2.27GHz"
    fds = {644511720, 32767}
    tb_size = 0
    pid_file = 0x7fff266a89bb "/var/run/qemu/vm-150.pid"
    incoming = 0x0
    fd = 0
    pwd = (struct passwd *) 0x0
    chroot_dir = 0x0
    run_as = 0x0
    env = (struct CPUX86State *) 0x0
    show_vnc_port = 0
    params = {0x58cc76 "order", 0x58cc7c "once", 0x58cc81 "menu", 0x0}

Kevin Wolf wrote:
> Am 04.05.2010 15:42, schrieb Peter Lieven:
>   
>> hi kevin,
>>
>> you did it *g*
>>
>> looks promising. applied this patched and was not able to reproduce yet :-)
>>
>> secure way to reproduce was to shut down all multipath paths, then 
>> initiate i/o
>> in the vm (e.g. start an application). of course, everything hangs at 
>> this point.
>>
>> after reenabling one path, vm crashed. now it seems to behave correctly and
>> just report an DMA timeout and continues normally afterwards.
>>     
>
> Great, I'm going to submit it as a proper patch then.
>
> Christoph, by now I'm pretty sure it's right, but can you have another
> look if this is correct, anyway?
>
>   
>> can you imagine of any way preventing the vm to consume 100% cpu in
>> that waiting state?
>> my current approach is to run all vms with nice 1, which helped to keep the
>> machine responsible if all vms (in my test case 64 on a box) have hanging
>> i/o at the same time.
>>     
>
> I don't have anything particular in mind, but you could just attach gdb
> and get another backtrace while it consumes 100% CPU (you'll need to use
> "thread apply all bt" to catch everything). Then we should see where
> it's hanging.
>
> Kevin
>
>
>
>
Kevin Wolf - May 14, 2010, 9:26 a.m.
Hi Peter,

Am 12.05.2010 16:01, schrieb Peter Lieven:
> Hi Kevin,
> 
> here we go. I created a blocking multipath device (interrupted all 
> paths). qemu-kvm hangs with 100% cpu.
> also monitor is not responding.
> 
> If I restore at least one path, the vm is continueing.
> 
> BR,
> Peter

This seems to be the backtrace of only one thread, and likely not the
interesting one. Can you please use "threads all apply bt" to get the
backtrace of all threads?

Kevin

> 
> 
> ^C
> Program received signal SIGINT, Interrupt.
> 0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
> (gdb) bt
> #0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
> #1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
> #2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
> #3  0x000000000042e739 in kvm_mutex_lock () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
> #4  0x000000000042e76e in qemu_mutex_lock_iothread () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
> #5  0x000000000040c262 in main_loop_wait (timeout=1000) at 
> /usr/src/qemu-kvm-0.12.4/vl.c:3995
> #6  0x000000000042dcf1 in kvm_main_loop () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
> #7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
> #8  0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, 
> envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
> (gdb) bt full
> #0  0x00007fd8a6aaea94 in __lll_lock_wait () from /lib/libpthread.so.0
> No symbol table info available.
> #1  0x00007fd8a6aaa190 in _L_lock_102 () from /lib/libpthread.so.0
> No symbol table info available.
> #2  0x00007fd8a6aa9a7e in pthread_mutex_lock () from /lib/libpthread.so.0
> No symbol table info available.
> #3  0x000000000042e739 in kvm_mutex_lock () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
> No locals.
> #4  0x000000000042e76e in qemu_mutex_lock_iothread () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
> No locals.
> #5  0x000000000040c262 in main_loop_wait (timeout=1000) at 
> /usr/src/qemu-kvm-0.12.4/vl.c:3995
>     ioh = (IOHandlerRecord *) 0x0
>     rfds = {fds_bits = {1048576, 0 <repeats 15 times>}}
>     wfds = {fds_bits = {0 <repeats 16 times>}}
>     xfds = {fds_bits = {0 <repeats 16 times>}}
>     ret = 1
>     nfds = 21
>     tv = {tv_sec = 0, tv_usec = 999761}
> #6  0x000000000042dcf1 in kvm_main_loop () at 
> /usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
>     fds = {18, 19}
>     mask = {__val = {268443712, 0 <repeats 15 times>}}
>     sigfd = 20
> #7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
>     r = 0
> #8  0x000000000041054b in main (argc=30, argv=0x7fff266a77e8, 
> envp=0x7fff266a78e0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
>     gdbstub_dev = 0x0
>     boot_devices_bitmap = 12
>     i = 0
>     snapshot = 0
>     linux_boot = 0
>     initrd_filename = 0x0
>     kernel_filename = 0x0
>     kernel_cmdline = 0x588fac ""
>     boot_devices = "dc", '\0' <repeats 30 times>
>     ds = (DisplayState *) 0x198bf00
>     dcl = (DisplayChangeListener *) 0x0
>     cyls = 0
>     heads = 0
>     secs = 0
>     translation = 0
>     hda_opts = (QemuOpts *) 0x0
>     opts = (QemuOpts *) 0x1957390
>     optind = 30
> ---Type <return> to continue, or q <return> to quit---
>     r = 0x7fff266a8a23 "-usbdevice"
>     optarg = 0x7fff266a8a2e "tablet"
>     loadvm = 0x0
>     machine = (QEMUMachine *) 0x861720
>     cpu_model = 0x7fff266a8917 "qemu64,model_id=Intel(R) Xeon(R) CPU", ' 
> ' <repeats 11 times>, "E5520  @ 2.27GHz"
>     fds = {644511720, 32767}
>     tb_size = 0
>     pid_file = 0x7fff266a89bb "/var/run/qemu/vm-150.pid"
>     incoming = 0x0
>     fd = 0
>     pwd = (struct passwd *) 0x0
>     chroot_dir = 0x0
>     run_as = 0x0
>     env = (struct CPUX86State *) 0x0
>     show_vnc_port = 0
>     params = {0x58cc76 "order", 0x58cc7c "once", 0x58cc81 "menu", 0x0}
> 
> Kevin Wolf wrote:
>> Am 04.05.2010 15:42, schrieb Peter Lieven:
>>   
>>> hi kevin,
>>>
>>> you did it *g*
>>>
>>> looks promising. applied this patched and was not able to reproduce yet :-)
>>>
>>> secure way to reproduce was to shut down all multipath paths, then 
>>> initiate i/o
>>> in the vm (e.g. start an application). of course, everything hangs at 
>>> this point.
>>>
>>> after reenabling one path, vm crashed. now it seems to behave correctly and
>>> just report an DMA timeout and continues normally afterwards.
>>>     
>>
>> Great, I'm going to submit it as a proper patch then.
>>
>> Christoph, by now I'm pretty sure it's right, but can you have another
>> look if this is correct, anyway?
>>
>>   
>>> can you imagine of any way preventing the vm to consume 100% cpu in
>>> that waiting state?
>>> my current approach is to run all vms with nice 1, which helped to keep the
>>> machine responsible if all vms (in my test case 64 on a box) have hanging
>>> i/o at the same time.
>>>     
>>
>> I don't have anything particular in mind, but you could just attach gdb
>> and get another backtrace while it consumes 100% CPU (you'll need to use
>> "thread apply all bt" to catch everything). Then we should see where
>> it's hanging.
>>
>> Kevin
>>
>>
>>
>>   
> 
>
Peter Lieven - May 18, 2010, 11:10 a.m.
hi kevin,

here is the backtrace of (hopefully) all threads:

^C
Program received signal SIGINT, Interrupt.
[Switching to Thread 0x7f39b72656f0 (LWP 10695)]
0x00007f39b6c3ea94 in __lll_lock_wait () from /lib/libpthread.so.0

(gdb) thread apply all bt

Thread 2 (Thread 0x7f39b57b8950 (LWP 10698)):
#0  0x00007f39b6c3eedb in read () from /lib/libpthread.so.0
#1  0x000000000049e723 in qemu_laio_completion_cb (opaque=0x22b4010) at 
linux-aio.c:125
#2  0x000000000049e8ad in laio_cancel (blockacb=0x22ba310) at 
linux-aio.c:184
#3  0x000000000049a309 in bdrv_aio_cancel (acb=0x22ba310) at block.c:1800
#4  0x0000000000587a52 in dma_aio_cancel (acb=0x22ba170) at 
/usr/src/qemu-kvm-0.12.4/dma-helpers.c:138
#5  0x000000000049a309 in bdrv_aio_cancel (acb=0x22ba170) at block.c:1800
#6  0x0000000000444aac in ide_dma_cancel (bm=0x2800fd8) at 
/usr/src/qemu-kvm-0.12.4/hw/ide/core.c:2834
#7  0x0000000000445001 in bmdma_cmd_writeb (opaque=0x2800fd8, 
addr=49152, val=8) at /usr/src/qemu-kvm-0.12.4/hw/ide/pci.c:44
#8  0x00000000004c85f0 in ioport_write (index=0, address=49152, data=8) 
at ioport.c:80
#9  0x00000000004c8977 in cpu_outb (addr=49152, val=8 '\b') at ioport.c:198
#10 0x0000000000429731 in kvm_handle_io (port=49152, 
data=0x7f39b7263000, direction=1, size=1, count=1)
    at /usr/src/qemu-kvm-0.12.4/kvm-all.c:535
#11 0x000000000042bb8b in kvm_run (env=0x22ba5d0) at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:968
#12 0x000000000042cea2 in kvm_cpu_exec (env=0x22ba5d0) at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:1651
#13 0x000000000042d62c in kvm_main_loop_cpu (env=0x22ba5d0) at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:1893
#14 0x000000000042d76d in ap_main_loop (_env=0x22ba5d0) at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:1943
#15 0x00007f39b6c383ba in start_thread () from /lib/libpthread.so.0
#16 0x00007f39b5cbafcd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f39b72656f0 (LWP 10695)):
#0  0x00007f39b6c3ea94 in __lll_lock_wait () from /lib/libpthread.so.0
#1  0x00007f39b6c3a190 in _L_lock_102 () from /lib/libpthread.so.0
#2  0x00007f39b6c39a7e in pthread_mutex_lock () from /lib/libpthread.so.0
#3  0x000000000042e739 in kvm_mutex_lock () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2524
#4  0x000000000042e76e in qemu_mutex_lock_iothread () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2537
#5  0x000000000040c262 in main_loop_wait (timeout=1000) at 
/usr/src/qemu-kvm-0.12.4/vl.c:3995
#6  0x000000000042dcf1 in kvm_main_loop () at 
/usr/src/qemu-kvm-0.12.4/qemu-kvm.c:2126
#7  0x000000000040c98c in main_loop () at /usr/src/qemu-kvm-0.12.4/vl.c:4212
#8  0x000000000041054b in main (argc=30, argv=0x7fff019f1ca8, 
envp=0x7fff019f1da0) at /usr/src/qemu-kvm-0.12.4/vl.c:6252
Peter Lieven - May 18, 2010, 11:13 a.m.
hi,

will this patch make it into 0.12.4.1 ?

br,
peter

Christoph Hellwig wrote:
> On Tue, May 04, 2010 at 04:01:35PM +0200, Kevin Wolf wrote:
>   
>> Great, I'm going to submit it as a proper patch then.
>>
>> Christoph, by now I'm pretty sure it's right, but can you have another
>> look if this is correct, anyway?
>>     
>
> It looks correct to me - we really shouldn't update the the fields
> until bdrv_aio_cancel has returned.  In fact we cannot cancel a request
> more often than we can, so there's a fairly high chance it will
> complete.
>
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
>
Kevin Wolf - May 18, 2010, 12:14 p.m.
Am 18.05.2010 13:13, schrieb Peter Lieven:
> hi,
> 
> will this patch make it into 0.12.4.1 ?
> 
> br,
> peter

Anthony, can you please cherry-pick commit 38d8dfa1 into stable-0.12?

Kevin

> 
> Christoph Hellwig wrote:
>> On Tue, May 04, 2010 at 04:01:35PM +0200, Kevin Wolf wrote:
>>   
>>> Great, I'm going to submit it as a proper patch then.
>>>
>>> Christoph, by now I'm pretty sure it's right, but can you have another
>>> look if this is correct, anyway?
>>>     
>>
>> It looks correct to me - we really shouldn't update the the fields
>> until bdrv_aio_cancel has returned.  In fact we cannot cancel a request
>> more often than we can, so there's a fairly high chance it will
>> complete.
>>
>>
>> Reviewed-by: Christoph Hellwig <hch@lst.de>
Kevin Wolf - May 18, 2010, 1:22 p.m.
Am 18.05.2010 13:10, schrieb Peter Lieven:
> hi kevin,
> 
> here is the backtrace of (hopefully) all threads:
> 
> ^C
> Program received signal SIGINT, Interrupt.
> [Switching to Thread 0x7f39b72656f0 (LWP 10695)]
> 0x00007f39b6c3ea94 in __lll_lock_wait () from /lib/libpthread.so.0
> 
> (gdb) thread apply all bt
> 
> Thread 2 (Thread 0x7f39b57b8950 (LWP 10698)):
> #0  0x00007f39b6c3eedb in read () from /lib/libpthread.so.0
> #1  0x000000000049e723 in qemu_laio_completion_cb (opaque=0x22b4010) at 
> linux-aio.c:125
> #2  0x000000000049e8ad in laio_cancel (blockacb=0x22ba310) at 
> linux-aio.c:184

I think it's stuck here in an endless loop:

    while (laiocb->ret == -EINPROGRESS)
        qemu_laio_completion_cb(laiocb->ctx);

Can you verify this by single-stepping one or two loop iterations? ret
and errno after the read call could be interesting, too.

We'll be stuck in an endless loop if the request doesn't complete, which
might well happen in your scenario. Not sure what the right thing to do
is. We probably need to fail the bdrv_aio_cancel to avoid blocking the
whole program, but I have no idea what device emulations should do on
that condition.

As long as we can't handle that condition correctly, leaving the hang in
place is probably the best option. Maybe add some sleep to avoid 100%
CPU consumption.

Kevin
Christoph Hellwig - May 19, 2010, 7:29 a.m.
On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
> I think it's stuck here in an endless loop:
> 
>     while (laiocb->ret == -EINPROGRESS)
>         qemu_laio_completion_cb(laiocb->ctx);
> 
> Can you verify this by single-stepping one or two loop iterations? ret
> and errno after the read call could be interesting, too.

Maybe the compiler is just too smart.  Without some form of barrier
it could just optimize the loop away as laiocb->ret couldn't change
in a normal single-threaded environment.
Kevin Wolf - May 19, 2010, 7:48 a.m.
Am 19.05.2010 09:29, schrieb Christoph Hellwig:
> On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
>> I think it's stuck here in an endless loop:
>>
>>     while (laiocb->ret == -EINPROGRESS)
>>         qemu_laio_completion_cb(laiocb->ctx);
>>
>> Can you verify this by single-stepping one or two loop iterations? ret
>> and errno after the read call could be interesting, too.
> 
> Maybe the compiler is just too smart.  Without some form of barrier
> it could just optimize the loop away as laiocb->ret couldn't change
> in a normal single-threaded environment.

It probably could in theory, but in practice we're in a read() inside
qemu_laio_completion, so it didn't do it here.

Kevin
Peter Lieven - May 19, 2010, 8:18 a.m.
Kevin Wolf wrote:
> Am 19.05.2010 09:29, schrieb Christoph Hellwig:
>   
>> On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
>>     
>>> I think it's stuck here in an endless loop:
>>>
>>>     while (laiocb->ret == -EINPROGRESS)
>>>         qemu_laio_completion_cb(laiocb->ctx);
>>>
>>> Can you verify this by single-stepping one or two loop iterations? ret
>>> and errno after the read call could be interesting, too.
>>>       
>> Maybe the compiler is just too smart.  Without some form of barrier
>> it could just optimize the loop away as laiocb->ret couldn't change
>> in a normal single-threaded environment.
>>     
>
> It probably could in theory, but in practice we're in a read() inside
> qemu_laio_completion, so it didn't do it here.
>   
if you supply a patch that will add some usleeps at the point in
question i'm willing to test if it solves the 100% cpu problem.
> Kevin
>
>
Peter Lieven - May 23, 2010, 10:30 a.m.
Am 19.05.2010 um 10:18 schrieb Peter Lieven:

> Kevin Wolf wrote:
>> Am 19.05.2010 09:29, schrieb Christoph Hellwig:
>>  
>>> On Tue, May 18, 2010 at 03:22:36PM +0200, Kevin Wolf wrote:
>>>    
>>>> I think it's stuck here in an endless loop:
>>>> 
>>>>    while (laiocb->ret == -EINPROGRESS)
>>>>        qemu_laio_completion_cb(laiocb->ctx);
>>>> 
>>>> Can you verify this by single-stepping one or two loop iterations? ret
>>>> and errno after the read call could be interesting, too.
>>>>      
>>> Maybe the compiler is just too smart.  Without some form of barrier
>>> it could just optimize the loop away as laiocb->ret couldn't change
>>> in a normal single-threaded environment.
>>>    
>> 
>> It probably could in theory, but in practice we're in a read() inside
>> qemu_laio_completion, so it didn't do it here.
>>  
> if you supply a patch that will add some usleeps at the point in
> question i'm willing to test if it solves the 100% cpu problem.

can someone help here? what would be the best option to add some
usleeps?

>> Kevin
>> 
>>  
> 
> 
>

Patch

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 0757528..3cd55e3 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -2838,10 +2838,6 @@  static void ide_dma_restart(IDEState *s, int is_read)
 void ide_dma_cancel(BMDMAState *bm)
 {
     if (bm->status & BM_STATUS_DMAING) {
-        bm->status &= ~BM_STATUS_DMAING;
-        /* cancel DMA request */
-        bm->unit = -1;
-        bm->dma_cb = NULL;
         if (bm->aiocb) {
 #ifdef DEBUG_AIO
             printf("aio_cancel\n");
@@ -2849,6 +2845,10 @@  void ide_dma_cancel(BMDMAState *bm)
             bdrv_aio_cancel(bm->aiocb);
             bm->aiocb = NULL;
         }
+        bm->status &= ~BM_STATUS_DMAING;
+        /* cancel DMA request */
+        bm->unit = -1;
+        bm->dma_cb = NULL;
     }
 }