"iothread: release iothread around aio_poll" causes random hangs at startup
diff mbox

Message ID 20150610093408.GC11648@ad.nay.redhat.com
State New
Headers show

Commit Message

Fam Zheng June 10, 2015, 9:34 a.m. UTC
On Wed, 06/10 11:18, Christian Borntraeger wrote:
> Am 10.06.2015 um 04:12 schrieb Fam Zheng:
> > On Tue, 06/09 11:01, Christian Borntraeger wrote:
> >> Am 09.06.2015 um 04:28 schrieb Fam Zheng:
> >>> On Tue, 06/02 16:36, Christian Borntraeger wrote:
> >>>> Paolo,
> >>>>
> >>>> I bisected 
> >>>> commit a0710f7995f914e3044e5899bd8ff6c43c62f916
> >>>> Author:     Paolo Bonzini <pbonzini@redhat.com>
> >>>> AuthorDate: Fri Feb 20 17:26:52 2015 +0100
> >>>> Commit:     Kevin Wolf <kwolf@redhat.com>
> >>>> CommitDate: Tue Apr 28 15:36:08 2015 +0200
> >>>>
> >>>>     iothread: release iothread around aio_poll
> >>>>
> >>>> to cause a problem with hanging guests.
> >>>>
> >>>> Having many guests all with a kernel/ramdisk (via -kernel) and
> >>>> several null block devices will result in hangs. All hanging 
> >>>> guests are in partition detection code waiting for an I/O to return
> >>>> so very early maybe even the first I/O.
> >>>>
> >>>> Reverting that commit "fixes" the hangs.
> >>>> Any ideas?
> >>>
> >>> Christian, I can't reproduce this on my x86 box with virtio-blk-pci. Do you
> >>> have a reproducer for x86? Or could you collect backtraces for all the threads
> >>> in QEMU when it hangs?
> >>>
> >>> My long shot is that the main loop is blocked at aio_context_acquire(ctx),
> >>> while the iothread of that ctx is blocked at aio_poll(ctx, blocking).
> >>
> >> Here is a backtrace on s390. I need 2 or more disks, (one is not enough).
> > 
> > It shows iothreads and main loop are all waiting for events, and the vcpu
> > threads are running guest code.
> > 
> > It could be the requests being leaked. Do you see this problem with a regular
> > file based image or null-co driver? Maybe we're missing something about the
> > AioContext in block/null.c.
> 
> It seems to run with normal file based images. As soon as I have two or more null-aio
> devices it hangs pretty soon when doing a reboot loop.
> 

Ahh! If it's a reboot loop, the device reset thing may get fishy. I suspect the
completion BH used by null-aio may be messed up, that's why I wonder whether
null-co:// would work for you. Could you test that?

Also, could you try below patch with null-aio://, too?

Thanks,
Fam

---

Comments

Christian Borntraeger June 10, 2015, 10:31 a.m. UTC | #1
Am 10.06.2015 um 11:34 schrieb Fam Zheng:
> On Wed, 06/10 11:18, Christian Borntraeger wrote:
>> Am 10.06.2015 um 04:12 schrieb Fam Zheng:
>>> On Tue, 06/09 11:01, Christian Borntraeger wrote:
>>>> Am 09.06.2015 um 04:28 schrieb Fam Zheng:
>>>>> On Tue, 06/02 16:36, Christian Borntraeger wrote:
>>>>>> Paolo,
>>>>>>
>>>>>> I bisected 
>>>>>> commit a0710f7995f914e3044e5899bd8ff6c43c62f916
>>>>>> Author:     Paolo Bonzini <pbonzini@redhat.com>
>>>>>> AuthorDate: Fri Feb 20 17:26:52 2015 +0100
>>>>>> Commit:     Kevin Wolf <kwolf@redhat.com>
>>>>>> CommitDate: Tue Apr 28 15:36:08 2015 +0200
>>>>>>
>>>>>>     iothread: release iothread around aio_poll
>>>>>>
>>>>>> to cause a problem with hanging guests.
>>>>>>
>>>>>> Having many guests all with a kernel/ramdisk (via -kernel) and
>>>>>> several null block devices will result in hangs. All hanging 
>>>>>> guests are in partition detection code waiting for an I/O to return
>>>>>> so very early maybe even the first I/O.
>>>>>>
>>>>>> Reverting that commit "fixes" the hangs.
>>>>>> Any ideas?
>>>>>
>>>>> Christian, I can't reproduce this on my x86 box with virtio-blk-pci. Do you
>>>>> have a reproducer for x86? Or could you collect backtraces for all the threads
>>>>> in QEMU when it hangs?
>>>>>
>>>>> My long shot is that the main loop is blocked at aio_context_acquire(ctx),
>>>>> while the iothread of that ctx is blocked at aio_poll(ctx, blocking).
>>>>
>>>> Here is a backtrace on s390. I need 2 or more disks, (one is not enough).
>>>
>>> It shows iothreads and main loop are all waiting for events, and the vcpu
>>> threads are running guest code.
>>>
>>> It could be the requests being leaked. Do you see this problem with a regular
>>> file based image or null-co driver? Maybe we're missing something about the
>>> AioContext in block/null.c.
>>
>> It seems to run with normal file based images. As soon as I have two or more null-aio
>> devices it hangs pretty soon when doing a reboot loop.
>>
> 
> Ahh! If it's a reboot loop, the device reset thing may get fishy. I suspect the
> completion BH used by null-aio may be messed up, that's why I wonder whether
> null-co:// would work for you. Could you test that?

null-co also fails.

> 
> Also, could you try below patch with null-aio://, too?

The same. Guests still get stuck.


> 
> Thanks,
> Fam
> 
> ---
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index cd539aa..c87b444 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -652,15 +652,11 @@ static void virtio_blk_reset(VirtIODevice *vdev)
>  {
>      VirtIOBlock *s = VIRTIO_BLK(vdev);
> 
> -    if (s->dataplane) {
> -        virtio_blk_data_plane_stop(s->dataplane);
> -    }
> -
> -    /*
> -     * This should cancel pending requests, but can't do nicely until there
> -     * are per-device request lists.
> -     */
>      blk_drain_all();
> +    if (s->dataplane) {
> +        virtio_blk_data_plane_stop(s->dataplane);
> +    }
> +
>      blk_set_enable_write_cache(s->blk, s->original_wce);
>  }
>
Christian Borntraeger July 16, 2015, 11:03 a.m. UTC | #2
Am 10.06.2015 um 11:34 schrieb Fam Zheng:
> On Wed, 06/10 11:18, Christian Borntraeger wrote:
>> Am 10.06.2015 um 04:12 schrieb Fam Zheng:
>>> On Tue, 06/09 11:01, Christian Borntraeger wrote:
>>>> Am 09.06.2015 um 04:28 schrieb Fam Zheng:
>>>>> On Tue, 06/02 16:36, Christian Borntraeger wrote:
>>>>>> Paolo,
>>>>>>
>>>>>> I bisected 
>>>>>> commit a0710f7995f914e3044e5899bd8ff6c43c62f916
>>>>>> Author:     Paolo Bonzini <pbonzini@redhat.com>
>>>>>> AuthorDate: Fri Feb 20 17:26:52 2015 +0100
>>>>>> Commit:     Kevin Wolf <kwolf@redhat.com>
>>>>>> CommitDate: Tue Apr 28 15:36:08 2015 +0200
>>>>>>
>>>>>>     iothread: release iothread around aio_poll
>>>>>>
>>>>>> to cause a problem with hanging guests.
>>>>>>
>>>>>> Having many guests all with a kernel/ramdisk (via -kernel) and
>>>>>> several null block devices will result in hangs. All hanging 
>>>>>> guests are in partition detection code waiting for an I/O to return
>>>>>> so very early maybe even the first I/O.
>>>>>>
>>>>>> Reverting that commit "fixes" the hangs.
>>>>>> Any ideas?


For what its worth, I can no longer reproduce the issue on
current master + cherry-pick of a0710f7995f (iothread: release iothread around aio_poll)

bisect tells me that

commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
Author:     Fam Zheng <famz@redhat.com>
AuthorDate: Fri May 29 18:53:14 2015 +0800
Commit:     Stefan Hajnoczi <stefanha@redhat.com>
CommitDate: Tue Jul 7 14:27:14 2015 +0100

    block: Use bdrv_drain to replace uncessary bdrv_drain_all

made the problem will blk-null go away. I still dont understand why.

Christian
Paolo Bonzini July 16, 2015, 11:20 a.m. UTC | #3
On 16/07/2015 13:03, Christian Borntraeger wrote:
> For what its worth, I can no longer reproduce the issue on
> current master + cherry-pick of a0710f7995f (iothread: release iothread around aio_poll)
> 
> bisect tells me that
> 
> commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
> Author:     Fam Zheng <famz@redhat.com>
> AuthorDate: Fri May 29 18:53:14 2015 +0800
> Commit:     Stefan Hajnoczi <stefanha@redhat.com>
> CommitDate: Tue Jul 7 14:27:14 2015 +0100
> 
>     block: Use bdrv_drain to replace uncessary bdrv_drain_all
> 
> made the problem will blk-null go away. I still dont understand why.

It could be related to the AioContext problem that I'm fixing these
days, too.  Good news, we'll requeue the patch for 2.5.

Paolo
Christian Borntraeger July 16, 2015, 11:24 a.m. UTC | #4
Am 16.07.2015 um 13:20 schrieb Paolo Bonzini:
> 
> 
> On 16/07/2015 13:03, Christian Borntraeger wrote:
>> For what its worth, I can no longer reproduce the issue on
>> current master + cherry-pick of a0710f7995f (iothread: release iothread around aio_poll)
>>
>> bisect tells me that
>>
>> commit 53ec73e264f481b79b52efcadc9ceb8f8996975c
>> Author:     Fam Zheng <famz@redhat.com>
>> AuthorDate: Fri May 29 18:53:14 2015 +0800
>> Commit:     Stefan Hajnoczi <stefanha@redhat.com>
>> CommitDate: Tue Jul 7 14:27:14 2015 +0100
>>
>>     block: Use bdrv_drain to replace uncessary bdrv_drain_all
>>
>> made the problem will blk-null go away. I still dont understand why.
> 
> It could be related to the AioContext problem that I'm fixing these
> days, too.  Good news, we'll requeue the patch for 2.5.

That was also something that I had in mind (in fact I retested this to check
the ctx patch). master + cherry-pick of a0710f7995f + revert of 53ec73e26 + this fix
still fails, so it was (is?) a different issue. The interesting part is that this
problem required 2 or more disk (and we replace drain_all with single drains) so
it somewhat sounds plausible.

Christian
Paolo Bonzini July 16, 2015, 11:37 a.m. UTC | #5
On 16/07/2015 13:24, Christian Borntraeger wrote:
> The interesting part is that this problem required 2 or more disk
> (and we replace drain_all with single drains) so it somewhat sounds
> plausible.

Yes, indeed.  It is very plausible.  I wanted to reproduce it these
days, so thanks for saving me a lot of time!  I'll test your exact setup
(master + AioContext fix + cherry-pick of a0710f7995f + revert of
53ec73e26).

Thanks,

Paolo

Patch
diff mbox

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index cd539aa..c87b444 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -652,15 +652,11 @@  static void virtio_blk_reset(VirtIODevice *vdev)
 {
     VirtIOBlock *s = VIRTIO_BLK(vdev);
 
-    if (s->dataplane) {
-        virtio_blk_data_plane_stop(s->dataplane);
-    }
-
-    /*
-     * This should cancel pending requests, but can't do nicely until there
-     * are per-device request lists.
-     */
     blk_drain_all();
+    if (s->dataplane) {
+        virtio_blk_data_plane_stop(s->dataplane);
+    }
+
     blk_set_enable_write_cache(s->blk, s->original_wce);
 }