diff mbox

thread-pool: Notify AIO context upon completion

Message ID 1435818839-5376-1-git-send-email-famz@redhat.com
State New
Headers show

Commit Message

Fam Zheng July 2, 2015, 6:33 a.m. UTC
bdrv_flush() uses a loop like

    while (rwco.ret == NOT_DONE) {
        aio_poll(aio_context, true);
    }

to wait for thread pool, which may not get notified about the scheduled
BH right away, if there is no new event that wakes up a blocking
qemu_poll_ns(). In this case, it may even be a permanent hang.

Wake the main thread up by writing to the event notifier fd.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Fam Zheng <famz@redhat.com>

---

I suspect this may relate to

[Qemu-devel] "iothread: release iothread around aio_poll" causes random
hangs at startup

[http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg00623.html]

reported by Christian Borntraeger. Because in iothread there is rarely
any fd activity, so the blocking aio_poll() may block forever if it
misses the BH schedule.

Christian, could you test this patch against your reproducer?
---
 thread-pool.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Christian Borntraeger July 2, 2015, 6:45 a.m. UTC | #1
Am 02.07.2015 um 08:33 schrieb Fam Zheng:
> bdrv_flush() uses a loop like
> 
>     while (rwco.ret == NOT_DONE) {
>         aio_poll(aio_context, true);
>     }
> 
> to wait for thread pool, which may not get notified about the scheduled
> BH right away, if there is no new event that wakes up a blocking
> qemu_poll_ns(). In this case, it may even be a permanent hang.
> 
> Wake the main thread up by writing to the event notifier fd.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Christian Borntraeger <borntraeger@de.ibm.com>
> Signed-off-by: Fam Zheng <famz@redhat.com>
> 
> ---
> 
> I suspect this may relate to
> 
> [Qemu-devel] "iothread: release iothread around aio_poll" causes random
> hangs at startup
> 
> [http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg00623.html]
> 
> reported by Christian Borntraeger. Because in iothread there is rarely
> any fd activity, so the blocking aio_poll() may block forever if it
> misses the BH schedule.
> 
> Christian, could you test this patch against your reproducer?

Still does not work. It really seems to be triggered by the null device
(and there must be >= 2).


> ---
>  thread-pool.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/thread-pool.c b/thread-pool.c
> index ac909f4..9b9c065 100644
> --- a/thread-pool.c
> +++ b/thread-pool.c
> @@ -112,6 +112,7 @@ static void *worker_thread(void *opaque)
>          qemu_mutex_lock(&pool->lock);
> 
>          qemu_bh_schedule(pool->completion_bh);
> +        aio_notify(pool->ctx);
>      }
> 
>      pool->cur_threads--;
>
Paolo Bonzini July 2, 2015, 7:11 a.m. UTC | #2
On 02/07/2015 08:33, Fam Zheng wrote:
> bdrv_flush() uses a loop like
> 
>     while (rwco.ret == NOT_DONE) {
>         aio_poll(aio_context, true);
>     }
> 
> to wait for thread pool, which may not get notified about the scheduled
> BH right away, if there is no new event that wakes up a blocking
> qemu_poll_ns().

That translates to "the dispatching optimization does not work". :)  I
do not think that is the problem.

Paolo
Fam Zheng July 2, 2015, 7:17 a.m. UTC | #3
On Thu, 07/02 09:11, Paolo Bonzini wrote:
> 
> 
> On 02/07/2015 08:33, Fam Zheng wrote:
> > bdrv_flush() uses a loop like
> > 
> >     while (rwco.ret == NOT_DONE) {
> >         aio_poll(aio_context, true);
> >     }
> > 
> > to wait for thread pool, which may not get notified about the scheduled
> > BH right away, if there is no new event that wakes up a blocking
> > qemu_poll_ns().
> 
> That translates to "the dispatching optimization does not work". :)  I
> do not think that is the problem.

I must be missing something. I see a hang locally with some AioContext patches
I'm testing, and this does fix it.

I traced that qemu_bh_schedule does call aio_notify and event_notifier_set, so
it's curious. Still looking.

Fam
diff mbox

Patch

diff --git a/thread-pool.c b/thread-pool.c
index ac909f4..9b9c065 100644
--- a/thread-pool.c
+++ b/thread-pool.c
@@ -112,6 +112,7 @@  static void *worker_thread(void *opaque)
         qemu_mutex_lock(&pool->lock);
 
         qemu_bh_schedule(pool->completion_bh);
+        aio_notify(pool->ctx);
     }
 
     pool->cur_threads--;