Message ID | 20191003100103.331-1-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | test-bdrv-drain: fix iothread_join() hang | expand |
On 03/10/19 12:01, Stefan Hajnoczi wrote: > tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run(): > > while (!atomic_read(&iothread->stopping)) { > aio_poll(iothread->ctx, true); > } > > The iothread_join() function works as follows: > > void iothread_join(IOThread *iothread) > { > iothread->stopping = true; > aio_notify(iothread->ctx); > qemu_thread_join(&iothread->thread); > > If iothread_run() checks iothread->stopping before the iothread_join() > thread sets stopping to true, then aio_notify() may be optimized away > and iothread_run() hangs forever in aio_poll(). > > The correct way to change iothread->stopping is from a BH that executes > within iothread_run(). This ensures that iothread->stopping is checked > after we set it to true. > > This was already fixed for ./iothread.c (note this is a different source > file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread: > fix iothread_stop() race condition"), but not for tests/iothread.c. > > Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a > ("aio: introduce aio_co_schedule and aio_co_wake") > Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > tests/iothread.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/tests/iothread.c b/tests/iothread.c > index 777d9eea46..13c9fdcd8d 100644 > --- a/tests/iothread.c > +++ b/tests/iothread.c > @@ -55,10 +55,16 @@ static void *iothread_run(void *opaque) > return NULL; > } > > -void iothread_join(IOThread *iothread) > +static void iothread_stop_bh(void *opaque) > { > + IOThread *iothread = opaque; > + > iothread->stopping = true; > - aio_notify(iothread->ctx); > +} > + > +void iothread_join(IOThread *iothread) > +{ > + aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread); > qemu_thread_join(&iothread->thread); > qemu_cond_destroy(&iothread->init_done_cond); > qemu_mutex_destroy(&iothread->init_done_lock); > Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Thanks! Paolo
On Thu, Oct 03, 2019 at 11:01:03AM +0100, Stefan Hajnoczi wrote: > tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run(): > > while (!atomic_read(&iothread->stopping)) { > aio_poll(iothread->ctx, true); > } > > The iothread_join() function works as follows: > > void iothread_join(IOThread *iothread) > { > iothread->stopping = true; > aio_notify(iothread->ctx); > qemu_thread_join(&iothread->thread); > > If iothread_run() checks iothread->stopping before the iothread_join() > thread sets stopping to true, then aio_notify() may be optimized away > and iothread_run() hangs forever in aio_poll(). > > The correct way to change iothread->stopping is from a BH that executes > within iothread_run(). This ensures that iothread->stopping is checked > after we set it to true. > > This was already fixed for ./iothread.c (note this is a different source > file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread: > fix iothread_stop() race condition"), but not for tests/iothread.c. > > Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a > ("aio: introduce aio_co_schedule and aio_co_wake") > Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > Cc: Paolo Bonzini <pbonzini@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > tests/iothread.c | 10 ++++++++-- > 1 file changed, 8 insertions(+), 2 deletions(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
diff --git a/tests/iothread.c b/tests/iothread.c index 777d9eea46..13c9fdcd8d 100644 --- a/tests/iothread.c +++ b/tests/iothread.c @@ -55,10 +55,16 @@ static void *iothread_run(void *opaque) return NULL; } -void iothread_join(IOThread *iothread) +static void iothread_stop_bh(void *opaque) { + IOThread *iothread = opaque; + iothread->stopping = true; - aio_notify(iothread->ctx); +} + +void iothread_join(IOThread *iothread) +{ + aio_bh_schedule_oneshot(iothread->ctx, iothread_stop_bh, iothread); qemu_thread_join(&iothread->thread); qemu_cond_destroy(&iothread->init_done_cond); qemu_mutex_destroy(&iothread->init_done_lock);
tests/test-bdrv-drain can hang in tests/iothread.c:iothread_run(): while (!atomic_read(&iothread->stopping)) { aio_poll(iothread->ctx, true); } The iothread_join() function works as follows: void iothread_join(IOThread *iothread) { iothread->stopping = true; aio_notify(iothread->ctx); qemu_thread_join(&iothread->thread); If iothread_run() checks iothread->stopping before the iothread_join() thread sets stopping to true, then aio_notify() may be optimized away and iothread_run() hangs forever in aio_poll(). The correct way to change iothread->stopping is from a BH that executes within iothread_run(). This ensures that iothread->stopping is checked after we set it to true. This was already fixed for ./iothread.c (note this is a different source file!) by commit 2362a28ea11c145e1a13ae79342d76dc118a72a6 ("iothread: fix iothread_stop() race condition"), but not for tests/iothread.c. Fixes: 0c330a734b51c177ab8488932ac3b0c4d63a718a ("aio: introduce aio_co_schedule and aio_co_wake") Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- tests/iothread.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)