[1/9] blockdev: hold AioContext for bdrv_unref() in external_snapshot_clean()

Message ID 20171205104141.28882-2-stefanha@redhat.com
State New
Headers show
Series
  • blockdev: fix QMP 'transaction' with IOThreads
Related show

Commit Message

Stefan Hajnoczi Dec. 5, 2017, 10:41 a.m.
bdrv_unref() requires the AioContext lock because bdrv_flush() uses
BDRV_POLL_WHILE(), which assumes the AioContext is currently held.  If
BDRV_POLL_WHILE() runs without AioContext held the
pthread_mutex_unlock() call in aio_context_release() fails.

This patch moves bdrv_unref() into the AioContext locked region to solve
the following pthread_mutex_unlock() failure:

  #0  0x00007f566181969b in raise () at /lib64/libc.so.6
  #1  0x00007f566181b3b1 in abort () at /lib64/libc.so.6
  #2  0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36
  #3  0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96
  #4  0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507
  #5  0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478
  #6  0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207
  #7  0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395
  #8  0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418
  #9  0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 blockdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Blake Dec. 5, 2017, 2:36 p.m. | #1
On 12/05/2017 04:41 AM, Stefan Hajnoczi wrote:
> bdrv_unref() requires the AioContext lock because bdrv_flush() uses
> BDRV_POLL_WHILE(), which assumes the AioContext is currently held.  If
> BDRV_POLL_WHILE() runs without AioContext held the
> pthread_mutex_unlock() call in aio_context_release() fails.
> 
> This patch moves bdrv_unref() into the AioContext locked region to solve
> the following pthread_mutex_unlock() failure:
> 
>   #0  0x00007f566181969b in raise () at /lib64/libc.so.6
>   #1  0x00007f566181b3b1 in abort () at /lib64/libc.so.6
>   #2  0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36
>   #3  0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96
>   #4  0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507
>   #5  0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478
>   #6  0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207
>   #7  0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395
>   #8  0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418
>   #9  0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  blockdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Reviewed-by: Eric Blake <eblake@redhat.com>

I know the series is too big/late for 2.11, but is this one patch worth
having?
Stefan Hajnoczi Dec. 6, 2017, 11:31 a.m. | #2
On Tue, Dec 05, 2017 at 08:36:32AM -0600, Eric Blake wrote:
> On 12/05/2017 04:41 AM, Stefan Hajnoczi wrote:
> > bdrv_unref() requires the AioContext lock because bdrv_flush() uses
> > BDRV_POLL_WHILE(), which assumes the AioContext is currently held.  If
> > BDRV_POLL_WHILE() runs without AioContext held the
> > pthread_mutex_unlock() call in aio_context_release() fails.
> > 
> > This patch moves bdrv_unref() into the AioContext locked region to solve
> > the following pthread_mutex_unlock() failure:
> > 
> >   #0  0x00007f566181969b in raise () at /lib64/libc.so.6
> >   #1  0x00007f566181b3b1 in abort () at /lib64/libc.so.6
> >   #2  0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36
> >   #3  0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96
> >   #4  0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507
> >   #5  0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478
> >   #6  0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207
> >   #7  0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395
> >   #8  0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418
> >   #9  0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  blockdev.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
> 
> I know the series is too big/late for 2.11, but is this one patch worth
> having?

This fix is not critical for 2.11.  The bug shouldn't be triggered under
normal circumstances.

The bug was exposed while developing qemu-iotests 202, which uses
unconventional commands (blockdev-add to create BDSes with no -drive or
BlockBackend).

Patch

diff --git a/blockdev.c b/blockdev.c
index 56a6b24a0b..3c8d994ced 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1812,8 +1812,8 @@  static void external_snapshot_clean(BlkActionState *common)
                              DO_UPCAST(ExternalSnapshotState, common, common);
     if (state->aio_context) {
         bdrv_drained_end(state->old_bs);
-        aio_context_release(state->aio_context);
         bdrv_unref(state->new_bs);
+        aio_context_release(state->aio_context);
     }
 }