diff mbox

aio-posix: honor is_external in AioContext polling

Message ID 20170124095350.16679-1-stefanha@redhat.com
State New
Headers show

Commit Message

Stefan Hajnoczi Jan. 24, 2017, 9:53 a.m. UTC
AioHandlers marked ->is_external must be skipped when aio_node_check()
fails.  bdrv_drained_begin() needs this to prevent dataplane from
submitting new I/O requests while another thread accesses the device and
relies on it being quiesced.

This patch fixes the following segfault:

  Program terminated with signal SIGSEGV, Segmentation fault.
  #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
  2650            bdrv_io_plug(child->bs);
  [Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
  (gdb) bt
  #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
  #1  0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
  #2  0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
  #3  0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
  #4  0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
  #5  0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
  #6  0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
  #7  0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
  #8  0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
  #9  0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
  #10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
  #11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
  #12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
  #13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
  #14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
  #15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
  #16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
  #17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684

The key thing is that bdrv_close() uses bdrv_drained_begin() and
virtio_queue_host_notifier_aio_poll() must not be called.

Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
this crash.

Reported-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 aio-posix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Fam Zheng Jan. 24, 2017, 12:04 p.m. UTC | #1
On Tue, 01/24 09:53, Stefan Hajnoczi wrote:
> AioHandlers marked ->is_external must be skipped when aio_node_check()
> fails.  bdrv_drained_begin() needs this to prevent dataplane from
> submitting new I/O requests while another thread accesses the device and
> relies on it being quiesced.
> 
> This patch fixes the following segfault:
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   2650            bdrv_io_plug(child->bs);
>   [Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
>   (gdb) bt
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   #1  0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
>   #2  0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
>   #3  0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
>   #4  0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
>   #5  0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
>   #6  0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
>   #7  0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
>   #8  0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
>   #9  0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
>   #10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
>   #11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
>   #12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
>   #13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
>   #14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
>   #15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
>   #16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
>   #17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684
> 
> The key thing is that bdrv_close() uses bdrv_drained_begin() and
> virtio_queue_host_notifier_aio_poll() must not be called.
> 
> Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
> this crash.
> 
> Reported-by: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  aio-posix.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/aio-posix.c b/aio-posix.c
> index 9453d83..a8d7090 100644
> --- a/aio-posix.c
> +++ b/aio-posix.c
> @@ -508,7 +508,8 @@ static bool run_poll_handlers_once(AioContext *ctx)
>  
>      QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
>          if (!node->deleted && node->io_poll &&
> -                node->io_poll(node->opaque)) {
> +            aio_node_check(ctx, node->is_external) &&
> +            node->io_poll(node->opaque)) {
>              progress = true;
>          }
>  
> -- 
> 2.9.3
> 
> 

The patch is not wrong and I believe it is enough to fix the crash, however it's
not enough...

All in all I think we should skip external handlers regardless of
aio_disable_external(), or even skip try_poll_mode, in nested aio_poll()'s. The
reasons are 1) many nested aio_poll()'s don't have bdrv_drained_begin, so this
check is not sufficient; 2) aio_poll() on qemu_aio_context doesn't look at
ioeventfd before, but this was changed by adding try_poll_mode(), which is not
very correct.

These two factors combined together make it possible for bdrv_flush() etc to
spin longer than necessary, if not forever, when the guest keeps submitting more
requests with ioeventfd.

Fam
Paolo Bonzini Jan. 24, 2017, 12:15 p.m. UTC | #2
On 24/01/2017 13:04, Fam Zheng wrote:
> 
> All in all I think we should skip external handlers regardless of
> aio_disable_external(), or even skip try_poll_mode, in nested aio_poll()'s. The
> reasons are 1) many nested aio_poll()'s don't have bdrv_drained_begin, so this
> check is not sufficient [...] bdrv_flush()
> spin longer than necessary, if not forever, when the guest keeps submitting more
> requests with ioeventfd.


I'm not sure I understand why this is related.  aio_poll() only tries
poll mode once, so bdrv_flush would only spin until the fsync is complete.

Nested aio_polls don't have bdrv_drained_begin because draining matters
over the whole section where you need atomicity (e.g. taking a live
snapshot).  It doesn't matter for single I/O operation.

Paolo
Fam Zheng Jan. 24, 2017, 12:47 p.m. UTC | #3
On Tue, 01/24 13:15, Paolo Bonzini wrote:
> 
> 
> On 24/01/2017 13:04, Fam Zheng wrote:
> > 
> > All in all I think we should skip external handlers regardless of
> > aio_disable_external(), or even skip try_poll_mode, in nested aio_poll()'s. The
> > reasons are 1) many nested aio_poll()'s don't have bdrv_drained_begin, so this
> > check is not sufficient [...] bdrv_flush()
> > spin longer than necessary, if not forever, when the guest keeps submitting more
> > requests with ioeventfd.
> 
> 
> I'm not sure I understand why this is related.  aio_poll() only tries
> poll mode once, so bdrv_flush would only spin until the fsync is complete.

Right, I was confused.  The problematic ones are "drain" style ones that tracks
a inflight counter. The only suspecious one is in v9fs_reset(), otherwise we are
safe!

Fam
Paolo Bonzini Jan. 24, 2017, 12:51 p.m. UTC | #4
On 24/01/2017 13:47, Fam Zheng wrote:
>> I'm not sure I understand why this is related.  aio_poll() only tries
>> poll mode once, so bdrv_flush would only spin until the fsync is complete.
>
> Right, I was confused.  The problematic ones are "drain" style ones that tracks
> a inflight counter. The only suspecious one is in v9fs_reset(), otherwise we are
> safe!

And v9fs_reset in turn is fine because it doesn't use
virtio_queue_aio_set_host_notifier_handler (so it goes to
event_notifier_set_handler and then iohandler_ctx, not
qemu_get_aio_context()).

Paolo
Fam Zheng Jan. 24, 2017, 1:40 p.m. UTC | #5
On Tue, 01/24 09:53, Stefan Hajnoczi wrote:
> AioHandlers marked ->is_external must be skipped when aio_node_check()
> fails.  bdrv_drained_begin() needs this to prevent dataplane from
> submitting new I/O requests while another thread accesses the device and
> relies on it being quiesced.
> 
> This patch fixes the following segfault:
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   2650            bdrv_io_plug(child->bs);
>   [Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
>   (gdb) bt
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   #1  0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
>   #2  0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
>   #3  0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
>   #4  0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
>   #5  0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
>   #6  0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
>   #7  0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
>   #8  0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
>   #9  0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
>   #10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
>   #11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
>   #12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
>   #13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
>   #14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
>   #15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
>   #16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
>   #17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684
> 
> The key thing is that bdrv_close() uses bdrv_drained_begin() and
> virtio_queue_host_notifier_aio_poll() must not be called.
> 
> Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
> this crash.
> 
> Reported-by: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Paolo eliminated my concern, so this looks good to me now!

Reviewed-by: Fam Zheng <famz@redhat.com>
Alberto Garcia Jan. 24, 2017, 2:31 p.m. UTC | #6
On Tue 24 Jan 2017 10:53:50 AM CET, Stefan Hajnoczi wrote:
> AioHandlers marked ->is_external must be skipped when aio_node_check()
> fails.  bdrv_drained_begin() needs this to prevent dataplane from
> submitting new I/O requests while another thread accesses the device and
> relies on it being quiesced.
>
> Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
> this crash.
>
> Reported-by: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Tested-by: Alberto Garcia <berto@igalia.com>

Berto
Stefan Hajnoczi Jan. 25, 2017, 1:16 p.m. UTC | #7
On Tue, Jan 24, 2017 at 09:53:50AM +0000, Stefan Hajnoczi wrote:
> AioHandlers marked ->is_external must be skipped when aio_node_check()
> fails.  bdrv_drained_begin() needs this to prevent dataplane from
> submitting new I/O requests while another thread accesses the device and
> relies on it being quiesced.
> 
> This patch fixes the following segfault:
> 
>   Program terminated with signal SIGSEGV, Segmentation fault.
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   2650            bdrv_io_plug(child->bs);
>   [Current thread is 1 (Thread 0x7ff5c4bd1c80 (LWP 10917))]
>   (gdb) bt
>   #0  0x00005577f6127dad in bdrv_io_plug (bs=0x5577f7ae52f0) at qemu/block/io.c:2650
>   #1  0x00005577f6114363 in blk_io_plug (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:1561
>   #2  0x00005577f5d4091d in virtio_blk_handle_vq (s=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/virtio-blk.c:589
>   #3  0x00005577f5d4240d in virtio_blk_data_plane_handle_output (vdev=0x5577f9ada030, vq=0x5577f9b3d2a0) at qemu/hw/block/dataplane/virtio-blk.c:158
>   #4  0x00005577f5d88acd in virtio_queue_notify_aio_vq (vq=0x5577f9b3d2a0) at qemu/hw/virtio/virtio.c:1304
>   #5  0x00005577f5d8aaaf in virtio_queue_host_notifier_aio_poll (opaque=0x5577f9b3d308) at qemu/hw/virtio/virtio.c:2134
>   #6  0x00005577f60ca077 in run_poll_handlers_once (ctx=0x5577f79ddbb0) at qemu/aio-posix.c:493
>   #7  0x00005577f60ca268 in try_poll_mode (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:569
>   #8  0x00005577f60ca331 in aio_poll (ctx=0x5577f79ddbb0, blocking=true) at qemu/aio-posix.c:601
>   #9  0x00005577f612722a in bdrv_flush (bs=0x5577f7c20970) at qemu/block/io.c:2403
>   #10 0x00005577f60c1b2d in bdrv_close (bs=0x5577f7c20970) at qemu/block.c:2322
>   #11 0x00005577f60c20e7 in bdrv_delete (bs=0x5577f7c20970) at qemu/block.c:2465
>   #12 0x00005577f60c3ecf in bdrv_unref (bs=0x5577f7c20970) at qemu/block.c:3425
>   #13 0x00005577f60bf951 in bdrv_root_unref_child (child=0x5577f7a2de70) at qemu/block.c:1361
>   #14 0x00005577f6112162 in blk_remove_bs (blk=0x5577f7b8ba20) at qemu/block/block-backend.c:491
>   #15 0x00005577f6111b1b in blk_remove_all_bs () at qemu/block/block-backend.c:245
>   #16 0x00005577f60c1db6 in bdrv_close_all () at qemu/block.c:2382
>   #17 0x00005577f5e60cca in main (argc=20, argv=0x7ffea6eb8398, envp=0x7ffea6eb8440) at qemu/vl.c:4684
> 
> The key thing is that bdrv_close() uses bdrv_drained_begin() and
> virtio_queue_host_notifier_aio_poll() must not be called.
> 
> Thanks to Fam Zheng <famz@redhat.com> for identifying the root cause of
> this crash.
> 
> Reported-by: Alberto Garcia <berto@igalia.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  aio-posix.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan
diff mbox

Patch

diff --git a/aio-posix.c b/aio-posix.c
index 9453d83..a8d7090 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -508,7 +508,8 @@  static bool run_poll_handlers_once(AioContext *ctx)
 
     QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
         if (!node->deleted && node->io_poll &&
-                node->io_poll(node->opaque)) {
+            aio_node_check(ctx, node->is_external) &&
+            node->io_poll(node->opaque)) {
             progress = true;
         }