Message ID | 20180208171807.24267-2-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | block: fix blk_aio_*() segfault when blk->root == NULL | expand |
On 02/08/2018 11:18 AM, Stefan Hajnoczi wrote: > BlockBackend currently relies on BlockDriverState->in_flight to track > requests for blk_drain(). There is a corner case where > BlockDriverState->in_flight cannot be used though: blk->root can be NULL > when there is no medium. This results in a segfault when the NULL > pointer is dereferenced. > > Introduce a BlockBackend->in_flight counter for aio requests so it works > even when blk->root == NULL. > > Based on a patch by Kevin Wolf <kwolf@redhat.com>. > > Signed-off-by: Kevin Wolf <kwolf@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > block.c | 2 +- > block/block-backend.c | 59 +++++++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 53 insertions(+), 8 deletions(-) > Reviewed-by: Eric Blake <eblake@redhat.com>
On 08/02/2018 18:18, Stefan Hajnoczi wrote: > + BlockDriverState *bs = blk_bs(blk); > + > + if (bs) { > + bdrv_drained_begin(bs); > + } > + > + /* We may have aio requests like -ENOMEDIUM in flight */ > + while (atomic_mb_read(&blk->in_flight) > 0) { > + aio_poll(blk_get_aio_context(blk), true); > + } > + > + if (bs) { > + bdrv_drained_end(bs); This only works if you are in the default AioContext, otherwise you can run handlers for one AioContexts in two threads. bdrv_dec_in_flight uses bdrv_wakeup and BDRV_POLL_WHILE to ensure that this doesn't happen, so there would be more code that you have to copy into block-backend.c. Paolo
Am 09.02.2018 um 17:28 hat Paolo Bonzini geschrieben: > On 08/02/2018 18:18, Stefan Hajnoczi wrote: > > + BlockDriverState *bs = blk_bs(blk); > > + > > + if (bs) { > > + bdrv_drained_begin(bs); > > + } > > + > > + /* We may have aio requests like -ENOMEDIUM in flight */ > > + while (atomic_mb_read(&blk->in_flight) > 0) { > > + aio_poll(blk_get_aio_context(blk), true); > > + } > > + > > + if (bs) { > > + bdrv_drained_end(bs); > > This only works if you are in the default AioContext, otherwise you can > run handlers for one AioContexts in two threads. Should aio_poll() assert that it's called in the right thread to make sure we obey the rules? > bdrv_dec_in_flight uses bdrv_wakeup and BDRV_POLL_WHILE to ensure that > this doesn't happen, so there would be more code that you have to copy > into block-backend.c. Instead of copying, can't we generalise it into a POLL_WHILE(ctx, wakeup, cond) and make BDRV_POLL_WHILE() a wrapper for that? Kevin
On 09/02/2018 18:23, Kevin Wolf wrote: > Am 09.02.2018 um 17:28 hat Paolo Bonzini geschrieben: >> On 08/02/2018 18:18, Stefan Hajnoczi wrote: >>> + BlockDriverState *bs = blk_bs(blk); >>> + >>> + if (bs) { >>> + bdrv_drained_begin(bs); >>> + } >>> + >>> + /* We may have aio requests like -ENOMEDIUM in flight */ >>> + while (atomic_mb_read(&blk->in_flight) > 0) { >>> + aio_poll(blk_get_aio_context(blk), true); >>> + } >>> + >>> + if (bs) { >>> + bdrv_drained_end(bs); >> >> This only works if you are in the default AioContext, otherwise you can >> run handlers for one AioContexts in two threads. > > Should aio_poll() assert that it's called in the right thread to make > sure we obey the rules? > >> bdrv_dec_in_flight uses bdrv_wakeup and BDRV_POLL_WHILE to ensure that >> this doesn't happen, so there would be more code that you have to copy >> into block-backend.c. > > Instead of copying, can't we generalise it into a POLL_WHILE(ctx, > wakeup, cond) and make BDRV_POLL_WHILE() a wrapper for that? Yes, or even move bdrv_wakeup to AioContext would do. We already have block layer-specific fields such as the linux-aio state(*). (*) though now that linux-aio has been improved to do something like epoll, there may be a better reason to place linux-aio state in AioContext. Paolo
On Fri, Feb 09, 2018 at 06:26:41PM +0100, Paolo Bonzini wrote: > On 09/02/2018 18:23, Kevin Wolf wrote: > > Am 09.02.2018 um 17:28 hat Paolo Bonzini geschrieben: > >> On 08/02/2018 18:18, Stefan Hajnoczi wrote: > >>> + BlockDriverState *bs = blk_bs(blk); > >>> + > >>> + if (bs) { > >>> + bdrv_drained_begin(bs); > >>> + } > >>> + > >>> + /* We may have aio requests like -ENOMEDIUM in flight */ > >>> + while (atomic_mb_read(&blk->in_flight) > 0) { > >>> + aio_poll(blk_get_aio_context(blk), true); > >>> + } > >>> + > >>> + if (bs) { > >>> + bdrv_drained_end(bs); > >> > >> This only works if you are in the default AioContext, otherwise you can > >> run handlers for one AioContexts in two threads. > > > > Should aio_poll() assert that it's called in the right thread to make > > sure we obey the rules? > > > >> bdrv_dec_in_flight uses bdrv_wakeup and BDRV_POLL_WHILE to ensure that > >> this doesn't happen, so there would be more code that you have to copy > >> into block-backend.c. > > > > Instead of copying, can't we generalise it into a POLL_WHILE(ctx, > > wakeup, cond) and make BDRV_POLL_WHILE() a wrapper for that? > > Yes, or even move bdrv_wakeup to AioContext would do. We already have > block layer-specific fields such as the linux-aio state(*). > > (*) though now that linux-aio has been improved to do something > like epoll, there may be a better reason to place linux-aio > state in AioContext. Thanks for the ideas, will fix in v2. Stefan
diff --git a/block.c b/block.c index a8da4f2b25..140f7919f8 100644 --- a/block.c +++ b/block.c @@ -4716,7 +4716,7 @@ out: AioContext *bdrv_get_aio_context(BlockDriverState *bs) { - return bs->aio_context; + return bs ? bs->aio_context : qemu_get_aio_context(); } void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co) diff --git a/block/block-backend.c b/block/block-backend.c index baef8e7abc..4811d8bc55 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -71,6 +71,13 @@ struct BlockBackend { int quiesce_counter; VMChangeStateEntry *vmsh; bool force_allow_inactivate; + + /* Number of in-flight aio requests. BlockDriverState also counts + * in-flight requests but aio requests can exist even when blk->root is + * NULL, so we cannot rely on its counter for that case. + * Accessed with atomic ops. + */ + unsigned int in_flight; }; typedef struct BlockBackendAIOCB { @@ -1223,11 +1230,21 @@ int blk_make_zero(BlockBackend *blk, BdrvRequestFlags flags) return bdrv_make_zero(blk->root, flags); } +static void blk_inc_in_flight(BlockBackend *blk) +{ + atomic_inc(&blk->in_flight); +} + +static void blk_dec_in_flight(BlockBackend *blk) +{ + atomic_dec(&blk->in_flight); +} + static void error_callback_bh(void *opaque) { struct BlockBackendAIOCB *acb = opaque; - bdrv_dec_in_flight(acb->common.bs); + blk_dec_in_flight(acb->blk); acb->common.cb(acb->common.opaque, acb->ret); qemu_aio_unref(acb); } @@ -1238,7 +1255,7 @@ BlockAIOCB *blk_abort_aio_request(BlockBackend *blk, { struct BlockBackendAIOCB *acb; - bdrv_inc_in_flight(blk_bs(blk)); + blk_inc_in_flight(blk); acb = blk_aio_get(&block_backend_aiocb_info, blk, cb, opaque); acb->blk = blk; acb->ret = ret; @@ -1261,7 +1278,7 @@ static const AIOCBInfo blk_aio_em_aiocb_info = { static void blk_aio_complete(BlkAioEmAIOCB *acb) { if (acb->has_returned) { - bdrv_dec_in_flight(acb->common.bs); + blk_dec_in_flight(acb->rwco.blk); acb->common.cb(acb->common.opaque, acb->rwco.ret); qemu_aio_unref(acb); } @@ -1282,7 +1299,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes, BlkAioEmAIOCB *acb; Coroutine *co; - bdrv_inc_in_flight(blk_bs(blk)); + blk_inc_in_flight(blk); acb = blk_aio_get(&blk_aio_em_aiocb_info, blk, cb, opaque); acb->rwco = (BlkRwCo) { .blk = blk, @@ -1519,14 +1536,42 @@ int blk_flush(BlockBackend *blk) void blk_drain(BlockBackend *blk) { - if (blk_bs(blk)) { - bdrv_drain(blk_bs(blk)); + BlockDriverState *bs = blk_bs(blk); + + if (bs) { + bdrv_drained_begin(bs); + } + + /* We may have aio requests like -ENOMEDIUM in flight */ + while (atomic_mb_read(&blk->in_flight) > 0) { + aio_poll(blk_get_aio_context(blk), true); + } + + if (bs) { + bdrv_drained_end(bs); } } void blk_drain_all(void) { - bdrv_drain_all(); + BlockBackend *blk = NULL; + + bdrv_drain_all_begin(); + + while ((blk = blk_all_next(blk)) != NULL) { + AioContext *ctx = blk_get_aio_context(blk); + + aio_context_acquire(ctx); + + /* We may have aio requests like -ENOMEDIUM in flight */ + while (atomic_mb_read(&blk->in_flight) > 0) { + aio_poll(blk_get_aio_context(blk), true); + } + + aio_context_release(ctx); + } + + bdrv_drain_all_end(); } void blk_set_on_error(BlockBackend *blk, BlockdevOnError on_read_error,