Message ID | 20190712173600.14554-2-mreitz@redhat.com |
---|---|
State | New |
Headers | show |
Series | block: Generic file creation fallback | expand |
On Fri, 2019-07-12 at 19:35 +0200, Max Reitz wrote: > When nbd_close() is called from a coroutine, the connection_co never > gets to run, and thus nbd_teardown_connection() hangs. > > This is because aio_co_enter() only puts the connection_co into the main > coroutine's wake-up queue, so this main coroutine needs to yield and > wait for connection_co to terminate. After diving into NBD's co-routines (this is 2nd time I do this) and speaking about this with Max Reitz on IRC, could I suggest to extend the explanation a bit, something like that: When nbd_close() is called from a coroutine, the connection_co never gets to run, and thus nbd_teardown_connection() hangs. This happens because the connection_co is woken up by nbd_teardown_connection() by closing the socket, which wakes up the IO channel on which the connection_co is blocked. However connection_co is waken up by aio_co_wake, which has an assumption that if the caller is already in a coroutine, the caller doesn't switch immediately to the woken coroutine, but rather it adds the coroutine to list of coroutines to wake immediately after the current co-routine yields/terminates (self->co_queue_wakeup) Since we instead do aio_poll, that never happens. The patch itself looks fine, so Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Best regards, Maxim Levitsky > > Suggested-by: Kevin Wolf <kwolf@redhat.com> > Signed-off-by: Max Reitz <mreitz@redhat.com> > --- > block/nbd.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/block/nbd.c b/block/nbd.c > index 81edabbf35..8f5ee86842 100644 > --- a/block/nbd.c > +++ b/block/nbd.c > @@ -61,6 +61,7 @@ typedef struct BDRVNBDState { > CoMutex send_mutex; > CoQueue free_sema; > Coroutine *connection_co; > + Coroutine *teardown_co; > int in_flight; > > NBDClientRequest requests[MAX_NBD_REQUESTS]; > @@ -135,7 +136,15 @@ static void nbd_teardown_connection(BlockDriverState *bs) > qio_channel_shutdown(s->ioc, > QIO_CHANNEL_SHUTDOWN_BOTH, > NULL); > - BDRV_POLL_WHILE(bs, s->connection_co); > + if (qemu_in_coroutine()) { > + s->teardown_co = qemu_coroutine_self(); > + /* connection_co resumes us when it terminates */ > + qemu_coroutine_yield(); > + s->teardown_co = NULL; > + } else { > + BDRV_POLL_WHILE(bs, s->connection_co); > + } > + assert(!s->connection_co); > > nbd_client_detach_aio_context(bs); > object_unref(OBJECT(s->sioc)); > @@ -207,6 +216,9 @@ static coroutine_fn void nbd_connection_entry(void *opaque) > bdrv_dec_in_flight(s->bs); > > s->connection_co = NULL; > + if (s->teardown_co) { > + aio_co_wake(s->teardown_co); > + } > aio_wait_kick(); > } >
diff --git a/block/nbd.c b/block/nbd.c index 81edabbf35..8f5ee86842 100644 --- a/block/nbd.c +++ b/block/nbd.c @@ -61,6 +61,7 @@ typedef struct BDRVNBDState { CoMutex send_mutex; CoQueue free_sema; Coroutine *connection_co; + Coroutine *teardown_co; int in_flight; NBDClientRequest requests[MAX_NBD_REQUESTS]; @@ -135,7 +136,15 @@ static void nbd_teardown_connection(BlockDriverState *bs) qio_channel_shutdown(s->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); - BDRV_POLL_WHILE(bs, s->connection_co); + if (qemu_in_coroutine()) { + s->teardown_co = qemu_coroutine_self(); + /* connection_co resumes us when it terminates */ + qemu_coroutine_yield(); + s->teardown_co = NULL; + } else { + BDRV_POLL_WHILE(bs, s->connection_co); + } + assert(!s->connection_co); nbd_client_detach_aio_context(bs); object_unref(OBJECT(s->sioc)); @@ -207,6 +216,9 @@ static coroutine_fn void nbd_connection_entry(void *opaque) bdrv_dec_in_flight(s->bs); s->connection_co = NULL; + if (s->teardown_co) { + aio_co_wake(s->teardown_co); + } aio_wait_kick(); }
When nbd_close() is called from a coroutine, the connection_co never gets to run, and thus nbd_teardown_connection() hangs. This is because aio_co_enter() only puts the connection_co into the main coroutine's wake-up queue, so this main coroutine needs to yield and wait for connection_co to terminate. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> --- block/nbd.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)