Message ID | f88b9c1e-6331-8c7f-b748-62dc11a50c1e@redhat.com |
---|---|
State | New |
Headers | show |
Series | coroutine question, for NBD debugging | expand |
On 11/03/2017 03:03 PM, Eric Blake wrote: > In include/qemu/coroutine.h, we have: > > /** > * Yield the coroutine for a given duration > * > * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be > * resumed when using aio_poll(). > */ > void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type, > > but there is no co_sleep_ns() anywhere. What should the documentation > actually state? This part still needs fixing; but partially answering myself: > > Meanwhile, I'm trying to figure out if there is an easy way to hack > qemu-nbd to send two structured replies interleaved (as in A.1, B.1, > A.2, B.2) to ensure the NBD client coroutines properly handle > interleaved responses, but adding a plain sleep() between A.1 and A.2 > does not work, and I'm not sure whether co_aio_sleep_ns() would work for > the hack instead. > > The hack I'm currently playing with splits reads and structured errors > into two parts, attempting to use sleeps to force the server to send > replies out of order: > > + sleep(2); Using this instead of sleep(): + co_aio_sleep_ns(client->exp->ctx, QEMU_CLOCK_REALTIME, 2000000000); > coupled with a client that does: > > $ ./qemu-io -f raw nbd://localhost:10809/foo --trace='nbd_*' -c > 'aio_write -P 3 1 1' -c 'aio_read -P 1 1 1' > produced the trace I was looking for: 2837@1509745520.815687:nbd_send_request Sending request to server: { .from = 1, .len = 1, .handle = 93924431318768, .flags = 0x1, .type = 1 (write) } 2837@1509745520.815764:nbd_send_request Sending request to server: { .from = 1, .len = 1, .handle = 93924431318769, .flags = 0x0, .type = 0 (read) } 2837@1509745520.815913:nbd_receive_structured_reply_chunk Got structured reply chunk: { flags = 0x0, type = 32769, handle = 93924431318768, length = 25 } 2837@1509745520.856401:nbd_receive_structured_reply_chunk Got structured reply chunk: { flags = 0x0, type = 1, handle = 93924431318769, length = 9 } 2837@1509745521.817279:nbd_receive_structured_reply_chunk Got structured reply chunk: { flags = 0x1, type = 0, handle = 93924431318768, length = 0 } aio_write failed: Operation not permitted 2837@1509745522.817474:nbd_receive_structured_reply_chunk Got structured reply chunk: { flags = 0x1, type = 0, handle = 93924431318769, length = 0 } read 1/1 bytes at offset 1 1 bytes, 1 ops; 0:00:02.00 (0.499560 bytes/sec and 0.4996 ops/sec) which means I'm fairly confident that our client implementation is doing the right thing with regards to complex out-of-order/interleaved structured replies! Always a nice thing to demonstrate.
On 03/11/2017 21:03, Eric Blake wrote: > In include/qemu/coroutine.h, we have: > > /** > * Yield the coroutine for a given duration > * > * Behaves similarly to co_sleep_ns(), but the sleeping coroutine will be > * resumed when using aio_poll(). > */ > void coroutine_fn co_aio_sleep_ns(AioContext *ctx, QEMUClockType type, > > but there is no co_sleep_ns() anywhere. What should the documentation > actually state? The old doc comment for co_sleep_ns was: /** * Yield the coroutine for a given duration * * Note this function uses timers and hence only works when a main loop is in * use. See main-loop.h and do not use from qemu-tool programs. */ So probably: This function uses timers and hence needs to know the event loop (#AioContext) to place the timer on. In any case, co_aio_sleep_ns() does not affect the #AioContext where the current coroutine is running, as the coroutine will restart on the same #AioContext that it is running on. Thanks, Paolo > > Meanwhile, I'm trying to figure out if there is an easy way to hack > qemu-nbd to send two structured replies interleaved (as in A.1, B.1, > A.2, B.2) to ensure the NBD client coroutines properly handle > interleaved responses, but adding a plain sleep() between A.1 and A.2 > does not work, and I'm not sure whether co_aio_sleep_ns() would work for > the hack instead. > > The hack I'm currently playing with splits reads and structured errors > into two parts, attempting to use sleeps to force the server to send > replies out of order: > > diff --git c/nbd/server.c w/nbd/server.c > index bcf0cdb47c..4b642127bb 100644 > --- c/nbd/server.c > +++ w/nbd/server.c > @@ -1285,13 +1285,24 @@ static int coroutine_fn > nbd_co_send_structured_read(NBDClient *client, > {.iov_base = &chunk, .iov_len = sizeof(chunk)}, > {.iov_base = data, .iov_len = size} > }; > + int ret; > > trace_nbd_co_send_structured_read(handle, offset, data, size); > - set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA, > + set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_DATA, > handle, sizeof(chunk) - sizeof(chunk.h) + size); > stq_be_p(&chunk.offset, offset); > > - return nbd_co_send_iov(client, iov, 2, errp); > + ret = nbd_co_send_iov(client, iov, 2, errp); > + if (ret < 0) { > + return ret; > + } > + > + sleep(2); > + > + set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, > + handle, 0); > + iov[0].iov_len = sizeof(chunk.h); > + return nbd_co_send_iov(client, iov, 1, errp); > } > > static int coroutine_fn nbd_co_send_structured_error(NBDClient *client, > @@ -1306,16 +1317,27 @@ static int coroutine_fn > nbd_co_send_structured_error(NBDClient *client, > {.iov_base = &chunk, .iov_len = sizeof(chunk)}, > {.iov_base = (char *)msg, .iov_len = msg ? strlen(msg) : 0}, > }; > + int ret; > > assert(nbd_err); > trace_nbd_co_send_structured_error(handle, nbd_err, > nbd_err_lookup(nbd_err), msg ? > msg : ""); > - set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_ERROR, > handle, > + set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_ERROR, handle, > sizeof(chunk) - sizeof(chunk.h) + iov[1].iov_len); > stl_be_p(&chunk.error, nbd_err); > stw_be_p(&chunk.message_length, iov[1].iov_len); > > - return nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp); > + ret = nbd_co_send_iov(client, iov, 1 + !!iov[1].iov_len, errp); > + if (ret < 0) { > + return ret; > + } > + > + sleep(1); > + > + set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, > + handle, 0); > + iov[0].iov_len = sizeof(chunk.h); > + return nbd_co_send_iov(client, iov, 1, errp); > } > > /* nbd_co_receive_request > > > coupled with a client that does: > > $ ./qemu-io -f raw nbd://localhost:10809/foo --trace='nbd_*' -c > 'aio_write -P 3 1 1' -c 'aio_read -P 1 1 1' > > but the trace shows that the client does not receive B.1 until after it > has blocked waiting for A.2, which is not what I was hoping to see. >
diff --git c/nbd/server.c w/nbd/server.c index bcf0cdb47c..4b642127bb 100644 --- c/nbd/server.c +++ w/nbd/server.c @@ -1285,13 +1285,24 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client, {.iov_base = &chunk, .iov_len = sizeof(chunk)}, {.iov_base = data, .iov_len = size} }; + int ret; trace_nbd_co_send_structured_read(handle, offset, data, size); - set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA, + set_be_chunk(&chunk.h, 0, NBD_REPLY_TYPE_OFFSET_DATA, handle, sizeof(chunk) - sizeof(chunk.h) + size); stq_be_p(&chunk.offset, offset); - return nbd_co_send_iov(client, iov, 2, errp); + ret = nbd_co_send_iov(client, iov, 2, errp); + if (ret < 0) { + return ret; + } + + sleep(2); + + set_be_chunk(&chunk.h, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, + handle, 0); + iov[0].iov_len = sizeof(chunk.h); + return nbd_co_send_iov(client, iov, 1, errp); }