diff mbox

linux-aio: fix submit aio as a batch

Message ID 1404834310-12856-1-git-send-email-ming.lei@canonical.com
State New
Headers show

Commit Message

Ming Lei July 8, 2014, 3:45 p.m. UTC
In the enqueue path, we can't complete request, otherwise
"Co-routine re-entered recursively" may be caused, so this
patch fixes the issue with below ideas:

	- for -EAGAIN, retry the submission in an introduced event handler
	- for part of completion, just update the io queue, since it is
	moving on after all
	- for other failure, return the failure if in enqueue path,
	otherwise, abort all queued I/O

Signed-off-by: Ming Lei <ming.lei@canonical.com>
---
 block/linux-aio.c |   85 +++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 63 insertions(+), 22 deletions(-)

Comments

Paolo Bonzini July 8, 2014, 5:41 p.m. UTC | #1
Il 08/07/2014 17:45, Ming Lei ha scritto:
>
> -    /* empty io queue */
> -    s->io_q.idx = 0;
> +    ret = io_submit(s->ctx, len, s->io_q.iocbs);
> +    if (ret == -EAGAIN) {
> +        event_notifier_set(&s->retry);
> +        return 0;

You can use a bottom half instead of this event notifier.

Paolo
Ming Lei July 9, 2014, 1:33 a.m. UTC | #2
On Wed, Jul 9, 2014 at 1:41 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> Il 08/07/2014 17:45, Ming Lei ha scritto:
>
>>
>> -    /* empty io queue */
>> -    s->io_q.idx = 0;
>> +    ret = io_submit(s->ctx, len, s->io_q.iocbs);
>> +    if (ret == -EAGAIN) {
>> +        event_notifier_set(&s->retry);
>> +        return 0;
>
>
> You can use a bottom half instead of this event notifier.

It is a intended use of event notifier, and from my observation, after
one cycle of write() and ppoll(), the next io_submit() is a bit easier to
succeed than using BH.

Given it doesn't happen often, it should be ok to use event notifier.


Thanks,
Stefan Hajnoczi July 9, 2014, 8:29 a.m. UTC | #3
On Tue, Jul 08, 2014 at 11:45:10PM +0800, Ming Lei wrote:
> In the enqueue path, we can't complete request, otherwise
> "Co-routine re-entered recursively" may be caused, so this
> patch fixes the issue with below ideas:

Thi  probably happens when the caller is in coroutine context and its
completion function invokes qemu_coroutine_enter() on itself.  The
solution is to invoke completions from a BH (other places in the block
layer do this too).

> 	- for -EAGAIN, retry the submission in an introduced event handler

I agree with Paolo that a BH is appropriate.

> 	- for part of completion, just update the io queue, since it is
> 	moving on after all

If we do this then we need to guarantee that io_submit() will be called
at some point soon.  Otherwise requests could get stuck if the guest
doesn't submit any more I/O requests to push the queue.

Please split this into separate patches.  You're trying to do too much.

Overall, I would prefer it if we avoid the extra complexity of deferring
io_submit() on EAGAIN and partial submission.  Do you understand why the
kernel is producing this behavior?  Can we set the right capacity in
io_setup() so it doesn't happen?

> +        if (enqueue)
> +            return ret;

Please set up a git hook to run checkpatch.pl.  It will alert you when
you violate QEMU coding style:
http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html

I already mentioned coding style in previous patches, using a git hook
will avoid it happening again.
Eric Blake July 9, 2014, 12:42 p.m. UTC | #4
On 07/09/2014 02:29 AM, Stefan Hajnoczi wrote:

>> +        if (enqueue)
>> +            return ret;
> 
> Please set up a git hook to run checkpatch.pl.  It will alert you when
> you violate QEMU coding style:
> http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html
> 
> I already mentioned coding style in previous patches, using a git hook
> will avoid it happening again.

Nice trick; I've added it to http://wiki.qemu.org/Contribute/SubmitAPatch
Ming Lei July 11, 2014, 3:43 p.m. UTC | #5
On Wed, Jul 9, 2014 at 4:29 PM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Tue, Jul 08, 2014 at 11:45:10PM +0800, Ming Lei wrote:
>> In the enqueue path, we can't complete request, otherwise
>> "Co-routine re-entered recursively" may be caused, so this
>> patch fixes the issue with below ideas:
>
> Thi  probably happens when the caller is in coroutine context and its
> completion function invokes qemu_coroutine_enter() on itself.  The
> solution is to invoke completions from a BH (other places in the block
> layer do this too).

Yes, BH is solution, but for this case, I prefer to take the approach in
the patch because the completion shouldn't have been called if submit
is failed,  which should be kept consistent as before.

>
>>       - for -EAGAIN, retry the submission in an introduced event handler
>
> I agree with Paolo that a BH is appropriate.

It isn't a big deal about BH vs. event, and I take event just for
not making resubmission too quick.

>
>>       - for part of completion, just update the io queue, since it is
>>       moving on after all
>
> If we do this then we need to guarantee that io_submit() will be called
> at some point soon.  Otherwise requests could get stuck if the guest
> doesn't submit any more I/O requests to push the queue.

Good point, I will let the BH or event handler to resubmit the remainder.

>
> Please split this into separate patches.  You're trying to do too much.
>
> Overall, I would prefer it if we avoid the extra complexity of deferring
> io_submit() on EAGAIN and partial submission.  Do you understand why the

It might be a bit difficult to avoid the problem completely with a fixed/static
max events, especially after multi virtqueue of virtio-blk is introduced.

> kernel is producing this behavior?  Can we set the right capacity in

It is caused by not enough request resources.

> io_setup() so it doesn't happen?

Yes, it can be helpful, but won't easy to avoid -EAGAIN completely.

>
>> +        if (enqueue)
>> +            return ret;
>
> Please set up a git hook to run checkpatch.pl.  It will alert you when
> you violate QEMU coding style:
> http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html
>
> I already mentioned coding style in previous patches, using a git hook
> will avoid it happening again.

Sorry for missing that again.

Thanks,
diff mbox

Patch

diff --git a/block/linux-aio.c b/block/linux-aio.c
index 4867369..6f7dd51 100644
--- a/block/linux-aio.c
+++ b/block/linux-aio.c
@@ -51,6 +51,7 @@  struct qemu_laio_state {
 
     /* io queue for submit at batch */
     LaioQueue io_q;
+    EventNotifier retry;      /* handle -EAGAIN */
 };
 
 static inline ssize_t io_event_ret(struct io_event *ev)
@@ -154,45 +155,78 @@  static void ioq_init(LaioQueue *io_q)
     io_q->plugged = 0;
 }
 
-static int ioq_submit(struct qemu_laio_state *s)
+static void abort_queue(struct qemu_laio_state *s)
+{
+    int i;
+    for (i = 0; i < s->io_q.idx; i++) {
+        struct qemu_laiocb *laiocb = container_of(s->io_q.iocbs[i],
+                                                  struct qemu_laiocb,
+                                                  iocb);
+        laiocb->ret = -EIO;
+        qemu_laio_process_completion(s, laiocb);
+    }
+}
+
+static int ioq_submit(struct qemu_laio_state *s, bool enqueue)
 {
     int ret, i = 0;
     int len = s->io_q.idx;
+    int j = 0;
 
-    do {
-        ret = io_submit(s->ctx, len, s->io_q.iocbs);
-    } while (i++ < 3 && ret == -EAGAIN);
+    if (!len) {
+        return 0;
+    }
 
-    /* empty io queue */
-    s->io_q.idx = 0;
+    ret = io_submit(s->ctx, len, s->io_q.iocbs);
+    if (ret == -EAGAIN) {
+        event_notifier_set(&s->retry);
+        return 0;
+    } else if (ret < 0) {
+        if (enqueue)
+            return ret;
+
+        /* in non-queue path, all IOs have to be completed */
+        abort_queue(s);
+        ret = len;
+    } else if (ret == 0) {
+        goto out;
+    }
 
-    if (ret < 0) {
-        i = 0;
-    } else {
-        i = ret;
+    for (i = ret; i < len; i++) {
+        s->io_q.iocbs[j++] = s->io_q.iocbs[i];
     }
 
-    for (; i < len; i++) {
-        struct qemu_laiocb *laiocb =
-            container_of(s->io_q.iocbs[i], struct qemu_laiocb, iocb);
+ out:
+    /* update io queue */
+    s->io_q.idx -= ret;
 
-        laiocb->ret = (ret < 0) ? ret : -EIO;
-        qemu_laio_process_completion(s, laiocb);
-    }
     return ret;
 }
 
-static void ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
+static void ioq_submit_retry(EventNotifier *e)
+{
+    struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, retry);
+
+    event_notifier_test_and_clear(e);
+    ioq_submit(s, false);
+}
+
+static int ioq_enqueue(struct qemu_laio_state *s, struct iocb *iocb)
 {
     unsigned int idx = s->io_q.idx;
 
+    if (unlikely(idx == s->io_q.size))
+        return -1;
+
     s->io_q.iocbs[idx++] = iocb;
     s->io_q.idx = idx;
 
-    /* submit immediately if queue is full */
-    if (idx == s->io_q.size) {
-        ioq_submit(s);
+    /* submit immediately if queue depth is above 2/3 */
+    if (idx > s->io_q.size * 2 / 3) {
+        return ioq_submit(s, true);
     }
+
+    return 0;
 }
 
 void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
@@ -214,7 +248,7 @@  int laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
     }
 
     if (s->io_q.idx > 0) {
-        ret = ioq_submit(s);
+        ret = ioq_submit(s, false);
     }
 
     return ret;
@@ -258,7 +292,8 @@  BlockDriverAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
             goto out_free_aiocb;
         }
     } else {
-        ioq_enqueue(s, iocbs);
+        if (ioq_enqueue(s, iocbs) < 0)
+            goto out_free_aiocb;
     }
     return &laiocb->common;
 
@@ -272,6 +307,7 @@  void laio_detach_aio_context(void *s_, AioContext *old_context)
     struct qemu_laio_state *s = s_;
 
     aio_set_event_notifier(old_context, &s->e, NULL);
+    aio_set_event_notifier(old_context, &s->retry, NULL);
 }
 
 void laio_attach_aio_context(void *s_, AioContext *new_context)
@@ -279,6 +315,7 @@  void laio_attach_aio_context(void *s_, AioContext *new_context)
     struct qemu_laio_state *s = s_;
 
     aio_set_event_notifier(new_context, &s->e, qemu_laio_completion_cb);
+    aio_set_event_notifier(new_context, &s->retry, ioq_submit_retry);
 }
 
 void *laio_init(void)
@@ -295,6 +332,9 @@  void *laio_init(void)
     }
 
     ioq_init(&s->io_q);
+    if (event_notifier_init(&s->retry, false) < 0) {
+        goto out_close_efd;
+    }
 
     return s;
 
@@ -310,5 +350,6 @@  void laio_cleanup(void *s_)
     struct qemu_laio_state *s = s_;
 
     event_notifier_cleanup(&s->e);
+    event_notifier_cleanup(&s->retry);
     g_free(s);
 }