diff mbox series

[for-2.11,3/4] coroutine: Cancel aio_co_schedule() on direct entry

Message ID 20171128154350.21504-4-kwolf@redhat.com
State New
Headers show
Series Fix qemu-iotests failures | expand

Commit Message

Kevin Wolf Nov. 28, 2017, 3:43 p.m. UTC
If a coroutine can be reentered from multiple possible sources, we need
to be careful in the case that two sources try to reenter it at the same
time.

There are two different cases where this can happen:

1. A coroutine spawns multiple asynchronous jobs and waits for all of
   them to complete. In this case, multiple reentries are expected and
   the coroutine is probably looping around a yield, so entering it
   twice is generally fine (but entering it just once after all jobs
   have completed would be enough, too).

   Exception: When the loop condition becomes false and the first
   reenter already leaves the loop, we must not do a second reenter.

2. A coroutine yields and provides multiple alternative ways to be
   reentered. In this case, we must always only process the first
   reenter.

Direct entry is not a problem. It requires that the AioContext locks for
the coroutine context are held, which means that the coroutine has
enough time to set some state that simply prevents the second caller
from reentering the coroutine, too.

Things get more interesting with aio_co_schedule() because the coroutine
may be scheduled before it is (directly) entered from a second place.
Then changing some state inside the coroutine doesn't help because it is
already scheduled and the state won't be checked again.

For this case, we'll cancel any pending aio_co_schedule() when the
coroutine is actually entered. Reentering it once is enough for all
cases explained above, and a requirement for part of them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/qemu/coroutine_int.h |  1 +
 util/async.c                 | 15 ++++++++++++++-
 util/qemu-coroutine.c        | 17 +++++++++++++++++
 3 files changed, 32 insertions(+), 1 deletion(-)

Comments

Jeff Cody Nov. 28, 2017, 4:09 p.m. UTC | #1
On Tue, Nov 28, 2017 at 04:43:49PM +0100, Kevin Wolf wrote:
> If a coroutine can be reentered from multiple possible sources, we need
> to be careful in the case that two sources try to reenter it at the same
> time.
> 
> There are two different cases where this can happen:
> 
> 1. A coroutine spawns multiple asynchronous jobs and waits for all of
>    them to complete. In this case, multiple reentries are expected and
>    the coroutine is probably looping around a yield, so entering it
>    twice is generally fine (but entering it just once after all jobs
>    have completed would be enough, too).
> 
>    Exception: When the loop condition becomes false and the first
>    reenter already leaves the loop, we must not do a second reenter.
> 
> 2. A coroutine yields and provides multiple alternative ways to be
>    reentered. In this case, we must always only process the first
>    reenter.
> 
> Direct entry is not a problem. It requires that the AioContext locks for
> the coroutine context are held, which means that the coroutine has
> enough time to set some state that simply prevents the second caller
> from reentering the coroutine, too.
> 
> Things get more interesting with aio_co_schedule() because the coroutine
> may be scheduled before it is (directly) entered from a second place.
> Then changing some state inside the coroutine doesn't help because it is
> already scheduled and the state won't be checked again.
> 
> For this case, we'll cancel any pending aio_co_schedule() when the
> coroutine is actually entered. Reentering it once is enough for all
> cases explained above, and a requirement for part of them.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/coroutine_int.h |  1 +
>  util/async.c                 | 15 ++++++++++++++-
>  util/qemu-coroutine.c        | 17 +++++++++++++++++
>  3 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> index cb98892bba..73fc35e54b 100644
> --- a/include/qemu/coroutine_int.h
> +++ b/include/qemu/coroutine_int.h
> @@ -40,6 +40,7 @@ struct Coroutine {
>      CoroutineEntry *entry;
>      void *entry_arg;
>      Coroutine *caller;
> +    AioContext *scheduled;
>  
>      /* Only used when the coroutine has terminated.  */
>      QSLIST_ENTRY(Coroutine) pool_next;
> diff --git a/util/async.c b/util/async.c
> index 0e1bd8780a..dc249e9404 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -372,6 +372,7 @@ static bool event_notifier_poll(void *opaque)
>  static void co_schedule_bh_cb(void *opaque)
>  {
>      AioContext *ctx = opaque;
> +    AioContext *scheduled_ctx;
>      QSLIST_HEAD(, Coroutine) straight, reversed;
>  
>      QSLIST_MOVE_ATOMIC(&reversed, &ctx->scheduled_coroutines);
> @@ -388,7 +389,16 @@ static void co_schedule_bh_cb(void *opaque)
>          QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
>          trace_aio_co_schedule_bh_cb(ctx, co);
>          aio_context_acquire(ctx);
> -        qemu_coroutine_enter(co);
> +        scheduled_ctx = atomic_mb_read(&co->scheduled);
> +        if (scheduled_ctx == ctx) {
> +            qemu_coroutine_enter(co);
> +        } else {
> +            /* This must be a cancelled aio_co_schedule() because the coroutine
> +             * was already entered before this BH had a chance to run. If a
> +             * coroutine is scheduled for two different AioContexts, something
> +             * is very wrong. */
> +            assert(scheduled_ctx == NULL);
> +        }
>          aio_context_release(ctx);
>      }
>  }
> @@ -438,6 +448,9 @@ fail:
>  void aio_co_schedule(AioContext *ctx, Coroutine *co)
>  {
>      trace_aio_co_schedule(ctx, co);
> +

Do we want to assert(!co->scheduled || co->scheduled == ctx) here?


> +    /* Set co->scheduled before the coroutine becomes visible in the list */
> +    atomic_mb_set(&co->scheduled, ctx);


>      QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
>                                co, co_scheduled_next);
>      qemu_bh_schedule(ctx->co_schedule_bh);
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index d6095c1d5a..c515b3cb4a 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -122,6 +122,23 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
>       */
>      smp_wmb();
>  
> +    /* Make sure that a coroutine that can alternatively reentered from two
> +     * different sources isn't reentered more than once when the first caller
> +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> +     * This is achieved by cancelling the pending aio_co_schedule().
> +     *
> +     * The other way round, if aio_co_schedule() would be called after this
> +     * point, this would be a problem, too, but in practice it doesn't happen
> +     * because we're holding the AioContext lock here and aio_co_schedule()
> +     * callers must do the same. This means that the coroutine just needs to
> +     * prevent other callers from calling aio_co_schedule() before it yields
> +     * (e.g. block job coroutines by setting job->busy = true).
> +     *
> +     * We still want to ensure that the second case doesn't happen, so reset
> +     * co->scheduled only after setting co->caller to make the above check
> +     * effective for the co_schedule_bh_cb() case. */

Nice comment

> +    atomic_set(&co->scheduled, NULL);
> +
>      ret = qemu_coroutine_switch(self, co, COROUTINE_ENTER);
>  
>      qemu_co_queue_run_restart(co);
> -- 
> 2.13.6
>
Paolo Bonzini Nov. 28, 2017, 4:14 p.m. UTC | #2
On 28/11/2017 16:43, Kevin Wolf wrote:
> +    /* Make sure that a coroutine that can alternatively reentered from two
> +     * different sources isn't reentered more than once when the first caller
> +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> +     * This is achieved by cancelling the pending aio_co_schedule().
> +     *
> +     * The other way round, if aio_co_schedule() would be called after this
> +     * point, this would be a problem, too, but in practice it doesn't happen
> +     * because we're holding the AioContext lock here and aio_co_schedule()
> +     * callers must do the same.

No, this is not true.  aio_co_schedule is thread-safe.

> This means that the coroutine just needs to
> +     * prevent other callers from calling aio_co_schedule() before it yields
> +     * (e.g. block job coroutines by setting job->busy = true).
> +     *
> +     * We still want to ensure that the second case doesn't happen, so reset
> +     * co->scheduled only after setting co->caller to make the above check
> +     * effective for the co_schedule_bh_cb() case. */
> +    atomic_set(&co->scheduled, NULL);

This doesn't work.  The coroutine is still in the list, and if someone
calls aio_co_schedule again now, any coroutines linked from "co" via
co_scheduled_next are lost.

(Also, the AioContext lock is by design not protecting any state in
AioContext itself; the AioContext lock is only protecting things that
run in an AioContext but do not have their own lock).

Paolo

> +
>      ret = qemu_coroutine_switch(self, co, COROUTINE_ENTER);
>  
>      qemu_co_queue_run_restart(co);
>
Kevin Wolf Nov. 28, 2017, 4:28 p.m. UTC | #3
Am 28.11.2017 um 17:14 hat Paolo Bonzini geschrieben:
> On 28/11/2017 16:43, Kevin Wolf wrote:
> > +    /* Make sure that a coroutine that can alternatively reentered from two
> > +     * different sources isn't reentered more than once when the first caller
> > +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> > +     * This is achieved by cancelling the pending aio_co_schedule().
> > +     *
> > +     * The other way round, if aio_co_schedule() would be called after this
> > +     * point, this would be a problem, too, but in practice it doesn't happen
> > +     * because we're holding the AioContext lock here and aio_co_schedule()
> > +     * callers must do the same.
> 
> No, this is not true.  aio_co_schedule is thread-safe.

Hm... With the reproducer we were specfically looking at
qmp_block_job_cancel(), which does take the AioContext locks. But it
might not be as universal as I thought.

To be honest, I just wasn't sure what to do with this case anyway. It
means that the coroutine is already running when someone else schedules
it. We don't really know whether we have to enter it a second time or
not.

So if it can indeed happen in practice, we need to think a bit more
about this.

> > This means that the coroutine just needs to
> > +     * prevent other callers from calling aio_co_schedule() before it yields
> > +     * (e.g. block job coroutines by setting job->busy = true).
> > +     *
> > +     * We still want to ensure that the second case doesn't happen, so reset
> > +     * co->scheduled only after setting co->caller to make the above check
> > +     * effective for the co_schedule_bh_cb() case. */
> > +    atomic_set(&co->scheduled, NULL);
> 
> This doesn't work.  The coroutine is still in the list, and if someone
> calls aio_co_schedule again now, any coroutines linked from "co" via
> co_scheduled_next are lost.

Why would they? We still iterate the whole list in co_schedule_bh_cb(),
we just skip the single qemu_coroutine_enter().

> (Also, the AioContext lock is by design not protecting any state in
> AioContext itself; the AioContext lock is only protecting things that
> run in an AioContext but do not have their own lock).

Such as the coroutine we want to enter, no?

Kevin
Fam Zheng Nov. 28, 2017, 4:30 p.m. UTC | #4
On Tue, 11/28 16:43, Kevin Wolf wrote:
> If a coroutine can be reentered from multiple possible sources, we need
> to be careful in the case that two sources try to reenter it at the same
> time.
> 
> There are two different cases where this can happen:
> 
> 1. A coroutine spawns multiple asynchronous jobs and waits for all of
>    them to complete. In this case, multiple reentries are expected and
>    the coroutine is probably looping around a yield, so entering it
>    twice is generally fine (but entering it just once after all jobs
>    have completed would be enough, too).
> 
>    Exception: When the loop condition becomes false and the first
>    reenter already leaves the loop, we must not do a second reenter.
> 
> 2. A coroutine yields and provides multiple alternative ways to be
>    reentered. In this case, we must always only process the first
>    reenter.
> 
> Direct entry is not a problem. It requires that the AioContext locks for
> the coroutine context are held, which means that the coroutine has
> enough time to set some state that simply prevents the second caller
> from reentering the coroutine, too.
> 
> Things get more interesting with aio_co_schedule() because the coroutine
> may be scheduled before it is (directly) entered from a second place.
> Then changing some state inside the coroutine doesn't help because it is
> already scheduled and the state won't be checked again.
> 
> For this case, we'll cancel any pending aio_co_schedule() when the
> coroutine is actually entered. Reentering it once is enough for all
> cases explained above, and a requirement for part of them.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/qemu/coroutine_int.h |  1 +
>  util/async.c                 | 15 ++++++++++++++-
>  util/qemu-coroutine.c        | 17 +++++++++++++++++
>  3 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> index cb98892bba..73fc35e54b 100644
> --- a/include/qemu/coroutine_int.h
> +++ b/include/qemu/coroutine_int.h
> @@ -40,6 +40,7 @@ struct Coroutine {
>      CoroutineEntry *entry;
>      void *entry_arg;
>      Coroutine *caller;
> +    AioContext *scheduled;
>  
>      /* Only used when the coroutine has terminated.  */
>      QSLIST_ENTRY(Coroutine) pool_next;
> diff --git a/util/async.c b/util/async.c
> index 0e1bd8780a..dc249e9404 100644
> --- a/util/async.c
> +++ b/util/async.c
> @@ -372,6 +372,7 @@ static bool event_notifier_poll(void *opaque)
>  static void co_schedule_bh_cb(void *opaque)
>  {
>      AioContext *ctx = opaque;
> +    AioContext *scheduled_ctx;
>      QSLIST_HEAD(, Coroutine) straight, reversed;
>  
>      QSLIST_MOVE_ATOMIC(&reversed, &ctx->scheduled_coroutines);
> @@ -388,7 +389,16 @@ static void co_schedule_bh_cb(void *opaque)
>          QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
>          trace_aio_co_schedule_bh_cb(ctx, co);
>          aio_context_acquire(ctx);
> -        qemu_coroutine_enter(co);
> +        scheduled_ctx = atomic_mb_read(&co->scheduled);
> +        if (scheduled_ctx == ctx) {
> +            qemu_coroutine_enter(co);
> +        } else {
> +            /* This must be a cancelled aio_co_schedule() because the coroutine
> +             * was already entered before this BH had a chance to run. If a
> +             * coroutine is scheduled for two different AioContexts, something
> +             * is very wrong. */
> +            assert(scheduled_ctx == NULL);
> +        }
>          aio_context_release(ctx);
>      }
>  }
> @@ -438,6 +448,9 @@ fail:
>  void aio_co_schedule(AioContext *ctx, Coroutine *co)
>  {
>      trace_aio_co_schedule(ctx, co);
> +
> +    /* Set co->scheduled before the coroutine becomes visible in the list */
> +    atomic_mb_set(&co->scheduled, ctx);
>      QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
>                                co, co_scheduled_next);
>      qemu_bh_schedule(ctx->co_schedule_bh);
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index d6095c1d5a..c515b3cb4a 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -122,6 +122,23 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
>       */
>      smp_wmb();
>  
> +    /* Make sure that a coroutine that can alternatively reentered from two

"that can be"?

> +     * different sources isn't reentered more than once when the first caller
> +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> +     * This is achieved by cancelling the pending aio_co_schedule().
> +     *
> +     * The other way round, if aio_co_schedule() would be called after this
> +     * point, this would be a problem, too, but in practice it doesn't happen
> +     * because we're holding the AioContext lock here and aio_co_schedule()
> +     * callers must do the same. This means that the coroutine just needs to
> +     * prevent other callers from calling aio_co_schedule() before it yields
> +     * (e.g. block job coroutines by setting job->busy = true).
> +     *
> +     * We still want to ensure that the second case doesn't happen, so reset
> +     * co->scheduled only after setting co->caller to make the above check
> +     * effective for the co_schedule_bh_cb() case. */
> +    atomic_set(&co->scheduled, NULL);

Looks like accesses to co->scheduled are fully protected by AioContext lock, why
are atomic ops necessary? To catch bugs?

> +
>      ret = qemu_coroutine_switch(self, co, COROUTINE_ENTER);
>  
>      qemu_co_queue_run_restart(co);
> -- 
> 2.13.6
> 

Fam
Jeff Cody Nov. 28, 2017, 4:42 p.m. UTC | #5
On Tue, Nov 28, 2017 at 05:28:50PM +0100, Kevin Wolf wrote:
> Am 28.11.2017 um 17:14 hat Paolo Bonzini geschrieben:
> > On 28/11/2017 16:43, Kevin Wolf wrote:
> > > +    /* Make sure that a coroutine that can alternatively reentered from two
> > > +     * different sources isn't reentered more than once when the first caller
> > > +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> > > +     * This is achieved by cancelling the pending aio_co_schedule().
> > > +     *
> > > +     * The other way round, if aio_co_schedule() would be called after this
> > > +     * point, this would be a problem, too, but in practice it doesn't happen
> > > +     * because we're holding the AioContext lock here and aio_co_schedule()
> > > +     * callers must do the same.
> > 
> > No, this is not true.  aio_co_schedule is thread-safe.
> 
> Hm... With the reproducer we were specfically looking at
> qmp_block_job_cancel(), which does take the AioContext locks. But it
> might not be as universal as I thought.
> 
> To be honest, I just wasn't sure what to do with this case anyway. It
> means that the coroutine is already running when someone else schedules
> it. We don't really know whether we have to enter it a second time or
> not.
> 
> So if it can indeed happen in practice, we need to think a bit more
> about this.

It would be nice if, on coroutine termination, we could unschedule all
pending executions for that coroutine.  I think use-after-free is the main
concern for someone else calling aio_co_schedule() while the coroutine is
currently running.

> 
> > > This means that the coroutine just needs to
> > > +     * prevent other callers from calling aio_co_schedule() before it yields
> > > +     * (e.g. block job coroutines by setting job->busy = true).
> > > +     *
> > > +     * We still want to ensure that the second case doesn't happen, so reset
> > > +     * co->scheduled only after setting co->caller to make the above check
> > > +     * effective for the co_schedule_bh_cb() case. */
> > > +    atomic_set(&co->scheduled, NULL);
> > 
> > This doesn't work.  The coroutine is still in the list, and if someone
> > calls aio_co_schedule again now, any coroutines linked from "co" via
> > co_scheduled_next are lost.
> 
> Why would they? We still iterate the whole list in co_schedule_bh_cb(),
> we just skip the single qemu_coroutine_enter().
> 
> > (Also, the AioContext lock is by design not protecting any state in
> > AioContext itself; the AioContext lock is only protecting things that
> > run in an AioContext but do not have their own lock).
> 
> Such as the coroutine we want to enter, no?
> 
> Kevin
Paolo Bonzini Nov. 28, 2017, 4:45 p.m. UTC | #6
On 28/11/2017 17:28, Kevin Wolf wrote:
> To be honest, I just wasn't sure what to do with this case anyway. It
> means that the coroutine is already running when someone else schedules
> it. We don't really know whether we have to enter it a second time or
> not.
>
> So if it can indeed happen in practice, we need to think a bit more
> about this.

>>> This means that the coroutine just needs to
>>> +     * prevent other callers from calling aio_co_schedule() before it yields
>>> +     * (e.g. block job coroutines by setting job->busy = true).
>>> +     *
>>> +     * We still want to ensure that the second case doesn't happen, so reset
>>> +     * co->scheduled only after setting co->caller to make the above check
>>> +     * effective for the co_schedule_bh_cb() case. */
>>> +    atomic_set(&co->scheduled, NULL);
>>
>> This doesn't work.  The coroutine is still in the list, and if someone
>> calls aio_co_schedule again now, any coroutines linked from "co" via
>> co_scheduled_next are lost.
> 
> Why would they? We still iterate the whole list in co_schedule_bh_cb(),
> we just skip the single qemu_coroutine_enter().

Because you modify co_scheduled_next (in QSLIST_INSERT_HEAD_ATOMIC)
before it has been consumed.

>> (Also, the AioContext lock is by design not protecting any state in
>> AioContext itself; the AioContext lock is only protecting things that
>> run in an AioContext but do not have their own lock).
> 
> Such as the coroutine we want to enter, no?

If you mean the Coroutine object then no, coroutines are entirely
thread-safe.  All you need to do is: 1) re-enter them with aio_co_wake
or aio_co_schedule after the first time you've entered them; 2) make
sure you don't enter/schedule them twice.

Paolo
Eric Blake Nov. 28, 2017, 4:46 p.m. UTC | #7
On 11/28/2017 09:43 AM, Kevin Wolf wrote:
> If a coroutine can be reentered from multiple possible sources, we need
> to be careful in the case that two sources try to reenter it at the same
> time.
> 

> For this case, we'll cancel any pending aio_co_schedule() when the
> coroutine is actually entered. Reentering it once is enough for all
> cases explained above, and a requirement for part of them.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>   include/qemu/coroutine_int.h |  1 +

> +++ b/util/qemu-coroutine.c
> @@ -122,6 +122,23 @@ void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
>        */
>       smp_wmb();
>   
> +    /* Make sure that a coroutine that can alternatively reentered from two

s/reentered/be reentered/

> +     * different sources isn't reentered more than once when the first caller
> +     * uses aio_co_schedule() and the other one enters to coroutine directly.

s/enters to/enters the/

> +     * This is achieved by cancelling the pending aio_co_schedule().
> +     *
> +     * The other way round, if aio_co_schedule() would be called after this
> +     * point, this would be a problem, too, but in practice it doesn't happen

s/this //

> +     * because we're holding the AioContext lock here and aio_co_schedule()
> +     * callers must do the same. This means that the coroutine just needs to
> +     * prevent other callers from calling aio_co_schedule() before it yields
> +     * (e.g. block job coroutines by setting job->busy = true).

Is 'block job coroutines' the set of coroutines owned by 'block jobs', 
or is it the verb that we are blocking all 'job coroutines'?  I don't 
know if a wording tweak would help avoid ambiguity here.

> +     *
> +     * We still want to ensure that the second case doesn't happen, so reset
> +     * co->scheduled only after setting co->caller to make the above check
> +     * effective for the co_schedule_bh_cb() case. */
> +    atomic_set(&co->scheduled, NULL);
> +
>       ret = qemu_coroutine_switch(self, co, COROUTINE_ENTER);
>   
>       qemu_co_queue_run_restart(co);
>
Paolo Bonzini Nov. 28, 2017, 4:51 p.m. UTC | #8
On 28/11/2017 17:42, Jeff Cody wrote:
> On Tue, Nov 28, 2017 at 05:28:50PM +0100, Kevin Wolf wrote:
>> Am 28.11.2017 um 17:14 hat Paolo Bonzini geschrieben:
>>> On 28/11/2017 16:43, Kevin Wolf wrote:
>>>> +    /* Make sure that a coroutine that can alternatively reentered from two
>>>> +     * different sources isn't reentered more than once when the first caller
>>>> +     * uses aio_co_schedule() and the other one enters to coroutine directly.
>>>> +     * This is achieved by cancelling the pending aio_co_schedule().
>>>> +     *
>>>> +     * The other way round, if aio_co_schedule() would be called after this
>>>> +     * point, this would be a problem, too, but in practice it doesn't happen
>>>> +     * because we're holding the AioContext lock here and aio_co_schedule()
>>>> +     * callers must do the same.
>>>
>>> No, this is not true.  aio_co_schedule is thread-safe.
>>
>> Hm... With the reproducer we were specfically looking at
>> qmp_block_job_cancel(), which does take the AioContext locks. But it
>> might not be as universal as I thought.
>>
>> To be honest, I just wasn't sure what to do with this case anyway. It
>> means that the coroutine is already running when someone else schedules
>> it. We don't really know whether we have to enter it a second time or
>> not.
>>
>> So if it can indeed happen in practice, we need to think a bit more
>> about this.
> 
> It would be nice if, on coroutine termination, we could unschedule all
> pending executions for that coroutine.  I think use-after-free is the main
> concern for someone else calling aio_co_schedule() while the coroutine is
> currently running.

Yes, terminating a scheduled coroutine is a bug; same for scheduling a
terminated coroutine, both orders are wrong. However, "unscheduling" is
not the solution; you would just be papering over the issue.

aio_co_schedule() on a running coroutine can only happen when the
coroutine is going to yield soon.

Paolo
Kevin Wolf Nov. 28, 2017, 5:03 p.m. UTC | #9
Am 28.11.2017 um 17:42 hat Jeff Cody geschrieben:
> On Tue, Nov 28, 2017 at 05:28:50PM +0100, Kevin Wolf wrote:
> > Am 28.11.2017 um 17:14 hat Paolo Bonzini geschrieben:
> > > On 28/11/2017 16:43, Kevin Wolf wrote:
> > > > +    /* Make sure that a coroutine that can alternatively reentered from two
> > > > +     * different sources isn't reentered more than once when the first caller
> > > > +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> > > > +     * This is achieved by cancelling the pending aio_co_schedule().
> > > > +     *
> > > > +     * The other way round, if aio_co_schedule() would be called after this
> > > > +     * point, this would be a problem, too, but in practice it doesn't happen
> > > > +     * because we're holding the AioContext lock here and aio_co_schedule()
> > > > +     * callers must do the same.
> > > 
> > > No, this is not true.  aio_co_schedule is thread-safe.
> > 
> > Hm... With the reproducer we were specfically looking at
> > qmp_block_job_cancel(), which does take the AioContext locks. But it
> > might not be as universal as I thought.
> > 
> > To be honest, I just wasn't sure what to do with this case anyway. It
> > means that the coroutine is already running when someone else schedules
> > it. We don't really know whether we have to enter it a second time or
> > not.
> > 
> > So if it can indeed happen in practice, we need to think a bit more
> > about this.
> 
> It would be nice if, on coroutine termination, we could unschedule all
> pending executions for that coroutine.  I think use-after-free is the main
> concern for someone else calling aio_co_schedule() while the coroutine is
> currently running.

I'd rather assert that this never happens.

Because if you have the same situation, but the coroutine doesn't
terminate, but just yields for another reason, then reentering it with
the assumption that it's still in the previous yield is a bad bug that
you definitely want to catch.

Kevin
Jeff Cody Nov. 28, 2017, 5:09 p.m. UTC | #10
On Tue, Nov 28, 2017 at 05:51:21PM +0100, Paolo Bonzini wrote:
> On 28/11/2017 17:42, Jeff Cody wrote:
> > On Tue, Nov 28, 2017 at 05:28:50PM +0100, Kevin Wolf wrote:
> >> Am 28.11.2017 um 17:14 hat Paolo Bonzini geschrieben:
> >>> On 28/11/2017 16:43, Kevin Wolf wrote:
> >>>> +    /* Make sure that a coroutine that can alternatively reentered from two
> >>>> +     * different sources isn't reentered more than once when the first caller
> >>>> +     * uses aio_co_schedule() and the other one enters to coroutine directly.
> >>>> +     * This is achieved by cancelling the pending aio_co_schedule().
> >>>> +     *
> >>>> +     * The other way round, if aio_co_schedule() would be called after this
> >>>> +     * point, this would be a problem, too, but in practice it doesn't happen
> >>>> +     * because we're holding the AioContext lock here and aio_co_schedule()
> >>>> +     * callers must do the same.
> >>>
> >>> No, this is not true.  aio_co_schedule is thread-safe.
> >>
> >> Hm... With the reproducer we were specfically looking at
> >> qmp_block_job_cancel(), which does take the AioContext locks. But it
> >> might not be as universal as I thought.
> >>
> >> To be honest, I just wasn't sure what to do with this case anyway. It
> >> means that the coroutine is already running when someone else schedules
> >> it. We don't really know whether we have to enter it a second time or
> >> not.
> >>
> >> So if it can indeed happen in practice, we need to think a bit more
> >> about this.
> > 
> > It would be nice if, on coroutine termination, we could unschedule all
> > pending executions for that coroutine.  I think use-after-free is the main
> > concern for someone else calling aio_co_schedule() while the coroutine is
> > currently running.
> 
> Yes, terminating a scheduled coroutine is a bug; same for scheduling a
> terminated coroutine, both orders are wrong. However, "unscheduling" is
> not the solution; you would just be papering over the issue.
> 

Maybe we should at least add an abort on coroutine termination if there are
still outstanding schedules, as that is preferable to operating in the
weeds.


> aio_co_schedule() on a running coroutine can only happen when the
> coroutine is going to yield soon.
> 

That is a bit vague.  What is "soon", and how does an external caller know
if a coroutine is going to yield in this timeframe?


Jeff
Paolo Bonzini Nov. 28, 2017, 5:14 p.m. UTC | #11
On 28/11/2017 18:09, Jeff Cody wrote:
>> Yes, terminating a scheduled coroutine is a bug; same for scheduling a
>> terminated coroutine, both orders are wrong. However, "unscheduling" is
>> not the solution; you would just be papering over the issue.
>
> Maybe we should at least add an abort on coroutine termination if there are
> still outstanding schedules, as that is preferable to operating in the
> weeds.

Sure, why not.  I'm all for adding more assertions (not less :)).

>> aio_co_schedule() on a running coroutine can only happen when the
>> coroutine is going to yield soon.
>>
> That is a bit vague.  What is "soon", and how does an external caller know
> if a coroutine is going to yield in this timeframe?

Soon really means "eventually"; basically if you do

	f();
	qemu_coroutine_yield();

then f() can call aio_co_wake() or aio_co_schedule() and knows that it
will be entirely thread-safe.

However, remember that only one aio_co_schedule() can be pending at a
single time; so if you have

	f();
	g();
	qemu_coroutine_yield();

either f() or g() can wake you up but not both.

Thanks,

Paolo
diff mbox series

Patch

diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index cb98892bba..73fc35e54b 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -40,6 +40,7 @@  struct Coroutine {
     CoroutineEntry *entry;
     void *entry_arg;
     Coroutine *caller;
+    AioContext *scheduled;
 
     /* Only used when the coroutine has terminated.  */
     QSLIST_ENTRY(Coroutine) pool_next;
diff --git a/util/async.c b/util/async.c
index 0e1bd8780a..dc249e9404 100644
--- a/util/async.c
+++ b/util/async.c
@@ -372,6 +372,7 @@  static bool event_notifier_poll(void *opaque)
 static void co_schedule_bh_cb(void *opaque)
 {
     AioContext *ctx = opaque;
+    AioContext *scheduled_ctx;
     QSLIST_HEAD(, Coroutine) straight, reversed;
 
     QSLIST_MOVE_ATOMIC(&reversed, &ctx->scheduled_coroutines);
@@ -388,7 +389,16 @@  static void co_schedule_bh_cb(void *opaque)
         QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
         trace_aio_co_schedule_bh_cb(ctx, co);
         aio_context_acquire(ctx);
-        qemu_coroutine_enter(co);
+        scheduled_ctx = atomic_mb_read(&co->scheduled);
+        if (scheduled_ctx == ctx) {
+            qemu_coroutine_enter(co);
+        } else {
+            /* This must be a cancelled aio_co_schedule() because the coroutine
+             * was already entered before this BH had a chance to run. If a
+             * coroutine is scheduled for two different AioContexts, something
+             * is very wrong. */
+            assert(scheduled_ctx == NULL);
+        }
         aio_context_release(ctx);
     }
 }
@@ -438,6 +448,9 @@  fail:
 void aio_co_schedule(AioContext *ctx, Coroutine *co)
 {
     trace_aio_co_schedule(ctx, co);
+
+    /* Set co->scheduled before the coroutine becomes visible in the list */
+    atomic_mb_set(&co->scheduled, ctx);
     QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
                               co, co_scheduled_next);
     qemu_bh_schedule(ctx->co_schedule_bh);
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index d6095c1d5a..c515b3cb4a 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -122,6 +122,23 @@  void qemu_aio_coroutine_enter(AioContext *ctx, Coroutine *co)
      */
     smp_wmb();
 
+    /* Make sure that a coroutine that can alternatively reentered from two
+     * different sources isn't reentered more than once when the first caller
+     * uses aio_co_schedule() and the other one enters to coroutine directly.
+     * This is achieved by cancelling the pending aio_co_schedule().
+     *
+     * The other way round, if aio_co_schedule() would be called after this
+     * point, this would be a problem, too, but in practice it doesn't happen
+     * because we're holding the AioContext lock here and aio_co_schedule()
+     * callers must do the same. This means that the coroutine just needs to
+     * prevent other callers from calling aio_co_schedule() before it yields
+     * (e.g. block job coroutines by setting job->busy = true).
+     *
+     * We still want to ensure that the second case doesn't happen, so reset
+     * co->scheduled only after setting co->caller to make the above check
+     * effective for the co_schedule_bh_cb() case. */
+    atomic_set(&co->scheduled, NULL);
+
     ret = qemu_coroutine_switch(self, co, COROUTINE_ENTER);
 
     qemu_co_queue_run_restart(co);