diff mbox series

[RFC] bdrv_flush: only use fast path when in owned AioContext

Message ID 20200511165032.11384-1-s.reiter@proxmox.com
State New
Headers show
Series [RFC] bdrv_flush: only use fast path when in owned AioContext | expand

Commit Message

Stefan Reiter May 11, 2020, 4:50 p.m. UTC
Just because we're in a coroutine doesn't imply ownership of the context
of the flushed drive. In such a case use the slow path which explicitly
enters bdrv_flush_co_entry in the correct AioContext.

Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
---

We've experienced some lockups in this codepath when taking snapshots of VMs
with drives that have IO-Threads enabled (we have an async 'savevm'
implementation running from a coroutine).

Currently no reproducer for upstream versions I could find, but in testing this
patch fixes all issues we're seeing and I think the logic checks out.

The fast path pattern is repeated a few times in this file, so if this change
makes sense, it's probably worth evaluating the other occurences as well.

 block/io.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Kevin Wolf May 12, 2020, 10:57 a.m. UTC | #1
Am 11.05.2020 um 18:50 hat Stefan Reiter geschrieben:
> Just because we're in a coroutine doesn't imply ownership of the context
> of the flushed drive. In such a case use the slow path which explicitly
> enters bdrv_flush_co_entry in the correct AioContext.
> 
> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> ---
> 
> We've experienced some lockups in this codepath when taking snapshots of VMs
> with drives that have IO-Threads enabled (we have an async 'savevm'
> implementation running from a coroutine).
> 
> Currently no reproducer for upstream versions I could find, but in testing this
> patch fixes all issues we're seeing and I think the logic checks out.
> 
> The fast path pattern is repeated a few times in this file, so if this change
> makes sense, it's probably worth evaluating the other occurences as well.

What do you mean by "owning" the context? If it's about taking the
AioContext lock, isn't the problem more with calling bdrv_flush() from
code that doesn't take the locks?

Though I think we have some code that doesn't only rely on holding the
AioContext locks, but that actually depends on running in the right
thread, so the change looks right anyway.

Kevin

>  block/io.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index aba67f66b9..ee7310fa13 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -2895,8 +2895,9 @@ int bdrv_flush(BlockDriverState *bs)
>          .ret = NOT_DONE,
>      };
>  
> -    if (qemu_in_coroutine()) {
> -        /* Fast-path if already in coroutine context */
> +    if (qemu_in_coroutine() &&
> +        bdrv_get_aio_context(bs) == qemu_get_current_aio_context()) {
> +        /* Fast-path if already in coroutine and we own the drive's context */
>          bdrv_flush_co_entry(&flush_co);
>      } else {
>          co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
> -- 
> 2.20.1
> 
>
Kevin Wolf May 12, 2020, 11:32 a.m. UTC | #2
Am 12.05.2020 um 12:57 hat Kevin Wolf geschrieben:
> Am 11.05.2020 um 18:50 hat Stefan Reiter geschrieben:
> > Just because we're in a coroutine doesn't imply ownership of the context
> > of the flushed drive. In such a case use the slow path which explicitly
> > enters bdrv_flush_co_entry in the correct AioContext.
> > 
> > Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
> > ---
> > 
> > We've experienced some lockups in this codepath when taking snapshots of VMs
> > with drives that have IO-Threads enabled (we have an async 'savevm'
> > implementation running from a coroutine).
> > 
> > Currently no reproducer for upstream versions I could find, but in testing this
> > patch fixes all issues we're seeing and I think the logic checks out.
> > 
> > The fast path pattern is repeated a few times in this file, so if this change
> > makes sense, it's probably worth evaluating the other occurences as well.
> 
> What do you mean by "owning" the context? If it's about taking the
> AioContext lock, isn't the problem more with calling bdrv_flush() from
> code that doesn't take the locks?
> 
> Though I think we have some code that doesn't only rely on holding the
> AioContext locks, but that actually depends on running in the right
> thread, so the change looks right anyway.

Well, the idea is right, but the change itself isn't, of course. If
we're already in coroutine context, we must not busy wait with
BDRV_POLL_WHILE(). I'll see if I can put something together after lunch.

Kevin
Stefan Reiter May 12, 2020, 12:22 p.m. UTC | #3
On 5/12/20 1:32 PM, Kevin Wolf wrote:
> Am 12.05.2020 um 12:57 hat Kevin Wolf geschrieben:
>> Am 11.05.2020 um 18:50 hat Stefan Reiter geschrieben:
>>> Just because we're in a coroutine doesn't imply ownership of the context
>>> of the flushed drive. In such a case use the slow path which explicitly
>>> enters bdrv_flush_co_entry in the correct AioContext.
>>>
>>> Signed-off-by: Stefan Reiter <s.reiter@proxmox.com>
>>> ---
>>>
>>> We've experienced some lockups in this codepath when taking snapshots of VMs
>>> with drives that have IO-Threads enabled (we have an async 'savevm'
>>> implementation running from a coroutine).
>>>
>>> Currently no reproducer for upstream versions I could find, but in testing this
>>> patch fixes all issues we're seeing and I think the logic checks out.
>>>
>>> The fast path pattern is repeated a few times in this file, so if this change
>>> makes sense, it's probably worth evaluating the other occurences as well.
>>
>> What do you mean by "owning" the context? If it's about taking the
>> AioContext lock, isn't the problem more with calling bdrv_flush() from
>> code that doesn't take the locks?
>>
>> Though I think we have some code that doesn't only rely on holding the
>> AioContext locks, but that actually depends on running in the right
>> thread, so the change looks right anyway.

"Owning" as in it only works (doesn't hang) when bdrv_flush_co_entry 
runs on the same AioContext that the BlockDriverState it's flushing 
belongs to.

We hold the locks for all AioContexts we want to flush in our code (in 
this case called from do_vm_stop/bdrv_flush_all so we're even in a 
drained section).

> 
> Well, the idea is right, but the change itself isn't, of course. If
> we're already in coroutine context, we must not busy wait with
> BDRV_POLL_WHILE(). I'll see if I can put something together after lunch.
> 
> Kevin
> 
> 

Thanks for taking a look!
diff mbox series

Patch

diff --git a/block/io.c b/block/io.c
index aba67f66b9..ee7310fa13 100644
--- a/block/io.c
+++ b/block/io.c
@@ -2895,8 +2895,9 @@  int bdrv_flush(BlockDriverState *bs)
         .ret = NOT_DONE,
     };
 
-    if (qemu_in_coroutine()) {
-        /* Fast-path if already in coroutine context */
+    if (qemu_in_coroutine() &&
+        bdrv_get_aio_context(bs) == qemu_get_current_aio_context()) {
+        /* Fast-path if already in coroutine and we own the drive's context */
         bdrv_flush_co_entry(&flush_co);
     } else {
         co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);