diff mbox

[for-2.9,4/5] block: Drain BH in bdrv_drained_begin

Message ID 20170406142527.25835-5-famz@redhat.com
State New
Headers show

Commit Message

Fam Zheng April 6, 2017, 2:25 p.m. UTC
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:

    qmp_block_commit
      commit_active_start
        bdrv_reopen
          bdrv_reopen_multiple
            bdrv_reopen_prepare
              bdrv_flush
                aio_poll
                  aio_bh_poll
                    aio_bh_call
                      block_job_defer_to_main_loop_bh
                        stream_complete
                          bdrv_reopen

block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.

Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well.  To achieve
this, this patch forces draining the BH before leaving bdrv_drained_begin().

Signed-off-by: Fam Zheng <famz@redhat.com>
---
 block/io.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Eric Blake April 6, 2017, 3:10 p.m. UTC | #1
On 04/06/2017 09:25 AM, Fam Zheng wrote:
> During block job completion, nothing is preventing
> block_job_defer_to_main_loop_bh from being called in a nested
> aio_poll(), which is a trouble, such as in this code path:
> 
>     qmp_block_commit
>       commit_active_start
>         bdrv_reopen
>           bdrv_reopen_multiple
>             bdrv_reopen_prepare
>               bdrv_flush
>                 aio_poll
>                   aio_bh_poll
>                     aio_bh_call
>                       block_job_defer_to_main_loop_bh
>                         stream_complete
>                           bdrv_reopen
> 
> block_job_defer_to_main_loop_bh is the last step of the stream job,
> which should have been "paused" by the bdrv_drained_begin/end in
> bdrv_reopen_multiple, but it is not done because it's in the form of a
> main loop BH.
> 
> Similar to why block jobs should be paused between drained_begin and
> drained_end, BHs they schedule must be excluded as well.  To achieve
> this, this patch forces draining the BH before leaving bdrv_drained_begin().
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>
> ---
>  block/io.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 

Nice writeup.

Reviewed-by: Eric Blake <eblake@redhat.com>
Kevin Wolf April 6, 2017, 3:37 p.m. UTC | #2
Am 06.04.2017 um 16:25 hat Fam Zheng geschrieben:
> During block job completion, nothing is preventing
> block_job_defer_to_main_loop_bh from being called in a nested
> aio_poll(), which is a trouble, such as in this code path:
> 
>     qmp_block_commit
>       commit_active_start
>         bdrv_reopen
>           bdrv_reopen_multiple
>             bdrv_reopen_prepare
>               bdrv_flush
>                 aio_poll
>                   aio_bh_poll
>                     aio_bh_call
>                       block_job_defer_to_main_loop_bh
>                         stream_complete
>                           bdrv_reopen
> 
> block_job_defer_to_main_loop_bh is the last step of the stream job,
> which should have been "paused" by the bdrv_drained_begin/end in
> bdrv_reopen_multiple, but it is not done because it's in the form of a
> main loop BH.
> 
> Similar to why block jobs should be paused between drained_begin and
> drained_end, BHs they schedule must be excluded as well.  To achieve
> this, this patch forces draining the BH before leaving bdrv_drained_begin().
> 
> Signed-off-by: Fam Zheng <famz@redhat.com>

We used to poll BHs in earlier times. Commit 99723548 ('block: add BDS
field to count in-flight requests') changed this, without an explanation
in the commit message.

Paolo, is this simply a bug in that commit, or do you rely on it
somewhere? I remember that you wanted to get rid of some aio_poll()
calls a while ago.

>  block/io.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 2709a70..b9cfd18 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -228,7 +228,12 @@ void bdrv_drained_begin(BlockDriverState *bs)
>          bdrv_parent_drained_begin(bs);
>      }
>  
> -    bdrv_drain_recurse(bs);
> +    while (true) {
> +        if (!bdrv_drain_recurse(bs) &&
> +            !aio_poll(bdrv_get_aio_context(bs), false)) {
> +                break;
> +            }

The indentation is off here.

> +    }
>  }

The old code had this in what is now the BDRV_POLL_WHILE() call in
bdrv_drain_recurse(). I think it makes more sense there, saves you a
loop here and fixes bdrv_drain_all_begin() at the same time.

Kevin
Paolo Bonzini April 14, 2017, 9:15 a.m. UTC | #3
On 06/04/2017 23:37, Kevin Wolf wrote:
> We used to poll BHs in earlier times. Commit 99723548 ('block: add BDS
> field to count in-flight requests') changed this, without an explanation
> in the commit message.
> 
> Paolo, is this simply a bug in that commit, or do you rely on it
> somewhere? I remember that you wanted to get rid of some aio_poll()
> calls a while ago.

It was replaced by the bdrv_inc_in_flight/bdrv_dec_in_flight calls in
bdrv_aio_prw and bdrv_co_complete.

These let bdrv_drain complete the invocation of the callbacks.  But
block_job_defer_to_main_loop is different because it schedules the
bottom half in another QEMU AioContext (the main thread's).

Since he's here with me, I'll ask Fam exactly what's happening.  Stefan
also found something not too convincing in his review.

Paolo

>>  block/io.c | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/block/io.c b/block/io.c
>> index 2709a70..b9cfd18 100644
>> --- a/block/io.c
>> +++ b/block/io.c
>> @@ -228,7 +228,12 @@ void bdrv_drained_begin(BlockDriverState *bs)
>>          bdrv_parent_drained_begin(bs);
>>      }
>>  
>> -    bdrv_drain_recurse(bs);
>> +    while (true) {
>> +        if (!bdrv_drain_recurse(bs) &&
>> +            !aio_poll(bdrv_get_aio_context(bs), false)) {
>> +                break;
>> +            }
> 
> The indentation is off here.
> 
>> +    }
>>  }
> 
> The old code had this in what is now the BDRV_POLL_WHILE() call in
> bdrv_drain_recurse(). I think it makes more sense there, saves you a
> loop here and fixes bdrv_drain_all_begin() at the same time.
> 
> Kevin
> 
>
diff mbox

Patch

diff --git a/block/io.c b/block/io.c
index 2709a70..b9cfd18 100644
--- a/block/io.c
+++ b/block/io.c
@@ -228,7 +228,12 @@  void bdrv_drained_begin(BlockDriverState *bs)
         bdrv_parent_drained_begin(bs);
     }
 
-    bdrv_drain_recurse(bs);
+    while (true) {
+        if (!bdrv_drain_recurse(bs) &&
+            !aio_poll(bdrv_get_aio_context(bs), false)) {
+                break;
+            }
+    }
 }
 
 void bdrv_drained_end(BlockDriverState *bs)