diff mbox

[4/4] migration: add missed aio_context_acquire into HMP snapshot code

Message ID 1446044465-19312-5-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev Oct. 28, 2015, 3:01 p.m. UTC
aio_context should be locked in the similar way as was done in QMP
snapshot creation in the other case there are a lot of possible
troubles if native AIO mode is enabled for disk.

- the command can hang (HMP thread) with missed wakeup (the operation is
  actually complete)
    io_submit
    ioq_submit
    laio_submit
    raw_aio_submit
    raw_aio_readv
    bdrv_co_io_em
    bdrv_co_readv_em
    bdrv_aligned_preadv
    bdrv_co_do_preadv
    bdrv_co_do_readv
    bdrv_co_readv
    qcow2_co_readv
    bdrv_aligned_preadv
    bdrv_co_do_pwritev
    bdrv_rw_co_entry

- QEMU can assert in coroutine re-enter
    __GI_abort
    qemu_coroutine_enter
    bdrv_co_io_em_complete
    qemu_laio_process_completion
    qemu_laio_completion_bh
    aio_bh_poll
    aio_dispatch
    aio_poll
    iothread_run

qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
along with block drivers. This change should influence only HMP snapshot
operations.

AioContext lock is reqursive. Thus nested locking should not be a problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
---
 block/snapshot.c   |  5 +++++
 migration/savevm.c | 18 +++++++++++++++---
 2 files changed, 20 insertions(+), 3 deletions(-)

Comments

Juan Quintela Oct. 28, 2015, 3:33 p.m. UTC | #1
"Denis V. Lunev" <den@openvz.org> wrote:
> aio_context should be locked in the similar way as was done in QMP
> snapshot creation in the other case there are a lot of possible
> troubles if native AIO mode is enabled for disk.
>
> - the command can hang (HMP thread) with missed wakeup (the operation is
>   actually complete)
>     io_submit
>     ioq_submit
>     laio_submit
>     raw_aio_submit
>     raw_aio_readv
>     bdrv_co_io_em
>     bdrv_co_readv_em
>     bdrv_aligned_preadv
>     bdrv_co_do_preadv
>     bdrv_co_do_readv
>     bdrv_co_readv
>     qcow2_co_readv
>     bdrv_aligned_preadv
>     bdrv_co_do_pwritev
>     bdrv_rw_co_entry
>
> - QEMU can assert in coroutine re-enter
>     __GI_abort
>     qemu_coroutine_enter
>     bdrv_co_io_em_complete
>     qemu_laio_process_completion
>     qemu_laio_completion_bh
>     aio_bh_poll
>     aio_dispatch
>     aio_poll
>     iothread_run
>
> qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
> along with block drivers. This change should influence only HMP snapshot
> operations.
>
> AioContext lock is reqursive. Thus nested locking should not be a problem.
>
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Stefan Hajnoczi <stefanha@redhat.com>
> CC: Paolo Bonzini <pbonzini@redhat.com>
> CC: Juan Quintela <quintela@redhat.com>
> CC: Amit Shah <amit.shah@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

Should this one go through the block layer?  I guess that the block
layer, but otherwise, I will get it.


> -    return bdrv_flush(opaque);

> +    BlockDriverState *bs = (BlockDriverState *)opaque;

Cast not needed.

BlockDriverState * bs = opaque;

is even better.

Thanks, Juan.
Denis V. Lunev Oct. 28, 2015, 3:57 p.m. UTC | #2
On 10/28/2015 06:33 PM, Juan Quintela wrote:
> "Denis V. Lunev" <den@openvz.org> wrote:
>> aio_context should be locked in the similar way as was done in QMP
>> snapshot creation in the other case there are a lot of possible
>> troubles if native AIO mode is enabled for disk.
>>
>> - the command can hang (HMP thread) with missed wakeup (the operation is
>>    actually complete)
>>      io_submit
>>      ioq_submit
>>      laio_submit
>>      raw_aio_submit
>>      raw_aio_readv
>>      bdrv_co_io_em
>>      bdrv_co_readv_em
>>      bdrv_aligned_preadv
>>      bdrv_co_do_preadv
>>      bdrv_co_do_readv
>>      bdrv_co_readv
>>      qcow2_co_readv
>>      bdrv_aligned_preadv
>>      bdrv_co_do_pwritev
>>      bdrv_rw_co_entry
>>
>> - QEMU can assert in coroutine re-enter
>>      __GI_abort
>>      qemu_coroutine_enter
>>      bdrv_co_io_em_complete
>>      qemu_laio_process_completion
>>      qemu_laio_completion_bh
>>      aio_bh_poll
>>      aio_dispatch
>>      aio_poll
>>      iothread_run
>>
>> qemu_fopen_bdrv and bdrv_fclose are used in real snapshot operations only
>> along with block drivers. This change should influence only HMP snapshot
>> operations.
>>
>> AioContext lock is reqursive. Thus nested locking should not be a problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Stefan Hajnoczi <stefanha@redhat.com>
>> CC: Paolo Bonzini <pbonzini@redhat.com>
>> CC: Juan Quintela <quintela@redhat.com>
>> CC: Amit Shah <amit.shah@redhat.com>
> Reviewed-by: Juan Quintela <quintela@redhat.com>
>
> Should this one go through the block layer?  I guess that the block
> layer, but otherwise, I will get it.

let's wait opinion from Stefan :)

Either way would be good to me, but I want to have
previous patches from the set committed too. They
should be definitely flow through block tree thus
block tree would be better.

Anyway, I can retry the process with patches 1-3
if you'll get 4 through your queue.

Den
Stefan Hajnoczi Oct. 30, 2015, 3:52 p.m. UTC | #3
On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
> diff --git a/block/snapshot.c b/block/snapshot.c
> index 89500f2..f6fa17a 100644
> --- a/block/snapshot.c
> +++ b/block/snapshot.c
> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>  {
>      int ret;
>      Error *local_err = NULL;
> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> +
> +    aio_context_acquire(aio_context);
>  
>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>      if (ret == -ENOENT || ret == -EINVAL) {
> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>      }
>  
> +    aio_context_release(aio_context);
> +
>      if (ret < 0) {
>          error_propagate(errp, local_err);
>      }

Please make the caller acquire the AioContext instead of modifying
bdrv_snapshot_delete_id_or_name() because no other functions in this
file acquire AioContext and the API should be consistent.

There's no harm in recursive locking but it is hard to write correct
code if related functions differ in whether or not they acquire the
AioContext.  Either all of them should acquire AioContext or none of
them.
Juan Quintela Nov. 3, 2015, 2:48 p.m. UTC | #4
Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
>> diff --git a/block/snapshot.c b/block/snapshot.c
>> index 89500f2..f6fa17a 100644
>> --- a/block/snapshot.c
>> +++ b/block/snapshot.c
>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>  {
>>      int ret;
>>      Error *local_err = NULL;
>> +    AioContext *aio_context = bdrv_get_aio_context(bs);
>> +
>> +    aio_context_acquire(aio_context);
>>  
>>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>>      if (ret == -ENOENT || ret == -EINVAL) {
>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>>      }
>>  
>> +    aio_context_release(aio_context);
>> +
>>      if (ret < 0) {
>>          error_propagate(errp, local_err);
>>      }
>
> Please make the caller acquire the AioContext instead of modifying
> bdrv_snapshot_delete_id_or_name() because no other functions in this
> file acquire AioContext and the API should be consistent.

That is wrong (TM).  No other functions in migration/* know what an
aiocontext is, and they are fine, thanks O:-)

So, I guess we would have to get some other function exported from the
block layer, with the aiocontext taken?

Code ends being like this:


     while ((bs = bdrv_next(bs))) {
         if (bdrv_can_snapshot(bs) &&
             bdrv_snapshot_find(bs, snapshot, name) >= 0) {
             AioContext *ctx = bdrv_get_aio_context(bs);

             aio_context_acquire(ctx);
             bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
            aio_context_release(ctx);
         .... some error handling here ...
    }


As discussed on irc, we need to get some function exported from the
block layer that does this.

I am sure that I don't understand the differences between hmp_devlvm()
and del_existing_snapshots().

>
> There's no harm in recursive locking but it is hard to write correct
> code if related functions differ in whether or not they acquire the
> AioContext.  Either all of them should acquire AioContext or none of
> them.

I don't like recursive locking, but that is a different question,
altogether.

Denis, on irc Stefan says that new locking is not valid either, so
working from there.

Thanks, Juan.
Stefan Hajnoczi Nov. 3, 2015, 3:30 p.m. UTC | #5
On Tue, Nov 03, 2015 at 03:48:07PM +0100, Juan Quintela wrote:
> Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
> >> diff --git a/block/snapshot.c b/block/snapshot.c
> >> index 89500f2..f6fa17a 100644
> >> --- a/block/snapshot.c
> >> +++ b/block/snapshot.c
> >> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
> >>  {
> >>      int ret;
> >>      Error *local_err = NULL;
> >> +    AioContext *aio_context = bdrv_get_aio_context(bs);
> >> +
> >> +    aio_context_acquire(aio_context);
> >>  
> >>      ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
> >>      if (ret == -ENOENT || ret == -EINVAL) {
> >> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
> >>          ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
> >>      }
> >>  
> >> +    aio_context_release(aio_context);
> >> +
> >>      if (ret < 0) {
> >>          error_propagate(errp, local_err);
> >>      }
> >
> > Please make the caller acquire the AioContext instead of modifying
> > bdrv_snapshot_delete_id_or_name() because no other functions in this
> > file acquire AioContext and the API should be consistent.
> 
> That is wrong (TM).  No other functions in migration/* know what an
> aiocontext is, and they are fine, thanks O:-)

To clarify my comment:

APIs should have a consistent locking strategy.  Either all of the the
block/snapshot.c public functions should take the lock or none of them
should.

With an inconsistent locking strategy it's really hard to review code
and ensure it is correct because you need to look up for each function
whether or not it takes the lock internally.

> So, I guess we would have to get some other function exported from the
> block layer, with the aiocontext taken?
> 
> Code ends being like this:
> 
> 
>      while ((bs = bdrv_next(bs))) {
>          if (bdrv_can_snapshot(bs) &&
>              bdrv_snapshot_find(bs, snapshot, name) >= 0) {
>              AioContext *ctx = bdrv_get_aio_context(bs);
> 
>              aio_context_acquire(ctx);
>              bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
>             aio_context_release(ctx);
>          .... some error handling here ...
>     }
> 
> 
> As discussed on irc, we need to get some function exported from the
> block layer that does this.
> 
> I am sure that I don't understand the differences between hmp_devlvm()
> and del_existing_snapshots().

On IRC I commented when you posted this code because there's a bug:

bdrv_can_snapshot() and bdrv_snapshot_find() must be called with
AioContext acquired.  So the function should actually be:

while ((bs = bdrv_next(bs))) {
    AioContext *ctx = bdrv_get_aio_context(ctx);

    if (bdrv_can_snapshot(bs) &&
        ...

    aio_context_release(ctx);
}

Stefan
Denis V. Lunev Nov. 3, 2015, 3:36 p.m. UTC | #6
On 11/03/2015 06:30 PM, Stefan Hajnoczi wrote:
> On Tue, Nov 03, 2015 at 03:48:07PM +0100, Juan Quintela wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> wrote:
>>> On Wed, Oct 28, 2015 at 06:01:05PM +0300, Denis V. Lunev wrote:
>>>> diff --git a/block/snapshot.c b/block/snapshot.c
>>>> index 89500f2..f6fa17a 100644
>>>> --- a/block/snapshot.c
>>>> +++ b/block/snapshot.c
>>>> @@ -259,6 +259,9 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>>>   {
>>>>       int ret;
>>>>       Error *local_err = NULL;
>>>> +    AioContext *aio_context = bdrv_get_aio_context(bs);
>>>> +
>>>> +    aio_context_acquire(aio_context);
>>>>   
>>>>       ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
>>>>       if (ret == -ENOENT || ret == -EINVAL) {
>>>> @@ -267,6 +270,8 @@ void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
>>>>           ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
>>>>       }
>>>>   
>>>> +    aio_context_release(aio_context);
>>>> +
>>>>       if (ret < 0) {
>>>>           error_propagate(errp, local_err);
>>>>       }
>>> Please make the caller acquire the AioContext instead of modifying
>>> bdrv_snapshot_delete_id_or_name() because no other functions in this
>>> file acquire AioContext and the API should be consistent.
>> That is wrong (TM).  No other functions in migration/* know what an
>> aiocontext is, and they are fine, thanks O:-)
> To clarify my comment:
>
> APIs should have a consistent locking strategy.  Either all of the the
> block/snapshot.c public functions should take the lock or none of them
> should.
>
> With an inconsistent locking strategy it's really hard to review code
> and ensure it is correct because you need to look up for each function
> whether or not it takes the lock internally.
>
>> So, I guess we would have to get some other function exported from the
>> block layer, with the aiocontext taken?
>>
>> Code ends being like this:
>>
>>
>>       while ((bs = bdrv_next(bs))) {
>>           if (bdrv_can_snapshot(bs) &&
>>               bdrv_snapshot_find(bs, snapshot, name) >= 0) {
>>               AioContext *ctx = bdrv_get_aio_context(bs);
>>
>>               aio_context_acquire(ctx);
>>               bdrv_snapshot_delete_by_id_or_name(bs, name, &err);
>>              aio_context_release(ctx);
>>           .... some error handling here ...
>>      }
>>
>>
>> As discussed on irc, we need to get some function exported from the
>> block layer that does this.
>>
>> I am sure that I don't understand the differences between hmp_devlvm()
>> and del_existing_snapshots().
> On IRC I commented when you posted this code because there's a bug:
>
> bdrv_can_snapshot() and bdrv_snapshot_find() must be called with
> AioContext acquired.  So the function should actually be:
>
> while ((bs = bdrv_next(bs))) {
>      AioContext *ctx = bdrv_get_aio_context(ctx);
>
>      if (bdrv_can_snapshot(bs) &&
>          ...
>
>      aio_context_release(ctx);
> }
this is not that necessary after patch 6 which adds
guard into direct AIO submission code.

Anyway, I would tend to agree that locking
MUST be consistent and it MUST be ENFORCED
in proper places.

Truly speaking we must place assert(aio_context_is_owner)
at the beginning of each externally visible function in
block layer. Though this is too late in 2.5.

Den
diff mbox

Patch

diff --git a/block/snapshot.c b/block/snapshot.c
index 89500f2..f6fa17a 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -259,6 +259,9 @@  void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
 {
     int ret;
     Error *local_err = NULL;
+    AioContext *aio_context = bdrv_get_aio_context(bs);
+
+    aio_context_acquire(aio_context);
 
     ret = bdrv_snapshot_delete(bs, id_or_name, NULL, &local_err);
     if (ret == -ENOENT || ret == -EINVAL) {
@@ -267,6 +270,8 @@  void bdrv_snapshot_delete_by_id_or_name(BlockDriverState *bs,
         ret = bdrv_snapshot_delete(bs, NULL, id_or_name, &local_err);
     }
 
+    aio_context_release(aio_context);
+
     if (ret < 0) {
         error_propagate(errp, local_err);
     }
diff --git a/migration/savevm.c b/migration/savevm.c
index dbcc39a..1653f56 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -153,7 +153,11 @@  static ssize_t block_get_buffer(void *opaque, uint8_t *buf, int64_t pos,
 
 static int bdrv_fclose(void *opaque)
 {
-    return bdrv_flush(opaque);
+    BlockDriverState *bs = (BlockDriverState *)opaque;
+    int ret = bdrv_flush(bs);
+
+    aio_context_release(bdrv_get_aio_context(bs));
+    return ret;
 }
 
 static const QEMUFileOps bdrv_read_ops = {
@@ -169,10 +173,18 @@  static const QEMUFileOps bdrv_write_ops = {
 
 static QEMUFile *qemu_fopen_bdrv(BlockDriverState *bs, int is_writable)
 {
+    QEMUFile *file;
+
     if (is_writable) {
-        return qemu_fopen_ops(bs, &bdrv_write_ops);
+        file = qemu_fopen_ops(bs, &bdrv_write_ops);
+    } else {
+        file = qemu_fopen_ops(bs, &bdrv_read_ops);
+    }
+
+    if (file != NULL) {
+        aio_context_acquire(bdrv_get_aio_context(bs));
     }
-    return qemu_fopen_ops(bs, &bdrv_read_ops);
+    return file;
 }