Message ID | d13857a6196c4bc8bbc6e3e290fc81fe@h3c.com |
---|---|
State | New |
Headers | show |
Series | [v2] migration: Don't allow migration if vm is in POSTMIGRATE state | expand |
07.12.2020 10:44, Tuguoyi wrote: > The following steps will cause qemu assertion failure: > - pause vm by executing 'virsh suspend' > - create external snapshot of memory and disk using 'virsh snapshot-create-as' > - doing the above operation again will cause qemu crash > > The backtrace looks like: > #0 0x00007fbf958c5c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > #1 0x00007fbf958c9028 in abort () from /lib/x86_64-linux-gnu/libc.so.6 > #2 0x00007fbf958bebf6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > #3 0x00007fbf958beca2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 > #4 0x000055ca8decd39d in bdrv_inactivate_recurse (bs=0x55ca90c80400) at /build/qemu-5.0/block.c:5724 > #5 0x000055ca8dece967 in bdrv_inactivate_all () at /build//qemu-5.0/block.c:5792 > #6 0x000055ca8de5539d in qemu_savevm_state_complete_precopy_non_iterable (inactivate_disks=true, in_postcopy=false, f=0x55ca907044b0) > at /build/qemu-5.0/migration/savevm.c:1401 > #7 qemu_savevm_state_complete_precopy (f=0x55ca907044b0, iterable_only=iterable_only@entry=false, inactivate_disks=inactivate_disks@entry=true) > at /build/qemu-5.0/migration/savevm.c:1453 > #8 0x000055ca8de4f581 in migration_completion (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:2941 > #9 migration_iteration_run (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3295 > #10 migration_thread (opaque=opaque@entry=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3459 > #11 0x000055ca8dfc6716 in qemu_thread_start (args=<optimized out>) at /build/qemu-5.0/util/qemu-thread-posix.c:519 > #12 0x00007fbf95c5f184 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 > #13 0x00007fbf9598cbed in clone () from /lib/x86_64-linux-gnu/libc.so.6 > > When the first migration completes, bs->open_flags will set BDRV_O_INACTIVE > flag by bdrv_inactivate_all(), and during the second migration the > bdrv_inactivate_recurse assert that the bs->open_flags is already > BDRV_O_INACTIVE enabled which cause crash. > > As Vladimir suggested, this patch just make migration job error-out with a > message in migrate_fd_connect() if the vm is in RUN_STATE_POSTMIGRATE state. > > Signed-off-by: Tuguoyi <tu.guoyi@h3c.com> > --- > migration/migration.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/migration/migration.c b/migration/migration.c > index 87a9b59..4091678 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3622,6 +3622,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) > return; > } > > + if (runstate_check(RUN_STATE_POSTMIGRATE)) { > + error_report("Can't migrate the vm that is in POSTMIGRATE state"); > + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); > + migrate_fd_cleanup(s); > + return; > + } > + > if (resume) { > /* This is a resumed migration */ > rate_limit = s->parameters.max_postcopy_bandwidth / > I think, correct place for the check migrate_prepare, as it is called for any kind of migration, not only fd_. And in it we have already check for wrong state: if (runstate_check(RUN_STATE_INMIGRATE)) { error_setg(errp, "Guest is waiting for an incoming migration"); return false; } and no additional state change and cleanup is needed.
On December 07, 2020 6:06 PM Vladimir Sementsov-Ogievskiy wrote: > 07.12.2020 10:44, Tuguoyi wrote: > > The following steps will cause qemu assertion failure: > > - pause vm by executing 'virsh suspend' > > - create external snapshot of memory and disk using 'virsh > snapshot-create-as' > > - doing the above operation again will cause qemu crash > > > > The backtrace looks like: > > #0 0x00007fbf958c5c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6 > > #1 0x00007fbf958c9028 in abort () from /lib/x86_64-linux-gnu/libc.so.6 > > #2 0x00007fbf958bebf6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 > > #3 0x00007fbf958beca2 in __assert_fail () from > /lib/x86_64-linux-gnu/libc.so.6 > > #4 0x000055ca8decd39d in bdrv_inactivate_recurse (bs=0x55ca90c80400) > at /build/qemu-5.0/block.c:5724 > > #5 0x000055ca8dece967 in bdrv_inactivate_all () at > /build//qemu-5.0/block.c:5792 > > #6 0x000055ca8de5539d in > qemu_savevm_state_complete_precopy_non_iterable (inactivate_disks=true, > in_postcopy=false, f=0x55ca907044b0) > > at /build/qemu-5.0/migration/savevm.c:1401 > > #7 qemu_savevm_state_complete_precopy (f=0x55ca907044b0, > iterable_only=iterable_only@entry=false, > inactivate_disks=inactivate_disks@entry=true) > > at /build/qemu-5.0/migration/savevm.c:1453 > > #8 0x000055ca8de4f581 in migration_completion (s=0x55ca8f64d9f0) at > /build/qemu-5.0/migration/migration.c:2941 > > #9 migration_iteration_run (s=0x55ca8f64d9f0) at > /build/qemu-5.0/migration/migration.c:3295 > > #10 migration_thread (opaque=opaque@entry=0x55ca8f64d9f0) at > /build/qemu-5.0/migration/migration.c:3459 > > #11 0x000055ca8dfc6716 in qemu_thread_start (args=<optimized out>) at > /build/qemu-5.0/util/qemu-thread-posix.c:519 > > #12 0x00007fbf95c5f184 in start_thread () from > /lib/x86_64-linux-gnu/libpthread.so.0 > > #13 0x00007fbf9598cbed in clone () from /lib/x86_64-linux-gnu/libc.so.6 > > > > When the first migration completes, bs->open_flags will set > BDRV_O_INACTIVE > > flag by bdrv_inactivate_all(), and during the second migration the > > bdrv_inactivate_recurse assert that the bs->open_flags is already > > BDRV_O_INACTIVE enabled which cause crash. > > > > As Vladimir suggested, this patch just make migration job error-out with a > > message in migrate_fd_connect() if the vm is in > RUN_STATE_POSTMIGRATE state. > > > > Signed-off-by: Tuguoyi <tu.guoyi@h3c.com> > > --- > > migration/migration.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index 87a9b59..4091678 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -3622,6 +3622,13 @@ void migrate_fd_connect(MigrationState *s, > Error *error_in) > > return; > > } > > > > + if (runstate_check(RUN_STATE_POSTMIGRATE)) { > > + error_report("Can't migrate the vm that is in POSTMIGRATE > state"); > > + migrate_set_state(&s->state, s->state, > MIGRATION_STATUS_FAILED); > > + migrate_fd_cleanup(s); > > + return; > > + } > > + > > if (resume) { > > /* This is a resumed migration */ > > rate_limit = s->parameters.max_postcopy_bandwidth / > > > > > I think, correct place for the check migrate_prepare, as it is called for any kind > of migration, not only fd_. And in it we have already check for wrong state: > > if (runstate_check(RUN_STATE_INMIGRATE)) { > error_setg(errp, "Guest is waiting for an incoming migration"); > return false; > } > > and no additional state change and cleanup is needed. Thank you for your advise, I'll send another patch. > -- > Best regards, > Vladimir
diff --git a/migration/migration.c b/migration/migration.c index 87a9b59..4091678 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3622,6 +3622,13 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) return; } + if (runstate_check(RUN_STATE_POSTMIGRATE)) { + error_report("Can't migrate the vm that is in POSTMIGRATE state"); + migrate_set_state(&s->state, s->state, MIGRATION_STATUS_FAILED); + migrate_fd_cleanup(s); + return; + } + if (resume) { /* This is a resumed migration */ rate_limit = s->parameters.max_postcopy_bandwidth /
The following steps will cause qemu assertion failure: - pause vm by executing 'virsh suspend' - create external snapshot of memory and disk using 'virsh snapshot-create-as' - doing the above operation again will cause qemu crash The backtrace looks like: #0 0x00007fbf958c5c37 in raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 0x00007fbf958c9028 in abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 0x00007fbf958bebf6 in ?? () from /lib/x86_64-linux-gnu/libc.so.6 #3 0x00007fbf958beca2 in __assert_fail () from /lib/x86_64-linux-gnu/libc.so.6 #4 0x000055ca8decd39d in bdrv_inactivate_recurse (bs=0x55ca90c80400) at /build/qemu-5.0/block.c:5724 #5 0x000055ca8dece967 in bdrv_inactivate_all () at /build//qemu-5.0/block.c:5792 #6 0x000055ca8de5539d in qemu_savevm_state_complete_precopy_non_iterable (inactivate_disks=true, in_postcopy=false, f=0x55ca907044b0) at /build/qemu-5.0/migration/savevm.c:1401 #7 qemu_savevm_state_complete_precopy (f=0x55ca907044b0, iterable_only=iterable_only@entry=false, inactivate_disks=inactivate_disks@entry=true) at /build/qemu-5.0/migration/savevm.c:1453 #8 0x000055ca8de4f581 in migration_completion (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:2941 #9 migration_iteration_run (s=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3295 #10 migration_thread (opaque=opaque@entry=0x55ca8f64d9f0) at /build/qemu-5.0/migration/migration.c:3459 #11 0x000055ca8dfc6716 in qemu_thread_start (args=<optimized out>) at /build/qemu-5.0/util/qemu-thread-posix.c:519 #12 0x00007fbf95c5f184 in start_thread () from /lib/x86_64-linux-gnu/libpthread.so.0 #13 0x00007fbf9598cbed in clone () from /lib/x86_64-linux-gnu/libc.so.6 When the first migration completes, bs->open_flags will set BDRV_O_INACTIVE flag by bdrv_inactivate_all(), and during the second migration the bdrv_inactivate_recurse assert that the bs->open_flags is already BDRV_O_INACTIVE enabled which cause crash. As Vladimir suggested, this patch just make migration job error-out with a message in migrate_fd_connect() if the vm is in RUN_STATE_POSTMIGRATE state. Signed-off-by: Tuguoyi <tu.guoyi@h3c.com> --- migration/migration.c | 7 +++++++ 1 file changed, 7 insertions(+)