Message ID | 1404495717-4239-17-git-send-email-dgilbert@redhat.com |
---|---|
State | New |
Headers | show |
On 07/04/2014 11:41 AM, Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > --- > include/migration/migration.h | 1 + > migration.c | 9 +++++++++ > qapi-schema.json | 6 +++++- > 3 files changed, 15 insertions(+), 1 deletion(-) > > +++ b/qapi-schema.json > @@ -491,10 +491,14 @@ > # @auto-converge: If enabled, QEMU will automatically throttle down the guest > # to speed up convergence of RAM migration. (since 1.6) > # > +# @x-postcopy-ram: Start executing on the migration target before all of RAM has been > +# migrated, pulling the remaining pages along as needed. NOTE: If the > +# migration fails during postcopy the VM will fail. (since 2.2) How does this work with libvirt's current insistence that it manually resumes the guest on the destination in order to give feedback to the source on whether it was successful? I'm not sure if enabling this bool is the right thing to do, or if we just need more visibility (such as events rather than the current state of polling) for libvirt to know that it is time to resume the destination and start the post-copy phase.
* Eric Blake (eblake@redhat.com) wrote: > On 07/04/2014 11:41 AM, Dr. David Alan Gilbert (git) wrote: > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com> > > > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> > > --- > > include/migration/migration.h | 1 + > > migration.c | 9 +++++++++ > > qapi-schema.json | 6 +++++- > > 3 files changed, 15 insertions(+), 1 deletion(-) > > > > > +++ b/qapi-schema.json > > @@ -491,10 +491,14 @@ > > # @auto-converge: If enabled, QEMU will automatically throttle down the guest > > # to speed up convergence of RAM migration. (since 1.6) > > # > > +# @x-postcopy-ram: Start executing on the migration target before all of RAM has been > > +# migrated, pulling the remaining pages along as needed. NOTE: If the > > +# migration fails during postcopy the VM will fail. (since 2.2) > > How does this work with libvirt's current insistence that it manually > resumes the guest on the destination in order to give feedback to the > source on whether it was successful? I'm not sure if enabling this bool > is the right thing to do, or if we just need more visibility (such as > events rather than the current state of polling) for libvirt to know > that it is time to resume the destination and start the post-copy phase. That's an interesting overlap with Paolo's question. (and approximately the same answer) I think what I need to do for that is: 1) As for precopy add the option not to start the destination CPU on entry to postcopy; I think that's OK, because we can carry on in postcopy mode even if the destination CPU isn't running, we just won't generate page requests. 2) Finally fix up the old request libvirt has for events based on migration state. Admittedly I don't quite understand how (1) is supposed to interact with device state. Dave > -- > Eric Blake eblake redhat com +1-919-301-3266 > Libvirt virtualization library http://libvirt.org > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Il 07/07/2014 22:23, Dr. David Alan Gilbert ha scritto: > I think what I need to do for that is: > 1) As for precopy add the option not to start the destination CPU on entry to postcopy; > I think that's OK, because we can carry on in postcopy mode even if the destination > CPU isn't running, we just won't generate page requests. > > Admittedly I don't quite understand how (1) is supposed to interact with device > state. This is just passing "-S" on the destination side. Device state is treated the same as without "-S" and can still generate page requests. The only difference is whether you have a vm_start() or not. I think it should be possible to restart the VM on the source side after postcopy migration, as long as migration has failed or has been canceled. Whether that makes sense or causes dire disk corruption depends on the particular scenario, but then the same holds for precopy and we don't try at all to prevent "cont" at the end of migration. It makes it much easier for libvirt to restart the source if it cannot continue on the destination. Paolo
* Paolo Bonzini (pbonzini@redhat.com) wrote: > Il 07/07/2014 22:23, Dr. David Alan Gilbert ha scritto: > >I think what I need to do for that is: > > 1) As for precopy add the option not to start the destination CPU on entry to postcopy; > > I think that's OK, because we can carry on in postcopy mode even if the destination > > CPU isn't running, we just won't generate page requests. > > > >Admittedly I don't quite understand how (1) is supposed to interact with device > >state. > > This is just passing "-S" on the destination side. Device state is treated > the same as without "-S" and can still generate page requests. The only > difference is whether you have a vm_start() or not. Good, that sounds easy enough. > I think it should be possible to restart the VM on the source side after > postcopy migration, as long as migration has failed or has been canceled. > Whether that makes sense or causes dire disk corruption depends on the > particular scenario, but then the same holds for precopy and we don't try at > all to prevent "cont" at the end of migration. It makes it much easier for > libvirt to restart the source if it cannot continue on the destination. Interesting; Andrea fell into accidentally starting his source and was somewhat surprised. I was just going to add the RAN_STATE_MEMORY_STALE that Lei Li added in the exec-migration patchset. Dave > > Paolo -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff --git a/include/migration/migration.h b/include/migration/migration.h index a1ed7a3..35ad1f6 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -171,6 +171,7 @@ void migrate_add_blocker(Error *reason); */ void migrate_del_blocker(Error *reason); +bool migrate_postcopy_ram(void); bool migrate_rdma_pin_all(void); bool migrate_zero_blocks(void); diff --git a/migration.c b/migration.c index e69a49e..67cdfd6 100644 --- a/migration.c +++ b/migration.c @@ -612,6 +612,15 @@ bool migrate_rdma_pin_all(void) return s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]; } +bool migrate_postcopy_ram(void) +{ + MigrationState *s; + + s = migrate_get_current(); + + return s->enabled_capabilities[MIGRATION_CAPABILITY_X_POSTCOPY_RAM]; +} + bool migrate_auto_converge(void) { MigrationState *s; diff --git a/qapi-schema.json b/qapi-schema.json index b11aad2..eac3739 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -491,10 +491,14 @@ # @auto-converge: If enabled, QEMU will automatically throttle down the guest # to speed up convergence of RAM migration. (since 1.6) # +# @x-postcopy-ram: Start executing on the migration target before all of RAM has been +# migrated, pulling the remaining pages along as needed. NOTE: If the +# migration fails during postcopy the VM will fail. (since 2.2) +# # Since: 1.2 ## { 'enum': 'MigrationCapability', - 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks'] } + 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks', 'x-postcopy-ram'] } ## # @MigrationCapabilityStatus