diff mbox

[[RFC,WIP] v2] Keeping the Source side alive incase of network failure (Migration recovery from network failure)

Message ID 1465409584-16308-1-git-send-email-haris.phnx@gmail.com
State New
Headers show

Commit Message

Md Haris Iqbal June 8, 2016, 6:13 p.m. UTC
---
 include/migration/migration.h |  1 +
 migration/migration.c         | 76 ++++++++++++++++++++++++++++++++++++++++---
 qapi-schema.json              | 11 +++++--
 vl.c                          |  4 +++
 4 files changed, 85 insertions(+), 7 deletions(-)

Comments

Eric Blake June 8, 2016, 8:41 p.m. UTC | #1
On 06/08/2016 12:13 PM, Md Haris Iqbal wrote:

The subject line is long, and has a typo (s/incase/in case/).  Also, the
mailing list automatically prepends [Qemu-devel], so you shouldn't
repeat it manually.  Better might have been a short subject line then a
longer commit body:

migration: keep source alive on network failure

Details about what was failing, and why this code improves it

Missing a Signed-off-by: attribution; without that, we can't take it.

> ---

You marked this patch as v2, but in the same minute sent another email
with subject line v1, and didn't say what changed to need a v2. Here
after the --- divider is a good place for that.

>  include/migration/migration.h |  1 +
>  migration/migration.c         | 76 ++++++++++++++++++++++++++++++++++++++++---
>  qapi-schema.json              | 11 +++++--
>  vl.c                          |  4 +++
>  4 files changed, 85 insertions(+), 7 deletions(-)
> 

> @@ -1726,11 +1755,32 @@ static void *migration_thread(void *opaque)
>              }
>          }
>  
> -        if (qemu_file_get_error(s->to_dst_file)) {
> -            migrate_set_state(&s->state, current_active_state,
> -                              MIGRATION_STATUS_FAILED);
> -            trace_migration_thread_file_err();
> -            break;
> +        if ((ret = qemu_file_get_error(s->to_dst_file))) {
> +            fprintf(stderr, "1 : Error %s %d\n", strerror(-ret), -ret);

fprintf() is rather awkward for errors; can we use qemu's Error mechanism?

> +
> +            /*  This check is based on how the error is set during the network
> +             *  recv(). When recv() returns 0 (i.e. no data to read), the error
> +             *  is set to -EIO. For all other network errors, it is set
> +             *  according to the return value received.
> +             */
> +            if (ret != -EIO && s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +                /* Network Failure during postcopy */
> +
> +                current_active_state = MIGRATION_STATUS_POSTCOPY_RECOVERY;
> +                runstate_set(RUN_STATE_POSTMIGRATE_RECOVERY);
> +                fprintf(stderr, "1.1 : Error %s %d\n", strerror(-ret), -ret);

Does the end user really need to see "1.1 :"


> +++ b/qapi-schema.json
> @@ -154,12 +154,14 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @postmigrate-recovery: guest is paused for recovery after a network failure

Not your fault that the overall enum is missing an overall line:

# Since: 1.4

nor that guest-panicked is missing a "(since 1.5)" hint, but at least
your addition should have a "(since 2.7)" hint.

>  ##
>  { 'enum': 'RunState',
>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
> -            'guest-panicked' ] }
> +            'guest-panicked', 'postmigrate-recovery' ] }

Adding new enums can cause existing clients like libvirt to do weird
things if they aren't expecting the new state. Are we sure we want to do
it?  Is it a state that cannot be entered by default, but only in
response to a client request that proves the client is new enough to
expect the new state?

>  
>  ##
>  # @StatusInfo:
> @@ -434,12 +436,15 @@
>  #
>  # @failed: some error occurred during migration process.
>  #
> +# @postcopy-recovery: in recovery mode, after a network failure.
> +#

Missing a "(since 2.7)" hint.

>  # Since: 2.3
>  #
>  ##
>  { 'enum': 'MigrationStatus',
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
> -            'active', 'postcopy-active', 'completed', 'failed' ] }
> +            'active', 'postcopy-active', 'completed', 'failed',
> +            'postcopy-recovery' ] }
>  
>  ##
>  # @MigrationInfo
> @@ -2058,6 +2063,8 @@
>  #
>  # @uri: the Uniform Resource Identifier of the destination VM
>  #
> +# @recover: #optional recover from a broken migration
> +#

I don't see any 'recover' parameter added to the 'migrate' command to
match this added documentation.

>  # @blk: #optional do block migration (full disk copy)
>  #
>  # @inc: #optional incremental disk copy migration
> diff --git a/vl.c b/vl.c
> index 5fd22cb..c237140 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -618,6 +618,10 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
> +    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE_RECOVERY },
> +
> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_SHUTDOWN },
>  
>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>
Md Haris Iqbal June 13, 2016, 6:38 a.m. UTC | #2
On Thu, Jun 9, 2016 at 2:11 AM, Eric Blake <eblake@redhat.com> wrote:
> On 06/08/2016 12:13 PM, Md Haris Iqbal wrote:
>
> The subject line is long, and has a typo (s/incase/in case/).  Also, the
> mailing list automatically prepends [Qemu-devel], so you shouldn't
> repeat it manually.  Better might have been a short subject line then a
> longer commit body:
>
> migration: keep source alive on network failure
>
> Details about what was failing, and why this code improves it

Yes, I will add details and will take care while writing commit messages.

>
> Missing a Signed-off-by: attribution; without that, we can't take it.

I will add it from next time.

>
>> ---
>
> You marked this patch as v2, but in the same minute sent another email
> with subject line v1, and didn't say what changed to need a v2. Here
> after the --- divider is a good place for that.

The other patch is different then the one I posted with v1. There was
another patch which I posted few days back, which wa v1 of this patch.
I should have pointed out what has changed though.

>
>>  include/migration/migration.h |  1 +
>>  migration/migration.c         | 76 ++++++++++++++++++++++++++++++++++++++++---
>>  qapi-schema.json              | 11 +++++--
>>  vl.c                          |  4 +++
>>  4 files changed, 85 insertions(+), 7 deletions(-)
>>
>
>> @@ -1726,11 +1755,32 @@ static void *migration_thread(void *opaque)
>>              }
>>          }
>>
>> -        if (qemu_file_get_error(s->to_dst_file)) {
>> -            migrate_set_state(&s->state, current_active_state,
>> -                              MIGRATION_STATUS_FAILED);
>> -            trace_migration_thread_file_err();
>> -            break;
>> +        if ((ret = qemu_file_get_error(s->to_dst_file))) {
>> +            fprintf(stderr, "1 : Error %s %d\n", strerror(-ret), -ret);
>
> fprintf() is rather awkward for errors; can we use qemu's Error mechanism?

Just something I am using to debug. It will be absent in the final
version of the patch.

>
>> +
>> +            /*  This check is based on how the error is set during the network
>> +             *  recv(). When recv() returns 0 (i.e. no data to read), the error
>> +             *  is set to -EIO. For all other network errors, it is set
>> +             *  according to the return value received.
>> +             */
>> +            if (ret != -EIO && s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
>> +                /* Network Failure during postcopy */
>> +
>> +                current_active_state = MIGRATION_STATUS_POSTCOPY_RECOVERY;
>> +                runstate_set(RUN_STATE_POSTMIGRATE_RECOVERY);
>> +                fprintf(stderr, "1.1 : Error %s %d\n", strerror(-ret), -ret);
>
> Does the end user really need to see "1.1 :"

Just a debugging output.

>
>
>> +++ b/qapi-schema.json
>> @@ -154,12 +154,14 @@
>>  # @watchdog: the watchdog action is configured to pause and has been triggered
>>  #
>>  # @guest-panicked: guest has been panicked as a result of guest OS panic
>> +#
>> +# @postmigrate-recovery: guest is paused for recovery after a network failure
>
> Not your fault that the overall enum is missing an overall line:
>
> # Since: 1.4
>
> nor that guest-panicked is missing a "(since 1.5)" hint, but at least
> your addition should have a "(since 2.7)" hint.

Added.

>
>>  ##
>>  { 'enum': 'RunState',
>>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
>> -            'guest-panicked' ] }
>> +            'guest-panicked', 'postmigrate-recovery' ] }
>
> Adding new enums can cause existing clients like libvirt to do weird
> things if they aren't expecting the new state. Are we sure we want to do
> it?
I think so. If we do not have a new state, then one would not know
that the VM is in recovery.

> Is it a state that cannot be entered by default, but only in
> response to a client request that proves the client is new enough to
> expect the new state?

I did not quite understand what you are trying to say.

>
>>
>>  ##
>>  # @StatusInfo:
>> @@ -434,12 +436,15 @@
>>  #
>>  # @failed: some error occurred during migration process.
>>  #
>> +# @postcopy-recovery: in recovery mode, after a network failure.
>> +#
>
> Missing a "(since 2.7)" hint.

Added.

>
>>  # Since: 2.3
>>  #
>>  ##
>>  { 'enum': 'MigrationStatus',
>>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
>> -            'active', 'postcopy-active', 'completed', 'failed' ] }
>> +            'active', 'postcopy-active', 'completed', 'failed',
>> +            'postcopy-recovery' ] }
>>
>>  ##
>>  # @MigrationInfo
>> @@ -2058,6 +2063,8 @@
>>  #
>>  # @uri: the Uniform Resource Identifier of the destination VM
>>  #
>> +# @recover: #optional recover from a broken migration
>> +#
>
> I don't see any 'recover' parameter added to the 'migrate' command to
> match this added documentation.

That was a mistake. This was suppose to go to the other patch I posted
this same minute. Just missed splitting it out.

>
>>  # @blk: #optional do block migration (full disk copy)
>>  #
>>  # @inc: #optional incremental disk copy migration
>> diff --git a/vl.c b/vl.c
>> index 5fd22cb..c237140 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -618,6 +618,10 @@ static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
>>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
>>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
>> +    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE_RECOVERY },
>> +
>> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_FINISH_MIGRATE },
>> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_SHUTDOWN },
>>
>>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
>>
>
> --
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
>
Eric Blake June 13, 2016, 3:38 p.m. UTC | #3
On 06/13/2016 12:38 AM, haris iqbal wrote:

>>>  ##
>>>  { 'enum': 'RunState',
>>>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>>>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>>>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
>>> -            'guest-panicked' ] }
>>> +            'guest-panicked', 'postmigrate-recovery' ] }
>>
>> Adding new enums can cause existing clients like libvirt to do weird
>> things if they aren't expecting the new state. Are we sure we want to do
>> it?
> I think so. If we do not have a new state, then one would not know
> that the VM is in recovery.
> 
>> Is it a state that cannot be entered by default, but only in
>> response to a client request that proves the client is new enough to
>> expect the new state?
> 
> I did not quite understand what you are trying to say.
> 

A client that is not expecting the new 'postmigrate-recovery' state may
mishandle a VM that is in that state.  So I'm suggesting that we may
want to special case this state, and make it possible to enter the state
only if the client has done something first to inform qemu that it
understands what it means for a VM to be in that state.
Dr. David Alan Gilbert June 15, 2016, 1:02 p.m. UTC | #4
* Md Haris Iqbal (haris.phnx@gmail.com) wrote:
> ---
>  include/migration/migration.h |  1 +
>  migration/migration.c         | 76 ++++++++++++++++++++++++++++++++++++++++---
>  qapi-schema.json              | 11 +++++--
>  vl.c                          |  4 +++
>  4 files changed, 85 insertions(+), 7 deletions(-)

You need to pull this pair of patches together as a set of patches rather
than two individual patches; it just gets too confusing (like with the
version numbers).
Pull them together in git, and then we can also get the patches split
up into smaller parts and you can keep upto date with current qemu.

> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 4a3201b..59e26e6 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -326,6 +326,7 @@ void global_state_store_running(void);
>  void flush_page_queue(MigrationState *ms);
>  int ram_save_queue_pages(MigrationState *ms, const char *rbname,
>                           ram_addr_t start, ram_addr_t len);
> +int qemu_migrate_postcopy_outgoing_recovery(MigrationState *ms);
>  
>  PostcopyState postcopy_state_get(void);
>  /* Set the state and return the old state */
> diff --git a/migration/migration.c b/migration/migration.c
> index a77f62e..41c28e1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -676,6 +676,33 @@ MigrationInfo *qmp_query_migrate(Error **errp)
>      case MIGRATION_STATUS_CANCELLED:
>          info->has_status = true;
>          break;
> +    case MIGRATION_STATUS_POSTCOPY_RECOVERY:
> +        info->has_status = true;
> +        info->has_total_time = true;
> +        info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> +
> +        info->has_ram = true;
> +        info->ram = g_malloc0(sizeof(*info->ram));
> +        info->ram->transferred = ram_bytes_transferred();
> +        info->ram->remaining = ram_bytes_remaining();
> +        info->ram->total = ram_bytes_total();
> +        info->ram->duplicate = dup_mig_pages_transferred();
> +        info->ram->skipped = skipped_mig_pages_transferred();
> +        info->ram->normal = norm_mig_pages_transferred();
> +        info->ram->normal_bytes = norm_mig_bytes_transferred();
> +        info->ram->dirty_pages_rate = s->dirty_pages_rate;
> +        info->ram->mbps = s->mbps;
> +        info->ram->dirty_sync_count = s->dirty_sync_count;

If you update to the current qemu git you'll see there's a
populate_ram_info function that avoids a lot of this duplication.

> +        if (blk_mig_active()) {
> +            info->has_disk = true;
> +            info->disk = g_malloc0(sizeof(*info->disk));
> +            info->disk->transferred = blk_mig_bytes_transferred();
> +            info->disk->remaining = blk_mig_bytes_remaining();
> +            info->disk->total = blk_mig_bytes_total();
> +        }
> +
> +        get_xbzrle_cache_stats(info);
>      }
>      info->status = s->state;
>  
> @@ -1660,6 +1687,8 @@ static void *migration_thread(void *opaque)
>      /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
>      enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE;
>  
> +    int ret;
> +
>      rcu_register_thread();
>  
>      qemu_savevm_state_header(s->to_dst_file);
> @@ -1726,11 +1755,32 @@ static void *migration_thread(void *opaque)
>              }
>          }
>  
> -        if (qemu_file_get_error(s->to_dst_file)) {
> -            migrate_set_state(&s->state, current_active_state,
> -                              MIGRATION_STATUS_FAILED);
> -            trace_migration_thread_file_err();
> -            break;
> +        if ((ret = qemu_file_get_error(s->to_dst_file))) {
> +            fprintf(stderr, "1 : Error %s %d\n", strerror(-ret), -ret);
> +
> +            /*  This check is based on how the error is set during the network
> +             *  recv(). When recv() returns 0 (i.e. no data to read), the error
> +             *  is set to -EIO. For all other network errors, it is set
> +             *  according to the return value received.
> +             */
> +            if (ret != -EIO && s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {

shouldn't that be   ret == -EIO ?

> +                /* Network Failure during postcopy */
> +
> +                current_active_state = MIGRATION_STATUS_POSTCOPY_RECOVERY;
> +                runstate_set(RUN_STATE_POSTMIGRATE_RECOVERY);
> +                fprintf(stderr, "1.1 : Error %s %d\n", strerror(-ret), -ret);

All the fprintf's need to come out for the final version; use traces instaed.

> +                ret = qemu_migrate_postcopy_outgoing_recovery(s);
> +                if(ret < 0) {
> +                    break;
> +                }
> +
> +            } else {
> +                migrate_set_state(&s->state, current_active_state,
> +                                             MIGRATION_STATUS_FAILED);
> +                fprintf(stderr, "1.2 : Error %s %d\n", strerror(-ret), -ret);
> +                trace_migration_thread_file_err();
> +                break;
> +            }
>          }
>          current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>          if (current_time >= initial_time + BUFFER_DELAY) {
> @@ -1831,6 +1881,22 @@ void migrate_fd_connect(MigrationState *s)
>      s->migration_thread_running = true;
>  }
>  
> +int qemu_migrate_postcopy_outgoing_recovery(MigrationState* ms)
> +{
> +    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
> +                                  MIGRATION_STATUS_POSTCOPY_RECOVERY);
> +
> +    ms->in_recovery = true;
> +    /* Code for network recovery to be added here */
> +    while(atomic_mb_read(&ms->in_recovery) == true) {
> +        fprintf(stderr, "Not letting it fail %p\n", ms->to_dst_file);
> +        sleep(5);
> +    }
> +
> +    return -1;
> +
> +}
> +
>  PostcopyState  postcopy_state_get(void)
>  {
>      return atomic_mb_read(&incoming_postcopy_state);
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 1b7b1e1..613f8d2 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -154,12 +154,14 @@
>  # @watchdog: the watchdog action is configured to pause and has been triggered
>  #
>  # @guest-panicked: guest has been panicked as a result of guest OS panic
> +#
> +# @postmigrate-recovery: guest is paused for recovery after a network failure
>  ##
>  { 'enum': 'RunState',
>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
> -            'guest-panicked' ] }
> +            'guest-panicked', 'postmigrate-recovery' ] }

Would it be possible to have that as postcopy-recovery rather than postmigrate?

>  ##
>  # @StatusInfo:
> @@ -434,12 +436,15 @@
>  #
>  # @failed: some error occurred during migration process.
>  #
> +# @postcopy-recovery: in recovery mode, after a network failure.
> +#
>  # Since: 2.3
>  #
>  ##
>  { 'enum': 'MigrationStatus',
>    'data': [ 'none', 'setup', 'cancelling', 'cancelled',
> -            'active', 'postcopy-active', 'completed', 'failed' ] }
> +            'active', 'postcopy-active', 'completed', 'failed',
> +            'postcopy-recovery' ] }
>  
>  ##
>  # @MigrationInfo
> @@ -2058,6 +2063,8 @@
>  #
>  # @uri: the Uniform Resource Identifier of the destination VM
>  #
> +# @recover: #optional recover from a broken migration
> +#

That looks like it belongs in a different patch.

>  # @blk: #optional do block migration (full disk copy)
>  #
>  # @inc: #optional incremental disk copy migration
> diff --git a/vl.c b/vl.c
> index 5fd22cb..c237140 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -618,6 +618,10 @@ static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
>      { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
> +    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE_RECOVERY },
> +
> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_SHUTDOWN },
>  
>      { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
>      { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },
> -- 
> 2.7.4
> 

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert June 15, 2016, 1:03 p.m. UTC | #5
* Eric Blake (eblake@redhat.com) wrote:
> On 06/13/2016 12:38 AM, haris iqbal wrote:
> 
> >>>  ##
> >>>  { 'enum': 'RunState',
> >>>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
> >>>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
> >>>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
> >>> -            'guest-panicked' ] }
> >>> +            'guest-panicked', 'postmigrate-recovery' ] }
> >>
> >> Adding new enums can cause existing clients like libvirt to do weird
> >> things if they aren't expecting the new state. Are we sure we want to do
> >> it?
> > I think so. If we do not have a new state, then one would not know
> > that the VM is in recovery.
> > 
> >> Is it a state that cannot be entered by default, but only in
> >> response to a client request that proves the client is new enough to
> >> expect the new state?
> > 
> > I did not quite understand what you are trying to say.
> > 
> 
> A client that is not expecting the new 'postmigrate-recovery' state may
> mishandle a VM that is in that state.  So I'm suggesting that we may
> want to special case this state, and make it possible to enter the state
> only if the client has done something first to inform qemu that it
> understands what it means for a VM to be in that state.

Do you mean another migration capability?

Dave

> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 


--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Eric Blake June 15, 2016, 1:10 p.m. UTC | #6
On 06/15/2016 07:03 AM, Dr. David Alan Gilbert wrote:
> * Eric Blake (eblake@redhat.com) wrote:
>> On 06/13/2016 12:38 AM, haris iqbal wrote:
>>
>>>>>  ##
>>>>>  { 'enum': 'RunState',
>>>>>    'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
>>>>>              'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
>>>>>              'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
>>>>> -            'guest-panicked' ] }
>>>>> +            'guest-panicked', 'postmigrate-recovery' ] }
>>>>
>>>> Adding new enums can cause existing clients like libvirt to do weird
>>>> things if they aren't expecting the new state. Are we sure we want to do
>>>> it?
>>> I think so. If we do not have a new state, then one would not know
>>> that the VM is in recovery.
>>>
>>>> Is it a state that cannot be entered by default, but only in
>>>> response to a client request that proves the client is new enough to
>>>> expect the new state?
>>>
>>> I did not quite understand what you are trying to say.
>>>
>>
>> A client that is not expecting the new 'postmigrate-recovery' state may
>> mishandle a VM that is in that state.  So I'm suggesting that we may
>> want to special case this state, and make it possible to enter the state
>> only if the client has done something first to inform qemu that it
>> understands what it means for a VM to be in that state.
> 
> Do you mean another migration capability?
> 

Sure, that would be an introspectible way - if left off, the state
cannot be reached, but if the client turns the capability on, then the
state is useful.
diff mbox

Patch

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 4a3201b..59e26e6 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -326,6 +326,7 @@  void global_state_store_running(void);
 void flush_page_queue(MigrationState *ms);
 int ram_save_queue_pages(MigrationState *ms, const char *rbname,
                          ram_addr_t start, ram_addr_t len);
+int qemu_migrate_postcopy_outgoing_recovery(MigrationState *ms);
 
 PostcopyState postcopy_state_get(void);
 /* Set the state and return the old state */
diff --git a/migration/migration.c b/migration/migration.c
index a77f62e..41c28e1 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -676,6 +676,33 @@  MigrationInfo *qmp_query_migrate(Error **errp)
     case MIGRATION_STATUS_CANCELLED:
         info->has_status = true;
         break;
+    case MIGRATION_STATUS_POSTCOPY_RECOVERY:
+        info->has_status = true;
+        info->has_total_time = true;
+        info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
+
+        info->has_ram = true;
+        info->ram = g_malloc0(sizeof(*info->ram));
+        info->ram->transferred = ram_bytes_transferred();
+        info->ram->remaining = ram_bytes_remaining();
+        info->ram->total = ram_bytes_total();
+        info->ram->duplicate = dup_mig_pages_transferred();
+        info->ram->skipped = skipped_mig_pages_transferred();
+        info->ram->normal = norm_mig_pages_transferred();
+        info->ram->normal_bytes = norm_mig_bytes_transferred();
+        info->ram->dirty_pages_rate = s->dirty_pages_rate;
+        info->ram->mbps = s->mbps;
+        info->ram->dirty_sync_count = s->dirty_sync_count;
+
+        if (blk_mig_active()) {
+            info->has_disk = true;
+            info->disk = g_malloc0(sizeof(*info->disk));
+            info->disk->transferred = blk_mig_bytes_transferred();
+            info->disk->remaining = blk_mig_bytes_remaining();
+            info->disk->total = blk_mig_bytes_total();
+        }
+
+        get_xbzrle_cache_stats(info);
     }
     info->status = s->state;
 
@@ -1660,6 +1687,8 @@  static void *migration_thread(void *opaque)
     /* The active state we expect to be in; ACTIVE or POSTCOPY_ACTIVE */
     enum MigrationStatus current_active_state = MIGRATION_STATUS_ACTIVE;
 
+    int ret;
+
     rcu_register_thread();
 
     qemu_savevm_state_header(s->to_dst_file);
@@ -1726,11 +1755,32 @@  static void *migration_thread(void *opaque)
             }
         }
 
-        if (qemu_file_get_error(s->to_dst_file)) {
-            migrate_set_state(&s->state, current_active_state,
-                              MIGRATION_STATUS_FAILED);
-            trace_migration_thread_file_err();
-            break;
+        if ((ret = qemu_file_get_error(s->to_dst_file))) {
+            fprintf(stderr, "1 : Error %s %d\n", strerror(-ret), -ret);
+
+            /*  This check is based on how the error is set during the network
+             *  recv(). When recv() returns 0 (i.e. no data to read), the error
+             *  is set to -EIO. For all other network errors, it is set
+             *  according to the return value received.
+             */
+            if (ret != -EIO && s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
+                /* Network Failure during postcopy */
+
+                current_active_state = MIGRATION_STATUS_POSTCOPY_RECOVERY;
+                runstate_set(RUN_STATE_POSTMIGRATE_RECOVERY);
+                fprintf(stderr, "1.1 : Error %s %d\n", strerror(-ret), -ret);
+                ret = qemu_migrate_postcopy_outgoing_recovery(s);
+                if(ret < 0) {
+                    break;
+                }
+
+            } else {
+                migrate_set_state(&s->state, current_active_state,
+                                             MIGRATION_STATUS_FAILED);
+                fprintf(stderr, "1.2 : Error %s %d\n", strerror(-ret), -ret);
+                trace_migration_thread_file_err();
+                break;
+            }
         }
         current_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
         if (current_time >= initial_time + BUFFER_DELAY) {
@@ -1831,6 +1881,22 @@  void migrate_fd_connect(MigrationState *s)
     s->migration_thread_running = true;
 }
 
+int qemu_migrate_postcopy_outgoing_recovery(MigrationState* ms)
+{
+    migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
+                                  MIGRATION_STATUS_POSTCOPY_RECOVERY);
+
+    ms->in_recovery = true;
+    /* Code for network recovery to be added here */
+    while(atomic_mb_read(&ms->in_recovery) == true) {
+        fprintf(stderr, "Not letting it fail %p\n", ms->to_dst_file);
+        sleep(5);
+    }
+
+    return -1;
+
+}
+
 PostcopyState  postcopy_state_get(void)
 {
     return atomic_mb_read(&incoming_postcopy_state);
diff --git a/qapi-schema.json b/qapi-schema.json
index 1b7b1e1..613f8d2 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -154,12 +154,14 @@ 
 # @watchdog: the watchdog action is configured to pause and has been triggered
 #
 # @guest-panicked: guest has been panicked as a result of guest OS panic
+#
+# @postmigrate-recovery: guest is paused for recovery after a network failure
 ##
 { 'enum': 'RunState',
   'data': [ 'debug', 'inmigrate', 'internal-error', 'io-error', 'paused',
             'postmigrate', 'prelaunch', 'finish-migrate', 'restore-vm',
             'running', 'save-vm', 'shutdown', 'suspended', 'watchdog',
-            'guest-panicked' ] }
+            'guest-panicked', 'postmigrate-recovery' ] }
 
 ##
 # @StatusInfo:
@@ -434,12 +436,15 @@ 
 #
 # @failed: some error occurred during migration process.
 #
+# @postcopy-recovery: in recovery mode, after a network failure.
+#
 # Since: 2.3
 #
 ##
 { 'enum': 'MigrationStatus',
   'data': [ 'none', 'setup', 'cancelling', 'cancelled',
-            'active', 'postcopy-active', 'completed', 'failed' ] }
+            'active', 'postcopy-active', 'completed', 'failed',
+            'postcopy-recovery' ] }
 
 ##
 # @MigrationInfo
@@ -2058,6 +2063,8 @@ 
 #
 # @uri: the Uniform Resource Identifier of the destination VM
 #
+# @recover: #optional recover from a broken migration
+#
 # @blk: #optional do block migration (full disk copy)
 #
 # @inc: #optional incremental disk copy migration
diff --git a/vl.c b/vl.c
index 5fd22cb..c237140 100644
--- a/vl.c
+++ b/vl.c
@@ -618,6 +618,10 @@  static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_RUNNING },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE },
     { RUN_STATE_FINISH_MIGRATE, RUN_STATE_PRELAUNCH },
+    { RUN_STATE_FINISH_MIGRATE, RUN_STATE_POSTMIGRATE_RECOVERY },
+
+    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_FINISH_MIGRATE },
+    { RUN_STATE_POSTMIGRATE_RECOVERY, RUN_STATE_SHUTDOWN },
 
     { RUN_STATE_RESTORE_VM, RUN_STATE_RUNNING },
     { RUN_STATE_RESTORE_VM, RUN_STATE_PRELAUNCH },