diff mbox series

[V3,6/7] Migration/colo.c: Add the necessary checks for colo_do_failover

Message ID 20190303145021.2962-7-chen.zhang@intel.com
State New
Headers show
Series Migration/colo: Fix upstream bugs when occur failover | expand

Commit Message

Zhang, Chen March 3, 2019, 2:50 p.m. UTC
From: Zhang Chen <chen.zhang@intel.com>

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
---
 migration/colo.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Dr. David Alan Gilbert March 8, 2019, 5:41 p.m. UTC | #1
* Zhang Chen (chen.zhang@intel.com) wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>

OK, we should make that properly return an error.
(Actually we should make the failover command be one of the new OOB
commands; so that it can work even if the main loop is blocked- that
would make COLO robust properly against network failures)


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/colo.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c
> index dbe2b88807..d1ae2e6d11 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -197,10 +197,16 @@ void colo_do_failover(MigrationState *s)
>          vm_stop_force_state(RUN_STATE_COLO);
>      }
>  
> -    if (get_colo_mode() == COLO_MODE_PRIMARY) {
> +    switch (get_colo_mode()) {
> +    case COLO_MODE_PRIMARY:
>          primary_vm_do_failover();
> -    } else {
> +        break;
> +    case COLO_MODE_SECONDARY:
>          secondary_vm_do_failover();
> +        break;
> +    default:
> +        error_report("colo_do_failover failed because the colo mode"
> +                     " could not be obtained");
>      }
>  }
>  
> -- 
> 2.17.GIT
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Zhang, Chen March 8, 2019, 6:36 p.m. UTC | #2
-----Original Message-----
From: Dr. David Alan Gilbert [mailto:dgilbert@redhat.com] 
Sent: Saturday, March 9, 2019 1:41 AM
To: Zhang, Chen <chen.zhang@intel.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>; Zhang Chen <zhangckid@gmail.com>; Juan Quintela <quintela@redhat.com>; zhanghailiang <zhang.zhanghailiang@huawei.com>; Markus Armbruster <armbru@redhat.com>; Eric Blake <eblake@redhat.com>; qemu-dev <qemu-devel@nongnu.org>
Subject: Re: [PATCH V3 6/7] Migration/colo.c: Add the necessary checks for colo_do_failover

* Zhang Chen (chen.zhang@intel.com) wrote:
> From: Zhang Chen <chen.zhang@intel.com>
> 
> Signed-off-by: Zhang Chen <chen.zhang@intel.com>

OK, we should make that properly return an error.
(Actually we should make the failover command be one of the new OOB commands; so that it can work even if the main loop is blocked- that would make COLO robust properly against network failures)

Good idea~ which OOB command can be the demo in current Qemu?
Maybe I can do this job in the future.

Thanks
Zhang Chen

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/colo.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/colo.c b/migration/colo.c index 
> dbe2b88807..d1ae2e6d11 100644
> --- a/migration/colo.c
> +++ b/migration/colo.c
> @@ -197,10 +197,16 @@ void colo_do_failover(MigrationState *s)
>          vm_stop_force_state(RUN_STATE_COLO);
>      }
>  
> -    if (get_colo_mode() == COLO_MODE_PRIMARY) {
> +    switch (get_colo_mode()) {
> +    case COLO_MODE_PRIMARY:
>          primary_vm_do_failover();
> -    } else {
> +        break;
> +    case COLO_MODE_SECONDARY:
>          secondary_vm_do_failover();
> +        break;
> +    default:
> +        error_report("colo_do_failover failed because the colo mode"
> +                     " could not be obtained");
>      }
>  }
>  
> --
> 2.17.GIT
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert March 8, 2019, 7:02 p.m. UTC | #3
* Zhang, Chen (chen.zhang@intel.com) wrote:
> 
> 
> -----Original Message-----
> From: Dr. David Alan Gilbert [mailto:dgilbert@redhat.com] 
> Sent: Saturday, March 9, 2019 1:41 AM
> To: Zhang, Chen <chen.zhang@intel.com>
> Cc: Li Zhijian <lizhijian@cn.fujitsu.com>; Zhang Chen <zhangckid@gmail.com>; Juan Quintela <quintela@redhat.com>; zhanghailiang <zhang.zhanghailiang@huawei.com>; Markus Armbruster <armbru@redhat.com>; Eric Blake <eblake@redhat.com>; qemu-dev <qemu-devel@nongnu.org>
> Subject: Re: [PATCH V3 6/7] Migration/colo.c: Add the necessary checks for colo_do_failover
> 
> * Zhang Chen (chen.zhang@intel.com) wrote:
> > From: Zhang Chen <chen.zhang@intel.com>
> > 
> > Signed-off-by: Zhang Chen <chen.zhang@intel.com>
> 
>>  OK, we should make that properly return an error.
>>  (Actually we should make the failover command be one of the new OOB commands; so that it can work even if the main loop is blocked- that would make COLO robust properly against network failures)
> 
> Good idea~ which OOB command can be the demo in current Qemu?
>  Maybe I can do this job in the future.

The 'migration-pause' command I think is the only real one so far.
Remember the important thing is the code that runs as part of it must
not take any lock that could be blocked by the main loop or any
part of the replication that could block based on the other host
having died.

Dave

> 
> Thanks
> Zhang Chen
> 
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> > ---
> >  migration/colo.c | 10 ++++++++--
> >  1 file changed, 8 insertions(+), 2 deletions(-)
> > 
> > diff --git a/migration/colo.c b/migration/colo.c index 
> > dbe2b88807..d1ae2e6d11 100644
> > --- a/migration/colo.c
> > +++ b/migration/colo.c
> > @@ -197,10 +197,16 @@ void colo_do_failover(MigrationState *s)
> >          vm_stop_force_state(RUN_STATE_COLO);
> >      }
> >  
> > -    if (get_colo_mode() == COLO_MODE_PRIMARY) {
> > +    switch (get_colo_mode()) {
> > +    case COLO_MODE_PRIMARY:
> >          primary_vm_do_failover();
> > -    } else {
> > +        break;
> > +    case COLO_MODE_SECONDARY:
> >          secondary_vm_do_failover();
> > +        break;
> > +    default:
> > +        error_report("colo_do_failover failed because the colo mode"
> > +                     " could not be obtained");
> >      }
> >  }
> >  
> > --
> > 2.17.GIT
> > 
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox series

Patch

diff --git a/migration/colo.c b/migration/colo.c
index dbe2b88807..d1ae2e6d11 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -197,10 +197,16 @@  void colo_do_failover(MigrationState *s)
         vm_stop_force_state(RUN_STATE_COLO);
     }
 
-    if (get_colo_mode() == COLO_MODE_PRIMARY) {
+    switch (get_colo_mode()) {
+    case COLO_MODE_PRIMARY:
         primary_vm_do_failover();
-    } else {
+        break;
+    case COLO_MODE_SECONDARY:
         secondary_vm_do_failover();
+        break;
+    default:
+        error_report("colo_do_failover failed because the colo mode"
+                     " could not be obtained");
     }
 }