[v2,2/2] migration/postcopy: check PostcopyState before setting to POSTCOPY_INCOMING_RUNNING
diff mbox series

Message ID 20191010011316.31363-3-richardw.yang@linux.intel.com
State New
Headers show
Series
  • migration/postcopy: cleanup related to postcopy
Related show

Commit Message

Wei Yang Oct. 10, 2019, 1:13 a.m. UTC
Currently, we set PostcopyState blindly to RUNNING, even we found the
previous state is not LISTENING. This will lead to a corner case.

First let's look at the code flow:

qemu_loadvm_state_main()
    ret = loadvm_process_command()
        loadvm_postcopy_handle_run()
            return -1;
    if (ret < 0) {
        if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING)
            ...
    }

>From above snippet, the corner case is loadvm_postcopy_handle_run()
always sets state to RUNNING. And then it checks the previous state. If
the previous state is not LISTENING, it will return -1. But at this
moment, PostcopyState is already been set to RUNNING.

Then ret is checked in qemu_loadvm_state_main(), when it is -1
PostcopyState is checked. Current logic would pause postcopy and retry
if PostcopyState is RUNNING. This is not what we expect, because
postcopy is not active yet.

This patch makes sure state is set to RUNNING only previous state is
LISTENING by checking the state first.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Suggested by: Peter Xu <peterx@redhat.com>
---
 migration/savevm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Dr. David Alan Gilbert Oct. 11, 2019, 10:12 a.m. UTC | #1
* Wei Yang (richardw.yang@linux.intel.com) wrote:
> Currently, we set PostcopyState blindly to RUNNING, even we found the
> previous state is not LISTENING. This will lead to a corner case.
> 
> First let's look at the code flow:
> 
> qemu_loadvm_state_main()
>     ret = loadvm_process_command()
>         loadvm_postcopy_handle_run()
>             return -1;
>     if (ret < 0) {
>         if (postcopy_state_get() == POSTCOPY_INCOMING_RUNNING)
>             ...
>     }
> 
> >From above snippet, the corner case is loadvm_postcopy_handle_run()
> always sets state to RUNNING. And then it checks the previous state. If
> the previous state is not LISTENING, it will return -1. But at this
> moment, PostcopyState is already been set to RUNNING.
> 
> Then ret is checked in qemu_loadvm_state_main(), when it is -1
> PostcopyState is checked. Current logic would pause postcopy and retry
> if PostcopyState is RUNNING. This is not what we expect, because
> postcopy is not active yet.
> 
> This patch makes sure state is set to RUNNING only previous state is
> LISTENING by checking the state first.
> 
> Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
> Suggested by: Peter Xu <peterx@redhat.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/savevm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 78c2965ca4..b9f30a7090 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1934,7 +1934,7 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
>  /* After all discards we can start running and asking for pages */
>  static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
>  {
> -    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
> +    PostcopyState ps = postcopy_state_get();
>  
>      trace_loadvm_postcopy_handle_run();
>      if (ps != POSTCOPY_INCOMING_LISTENING) {
> @@ -1942,6 +1942,7 @@ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
>          return -1;
>      }
>  
> +    postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
>      mis->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, mis);
>      qemu_bh_schedule(mis->bh);
>  
> -- 
> 2.17.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Patch
diff mbox series

diff --git a/migration/savevm.c b/migration/savevm.c
index 78c2965ca4..b9f30a7090 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1934,7 +1934,7 @@  static void loadvm_postcopy_handle_run_bh(void *opaque)
 /* After all discards we can start running and asking for pages */
 static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
 {
-    PostcopyState ps = postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
+    PostcopyState ps = postcopy_state_get();
 
     trace_loadvm_postcopy_handle_run();
     if (ps != POSTCOPY_INCOMING_LISTENING) {
@@ -1942,6 +1942,7 @@  static int loadvm_postcopy_handle_run(MigrationIncomingState *mis)
         return -1;
     }
 
+    postcopy_state_set(POSTCOPY_INCOMING_RUNNING);
     mis->bh = qemu_bh_new(loadvm_postcopy_handle_run_bh, mis);
     qemu_bh_schedule(mis->bh);