diff mbox series

[4/5] migration: fix qemu carsh when RDMA live migration

Message ID 1523089594-1422-5-git-send-email-lidongchen@tencent.com
State New
Headers show
Series Enable postcopy RDMA live migration | expand

Commit Message

858585 jemmy April 7, 2018, 8:26 a.m. UTC
After postcopy, the destination qemu work in the dedicated
thread, so only invoke yield_until_fd_readable before postcopy
migration.

Signed-off-by: Lidong Chen <lidongchen@tencent.com>
---
 migration/rdma.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Dr. David Alan Gilbert April 11, 2018, 4:43 p.m. UTC | #1
* Lidong Chen (jemmy858585@gmail.com) wrote:
> After postcopy, the destination qemu work in the dedicated
> thread, so only invoke yield_until_fd_readable before postcopy
> migration.

The subject line needs to be more discriptive:
   migration: Stop rdma yielding during incoming postcopy

I think.
(Also please check the subject spellings)

> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> ---
>  migration/rdma.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 53773c7..81be482 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1489,11 +1489,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
>       * Coroutine doesn't start until migration_fd_process_incoming()
>       * so don't yield unless we know we're running inside of a coroutine.
>       */
> -    if (rdma->migration_started_on_destination) {
> +    if (rdma->migration_started_on_destination &&
> +        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {

OK, that's a bit delicate; watch if it ever gets called in a failure
case or similar - and also wathc out if we make more use of the status
on the destination, but otherwise, and with a fix for the subject;


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

>          yield_until_fd_readable(rdma->comp_channel->fd);
>      } else {
>          /* This is the source side, we're in a separate thread
>           * or destination prior to migration_fd_process_incoming()
> +         * after postcopy, the destination also in a seprate thread.
>           * we can't yield; so we have to poll the fd.
>           * But we need to be able to handle 'cancel' or an error
>           * without hanging forever.
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
858585 jemmy April 12, 2018, 9:40 a.m. UTC | #2
On Thu, Apr 12, 2018 at 12:43 AM, Dr. David Alan Gilbert
<dgilbert@redhat.com> wrote:
> * Lidong Chen (jemmy858585@gmail.com) wrote:
>> After postcopy, the destination qemu work in the dedicated
>> thread, so only invoke yield_until_fd_readable before postcopy
>> migration.
>
> The subject line needs to be more discriptive:
>    migration: Stop rdma yielding during incoming postcopy
>
> I think.
> (Also please check the subject spellings)
>
>> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
>> ---
>>  migration/rdma.c | 4 +++-
>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/migration/rdma.c b/migration/rdma.c
>> index 53773c7..81be482 100644
>> --- a/migration/rdma.c
>> +++ b/migration/rdma.c
>> @@ -1489,11 +1489,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
>>       * Coroutine doesn't start until migration_fd_process_incoming()
>>       * so don't yield unless we know we're running inside of a coroutine.
>>       */
>> -    if (rdma->migration_started_on_destination) {
>> +    if (rdma->migration_started_on_destination &&
>> +        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
>
> OK, that's a bit delicate; watch if it ever gets called in a failure
> case or similar - and also wathc out if we make more use of the status
> on the destination, but otherwise, and with a fix for the subject;

How about use migration_incoming_get_current()->have_listen_thread?

    if (rdma->migration_started_on_destination &&
        migration_incoming_get_current()->have_listen_thread == false) {
        yield_until_fd_readable(rdma->comp_channel->fd);
    }

>
>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
>>          yield_until_fd_readable(rdma->comp_channel->fd);
>>      } else {
>>          /* This is the source side, we're in a separate thread
>>           * or destination prior to migration_fd_process_incoming()
>> +         * after postcopy, the destination also in a seprate thread.
>>           * we can't yield; so we have to poll the fd.
>>           * But we need to be able to handle 'cancel' or an error
>>           * without hanging forever.
>> --
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert April 12, 2018, 6:58 p.m. UTC | #3
* 858585 jemmy (jemmy858585@gmail.com) wrote:
> On Thu, Apr 12, 2018 at 12:43 AM, Dr. David Alan Gilbert
> <dgilbert@redhat.com> wrote:
> > * Lidong Chen (jemmy858585@gmail.com) wrote:
> >> After postcopy, the destination qemu work in the dedicated
> >> thread, so only invoke yield_until_fd_readable before postcopy
> >> migration.
> >
> > The subject line needs to be more discriptive:
> >    migration: Stop rdma yielding during incoming postcopy
> >
> > I think.
> > (Also please check the subject spellings)
> >
> >> Signed-off-by: Lidong Chen <lidongchen@tencent.com>
> >> ---
> >>  migration/rdma.c | 4 +++-
> >>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/migration/rdma.c b/migration/rdma.c
> >> index 53773c7..81be482 100644
> >> --- a/migration/rdma.c
> >> +++ b/migration/rdma.c
> >> @@ -1489,11 +1489,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> >>       * Coroutine doesn't start until migration_fd_process_incoming()
> >>       * so don't yield unless we know we're running inside of a coroutine.
> >>       */
> >> -    if (rdma->migration_started_on_destination) {
> >> +    if (rdma->migration_started_on_destination &&
> >> +        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
> >
> > OK, that's a bit delicate; watch if it ever gets called in a failure
> > case or similar - and also wathc out if we make more use of the status
> > on the destination, but otherwise, and with a fix for the subject;
> 
> How about use migration_incoming_get_current()->have_listen_thread?

That's supposed to be pretty internal to the postcopy code, so I prefer
the status check.

Dave

>     if (rdma->migration_started_on_destination &&
>         migration_incoming_get_current()->have_listen_thread == false) {
>         yield_until_fd_readable(rdma->comp_channel->fd);
>     }
> 
> >
> >
> > Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> >
> >>          yield_until_fd_readable(rdma->comp_channel->fd);
> >>      } else {
> >>          /* This is the source side, we're in a separate thread
> >>           * or destination prior to migration_fd_process_incoming()
> >> +         * after postcopy, the destination also in a seprate thread.
> >>           * we can't yield; so we have to poll the fd.
> >>           * But we need to be able to handle 'cancel' or an error
> >>           * without hanging forever.
> >> --
> >> 1.8.3.1
> >>
> > --
> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox series

Patch

diff --git a/migration/rdma.c b/migration/rdma.c
index 53773c7..81be482 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1489,11 +1489,13 @@  static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
      * Coroutine doesn't start until migration_fd_process_incoming()
      * so don't yield unless we know we're running inside of a coroutine.
      */
-    if (rdma->migration_started_on_destination) {
+    if (rdma->migration_started_on_destination &&
+        migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {
         yield_until_fd_readable(rdma->comp_channel->fd);
     } else {
         /* This is the source side, we're in a separate thread
          * or destination prior to migration_fd_process_incoming()
+         * after postcopy, the destination also in a seprate thread.
          * we can't yield; so we have to poll the fd.
          * But we need to be able to handle 'cancel' or an error
          * without hanging forever.