diff mbox

[3/5] migration/rdma: Allow cancelling while waiting for wrid

Message ID 20170704184915.31586-4-dgilbert@redhat.com
State New
Headers show

Commit Message

Dr. David Alan Gilbert July 4, 2017, 6:49 p.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

When waiting for a WRID, if the other side dies we end up waiting
for ever with no way to cancel the migration.
Cure this by poll()ing the fd first with a timeout and checking
error flags and migration state.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 migration/rdma.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++------
 1 file changed, 48 insertions(+), 6 deletions(-)

Comments

Peter Xu July 12, 2017, 9:32 a.m. UTC | #1
On Tue, Jul 04, 2017 at 07:49:13PM +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> When waiting for a WRID, if the other side dies we end up waiting
> for ever with no way to cancel the migration.
> Cure this by poll()ing the fd first with a timeout and checking
> error flags and migration state.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  migration/rdma.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 48 insertions(+), 6 deletions(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 6111e10c70..7273ae9929 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1466,6 +1466,52 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
>      return  0;
>  }
>  
> +/* Wait for activity on the completion channel.
> + * Returns 0 on success, none-0 on error.
> + */
> +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> +{
> +    /*
> +     * Coroutine doesn't start until migration_fd_process_incoming()
> +     * so don't yield unless we know we're running inside of a coroutine.
> +     */
> +    if (rdma->migration_started_on_destination) {
> +        yield_until_fd_readable(rdma->comp_channel->fd);
> +    } else {
> +        /* This is the source side, we're in a separate thread
> +         * or destination prior to migration_fd_process_incoming()
> +         * we can't yield; so we have to poll the fd.
> +         * But we need to be able to handle 'cancel' or an error
> +         * without hanging forever.
> +         */
> +        while (!rdma->error_state && !rdma->error_reported &&
> +               !rdma->received_error) {

Should we just check rdma->error_state? Since iiuc error_reported only
means whether error is reported (set to 1 if reported), and
received_error means whether we got explicit error from peer (I think
only via a RDMA_CONTROL_ERROR message), and it's 1 if received. IIUC
either error_reported or received_error will mean that error_state be
set already.

> +            GPollFD pfds[1];
> +            pfds[0].fd = rdma->comp_channel->fd;
> +            pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> +            /* 0.5s timeout, should be fine for a 'cancel' */
> +            switch (qemu_poll_ns(pfds, 1, 500 * 1000 * 1000)) {

(I would be more aggresive, say, 100ms, or even less, for better UI
 response with merely nothing lost. But 500ms is okay as well to me)

> +            case 1: /* fd active */
> +                return 0;
> +
> +            case 0: /* Timeout, go around again */
> +                break;
> +
> +            default: /* Error of some type */
> +                return -1;
> +            }
> +
> +            if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
> +                /* Bail out and let the cancellation happen */
> +                return -EPIPE;
> +            }
> +        }
> +    }
> +
> +    return rdma->error_state || rdma->error_reported ||
> +           rdma->received_error;

Similar question here. Maybe just return rdma->error_state?

Thanks,
Dr. David Alan Gilbert July 12, 2017, 12:36 p.m. UTC | #2
* Peter Xu (peterx@redhat.com) wrote:
> On Tue, Jul 04, 2017 at 07:49:13PM +0100, Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > When waiting for a WRID, if the other side dies we end up waiting
> > for ever with no way to cancel the migration.
> > Cure this by poll()ing the fd first with a timeout and checking
> > error flags and migration state.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  migration/rdma.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++------
> >  1 file changed, 48 insertions(+), 6 deletions(-)
> > 
> > diff --git a/migration/rdma.c b/migration/rdma.c
> > index 6111e10c70..7273ae9929 100644
> > --- a/migration/rdma.c
> > +++ b/migration/rdma.c
> > @@ -1466,6 +1466,52 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
> >      return  0;
> >  }
> >  
> > +/* Wait for activity on the completion channel.
> > + * Returns 0 on success, none-0 on error.
> > + */
> > +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> > +{
> > +    /*
> > +     * Coroutine doesn't start until migration_fd_process_incoming()
> > +     * so don't yield unless we know we're running inside of a coroutine.
> > +     */
> > +    if (rdma->migration_started_on_destination) {
> > +        yield_until_fd_readable(rdma->comp_channel->fd);
> > +    } else {
> > +        /* This is the source side, we're in a separate thread
> > +         * or destination prior to migration_fd_process_incoming()
> > +         * we can't yield; so we have to poll the fd.
> > +         * But we need to be able to handle 'cancel' or an error
> > +         * without hanging forever.
> > +         */
> > +        while (!rdma->error_state && !rdma->error_reported &&
> > +               !rdma->received_error) {
> 
> Should we just check rdma->error_state? Since iiuc error_reported only
> means whether error is reported (set to 1 if reported), and
> received_error means whether we got explicit error from peer (I think
> only via a RDMA_CONTROL_ERROR message), and it's 1 if received. IIUC
> either error_reported or received_error will mean that error_state be
> set already.

I think I agree we don't need to check both error_state and
error_reported; I think it's best to check received_error since if
the other end fails we want this end to stop waiting for anything else.
I've not convinced myself that every path from receiving that error
till this point would have set the error_state (although it probably
has);  so I suggest:
   while (!rdma->error_state && !rdma->received_error) {

> > +            GPollFD pfds[1];
> > +            pfds[0].fd = rdma->comp_channel->fd;
> > +            pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> > +            /* 0.5s timeout, should be fine for a 'cancel' */
> > +            switch (qemu_poll_ns(pfds, 1, 500 * 1000 * 1000)) {
> 
> (I would be more aggresive, say, 100ms, or even less, for better UI
>  response with merely nothing lost. But 500ms is okay as well to me)

Yes, it felt fairly arbitrary here.

> > +            case 1: /* fd active */
> > +                return 0;
> > +
> > +            case 0: /* Timeout, go around again */
> > +                break;
> > +
> > +            default: /* Error of some type */
> > +                return -1;
> > +            }
> > +
> > +            if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
> > +                /* Bail out and let the cancellation happen */
> > +                return -EPIPE;
> > +            }
> > +        }
> > +    }
> > +
> > +    return rdma->error_state || rdma->error_reported ||
> > +           rdma->received_error;
> 
> Similar question here. Maybe just return rdma->error_state?

As above, I could just drop the error_reported part.

Dave

> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Peter Xu July 14, 2017, 2:57 a.m. UTC | #3
On Wed, Jul 12, 2017 at 01:36:07PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > On Tue, Jul 04, 2017 at 07:49:13PM +0100, Dr. David Alan Gilbert (git) wrote:
> > > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > 
> > > When waiting for a WRID, if the other side dies we end up waiting
> > > for ever with no way to cancel the migration.
> > > Cure this by poll()ing the fd first with a timeout and checking
> > > error flags and migration state.
> > > 
> > > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > > ---
> > >  migration/rdma.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++------
> > >  1 file changed, 48 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/migration/rdma.c b/migration/rdma.c
> > > index 6111e10c70..7273ae9929 100644
> > > --- a/migration/rdma.c
> > > +++ b/migration/rdma.c
> > > @@ -1466,6 +1466,52 @@ static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
> > >      return  0;
> > >  }
> > >  
> > > +/* Wait for activity on the completion channel.
> > > + * Returns 0 on success, none-0 on error.
> > > + */
> > > +static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
> > > +{
> > > +    /*
> > > +     * Coroutine doesn't start until migration_fd_process_incoming()
> > > +     * so don't yield unless we know we're running inside of a coroutine.
> > > +     */
> > > +    if (rdma->migration_started_on_destination) {
> > > +        yield_until_fd_readable(rdma->comp_channel->fd);
> > > +    } else {
> > > +        /* This is the source side, we're in a separate thread
> > > +         * or destination prior to migration_fd_process_incoming()
> > > +         * we can't yield; so we have to poll the fd.
> > > +         * But we need to be able to handle 'cancel' or an error
> > > +         * without hanging forever.
> > > +         */
> > > +        while (!rdma->error_state && !rdma->error_reported &&
> > > +               !rdma->received_error) {
> > 
> > Should we just check rdma->error_state? Since iiuc error_reported only
> > means whether error is reported (set to 1 if reported), and
> > received_error means whether we got explicit error from peer (I think
> > only via a RDMA_CONTROL_ERROR message), and it's 1 if received. IIUC
> > either error_reported or received_error will mean that error_state be
> > set already.
> 
> I think I agree we don't need to check both error_state and
> error_reported; I think it's best to check received_error since if
> the other end fails we want this end to stop waiting for anything else.
> I've not convinced myself that every path from receiving that error
> till this point would have set the error_state (although it probably
> has);  so I suggest:
>    while (!rdma->error_state && !rdma->received_error) {

qemu_rdma_exchange_get_response() is the only place I see to set the
received_error. And when that is triggered, the stack would be:

   qemu_rdma_exchange_get_response() set received_error=true, ret -EIO
     <-- qemu_rdma_exchange_recv() return the -EIO
       <-- qio_channel_rdma_readv() pass the -EIO to error_state, or
       <-- qemu_rdma_registration_handle() pass the -EIO to error_state

But even so, I'm totally fine that we check both.

> 
> > > +            GPollFD pfds[1];
> > > +            pfds[0].fd = rdma->comp_channel->fd;
> > > +            pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
> > > +            /* 0.5s timeout, should be fine for a 'cancel' */
> > > +            switch (qemu_poll_ns(pfds, 1, 500 * 1000 * 1000)) {
> > 
> > (I would be more aggresive, say, 100ms, or even less, for better UI
> >  response with merely nothing lost. But 500ms is okay as well to me)
> 
> Yes, it felt fairly arbitrary here.
> 
> > > +            case 1: /* fd active */
> > > +                return 0;
> > > +
> > > +            case 0: /* Timeout, go around again */
> > > +                break;
> > > +
> > > +            default: /* Error of some type */
> > > +                return -1;
> > > +            }
> > > +
> > > +            if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
> > > +                /* Bail out and let the cancellation happen */
> > > +                return -EPIPE;
> > > +            }
> > > +        }
> > > +    }
> > > +
> > > +    return rdma->error_state || rdma->error_reported ||
> > > +           rdma->received_error;
> > 
> > Similar question here. Maybe just return rdma->error_state?
> 
> As above, I could just drop the error_reported part.

Sure. Will reply on the new version.
diff mbox

Patch

diff --git a/migration/rdma.c b/migration/rdma.c
index 6111e10c70..7273ae9929 100644
--- a/migration/rdma.c
+++ b/migration/rdma.c
@@ -1466,6 +1466,52 @@  static uint64_t qemu_rdma_poll(RDMAContext *rdma, uint64_t *wr_id_out,
     return  0;
 }
 
+/* Wait for activity on the completion channel.
+ * Returns 0 on success, none-0 on error.
+ */
+static int qemu_rdma_wait_comp_channel(RDMAContext *rdma)
+{
+    /*
+     * Coroutine doesn't start until migration_fd_process_incoming()
+     * so don't yield unless we know we're running inside of a coroutine.
+     */
+    if (rdma->migration_started_on_destination) {
+        yield_until_fd_readable(rdma->comp_channel->fd);
+    } else {
+        /* This is the source side, we're in a separate thread
+         * or destination prior to migration_fd_process_incoming()
+         * we can't yield; so we have to poll the fd.
+         * But we need to be able to handle 'cancel' or an error
+         * without hanging forever.
+         */
+        while (!rdma->error_state && !rdma->error_reported &&
+               !rdma->received_error) {
+            GPollFD pfds[1];
+            pfds[0].fd = rdma->comp_channel->fd;
+            pfds[0].events = G_IO_IN | G_IO_HUP | G_IO_ERR;
+            /* 0.5s timeout, should be fine for a 'cancel' */
+            switch (qemu_poll_ns(pfds, 1, 500 * 1000 * 1000)) {
+            case 1: /* fd active */
+                return 0;
+
+            case 0: /* Timeout, go around again */
+                break;
+
+            default: /* Error of some type */
+                return -1;
+            }
+
+            if (migrate_get_current()->state == MIGRATION_STATUS_CANCELLING) {
+                /* Bail out and let the cancellation happen */
+                return -EPIPE;
+            }
+        }
+    }
+
+    return rdma->error_state || rdma->error_reported ||
+           rdma->received_error;
+}
+
 /*
  * Block until the next work request has completed.
  *
@@ -1513,12 +1559,8 @@  static int qemu_rdma_block_for_wrid(RDMAContext *rdma, int wrid_requested,
     }
 
     while (1) {
-        /*
-         * Coroutine doesn't start until migration_fd_process_incoming()
-         * so don't yield unless we know we're running inside of a coroutine.
-         */
-        if (rdma->migration_started_on_destination) {
-            yield_until_fd_readable(rdma->comp_channel->fd);
+        if (qemu_rdma_wait_comp_channel(rdma)) {
+            goto err_block_for_wrid;
         }
 
         if (ibv_get_cq_event(rdma->comp_channel, &cq, &cq_ctx)) {