diff mbox

[v7,30/42] Page request: Add MIG_RP_MSG_REQ_PAGES reverse command

Message ID 1434450415-11339-31-git-send-email-dgilbert@redhat.com
State New
Headers show

Commit Message

Dr. David Alan Gilbert June 16, 2015, 10:26 a.m. UTC
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>

Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
destination to request a page from the source.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
 include/migration/migration.h |  4 +++
 migration/migration.c         | 70 +++++++++++++++++++++++++++++++++++++++++++
 trace-events                  |  1 +
 3 files changed, 75 insertions(+)

Comments

Juan Quintela July 13, 2015, 1:24 p.m. UTC | #1
"Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>
> Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
> destination to request a page from the source.
>
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> ---
>  include/migration/migration.h |  4 +++
>  migration/migration.c         | 70 +++++++++++++++++++++++++++++++++++++++++++
>  trace-events                  |  1 +
>  3 files changed, 75 insertions(+)
>
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 68a1731..8742d53 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -47,6 +47,8 @@ enum mig_rp_message_type {
>      MIG_RP_MSG_INVALID = 0,  /* Must be 0 */
>      MIG_RP_MSG_SHUT,         /* sibling will not send any more RP messages */
>      MIG_RP_MSG_PONG,         /* Response to a PING; data (seq: be32 ) */
> +
> +    MIG_RP_MSG_REQ_PAGES,    /* data (start: be64, len: be64) */

Not that I really care, buht I think that leng could be 32bits.  I am
not seing networking getting good at multigigabytes transfers soon O:-)


> +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
> +                              ram_addr_t start, ram_addr_t len);

Shouldn't len be a size_t?
(yes, I know that migration code is not really consistent about that)

>  void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
>  void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> diff --git a/migration/migration.c b/migration/migration.c
> index 3e5a7c8..0373b77 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -113,6 +113,36 @@ static void deferred_incoming_migration(Error **errp)
>      deferred_incoming = true;
>  }
>  
> +/* Request a range of pages from the source VM at the given
> + * start address.
> + *   rbname: Name of the RAMBlock to request the page in, if NULL it's the same
> + *           as the last request (a name must have been given previously)
> + *   Start: Address offset within the RB
> + *   Len: Length in bytes required - must be a multiple of pagesize
> + */
> +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> +                               ram_addr_t start, ram_addr_t len)
> +{
> +    uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto 256 */
> +    uint64_t *buf64 = (uint64_t *)bufc;
> +    size_t msglen = 16; /* start + len */
> +
> +    assert(!(len & 1));

ohhhh, why can't we get a real flags field?

Scratch that.  Seeing the rest of the code, can't we have two commands:

MIG_RP_MSG_REQ_PAGES
MIG_RP_MSG_REQ_PAGES_WITH_ID

I am not really sure that it makes sense getting a command that can be
of two different lengths only for that?


I am not sure, but having a command with two different payloads look
strange.

Later, Juan.
Amit Shah July 23, 2015, 6:50 a.m. UTC | #2
On (Tue) 16 Jun 2015 [11:26:43], Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> 
> Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
> destination to request a page from the source.
> 
> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -113,6 +113,36 @@ static void deferred_incoming_migration(Error **errp)
>      deferred_incoming = true;
>  }
>  
> +/* Request a range of pages from the source VM at the given
> + * start address.
> + *   rbname: Name of the RAMBlock to request the page in, if NULL it's the same
> + *           as the last request (a name must have been given previously)

Why not just send the name all the time?

> @@ -1010,6 +1058,28 @@ static void *source_return_path_thread(void *opaque)
>              trace_source_return_path_thread_pong(tmp32);
>              break;
>  
> +        case MIG_RP_MSG_REQ_PAGES:
> +            start = be64_to_cpup((uint64_t *)buf);
> +            len = be64_to_cpup(((uint64_t *)buf)+1);
> +            tmpstr = NULL;
> +            if (len & 1) {
> +                len -= 1; /* Remove the flag */
> +                /* Now we expect an idstr */
> +                tmp32 = buf[16]; /* Length of the following idstr */
> +                tmpstr = (char *)&buf[17];
> +                buf[17+tmp32] = '\0';
> +                expected_len = 16+1+tmp32;

Whitespace missing around operators

> +            } else {
> +                expected_len = 16;
> +            }

This else can be removed if expected_len is set before the if


		Amit
Dr. David Alan Gilbert Aug. 6, 2015, 2:15 p.m. UTC | #3
* Juan Quintela (quintela@redhat.com) wrote:
> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >
> > Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
> > destination to request a page from the source.
> >
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> > ---
> >  include/migration/migration.h |  4 +++
> >  migration/migration.c         | 70 +++++++++++++++++++++++++++++++++++++++++++
> >  trace-events                  |  1 +
> >  3 files changed, 75 insertions(+)
> >
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index 68a1731..8742d53 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
> > @@ -47,6 +47,8 @@ enum mig_rp_message_type {
> >      MIG_RP_MSG_INVALID = 0,  /* Must be 0 */
> >      MIG_RP_MSG_SHUT,         /* sibling will not send any more RP messages */
> >      MIG_RP_MSG_PONG,         /* Response to a PING; data (seq: be32 ) */
> > +
> > +    MIG_RP_MSG_REQ_PAGES,    /* data (start: be64, len: be64) */
> 
> Not that I really care, buht I think that leng could be 32bits.  I am
> not seing networking getting good at multigigabytes transfers soon O:-)

Done.

> > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
> > +                              ram_addr_t start, ram_addr_t len);
> 
> Shouldn't len be a size_t?
> (yes, I know that migration code is not really consistent about that)

Done (fun combination with the change above, but still)

> >  void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
> >  void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 3e5a7c8..0373b77 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -113,6 +113,36 @@ static void deferred_incoming_migration(Error **errp)
> >      deferred_incoming = true;
> >  }
> >  
> > +/* Request a range of pages from the source VM at the given
> > + * start address.
> > + *   rbname: Name of the RAMBlock to request the page in, if NULL it's the same
> > + *           as the last request (a name must have been given previously)
> > + *   Start: Address offset within the RB
> > + *   Len: Length in bytes required - must be a multiple of pagesize
> > + */
> > +void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
> > +                               ram_addr_t start, ram_addr_t len)
> > +{
> > +    uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto 256 */
> > +    uint64_t *buf64 = (uint64_t *)bufc;
> > +    size_t msglen = 16; /* start + len */
> > +
> > +    assert(!(len & 1));
> 
> ohhhh, why can't we get a real flags field?
> 
> Scratch that.  Seeing the rest of the code, can't we have two commands:
> 
> MIG_RP_MSG_REQ_PAGES
> MIG_RP_MSG_REQ_PAGES_WITH_ID
> 
> I am not really sure that it makes sense getting a command that can be
> of two different lengths only for that?
> 
> I am not sure, but having a command with two different payloads look
> strange.

Done (I made it _ID rather than WITH_ID - it was getting a bit long).

Dave

> 
> Later, Juan.
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Dr. David Alan Gilbert Aug. 6, 2015, 2:21 p.m. UTC | #4
* Amit Shah (amit.shah@redhat.com) wrote:
> On (Tue) 16 Jun 2015 [11:26:43], Dr. David Alan Gilbert (git) wrote:
> > From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > 
> > Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
> > destination to request a page from the source.
> > 
> > Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
> 
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -113,6 +113,36 @@ static void deferred_incoming_migration(Error **errp)
> >      deferred_incoming = true;
> >  }
> >  
> > +/* Request a range of pages from the source VM at the given
> > + * start address.
> > + *   rbname: Name of the RAMBlock to request the page in, if NULL it's the same
> > + *           as the last request (a name must have been given previously)
> 
> Why not just send the name all the time?

It does shrink the messages quite a bit, and you have to do a name lookup
on them when you receive it, rather than just knowing it's the same one.

> > @@ -1010,6 +1058,28 @@ static void *source_return_path_thread(void *opaque)
> >              trace_source_return_path_thread_pong(tmp32);
> >              break;
> >  
> > +        case MIG_RP_MSG_REQ_PAGES:
> > +            start = be64_to_cpup((uint64_t *)buf);
> > +            len = be64_to_cpup(((uint64_t *)buf)+1);
> > +            tmpstr = NULL;
> > +            if (len & 1) {
> > +                len -= 1; /* Remove the flag */
> > +                /* Now we expect an idstr */
> > +                tmp32 = buf[16]; /* Length of the following idstr */
> > +                tmpstr = (char *)&buf[17];
> > +                buf[17+tmp32] = '\0';
> > +                expected_len = 16+1+tmp32;
> 
> Whitespace missing around operators

Done.

> > +            } else {
> > +                expected_len = 16;
> > +            }
> 
> This else can be removed if expected_len is set before the if

That's simplified out with the change Juan suggested.

Dave

> 
> 		Amit
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
diff mbox

Patch

diff --git a/include/migration/migration.h b/include/migration/migration.h
index 68a1731..8742d53 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -47,6 +47,8 @@  enum mig_rp_message_type {
     MIG_RP_MSG_INVALID = 0,  /* Must be 0 */
     MIG_RP_MSG_SHUT,         /* sibling will not send any more RP messages */
     MIG_RP_MSG_PONG,         /* Response to a PING; data (seq: be32 ) */
+
+    MIG_RP_MSG_REQ_PAGES,    /* data (start: be64, len: be64) */
 };
 
 typedef QLIST_HEAD(, LoadStateEntry) LoadStateEntry_Head;
@@ -246,6 +248,8 @@  void migrate_send_rp_shut(MigrationIncomingState *mis,
                           uint32_t value);
 void migrate_send_rp_pong(MigrationIncomingState *mis,
                           uint32_t value);
+void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
+                              ram_addr_t start, ram_addr_t len);
 
 void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
 void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
diff --git a/migration/migration.c b/migration/migration.c
index 3e5a7c8..0373b77 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -113,6 +113,36 @@  static void deferred_incoming_migration(Error **errp)
     deferred_incoming = true;
 }
 
+/* Request a range of pages from the source VM at the given
+ * start address.
+ *   rbname: Name of the RAMBlock to request the page in, if NULL it's the same
+ *           as the last request (a name must have been given previously)
+ *   Start: Address offset within the RB
+ *   Len: Length in bytes required - must be a multiple of pagesize
+ */
+void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char *rbname,
+                               ram_addr_t start, ram_addr_t len)
+{
+    uint8_t bufc[16+1+255]; /* start (8 byte), len (8 byte), rbname upto 256 */
+    uint64_t *buf64 = (uint64_t *)bufc;
+    size_t msglen = 16; /* start + len */
+
+    assert(!(len & 1));
+    if (rbname) {
+        int rbname_len = strlen(rbname);
+        assert(rbname_len < 256);
+
+        len |= 1; /* Flag to say we've got a name */
+        bufc[msglen++] = rbname_len;
+        memcpy(bufc + msglen, rbname, rbname_len);
+        msglen += rbname_len;
+    }
+
+    buf64[0] = cpu_to_be64((uint64_t)start);
+    buf64[1] = cpu_to_be64((uint64_t)len);
+    migrate_send_rp_message(mis, MIG_RP_MSG_REQ_PAGES, msglen, bufc);
+}
+
 void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p;
@@ -939,6 +969,17 @@  static void source_return_path_bad(MigrationState *s)
 }
 
 /*
+ * Process a request for pages received on the return path,
+ * We're allowed to send more than requested (e.g. to round to our page size)
+ * and we don't need to send pages that have already been sent.
+ */
+static void migrate_handle_rp_req_pages(MigrationState *ms, const char* rbname,
+                                       ram_addr_t start, ram_addr_t len)
+{
+    trace_migrate_handle_rp_req_pages(rbname, start, len);
+}
+
+/*
  * Handles messages sent on the return path towards the source VM
  *
  */
@@ -950,6 +991,8 @@  static void *source_return_path_thread(void *opaque)
     const int max_len = 512;
     uint8_t buf[max_len];
     uint32_t tmp32;
+    ram_addr_t start, len;
+    char *tmpstr;
     int res;
 
     trace_source_return_path_thread_entry();
@@ -965,6 +1008,11 @@  static void *source_return_path_thread(void *opaque)
             expected_len = 4;
             break;
 
+        case MIG_RP_MSG_REQ_PAGES:
+            /* 16 byte start/len _possibly_ plus an id str */
+            expected_len = 16 + 256;
+            break;
+
         default:
             error_report("RP: Received invalid message 0x%04x length 0x%04x",
                     header_type, header_len);
@@ -1010,6 +1058,28 @@  static void *source_return_path_thread(void *opaque)
             trace_source_return_path_thread_pong(tmp32);
             break;
 
+        case MIG_RP_MSG_REQ_PAGES:
+            start = be64_to_cpup((uint64_t *)buf);
+            len = be64_to_cpup(((uint64_t *)buf)+1);
+            tmpstr = NULL;
+            if (len & 1) {
+                len -= 1; /* Remove the flag */
+                /* Now we expect an idstr */
+                tmp32 = buf[16]; /* Length of the following idstr */
+                tmpstr = (char *)&buf[17];
+                buf[17+tmp32] = '\0';
+                expected_len = 16+1+tmp32;
+            } else {
+                expected_len = 16;
+            }
+            if (header_len != expected_len) {
+                error_report("RP: Req_Page with length %d expecting %d",
+                        header_len, expected_len);
+                source_return_path_bad(ms);
+            }
+            migrate_handle_rp_req_pages(ms, tmpstr, start, len);
+            break;
+
         default:
             break;
         }
diff --git a/trace-events b/trace-events
index 528d5a3..328d64c 100644
--- a/trace-events
+++ b/trace-events
@@ -1422,6 +1422,7 @@  migrate_fd_error(void) ""
 migrate_fd_cancel(void) ""
 migrate_pending(uint64_t size, uint64_t max, uint64_t post, uint64_t nonpost) "pending size %" PRIu64 " max %" PRIu64 " (post=%" PRIu64 " nonpost=%" PRIu64 ")"
 migrate_send_rp_message(int msg_type, uint16_t len) "%d: len %d"
+migrate_handle_rp_req_pages(const char *rbname, size_t start, size_t len) "in %s at %zx len %zx"
 migration_thread_after_loop(void) ""
 migration_thread_file_err(void) ""
 migration_thread_setup_complete(void) ""