Patchwork [v5,10/12] rdma: introduce capability x-rdma-pin-all

login
register
mail settings
Submitter mrhines@linux.vnet.ibm.com
Date April 21, 2013, 9:17 p.m.
Message ID <1366579081-6857-11-git-send-email-mrhines@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/238262/
State New
Headers show

Comments

mrhines@linux.vnet.ibm.com - April 21, 2013, 9:17 p.m.
From: "Michael R. Hines" <mrhines@us.ibm.com>

This capability allows you to disable dynamic chunk registration
for better throughput on high-performance links.

For example, using an 8GB RAM virtual machine with all 8GB of memory in
active use and the VM itself is completely idle using a 40 gbps infiniband link:

1. x-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps
2. x-pin-all enabled total time: approximately 4 seconds @ 26 Gbps

These numbers would of course scale up to whatever size virtual machine
you have to migrate using RDMA.

Enabling this feature does *not* have any measurable affect on
migration *downtime*. This is because, without this feature, all of the
memory will have already been registered already in advance during
the bulk round and does not need to be re-registered during the successive
iteration rounds.

Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
---
 include/migration/migration.h |    2 ++
 migration.c                   |    9 +++++++++
 qapi-schema.json              |    7 ++++++-
 3 files changed, 17 insertions(+), 1 deletion(-)
Eric Blake - April 22, 2013, 8:24 p.m.
On 04/21/2013 03:17 PM, mrhines@linux.vnet.ibm.com wrote:
> From: "Michael R. Hines" <mrhines@us.ibm.com>
> 
> This capability allows you to disable dynamic chunk registration
> for better throughput on high-performance links.
> 
> For example, using an 8GB RAM virtual machine with all 8GB of memory in
> active use and the VM itself is completely idle using a 40 gbps infiniband link:
> 
> 1. x-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps
> 2. x-pin-all enabled total time: approximately 4 seconds @ 26 Gbps

Naming here doesn't match the actual bit name; but it is obvious enough
to know what you meant.

Thanks for doing this, by the way - the default-to-disabled is a bit
nicer to manage from libvirt's perspective.

> 
> These numbers would of course scale up to whatever size virtual machine
> you have to migrate using RDMA.
> 
> Enabling this feature does *not* have any measurable affect on
> migration *downtime*. This is because, without this feature, all of the
> memory will have already been registered already in advance during
> the bulk round and does not need to be re-registered during the successive
> iteration rounds.
> 
> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
> ---
>  include/migration/migration.h |    2 ++
>  migration.c                   |    9 +++++++++
>  qapi-schema.json              |    7 ++++++-
>  3 files changed, 17 insertions(+), 1 deletion(-)


> +++ b/qapi-schema.json
> @@ -602,10 +602,15 @@
>  #          This feature allows us to minimize migration traffic for certain work
>  #          loads, by sending compressed difference of the pages
>  #
> +# @x-rdma-pin-all: (since 1.5) Controls whether or not the entire VM memory footprint is 

Trailing whitespace, and line longer than 80 columns.  You ought to
rewrap this, and make sure it passes checkpatch.pl.

But since that is whitespace-only, feel free to add:

Reviewed-by: Eric Blake <eblake@redhat.com>
mrhines@linux.vnet.ibm.com - April 22, 2013, 8:59 p.m.
On 04/22/2013 04:24 PM, Eric Blake wrote:
> On 04/21/2013 03:17 PM, mrhines@linux.vnet.ibm.com wrote:
>> From: "Michael R. Hines" <mrhines@us.ibm.com>
>>
>> This capability allows you to disable dynamic chunk registration
>> for better throughput on high-performance links.
>>
>> For example, using an 8GB RAM virtual machine with all 8GB of memory in
>> active use and the VM itself is completely idle using a 40 gbps infiniband link:
>>
>> 1. x-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps
>> 2. x-pin-all enabled total time: approximately 4 seconds @ 26 Gbps
> Naming here doesn't match the actual bit name; but it is obvious enough
> to know what you meant.
>
> Thanks for doing this, by the way - the default-to-disabled is a bit
> nicer to manage from libvirt's perspective.
>
>> These numbers would of course scale up to whatever size virtual machine
>> you have to migrate using RDMA.
>>
>> Enabling this feature does *not* have any measurable affect on
>> migration *downtime*. This is because, without this feature, all of the
>> memory will have already been registered already in advance during
>> the bulk round and does not need to be re-registered during the successive
>> iteration rounds.
>>
>> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com>
>> ---
>>   include/migration/migration.h |    2 ++
>>   migration.c                   |    9 +++++++++
>>   qapi-schema.json              |    7 ++++++-
>>   3 files changed, 17 insertions(+), 1 deletion(-)
>
>> +++ b/qapi-schema.json
>> @@ -602,10 +602,15 @@
>>   #          This feature allows us to minimize migration traffic for certain work
>>   #          loads, by sending compressed difference of the pages
>>   #
>> +# @x-rdma-pin-all: (since 1.5) Controls whether or not the entire VM memory footprint is
> Trailing whitespace, and line longer than 80 columns.  You ought to
> rewrap this, and make sure it passes checkpatch.pl.
>
> But since that is whitespace-only, feel free to add:
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
>

Acknowledged.

Patch

diff --git a/include/migration/migration.h b/include/migration/migration.h
index d173bd9..3b4d5e9 100644
--- a/include/migration/migration.h
+++ b/include/migration/migration.h
@@ -122,6 +122,8 @@  void migrate_add_blocker(Error *reason);
  */
 void migrate_del_blocker(Error *reason);
 
+bool migrate_rdma_pin_all(void);
+
 int xbzrle_encode_buffer(uint8_t *old_buf, uint8_t *new_buf, int slen,
                          uint8_t *dst, int dlen);
 int xbzrle_decode_buffer(uint8_t *src, int slen, uint8_t *dst, int dlen);
diff --git a/migration.c b/migration.c
index 48b5174..b13fa66 100644
--- a/migration.c
+++ b/migration.c
@@ -476,6 +476,15 @@  void qmp_migrate_set_downtime(double value, Error **errp)
     max_downtime = (uint64_t)value;
 }
 
+bool migrate_rdma_pin_all(void)
+{
+    MigrationState *s;
+
+    s = migrate_get_current();
+
+    return s->enabled_capabilities[MIGRATION_CAPABILITY_X_RDMA_PIN_ALL];
+}
+
 int migrate_use_xbzrle(void)
 {
     MigrationState *s;
diff --git a/qapi-schema.json b/qapi-schema.json
index cc846c3..b73e30a 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -602,10 +602,15 @@ 
 #          This feature allows us to minimize migration traffic for certain work
 #          loads, by sending compressed difference of the pages
 #
+# @x-rdma-pin-all: (since 1.5) Controls whether or not the entire VM memory footprint is 
+#          mlock()'d on demand or all at once. Refer to docs/rdma.txt for advice on usage.
+#          Disabled by default. Experimental: may (or may not) be renamed after
+#          further testing is complete.
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
-  'data': ['xbzrle'] }
+  'data': ['xbzrle', 'x-rdma-pin-all'] }
 
 ##
 # @MigrationCapabilityStatus