diff mbox

[V13,10/13] NUMA: add qmp command set-mem-policy to set memory policy for NUMA node

Message ID 1379387785-14554-11-git-send-email-gaowanlong@cn.fujitsu.com
State New
Headers show

Commit Message

Wanlong Gao Sept. 17, 2013, 3:16 a.m. UTC
This QMP command allows user set guest node's memory policy
through the QMP protocol. The qmp-shell command is like:
    set-mem-policy nodeid=0 policy=membind relative=true host-nodes=0-1

Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
---
 numa.c           | 66 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 qapi-schema.json | 21 ++++++++++++++++++
 qmp-commands.hx  | 41 +++++++++++++++++++++++++++++++++++
 3 files changed, 128 insertions(+)

Comments

Marcelo Tosatti Oct. 3, 2013, 2:13 a.m. UTC | #1
On Tue, Sep 17, 2013 at 11:16:22AM +0800, Wanlong Gao wrote:
> This QMP command allows user set guest node's memory policy
> through the QMP protocol. The qmp-shell command is like:
>     set-mem-policy nodeid=0 policy=membind relative=true host-nodes=0-1
> 
> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>

Wanlong Gao,

1)

Exposing mbind via QMP/HMP on a live guest is interesting because,
see mbind manpage: 

"By  default,  mbind() only has an effect for new allocations;
if the pages inside the range have been already touched before
setting the policy, then the policy has no effect.  This  default
behavior  may  be  overridden  by  the  MPOL_MF_MOVE  and
MPOL_MF_MOVE_ALL flags described below."

This means that executing set-mem-policy on a live guest is
unpredictable: it depends on which pages have been faulted in already.

Should the command be restricted to offline guests?

2)

Have you tested the patchset with hugetlbfs (-mem-path) backing ?
Marcelo Tosatti Oct. 4, 2013, 12:04 a.m. UTC | #2
On Wed, Oct 02, 2013 at 11:13:29PM -0300, Marcelo Tosatti wrote:
> On Tue, Sep 17, 2013 at 11:16:22AM +0800, Wanlong Gao wrote:
> > This QMP command allows user set guest node's memory policy
> > through the QMP protocol. The qmp-shell command is like:
> >     set-mem-policy nodeid=0 policy=membind relative=true host-nodes=0-1
> > 
> > Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
> > Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
> 
> Wanlong Gao,
> 
> 1)
> 
> Exposing mbind via QMP/HMP on a live guest is interesting because,
> see mbind manpage: 
> 
> "By  default,  mbind() only has an effect for new allocations;
> if the pages inside the range have been already touched before
> setting the policy, then the policy has no effect.  This  default
> behavior  may  be  overridden  by  the  MPOL_MF_MOVE  and
> MPOL_MF_MOVE_ALL flags described below."
> 
> This means that executing set-mem-policy on a live guest is
> unpredictable: it depends on which pages have been faulted in already.
> 
> Should the command be restricted to offline guests?

In fact, unless there is a missing point, it should be removed: to solve
the device assignment case (memory pinning), mbind must be executed before
the memory regions are registered.

> 2)
> 
> Have you tested the patchset with hugetlbfs (-mem-path) backing ?
>
Paolo Bonzini Oct. 4, 2013, 8:13 a.m. UTC | #3
Il 04/10/2013 02:04, Marcelo Tosatti ha scritto:
>>> > > This QMP command allows user set guest node's memory policy
>>> > > through the QMP protocol. The qmp-shell command is like:
>>> > >     set-mem-policy nodeid=0 policy=membind relative=true host-nodes=0-1
>>> > > 
>>> > > Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
>>> > > Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>> > 
>> > Wanlong Gao,
>> > 
>> > 1)
>> > 
>> > Exposing mbind via QMP/HMP on a live guest is interesting because,
>> > see mbind manpage: 
>> > 
>> > "By  default,  mbind() only has an effect for new allocations;
>> > if the pages inside the range have been already touched before
>> > setting the policy, then the policy has no effect.  This  default
>> > behavior  may  be  overridden  by  the  MPOL_MF_MOVE  and
>> > MPOL_MF_MOVE_ALL flags described below."
>> > 
>> > This means that executing set-mem-policy on a live guest is
>> > unpredictable: it depends on which pages have been faulted in already.
>> > 
>> > Should the command be restricted to offline guests?
> In fact, unless there is a missing point, it should be removed: to solve
> the device assignment case (memory pinning), mbind must be executed before
> the memory regions are registered.
> 

Right.  We can add the command back later as memory-add, together with
memory hotplug.

Paolo
Wanlong Gao Oct. 7, 2013, 1:28 a.m. UTC | #4
On 10/04/2013 04:13 PM, Paolo Bonzini wrote:
> Il 04/10/2013 02:04, Marcelo Tosatti ha scritto:
>>>>>> This QMP command allows user set guest node's memory policy
>>>>>> through the QMP protocol. The qmp-shell command is like:
>>>>>>     set-mem-policy nodeid=0 policy=membind relative=true host-nodes=0-1
>>>>>>
>>>>>> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
>>>>>> Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
>>>>
>>>> Wanlong Gao,
>>>>
>>>> 1)
>>>>
>>>> Exposing mbind via QMP/HMP on a live guest is interesting because,
>>>> see mbind manpage: 
>>>>
>>>> "By  default,  mbind() only has an effect for new allocations;
>>>> if the pages inside the range have been already touched before
>>>> setting the policy, then the policy has no effect.  This  default
>>>> behavior  may  be  overridden  by  the  MPOL_MF_MOVE  and
>>>> MPOL_MF_MOVE_ALL flags described below."
>>>>
>>>> This means that executing set-mem-policy on a live guest is
>>>> unpredictable: it depends on which pages have been faulted in already.
>>>>
>>>> Should the command be restricted to offline guests?
>> In fact, unless there is a missing point, it should be removed: to solve
>> the device assignment case (memory pinning), mbind must be executed before
>> the memory regions are registered.
>>
> 
> Right.  We can add the command back later as memory-add, together with
> memory hotplug.

OK, will remove the command in this patch set.

Thanks,
Wanlong Gao

> 
> Paolo
>
diff mbox

Patch

diff --git a/numa.c b/numa.c
index 915a67a..19ee7f7 100644
--- a/numa.c
+++ b/numa.c
@@ -28,6 +28,7 @@ 
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
 #include "exec/memory.h"
+#include "qmp-commands.h"
 
 #ifdef __linux__
 #include <sys/syscall.h>
@@ -327,3 +328,68 @@  void set_numa_modes(void)
         }
     }
 }
+
+void qmp_set_mem_policy(uint16_t nodeid, bool has_policy, NumaNodePolicy policy,
+                        bool has_relative, bool relative,
+                        bool has_host_nodes, uint16List *host_nodes,
+                        Error **errp)
+{
+    NumaNodePolicy old_policy;
+    bool old_relative;
+    DECLARE_BITMAP(host_mem, MAX_NODES);
+    uint16List *nodes;
+
+    if (nodeid >= nb_numa_nodes) {
+        error_setg(errp, "Only has '%d' NUMA nodes", nb_numa_nodes);
+        return;
+    }
+
+    bitmap_copy(host_mem, numa_info[nodeid].host_mem, MAX_NODES);
+    old_policy = numa_info[nodeid].policy;
+    old_relative = numa_info[nodeid].relative;
+
+    numa_info[nodeid].policy = NUMA_NODE_POLICY_DEFAULT;
+    numa_info[nodeid].relative = false;
+    bitmap_zero(numa_info[nodeid].host_mem, MAX_NODES);
+
+    if (!has_policy) {
+        if (set_node_mem_policy(nodeid) == -1) {
+            error_setg(errp, "Failed to set memory policy for node%" PRIu16,
+                       nodeid);
+            goto error;
+        }
+        return;
+    }
+
+    numa_info[nodeid].policy = policy;
+
+    if (has_relative) {
+        numa_info[nodeid].relative = relative;
+    }
+
+    if (!has_host_nodes) {
+        bitmap_empty(numa_info[nodeid].host_mem, MAX_NODES);
+        bitmap_set(numa_info[nodeid].host_mem, 0, 1);
+    }
+
+    for (nodes = host_nodes; nodes; nodes = nodes->next) {
+        if (nodes->value > MAX_NODES) {
+            continue;
+        }
+        bitmap_set(numa_info[nodeid].host_mem, nodes->value, 1);
+    }
+
+    if (set_node_mem_policy(nodeid) == -1) {
+        error_setg(errp, "Failed to set memory policy for node%" PRIu16,
+                   nodeid);
+        goto error;
+    }
+
+    return;
+
+error:
+    bitmap_copy(numa_info[nodeid].host_mem, host_mem, MAX_NODES);
+    numa_info[nodeid].policy = old_policy;
+    numa_info[nodeid].relative = old_relative;
+    return;
+}
diff --git a/qapi-schema.json b/qapi-schema.json
index dbe7088..914c0c0 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -3914,3 +3914,24 @@ 
    '*policy':     'NumaNodePolicy',
    '*relative':   'bool',
    '*host-nodes': ['uint16'] }}
+
+##
+# @set-mem-policy:
+#
+# Set the host memory binding policy for guest NUMA node.
+#
+# @nodeid: The node ID of guest NUMA node to set memory policy to.
+#
+# @policy: #optional The memory policy to be set (default 'default').
+#
+# @relative: #optional If the specified nodes are relative (default 'false')
+#
+# @host-nodes: #optional The host nodes range for memory policy.
+#
+# Returns: Nothing on success
+#
+# Since: 1.7
+##
+{ 'command': 'set-mem-policy',
+  'data': {'nodeid': 'uint16', '*policy': 'NumaNodePolicy',
+           '*relative': 'bool', '*host-nodes': ['uint16'] } }
diff --git a/qmp-commands.hx b/qmp-commands.hx
index 008cad9..fc7b804 100644
--- a/qmp-commands.hx
+++ b/qmp-commands.hx
@@ -3089,6 +3089,7 @@  Example:
 <- { "return": {} }
 
 EQMP
+
     {
         .name       = "query-rx-filter",
         .args_type  = "name:s?",
@@ -3152,3 +3153,43 @@  Example:
    }
 
 EQMP
+
+    {
+        .name      = "set-mem-policy",
+        .args_type = "nodeid:i,policy:s?,relative:b?,host-nodes:q?",
+        .help      = "Set the host memory binding policy for guest NUMA node",
+        .mhandler.cmd_new = qmp_marshal_input_set_mem_policy,
+    },
+
+SQMP
+set-mem-policy
+------
+
+Set the host memory binding policy for guest NUMA node
+
+Arguments:
+
+- "nodeid": The nodeid of guest NUMA node to set memory policy to.
+            (json-int)
+- "policy": The memory policy to set.
+            (json-string, optional)
+- "relative": If the specified nodes are relative.
+              (json-bool, optional)
+- "host-nodes": The host nodes contained to this memory policy.
+                (a json-array of int, optional)
+
+Example:
+
+-> { "execute": "set-mem-policy", "arguments": { "nodeid": 0,
+                                                 "policy": "membind",
+                                                 "relative": true,
+                                                 "host-nodes": [0, 1] } }
+<- { "return": {} }
+
+Notes:
+    1. If "policy" is not set, the memory policy of this "nodeid" will be set
+       to "default".
+    2. If "host-nodes" is not set, the node mask of this "policy" will be set
+       to host node 0.
+
+EQMP