diff mbox

[for-2.10,07/23] pc: add node-id property to CPU

Message ID 1490189568-167621-8-git-send-email-imammedo@redhat.com
State New
Headers show

Commit Message

Igor Mammedov March 22, 2017, 1:32 p.m. UTC
it will allow switching from cpu_index to property based
numa mapping in follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
 hw/i386/pc.c      | 17 +++++++++++++++++
 target/i386/cpu.c |  1 +
 2 files changed, 18 insertions(+)

Comments

Eduardo Habkost April 12, 2017, 9:02 p.m. UTC | #1
On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:
> it will allow switching from cpu_index to property based
> numa mapping in follow up patches.

I am not sure I understand all the consequences of this, so I
will give it a try:

"node-id" is an existing field in CpuInstanceProperties.
CpuInstanceProperties is used on both query-hotpluggable-cpus
output and in MachineState::possible_cpus.

We will start using MachineState::possible_cpus to keep track of
NUMA CPU affinity, and that means query-hotpluggable-cpus will
start reporting a "node-id" property when a NUMA mapping is
configured.

To allow query-hotpluggable-cpus to report "node-id", the CPU
objects must have a "node-id" property that can be set. This
patch adds the "node-id" property to X86CPU.

Is this description accurate? Is the presence of "node-id" in
query-hotpluggable-cpus the only reason we really need this
patch, or is there something else that requires the "node-id"
property?

Why exactly do we need to change the output of
query-hotpluggable-cpus for all machines to include "node-id", to
make "-numa cpu" work?  Did you consider saving node_id inside
CPUArchId and outside CpuInstanceProperties, so
query-hotplugabble-cpus output won't be affected by "-numa cpu"?

I'm asking this because I believe we will eventually need a
mechanism that lets management check what are the valid arguments
for "-numa cpu" for a given machine, and it looks like
query-hotpluggable-cpus is already the right mechanism for that.
But we can't make query-hotpluggable-cpus output depend on "-numa
cpu" input, if the "-numa cpu" input will also depend on
query-hotpluggable-cpus output.

> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
>  hw/i386/pc.c      | 17 +++++++++++++++++
>  target/i386/cpu.c |  1 +
>  2 files changed, 18 insertions(+)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 7031100..873bbfa 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1895,6 +1895,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>                              DeviceState *dev, Error **errp)
>  {
>      int idx;
> +    int node_id;
>      CPUState *cs;
>      CPUArchId *cpu_slot;
>      X86CPUTopoInfo topo;
> @@ -1984,6 +1985,22 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>  
>      cs = CPU(cpu);
>      cs->cpu_index = idx;
> +
> +    node_id = numa_get_node_for_cpu(cs->cpu_index);
> +    if (node_id == nb_numa_nodes) {
> +        /* by default CPUState::numa_node was 0 if it's not set via CLI
> +         * keep it this way for now but in future we probably should
> +         * refuse to start up with incomplete numa mapping */
> +        node_id = 0;
> +    }
> +    if (cs->numa_node == CPU_UNSET_NUMA_NODE_ID) {
> +        cs->numa_node = node_id;
> +    } else if (cs->numa_node != node_id) {
> +            error_setg(errp, "node-id %d must match numa node specified"
> +                "with -numa option for cpu-index %d",
> +                cs->numa_node, cs->cpu_index);
> +            return;
> +    }
>  }
>  
>  static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 7aa7622..d690244 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -3974,6 +3974,7 @@ static Property x86_cpu_properties[] = {
>      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
>      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
>  #endif
> +    DEFINE_PROP_INT32("node-id", CPUState, numa_node, CPU_UNSET_NUMA_NODE_ID),
>      DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
>      { .name  = "hv-spinlocks", .info  = &qdev_prop_spinlocks },
>      DEFINE_PROP_BOOL("hv-relaxed", X86CPU, hyperv_relaxed_timing, false),
> -- 
> 2.7.4
> 
>
Igor Mammedov April 19, 2017, 11:14 a.m. UTC | #2
On Wed, 12 Apr 2017 18:02:39 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:
> > it will allow switching from cpu_index to property based
> > numa mapping in follow up patches.  
> 
> I am not sure I understand all the consequences of this, so I
> will give it a try:
> 
> "node-id" is an existing field in CpuInstanceProperties.
> CpuInstanceProperties is used on both query-hotpluggable-cpus
> output and in MachineState::possible_cpus.
> 
> We will start using MachineState::possible_cpus to keep track of
> NUMA CPU affinity, and that means query-hotpluggable-cpus will
> start reporting a "node-id" property when a NUMA mapping is
> configured.
> 
> To allow query-hotpluggable-cpus to report "node-id", the CPU
> objects must have a "node-id" property that can be set. This
> patch adds the "node-id" property to X86CPU.
> 
> Is this description accurate? Is the presence of "node-id" in
> query-hotpluggable-cpus the only reason we really need this
> patch, or is there something else that requires the "node-id"
> property?
That accurate description, node-id is in the same 'address'
properties category as socket/core/thread-id. So if you have
numa enabled machine you'd see node-id property in
query-hotpluggable-cpus.


> Why exactly do we need to change the output of
> query-hotpluggable-cpus for all machines to include "node-id", to
> make "-numa cpu" work?
It's for introspection as well as for consolidating topology data
in a single place purposes and complements already outputed
socket/core/thread-id address properties with numa node-id.
That way one doesn't need yet another command for introspecting
numa mapping for cpus and use existing query-hotpluggable-cpus
for full topology description.

>  Did you consider saving node_id inside
> CPUArchId and outside CpuInstanceProperties, so
> query-hotplugabble-cpus output won't be affected by "-numa cpu"?
nope, intent was to make node-id visible if numa is enabled and
I think that intent was there from the very begging when
query-hotplugabble-cpus was introduced with CpuInstanceProperties
having node_id field but unused since it has been out of scope
of cpu hotplug.


> I'm asking this because I believe we will eventually need a
> mechanism that lets management check what are the valid arguments
> for "-numa cpu" for a given machine, and it looks like
> query-hotpluggable-cpus is already the right mechanism for that.
it's problem similar with -device cpu_foo,...

> But we can't make query-hotpluggable-cpus output depend on "-numa
> cpu" input, if the "-numa cpu" input will also depend on
> query-hotpluggable-cpus output.
I don't think that query-hotpluggable-cpus must be independent of
'-numa' option.

query-hotpluggable-cpus is a function of -smp and machine type and
it's output is dynamic and can change during runtime so we've never
made promise to make it static. I think it's ok to make it depend
on -numa as extra input argument when present.

It bothers me as well, that '-numa cpu' as well as '-device cpu_foo'
options depend on query-hotpluggable-cpus and when we considered
generic '-device cpu' support, we though that initially
query-hotpluggable-cpus could be used to get list of CPUs
for given -smp/machine combination and then it could be used
for composing proper CLI. That makes mgmt to start QEMU twice
when creating configuration for the 1st time, but end result CLI
could be reused without repeating query step again provided
topology/machine stays the same. The same applies to '-numa cpu'.

In future to avoid starting QEMU twice we were thinking about
configuring QEMU from QMP at runtime, that's where preconfigure
approach could be used to help solving it in the future:

  1. introduce pause before machine_init CLI option to allow
     preconfig machine from qmp/monitor
  2. make query-hotpluggable-cpus usable at preconfig time
  3. start qemu with needed number of numa nodes and default mapping:
         #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1
  4. get possible cpus list
  5. add qmp/monitor command variant for '-numa cpu' to set numa mapping
  6. optionally, set new numa mapping and get updated
     possible cpus list with query-hotpluggable-cpus
  7. optionally, add extra cpus with device_add using updated
     cpus list and get updated cpus list as it's been changed again.
  8. unpause preconfig stage and let qemu continue to execute
     machine_init and the rest.

Since we would need to implement QMP configuration for '-device cpu',
we as well might reuse it for custom numa mapping.
 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> >  hw/i386/pc.c      | 17 +++++++++++++++++
> >  target/i386/cpu.c |  1 +
> >  2 files changed, 18 insertions(+)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 7031100..873bbfa 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1895,6 +1895,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >                              DeviceState *dev, Error **errp)
> >  {
> >      int idx;
> > +    int node_id;
> >      CPUState *cs;
> >      CPUArchId *cpu_slot;
> >      X86CPUTopoInfo topo;
> > @@ -1984,6 +1985,22 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
> >  
> >      cs = CPU(cpu);
> >      cs->cpu_index = idx;
> > +
> > +    node_id = numa_get_node_for_cpu(cs->cpu_index);
> > +    if (node_id == nb_numa_nodes) {
> > +        /* by default CPUState::numa_node was 0 if it's not set via CLI
> > +         * keep it this way for now but in future we probably should
> > +         * refuse to start up with incomplete numa mapping */
> > +        node_id = 0;
> > +    }
> > +    if (cs->numa_node == CPU_UNSET_NUMA_NODE_ID) {
> > +        cs->numa_node = node_id;
> > +    } else if (cs->numa_node != node_id) {
> > +            error_setg(errp, "node-id %d must match numa node specified"
> > +                "with -numa option for cpu-index %d",
> > +                cs->numa_node, cs->cpu_index);
> > +            return;
> > +    }
> >  }
> >  
> >  static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
> > diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> > index 7aa7622..d690244 100644
> > --- a/target/i386/cpu.c
> > +++ b/target/i386/cpu.c
> > @@ -3974,6 +3974,7 @@ static Property x86_cpu_properties[] = {
> >      DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
> >      DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
> >  #endif
> > +    DEFINE_PROP_INT32("node-id", CPUState, numa_node, CPU_UNSET_NUMA_NODE_ID),
> >      DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
> >      { .name  = "hv-spinlocks", .info  = &qdev_prop_spinlocks },
> >      DEFINE_PROP_BOOL("hv-relaxed", X86CPU, hyperv_relaxed_timing, false),
> > -- 
> > 2.7.4
> > 
> >   
>
Eduardo Habkost April 26, 2017, 12:21 p.m. UTC | #3
(Sorry for taking so long to continue the discussion. I thought I
was convinced setting node-id on HotpluggableCPU.props was
required, but after thinking more about it, it looks like it's
still not required.)

On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:
> On Wed, 12 Apr 2017 18:02:39 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:
> > > it will allow switching from cpu_index to property based
> > > numa mapping in follow up patches.  
> > 
> > I am not sure I understand all the consequences of this, so I
> > will give it a try:
> > 
> > "node-id" is an existing field in CpuInstanceProperties.
> > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > output and in MachineState::possible_cpus.
> > 
> > We will start using MachineState::possible_cpus to keep track of
> > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > start reporting a "node-id" property when a NUMA mapping is
> > configured.
> > 
> > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > objects must have a "node-id" property that can be set. This
> > patch adds the "node-id" property to X86CPU.
> > 
> > Is this description accurate? Is the presence of "node-id" in
> > query-hotpluggable-cpus the only reason we really need this
> > patch, or is there something else that requires the "node-id"
> > property?
> That accurate description, node-id is in the same 'address'
> properties category as socket/core/thread-id. So if you have
> numa enabled machine you'd see node-id property in
> query-hotpluggable-cpus.

I agree that we can make -numa cpu affect query-hotpluggable-cpus
output (i.e. affect some field on HotpluggableCPU.props).

But it looks like we disagree about the purpose of
HotpluggableCPU.props:

I believe HotpluggableCPU.props is just an opaque identifier for
the location we want to plug the CPU, and the only requirement is
that it should be unique and have all the information device_add
needs. As socket IDs are already unique on our existing machines,
and socket<=>node mapping is already configured using -numa cpu,
node-id doesn't need to be in HotpluggableCPU.props. (see example
below)

I don't think clients should assume topology information in
HotpluggableCPU.props is always present, because the field has a
different purpose: letting clients know what are the required
device_add arguments. If we need introspection of CPU topology,
we can add new fields to HotpluggableCPU, outside 'props'.

> 
> 
> > Why exactly do we need to change the output of
> > query-hotpluggable-cpus for all machines to include "node-id", to
> > make "-numa cpu" work?
> It's for introspection as well as for consolidating topology data
> in a single place purposes and complements already outputed
> socket/core/thread-id address properties with numa node-id.
> That way one doesn't need yet another command for introspecting
> numa mapping for cpus and use existing query-hotpluggable-cpus
> for full topology description.

I don't disagree about including node-id in HotpluggableCPU
struct, I am just unsure about including it in
HotpluggableCPU.props.

My question is: if we use:

 -numa cpu,socket=2,core=1,thread=0,node-id=3

and then:

 device_add ...,socket=2,core=1,thread=0
 (omitting node-id on device_add)

won't it work exactly the same, and place the new CPU on NUMA
node 3?

In this case, if we don't need a node-id argument on device_add,
so node-id doesn't need to be in HotpluggableCPU.props.

> 
> >  Did you consider saving node_id inside
> > CPUArchId and outside CpuInstanceProperties, so
> > query-hotplugabble-cpus output won't be affected by "-numa cpu"?
> nope, intent was to make node-id visible if numa is enabled and
> I think that intent was there from the very begging when
> query-hotplugabble-cpus was introduced with CpuInstanceProperties
> having node_id field but unused since it has been out of scope
> of cpu hotplug.
> 
> 
> > I'm asking this because I believe we will eventually need a
> > mechanism that lets management check what are the valid arguments
> > for "-numa cpu" for a given machine, and it looks like
> > query-hotpluggable-cpus is already the right mechanism for that.
> it's problem similar with -device cpu_foo,...

True.

> 
> > But we can't make query-hotpluggable-cpus output depend on "-numa
> > cpu" input, if the "-numa cpu" input will also depend on
> > query-hotpluggable-cpus output.
> I don't think that query-hotpluggable-cpus must be independent of
> '-numa' option.
> 
> query-hotpluggable-cpus is a function of -smp and machine type and
> it's output is dynamic and can change during runtime so we've never
> made promise to make it static. I think it's ok to make it depend
> on -numa as extra input argument when present.

OK, I agree that we can make -numa cpu affect
query-hotpluggable-cpus output.

I also think it might be OK to make -numa affect
HotpluggableCPU.props, as clients should be prepared for that. I
just want to understand if we really _have_ to make it so.
Because not including it would help us avoid surprises, and even
simplify the code (making this series shorter).

> 
> It bothers me as well, that '-numa cpu' as well as '-device cpu_foo'
> options depend on query-hotpluggable-cpus and when we considered
> generic '-device cpu' support, we though that initially
> query-hotpluggable-cpus could be used to get list of CPUs
> for given -smp/machine combination and then it could be used
> for composing proper CLI. That makes mgmt to start QEMU twice
> when creating configuration for the 1st time, but end result CLI
> could be reused without repeating query step again provided
> topology/machine stays the same. The same applies to '-numa cpu'.
> 
> In future to avoid starting QEMU twice we were thinking about
> configuring QEMU from QMP at runtime, that's where preconfigure
> approach could be used to help solving it in the future:
> 
>   1. introduce pause before machine_init CLI option to allow
>      preconfig machine from qmp/monitor
>   2. make query-hotpluggable-cpus usable at preconfig time
>   3. start qemu with needed number of numa nodes and default mapping:
>          #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1
>   4. get possible cpus list

This is where things can get tricky: if we have the default
mapping set, step 4 would return "node-id" already set on all
possible CPUs.

>   5. add qmp/monitor command variant for '-numa cpu' to set numa mapping

This is where I think we would make things simpler: if node-id
isn't present on 'props', we can simply document the arguments
that identify the CPU for the numa-cpu command as "just use the
properties you get on query-hotpluggable-cpus.props". Clients
would be able to treat CpuInstanceProperties as an opaque CPU
slot identifier.

i.e. I think this would be a better way to define and document
the interface:

##
# @NumaCpuOptions:
#
# Mapping of a given CPU (or a set of CPUs) to a NUMA node.
#
# @cpu: Properties identifying the CPU(s). Use the 'props' field of
#       query-hotpluggable-cpus for possible values for this
#       field.
#       TODO: describe what happens when 'cpu' matches
#       multiple slots.
# @node-id: NUMA node where the CPUs are going to be located.
##
{ 'struct': 'NumaCpuOptions',
  'data': {
   'cpu': 'CpuInstanceProperties',
   'node-id': 'int' } }

This separates "what identifies the CPU slot(s) we are
configuring" from "what identifies the node ID we are binding
to".

In case we have trouble making this struct work with QemuOpts, we
could do this (temporarily?):

##
# @NumaCpuOptions:
#
# Mapping of a given CPU (or a set of CPUs) to a NUMA node.
#
# @cpu: Properties identifying the CPU(s). Use the 'props' field of
#       query-hotpluggable-cpus for possible values for this
#       field.
#       TODO: describe what happens when 'cpu' matches
#       multiple slots.
# @node-id: NUMA node where the CPUs are going to be located.
#
# @socket-id: Shortcut for cpu.socket-id, to make this struct
#             friendly to QemuOpts.
# @core-id: Shortcut for cpu.core-id, to make this struct
#           friendly to QemuOpts.
# @thread-id: Shortcut for cpu.thread-id, to make this struct
#             friendly to QemuOpts.
##
{ 'struct': 'NumaCpuOptions',
  'data': {
   '*cpu': 'CpuInstanceProperties',
   '*socket-id': 'int',
   '*core-id': 'int',
   '*thread-id': 'int',
   'node-id': 'int' } }

>   6. optionally, set new numa mapping and get updated
>      possible cpus list with query-hotpluggable-cpus
>   7. optionally, add extra cpus with device_add using updated
>      cpus list and get updated cpus list as it's been changed again.
>   8. unpause preconfig stage and let qemu continue to execute
>      machine_init and the rest.
> 
> Since we would need to implement QMP configuration for '-device cpu',
> we as well might reuse it for custom numa mapping.
> 
>  [...]
Igor Mammedov April 27, 2017, 1:14 p.m. UTC | #4
On Wed, 26 Apr 2017 09:21:38 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

adding Peter to CC list

[...]

> On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:
> > On Wed, 12 Apr 2017 18:02:39 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> >   
> > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:  
> > > > it will allow switching from cpu_index to property based
> > > > numa mapping in follow up patches.    
> > > 
> > > I am not sure I understand all the consequences of this, so I
> > > will give it a try:
> > > 
> > > "node-id" is an existing field in CpuInstanceProperties.
> > > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > > output and in MachineState::possible_cpus.
> > > 
> > > We will start using MachineState::possible_cpus to keep track of
> > > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > > start reporting a "node-id" property when a NUMA mapping is
> > > configured.
> > > 
> > > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > > objects must have a "node-id" property that can be set. This
> > > patch adds the "node-id" property to X86CPU.
> > > 
> > > Is this description accurate? Is the presence of "node-id" in
> > > query-hotpluggable-cpus the only reason we really need this
> > > patch, or is there something else that requires the "node-id"
> > > property?  
> > That accurate description, node-id is in the same 'address'
> > properties category as socket/core/thread-id. So if you have
> > numa enabled machine you'd see node-id property in
> > query-hotpluggable-cpus.  
> 
> I agree that we can make -numa cpu affect query-hotpluggable-cpus
> output (i.e. affect some field on HotpluggableCPU.props).
> 
> But it looks like we disagree about the purpose of
> HotpluggableCPU.props:
> 
> I believe HotpluggableCPU.props is just an opaque identifier for
> the location we want to plug the CPU, and the only requirement is
> that it should be unique and have all the information device_add
> needs. As socket IDs are already unique on our existing machines,
> and socket<=>node mapping is already configured using -numa cpu,
> node-id doesn't need to be in HotpluggableCPU.props. (see example
> below)
node-id is also location property which logically complements
to socket/core/thread properties.  Also socket is not necessarily
unique id that maps 1:1 to node-id from generic pov.
BTW -numa cpu[s] is not the only way to specify mapping,
it could be specified like we do with pc-dimm:
   device_add pc-dimm,node=x

Looking at it more genericly, there could be the same
socket-ids for different nodes, then we would have to add
node-id to props anyway and end up with 2 node-id, one in props
and another in the parent struct.


> I don't think clients should assume topology information in
> HotpluggableCPU.props is always present, because the field has a
> different purpose: letting clients know what are the required
> device_add arguments. If we need introspection of CPU topology,
> we can add new fields to HotpluggableCPU, outside 'props'.
looking at existing clients (libvirt), it doesn't treat 'props'
as opaque set, but parses it into topology information (my guess
is that because it's the sole source such info from QEMU).
Actually we never forbade this, the only requirement for
props was that mgmt should provide those properties to create
a cpu. Property names where designed in topology/location
friendly terms so that clients could make the sense from them.

So I wouldn't try now to reduce meaning of 'props' to
opaque as you suggest.

[..]
> My question is: if we use:
> 
>  -numa cpu,socket=2,core=1,thread=0,node-id=3
> 
> and then:
> 
>  device_add ...,socket=2,core=1,thread=0
>  (omitting node-id on device_add)
> 
> won't it work exactly the same, and place the new CPU on NUMA
> node 3?
yep, it's allowed for compat reasons:
  1: to allow DST start with old CLI variant, that didn't have node-id
     (migration)
  2: to let old libvirt hotplug CPUs, it doesn't treat 'props'
     as opaque set that is just replayed to device_add,
     instead it composes command from topo info it got
     from QEMU and unfortunately node-id is only read but is
     not emitted when device_add is composed
    (I consider this bug but it's out in the wild so we have to deal with it)

we can't enforce presence in these cases or at least have to
keep it relaxed for old machine types.
 
> In this case, if we don't need a node-id argument on device_add,
> so node-id doesn't need to be in HotpluggableCPU.props.
I'd say we currently don't have to (for above reasons) but
it doesn't hurt and actually allows to use pc-dimm way of
mapping CPUs to nodes as David noted. i.e.:
  -device cpu-foo,node-id=x,...
without any of -numa cpu[s] options on CLI.
It's currently explicitly disabled but should work if one
doesn't care about hotplug or if target doesn't care about
mapping at startup (sPAPR), it also might work for x86 as
well using _PXM method in ACPI.
(But that's out of scope of this series and needs more
testing as some guest OSes might expect populated SRAT
to work correctly).

[...]
> > 
> > In future to avoid starting QEMU twice we were thinking about
> > configuring QEMU from QMP at runtime, that's where preconfigure
> > approach could be used to help solving it in the future:
> > 
> >   1. introduce pause before machine_init CLI option to allow
> >      preconfig machine from qmp/monitor
> >   2. make query-hotpluggable-cpus usable at preconfig time
> >   3. start qemu with needed number of numa nodes and default mapping:
> >          #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1
> >   4. get possible cpus list  
> 
> This is where things can get tricky: if we have the default
> mapping set, step 4 would return "node-id" already set on all
> possible CPUs.
that would depend on impl.
 - display node-id with default preset values to override
 - do not set defaults and force user to do mapping

> >   5. add qmp/monitor command variant for '-numa cpu' to set numa mapping  
> 
> This is where I think we would make things simpler: if node-id
> isn't present on 'props', we can simply document the arguments
> that identify the CPU for the numa-cpu command as "just use the
> properties you get on query-hotpluggable-cpus.props". Clients
> would be able to treat CpuInstanceProperties as an opaque CPU
> slot identifier.
> 
> i.e. I think this would be a better way to define and document
> the interface:
> 
> ##
> # @NumaCpuOptions:
> #
> # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> #
> # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> #       query-hotpluggable-cpus for possible values for this
> #       field.
> #       TODO: describe what happens when 'cpu' matches
> #       multiple slots.
> # @node-id: NUMA node where the CPUs are going to be located.
> ##
> { 'struct': 'NumaCpuOptions',
>   'data': {
>    'cpu': 'CpuInstanceProperties',
>    'node-id': 'int' } }
> 
> This separates "what identifies the CPU slot(s) we are
> configuring" from "what identifies the node ID we are binding
> to".
Doesn't look any simpler to me, I'd better document node-id usage
in props, like

 #
 # A discriminated record of NUMA options. (for OptsVisitor)
 #
+# For 'cpu' type as arguments use a set of cpu properties returned
+# by query-hotpluggable-cpus[].props, where node-id could be used
+# to override default node mapping. Since: 2.10
+#
 # Since: 2.1
 ##
 { 'union': 'NumaOptions',
   'base': { 'type': 'NumaOptionsType' },
   'discriminator': 'type',
   'data': {
-    'node': 'NumaNodeOptions' }}
+    'node': 'NumaNodeOptions',
+    'cpu' : 'CpuInstanceProperties' }}
 
 ##
 # @NumaNodeOptions:


> In case we have trouble making this struct work with QemuOpts, we
> could do this (temporarily?):
> 
> ##
> # @NumaCpuOptions:
> #
> # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> #
> # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> #       query-hotpluggable-cpus for possible values for this
> #       field.
> #       TODO: describe what happens when 'cpu' matches
> #       multiple slots.
> # @node-id: NUMA node where the CPUs are going to be located.
> #
> # @socket-id: Shortcut for cpu.socket-id, to make this struct
> #             friendly to QemuOpts.
> # @core-id: Shortcut for cpu.core-id, to make this struct
> #           friendly to QemuOpts.
> # @thread-id: Shortcut for cpu.thread-id, to make this struct
> #             friendly to QemuOpts.
> ##
> { 'struct': 'NumaCpuOptions',
>   'data': {
>    '*cpu': 'CpuInstanceProperties',
>    '*socket-id': 'int',
>    '*core-id': 'int',
>    '*thread-id': 'int',
>    'node-id': 'int' } }
> 
> >   6. optionally, set new numa mapping and get updated
> >      possible cpus list with query-hotpluggable-cpus
> >   7. optionally, add extra cpus with device_add using updated
> >      cpus list and get updated cpus list as it's been changed again.
> >   8. unpause preconfig stage and let qemu continue to execute
> >      machine_init and the rest.
> > 
> > Since we would need to implement QMP configuration for '-device cpu',
> > we as well might reuse it for custom numa mapping.
> > 
> >  [...]  
>
Eduardo Habkost April 27, 2017, 4:32 p.m. UTC | #5
On Thu, Apr 27, 2017 at 03:14:06PM +0200, Igor Mammedov wrote:
> On Wed, 26 Apr 2017 09:21:38 -0300
> Eduardo Habkost <ehabkost@redhat.com> wrote:
> 
> adding Peter to CC list
> 
> [...]
> 
> > On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:
> > > On Wed, 12 Apr 2017 18:02:39 -0300
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > >   
> > > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:  
> > > > > it will allow switching from cpu_index to property based
> > > > > numa mapping in follow up patches.    
> > > > 
> > > > I am not sure I understand all the consequences of this, so I
> > > > will give it a try:
> > > > 
> > > > "node-id" is an existing field in CpuInstanceProperties.
> > > > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > > > output and in MachineState::possible_cpus.
> > > > 
> > > > We will start using MachineState::possible_cpus to keep track of
> > > > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > > > start reporting a "node-id" property when a NUMA mapping is
> > > > configured.
> > > > 
> > > > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > > > objects must have a "node-id" property that can be set. This
> > > > patch adds the "node-id" property to X86CPU.
> > > > 
> > > > Is this description accurate? Is the presence of "node-id" in
> > > > query-hotpluggable-cpus the only reason we really need this
> > > > patch, or is there something else that requires the "node-id"
> > > > property?  
> > > That accurate description, node-id is in the same 'address'
> > > properties category as socket/core/thread-id. So if you have
> > > numa enabled machine you'd see node-id property in
> > > query-hotpluggable-cpus.  
> > 
> > I agree that we can make -numa cpu affect query-hotpluggable-cpus
> > output (i.e. affect some field on HotpluggableCPU.props).
> > 
> > But it looks like we disagree about the purpose of
> > HotpluggableCPU.props:
> > 
> > I believe HotpluggableCPU.props is just an opaque identifier for
> > the location we want to plug the CPU, and the only requirement is
> > that it should be unique and have all the information device_add
> > needs. As socket IDs are already unique on our existing machines,
> > and socket<=>node mapping is already configured using -numa cpu,
> > node-id doesn't need to be in HotpluggableCPU.props. (see example
> > below)
> node-id is also location property which logically complements
> to socket/core/thread properties.  Also socket is not necessarily
> unique id that maps 1:1 to node-id from generic pov.
> BTW -numa cpu[s] is not the only way to specify mapping,
> it could be specified like we do with pc-dimm:
>    device_add pc-dimm,node=x
> 
> Looking at it more genericly, there could be the same
> socket-ids for different nodes, then we would have to add
> node-id to props anyway and end up with 2 node-id, one in props
> and another in the parent struct.

This is where my expectations are different: I think
HotpluggableCPU.props is just an identifier property for CPU
slots that is used for device_add (and will be used for -numa
cpu), and isn't supposed to be be interpreted by clients.

The problem I see is that the property has two completely
different purposes: identifying a given CPU slot for device_add
(and -numa cpu), and introspection of topology information about
the CPU slot. Today we are lucky and those goals don't conflict
with each other, but I worry this might cause trouble in the
future.

> 
> 
> > I don't think clients should assume topology information in
> > HotpluggableCPU.props is always present, because the field has a
> > different purpose: letting clients know what are the required
> > device_add arguments. If we need introspection of CPU topology,
> > we can add new fields to HotpluggableCPU, outside 'props'.
> looking at existing clients (libvirt), it doesn't treat 'props'
> as opaque set, but parses it into topology information (my guess
> is that because it's the sole source such info from QEMU).
> Actually we never forbade this, the only requirement for
> props was that mgmt should provide those properties to create
> a cpu. Property names where designed in topology/location
> friendly terms so that clients could make the sense from them.
> 
> So I wouldn't try now to reduce meaning of 'props' to
> opaque as you suggest.

I see. This means my expectation is not met even today. I am not
thrilled about it, but that's OK.

> 
> [..]
> > My question is: if we use:
> > 
> >  -numa cpu,socket=2,core=1,thread=0,node-id=3
> > 
> > and then:
> > 
> >  device_add ...,socket=2,core=1,thread=0
> >  (omitting node-id on device_add)
> > 
> > won't it work exactly the same, and place the new CPU on NUMA
> > node 3?
> yep, it's allowed for compat reasons:
>   1: to allow DST start with old CLI variant, that didn't have node-id
>      (migration)
>   2: to let old libvirt hotplug CPUs, it doesn't treat 'props'
>      as opaque set that is just replayed to device_add,
>      instead it composes command from topo info it got
>      from QEMU and unfortunately node-id is only read but is
>      not emitted when device_add is composed
>     (I consider this bug but it's out in the wild so we have to deal with it)
> 
> we can't enforce presence in these cases or at least have to
> keep it relaxed for old machine types.

I see.

>  
> > In this case, if we don't need a node-id argument on device_add,
> > so node-id doesn't need to be in HotpluggableCPU.props.
> I'd say we currently don't have to (for above reasons) but
> it doesn't hurt and actually allows to use pc-dimm way of
> mapping CPUs to nodes as David noted. i.e.:
>   -device cpu-foo,node-id=x,...
> without any of -numa cpu[s] options on CLI.
> It's currently explicitly disabled but should work if one
> doesn't care about hotplug or if target doesn't care about
> mapping at startup (sPAPR), it also might work for x86 as
> well using _PXM method in ACPI.
> (But that's out of scope of this series and needs more
> testing as some guest OSes might expect populated SRAT
> to work correctly).

Yep. I understand that setting node-id is useful, I just didn't
expect it to be mandatory and included on HotpluggableCPU.props.

> 
> [...]
> > > 
> > > In future to avoid starting QEMU twice we were thinking about
> > > configuring QEMU from QMP at runtime, that's where preconfigure
> > > approach could be used to help solving it in the future:
> > > 
> > >   1. introduce pause before machine_init CLI option to allow
> > >      preconfig machine from qmp/monitor
> > >   2. make query-hotpluggable-cpus usable at preconfig time
> > >   3. start qemu with needed number of numa nodes and default mapping:
> > >          #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1
> > >   4. get possible cpus list  
> > 
> > This is where things can get tricky: if we have the default
> > mapping set, step 4 would return "node-id" already set on all
> > possible CPUs.
> that would depend on impl.
>  - display node-id with default preset values to override
>  - do not set defaults and force user to do mapping

Right. We could choose to initialize default values much later,
and leave it uninitialized.

> 
> > >   5. add qmp/monitor command variant for '-numa cpu' to set numa mapping  
> > 
> > This is where I think we would make things simpler: if node-id
> > isn't present on 'props', we can simply document the arguments
> > that identify the CPU for the numa-cpu command as "just use the
> > properties you get on query-hotpluggable-cpus.props". Clients
> > would be able to treat CpuInstanceProperties as an opaque CPU
> > slot identifier.
> > 
> > i.e. I think this would be a better way to define and document
> > the interface:
> > 
> > ##
> > # @NumaCpuOptions:
> > #
> > # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> > #
> > # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> > #       query-hotpluggable-cpus for possible values for this
> > #       field.
> > #       TODO: describe what happens when 'cpu' matches
> > #       multiple slots.
> > # @node-id: NUMA node where the CPUs are going to be located.
> > ##
> > { 'struct': 'NumaCpuOptions',
> >   'data': {
> >    'cpu': 'CpuInstanceProperties',
> >    'node-id': 'int' } }
> > 
> > This separates "what identifies the CPU slot(s) we are
> > configuring" from "what identifies the node ID we are binding
> > to".
> Doesn't look any simpler to me, I'd better document node-id usage
> in props, like
> 

Well, it still looks simpler and more intuitive to me, but just
because it matches my initial expectations about the semantics of
query-hotpluggable-cpus and CpuInstanceProperties. If your
alternative is very clearly documented (like below), it is not a
problem to me.

>  #
>  # A discriminated record of NUMA options. (for OptsVisitor)
>  #
> +# For 'cpu' type as arguments use a set of cpu properties returned
> +# by query-hotpluggable-cpus[].props, where node-id could be used
> +# to override default node mapping. Since: 2.10
> +#
>  # Since: 2.1
>  ##
>  { 'union': 'NumaOptions',
>    'base': { 'type': 'NumaOptionsType' },
>    'discriminator': 'type',
>    'data': {
> -    'node': 'NumaNodeOptions' }}
> +    'node': 'NumaNodeOptions',
> +    'cpu' : 'CpuInstanceProperties' }}

I worry about not being able to add extra options to "-numa cpu"
in the future without affecting HotpluggableCPU.props too. Being
able to document the semantics of -numa cpu inside a dedicated
NumaCpuOptions struct would be nice too.

I believe this can be addressed by defing "NumaCpuOptions" with
"CpuInstanceProperties" as base:

 { 'union': 'NumaOptions',
   'base': { 'type': 'NumaOptionsType' },
   'discriminator': 'type',
   'data': {
     'node': 'NumaNodeOptions',
     'cpu' : 'NumaCpuOptions' }}

##
# Options for -numa cpu,...
#
# "-numa cpu" accepts the same set of cpu properties returned by
# query-hotpluggable-cpus[].props, where node-id could be used to
# override default node mapping.
#
# Since: 2.10
##
{ 'struct': 'NumaCpuOptions',
  'base': 'CpuInstanceProperties' }

>  


>  ##
>  # @NumaNodeOptions:
> 
> 
> > In case we have trouble making this struct work with QemuOpts, we
> > could do this (temporarily?):
> > 
> > ##
> > # @NumaCpuOptions:
> > #
> > # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> > #
> > # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> > #       query-hotpluggable-cpus for possible values for this
> > #       field.
> > #       TODO: describe what happens when 'cpu' matches
> > #       multiple slots.
> > # @node-id: NUMA node where the CPUs are going to be located.
> > #
> > # @socket-id: Shortcut for cpu.socket-id, to make this struct
> > #             friendly to QemuOpts.
> > # @core-id: Shortcut for cpu.core-id, to make this struct
> > #           friendly to QemuOpts.
> > # @thread-id: Shortcut for cpu.thread-id, to make this struct
> > #             friendly to QemuOpts.
> > ##
> > { 'struct': 'NumaCpuOptions',
> >   'data': {
> >    '*cpu': 'CpuInstanceProperties',
> >    '*socket-id': 'int',
> >    '*core-id': 'int',
> >    '*thread-id': 'int',
> >    'node-id': 'int' } }
> > 
> > >   6. optionally, set new numa mapping and get updated
> > >      possible cpus list with query-hotpluggable-cpus
> > >   7. optionally, add extra cpus with device_add using updated
> > >      cpus list and get updated cpus list as it's been changed again.
> > >   8. unpause preconfig stage and let qemu continue to execute
> > >      machine_init and the rest.
> > > 
> > > Since we would need to implement QMP configuration for '-device cpu',
> > > we as well might reuse it for custom numa mapping.
> > > 
> > >  [...]  
> > 
>
Igor Mammedov April 27, 2017, 5:25 p.m. UTC | #6
On Thu, 27 Apr 2017 13:32:25 -0300
Eduardo Habkost <ehabkost@redhat.com> wrote:

> On Thu, Apr 27, 2017 at 03:14:06PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Apr 2017 09:21:38 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > 
> > adding Peter to CC list
> > 
> > [...]
> > 
> > > On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:
> > > > On Wed, 12 Apr 2017 18:02:39 -0300
> > > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > >   
> > > > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:  
> > > > > > it will allow switching from cpu_index to property based
> > > > > > numa mapping in follow up patches.    
> > > > > 
> > > > > I am not sure I understand all the consequences of this, so I
> > > > > will give it a try:
> > > > > 
> > > > > "node-id" is an existing field in CpuInstanceProperties.
> > > > > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > > > > output and in MachineState::possible_cpus.
> > > > > 
> > > > > We will start using MachineState::possible_cpus to keep track of
> > > > > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > > > > start reporting a "node-id" property when a NUMA mapping is
> > > > > configured.
> > > > > 
> > > > > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > > > > objects must have a "node-id" property that can be set. This
> > > > > patch adds the "node-id" property to X86CPU.
> > > > > 
> > > > > Is this description accurate? Is the presence of "node-id" in
> > > > > query-hotpluggable-cpus the only reason we really need this
> > > > > patch, or is there something else that requires the "node-id"
> > > > > property?  
> > > > That accurate description, node-id is in the same 'address'
> > > > properties category as socket/core/thread-id. So if you have
> > > > numa enabled machine you'd see node-id property in
> > > > query-hotpluggable-cpus.  
> > > 
> > > I agree that we can make -numa cpu affect query-hotpluggable-cpus
> > > output (i.e. affect some field on HotpluggableCPU.props).
> > > 
> > > But it looks like we disagree about the purpose of
> > > HotpluggableCPU.props:
> > > 
> > > I believe HotpluggableCPU.props is just an opaque identifier for
> > > the location we want to plug the CPU, and the only requirement is
> > > that it should be unique and have all the information device_add
> > > needs. As socket IDs are already unique on our existing machines,
> > > and socket<=>node mapping is already configured using -numa cpu,
> > > node-id doesn't need to be in HotpluggableCPU.props. (see example
> > > below)
> > node-id is also location property which logically complements
> > to socket/core/thread properties.  Also socket is not necessarily
> > unique id that maps 1:1 to node-id from generic pov.
> > BTW -numa cpu[s] is not the only way to specify mapping,
> > it could be specified like we do with pc-dimm:
> >    device_add pc-dimm,node=x
> > 
> > Looking at it more genericly, there could be the same
> > socket-ids for different nodes, then we would have to add
> > node-id to props anyway and end up with 2 node-id, one in props
> > and another in the parent struct.
> 
> This is where my expectations are different: I think
> HotpluggableCPU.props is just an identifier property for CPU
> slots that is used for device_add (and will be used for -numa
> cpu), and isn't supposed to be be interpreted by clients.
> 
> The problem I see is that the property has two completely
> different purposes: identifying a given CPU slot for device_add
> (and -numa cpu), and introspection of topology information about
> the CPU slot. Today we are lucky and those goals don't conflict
> with each other, but I worry this might cause trouble in the
> future.
> 
> > 
> > 
> > > I don't think clients should assume topology information in
> > > HotpluggableCPU.props is always present, because the field has a
> > > different purpose: letting clients know what are the required
> > > device_add arguments. If we need introspection of CPU topology,
> > > we can add new fields to HotpluggableCPU, outside 'props'.
> > looking at existing clients (libvirt), it doesn't treat 'props'
> > as opaque set, but parses it into topology information (my guess
> > is that because it's the sole source such info from QEMU).
> > Actually we never forbade this, the only requirement for
> > props was that mgmt should provide those properties to create
> > a cpu. Property names where designed in topology/location
> > friendly terms so that clients could make the sense from them.
> > 
> > So I wouldn't try now to reduce meaning of 'props' to
> > opaque as you suggest.
> 
> I see. This means my expectation is not met even today. I am not
> thrilled about it, but that's OK.
> 
> > 
> > [..]
> > > My question is: if we use:
> > > 
> > >  -numa cpu,socket=2,core=1,thread=0,node-id=3
> > > 
> > > and then:
> > > 
> > >  device_add ...,socket=2,core=1,thread=0
> > >  (omitting node-id on device_add)
> > > 
> > > won't it work exactly the same, and place the new CPU on NUMA
> > > node 3?
> > yep, it's allowed for compat reasons:
> >   1: to allow DST start with old CLI variant, that didn't have node-id
> >      (migration)
> >   2: to let old libvirt hotplug CPUs, it doesn't treat 'props'
> >      as opaque set that is just replayed to device_add,
> >      instead it composes command from topo info it got
> >      from QEMU and unfortunately node-id is only read but is
> >      not emitted when device_add is composed
> >     (I consider this bug but it's out in the wild so we have to deal with it)
> > 
> > we can't enforce presence in these cases or at least have to
> > keep it relaxed for old machine types.
> 
> I see.
> 
> >  
> > > In this case, if we don't need a node-id argument on device_add,
> > > so node-id doesn't need to be in HotpluggableCPU.props.
> > I'd say we currently don't have to (for above reasons) but
> > it doesn't hurt and actually allows to use pc-dimm way of
> > mapping CPUs to nodes as David noted. i.e.:
> >   -device cpu-foo,node-id=x,...
> > without any of -numa cpu[s] options on CLI.
> > It's currently explicitly disabled but should work if one
> > doesn't care about hotplug or if target doesn't care about
> > mapping at startup (sPAPR), it also might work for x86 as
> > well using _PXM method in ACPI.
> > (But that's out of scope of this series and needs more
> > testing as some guest OSes might expect populated SRAT
> > to work correctly).
> 
> Yep. I understand that setting node-id is useful, I just didn't
> expect it to be mandatory and included on HotpluggableCPU.props.
> 
> > 
> > [...]
> > > > 
> > > > In future to avoid starting QEMU twice we were thinking about
> > > > configuring QEMU from QMP at runtime, that's where preconfigure
> > > > approach could be used to help solving it in the future:
> > > > 
> > > >   1. introduce pause before machine_init CLI option to allow
> > > >      preconfig machine from qmp/monitor
> > > >   2. make query-hotpluggable-cpus usable at preconfig time
> > > >   3. start qemu with needed number of numa nodes and default mapping:
> > > >          #qemu -smp ... -numa node,nodeid=0 -node node,nodeid=1
> > > >   4. get possible cpus list  
> > > 
> > > This is where things can get tricky: if we have the default
> > > mapping set, step 4 would return "node-id" already set on all
> > > possible CPUs.
> > that would depend on impl.
> >  - display node-id with default preset values to override
> >  - do not set defaults and force user to do mapping
> 
> Right. We could choose to initialize default values much later,
> and leave it uninitialized.
> 
> > 
> > > >   5. add qmp/monitor command variant for '-numa cpu' to set numa mapping  
> > > 
> > > This is where I think we would make things simpler: if node-id
> > > isn't present on 'props', we can simply document the arguments
> > > that identify the CPU for the numa-cpu command as "just use the
> > > properties you get on query-hotpluggable-cpus.props". Clients
> > > would be able to treat CpuInstanceProperties as an opaque CPU
> > > slot identifier.
> > > 
> > > i.e. I think this would be a better way to define and document
> > > the interface:
> > > 
> > > ##
> > > # @NumaCpuOptions:
> > > #
> > > # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> > > #
> > > # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> > > #       query-hotpluggable-cpus for possible values for this
> > > #       field.
> > > #       TODO: describe what happens when 'cpu' matches
> > > #       multiple slots.
> > > # @node-id: NUMA node where the CPUs are going to be located.
> > > ##
> > > { 'struct': 'NumaCpuOptions',
> > >   'data': {
> > >    'cpu': 'CpuInstanceProperties',
> > >    'node-id': 'int' } }
> > > 
> > > This separates "what identifies the CPU slot(s) we are
> > > configuring" from "what identifies the node ID we are binding
> > > to".
> > Doesn't look any simpler to me, I'd better document node-id usage
> > in props, like
> > 
> 
> Well, it still looks simpler and more intuitive to me, but just
> because it matches my initial expectations about the semantics of
> query-hotpluggable-cpus and CpuInstanceProperties. If your
> alternative is very clearly documented (like below), it is not a
> problem to me.
> 
> >  #
> >  # A discriminated record of NUMA options. (for OptsVisitor)
> >  #
> > +# For 'cpu' type as arguments use a set of cpu properties returned
> > +# by query-hotpluggable-cpus[].props, where node-id could be used
> > +# to override default node mapping. Since: 2.10
> > +#
> >  # Since: 2.1
> >  ##
> >  { 'union': 'NumaOptions',
> >    'base': { 'type': 'NumaOptionsType' },
> >    'discriminator': 'type',
> >    'data': {
> > -    'node': 'NumaNodeOptions' }}
> > +    'node': 'NumaNodeOptions',
> > +    'cpu' : 'CpuInstanceProperties' }}
> 
> I worry about not being able to add extra options to "-numa cpu"
> in the future without affecting HotpluggableCPU.props too. Being
> able to document the semantics of -numa cpu inside a dedicated
> NumaCpuOptions struct would be nice too.
> 
> I believe this can be addressed by defing "NumaCpuOptions" with
> "CpuInstanceProperties" as base:
> 
>  { 'union': 'NumaOptions',
>    'base': { 'type': 'NumaOptionsType' },
>    'discriminator': 'type',
>    'data': {
>      'node': 'NumaNodeOptions',
>      'cpu' : 'NumaCpuOptions' }}
> 
> ##
> # Options for -numa cpu,...
> #
> # "-numa cpu" accepts the same set of cpu properties returned by
> # query-hotpluggable-cpus[].props, where node-id could be used to
> # override default node mapping.
> #
> # Since: 2.10
> ##
> { 'struct': 'NumaCpuOptions',
>   'base': 'CpuInstanceProperties' }
is it inheritance or encapsulation?
if it's encapsulation, wouldn't look nice, but we can
duplicate fields from CpuInstanceProperties in NumaCpuOptions
like you proposed below and marshal them into CpuInstanceProperties
inside of parse_numa() where needed.

> 
> >  
> 
> 
> >  ##
> >  # @NumaNodeOptions:
> > 
> > 
> > > In case we have trouble making this struct work with QemuOpts, we
> > > could do this (temporarily?):
> > > 
> > > ##
> > > # @NumaCpuOptions:
> > > #
> > > # Mapping of a given CPU (or a set of CPUs) to a NUMA node.
> > > #
> > > # @cpu: Properties identifying the CPU(s). Use the 'props' field of
> > > #       query-hotpluggable-cpus for possible values for this
> > > #       field.
> > > #       TODO: describe what happens when 'cpu' matches
> > > #       multiple slots.
> > > # @node-id: NUMA node where the CPUs are going to be located.
> > > #
> > > # @socket-id: Shortcut for cpu.socket-id, to make this struct
> > > #             friendly to QemuOpts.
> > > # @core-id: Shortcut for cpu.core-id, to make this struct
> > > #           friendly to QemuOpts.
> > > # @thread-id: Shortcut for cpu.thread-id, to make this struct
> > > #             friendly to QemuOpts.
> > > ##
> > > { 'struct': 'NumaCpuOptions',
> > >   'data': {
> > >    '*cpu': 'CpuInstanceProperties',
> > >    '*socket-id': 'int',
> > >    '*core-id': 'int',
> > >    '*thread-id': 'int',
> > >    'node-id': 'int' } }
> > > 
> > > >   6. optionally, set new numa mapping and get updated
> > > >      possible cpus list with query-hotpluggable-cpus
> > > >   7. optionally, add extra cpus with device_add using updated
> > > >      cpus list and get updated cpus list as it's been changed again.
> > > >   8. unpause preconfig stage and let qemu continue to execute
> > > >      machine_init and the rest.
> > > > 
> > > > Since we would need to implement QMP configuration for '-device cpu',
> > > > we as well might reuse it for custom numa mapping.
> > > > 
> > > >  [...]  
> > > 
> > 
>
Eduardo Habkost April 27, 2017, 5:32 p.m. UTC | #7
On Thu, Apr 27, 2017 at 07:25:23PM +0200, Igor Mammedov wrote:
[...]
> > >  #
> > >  # A discriminated record of NUMA options. (for OptsVisitor)
> > >  #
> > > +# For 'cpu' type as arguments use a set of cpu properties returned
> > > +# by query-hotpluggable-cpus[].props, where node-id could be used
> > > +# to override default node mapping. Since: 2.10
> > > +#
> > >  # Since: 2.1
> > >  ##
> > >  { 'union': 'NumaOptions',
> > >    'base': { 'type': 'NumaOptionsType' },
> > >    'discriminator': 'type',
> > >    'data': {
> > > -    'node': 'NumaNodeOptions' }}
> > > +    'node': 'NumaNodeOptions',
> > > +    'cpu' : 'CpuInstanceProperties' }}
> > 
> > I worry about not being able to add extra options to "-numa cpu"
> > in the future without affecting HotpluggableCPU.props too. Being
> > able to document the semantics of -numa cpu inside a dedicated
> > NumaCpuOptions struct would be nice too.
> > 
> > I believe this can be addressed by defing "NumaCpuOptions" with
> > "CpuInstanceProperties" as base:
> > 
> >  { 'union': 'NumaOptions',
> >    'base': { 'type': 'NumaOptionsType' },
> >    'discriminator': 'type',
> >    'data': {
> >      'node': 'NumaNodeOptions',
> >      'cpu' : 'NumaCpuOptions' }}
> > 
> > ##
> > # Options for -numa cpu,...
> > #
> > # "-numa cpu" accepts the same set of cpu properties returned by
> > # query-hotpluggable-cpus[].props, where node-id could be used to
> > # override default node mapping.
> > #
> > # Since: 2.10
> > ##
> > { 'struct': 'NumaCpuOptions',
> >   'base': 'CpuInstanceProperties' }
> is it inheritance or encapsulation?

If I understood the docs correctly, it's inheritance. I didn't
test it, though.

> if it's encapsulation, wouldn't look nice, but we can
> duplicate fields from CpuInstanceProperties in NumaCpuOptions
> like you proposed below and marshal them into CpuInstanceProperties
> inside of parse_numa() where needed.

I think inheritance will work. But if it doesn't, I don't mind
either: we can duplicate the fields like you suggest, or use
CpuInstanceProperties directly like you did above.
David Gibson May 2, 2017, 4:27 a.m. UTC | #8
On Thu, Apr 27, 2017 at 01:32:25PM -0300, Eduardo Habkost wrote:
> On Thu, Apr 27, 2017 at 03:14:06PM +0200, Igor Mammedov wrote:
> > On Wed, 26 Apr 2017 09:21:38 -0300
> > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > 
> > adding Peter to CC list
> > 
> > [...]
> > 
> > > On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:
> > > > On Wed, 12 Apr 2017 18:02:39 -0300
> > > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > >   
> > > > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:  
> > > > > > it will allow switching from cpu_index to property based
> > > > > > numa mapping in follow up patches.    
> > > > > 
> > > > > I am not sure I understand all the consequences of this, so I
> > > > > will give it a try:
> > > > > 
> > > > > "node-id" is an existing field in CpuInstanceProperties.
> > > > > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > > > > output and in MachineState::possible_cpus.
> > > > > 
> > > > > We will start using MachineState::possible_cpus to keep track of
> > > > > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > > > > start reporting a "node-id" property when a NUMA mapping is
> > > > > configured.
> > > > > 
> > > > > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > > > > objects must have a "node-id" property that can be set. This
> > > > > patch adds the "node-id" property to X86CPU.
> > > > > 
> > > > > Is this description accurate? Is the presence of "node-id" in
> > > > > query-hotpluggable-cpus the only reason we really need this
> > > > > patch, or is there something else that requires the "node-id"
> > > > > property?  
> > > > That accurate description, node-id is in the same 'address'
> > > > properties category as socket/core/thread-id. So if you have
> > > > numa enabled machine you'd see node-id property in
> > > > query-hotpluggable-cpus.  
> > > 
> > > I agree that we can make -numa cpu affect query-hotpluggable-cpus
> > > output (i.e. affect some field on HotpluggableCPU.props).
> > > 
> > > But it looks like we disagree about the purpose of
> > > HotpluggableCPU.props:
> > > 
> > > I believe HotpluggableCPU.props is just an opaque identifier for
> > > the location we want to plug the CPU, and the only requirement is
> > > that it should be unique and have all the information device_add
> > > needs. As socket IDs are already unique on our existing machines,
> > > and socket<=>node mapping is already configured using -numa cpu,
> > > node-id doesn't need to be in HotpluggableCPU.props. (see example
> > > below)
> > node-id is also location property which logically complements
> > to socket/core/thread properties.  Also socket is not necessarily
> > unique id that maps 1:1 to node-id from generic pov.
> > BTW -numa cpu[s] is not the only way to specify mapping,
> > it could be specified like we do with pc-dimm:
> >    device_add pc-dimm,node=x
> > 
> > Looking at it more genericly, there could be the same
> > socket-ids for different nodes, then we would have to add
> > node-id to props anyway and end up with 2 node-id, one in props
> > and another in the parent struct.
> 
> This is where my expectations are different: I think
> HotpluggableCPU.props is just an identifier property for CPU
> slots that is used for device_add (and will be used for -numa
> cpu), and isn't supposed to be be interpreted by clients.
> 
> The problem I see is that the property has two completely
> different purposes: identifying a given CPU slot for device_add
> (and -numa cpu), and introspection of topology information about
> the CPU slot. Today we are lucky and those goals don't conflict
> with each other, but I worry this might cause trouble in the
> future.

Yeah, I share your concern.  And even if we allow that the topology
information may be read by the user, at the moment the
socket/core/thread values are "read only" in the sense that the client
should do nothing by read them from the query (possibly look at them
for its own interest) and echo them back verbatim to device_add.

node id is different because it's something the user/management might
want to actually choose.  So it seems dubious to me that it's in the
same structure.
Igor Mammedov May 2, 2017, 8:28 a.m. UTC | #9
On Tue, 2 May 2017 14:27:12 +1000
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Apr 27, 2017 at 01:32:25PM -0300, Eduardo Habkost wrote:
> > On Thu, Apr 27, 2017 at 03:14:06PM +0200, Igor Mammedov wrote:  
> > > On Wed, 26 Apr 2017 09:21:38 -0300
> > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > 
> > > adding Peter to CC list
> > > 
> > > [...]
> > >   
> > > > On Wed, Apr 19, 2017 at 01:14:58PM +0200, Igor Mammedov wrote:  
> > > > > On Wed, 12 Apr 2017 18:02:39 -0300
> > > > > Eduardo Habkost <ehabkost@redhat.com> wrote:
> > > > >     
> > > > > > On Wed, Mar 22, 2017 at 02:32:32PM +0100, Igor Mammedov wrote:    
> > > > > > > it will allow switching from cpu_index to property based
> > > > > > > numa mapping in follow up patches.      
> > > > > > 
> > > > > > I am not sure I understand all the consequences of this, so I
> > > > > > will give it a try:
> > > > > > 
> > > > > > "node-id" is an existing field in CpuInstanceProperties.
> > > > > > CpuInstanceProperties is used on both query-hotpluggable-cpus
> > > > > > output and in MachineState::possible_cpus.
> > > > > > 
> > > > > > We will start using MachineState::possible_cpus to keep track of
> > > > > > NUMA CPU affinity, and that means query-hotpluggable-cpus will
> > > > > > start reporting a "node-id" property when a NUMA mapping is
> > > > > > configured.
> > > > > > 
> > > > > > To allow query-hotpluggable-cpus to report "node-id", the CPU
> > > > > > objects must have a "node-id" property that can be set. This
> > > > > > patch adds the "node-id" property to X86CPU.
> > > > > > 
> > > > > > Is this description accurate? Is the presence of "node-id" in
> > > > > > query-hotpluggable-cpus the only reason we really need this
> > > > > > patch, or is there something else that requires the "node-id"
> > > > > > property?    
> > > > > That accurate description, node-id is in the same 'address'
> > > > > properties category as socket/core/thread-id. So if you have
> > > > > numa enabled machine you'd see node-id property in
> > > > > query-hotpluggable-cpus.    
> > > > 
> > > > I agree that we can make -numa cpu affect query-hotpluggable-cpus
> > > > output (i.e. affect some field on HotpluggableCPU.props).
> > > > 
> > > > But it looks like we disagree about the purpose of
> > > > HotpluggableCPU.props:
> > > > 
> > > > I believe HotpluggableCPU.props is just an opaque identifier for
> > > > the location we want to plug the CPU, and the only requirement is
> > > > that it should be unique and have all the information device_add
> > > > needs. As socket IDs are already unique on our existing machines,
> > > > and socket<=>node mapping is already configured using -numa cpu,
> > > > node-id doesn't need to be in HotpluggableCPU.props. (see example
> > > > below)  
> > > node-id is also location property which logically complements
> > > to socket/core/thread properties.  Also socket is not necessarily
> > > unique id that maps 1:1 to node-id from generic pov.
> > > BTW -numa cpu[s] is not the only way to specify mapping,
> > > it could be specified like we do with pc-dimm:
> > >    device_add pc-dimm,node=x
> > > 
> > > Looking at it more genericly, there could be the same
> > > socket-ids for different nodes, then we would have to add
> > > node-id to props anyway and end up with 2 node-id, one in props
> > > and another in the parent struct.  
> > 
> > This is where my expectations are different: I think
> > HotpluggableCPU.props is just an identifier property for CPU
> > slots that is used for device_add (and will be used for -numa
> > cpu), and isn't supposed to be be interpreted by clients.
> > 
> > The problem I see is that the property has two completely
> > different purposes: identifying a given CPU slot for device_add
> > (and -numa cpu), and introspection of topology information about
> > the CPU slot. Today we are lucky and those goals don't conflict
> > with each other, but I worry this might cause trouble in the
> > future.  
> 
> Yeah, I share your concern.  And even if we allow that the topology
> information may be read by the user, at the moment the
> socket/core/thread values are "read only" in the sense that the client
> should do nothing by read them from the query (possibly look at them
> for its own interest) and echo them back verbatim to device_add.
> 
> node id is different because it's something the user/management might
> want to actually choose.  So it seems dubious to me that it's in the
> same structure.
node-id is 'read only' when it comes to device_add so far.
diff mbox

Patch

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 7031100..873bbfa 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1895,6 +1895,7 @@  static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
                             DeviceState *dev, Error **errp)
 {
     int idx;
+    int node_id;
     CPUState *cs;
     CPUArchId *cpu_slot;
     X86CPUTopoInfo topo;
@@ -1984,6 +1985,22 @@  static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
 
     cs = CPU(cpu);
     cs->cpu_index = idx;
+
+    node_id = numa_get_node_for_cpu(cs->cpu_index);
+    if (node_id == nb_numa_nodes) {
+        /* by default CPUState::numa_node was 0 if it's not set via CLI
+         * keep it this way for now but in future we probably should
+         * refuse to start up with incomplete numa mapping */
+        node_id = 0;
+    }
+    if (cs->numa_node == CPU_UNSET_NUMA_NODE_ID) {
+        cs->numa_node = node_id;
+    } else if (cs->numa_node != node_id) {
+            error_setg(errp, "node-id %d must match numa node specified"
+                "with -numa option for cpu-index %d",
+                cs->numa_node, cs->cpu_index);
+            return;
+    }
 }
 
 static void pc_machine_device_pre_plug_cb(HotplugHandler *hotplug_dev,
diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 7aa7622..d690244 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -3974,6 +3974,7 @@  static Property x86_cpu_properties[] = {
     DEFINE_PROP_INT32("core-id", X86CPU, core_id, -1),
     DEFINE_PROP_INT32("socket-id", X86CPU, socket_id, -1),
 #endif
+    DEFINE_PROP_INT32("node-id", CPUState, numa_node, CPU_UNSET_NUMA_NODE_ID),
     DEFINE_PROP_BOOL("pmu", X86CPU, enable_pmu, false),
     { .name  = "hv-spinlocks", .info  = &qdev_prop_spinlocks },
     DEFINE_PROP_BOOL("hv-relaxed", X86CPU, hyperv_relaxed_timing, false),