diff mbox

[for-2.10,05/23] numa: move source of default CPUs to NUMA node mapping into boards

Message ID 1490189568-167621-6-git-send-email-imammedo@redhat.com
State New
Headers show

Commit Message

Igor Mammedov March 22, 2017, 1:32 p.m. UTC
Originally CPU threads were by default assigned in
round-robin fashion. However it was causing issues in
guest since CPU threads from the same socket/core could
be placed on different NUMA nodes.
Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
fixed it by grouping threads within a socket on the same node
introducing cpu_index_to_socket_id() callback and commit
20bb648d (spapr: Fix default NUMA node allocation for threads)
reused callback to fix similar issues for SPAPR machine
even though socket doesn't make much sense there.

As result QEMU ended up having 3 default distribution rules
used by 3 targets /virt-arm, spapr, pc/.

In effort of moving NUMA mapping for CPUs into possible_cpus,
generalize default mapping in numa.c by making boards decide
on default mapping and let them explicitly tell generic
numa code to which node a CPU thread belongs to by replacing
cpu_index_to_socket_id() with @cpu_index_to_instance_props()
which provides default node_id assigned by board to specified
cpu_index.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
---
Patch only moves source of default mapping to possible_cpus[]
and leaves the rest of NUMA handling to numa_info[node_id].node_cpu
bitmaps. It's up to follow up patches to replace bitmaps
with possible_cpus[] internally.
---
 include/hw/boards.h   |  8 ++++++--
 include/sysemu/numa.h |  2 +-
 hw/arm/virt.c         | 19 +++++++++++++++++--
 hw/i386/pc.c          | 22 ++++++++++++++++------
 hw/ppc/spapr.c        | 27 ++++++++++++++++++++-------
 numa.c                | 15 +++++++++------
 vl.c                  |  2 +-
 7 files changed, 70 insertions(+), 25 deletions(-)

Comments

Bharata B Rao March 23, 2017, 6:10 a.m. UTC | #1
On Wed, Mar 22, 2017 at 7:02 PM, Igor Mammedov <imammedo@redhat.com> wrote:

> diff --git a/numa.c b/numa.c
> index e01cb54..b6e71bc 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -294,9 +294,10 @@ static void validate_numa_cpus(void)
>      g_free(seen_cpus);
>  }
>
> -void parse_numa_opts(MachineClass *mc)
> +void parse_numa_opts(MachineState *ms)
>  {
>      int i;
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>
>      for (i = 0; i < MAX_NODES; i++) {
>          numa_info[i].node_cpu = bitmap_new(max_cpus);
> @@ -378,14 +379,16 @@ void parse_numa_opts(MachineClass *mc)
>           * rule grouping VCPUs by socket so that VCPUs from the same
> socket
>           * would be on the same node.
>           */
> +        if (!mc->cpu_index_to_instance_props) {
> +            error_report("default CPUs to NUMA node mapping isn't
> supported");
> +            exit(1);
> +        }
>

Just trying to understand the impact of the above enforcement. So targets
and machine types that don't define ->cpu_index_to_instance_props() are
expected not to boot ? Shouldn't they have a default to fall back upon ?

Regards,
Bharata.
Igor Mammedov March 23, 2017, 8:48 a.m. UTC | #2
On Thu, 23 Mar 2017 11:40:29 +0530
Bharata B Rao <bharata.rao@gmail.com> wrote:

> On Wed, Mar 22, 2017 at 7:02 PM, Igor Mammedov <imammedo@redhat.com> wrote:
> 
> > diff --git a/numa.c b/numa.c
> > index e01cb54..b6e71bc 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -294,9 +294,10 @@ static void validate_numa_cpus(void)
> >      g_free(seen_cpus);
> >  }
> >
> > -void parse_numa_opts(MachineClass *mc)
> > +void parse_numa_opts(MachineState *ms)
> >  {
> >      int i;
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> >
> >      for (i = 0; i < MAX_NODES; i++) {
> >          numa_info[i].node_cpu = bitmap_new(max_cpus);
> > @@ -378,14 +379,16 @@ void parse_numa_opts(MachineClass *mc)
> >           * rule grouping VCPUs by socket so that VCPUs from the same
> > socket
> >           * would be on the same node.
> >           */
> > +        if (!mc->cpu_index_to_instance_props) {
> > +            error_report("default CPUs to NUMA node mapping isn't
> > supported");
> > +            exit(1);
> > +        }
> >  
> 
> Just trying to understand the impact of the above enforcement. So targets
> and machine types that don't define ->cpu_index_to_instance_props() are
> expected not to boot ? Shouldn't they have a default to fall back upon ?
Currently there are 3 boards that support numa and with this series
they all implement cpu_index_to_instance_props callback,
so boards that has supported numa shouldn't be affected.

But if someone used '-numa' with a board that doesn't support numa,
it would stop booting with error message instead of silently parsing
not supported option or falling back bogus defaults (which aren't
used anyway).


> Regards,
> Bharata.
David Gibson March 28, 2017, 4:19 a.m. UTC | #3
On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
> Originally CPU threads were by default assigned in
> round-robin fashion. However it was causing issues in
> guest since CPU threads from the same socket/core could
> be placed on different NUMA nodes.
> Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
> fixed it by grouping threads within a socket on the same node
> introducing cpu_index_to_socket_id() callback and commit
> 20bb648d (spapr: Fix default NUMA node allocation for threads)
> reused callback to fix similar issues for SPAPR machine
> even though socket doesn't make much sense there.
> 
> As result QEMU ended up having 3 default distribution rules
> used by 3 targets /virt-arm, spapr, pc/.
> 
> In effort of moving NUMA mapping for CPUs into possible_cpus,
> generalize default mapping in numa.c by making boards decide
> on default mapping and let them explicitly tell generic
> numa code to which node a CPU thread belongs to by replacing
> cpu_index_to_socket_id() with @cpu_index_to_instance_props()
> which provides default node_id assigned by board to specified
> cpu_index.
> 
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> Patch only moves source of default mapping to possible_cpus[]
> and leaves the rest of NUMA handling to numa_info[node_id].node_cpu
> bitmaps. It's up to follow up patches to replace bitmaps
> with possible_cpus[] internally.
> ---
>  include/hw/boards.h   |  8 ++++++--
>  include/sysemu/numa.h |  2 +-
>  hw/arm/virt.c         | 19 +++++++++++++++++--
>  hw/i386/pc.c          | 22 ++++++++++++++++------
>  hw/ppc/spapr.c        | 27 ++++++++++++++++++++-------
>  numa.c                | 15 +++++++++------
>  vl.c                  |  2 +-
>  7 files changed, 70 insertions(+), 25 deletions(-)
> 
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 269d0ba..1dd0fde 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -74,7 +74,10 @@ typedef struct {
>   *    of HotplugHandler object, which handles hotplug operation
>   *    for a given @dev. It may return NULL if @dev doesn't require
>   *    any actions to be performed by hotplug handler.
> - * @cpu_index_to_socket_id:
> + * @cpu_index_to_instance_props:
> + *    used to provide @cpu_index to socket/core/thread number mapping, allowing
> + *    legacy code to perform maping from cpu_index to topology properties
> + *    Returns: tuple of socket/core/thread ids given cpu_index belongs to.
>   *    used to provide @cpu_index to socket number mapping, allowing
>   *    a machine to group CPU threads belonging to the same socket/package
>   *    Returns: socket number given cpu_index belongs to.
> @@ -138,7 +141,8 @@ struct MachineClass {
>  
>      HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
>                                             DeviceState *dev);
> -    unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
> +    CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *machine,
> +                                                         unsigned cpu_index);
>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>  };
>  
> diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> index 8f09dcf..46ea6c7 100644
> --- a/include/sysemu/numa.h
> +++ b/include/sysemu/numa.h
> @@ -24,7 +24,7 @@ typedef struct node_info {
>  } NodeInfo;
>  
>  extern NodeInfo numa_info[MAX_NODES];
> -void parse_numa_opts(MachineClass *mc);
> +void parse_numa_opts(MachineState *ms);
>  void numa_post_machine_init(void);
>  void query_numa_node_mem(uint64_t node_mem[]);
>  extern QemuOptsList qemu_numa_opts;
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 0cbcbc1..8748d25 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1554,6 +1554,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
>      }
>  }
>  
> +static CpuInstanceProperties
> +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;;
> +}
> +

It seems a bit weird to have a machine specific hook to pull the
property information when one way or another it's coming from the
possible_cpus table, which is already constructed by a machine
specific hook.  Could we add a range or list of cpu_index values to
each possible_cpus entry instead, and have a generic lookup of the
right entry based on that?


>  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>  {
>      int n;
> @@ -1573,8 +1583,12 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[n].props.has_thread_id = true;
>          ms->possible_cpus->cpus[n].props.thread_id = n;
>  
> -        /* TODO: add 'has_node/node' here to describe
> -           to which node core belongs */
> +        /* default distribution of CPUs over NUMA nodes */
> +        if (nb_numa_nodes) {
> +            /* preset values but do not enable them i.e. 'has_node_id = false',
> +             * board will enable them if manual mapping wasn't present on CLI */

I'm a little confused by this comment, since I don't see any board
code altering has_node_id.

> +            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;
> +        }
>      }
>      return ms->possible_cpus;
>  }
> @@ -1596,6 +1610,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
>      /* We know we will never create a pre-ARMv7 CPU which needs 1K pages */
>      mc->minimum_page_bits = 12;
>      mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
> +    mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
>  }
>  
>  static const TypeInfo virt_machine_info = {
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index d24388e..7031100 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -2245,12 +2245,14 @@ static void pc_machine_reset(void)
>      }
>  }
>  
> -static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
> +static CpuInstanceProperties
> +pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>  {
> -    X86CPUTopoInfo topo;
> -    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
> -                          &topo);
> -    return topo.pkg_id;
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;;

Since the pc and arm version of this are basically identical, I wonder
if that should actually be the default implementation.  If we need it
at all.

>  }
>  
>  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
> @@ -2282,6 +2284,14 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
>          ms->possible_cpus->cpus[i].props.has_thread_id = true;
>          ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> +
> +        /* default distribution of CPUs over NUMA nodes */
> +        if (nb_numa_nodes) {
> +            /* preset values but do not enable them i.e. 'has_node_id = false',
> +             * board will enable them if manual mapping wasn't present on CLI */
> +            ms->possible_cpus->cpus[i].props.node_id =
> +                topo.pkg_id % nb_numa_nodes;
> +        }
>      }
>      return ms->possible_cpus;
>  }
> @@ -2324,7 +2334,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>      pcmc->acpi_data_size = 0x20000 + 0x8000;
>      pcmc->save_tsc_khz = true;
>      mc->get_hotplug_handler = pc_get_hotpug_handler;
> -    mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
> +    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
>      mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
>      mc->has_hotpluggable_cpus = true;
>      mc->default_boot_order = "cad";
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 6ee566d..9dcbbcc 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2921,11 +2921,18 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
>      return NULL;
>  }
>  
> -static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
> +static CpuInstanceProperties
> +spapr_cpu_index_to_props(MachineState *machine, unsigned cpu_index)
>  {
> -    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
> -     * socket means much for the paravirtualized PAPR platform) */
> -    return cpu_index / smp_threads / smp_cores;
> +    CPUArchId *core_slot;
> +    MachineClass *mc = MACHINE_GET_CLASS(machine);
> +    int core_id = cpu_index / smp_threads * smp_threads;

I don't think you need this.  AIUI the purpose of
spapr_find_cpu_slot() is that it already finds the right CPU slot from
a cpu_index, so you can just pass the cpu_index directly.

> +
> +    /* make sure possible_cpu are intialized */
> +    mc->possible_cpu_arch_ids(machine);
> +    core_slot = spapr_find_cpu_slot(machine, core_id, NULL);
> +    assert(core_slot);
> +    return core_slot->props;
>  }
>  
>  static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
> @@ -2952,8 +2959,14 @@ static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
>          machine->possible_cpus->cpus[i].arch_id = core_id;
>          machine->possible_cpus->cpus[i].props.has_core_id = true;
>          machine->possible_cpus->cpus[i].props.core_id = core_id;
> -        /* TODO: add 'has_node/node' here to describe
> -           to which node core belongs */
> +
> +        /* default distribution of CPUs over NUMA nodes */
> +        if (nb_numa_nodes) {
> +            /* preset values but do not enable them i.e. 'has_node_id = false',
> +             * board will enable them if manual mapping wasn't present on CLI */
> +            machine->possible_cpus->cpus[i].props.node_id =
> +                core_id / smp_threads / smp_cores % nb_numa_nodes;
> +        }
>      }
>      return machine->possible_cpus;
>  }
> @@ -3076,7 +3089,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
>      hc->pre_plug = spapr_machine_device_pre_plug;
>      hc->plug = spapr_machine_device_plug;
>      hc->unplug = spapr_machine_device_unplug;
> -    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
> +    mc->cpu_index_to_instance_props = spapr_cpu_index_to_props;
>      mc->possible_cpu_arch_ids = spapr_possible_cpu_arch_ids;
>      hc->unplug_request = spapr_machine_device_unplug_request;
>  
> diff --git a/numa.c b/numa.c
> index e01cb54..b6e71bc 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -294,9 +294,10 @@ static void validate_numa_cpus(void)
>      g_free(seen_cpus);
>  }
>  
> -void parse_numa_opts(MachineClass *mc)
> +void parse_numa_opts(MachineState *ms)
>  {
>      int i;
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
>  
>      for (i = 0; i < MAX_NODES; i++) {
>          numa_info[i].node_cpu = bitmap_new(max_cpus);
> @@ -378,14 +379,16 @@ void parse_numa_opts(MachineClass *mc)
>           * rule grouping VCPUs by socket so that VCPUs from the same socket
>           * would be on the same node.
>           */
> +        if (!mc->cpu_index_to_instance_props) {
> +            error_report("default CPUs to NUMA node mapping isn't supported");
> +            exit(1);
> +        }
>          if (i == nb_numa_nodes) {
>              for (i = 0; i < max_cpus; i++) {
> -                unsigned node_id = i % nb_numa_nodes;
> -                if (mc->cpu_index_to_socket_id) {
> -                    node_id = mc->cpu_index_to_socket_id(i) % nb_numa_nodes;
> -                }
> +                CpuInstanceProperties props;
> +                props = mc->cpu_index_to_instance_props(ms, i);
>  
> -                set_bit(i, numa_info[node_id].node_cpu);
> +                set_bit(i, numa_info[props.node_id].node_cpu);
>              }
>          }
>  
> diff --git a/vl.c b/vl.c
> index 0b4ed52..5ffb9c3 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -4498,7 +4498,7 @@ int main(int argc, char **argv, char **envp)
>      default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
>      default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
>  
> -    parse_numa_opts(machine_class);
> +    parse_numa_opts(current_machine);
>  
>      if (qemu_opts_foreach(qemu_find_opts("mon"),
>                            mon_init_func, NULL, NULL)) {
Igor Mammedov March 28, 2017, 10:53 a.m. UTC | #4
On Tue, 28 Mar 2017 15:19:20 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
> > Originally CPU threads were by default assigned in
> > round-robin fashion. However it was causing issues in
> > guest since CPU threads from the same socket/core could
> > be placed on different NUMA nodes.
> > Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
> > fixed it by grouping threads within a socket on the same node
> > introducing cpu_index_to_socket_id() callback and commit
> > 20bb648d (spapr: Fix default NUMA node allocation for threads)
> > reused callback to fix similar issues for SPAPR machine
> > even though socket doesn't make much sense there.
> > 
> > As result QEMU ended up having 3 default distribution rules
> > used by 3 targets /virt-arm, spapr, pc/.
> > 
> > In effort of moving NUMA mapping for CPUs into possible_cpus,
> > generalize default mapping in numa.c by making boards decide
> > on default mapping and let them explicitly tell generic
> > numa code to which node a CPU thread belongs to by replacing
> > cpu_index_to_socket_id() with @cpu_index_to_instance_props()
> > which provides default node_id assigned by board to specified
> > cpu_index.
> > 
> > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> > ---
> > Patch only moves source of default mapping to possible_cpus[]
> > and leaves the rest of NUMA handling to numa_info[node_id].node_cpu
> > bitmaps. It's up to follow up patches to replace bitmaps
> > with possible_cpus[] internally.
> > ---
> >  include/hw/boards.h   |  8 ++++++--
> >  include/sysemu/numa.h |  2 +-
> >  hw/arm/virt.c         | 19 +++++++++++++++++--
> >  hw/i386/pc.c          | 22 ++++++++++++++++------
> >  hw/ppc/spapr.c        | 27 ++++++++++++++++++++-------
> >  numa.c                | 15 +++++++++------
> >  vl.c                  |  2 +-
> >  7 files changed, 70 insertions(+), 25 deletions(-)
> > 
> > diff --git a/include/hw/boards.h b/include/hw/boards.h
> > index 269d0ba..1dd0fde 100644
> > --- a/include/hw/boards.h
> > +++ b/include/hw/boards.h
> > @@ -74,7 +74,10 @@ typedef struct {
> >   *    of HotplugHandler object, which handles hotplug operation
> >   *    for a given @dev. It may return NULL if @dev doesn't require
> >   *    any actions to be performed by hotplug handler.
> > - * @cpu_index_to_socket_id:
> > + * @cpu_index_to_instance_props:
> > + *    used to provide @cpu_index to socket/core/thread number mapping, allowing
> > + *    legacy code to perform maping from cpu_index to topology properties
> > + *    Returns: tuple of socket/core/thread ids given cpu_index belongs to.
> >   *    used to provide @cpu_index to socket number mapping, allowing
> >   *    a machine to group CPU threads belonging to the same socket/package
> >   *    Returns: socket number given cpu_index belongs to.
> > @@ -138,7 +141,8 @@ struct MachineClass {
> >  
> >      HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
> >                                             DeviceState *dev);
> > -    unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
> > +    CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *machine,
> > +                                                         unsigned cpu_index);
> >      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
> >  };
> >  
> > diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
> > index 8f09dcf..46ea6c7 100644
> > --- a/include/sysemu/numa.h
> > +++ b/include/sysemu/numa.h
> > @@ -24,7 +24,7 @@ typedef struct node_info {
> >  } NodeInfo;
> >  
> >  extern NodeInfo numa_info[MAX_NODES];
> > -void parse_numa_opts(MachineClass *mc);
> > +void parse_numa_opts(MachineState *ms);
> >  void numa_post_machine_init(void);
> >  void query_numa_node_mem(uint64_t node_mem[]);
> >  extern QemuOptsList qemu_numa_opts;
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 0cbcbc1..8748d25 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1554,6 +1554,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
> >      }
> >  }
> >  
> > +static CpuInstanceProperties
> > +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +
> > +    assert(cpu_index < possible_cpus->len);
> > +    return possible_cpus->cpus[cpu_index].props;;
> > +}
> > +
> 
> It seems a bit weird to have a machine specific hook to pull the
> property information when one way or another it's coming from the
> possible_cpus table, which is already constructed by a machine
> specific hook.  Could we add a range or list of cpu_index values to
> each possible_cpus entry instead, and have a generic lookup of the
> right entry based on that?
> 
> 
> >  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >  {
> >      int n;
> > @@ -1573,8 +1583,12 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >          ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >          ms->possible_cpus->cpus[n].props.thread_id = n;
> >  
> > -        /* TODO: add 'has_node/node' here to describe
> > -           to which node core belongs */
> > +        /* default distribution of CPUs over NUMA nodes */
> > +        if (nb_numa_nodes) {
> > +            /* preset values but do not enable them i.e. 'has_node_id = false',
> > +             * board will enable them if manual mapping wasn't present on CLI */
> 
> I'm a little confused by this comment, since I don't see any board
> code altering has_node_id.
> 
> > +            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;
> > +        }
> >      }
> >      return ms->possible_cpus;
> >  }
> > @@ -1596,6 +1610,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data)
> >      /* We know we will never create a pre-ARMv7 CPU which needs 1K pages */
> >      mc->minimum_page_bits = 12;
> >      mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
> > +    mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
> >  }
> >  
> >  static const TypeInfo virt_machine_info = {
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index d24388e..7031100 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -2245,12 +2245,14 @@ static void pc_machine_reset(void)
> >      }
> >  }
> >  
> > -static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
> > +static CpuInstanceProperties
> > +pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> >  {
> > -    X86CPUTopoInfo topo;
> > -    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
> > -                          &topo);
> > -    return topo.pkg_id;
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +
> > +    assert(cpu_index < possible_cpus->len);
> > +    return possible_cpus->cpus[cpu_index].props;;
> 
> Since the pc and arm version of this are basically identical, I wonder
> if that should actually be the default implementation.  If we need it
> at all.
ARM is still moving target and props are not really defined for it yet,
so I'd like to keep it separate for now and when it stabilizes we can think
about generalizing it.

> 
> >  }
> >  
> >  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
> > @@ -2282,6 +2284,14 @@ static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
> >          ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> >          ms->possible_cpus->cpus[i].props.has_thread_id = true;
> >          ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> > +
> > +        /* default distribution of CPUs over NUMA nodes */
> > +        if (nb_numa_nodes) {
> > +            /* preset values but do not enable them i.e. 'has_node_id = false',
> > +             * board will enable them if manual mapping wasn't present on CLI */
> > +            ms->possible_cpus->cpus[i].props.node_id =
> > +                topo.pkg_id % nb_numa_nodes;
> > +        }
> >      }
> >      return ms->possible_cpus;
> >  }
> > @@ -2324,7 +2334,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
> >      pcmc->acpi_data_size = 0x20000 + 0x8000;
> >      pcmc->save_tsc_khz = true;
> >      mc->get_hotplug_handler = pc_get_hotpug_handler;
> > -    mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
> > +    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
> >      mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
> >      mc->has_hotpluggable_cpus = true;
> >      mc->default_boot_order = "cad";
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 6ee566d..9dcbbcc 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2921,11 +2921,18 @@ static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
> >      return NULL;
> >  }
> >  
> > -static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
> > +static CpuInstanceProperties
> > +spapr_cpu_index_to_props(MachineState *machine, unsigned cpu_index)
> >  {
> > -    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
> > -     * socket means much for the paravirtualized PAPR platform) */
> > -    return cpu_index / smp_threads / smp_cores;
> > +    CPUArchId *core_slot;
> > +    MachineClass *mc = MACHINE_GET_CLASS(machine);
> > +    int core_id = cpu_index / smp_threads * smp_threads;
> 
> I don't think you need this.  AIUI the purpose of
> spapr_find_cpu_slot() is that it already finds the right CPU slot from
> a cpu_index, so you can just pass the cpu_index directly.
ok, will do in v2

> 
> > +
> > +    /* make sure possible_cpu are intialized */
> > +    mc->possible_cpu_arch_ids(machine);
> > +    core_slot = spapr_find_cpu_slot(machine, core_id, NULL);
> > +    assert(core_slot);
> > +    return core_slot->props;
> >  }
> >  
> >  static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
> > @@ -2952,8 +2959,14 @@ static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
> >          machine->possible_cpus->cpus[i].arch_id = core_id;
> >          machine->possible_cpus->cpus[i].props.has_core_id = true;
> >          machine->possible_cpus->cpus[i].props.core_id = core_id;
> > -        /* TODO: add 'has_node/node' here to describe
> > -           to which node core belongs */
> > +
> > +        /* default distribution of CPUs over NUMA nodes */
> > +        if (nb_numa_nodes) {
> > +            /* preset values but do not enable them i.e. 'has_node_id = false',
> > +             * board will enable them if manual mapping wasn't present on CLI */
> > +            machine->possible_cpus->cpus[i].props.node_id =
> > +                core_id / smp_threads / smp_cores % nb_numa_nodes;
> > +        }
> >      }
> >      return machine->possible_cpus;
> >  }
> > @@ -3076,7 +3089,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >      hc->pre_plug = spapr_machine_device_pre_plug;
> >      hc->plug = spapr_machine_device_plug;
> >      hc->unplug = spapr_machine_device_unplug;
> > -    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
> > +    mc->cpu_index_to_instance_props = spapr_cpu_index_to_props;
> >      mc->possible_cpu_arch_ids = spapr_possible_cpu_arch_ids;
> >      hc->unplug_request = spapr_machine_device_unplug_request;
> >  
> > diff --git a/numa.c b/numa.c
> > index e01cb54..b6e71bc 100644
> > --- a/numa.c
> > +++ b/numa.c
> > @@ -294,9 +294,10 @@ static void validate_numa_cpus(void)
> >      g_free(seen_cpus);
> >  }
> >  
> > -void parse_numa_opts(MachineClass *mc)
> > +void parse_numa_opts(MachineState *ms)
> >  {
> >      int i;
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> >  
> >      for (i = 0; i < MAX_NODES; i++) {
> >          numa_info[i].node_cpu = bitmap_new(max_cpus);
> > @@ -378,14 +379,16 @@ void parse_numa_opts(MachineClass *mc)
> >           * rule grouping VCPUs by socket so that VCPUs from the same socket
> >           * would be on the same node.
> >           */
> > +        if (!mc->cpu_index_to_instance_props) {
> > +            error_report("default CPUs to NUMA node mapping isn't supported");
> > +            exit(1);
> > +        }
> >          if (i == nb_numa_nodes) {
> >              for (i = 0; i < max_cpus; i++) {
> > -                unsigned node_id = i % nb_numa_nodes;
> > -                if (mc->cpu_index_to_socket_id) {
> > -                    node_id = mc->cpu_index_to_socket_id(i) % nb_numa_nodes;
> > -                }
> > +                CpuInstanceProperties props;
> > +                props = mc->cpu_index_to_instance_props(ms, i);
> >  
> > -                set_bit(i, numa_info[node_id].node_cpu);
> > +                set_bit(i, numa_info[props.node_id].node_cpu);
> >              }
> >          }
> >  
> > diff --git a/vl.c b/vl.c
> > index 0b4ed52..5ffb9c3 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -4498,7 +4498,7 @@ int main(int argc, char **argv, char **envp)
> >      default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
> >      default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
> >  
> > -    parse_numa_opts(machine_class);
> > +    parse_numa_opts(current_machine);
> >  
> >      if (qemu_opts_foreach(qemu_find_opts("mon"),
> >                            mon_init_func, NULL, NULL)) {
>
David Gibson March 29, 2017, 2:24 a.m. UTC | #5
On Tue, Mar 28, 2017 at 12:53:10PM +0200, Igor Mammedov wrote:
> On Tue, 28 Mar 2017 15:19:20 +1100
> David Gibson <david@gibson.dropbear.id.au> wrote:
> 
> > On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
> > > Originally CPU threads were by default assigned in
> > > round-robin fashion. However it was causing issues in
> > > guest since CPU threads from the same socket/core could
> > > be placed on different NUMA nodes.
> > > Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
> > > fixed it by grouping threads within a socket on the same node
> > > introducing cpu_index_to_socket_id() callback and commit
> > > 20bb648d (spapr: Fix default NUMA node allocation for threads)
> > > reused callback to fix similar issues for SPAPR machine
> > > even though socket doesn't make much sense there.
> > > 
> > > As result QEMU ended up having 3 default distribution rules
> > > used by 3 targets /virt-arm, spapr, pc/.
> > > 
> > > In effort of moving NUMA mapping for CPUs into possible_cpus,
> > > generalize default mapping in numa.c by making boards decide
> > > on default mapping and let them explicitly tell generic
> > > numa code to which node a CPU thread belongs to by replacing
> > > cpu_index_to_socket_id() with @cpu_index_to_instance_props()
> > > which provides default node_id assigned by board to specified
> > > cpu_index.
> > > 
> > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[snip]
> > > +static CpuInstanceProperties
> > > +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > > +{
> > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > > +
> > > +    assert(cpu_index < possible_cpus->len);
> > > +    return possible_cpus->cpus[cpu_index].props;;
> > > +}
> > > +
> > 
> > It seems a bit weird to have a machine specific hook to pull the
> > property information when one way or another it's coming from the
> > possible_cpus table, which is already constructed by a machine
> > specific hook.  Could we add a range or list of cpu_index values to
> > each possible_cpus entry instead, and have a generic lookup of the
> > right entry based on that?

[snip]
> > > -static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
> > > +static CpuInstanceProperties
> > > +pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > >  {
> > > -    X86CPUTopoInfo topo;
> > > -    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
> > > -                          &topo);
> > > -    return topo.pkg_id;
> > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > > +
> > > +    assert(cpu_index < possible_cpus->len);
> > > +    return possible_cpus->cpus[cpu_index].props;;
> > 
> > Since the pc and arm version of this are basically identical, I wonder
> > if that should actually be the default implementation.  If we need it
> > at all.
> ARM is still moving target and props are not really defined for it yet,
> so I'd like to keep it separate for now and when it stabilizes we can think
> about generalizing it.

Fair enough.

Any thoughts on my more general query above
Igor Mammedov March 29, 2017, 11:48 a.m. UTC | #6
On Wed, 29 Mar 2017 13:24:49 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Tue, Mar 28, 2017 at 12:53:10PM +0200, Igor Mammedov wrote:
> > On Tue, 28 Mar 2017 15:19:20 +1100
> > David Gibson <david@gibson.dropbear.id.au> wrote:
> >   
> > > On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:  
> > > > Originally CPU threads were by default assigned in
> > > > round-robin fashion. However it was causing issues in
> > > > guest since CPU threads from the same socket/core could
> > > > be placed on different NUMA nodes.
> > > > Commit fb43b73b (pc: fix default VCPU to NUMA node mapping)
> > > > fixed it by grouping threads within a socket on the same node
> > > > introducing cpu_index_to_socket_id() callback and commit
> > > > 20bb648d (spapr: Fix default NUMA node allocation for threads)
> > > > reused callback to fix similar issues for SPAPR machine
> > > > even though socket doesn't make much sense there.
> > > > 
> > > > As result QEMU ended up having 3 default distribution rules
> > > > used by 3 targets /virt-arm, spapr, pc/.
> > > > 
> > > > In effort of moving NUMA mapping for CPUs into possible_cpus,
> > > > generalize default mapping in numa.c by making boards decide
> > > > on default mapping and let them explicitly tell generic
> > > > numa code to which node a CPU thread belongs to by replacing
> > > > cpu_index_to_socket_id() with @cpu_index_to_instance_props()
> > > > which provides default node_id assigned by board to specified
> > > > cpu_index.
> > > > 
> > > > Signed-off-by: Igor Mammedov <imammedo@redhat.com>  
> [snip]
> > > > +static CpuInstanceProperties
> > > > +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > > > +{
> > > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > > > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > > > +
> > > > +    assert(cpu_index < possible_cpus->len);
> > > > +    return possible_cpus->cpus[cpu_index].props;;
> > > > +}
> > > > +  
> > > 
> > > It seems a bit weird to have a machine specific hook to pull the
> > > property information when one way or another it's coming from the
> > > possible_cpus table, which is already constructed by a machine
> > > specific hook.  Could we add a range or list of cpu_index values to
> > > each possible_cpus entry instead, and have a generic lookup of the
> > > right entry based on that?  
> 
> [snip]
> > > > -static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
> > > > +static CpuInstanceProperties
> > > > +pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > > >  {
> > > > -    X86CPUTopoInfo topo;
> > > > -    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
> > > > -                          &topo);
> > > > -    return topo.pkg_id;
> > > > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > > > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > > > +
> > > > +    assert(cpu_index < possible_cpus->len);
> > > > +    return possible_cpus->cpus[cpu_index].props;;  
> > > 
> > > Since the pc and arm version of this are basically identical, I wonder
> > > if that should actually be the default implementation.  If we need it
> > > at all.  
> > ARM is still moving target and props are not really defined for it yet,
> > so I'd like to keep it separate for now and when it stabilizes we can think
> > about generalizing it.  
> 
> Fair enough.
> 
> Any thoughts on my more general query above
None so far.
Igor Mammedov April 20, 2017, 2:29 p.m. UTC | #7
On Tue, 28 Mar 2017 15:19:20 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
[...]
answering to questions that I forgot to answer before

> > @@ -1554,6 +1554,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
> >      }
> >  }
> >  
> > +static CpuInstanceProperties
> > +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +
> > +    assert(cpu_index < possible_cpus->len);
> > +    return possible_cpus->cpus[cpu_index].props;;
> > +}
> > +  
> 
> It seems a bit weird to have a machine specific hook to pull the
> property information when one way or another it's coming from the
> possible_cpus table, which is already constructed by a machine
> specific hook.  Could we add a range or list of cpu_index values to
> each possible_cpus entry instead, and have a generic lookup of the
> right entry based on that?
Mainly I dislike the idea because it adds duplicate data to manage
that could be computed from already stored there CpuInstanceProperties.

And secondly if it were just 1 number then generic lookup would be trivial
but with list it becomes cumbersome to manage and implementation
larger then 3 *_cpu_index_to_props() hooks combined, it's not worth it
in foreseeable future.
 
> >  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >  {
> >      int n;
> > @@ -1573,8 +1583,12 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >          ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >          ms->possible_cpus->cpus[n].props.thread_id = n;
> >  
> > -        /* TODO: add 'has_node/node' here to describe
> > -           to which node core belongs */
> > +        /* default distribution of CPUs over NUMA nodes */
> > +        if (nb_numa_nodes) {
> > +            /* preset values but do not enable them i.e. 'has_node_id = false',
> > +             * board will enable them if manual mapping wasn't present on CLI */  
> 
> I'm a little confused by this comment, since I don't see any board
> code altering has_node_id.
it happens in the last 2 hunks of patch 10/23
may be I should write it like this:

+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * numa initialization code will enable them later if manual mapping
+             * wasn't present on CLI */

> 
> > +            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;
> > +        }
> >      }
> >      return ms->possible_cpus;
> >  }
[...]
Andrew Jones April 25, 2017, 2:48 p.m. UTC | #8
On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index 0cbcbc1..8748d25 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -1554,6 +1554,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
>      }
>  }
>  
> +static CpuInstanceProperties
> +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;;
> +}
> +
>  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>  {
>      int n;
> @@ -1573,8 +1583,12 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
>          ms->possible_cpus->cpus[n].props.has_thread_id = true;
>          ms->possible_cpus->cpus[n].props.thread_id = n;
>  
> -        /* TODO: add 'has_node/node' here to describe
> -           to which node core belongs */
> +        /* default distribution of CPUs over NUMA nodes */
> +        if (nb_numa_nodes) {
> +            /* preset values but do not enable them i.e. 'has_node_id = false',
> +             * board will enable them if manual mapping wasn't present on CLI */
> +            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;

extra ;

drew
Igor Mammedov April 25, 2017, 3:07 p.m. UTC | #9
On Tue, 25 Apr 2017 16:48:30 +0200
Andrew Jones <drjones@redhat.com> wrote:

> On Wed, Mar 22, 2017 at 02:32:30PM +0100, Igor Mammedov wrote:
> > diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> > index 0cbcbc1..8748d25 100644
> > --- a/hw/arm/virt.c
> > +++ b/hw/arm/virt.c
> > @@ -1554,6 +1554,16 @@ static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
> >      }
> >  }
> >  
> > +static CpuInstanceProperties
> > +virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> > +{
> > +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> > +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> > +
> > +    assert(cpu_index < possible_cpus->len);
> > +    return possible_cpus->cpus[cpu_index].props;;
> > +}
> > +
> >  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >  {
> >      int n;
> > @@ -1573,8 +1583,12 @@ static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
> >          ms->possible_cpus->cpus[n].props.has_thread_id = true;
> >          ms->possible_cpus->cpus[n].props.thread_id = n;
> >  
> > -        /* TODO: add 'has_node/node' here to describe
> > -           to which node core belongs */
> > +        /* default distribution of CPUs over NUMA nodes */
> > +        if (nb_numa_nodes) {
> > +            /* preset values but do not enable them i.e. 'has_node_id = false',
> > +             * board will enable them if manual mapping wasn't present on CLI */
> > +            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;
> 
> extra ;
fixed in v2

> 
> drew
diff mbox

Patch

diff --git a/include/hw/boards.h b/include/hw/boards.h
index 269d0ba..1dd0fde 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -74,7 +74,10 @@  typedef struct {
  *    of HotplugHandler object, which handles hotplug operation
  *    for a given @dev. It may return NULL if @dev doesn't require
  *    any actions to be performed by hotplug handler.
- * @cpu_index_to_socket_id:
+ * @cpu_index_to_instance_props:
+ *    used to provide @cpu_index to socket/core/thread number mapping, allowing
+ *    legacy code to perform maping from cpu_index to topology properties
+ *    Returns: tuple of socket/core/thread ids given cpu_index belongs to.
  *    used to provide @cpu_index to socket number mapping, allowing
  *    a machine to group CPU threads belonging to the same socket/package
  *    Returns: socket number given cpu_index belongs to.
@@ -138,7 +141,8 @@  struct MachineClass {
 
     HotplugHandler *(*get_hotplug_handler)(MachineState *machine,
                                            DeviceState *dev);
-    unsigned (*cpu_index_to_socket_id)(unsigned cpu_index);
+    CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *machine,
+                                                         unsigned cpu_index);
     const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
 };
 
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 8f09dcf..46ea6c7 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -24,7 +24,7 @@  typedef struct node_info {
 } NodeInfo;
 
 extern NodeInfo numa_info[MAX_NODES];
-void parse_numa_opts(MachineClass *mc);
+void parse_numa_opts(MachineState *ms);
 void numa_post_machine_init(void);
 void query_numa_node_mem(uint64_t node_mem[]);
 extern QemuOptsList qemu_numa_opts;
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0cbcbc1..8748d25 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -1554,6 +1554,16 @@  static void virt_set_gic_version(Object *obj, const char *value, Error **errp)
     }
 }
 
+static CpuInstanceProperties
+virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+    assert(cpu_index < possible_cpus->len);
+    return possible_cpus->cpus[cpu_index].props;;
+}
+
 static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
 {
     int n;
@@ -1573,8 +1583,12 @@  static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[n].props.has_thread_id = true;
         ms->possible_cpus->cpus[n].props.thread_id = n;
 
-        /* TODO: add 'has_node/node' here to describe
-           to which node core belongs */
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * board will enable them if manual mapping wasn't present on CLI */
+            ms->possible_cpus->cpus[n].props.node_id = n % nb_numa_nodes;;
+        }
     }
     return ms->possible_cpus;
 }
@@ -1596,6 +1610,7 @@  static void virt_machine_class_init(ObjectClass *oc, void *data)
     /* We know we will never create a pre-ARMv7 CPU which needs 1K pages */
     mc->minimum_page_bits = 12;
     mc->possible_cpu_arch_ids = virt_possible_cpu_arch_ids;
+    mc->cpu_index_to_instance_props = virt_cpu_index_to_props;
 }
 
 static const TypeInfo virt_machine_info = {
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index d24388e..7031100 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -2245,12 +2245,14 @@  static void pc_machine_reset(void)
     }
 }
 
-static unsigned pc_cpu_index_to_socket_id(unsigned cpu_index)
+static CpuInstanceProperties
+pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
-    X86CPUTopoInfo topo;
-    x86_topo_ids_from_idx(smp_cores, smp_threads, cpu_index,
-                          &topo);
-    return topo.pkg_id;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+    assert(cpu_index < possible_cpus->len);
+    return possible_cpus->cpus[cpu_index].props;;
 }
 
 static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
@@ -2282,6 +2284,14 @@  static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
         ms->possible_cpus->cpus[i].props.has_thread_id = true;
         ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
+
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * board will enable them if manual mapping wasn't present on CLI */
+            ms->possible_cpus->cpus[i].props.node_id =
+                topo.pkg_id % nb_numa_nodes;
+        }
     }
     return ms->possible_cpus;
 }
@@ -2324,7 +2334,7 @@  static void pc_machine_class_init(ObjectClass *oc, void *data)
     pcmc->acpi_data_size = 0x20000 + 0x8000;
     pcmc->save_tsc_khz = true;
     mc->get_hotplug_handler = pc_get_hotpug_handler;
-    mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
+    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
     mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
     mc->has_hotpluggable_cpus = true;
     mc->default_boot_order = "cad";
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 6ee566d..9dcbbcc 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2921,11 +2921,18 @@  static HotplugHandler *spapr_get_hotplug_handler(MachineState *machine,
     return NULL;
 }
 
-static unsigned spapr_cpu_index_to_socket_id(unsigned cpu_index)
+static CpuInstanceProperties
+spapr_cpu_index_to_props(MachineState *machine, unsigned cpu_index)
 {
-    /* Allocate to NUMA nodes on a "socket" basis (not that concept of
-     * socket means much for the paravirtualized PAPR platform) */
-    return cpu_index / smp_threads / smp_cores;
+    CPUArchId *core_slot;
+    MachineClass *mc = MACHINE_GET_CLASS(machine);
+    int core_id = cpu_index / smp_threads * smp_threads;
+
+    /* make sure possible_cpu are intialized */
+    mc->possible_cpu_arch_ids(machine);
+    core_slot = spapr_find_cpu_slot(machine, core_id, NULL);
+    assert(core_slot);
+    return core_slot->props;
 }
 
 static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
@@ -2952,8 +2959,14 @@  static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine)
         machine->possible_cpus->cpus[i].arch_id = core_id;
         machine->possible_cpus->cpus[i].props.has_core_id = true;
         machine->possible_cpus->cpus[i].props.core_id = core_id;
-        /* TODO: add 'has_node/node' here to describe
-           to which node core belongs */
+
+        /* default distribution of CPUs over NUMA nodes */
+        if (nb_numa_nodes) {
+            /* preset values but do not enable them i.e. 'has_node_id = false',
+             * board will enable them if manual mapping wasn't present on CLI */
+            machine->possible_cpus->cpus[i].props.node_id =
+                core_id / smp_threads / smp_cores % nb_numa_nodes;
+        }
     }
     return machine->possible_cpus;
 }
@@ -3076,7 +3089,7 @@  static void spapr_machine_class_init(ObjectClass *oc, void *data)
     hc->pre_plug = spapr_machine_device_pre_plug;
     hc->plug = spapr_machine_device_plug;
     hc->unplug = spapr_machine_device_unplug;
-    mc->cpu_index_to_socket_id = spapr_cpu_index_to_socket_id;
+    mc->cpu_index_to_instance_props = spapr_cpu_index_to_props;
     mc->possible_cpu_arch_ids = spapr_possible_cpu_arch_ids;
     hc->unplug_request = spapr_machine_device_unplug_request;
 
diff --git a/numa.c b/numa.c
index e01cb54..b6e71bc 100644
--- a/numa.c
+++ b/numa.c
@@ -294,9 +294,10 @@  static void validate_numa_cpus(void)
     g_free(seen_cpus);
 }
 
-void parse_numa_opts(MachineClass *mc)
+void parse_numa_opts(MachineState *ms)
 {
     int i;
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
 
     for (i = 0; i < MAX_NODES; i++) {
         numa_info[i].node_cpu = bitmap_new(max_cpus);
@@ -378,14 +379,16 @@  void parse_numa_opts(MachineClass *mc)
          * rule grouping VCPUs by socket so that VCPUs from the same socket
          * would be on the same node.
          */
+        if (!mc->cpu_index_to_instance_props) {
+            error_report("default CPUs to NUMA node mapping isn't supported");
+            exit(1);
+        }
         if (i == nb_numa_nodes) {
             for (i = 0; i < max_cpus; i++) {
-                unsigned node_id = i % nb_numa_nodes;
-                if (mc->cpu_index_to_socket_id) {
-                    node_id = mc->cpu_index_to_socket_id(i) % nb_numa_nodes;
-                }
+                CpuInstanceProperties props;
+                props = mc->cpu_index_to_instance_props(ms, i);
 
-                set_bit(i, numa_info[node_id].node_cpu);
+                set_bit(i, numa_info[props.node_id].node_cpu);
             }
         }
 
diff --git a/vl.c b/vl.c
index 0b4ed52..5ffb9c3 100644
--- a/vl.c
+++ b/vl.c
@@ -4498,7 +4498,7 @@  int main(int argc, char **argv, char **envp)
     default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS);
     default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS);
 
-    parse_numa_opts(machine_class);
+    parse_numa_opts(current_machine);
 
     if (qemu_opts_foreach(qemu_find_opts("mon"),
                           mon_init_func, NULL, NULL)) {