diff mbox

[RFC,v3,05/24] spapr: Reorganize CPU dt generation code

Message ID 1429858066-12088-6-git-send-email-bharata@linux.vnet.ibm.com
State New
Headers show

Commit Message

Bharata B Rao April 24, 2015, 6:47 a.m. UTC
Reorganize CPU device tree generation code so that it be reused from
hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
instead of spapr_create_fdt_skel().

Note: This is how the split-up looks like now:

Boot path
---------
spapr_finalize_fdt
 spapr_populate_cpus_dt_node
  spapr_populate_cpu_dt
   spapr_fixup_cpu_numa_dt
   spapr_fixup_cpu_smt_dt

Hotplug path
------------
spapr_cpu_plug
 spapr_populate_hotplug_cpu_dt
  spapr_populate_cpu_dt
   spapr_fixup_cpu_numa_dt
   spapr_fixup_cpu_smt_dt

ibm,cas path
------------
spapr_h_cas_compose_response
 spapr_fixup_cpu_dt
  spapr_fixup_cpu_numa_dt
  spapr_fixup_cpu_smt_dt

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
---
 hw/ppc/spapr.c | 278 +++++++++++++++++++++++++++++++--------------------------
 1 file changed, 153 insertions(+), 125 deletions(-)

Comments

Bharata B Rao April 26, 2015, 11:47 a.m. UTC | #1
On Fri, Apr 24, 2015 at 12:17:27PM +0530, Bharata B Rao wrote:
> Reorganize CPU device tree generation code so that it be reused from
> hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> instead of spapr_create_fdt_skel().

Creating CPU DT entries from spapr_finalize_fdt() instead of
spapr_create_fdt_skel() has an interesting side effect.

Before this patch, when I boot an SMP guest with the following configuration:

-smp 4 -numa node,cpus=0-1,mem=4G,nodeid=0 -numa node,cpus=2-3,mem=4G,nodeid=1

the guest CPUs come up in the following fashion:

[root@localhost ~]# cat /proc/cpuinfo
processor	: 0
cpu dt id	: 0

processor	: 1
cpu dt id	: 8

processor	: 2
cpu dt id	: 16

processor	: 3
cpu dt id	: 24

In the above /proc/cpuinfo output, only the relevant fields are retained
and the newly added field "cpu dt id" is essentially obtained by
arch/powerpc/include/asm/smp.h:get_hard_smp_processor_id() in the kernel.

[root@localhost ~]# lscpu
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          2
NUMA node0 CPU(s):     0,1
NUMA node1 CPU(s):     2,3

Here CPUs 0,1 are in node0 and 2,3 are in node1 as specified. The same is
reported by QEMU monitor below.

(qemu) info numa
2 nodes
node 0 cpus: 0 1
node 0 size: 4096 MB
node 1 cpus: 2 3
node 1 size: 4096 MB

After this patch where CPU DT entries are built in spapr_finalize_fdt()
completely, CPU enumeration done by the guest kernel gets reversed since
the CPU DT nodes end up getting discovered by guest kernel in the reverse
order in arch/powerpc/kernel/setup-common.c:smp_setup_cpu_maps(). With
this the resulting guest SMP configuration looks like this:

[root@localhost ~]# cat /proc/cpuinfo 
processor	: 0
cpu dt id	: 24  <--- was 0 previously

processor	: 1
cpu dt id	: 16  <--- was 8 previously

processor	: 2
cpu dt id	: 8   <--- was 16 previously

processor	: 3
cpu dt id	: 0   <--- was 24 previously

[root@localhost ~]# lscpu
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    1
Core(s) per socket:    1
Socket(s):             4
NUMA node(s):          2
NUMA node0 CPU(s):     2,3 <--- node0 was supposed to have 0,1
NUMA node1 CPU(s):     0,1 <--- node1 was supposed to have 2,3

(qemu) info numa
2 nodes
node 0 cpus: 0 1
node 0 size: 4096 MB
node 1 cpus: 2 3
node 1 size: 4096 MB

This is not wrong per se because CPUs with correct DT ids ended up on
right NUMA nodes, but just that the CPU numbers assigned by guest got
reversed.

Is this acceptable or will this break some userpace ?

In both the cases, I am adding CPU DT nodes from QEMU in the same order,
but not sure why the guest kernel discovers them in different orders in
each case.

> +static void spapr_populate_cpus_dt_node(void *fdt, sPAPREnvironment *spapr)
> +{
> +    CPUState *cs;
> +    int cpus_offset;
> +    char *nodename;
> +    int smt = kvmppc_smt_threads();
> +
> +    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
> +    _FDT(cpus_offset);
> +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
> +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
> +
> +    CPU_FOREACH(cs) {
> +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> +        int index = ppc_get_vcpu_dt_id(cpu);
> +        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> +        int offset;
> +
> +        if ((index % smt) != 0) {
> +            continue;
> +        }
> +
> +        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> +        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
> +        g_free(nodename);
> +        _FDT(offset);
> +        spapr_populate_cpu_dt(cs, fdt, offset);
> +    }

I can simply fix this by walking the CPUs in reverse order in the above
code which makes the guest kernel to discover the CPU DT nodes in the
right order.

s/CPU_FOREACH(cs)/CPU_FOREACH_REVERSE(cs) will solve this problem. Would this
be the right approach or should we just leave it to the guest kernel to
discover and enumerate CPUs in whatever order it finds the DT nodes in FDT ?

Regards,
Bharata.
Bharata B Rao April 27, 2015, 5:36 a.m. UTC | #2
On Sun, Apr 26, 2015 at 05:17:48PM +0530, Bharata B Rao wrote:
> On Fri, Apr 24, 2015 at 12:17:27PM +0530, Bharata B Rao wrote:
> > Reorganize CPU device tree generation code so that it be reused from
> > hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> > instead of spapr_create_fdt_skel().
> 
> Creating CPU DT entries from spapr_finalize_fdt() instead of
> spapr_create_fdt_skel() has an interesting side effect.
> 
> <snip> 
> 
> In both the cases, I am adding CPU DT nodes from QEMU in the same order,
> but not sure why the guest kernel discovers them in different orders in
> each case.

Nikunj and I tracked this down to the difference in device tree APIs that
we are using in two cases.

When CPU DT nodes are created from spapr_create_fdt_skel(), we are using
fdt_begin_node() API which does sequential write and hence CPU DT nodes
end up in the same order in which they are created.

However in my patch when I create CPU DT entries in spapr_finalize_fdt(),
I am using fdt_add_subnode() which ends up writing the CPU DT node at the
same parent offset for all the CPUs. This results in CPU DT nodes being
generated in reverse order in FDT.

> 
> > +static void spapr_populate_cpus_dt_node(void *fdt, sPAPREnvironment *spapr)
> > +{
> > +    CPUState *cs;
> > +    int cpus_offset;
> > +    char *nodename;
> > +    int smt = kvmppc_smt_threads();
> > +
> > +    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
> > +    _FDT(cpus_offset);
> > +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
> > +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
> > +
> > +    CPU_FOREACH(cs) {
> > +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> > +        int index = ppc_get_vcpu_dt_id(cpu);
> > +        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > +        int offset;
> > +
> > +        if ((index % smt) != 0) {
> > +            continue;
> > +        }
> > +
> > +        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> > +        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
> > +        g_free(nodename);
> > +        _FDT(offset);
> > +        spapr_populate_cpu_dt(cs, fdt, offset);
> > +    }
> 
> I can simply fix this by walking the CPUs in reverse order in the above
> code which makes the guest kernel to discover the CPU DT nodes in the
> right order.
> 
> s/CPU_FOREACH(cs)/CPU_FOREACH_REVERSE(cs) will solve this problem. Would this
> be the right approach or should we just leave it to the guest kernel to
> discover and enumerate CPUs in whatever order it finds the DT nodes in FDT ?

So using CPU_FOREACH_REVERSE(cs) appears to be right way to handle this.

Regards,
Bharata.
David Gibson May 4, 2015, 11:59 a.m. UTC | #3
On Fri, Apr 24, 2015 at 12:17:27PM +0530, Bharata B Rao wrote:
> Reorganize CPU device tree generation code so that it be reused from
> hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> instead of spapr_create_fdt_skel().
> 
> Note: This is how the split-up looks like now:
> 
> Boot path
> ---------
> spapr_finalize_fdt
>  spapr_populate_cpus_dt_node
>   spapr_populate_cpu_dt
>    spapr_fixup_cpu_numa_dt
>    spapr_fixup_cpu_smt_dt
> 
> Hotplug path
> ------------
> spapr_cpu_plug
>  spapr_populate_hotplug_cpu_dt
>   spapr_populate_cpu_dt
>    spapr_fixup_cpu_numa_dt
>    spapr_fixup_cpu_smt_dt
> 
> ibm,cas path
> ------------
> spapr_h_cas_compose_response
>  spapr_fixup_cpu_dt
>   spapr_fixup_cpu_numa_dt
>   spapr_fixup_cpu_smt_dt
> 
> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson May 4, 2015, 12:01 p.m. UTC | #4
On Mon, Apr 27, 2015 at 11:06:07AM +0530, Bharata B Rao wrote:
> On Sun, Apr 26, 2015 at 05:17:48PM +0530, Bharata B Rao wrote:
> > On Fri, Apr 24, 2015 at 12:17:27PM +0530, Bharata B Rao wrote:
> > > Reorganize CPU device tree generation code so that it be reused from
> > > hotplug path. CPU dt entries are now generated from spapr_finalize_fdt()
> > > instead of spapr_create_fdt_skel().
> > 
> > Creating CPU DT entries from spapr_finalize_fdt() instead of
> > spapr_create_fdt_skel() has an interesting side effect.
> > 
> > <snip> 
> > 
> > In both the cases, I am adding CPU DT nodes from QEMU in the same order,
> > but not sure why the guest kernel discovers them in different orders in
> > each case.
> 
> Nikunj and I tracked this down to the difference in device tree APIs that
> we are using in two cases.
> 
> When CPU DT nodes are created from spapr_create_fdt_skel(), we are using
> fdt_begin_node() API which does sequential write and hence CPU DT nodes
> end up in the same order in which they are created.
> 
> However in my patch when I create CPU DT entries in spapr_finalize_fdt(),
> I am using fdt_add_subnode() which ends up writing the CPU DT node at the
> same parent offset for all the CPUs. This results in CPU DT nodes being
> generated in reverse order in FDT.
> 
> > 
> > > +static void spapr_populate_cpus_dt_node(void *fdt, sPAPREnvironment *spapr)
> > > +{
> > > +    CPUState *cs;
> > > +    int cpus_offset;
> > > +    char *nodename;
> > > +    int smt = kvmppc_smt_threads();
> > > +
> > > +    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
> > > +    _FDT(cpus_offset);
> > > +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
> > > +    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
> > > +
> > > +    CPU_FOREACH(cs) {
> > > +        PowerPCCPU *cpu = POWERPC_CPU(cs);
> > > +        int index = ppc_get_vcpu_dt_id(cpu);
> > > +        DeviceClass *dc = DEVICE_GET_CLASS(cs);
> > > +        int offset;
> > > +
> > > +        if ((index % smt) != 0) {
> > > +            continue;
> > > +        }
> > > +
> > > +        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
> > > +        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
> > > +        g_free(nodename);
> > > +        _FDT(offset);
> > > +        spapr_populate_cpu_dt(cs, fdt, offset);
> > > +    }
> > 
> > I can simply fix this by walking the CPUs in reverse order in the above
> > code which makes the guest kernel to discover the CPU DT nodes in the
> > right order.
> > 
> > s/CPU_FOREACH(cs)/CPU_FOREACH_REVERSE(cs) will solve this problem. Would this
> > be the right approach or should we just leave it to the guest kernel to
> > discover and enumerate CPUs in whatever order it finds the DT nodes in FDT ?
> 
> So using CPU_FOREACH_REVERSE(cs) appears to be right way to handle this.

Yes, I think so.  In theory it shouldn't matter, but I think it's
safer to retain the device tree order.
diff mbox

Patch

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 4e72f26..a56f9a1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -206,6 +206,27 @@  static int spapr_fixup_cpu_smt_dt(void *fdt, int offset, PowerPCCPU *cpu,
     return ret;
 }
 
+static int spapr_fixup_cpu_numa_dt(void *fdt, int offset, CPUState *cs)
+{
+    int ret = 0;
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    int index = ppc_get_vcpu_dt_id(cpu);
+    uint32_t associativity[] = {cpu_to_be32(0x5),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(0x0),
+                                cpu_to_be32(cs->numa_node),
+                                cpu_to_be32(index)};
+
+    /* Advertise NUMA via ibm,associativity */
+    if (nb_numa_nodes > 1) {
+        ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
+                          sizeof(associativity));
+    }
+
+    return ret;
+}
+
 static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
 {
     int ret = 0, offset, cpus_offset;
@@ -218,12 +239,6 @@  static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
         PowerPCCPU *cpu = POWERPC_CPU(cs);
         DeviceClass *dc = DEVICE_GET_CLASS(cs);
         int index = ppc_get_vcpu_dt_id(cpu);
-        uint32_t associativity[] = {cpu_to_be32(0x5),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(0x0),
-                                    cpu_to_be32(cs->numa_node),
-                                    cpu_to_be32(index)};
 
         if ((index % smt) != 0) {
             continue;
@@ -247,16 +262,13 @@  static int spapr_fixup_cpu_dt(void *fdt, sPAPREnvironment *spapr)
             }
         }
 
-        if (nb_numa_nodes > 1) {
-            ret = fdt_setprop(fdt, offset, "ibm,associativity", associativity,
-                              sizeof(associativity));
-            if (ret < 0) {
-                return ret;
-            }
+        ret = fdt_setprop(fdt, offset, "ibm,pft-size",
+                      pft_size_prop, sizeof(pft_size_prop));
+        if (ret < 0) {
+            return ret;
         }
 
-        ret = fdt_setprop(fdt, offset, "ibm,pft-size",
-                          pft_size_prop, sizeof(pft_size_prop));
+        ret = spapr_fixup_cpu_numa_dt(fdt, offset, cs);
         if (ret < 0) {
             return ret;
         }
@@ -341,18 +353,13 @@  static void *spapr_create_fdt_skel(hwaddr initrd_base,
                                    uint32_t epow_irq)
 {
     void *fdt;
-    CPUState *cs;
     uint32_t start_prop = cpu_to_be32(initrd_base);
     uint32_t end_prop = cpu_to_be32(initrd_base + initrd_size);
     GString *hypertas = g_string_sized_new(256);
     GString *qemu_hypertas = g_string_sized_new(256);
     uint32_t refpoints[] = {cpu_to_be32(0x4), cpu_to_be32(0x4)};
     uint32_t interrupt_server_ranges_prop[] = {0, cpu_to_be32(max_cpus)};
-    int smt = kvmppc_smt_threads();
     unsigned char vec5[] = {0x0, 0x0, 0x0, 0x0, 0x0, 0x80};
-    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
-    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
-    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
     char *buf;
 
     add_str(hypertas, "hcall-pft");
@@ -441,107 +448,6 @@  static void *spapr_create_fdt_skel(hwaddr initrd_base,
 
     _FDT((fdt_end_node(fdt)));
 
-    /* cpus */
-    _FDT((fdt_begin_node(fdt, "cpus")));
-
-    _FDT((fdt_property_cell(fdt, "#address-cells", 0x1)));
-    _FDT((fdt_property_cell(fdt, "#size-cells", 0x0)));
-
-    CPU_FOREACH(cs) {
-        PowerPCCPU *cpu = POWERPC_CPU(cs);
-        CPUPPCState *env = &cpu->env;
-        DeviceClass *dc = DEVICE_GET_CLASS(cs);
-        PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
-        int index = ppc_get_vcpu_dt_id(cpu);
-        char *nodename;
-        uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
-                           0xffffffff, 0xffffffff};
-        uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
-        uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
-        uint32_t page_sizes_prop[64];
-        size_t page_sizes_prop_size;
-
-        if ((index % smt) != 0) {
-            continue;
-        }
-
-        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
-
-        _FDT((fdt_begin_node(fdt, nodename)));
-
-        g_free(nodename);
-
-        _FDT((fdt_property_cell(fdt, "reg", index)));
-        _FDT((fdt_property_string(fdt, "device_type", "cpu")));
-
-        _FDT((fdt_property_cell(fdt, "cpu-version", env->spr[SPR_PVR])));
-        _FDT((fdt_property_cell(fdt, "d-cache-block-size",
-                                env->dcache_line_size)));
-        _FDT((fdt_property_cell(fdt, "d-cache-line-size",
-                                env->dcache_line_size)));
-        _FDT((fdt_property_cell(fdt, "i-cache-block-size",
-                                env->icache_line_size)));
-        _FDT((fdt_property_cell(fdt, "i-cache-line-size",
-                                env->icache_line_size)));
-
-        if (pcc->l1_dcache_size) {
-            _FDT((fdt_property_cell(fdt, "d-cache-size", pcc->l1_dcache_size)));
-        } else {
-            fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
-        }
-        if (pcc->l1_icache_size) {
-            _FDT((fdt_property_cell(fdt, "i-cache-size", pcc->l1_icache_size)));
-        } else {
-            fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
-        }
-
-        _FDT((fdt_property_cell(fdt, "timebase-frequency", tbfreq)));
-        _FDT((fdt_property_cell(fdt, "clock-frequency", cpufreq)));
-        _FDT((fdt_property_cell(fdt, "ibm,slb-size", env->slb_nr)));
-        _FDT((fdt_property_string(fdt, "status", "okay")));
-        _FDT((fdt_property(fdt, "64-bit", NULL, 0)));
-
-        if (env->spr_cb[SPR_PURR].oea_read) {
-            _FDT((fdt_property(fdt, "ibm,purr", NULL, 0)));
-        }
-
-        if (env->mmu_model & POWERPC_MMU_1TSEG) {
-            _FDT((fdt_property(fdt, "ibm,processor-segment-sizes",
-                               segs, sizeof(segs))));
-        }
-
-        /* Advertise VMX/VSX (vector extensions) if available
-         *   0 / no property == no vector extensions
-         *   1               == VMX / Altivec available
-         *   2               == VSX available */
-        if (env->insns_flags & PPC_ALTIVEC) {
-            uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
-
-            _FDT((fdt_property_cell(fdt, "ibm,vmx", vmx)));
-        }
-
-        /* Advertise DFP (Decimal Floating Point) if available
-         *   0 / no property == no DFP
-         *   1               == DFP available */
-        if (env->insns_flags2 & PPC2_DFP) {
-            _FDT((fdt_property_cell(fdt, "ibm,dfp", 1)));
-        }
-
-        page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
-                                                      sizeof(page_sizes_prop));
-        if (page_sizes_prop_size) {
-            _FDT((fdt_property(fdt, "ibm,segment-page-sizes",
-                               page_sizes_prop, page_sizes_prop_size)));
-        }
-
-        _FDT((fdt_property_cell(fdt, "ibm,chip-id",
-                                cs->cpu_index / cpus_per_socket)));
-
-        _FDT((fdt_end_node(fdt)));
-    }
-
-    _FDT((fdt_end_node(fdt)));
-
     /* RTAS */
     _FDT((fdt_begin_node(fdt, "rtas")));
 
@@ -739,6 +645,131 @@  static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
     return 0;
 }
 
+static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset)
+{
+    PowerPCCPU *cpu = POWERPC_CPU(cs);
+    CPUPPCState *env = &cpu->env;
+    PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs);
+    int index = ppc_get_vcpu_dt_id(cpu);
+    uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40),
+                       0xffffffff, 0xffffffff};
+    uint32_t tbfreq = kvm_enabled() ? kvmppc_get_tbfreq() : TIMEBASE_FREQ;
+    uint32_t cpufreq = kvm_enabled() ? kvmppc_get_clockfreq() : 1000000000;
+    uint32_t page_sizes_prop[64];
+    size_t page_sizes_prop_size;
+    QemuOpts *opts = qemu_opts_find(qemu_find_opts("smp-opts"), NULL);
+    unsigned sockets = opts ? qemu_opt_get_number(opts, "sockets", 0) : 0;
+    uint32_t cpus_per_socket = sockets ? (smp_cpus / sockets) : 1;
+    uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)};
+
+    _FDT((fdt_setprop_cell(fdt, offset, "reg", index)));
+    _FDT((fdt_setprop_string(fdt, offset, "device_type", "cpu")));
+
+    _FDT((fdt_setprop_cell(fdt, offset, "cpu-version", env->spr[SPR_PVR])));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-block-size",
+                            env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "d-cache-line-size",
+                            env->dcache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-block-size",
+                            env->icache_line_size)));
+    _FDT((fdt_setprop_cell(fdt, offset, "i-cache-line-size",
+                            env->icache_line_size)));
+
+    if (pcc->l1_dcache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "d-cache-size",
+                             pcc->l1_dcache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 dcache size for cpu\n");
+    }
+    if (pcc->l1_icache_size) {
+        _FDT((fdt_setprop_cell(fdt, offset, "i-cache-size",
+                             pcc->l1_icache_size)));
+    } else {
+        fprintf(stderr, "Warning: Unknown L1 icache size for cpu\n");
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "timebase-frequency", tbfreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "clock-frequency", cpufreq)));
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,slb-size", env->slb_nr)));
+    _FDT((fdt_setprop_string(fdt, offset, "status", "okay")));
+    _FDT((fdt_setprop(fdt, offset, "64-bit", NULL, 0)));
+
+    if (env->spr_cb[SPR_PURR].oea_read) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,purr", NULL, 0)));
+    }
+
+    if (env->mmu_model & POWERPC_MMU_1TSEG) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,processor-segment-sizes",
+                           segs, sizeof(segs))));
+    }
+
+    /* Advertise VMX/VSX (vector extensions) if available
+     *   0 / no property == no vector extensions
+     *   1               == VMX / Altivec available
+     *   2               == VSX available */
+    if (env->insns_flags & PPC_ALTIVEC) {
+        uint32_t vmx = (env->insns_flags2 & PPC2_VSX) ? 2 : 1;
+
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,vmx", vmx)));
+    }
+
+    /* Advertise DFP (Decimal Floating Point) if available
+     *   0 / no property == no DFP
+     *   1               == DFP available */
+    if (env->insns_flags2 & PPC2_DFP) {
+        _FDT((fdt_setprop_cell(fdt, offset, "ibm,dfp", 1)));
+    }
+
+    page_sizes_prop_size = create_page_sizes_prop(env, page_sizes_prop,
+                                                  sizeof(page_sizes_prop));
+    if (page_sizes_prop_size) {
+        _FDT((fdt_setprop(fdt, offset, "ibm,segment-page-sizes",
+                           page_sizes_prop, page_sizes_prop_size)));
+    }
+
+    _FDT((fdt_setprop_cell(fdt, offset, "ibm,chip-id",
+                            cs->cpu_index / cpus_per_socket)));
+
+    _FDT((fdt_setprop(fdt, offset, "ibm,pft-size",
+                      pft_size_prop, sizeof(pft_size_prop))));
+
+    _FDT(spapr_fixup_cpu_numa_dt(fdt, offset, cs));
+
+    _FDT(spapr_fixup_cpu_smt_dt(fdt, offset, cpu,
+                                 ppc_get_compat_smt_threads(cpu)));
+}
+
+static void spapr_populate_cpus_dt_node(void *fdt, sPAPREnvironment *spapr)
+{
+    CPUState *cs;
+    int cpus_offset;
+    char *nodename;
+    int smt = kvmppc_smt_threads();
+
+    cpus_offset = fdt_add_subnode(fdt, 0, "cpus");
+    _FDT(cpus_offset);
+    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#address-cells", 0x1)));
+    _FDT((fdt_setprop_cell(fdt, cpus_offset, "#size-cells", 0x0)));
+
+    CPU_FOREACH(cs) {
+        PowerPCCPU *cpu = POWERPC_CPU(cs);
+        int index = ppc_get_vcpu_dt_id(cpu);
+        DeviceClass *dc = DEVICE_GET_CLASS(cs);
+        int offset;
+
+        if ((index % smt) != 0) {
+            continue;
+        }
+
+        nodename = g_strdup_printf("%s@%x", dc->fw_name, index);
+        offset = fdt_add_subnode(fdt, cpus_offset, nodename);
+        g_free(nodename);
+        _FDT(offset);
+        spapr_populate_cpu_dt(cs, fdt, offset);
+    }
+
+}
+
 static void spapr_finalize_fdt(sPAPREnvironment *spapr,
                                hwaddr fdt_addr,
                                hwaddr rtas_addr,
@@ -782,11 +813,8 @@  static void spapr_finalize_fdt(sPAPREnvironment *spapr,
         fprintf(stderr, "Couldn't set up RTAS device tree properties\n");
     }
 
-    /* Advertise NUMA via ibm,associativity */
-    ret = spapr_fixup_cpu_dt(fdt, spapr);
-    if (ret < 0) {
-        fprintf(stderr, "Couldn't finalize CPU device tree properties\n");
-    }
+    /* cpus */
+    spapr_populate_cpus_dt_node(fdt, spapr);
 
     bootlist = get_boot_devices_list(&cb, true);
     if (cb && bootlist) {