diff mbox

[RFC] ppc/spapr: support sparse NUMA node numbering

Message ID 20140624010833.GF4323@linux.vnet.ibm.com
State New
Headers show

Commit Message

Nishanth Aravamudan June 24, 2014, 1:08 a.m. UTC
With generic sparse NUMA node parsing in place ("numa: enable sparse
node numbering"), ppc can be updated to iterate over only the
user-specified nodes.

qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096
-realtime mlock=off -numa node,nodeid=3 -numa node,nodeid=2 -smp 4

Before:

info numa:
node 0 cpus: 0 2
node 0 size: 2048 MB
node 1 cpus: 1 3
node 1 size: 2048 MB

numactl --hardware:
available: 2 nodes (0-1)
node 0 cpus: 0 2
node 0 size: 2027 MB
node 0 free: 1875 MB
node 1 cpus: 1 3
node 1 size: 2045 MB
node 1 free: 1980 MB
node distances:
node   0   1
  0:  10  40
  1:  40  10

After:

info numa:
node 2 cpus: 0 2
node 2 size: 2048 MB
node 3 cpus: 1 3
node 3 size: 2048 MB

numactl --hardware:
available: 3 nodes (0,2-3)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 2 cpus: 0 2
node 2 size: 2027 MB
node 2 free: 1943 MB
node 3 cpus: 1 3
node 3 size: 2045 MB
node 3 free: 1904 MB
node distances:
node   0   2   3
  0:  10  40  40
  2:  40  10  40
  3:  40  40  10

Note, the empty node 0 is due to the Linux kernel.

Signed-off-by: Nishanth Aravamudan <nacc@linux.vnet.ibm.com>

Comments

Nishanth Aravamudan June 24, 2014, 3:34 a.m. UTC | #1
Hi David,

On 23.06.2014 [18:08:33 -0700], Nishanth Aravamudan wrote:
> With generic sparse NUMA node parsing in place ("numa: enable sparse
> node numbering"), ppc can be updated to iterate over only the
> user-specified nodes.
> 
> qemu-system-ppc64 -machine pseries,accel=kvm,usb=off -m 4096
> -realtime mlock=off -numa node,nodeid=3 -numa node,nodeid=2 -smp 4

I'd like to do so something similar for x86 as a follow-on patch, but I
don't really know what the SRAT should look like, or what the APIC
mapping should be to have sparse NUMA numbering? Some cursory googling
indicates you've run into this in practice? Possibly with a bad APIC?

Thanks,
Nish
diff mbox

Patch

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 82f183f..d07857a 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -707,7 +707,10 @@  static int spapr_populate_memory(sPAPREnvironment *spapr, void *fdt)
 
     /* RAM: Node 1 and beyond */
     mem_start = node0_size;
-    for (i = 1; i < nb_numa_nodes; i++) {
+    for (i = 1; i < max_numa_node; i++) {
+        if (!numa_info[i].present) {
+            continue;
+        }
         mem_reg_property[0] = cpu_to_be64(mem_start);
         if (mem_start >= ram_size) {
             node_size = 0;