@@ -896,10 +896,19 @@ static void __init find_possible_nodes(void)
if (!rtas)
return;
- if (of_property_read_u32_index(rtas,
- "ibm,max-associativity-domains",
+ if (of_property_read_u32_index(rtas, "ibm,current-associativity-domains",
+ min_common_depth, &numnodes)) {
+ /*
+ * ibm,current-associativity-domains is a fairly recent
+ * property. If it doesn't exist, then fallback on
+ * ibm,max-associativity-domains. Current denotes what the
+ * platform can support compared to max which denotes what the
+ * Hypervisor can support.
+ */
+ if (of_property_read_u32_index(rtas, "ibm,max-associativity-domains",
min_common_depth, &numnodes))
- goto out;
+ goto out;
+ }
for (i = 0; i < numnodes; i++) {
if (!node_possible(i))
As per PAPR, there are 2 device tree property ibm,max-associativity-domains (which defines the maximum number of domains that the firmware i.e PowerVM can support) and ibm,current-associativity-domains (which defines the maximum number of domains that the platform can support). Value of ibm,max-associativity-domains property is always greater than or equal to ibm,current-associativity-domains property. Powerpc currently uses ibm,max-associativity-domains property while setting the possible number of nodes. This is currently set at 32. However the possible number of nodes for a platform may be significantly less. Hence set the possible number of nodes based on ibm,current-associativity-domains property. $ lsprop /proc/device-tree/rtas/ibm,*associ*-domains /proc/device-tree/rtas/ibm,current-associativity-domains 00000005 00000001 00000002 00000002 00000002 00000010 /proc/device-tree/rtas/ibm,max-associativity-domains 00000005 00000001 00000008 00000020 00000020 00000100 $ cat /sys/devices/system/node/possible ##Before patch 0-31 $ cat /sys/devices/system/node/possible ##After patch 0-1 Note the maximum nodes this platform can support is only 2 but the possible nodes is set to 32. This is important because lot of kernel and user space code allocate structures for all possible nodes leading to a lot of memory that is allocated but not used. I ran a simple experiment to create and destroy 100 memory cgroups on boot on a 8 node machine (Power8 Alpine). Before patch free -k at boot total used free shared buff/cache available Mem: 523498176 4106816 518820608 22272 570752 516606720 Swap: 4194240 0 4194240 free -k after creating 100 memory cgroups total used free shared buff/cache available Mem: 523498176 4628416 518246464 22336 623296 516058688 Swap: 4194240 0 4194240 free -k after destroying 100 memory cgroups total used free shared buff/cache available Mem: 523498176 4697408 518173760 22400 627008 515987904 Swap: 4194240 0 4194240 After patch free -k at boot total used free shared buff/cache available Mem: 523498176 3969472 518933888 22272 594816 516731776 Swap: 4194240 0 4194240 free -k after creating 100 memory cgroups total used free shared buff/cache available Mem: 523498176 4181888 518676096 22208 640192 516496448 Swap: 4194240 0 4194240 free -k after destroying 100 memory cgroups total used free shared buff/cache available Mem: 523498176 4232320 518619904 22272 645952 516443264 Swap: 4194240 0 4194240 Observations: Fixed kernel takes 137344 kb (4106816-3969472) less to boot. Fixed kernel takes 309184 kb (4628416-4181888-137344) less to create 100 memcgs. Cc: Nathan Lynch <nathanl@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: linuxppc-dev@lists.ozlabs.org Cc: Anton Blanchard <anton@ozlabs.org> Cc: Bharata B Rao <bharata@linux.ibm.com> Cc: Tyrel Datwyler <tyreld@linux.ibm.com> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> --- v1: https://lore.kernel.org/linuxppc-dev/20200706064002.14848-1-srikar@linux.vnet.ibm.com/t/#u Changelog v1->v2: Fallback to ibm,max-associativity-domains if ibm,current-associativity is not available. Suggested by Tyrel Datwyler. arch/powerpc/mm/numa.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)