Patchwork [BUG] powerpc: fix numa distance for form0 device tree

login
register
mail settings
Submitter Vaidyanathan Srinivasan
Date March 22, 2013, 3:49 p.m.
Message ID <20130322154935.GA26283@dirshya.in.ibm.com>
Download mbox | patch
Permalink /patch/230103/
State Accepted
Commit 7122beeee7bc1757682049780179d7c216dd1c83
Delegated to: Michael Ellerman
Headers show

Comments

Vaidyanathan Srinivasan - March 22, 2013, 3:49 p.m.
powerpc: fix numa distance for form0 device tree
    
    The following commit breaks numa distance setup for old powerpc
    systems that use form0 encoding in device tree.
    
        commit 41eab6f88f24124df89e38067b3766b7bef06ddb
        powerpc/numa: Use form 1 affinity to setup node distance
    
    Device tree node /rtas/ibm,associativity-reference-points would
    index into /cpus/PowerPCxxxx/ibm,associativity based on form0 or
    form1 encoding detected by ibm,architecture-vec-5 property.
    
    All modern systems use form1 and current kernel code is correct.
    However, on older systems with form0 encoding, the numa distance
    will get hard coded as LOCAL_DISTANCE for all nodes.  This causes
    task scheduling anomaly since scheduler will skip building numa
    level domain (topmost domain with all cpus) if all numa distances
    are same.  (value of 'level' in sched_init_numa() will remain 0)
    
    Prior to the above commit:
    #define node_distance(from,to)
    	((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
    
    Restoring compatible behavior with this patch for old powerpc systems
    with device tree where numa distance are encoded as form0.

    Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Vaidyanathan Srinivasan - March 22, 2013, 3:56 p.m.
* Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> [2013-03-22 21:19:35]:

[snip]

>     Prior to the above commit:
>     #define node_distance(from,to)
>     	((from) == (to) ? LOCAL_DISTANCE : REMOTE_DISTANCE)
>     
>     Restoring compatible behavior with this patch for old powerpc systems
>     with device tree where numa distance are encoded as form0.

This patch on v3.9-rc3 has been tested on multi-node POWER7 with
different device tree combinations.

numactl -H would show local distance '10' for same node and remote
distance '20' for other nodes.  This ensures NUMA level sched domain
gets built and load balancing could work across such configurations.

--Vaidy

Patch

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index bba87ca..6a252c4 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -201,7 +201,7 @@  int __node_distance(int a, int b)
 	int distance = LOCAL_DISTANCE;
 
 	if (!form1_affinity)
-		return distance;
+		return ((a == b) ? LOCAL_DISTANCE : REMOTE_DISTANCE);
 
 	for (i = 0; i < distance_ref_points_depth; i++) {
 		if (distance_lookup_table[a][i] == distance_lookup_table[b][i])