[RFC,RESEND] core/affinity: Fix NUMA node associativity on P8 and P8NVL

Submitted by Gavin Shan on Feb. 10, 2017, 6:04 a.m.

Details

Message ID 1486706683-16657-1-git-send-email-gwshan@linux.vnet.ibm.com
State New
Headers show

Commit Message

Gavin Shan Feb. 10, 2017, 6:04 a.m.
We have variable distances from Node-A to Node-B on different machines
as below:

   Machine    CPU        A <-> A     A <-> B
   ------------------------------------------
   Tuleta     POWER8E    10          20
   Firestone  POWER8     10          40
   Garrison   POWER8NVL  10          40

This fixes the distance between Node-A and Node-B to 20 for all POWER8
platforms, which is the value defined by Linux kernel. After the patch
is applied, we have:

   Machine    CPU        A <-> A     A <-> B
   ------------------------------------------
   Tuleta     POWER8E    10          20
   Firestone  POWER8     10          20
   Garrison   POWER8NVL  10          20

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 core/affinity.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Benjamin Herrenschmidt Feb. 10, 2017, 7:04 a.m.
On Fri, 2017-02-10 at 17:04 +1100, Gavin Shan wrote:
> We have variable distances from Node-A to Node-B on different machines
> as below:
> 
>    Machine    CPU        A <-> A     A <-> B
>    ------------------------------------------
>    Tuleta     POWER8E    10          20
>    Firestone  POWER8     10          40
>    Garrison   POWER8NVL  10          40
> 
> This fixes the distance between Node-A and Node-B to 20 for all POWER8
> platforms, which is the value defined by Linux kernel. After the patch
> is applied, we have:
> 
>    Machine    CPU        A <-> A     A <-> B
>    ------------------------------------------
>    Tuleta     POWER8E    10          20
>    Firestone  POWER8     10          20
>    Garrison   POWER8NVL  10          20

I need to understand how the kernel calculates this...

Your patch makes is  effectively use a property that, I think, we don't
have in the DT on these machines, or do we ? (module ID)

There's more problems than that btw, Google's been using a hack on Rhesus
too and P9 is weird.

I don't completely understannd how linux does it either so we should
discuss next week and come up with a proper fix that works for P9 too.

Cheers,
Ben.

> Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
> ---
>  core/affinity.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/core/affinity.c b/core/affinity.c
> index 9f489d3..64c1fda 100644
> --- a/core/affinity.c
> +++ b/core/affinity.c
> @@ -67,6 +67,7 @@ static uint32_t get_chip_node_id(struct proc_chip *chip)
>  
>  void add_associativity_ref_point(void)
>  {
> > +	unsigned long pvr = mfspr(SPR_PVR);
> >  	int ref2 = 0x4;
>  
> >  	/*
> @@ -82,9 +83,12 @@ void add_associativity_ref_point(void)
> >  	 * as a second level of NUMA.
> >  	 *
> >  	 * If there is a way to obtain this information from the FSP
> > -	 * that would be ideal, but for now hardwire our POWER8E setting.
> > +	 * that would be ideal, but for now hardwire our POWER8E, POWER8
> > +	 * and POWER8NVL setting.
> >  	 */
> > -	if (PVR_TYPE(mfspr(SPR_PVR)) == PVR_TYPE_P8E)
> > +	if (PVR_TYPE(pvr) == PVR_TYPE_P8E ||
> > +	    PVR_TYPE(pvr) == PVR_TYPE_P8  ||
> > +	    PVR_TYPE(pvr) == PVR_TYPE_P8NVL)
> >  		ref2 = 0x3;
>  
> >  	dt_add_property_cells(opal_node, "ibm,associativity-reference-points",

Patch hide | download patch | download mbox

diff --git a/core/affinity.c b/core/affinity.c
index 9f489d3..64c1fda 100644
--- a/core/affinity.c
+++ b/core/affinity.c
@@ -67,6 +67,7 @@  static uint32_t get_chip_node_id(struct proc_chip *chip)
 
 void add_associativity_ref_point(void)
 {
+	unsigned long pvr = mfspr(SPR_PVR);
 	int ref2 = 0x4;
 
 	/*
@@ -82,9 +83,12 @@  void add_associativity_ref_point(void)
 	 * as a second level of NUMA.
 	 *
 	 * If there is a way to obtain this information from the FSP
-	 * that would be ideal, but for now hardwire our POWER8E setting.
+	 * that would be ideal, but for now hardwire our POWER8E, POWER8
+	 * and POWER8NVL setting.
 	 */
-	if (PVR_TYPE(mfspr(SPR_PVR)) == PVR_TYPE_P8E)
+	if (PVR_TYPE(pvr) == PVR_TYPE_P8E ||
+	    PVR_TYPE(pvr) == PVR_TYPE_P8  ||
+	    PVR_TYPE(pvr) == PVR_TYPE_P8NVL)
 		ref2 = 0x3;
 
 	dt_add_property_cells(opal_node, "ibm,associativity-reference-points",