Patchwork [2/2] cpc925_edac: support single-processor configurations

login
register
mail settings
Submitter Dmitry Eremin-Solenikov
Date May 21, 2011, 11:33 a.m.
Message ID <1305977629-26648-1-git-send-email-dbaryshkov@gmail.com>
Download mbox | patch
Permalink /patch/96692/
State Superseded
Headers show

Comments

Dmitry Eremin-Solenikov - May 21, 2011, 11:33 a.m.
If second CPU is not enabled, CPC925 EDAC driver will spill out warnings
about errors on second Processor Interface. Support masking that out,
by detecting at runtime which CPUs are present in device tree.

Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Harry Ciao <qingtao.cao@windriver.com>
Cc: Doug Thompson <dougthompson@xmission.com>
Signed-off-by: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
---

Oops, please use this one instead, previous contained one extra debug line.

 drivers/edac/cpc925_edac.c |   44 ++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 44 insertions(+), 0 deletions(-)
Segher Boessenkool - May 21, 2011, 1:45 p.m.
> If second CPU is not enabled, CPC925 EDAC driver will spill out 
> warnings
> about errors on second Processor Interface. Support masking that out,
> by detecting at runtime which CPUs are present in device tree.

That doesn't quite work, there can be multiple CPUs per processor 
interface.
You should be able to see which interfaces are enabled in some CPC925 
register,
but maybe both _are_ enabled on your system (although one is not 
connected),
which is causing the errors?


Segher
Dmitry Eremin-Solenikov - May 21, 2011, 6:15 p.m.
On 5/21/11, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>> If second CPU is not enabled, CPC925 EDAC driver will spill out
>> warnings
>> about errors on second Processor Interface. Support masking that out,
>> by detecting at runtime which CPUs are present in device tree.
>
> That doesn't quite work, there can be multiple CPUs per processor
> interface.

Are you sure that there can be multiple CPUs on one PI with CPC925
(CPC945 isn't supported by this driver anyway, IIUC).

> You should be able to see which interfaces are enabled in some CPC925
> register,
> but maybe both _are_ enabled on your system (although one is not
> connected),
> which is causing the errors?

Hmm, I dont't think this is the case: I'm using a MapleD board with two CPUs
connected to separate PIs. However I can slect the service processor
to enable only one CPU via selecting correct bootscript. In this case
bootscript correctly enables only APIMASK_ADI0. However as cpc925_edac
checks the APIEXCP itself, it sees the APIEXCP_ADI1 bit set and spills
regular warnings about it (see below).

If you'd prefer I can add a check for APIMASK at cpc925_cpu_init() time,
but I think that this will be less robust.
Segher Boessenkool - May 21, 2011, 8:04 p.m.
>>> If second CPU is not enabled, CPC925 EDAC driver will spill out
>>> warnings
>>> about errors on second Processor Interface. Support masking that out,
>>> by detecting at runtime which CPUs are present in device tree.
>>
>> That doesn't quite work, there can be multiple CPUs per processor
>> interface.
>
> Are you sure that there can be multiple CPUs on one PI with CPC925
> (CPC945 isn't supported by this driver anyway, IIUC).

I do not know any board that actually uses this.  And, hrm, you cannot
use 970MP with CPC925 if I remember correctly.

It's still better to look what processor interfaces are working 
correctly
though.  But given that this is essentially a dead platform, I'm okay 
with
this hack, if it works ;-)

>> You should be able to see which interfaces are enabled in some CPC925
>> register,
>> but maybe both _are_ enabled on your system (although one is not
>> connected),
>> which is causing the errors?
>
> Hmm, I dont't think this is the case: I'm using a MapleD board with 
> two CPUs
> connected to separate PIs. However I can slect the service processor
> to enable only one CPU via selecting correct bootscript. In this case
> bootscript correctly enables only APIMASK_ADI0. However as cpc925_edac
> checks the APIEXCP itself, it sees the APIEXCP_ADI1 bit set and spills
> regular warnings about it (see below).

(no below :-) )

I think the service processor left that processor interface enabled (the
interface itself, not the exception stuff), so the exception thing will
signal exceptions any time the CPC925 sends snoops to that second
processor.  This also might reduce performance.

Or maybe it is normal for the exception thing to signal errors on 
disabled
interfaces.

> If you'd prefer I can add a check for APIMASK at cpc925_cpu_init() 
> time,
> but I think that this will be less robust.

Yeah that's less robust, for sure.

Just keep what you have, but add a big fat comment that you are assuming
the processor interface id is identical to the MPIC processor id :-)

Did you test disabling physical CPU #0 as well?


Segher
Dmitry Eremin-Solenikov - May 22, 2011, 9:12 a.m.
On Sun, May 22, 2011 at 12:04 AM, Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>>> You should be able to see which interfaces are enabled in some CPC925
>>> register,
>>> but maybe both _are_ enabled on your system (although one is not
>>> connected),
>>> which is causing the errors?
>>
>> Hmm, I dont't think this is the case: I'm using a MapleD board with two
>> CPUs
>> connected to separate PIs. However I can slect the service processor
>> to enable only one CPU via selecting correct bootscript. In this case
>> bootscript correctly enables only APIMASK_ADI0. However as cpc925_edac
>> checks the APIEXCP itself, it sees the APIEXCP_ADI1 bit set and spills
>> regular warnings about it (see below).
>
> (no below :-) )

Sorry, here it goes:

EDAC CPC925: Processor Interface Fault
Processor Interface register dump:
EDAC CPC925: APIMASK            0xdea00000
EDAC CPC925: APIEXCP            0x20000000
EDAC DEVICE0: INTERNAL ERROR: instance 0 'block' out of range (0 >= 0)

> I think the service processor left that processor interface enabled (the
> interface itself, not the exception stuff), so the exception thing will
> signal exceptions any time the CPC925 sends snoops to that second
> processor.  This also might reduce performance.
>
> Or maybe it is normal for the exception thing to signal errors on disabled
> interfaces.

I only have U4 manual, so I can't be sure about U3H. And for U4 manual is
also unclear about ADI1 exception.

>> If you'd prefer I can add a check for APIMASK at cpc925_cpu_init() time,
>> but I think that this will be less robust.
>
> Yeah that's less robust, for sure.
>
> Just keep what you have, but add a big fat comment that you are assuming
> the processor interface id is identical to the MPIC processor id :-)

sure

> Did you test disabling physical CPU #0 as well?

No. I still don't have _that_ level of understanding of PIBS boot scripts.

Patch

diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 837ad8f..5bbe766 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -90,6 +90,7 @@  enum apimask_bits {
 	ECC_MASK_ENABLE = (APIMASK_ECC_UE_H | APIMASK_ECC_CE_H |
 			   APIMASK_ECC_UE_L | APIMASK_ECC_CE_L),
 };
+#define APIMASK_ADI(n)		CPC925_BIT(((n)+1))
 
 /************************************************************
  *	Processor Interface Exception Register (APIEXCP)
@@ -581,16 +582,56 @@  static void cpc925_mc_check(struct mem_ctl_info *mci)
 }
 
 /******************** CPU err device********************************/
+static u32 cpc925_cpu_getmask(void)
+{
+	struct device_node *cpus;
+	struct device_node *cpunode;
+	static u32 mask = 0;
+
+	if (mask != 0)
+		return mask;
+
+	mask = APIMASK_ADI0 | APIMASK_ADI1;
+
+	cpus = of_find_node_by_path("/cpus");
+	if (cpus == NULL) {
+		cpc925_printk(KERN_DEBUG, "No /cpus node !\n");
+		return 0;
+	}
+
+	/* Get first CPU node */
+	for (cpunode = NULL;
+	     (cpunode = of_get_next_child(cpus, cpunode)) != NULL;) {
+		const u32 *reg = of_get_property(cpunode, "reg", NULL);
+
+		if (!strcmp(cpunode->type, "cpu") && reg != NULL)
+			mask &= ~APIMASK_ADI(*reg);
+	}
+
+	of_node_put(cpunode);
+	of_node_put(cpus);
+
+	return mask;
+}
+
 /* Enable CPU Errors detection */
 static void cpc925_cpu_init(struct cpc925_dev_info *dev_info)
 {
 	u32 apimask;
+	u32 cpumask;
 
 	apimask = __raw_readl(dev_info->vbase + REG_APIMASK_OFFSET);
 	if ((apimask & CPU_MASK_ENABLE) == 0) {
 		apimask |= CPU_MASK_ENABLE;
 		__raw_writel(apimask, dev_info->vbase + REG_APIMASK_OFFSET);
 	}
+
+	cpumask = cpc925_cpu_getmask();
+	if (apimask & cpumask) {
+		cpc925_printk(KERN_WARNING, "CPU(s) not present, "
+				"but enabled in APIMASK, disabling\n");
+		apimask &= ~cpumask;
+	}
 }
 
 /* Disable CPU Errors detection */
@@ -622,6 +663,9 @@  static void cpc925_cpu_check(struct edac_device_ctl_info *edac_dev)
 	if ((apiexcp & CPU_EXCP_DETECTED) == 0)
 		return;
 
+	if ((apiexcp & ~cpc925_cpu_getmask()) == 0)
+		return;
+
 	apimask = __raw_readl(dev_info->vbase + REG_APIMASK_OFFSET);
 	cpc925_printk(KERN_INFO, "Processor Interface Fault\n"
 				 "Processor Interface register dump:\n");