diff mbox series

[RFC] phb4: Make verbose EEH output less verbose

Message ID 20180731084854.27802-1-oohall@gmail.com
State RFC
Headers show
Series [RFC] phb4: Make verbose EEH output less verbose | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success master/apply_patch Successfully applied
snowpatch_ozlabs/make_check success Test make_check on branch master

Commit Message

Oliver O'Halloran July 31, 2018, 8:48 a.m. UTC
Each PHB4 component has a error status register and a set of
component-specific registers that describe that error in more detail.
If the status register of a component has no bits set there's not much
point in dumping the error detail registers.

This patch modifies the verbose EEH dumping code so that it only
dumps the error registers of a component if there are bits set in the
error status register of that component. This limits the output to the
registers that are actually relevant to the error that occured which
makes interpreting the dumps slightly easier.

Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
---
I've been using this in my debug builds for a while, but I'm not sure
if it's a great idea since sometimes other people want to full dump.
It's probably fine...
---
 hw/phb4.c | 88 ++++++++++++++++++++++++++++++++++++++++-----------------------
 1 file changed, 56 insertions(+), 32 deletions(-)

Comments

Stewart Smith Aug. 1, 2018, 7:20 a.m. UTC | #1
"Oliver O'Halloran" <oohall@gmail.com> writes:
> Each PHB4 component has a error status register and a set of
> component-specific registers that describe that error in more detail.
> If the status register of a component has no bits set there's not much
> point in dumping the error detail registers.
>
> This patch modifies the verbose EEH dumping code so that it only
> dumps the error registers of a component if there are bits set in the
> error status register of that component. This limits the output to the
> registers that are actually relevant to the error that occured which
> makes interpreting the dumps slightly easier.
>
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> I've been using this in my debug builds for a while, but I'm not sure
> if it's a great idea since sometimes other people want to full dump.
> It's probably fine...

It does look probably fine. Maybe Ben or Russell has thoughts?
Russell Currey Aug. 2, 2018, 2:04 a.m. UTC | #2
On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote:
> Each PHB4 component has a error status register and a set of
> component-specific registers that describe that error in more detail.
> If the status register of a component has no bits set there's not
> much
> point in dumping the error detail registers.
> 
> This patch modifies the verbose EEH dumping code so that it only
> dumps the error registers of a component if there are bits set in the
> error status register of that component. This limits the output to
> the
> registers that are actually relevant to the error that occured which
> makes interpreting the dumps slightly easier.
> 
> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> ---
> I've been using this in my debug builds for a while, but I'm not sure
> if it's a great idea since sometimes other people want to full dump.
> It's probably fine...

This is the same thing the kernel does, I don't see any reason why we
wouldn't do it here.  The kernel also puts multiple registers on the
same line to save space, this does require looking at the code to
remember what order the regs are in but if you're decoding this stuff
that's the least of your problems so *shrug*

In any case,

Acked-by: Russell Currey <ruscur@russell.cc>
Stewart Smith Aug. 3, 2018, 3:15 a.m. UTC | #3
Russell Currey <ruscur@russell.cc> writes:
> On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote:
>> Each PHB4 component has a error status register and a set of
>> component-specific registers that describe that error in more detail.
>> If the status register of a component has no bits set there's not
>> much
>> point in dumping the error detail registers.
>> 
>> This patch modifies the verbose EEH dumping code so that it only
>> dumps the error registers of a component if there are bits set in the
>> error status register of that component. This limits the output to
>> the
>> registers that are actually relevant to the error that occured which
>> makes interpreting the dumps slightly easier.
>> 
>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>> ---
>> I've been using this in my debug builds for a while, but I'm not sure
>> if it's a great idea since sometimes other people want to full dump.
>> It's probably fine...
>
> This is the same thing the kernel does, I don't see any reason why we
> wouldn't do it here.  The kernel also puts multiple registers on the
> same line to save space, this does require looking at the code to
> remember what order the regs are in but if you're decoding this stuff
> that's the least of your problems so *shrug*
>
> In any case,
>
> Acked-by: Russell Currey <ruscur@russell.cc>

I think this means I can make good on my joke^W promise to merge some of
Oliver's RFC patches at 4:59pm on a friday before he takes some time
off? :)
Russell Currey Aug. 3, 2018, 3:17 a.m. UTC | #4
On Fri, 2018-08-03 at 13:15 +1000, Stewart Smith wrote:
> Russell Currey <ruscur@russell.cc> writes:
> > On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote:
> > > Each PHB4 component has a error status register and a set of
> > > component-specific registers that describe that error in more
> > > detail.
> > > If the status register of a component has no bits set there's not
> > > much
> > > point in dumping the error detail registers.
> > > 
> > > This patch modifies the verbose EEH dumping code so that it only
> > > dumps the error registers of a component if there are bits set in
> > > the
> > > error status register of that component. This limits the output
> > > to
> > > the
> > > registers that are actually relevant to the error that occured
> > > which
> > > makes interpreting the dumps slightly easier.
> > > 
> > > Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
> > > ---
> > > I've been using this in my debug builds for a while, but I'm not
> > > sure
> > > if it's a great idea since sometimes other people want to full
> > > dump.
> > > It's probably fine...
> > 
> > This is the same thing the kernel does, I don't see any reason why
> > we
> > wouldn't do it here.  The kernel also puts multiple registers on
> > the
> > same line to save space, this does require looking at the code to
> > remember what order the regs are in but if you're decoding this
> > stuff
> > that's the least of your problems so *shrug*
> > 
> > In any case,
> > 
> > Acked-by: Russell Currey <ruscur@russell.cc>
> 
> I think this means I can make good on my joke^W promise to merge some
> of
> Oliver's RFC patches at 4:59pm on a friday before he takes some time
> off? :)
> 
To be fair, this one is probably the least of your problems...
Oliver O'Halloran Aug. 3, 2018, 3:33 a.m. UTC | #5
On Fri, Aug 3, 2018 at 1:15 PM, Stewart Smith <stewart@linux.ibm.com> wrote:
> Russell Currey <ruscur@russell.cc> writes:
>> On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote:
>>> Each PHB4 component has a error status register and a set of
>>> component-specific registers that describe that error in more detail.
>>> If the status register of a component has no bits set there's not
>>> much
>>> point in dumping the error detail registers.
>>>
>>> This patch modifies the verbose EEH dumping code so that it only
>>> dumps the error registers of a component if there are bits set in the
>>> error status register of that component. This limits the output to
>>> the
>>> registers that are actually relevant to the error that occured which
>>> makes interpreting the dumps slightly easier.
>>>
>>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
>>> ---
>>> I've been using this in my debug builds for a while, but I'm not sure
>>> if it's a great idea since sometimes other people want to full dump.
>>> It's probably fine...
>>
>> This is the same thing the kernel does, I don't see any reason why we
>> wouldn't do it here.  The kernel also puts multiple registers on the
>> same line to save space, this does require looking at the code to
>> remember what order the regs are in but if you're decoding this stuff
>> that's the least of your problems so *shrug*
>>
>> In any case,
>>
>> Acked-by: Russell Currey <ruscur@russell.cc>
>
> I think this means I can make good on my joke^W promise to merge some of
> Oliver's RFC patches at 4:59pm on a friday before he takes some time
> off? :)

I think tradition requires you to be drunk when merging RFC patches,
so 4:59 might be a bit early.

>
> --
> Stewart Smith
> OPAL Architect, IBM.
>
diff mbox series

Patch

diff --git a/hw/phb4.c b/hw/phb4.c
index f5953a78a051..bb382395026e 100644
--- a/hw/phb4.c
+++ b/hw/phb4.c
@@ -2158,38 +2158,62 @@  static void phb4_eeh_dump_regs(struct phb4 *p)
 	PHBERR(p, "                  lemFir = %016llx\n", s->lemFir);
 	PHBERR(p, "            lemErrorMask = %016llx\n", s->lemErrorMask);
 	PHBERR(p, "                  lemWOF = %016llx\n", s->lemWOF);
-	PHBERR(p, "          phbErrorStatus = %016llx\n", s->phbErrorStatus);
-	PHBERR(p, "     phbFirstErrorStatus = %016llx\n", s->phbFirstErrorStatus);
-	PHBERR(p, "            phbErrorLog0 = %016llx\n", s->phbErrorLog0);
-	PHBERR(p, "            phbErrorLog1 = %016llx\n", s->phbErrorLog1);
-	PHBERR(p, "       phbTxeErrorStatus = %016llx\n", s->phbTxeErrorStatus);
-	PHBERR(p, "  phbTxeFirstErrorStatus = %016llx\n", s->phbTxeFirstErrorStatus);
-	PHBERR(p, "         phbTxeErrorLog0 = %016llx\n", s->phbTxeErrorLog0);
-	PHBERR(p, "         phbTxeErrorLog1 = %016llx\n", s->phbTxeErrorLog1);
-	PHBERR(p, "    phbRxeArbErrorStatus = %016llx\n", s->phbRxeArbErrorStatus);
-	PHBERR(p, "phbRxeArbFrstErrorStatus = %016llx\n", s->phbRxeArbFirstErrorStatus);
-	PHBERR(p, "      phbRxeArbErrorLog0 = %016llx\n", s->phbRxeArbErrorLog0);
-	PHBERR(p, "      phbRxeArbErrorLog1 = %016llx\n", s->phbRxeArbErrorLog1);
-	PHBERR(p, "    phbRxeMrgErrorStatus = %016llx\n", s->phbRxeMrgErrorStatus);
-	PHBERR(p, "phbRxeMrgFrstErrorStatus = %016llx\n", s->phbRxeMrgFirstErrorStatus);
-	PHBERR(p, "      phbRxeMrgErrorLog0 = %016llx\n", s->phbRxeMrgErrorLog0);
-	PHBERR(p, "      phbRxeMrgErrorLog1 = %016llx\n", s->phbRxeMrgErrorLog1);
-	PHBERR(p, "    phbRxeTceErrorStatus = %016llx\n", s->phbRxeTceErrorStatus);
-	PHBERR(p, "phbRxeTceFrstErrorStatus = %016llx\n", s->phbRxeTceFirstErrorStatus);
-	PHBERR(p, "      phbRxeTceErrorLog0 = %016llx\n", s->phbRxeTceErrorLog0);
-	PHBERR(p, "      phbRxeTceErrorLog1 = %016llx\n", s->phbRxeTceErrorLog1);
-	PHBERR(p, "       phbPblErrorStatus = %016llx\n", s->phbPblErrorStatus);
-	PHBERR(p, "  phbPblFirstErrorStatus = %016llx\n", s->phbPblFirstErrorStatus);
-	PHBERR(p, "         phbPblErrorLog0 = %016llx\n", s->phbPblErrorLog0);
-	PHBERR(p, "         phbPblErrorLog1 = %016llx\n", s->phbPblErrorLog1);
-	PHBERR(p, "     phbPcieDlpErrorLog1 = %016llx\n", s->phbPcieDlpErrorLog1);
-	PHBERR(p, "     phbPcieDlpErrorLog2 = %016llx\n", s->phbPcieDlpErrorLog2);
-	PHBERR(p, "   phbPcieDlpErrorStatus = %016llx\n", s->phbPcieDlpErrorStatus);
-
-	PHBERR(p, "      phbRegbErrorStatus = %016llx\n", s->phbRegbErrorStatus);
-	PHBERR(p, " phbRegbFirstErrorStatus = %016llx\n", s->phbRegbFirstErrorStatus);
-	PHBERR(p, "        phbRegbErrorLog0 = %016llx\n", s->phbRegbErrorLog0);
-	PHBERR(p, "        phbRegbErrorLog1 = %016llx\n", s->phbRegbErrorLog1);
+
+	/* Only dump the detailed regs if the error reg is non-zero */
+	if (s->phbErrorStatus) {
+		PHBERR(p, "          phbErrorStatus = %016llx\n", s->phbErrorStatus);
+		PHBERR(p, "     phbFirstErrorStatus = %016llx\n", s->phbFirstErrorStatus);
+		PHBERR(p, "            phbErrorLog0 = %016llx\n", s->phbErrorLog0);
+		PHBERR(p, "            phbErrorLog1 = %016llx\n", s->phbErrorLog1);
+	}
+
+	if (s->phbTxeErrorStatus) {
+		PHBERR(p, "       phbTxeErrorStatus = %016llx\n", s->phbTxeErrorStatus);
+		PHBERR(p, "  phbTxeFirstErrorStatus = %016llx\n", s->phbTxeFirstErrorStatus);
+		PHBERR(p, "         phbTxeErrorLog0 = %016llx\n", s->phbTxeErrorLog0);
+		PHBERR(p, "         phbTxeErrorLog1 = %016llx\n", s->phbTxeErrorLog1);
+	}
+
+	if (s->phbRxeArbErrorStatus) {
+		PHBERR(p, "    phbRxeArbErrorStatus = %016llx\n", s->phbRxeArbErrorStatus);
+		PHBERR(p, "phbRxeArbFrstErrorStatus = %016llx\n", s->phbRxeArbFirstErrorStatus);
+		PHBERR(p, "      phbRxeArbErrorLog0 = %016llx\n", s->phbRxeArbErrorLog0);
+		PHBERR(p, "      phbRxeArbErrorLog1 = %016llx\n", s->phbRxeArbErrorLog1);
+	}
+
+	if (s->phbRxeMrgErrorStatus) {
+		PHBERR(p, "    phbRxeMrgErrorStatus = %016llx\n", s->phbRxeMrgErrorStatus);
+		PHBERR(p, "phbRxeMrgFrstErrorStatus = %016llx\n", s->phbRxeMrgFirstErrorStatus);
+		PHBERR(p, "      phbRxeMrgErrorLog0 = %016llx\n", s->phbRxeMrgErrorLog0);
+		PHBERR(p, "      phbRxeMrgErrorLog1 = %016llx\n", s->phbRxeMrgErrorLog1);
+	}
+
+	if (s->phbRxeTceErrorStatus) {
+		PHBERR(p, "    phbRxeTceErrorStatus = %016llx\n", s->phbRxeTceErrorStatus);
+		PHBERR(p, "phbRxeTceFrstErrorStatus = %016llx\n", s->phbRxeTceFirstErrorStatus);
+		PHBERR(p, "      phbRxeTceErrorLog0 = %016llx\n", s->phbRxeTceErrorLog0);
+		PHBERR(p, "      phbRxeTceErrorLog1 = %016llx\n", s->phbRxeTceErrorLog1);
+	}
+
+	if (s->phbPblErrorStatus) {
+		PHBERR(p, "       phbPblErrorStatus = %016llx\n", s->phbPblErrorStatus);
+		PHBERR(p, "  phbPblFirstErrorStatus = %016llx\n", s->phbPblFirstErrorStatus);
+		PHBERR(p, "         phbPblErrorLog0 = %016llx\n", s->phbPblErrorLog0);
+		PHBERR(p, "         phbPblErrorLog1 = %016llx\n", s->phbPblErrorLog1);
+	}
+
+	if (s->phbPcieDlpErrorStatus) {
+		PHBERR(p, "   phbPcieDlpErrorStatus = %016llx\n", s->phbPcieDlpErrorStatus);
+		PHBERR(p, "     phbPcieDlpErrorLog1 = %016llx\n", s->phbPcieDlpErrorLog1);
+		PHBERR(p, "     phbPcieDlpErrorLog2 = %016llx\n", s->phbPcieDlpErrorLog2);
+	}
+
+	if (s->phbRegbErrorStatus) {
+		PHBERR(p, "      phbRegbErrorStatus = %016llx\n", s->phbRegbErrorStatus);
+		PHBERR(p, " phbRegbFirstErrorStatus = %016llx\n", s->phbRegbFirstErrorStatus);
+		PHBERR(p, "        phbRegbErrorLog0 = %016llx\n", s->phbRegbErrorLog0);
+		PHBERR(p, "        phbRegbErrorLog1 = %016llx\n", s->phbRegbErrorLog1);
+	}
 
 	for (i = 0; i < OPAL_PHB4_NUM_PEST_REGS; i++) {
 		if (!s->pestA[i] && !s->pestB[i])