Message ID | 20180731084854.27802-1-oohall@gmail.com |
---|---|
State | RFC |
Headers | show |
Series | [RFC] phb4: Make verbose EEH output less verbose | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
"Oliver O'Halloran" <oohall@gmail.com> writes: > Each PHB4 component has a error status register and a set of > component-specific registers that describe that error in more detail. > If the status register of a component has no bits set there's not much > point in dumping the error detail registers. > > This patch modifies the verbose EEH dumping code so that it only > dumps the error registers of a component if there are bits set in the > error status register of that component. This limits the output to the > registers that are actually relevant to the error that occured which > makes interpreting the dumps slightly easier. > > Signed-off-by: Oliver O'Halloran <oohall@gmail.com> > --- > I've been using this in my debug builds for a while, but I'm not sure > if it's a great idea since sometimes other people want to full dump. > It's probably fine... It does look probably fine. Maybe Ben or Russell has thoughts?
On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote: > Each PHB4 component has a error status register and a set of > component-specific registers that describe that error in more detail. > If the status register of a component has no bits set there's not > much > point in dumping the error detail registers. > > This patch modifies the verbose EEH dumping code so that it only > dumps the error registers of a component if there are bits set in the > error status register of that component. This limits the output to > the > registers that are actually relevant to the error that occured which > makes interpreting the dumps slightly easier. > > Signed-off-by: Oliver O'Halloran <oohall@gmail.com> > --- > I've been using this in my debug builds for a while, but I'm not sure > if it's a great idea since sometimes other people want to full dump. > It's probably fine... This is the same thing the kernel does, I don't see any reason why we wouldn't do it here. The kernel also puts multiple registers on the same line to save space, this does require looking at the code to remember what order the regs are in but if you're decoding this stuff that's the least of your problems so *shrug* In any case, Acked-by: Russell Currey <ruscur@russell.cc>
Russell Currey <ruscur@russell.cc> writes: > On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote: >> Each PHB4 component has a error status register and a set of >> component-specific registers that describe that error in more detail. >> If the status register of a component has no bits set there's not >> much >> point in dumping the error detail registers. >> >> This patch modifies the verbose EEH dumping code so that it only >> dumps the error registers of a component if there are bits set in the >> error status register of that component. This limits the output to >> the >> registers that are actually relevant to the error that occured which >> makes interpreting the dumps slightly easier. >> >> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> >> --- >> I've been using this in my debug builds for a while, but I'm not sure >> if it's a great idea since sometimes other people want to full dump. >> It's probably fine... > > This is the same thing the kernel does, I don't see any reason why we > wouldn't do it here. The kernel also puts multiple registers on the > same line to save space, this does require looking at the code to > remember what order the regs are in but if you're decoding this stuff > that's the least of your problems so *shrug* > > In any case, > > Acked-by: Russell Currey <ruscur@russell.cc> I think this means I can make good on my joke^W promise to merge some of Oliver's RFC patches at 4:59pm on a friday before he takes some time off? :)
On Fri, 2018-08-03 at 13:15 +1000, Stewart Smith wrote: > Russell Currey <ruscur@russell.cc> writes: > > On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote: > > > Each PHB4 component has a error status register and a set of > > > component-specific registers that describe that error in more > > > detail. > > > If the status register of a component has no bits set there's not > > > much > > > point in dumping the error detail registers. > > > > > > This patch modifies the verbose EEH dumping code so that it only > > > dumps the error registers of a component if there are bits set in > > > the > > > error status register of that component. This limits the output > > > to > > > the > > > registers that are actually relevant to the error that occured > > > which > > > makes interpreting the dumps slightly easier. > > > > > > Signed-off-by: Oliver O'Halloran <oohall@gmail.com> > > > --- > > > I've been using this in my debug builds for a while, but I'm not > > > sure > > > if it's a great idea since sometimes other people want to full > > > dump. > > > It's probably fine... > > > > This is the same thing the kernel does, I don't see any reason why > > we > > wouldn't do it here. The kernel also puts multiple registers on > > the > > same line to save space, this does require looking at the code to > > remember what order the regs are in but if you're decoding this > > stuff > > that's the least of your problems so *shrug* > > > > In any case, > > > > Acked-by: Russell Currey <ruscur@russell.cc> > > I think this means I can make good on my joke^W promise to merge some > of > Oliver's RFC patches at 4:59pm on a friday before he takes some time > off? :) > To be fair, this one is probably the least of your problems...
On Fri, Aug 3, 2018 at 1:15 PM, Stewart Smith <stewart@linux.ibm.com> wrote: > Russell Currey <ruscur@russell.cc> writes: >> On Tue, 2018-07-31 at 18:48 +1000, Oliver O'Halloran wrote: >>> Each PHB4 component has a error status register and a set of >>> component-specific registers that describe that error in more detail. >>> If the status register of a component has no bits set there's not >>> much >>> point in dumping the error detail registers. >>> >>> This patch modifies the verbose EEH dumping code so that it only >>> dumps the error registers of a component if there are bits set in the >>> error status register of that component. This limits the output to >>> the >>> registers that are actually relevant to the error that occured which >>> makes interpreting the dumps slightly easier. >>> >>> Signed-off-by: Oliver O'Halloran <oohall@gmail.com> >>> --- >>> I've been using this in my debug builds for a while, but I'm not sure >>> if it's a great idea since sometimes other people want to full dump. >>> It's probably fine... >> >> This is the same thing the kernel does, I don't see any reason why we >> wouldn't do it here. The kernel also puts multiple registers on the >> same line to save space, this does require looking at the code to >> remember what order the regs are in but if you're decoding this stuff >> that's the least of your problems so *shrug* >> >> In any case, >> >> Acked-by: Russell Currey <ruscur@russell.cc> > > I think this means I can make good on my joke^W promise to merge some of > Oliver's RFC patches at 4:59pm on a friday before he takes some time > off? :) I think tradition requires you to be drunk when merging RFC patches, so 4:59 might be a bit early. > > -- > Stewart Smith > OPAL Architect, IBM. >
diff --git a/hw/phb4.c b/hw/phb4.c index f5953a78a051..bb382395026e 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -2158,38 +2158,62 @@ static void phb4_eeh_dump_regs(struct phb4 *p) PHBERR(p, " lemFir = %016llx\n", s->lemFir); PHBERR(p, " lemErrorMask = %016llx\n", s->lemErrorMask); PHBERR(p, " lemWOF = %016llx\n", s->lemWOF); - PHBERR(p, " phbErrorStatus = %016llx\n", s->phbErrorStatus); - PHBERR(p, " phbFirstErrorStatus = %016llx\n", s->phbFirstErrorStatus); - PHBERR(p, " phbErrorLog0 = %016llx\n", s->phbErrorLog0); - PHBERR(p, " phbErrorLog1 = %016llx\n", s->phbErrorLog1); - PHBERR(p, " phbTxeErrorStatus = %016llx\n", s->phbTxeErrorStatus); - PHBERR(p, " phbTxeFirstErrorStatus = %016llx\n", s->phbTxeFirstErrorStatus); - PHBERR(p, " phbTxeErrorLog0 = %016llx\n", s->phbTxeErrorLog0); - PHBERR(p, " phbTxeErrorLog1 = %016llx\n", s->phbTxeErrorLog1); - PHBERR(p, " phbRxeArbErrorStatus = %016llx\n", s->phbRxeArbErrorStatus); - PHBERR(p, "phbRxeArbFrstErrorStatus = %016llx\n", s->phbRxeArbFirstErrorStatus); - PHBERR(p, " phbRxeArbErrorLog0 = %016llx\n", s->phbRxeArbErrorLog0); - PHBERR(p, " phbRxeArbErrorLog1 = %016llx\n", s->phbRxeArbErrorLog1); - PHBERR(p, " phbRxeMrgErrorStatus = %016llx\n", s->phbRxeMrgErrorStatus); - PHBERR(p, "phbRxeMrgFrstErrorStatus = %016llx\n", s->phbRxeMrgFirstErrorStatus); - PHBERR(p, " phbRxeMrgErrorLog0 = %016llx\n", s->phbRxeMrgErrorLog0); - PHBERR(p, " phbRxeMrgErrorLog1 = %016llx\n", s->phbRxeMrgErrorLog1); - PHBERR(p, " phbRxeTceErrorStatus = %016llx\n", s->phbRxeTceErrorStatus); - PHBERR(p, "phbRxeTceFrstErrorStatus = %016llx\n", s->phbRxeTceFirstErrorStatus); - PHBERR(p, " phbRxeTceErrorLog0 = %016llx\n", s->phbRxeTceErrorLog0); - PHBERR(p, " phbRxeTceErrorLog1 = %016llx\n", s->phbRxeTceErrorLog1); - PHBERR(p, " phbPblErrorStatus = %016llx\n", s->phbPblErrorStatus); - PHBERR(p, " phbPblFirstErrorStatus = %016llx\n", s->phbPblFirstErrorStatus); - PHBERR(p, " phbPblErrorLog0 = %016llx\n", s->phbPblErrorLog0); - PHBERR(p, " phbPblErrorLog1 = %016llx\n", s->phbPblErrorLog1); - PHBERR(p, " phbPcieDlpErrorLog1 = %016llx\n", s->phbPcieDlpErrorLog1); - PHBERR(p, " phbPcieDlpErrorLog2 = %016llx\n", s->phbPcieDlpErrorLog2); - PHBERR(p, " phbPcieDlpErrorStatus = %016llx\n", s->phbPcieDlpErrorStatus); - - PHBERR(p, " phbRegbErrorStatus = %016llx\n", s->phbRegbErrorStatus); - PHBERR(p, " phbRegbFirstErrorStatus = %016llx\n", s->phbRegbFirstErrorStatus); - PHBERR(p, " phbRegbErrorLog0 = %016llx\n", s->phbRegbErrorLog0); - PHBERR(p, " phbRegbErrorLog1 = %016llx\n", s->phbRegbErrorLog1); + + /* Only dump the detailed regs if the error reg is non-zero */ + if (s->phbErrorStatus) { + PHBERR(p, " phbErrorStatus = %016llx\n", s->phbErrorStatus); + PHBERR(p, " phbFirstErrorStatus = %016llx\n", s->phbFirstErrorStatus); + PHBERR(p, " phbErrorLog0 = %016llx\n", s->phbErrorLog0); + PHBERR(p, " phbErrorLog1 = %016llx\n", s->phbErrorLog1); + } + + if (s->phbTxeErrorStatus) { + PHBERR(p, " phbTxeErrorStatus = %016llx\n", s->phbTxeErrorStatus); + PHBERR(p, " phbTxeFirstErrorStatus = %016llx\n", s->phbTxeFirstErrorStatus); + PHBERR(p, " phbTxeErrorLog0 = %016llx\n", s->phbTxeErrorLog0); + PHBERR(p, " phbTxeErrorLog1 = %016llx\n", s->phbTxeErrorLog1); + } + + if (s->phbRxeArbErrorStatus) { + PHBERR(p, " phbRxeArbErrorStatus = %016llx\n", s->phbRxeArbErrorStatus); + PHBERR(p, "phbRxeArbFrstErrorStatus = %016llx\n", s->phbRxeArbFirstErrorStatus); + PHBERR(p, " phbRxeArbErrorLog0 = %016llx\n", s->phbRxeArbErrorLog0); + PHBERR(p, " phbRxeArbErrorLog1 = %016llx\n", s->phbRxeArbErrorLog1); + } + + if (s->phbRxeMrgErrorStatus) { + PHBERR(p, " phbRxeMrgErrorStatus = %016llx\n", s->phbRxeMrgErrorStatus); + PHBERR(p, "phbRxeMrgFrstErrorStatus = %016llx\n", s->phbRxeMrgFirstErrorStatus); + PHBERR(p, " phbRxeMrgErrorLog0 = %016llx\n", s->phbRxeMrgErrorLog0); + PHBERR(p, " phbRxeMrgErrorLog1 = %016llx\n", s->phbRxeMrgErrorLog1); + } + + if (s->phbRxeTceErrorStatus) { + PHBERR(p, " phbRxeTceErrorStatus = %016llx\n", s->phbRxeTceErrorStatus); + PHBERR(p, "phbRxeTceFrstErrorStatus = %016llx\n", s->phbRxeTceFirstErrorStatus); + PHBERR(p, " phbRxeTceErrorLog0 = %016llx\n", s->phbRxeTceErrorLog0); + PHBERR(p, " phbRxeTceErrorLog1 = %016llx\n", s->phbRxeTceErrorLog1); + } + + if (s->phbPblErrorStatus) { + PHBERR(p, " phbPblErrorStatus = %016llx\n", s->phbPblErrorStatus); + PHBERR(p, " phbPblFirstErrorStatus = %016llx\n", s->phbPblFirstErrorStatus); + PHBERR(p, " phbPblErrorLog0 = %016llx\n", s->phbPblErrorLog0); + PHBERR(p, " phbPblErrorLog1 = %016llx\n", s->phbPblErrorLog1); + } + + if (s->phbPcieDlpErrorStatus) { + PHBERR(p, " phbPcieDlpErrorStatus = %016llx\n", s->phbPcieDlpErrorStatus); + PHBERR(p, " phbPcieDlpErrorLog1 = %016llx\n", s->phbPcieDlpErrorLog1); + PHBERR(p, " phbPcieDlpErrorLog2 = %016llx\n", s->phbPcieDlpErrorLog2); + } + + if (s->phbRegbErrorStatus) { + PHBERR(p, " phbRegbErrorStatus = %016llx\n", s->phbRegbErrorStatus); + PHBERR(p, " phbRegbFirstErrorStatus = %016llx\n", s->phbRegbFirstErrorStatus); + PHBERR(p, " phbRegbErrorLog0 = %016llx\n", s->phbRegbErrorLog0); + PHBERR(p, " phbRegbErrorLog1 = %016llx\n", s->phbRegbErrorLog1); + } for (i = 0; i < OPAL_PHB4_NUM_PEST_REGS; i++) { if (!s->pestA[i] && !s->pestB[i])
Each PHB4 component has a error status register and a set of component-specific registers that describe that error in more detail. If the status register of a component has no bits set there's not much point in dumping the error detail registers. This patch modifies the verbose EEH dumping code so that it only dumps the error registers of a component if there are bits set in the error status register of that component. This limits the output to the registers that are actually relevant to the error that occured which makes interpreting the dumps slightly easier. Signed-off-by: Oliver O'Halloran <oohall@gmail.com> --- I've been using this in my debug builds for a while, but I'm not sure if it's a great idea since sometimes other people want to full dump. It's probably fine... --- hw/phb4.c | 88 ++++++++++++++++++++++++++++++++++++++++----------------------- 1 file changed, 56 insertions(+), 32 deletions(-)