Message ID | 20180917072550.4255-1-vaibhav@linux.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | [v2] phb4: Reset pfir and nfir if new errors reported during ETU reset | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
On 09/17/2018 12:55 PM, Vaibhav Jain wrote: > During fast-reboot new PEC errors can be latched even after ETU-Reset > is asserted. This will result in values of variables nfir_cache and > pfir_cache to be out of sync. > > During step-2 of CRESET nfir_cache and pfir_cache values are used to > bring the PHB out of reset state. However if these variables are out > as noted above of date the nfir/pfir registers are never reset > completely and ETU still remains frozen. > > Hence this patch updates step-2 of phb4_creset to re-read the values of > nfir/pfir registers to check if any new errors were reported after > ETU-reset was asserted, report these new errors and reset the > nfir/pfir registers. This should bring the ETU out of reset > successfully. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Tested-By: Vasant Hegde <hegdevasant@linux.vnet.ibm.com> -Vasant
On Mon, Sep 17, 2018 at 5:25 PM, Vaibhav Jain <vaibhav@linux.ibm.com> wrote: > During fast-reboot new PEC errors can be latched even after ETU-Reset > is asserted. This will result in values of variables nfir_cache and > pfir_cache to be out of sync. > > During step-2 of CRESET nfir_cache and pfir_cache values are used to > bring the PHB out of reset state. However if these variables are out > as noted above of date the nfir/pfir registers are never reset > completely and ETU still remains frozen. > > Hence this patch updates step-2 of phb4_creset to re-read the values of > nfir/pfir registers to check if any new errors were reported after > ETU-reset was asserted, report these new errors and reset the > nfir/pfir registers. This should bring the ETU out of reset > successfully. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> > --- > Change-log: > v2 -> Rebased the patch to > http://patchwork.ozlabs.org/patch/970408/ to dump all pec > error registers. looks good to me Reviewed-by: Oliver O'Halloran <oohall@gmail.com> > --- > hw/phb4.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index cf3d0f84..3b1a755c 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -3160,6 +3160,25 @@ static int64_t phb4_creset(struct pci_slot *slot) > xscom_write(p->chip_id, p->pe_stk_xscom + 0x1, > ~p->nfir_cache); > > + /* Re-read errors in PFIR and NFIR and reset any new > + * error reported. > + */ > + xscom_read(p->chip_id, p->pci_stk_xscom + > + XPEC_PCI_STK_PCI_FIR, &p->pfir_cache); > + xscom_read(p->chip_id, p->pe_stk_xscom + > + XPEC_NEST_STK_PCI_NFIR, &p->nfir_cache); > + > + if (p->pfir_cache || p->nfir_cache) { > + PHBERR(p, "CRESET: PHB still fenced !!\n"); > + phb4_dump_pec_err_regs(p); > + > + /* Reset the PHB errors */ > + xscom_write(p->chip_id, p->pci_stk_xscom + > + XPEC_PCI_STK_PCI_FIR, 0); > + xscom_write(p->chip_id, p->pe_stk_xscom + > + XPEC_NEST_STK_PCI_NFIR, 0); > + } > + > /* Clear PHB from reset */ > xscom_write(p->chip_id, > p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0); > -- > 2.17.1 >
Vaibhav Jain <vaibhav@linux.ibm.com> writes: > During fast-reboot new PEC errors can be latched even after ETU-Reset > is asserted. This will result in values of variables nfir_cache and > pfir_cache to be out of sync. > > During step-2 of CRESET nfir_cache and pfir_cache values are used to > bring the PHB out of reset state. However if these variables are out > as noted above of date the nfir/pfir registers are never reset > completely and ETU still remains frozen. > > Hence this patch updates step-2 of phb4_creset to re-read the values of > nfir/pfir registers to check if any new errors were reported after > ETU-reset was asserted, report these new errors and reset the > nfir/pfir registers. This should bring the ETU out of reset > successfully. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Thanks, merged to master as of c7be508115db0c903f7054c68beb0349b09592e0
diff --git a/hw/phb4.c b/hw/phb4.c index cf3d0f84..3b1a755c 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3160,6 +3160,25 @@ static int64_t phb4_creset(struct pci_slot *slot) xscom_write(p->chip_id, p->pe_stk_xscom + 0x1, ~p->nfir_cache); + /* Re-read errors in PFIR and NFIR and reset any new + * error reported. + */ + xscom_read(p->chip_id, p->pci_stk_xscom + + XPEC_PCI_STK_PCI_FIR, &p->pfir_cache); + xscom_read(p->chip_id, p->pe_stk_xscom + + XPEC_NEST_STK_PCI_NFIR, &p->nfir_cache); + + if (p->pfir_cache || p->nfir_cache) { + PHBERR(p, "CRESET: PHB still fenced !!\n"); + phb4_dump_pec_err_regs(p); + + /* Reset the PHB errors */ + xscom_write(p->chip_id, p->pci_stk_xscom + + XPEC_PCI_STK_PCI_FIR, 0); + xscom_write(p->chip_id, p->pe_stk_xscom + + XPEC_NEST_STK_PCI_NFIR, 0); + } + /* Clear PHB from reset */ xscom_write(p->chip_id, p->pci_stk_xscom + XPEC_PCI_STK_ETU_RESET, 0x0);
During fast-reboot new PEC errors can be latched even after ETU-Reset is asserted. This will result in values of variables nfir_cache and pfir_cache to be out of sync. During step-2 of CRESET nfir_cache and pfir_cache values are used to bring the PHB out of reset state. However if these variables are out as noted above of date the nfir/pfir registers are never reset completely and ETU still remains frozen. Hence this patch updates step-2 of phb4_creset to re-read the values of nfir/pfir registers to check if any new errors were reported after ETU-reset was asserted, report these new errors and reset the nfir/pfir registers. This should bring the ETU out of reset successfully. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> --- Change-log: v2 -> Rebased the patch to http://patchwork.ozlabs.org/patch/970408/ to dump all pec error registers. --- hw/phb4.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)