Message ID | 1449088437-25624-1-git-send-email-tim.gardner@canonical.com |
---|---|
State | New |
Headers | show |
On Wed, Dec 02, 2015 at 01:33:57PM -0700, tim.gardner@canonical.com wrote: > From: Gavin Shan <gwshan@linux.vnet.ibm.com> > > BugLink: http://bugs.launchpad.net/bugs/1522071 > > The config space of some PCI devices can't be accessed when their > PEs are in frozen state. Otherwise, fenced PHB might be seen. > Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing > EEH_PE_CFG_BLOCKED is set automatically when the PE is put to > frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores > PCI device BARs with eeh_pe_restore_bars(), which then calls > eeh_ops->restore_config() to reinitialize the PCI device in > (OPAL) firmware. eeh_ops->restore_config() produces PCI config > access that causes fenced PHB. The problem was reported on below > adapter: > > 0001:01:00.0 0200: 14e4:168e (rev 10) > 0001:01:00.0 Ethernet controller: Broadcom Corporation \ > NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10) > > This fixes the issue by skipping eeh_pe_restore_bars() in > eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE. > > Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE") > Cc: stable@vger.kernel.org # v4.0+ > Reported-by: Manvanthara B. Puttashankar <mputtash@in.ibm.com> > Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> > (cherry picked from commit 259800135c654a098d9f0adfdd3d1f20eef1f231) > Signed-off-by: Tim Gardner <tim.gardner@canonical.com> > --- > arch/powerpc/kernel/eeh.c | 21 ++++++++++++++++++--- > 1 file changed, 18 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c > index 3b2252e..79150db 100644 > --- a/arch/powerpc/kernel/eeh.c > +++ b/arch/powerpc/kernel/eeh.c > @@ -306,11 +306,26 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity) > if (!(pe->type & EEH_PE_PHB)) { > if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG)) > eeh_pci_enable(pe, EEH_OPT_THAW_MMIO); > + > + /* > + * The config space of some PCI devices can't be accessed > + * when their PEs are in frozen state. Otherwise, fenced > + * PHB might be seen. Those PEs are identified with flag > + * EEH_PE_CFG_RESTRICTED, indicating EEH_PE_CFG_BLOCKED > + * is set automatically when the PE is put to EEH_PE_ISOLATED. > + * > + * Restoring BARs possibly triggers PCI config access in > + * (OPAL) firmware and then causes fenced PHB. If the > + * PCI config is blocked with flag EEH_PE_CFG_BLOCKED, it's > + * pointless to restore BARs and dump config space. > + */ > eeh_ops->configure_bridge(pe); > - eeh_pe_restore_bars(pe); > + if (!(pe->state & EEH_PE_CFG_BLOCKED)) { > + eeh_pe_restore_bars(pe); > > - pci_regs_buf[0] = 0; > - eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen); > + pci_regs_buf[0] = 0; > + eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen); > + } > } > > eeh_ops->get_log(pe, severity, pci_regs_buf, loglen); Looks to do what is claimed. Acked-by: Andy Whitcroft <apw@canonical.com> -apw
cherry pick though upstream confusingly tagged it as only 4.0+, but looks sane enough and if it applies and compiles...
Applied to vivid master-next. -apw
On Tue, Dec 08, 2015 at 12:08:34PM +0100, Stefan Bader wrote: > cherry pick though upstream confusingly tagged it as only 4.0+, but looks sane > enough and if it applies and compiles... Tim, This does apply cleanly and builds, but it might be worth circling with IBM on this and confirming we are not missing a trick by applying it before 4.0. -apw
On Tue, 2015-12-08 at 11:27 +0000, Andy Whitcroft wrote: > On Tue, Dec 08, 2015 at 12:08:34PM +0100, Stefan Bader wrote: > > cherry pick though upstream confusingly tagged it as only 4.0+, but looks sane > > enough and if it applies and compiles... > > Tim, > > This does apply cleanly and builds, but it might be worth circling with > IBM on this and confirming we are not missing a trick by applying it > before 4.0. It looks like Gavin added the 4.0+ annotation himself: https://patchwork.ozlabs.org/patch/511744/ I don't remember discussing it with him, and I can't find any chat logs or anything. I suspect it's probably just wrongly tagged, and should go back to at least match the Fixes line, which is v3.18. So I think you should probably take it. cheers
On Tue, Dec 08, 2015 at 10:39:21PM +1100, Michael Ellerman wrote: > On Tue, 2015-12-08 at 11:27 +0000, Andy Whitcroft wrote: > > On Tue, Dec 08, 2015 at 12:08:34PM +0100, Stefan Bader wrote: > > > cherry pick though upstream confusingly tagged it as only 4.0+, but looks sane > > > enough and if it applies and compiles... > > > > Tim, > > > > This does apply cleanly and builds, but it might be worth circling with > > IBM on this and confirming we are not missing a trick by applying it > > before 4.0. > > It looks like Gavin added the 4.0+ annotation himself: > > https://patchwork.ozlabs.org/patch/511744/ > > I don't remember discussing it with him, and I can't find any chat logs or > anything. I suspect it's probably just wrongly tagged, and should go back to at > least match the Fixes line, which is v3.18. > > So I think you should probably take it. Thanks Michael for the clarification. I have applied this for the next upload. -apw
diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c index 3b2252e..79150db 100644 --- a/arch/powerpc/kernel/eeh.c +++ b/arch/powerpc/kernel/eeh.c @@ -306,11 +306,26 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity) if (!(pe->type & EEH_PE_PHB)) { if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG)) eeh_pci_enable(pe, EEH_OPT_THAW_MMIO); + + /* + * The config space of some PCI devices can't be accessed + * when their PEs are in frozen state. Otherwise, fenced + * PHB might be seen. Those PEs are identified with flag + * EEH_PE_CFG_RESTRICTED, indicating EEH_PE_CFG_BLOCKED + * is set automatically when the PE is put to EEH_PE_ISOLATED. + * + * Restoring BARs possibly triggers PCI config access in + * (OPAL) firmware and then causes fenced PHB. If the + * PCI config is blocked with flag EEH_PE_CFG_BLOCKED, it's + * pointless to restore BARs and dump config space. + */ eeh_ops->configure_bridge(pe); - eeh_pe_restore_bars(pe); + if (!(pe->state & EEH_PE_CFG_BLOCKED)) { + eeh_pe_restore_bars(pe); - pci_regs_buf[0] = 0; - eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen); + pci_regs_buf[0] = 0; + eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen); + } } eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);