diff mbox

PHB3: Fix unexpected ER (all) on errinjct by PCI config

Message ID 1443768122-13365-1-git-send-email-gwshan@linux.vnet.ibm.com
State Accepted
Headers show

Commit Message

Gavin Shan Oct. 2, 2015, 6:42 a.m. UTC
This issue was found on SRIOV VFs initially and then I checked
with Chad Larson who put much efforts to sort it out. As more
experiments I did, the issue isn't limited to SRIOV VFs. That
means the isue can be seen on non-SRIOV adapter as well: Firstly,
I ensure that outbound request discard interrupt (bit#12) is
enabled in PCI Express Port Interrupt Enable Register (offset:
0x558). Then injecting error to root complex by PAPR Error
Injection Registers with PCI config read. Eventually, all (256)
PEs are frozen. After clearing the bit, the target PE#0 is frozen
as expected. As Chad pointed, the interrupt ("outbound request
discard") is always raised during the error injection, which is
translated to UTL's primary interrupt to freeze all (256) PEs.

This drops bit#12 of PCI Express Port Interrupt Enable Register
to avoid the UTL's primary interrupt caused by outbound request
discard, in order to avoid freezing all (256) PEs during error
injection via PCI config read.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
---
 hw/phb3.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Stewart Smith Oct. 7, 2015, 7:05 a.m. UTC | #1
Gavin Shan <gwshan@linux.vnet.ibm.com> writes:
> This issue was found on SRIOV VFs initially and then I checked
> with Chad Larson who put much efforts to sort it out. As more
> experiments I did, the issue isn't limited to SRIOV VFs. That
> means the isue can be seen on non-SRIOV adapter as well: Firstly,
> I ensure that outbound request discard interrupt (bit#12) is
> enabled in PCI Express Port Interrupt Enable Register (offset:
> 0x558). Then injecting error to root complex by PAPR Error
> Injection Registers with PCI config read. Eventually, all (256)
> PEs are frozen. After clearing the bit, the target PE#0 is frozen
> as expected. As Chad pointed, the interrupt ("outbound request
> discard") is always raised during the error injection, which is
> translated to UTL's primary interrupt to freeze all (256) PEs.
>
> This drops bit#12 of PCI Express Port Interrupt Enable Register
> to avoid the UTL's primary interrupt caused by outbound request
> discard, in order to avoid freezing all (256) PEs during error
> injection via PCI config read.

Thanks,

merged at 2bb9c4bb257fe67f00a578cdd1bc41b6270ea27d into master.

Something that's certainly a candidate for stable I think, I'd like to
wait for outcome of other testing on (insert IBM internal bug number
here) so we can get a better complete explanation of what's being fixed,
and then we can cherry-pick.

cheers,
stewart
diff mbox

Patch

diff --git a/hw/phb3.c b/hw/phb3.c
index 71c64be..d57cbd9 100644
--- a/hw/phb3.c
+++ b/hw/phb3.c
@@ -2031,7 +2031,7 @@  static void phb3_setup_for_link_up(struct phb3 *p)
 
 	/* Clear spurrious errors and enable PCIE port interrupts */
 	out_be64(p->regs + UTL_PCIE_PORT_STATUS, 0xffdfffffffffffff);
-	out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad5a800000000000);
+	out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad52800000000000);
 
 	/* Mark link up */
 	p->has_link = true;
@@ -3834,7 +3834,7 @@  static void phb3_init_utl(struct phb3 *p)
 	out_be64(p->regs + UTL_PCIE_PORT_ERROR_SEV,        0x5039000000000000);
 
 	if (p->has_link)
-		out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN,   0xad5a800000000000);
+		out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN,   0xad52800000000000);
 	else
 		out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN,   0xad42800000000000);