Message ID | 1443768122-13365-1-git-send-email-gwshan@linux.vnet.ibm.com |
---|---|
State | Accepted |
Headers | show |
Gavin Shan <gwshan@linux.vnet.ibm.com> writes: > This issue was found on SRIOV VFs initially and then I checked > with Chad Larson who put much efforts to sort it out. As more > experiments I did, the issue isn't limited to SRIOV VFs. That > means the isue can be seen on non-SRIOV adapter as well: Firstly, > I ensure that outbound request discard interrupt (bit#12) is > enabled in PCI Express Port Interrupt Enable Register (offset: > 0x558). Then injecting error to root complex by PAPR Error > Injection Registers with PCI config read. Eventually, all (256) > PEs are frozen. After clearing the bit, the target PE#0 is frozen > as expected. As Chad pointed, the interrupt ("outbound request > discard") is always raised during the error injection, which is > translated to UTL's primary interrupt to freeze all (256) PEs. > > This drops bit#12 of PCI Express Port Interrupt Enable Register > to avoid the UTL's primary interrupt caused by outbound request > discard, in order to avoid freezing all (256) PEs during error > injection via PCI config read. Thanks, merged at 2bb9c4bb257fe67f00a578cdd1bc41b6270ea27d into master. Something that's certainly a candidate for stable I think, I'd like to wait for outcome of other testing on (insert IBM internal bug number here) so we can get a better complete explanation of what's being fixed, and then we can cherry-pick. cheers, stewart
diff --git a/hw/phb3.c b/hw/phb3.c index 71c64be..d57cbd9 100644 --- a/hw/phb3.c +++ b/hw/phb3.c @@ -2031,7 +2031,7 @@ static void phb3_setup_for_link_up(struct phb3 *p) /* Clear spurrious errors and enable PCIE port interrupts */ out_be64(p->regs + UTL_PCIE_PORT_STATUS, 0xffdfffffffffffff); - out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad5a800000000000); + out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad52800000000000); /* Mark link up */ p->has_link = true; @@ -3834,7 +3834,7 @@ static void phb3_init_utl(struct phb3 *p) out_be64(p->regs + UTL_PCIE_PORT_ERROR_SEV, 0x5039000000000000); if (p->has_link) - out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad5a800000000000); + out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad52800000000000); else out_be64(p->regs + UTL_PCIE_PORT_IRQ_EN, 0xad42800000000000);
This issue was found on SRIOV VFs initially and then I checked with Chad Larson who put much efforts to sort it out. As more experiments I did, the issue isn't limited to SRIOV VFs. That means the isue can be seen on non-SRIOV adapter as well: Firstly, I ensure that outbound request discard interrupt (bit#12) is enabled in PCI Express Port Interrupt Enable Register (offset: 0x558). Then injecting error to root complex by PAPR Error Injection Registers with PCI config read. Eventually, all (256) PEs are frozen. After clearing the bit, the target PE#0 is frozen as expected. As Chad pointed, the interrupt ("outbound request discard") is always raised during the error injection, which is translated to UTL's primary interrupt to freeze all (256) PEs. This drops bit#12 of PCI Express Port Interrupt Enable Register to avoid the UTL's primary interrupt caused by outbound request discard, in order to avoid freezing all (256) PEs during error injection via PCI config read. Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com> --- hw/phb3.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)