Message ID | 1374737524-8410-1-git-send-email-shangw@linux.vnet.ibm.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Gavin Shan <shangw@linux.vnet.ibm.com> Date: Thu, 25 Jul 2013 15:32:04 +0800 > While we have EEH errors happened on specific PCI device, the PE > (Partitionable Endpoint) which includes the PCI device is expected > to be reset. During the reset, the PCI device should have been > marked as "offline" and it's not safe to access MMIO of that PCI > device with "offline" state. That might cause the failure to do > EEH recovery and then the PCI device is removed from the system for > ever before manual recovery. > > The patch avoids access to MMIO while the PCI device has been marked > "offline" so that to avoid additional EEH errors during reset and make > sure that the EEH recovery can be done successfully. > > Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Making this check on every single register read or write is not reasonable. If this is how register access in every driver has to be done in order to support EEH errors properly, it is simply not acceptable. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/ethernet/chelsio/cxgb3/adapter.h b/drivers/net/ethernet/chelsio/cxgb3/adapter.h index 8b395b5..5b20c5d 100644 --- a/drivers/net/ethernet/chelsio/cxgb3/adapter.h +++ b/drivers/net/ethernet/chelsio/cxgb3/adapter.h @@ -270,7 +270,10 @@ struct adapter { static inline u32 t3_read_reg(struct adapter *adapter, u32 reg_addr) { - u32 val = readl(adapter->regs + reg_addr); + u32 val = 0xFFFFFFFF; + + if (!pci_channel_offline(adapter->pdev)) + val = readl(adapter->regs + reg_addr); CH_DBG(adapter, MMIO, "read register 0x%x value 0x%x\n", reg_addr, val); return val; @@ -279,7 +282,8 @@ static inline u32 t3_read_reg(struct adapter *adapter, u32 reg_addr) static inline void t3_write_reg(struct adapter *adapter, u32 reg_addr, u32 val) { CH_DBG(adapter, MMIO, "setting register 0x%x to 0x%x\n", reg_addr, val); - writel(val, adapter->regs + reg_addr); + if (!pci_channel_offline(adapter->pdev)) + writel(val, adapter->regs + reg_addr); } static inline struct port_info *adap2pinfo(struct adapter *adap, int idx)
While we have EEH errors happened on specific PCI device, the PE (Partitionable Endpoint) which includes the PCI device is expected to be reset. During the reset, the PCI device should have been marked as "offline" and it's not safe to access MMIO of that PCI device with "offline" state. That might cause the failure to do EEH recovery and then the PCI device is removed from the system for ever before manual recovery. The patch avoids access to MMIO while the PCI device has been marked "offline" so that to avoid additional EEH errors during reset and make sure that the EEH recovery can be done successfully. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> --- drivers/net/ethernet/chelsio/cxgb3/adapter.h | 8 ++++++-- 1 files changed, 6 insertions(+), 2 deletions(-)