Patchwork net/CXGB3: Avoid access MMIO on offlined PCI dev

login
register
mail settings
Submitter Gavin Shan
Date July 25, 2013, 7:32 a.m.
Message ID <1374737524-8410-1-git-send-email-shangw@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/261602/
State Rejected
Delegated to: David Miller
Headers show

Comments

Gavin Shan - July 25, 2013, 7:32 a.m.
While we have EEH errors happened on specific PCI device, the PE
(Partitionable Endpoint) which includes the PCI device is expected
to be reset. During the reset, the PCI device should have been
marked as "offline" and it's not safe to access MMIO of that PCI
device with "offline" state. That might cause the failure to do
EEH recovery and then the PCI device is removed from the system for
ever before manual recovery.

The patch avoids access to MMIO while the PCI device has been marked
"offline" so that to avoid additional EEH errors during reset and make
sure that the EEH recovery can be done successfully.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 drivers/net/ethernet/chelsio/cxgb3/adapter.h |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)
David Miller - July 28, 2013, 3:14 a.m.
From: Gavin Shan <shangw@linux.vnet.ibm.com>
Date: Thu, 25 Jul 2013 15:32:04 +0800

> While we have EEH errors happened on specific PCI device, the PE
> (Partitionable Endpoint) which includes the PCI device is expected
> to be reset. During the reset, the PCI device should have been
> marked as "offline" and it's not safe to access MMIO of that PCI
> device with "offline" state. That might cause the failure to do
> EEH recovery and then the PCI device is removed from the system for
> ever before manual recovery.
> 
> The patch avoids access to MMIO while the PCI device has been marked
> "offline" so that to avoid additional EEH errors during reset and make
> sure that the EEH recovery can be done successfully.
> 
> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>

Making this check on every single register read or write is not
reasonable.

If this is how register access in every driver has to be done in order
to support EEH errors properly, it is simply not acceptable.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/ethernet/chelsio/cxgb3/adapter.h b/drivers/net/ethernet/chelsio/cxgb3/adapter.h
index 8b395b5..5b20c5d 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/adapter.h
+++ b/drivers/net/ethernet/chelsio/cxgb3/adapter.h
@@ -270,7 +270,10 @@  struct adapter {
 
 static inline u32 t3_read_reg(struct adapter *adapter, u32 reg_addr)
 {
-	u32 val = readl(adapter->regs + reg_addr);
+	u32 val = 0xFFFFFFFF;
+
+	if (!pci_channel_offline(adapter->pdev))
+		val = readl(adapter->regs + reg_addr);
 
 	CH_DBG(adapter, MMIO, "read register 0x%x value 0x%x\n", reg_addr, val);
 	return val;
@@ -279,7 +282,8 @@  static inline u32 t3_read_reg(struct adapter *adapter, u32 reg_addr)
 static inline void t3_write_reg(struct adapter *adapter, u32 reg_addr, u32 val)
 {
 	CH_DBG(adapter, MMIO, "setting register 0x%x to 0x%x\n", reg_addr, val);
-	writel(val, adapter->regs + reg_addr);
+	if (!pci_channel_offline(adapter->pdev))
+		writel(val, adapter->regs + reg_addr);
 }
 
 static inline struct port_info *adap2pinfo(struct adapter *adap, int idx)