diff mbox

[trusty,1/2] bnx2x: Fix kernel crash and data miscompare after EEH recovery

Message ID 1408621171-20164-2-git-send-email-apw@canonical.com
State New
Headers show

Commit Message

Andy Whitcroft Aug. 21, 2014, 11:39 a.m. UTC
From: "wenxiong@linux.vnet.ibm.com" <wenxiong@linux.vnet.ibm.com>

A rmb() is required to ensure that the CQE is not read before it
is written by the adapter DMA.  PCI ordering rules will make sure
the other fields are written before the marker at the end of struct
eth_fast_path_rx_cqe but without rmb() a weakly ordered processor can
process stale data.

Without the barrier we have observed various crashes including
bnx2x_tpa_start being called on queues not stopped (resulting in message
start of bin not in stop) and NULL pointer exceptions from bnx2x_rx_int.

Signed-off-by: Milton Miller <miltonm@us.ibm.com>
Signed-off-by: Wen Xiong <wenxiong@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

(cherry picked from commit 9aaae044abe95de182d09004cc3fa181bf22e6e0)
BugLink: http://bugs.launchpad.net/bugs/1353105
Signed-off-by: Andy Whitcroft <apw@canonical.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
diff mbox

Patch

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 4265df2..74e6040 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -862,6 +862,18 @@  int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 		bd_prod = RX_BD(bd_prod);
 		bd_cons = RX_BD(bd_cons);
 
+		/* A rmb() is required to ensure that the CQE is not read
+		 * before it is written by the adapter DMA.  PCI ordering
+		 * rules will make sure the other fields are written before
+		 * the marker at the end of struct eth_fast_path_rx_cqe
+		 * but without rmb() a weakly ordered processor can process
+		 * stale data.  Without the barrier TPA state-machine might
+		 * enter inconsistent state and kernel stack might be
+		 * provided with incorrect packet description - these lead
+		 * to various kernel crashed.
+		 */
+		rmb();
+
 		cqe_fp_flags = cqe_fp->type_error_flags;
 		cqe_fp_type = cqe_fp_flags & ETH_FAST_PATH_RX_CQE_TYPE;