diff mbox series

[net] ibmvnic: Skip fatal error reset after passive init

Message ID 20200430182211.24211-1-julietk@linux.vnet.ibm.com (mailing list archive)
State Not Applicable
Headers show
Series [net] ibmvnic: Skip fatal error reset after passive init | expand

Checks

Context Check Description
snowpatch_ozlabs/apply_patch success Successfully applied on branch powerpc/merge (54dc28ff5e0b3585224d49a31b53e030342ca5c3)
snowpatch_ozlabs/build-ppc64le success Build succeeded
snowpatch_ozlabs/build-ppc64be success Build succeeded
snowpatch_ozlabs/build-ppc64e success Build succeeded
snowpatch_ozlabs/build-pmac32 warning Upstream build failed, couldn't test patch
snowpatch_ozlabs/checkpatch success total: 0 errors, 0 warnings, 0 checks, 9 lines checked
snowpatch_ozlabs/needsstable success Patch has no Fixes tags

Commit Message

Juliet Kim April 30, 2020, 6:22 p.m. UTC
During MTU change, the following events may happen.
Client-driven CRQ initialization fails due to partner’s CRQ closed,
causing client to enqueue a reset task for FATAL_ERROR. Then passive
(server-driven) CRQ initialization succeeds, causing client to
release CRQ and enqueue a reset task for failover. If the passive
CRQ initialization occurs before the FATAL reset task is processed,
the FATAL error reset task would try to access a CRQ message queue
that was freed, causing an oops. The problem may be most likely to
occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
process will automatically issue a change MTU request.

Fix this by not processing fatal error reset if CRQ is passively
initialized after client-driven CRQ initialization fails.

Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>
---
 drivers/net/ethernet/ibm/ibmvnic.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Miller April 30, 2020, 8:28 p.m. UTC | #1
From: Juliet Kim <julietk@linux.vnet.ibm.com>
Date: Thu, 30 Apr 2020 13:22:11 -0500

> During MTU change, the following events may happen.
> Client-driven CRQ initialization fails due to partner’s CRQ closed,
> causing client to enqueue a reset task for FATAL_ERROR. Then passive
> (server-driven) CRQ initialization succeeds, causing client to
> release CRQ and enqueue a reset task for failover. If the passive
> CRQ initialization occurs before the FATAL reset task is processed,
> the FATAL error reset task would try to access a CRQ message queue
> that was freed, causing an oops. The problem may be most likely to
> occur during DLPAR add vNIC with a non-default MTU, because the DLPAR
> process will automatically issue a change MTU request.
> 
> Fix this by not processing fatal error reset if CRQ is passively
> initialized after client-driven CRQ initialization fails.
> 
> Signed-off-by: Juliet Kim <julietk@linux.vnet.ibm.com>

Applied, thanks.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 4bd33245bad6..3de549c6c693 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -2189,7 +2189,8 @@  static void __ibmvnic_reset(struct work_struct *work)
 				rc = do_hard_reset(adapter, rwi, reset_state);
 				rtnl_unlock();
 			}
-		} else {
+		} else if (!(rwi->reset_reason == VNIC_RESET_FATAL &&
+				adapter->from_passive_init)) {
 			rc = do_reset(adapter, rwi, reset_state);
 		}
 		kfree(rwi);