diff mbox series

[net,v2] i40e: Fix erroneous adapter reinitialization during recovery process

Message ID 20220622094654.1379722-1-jan.sokolowski@intel.com
State Accepted
Delegated to: Anthony Nguyen
Headers show
Series [net,v2] i40e: Fix erroneous adapter reinitialization during recovery process | expand

Commit Message

Jan Sokolowski June 22, 2022, 9:46 a.m. UTC
From: Dawid Lukwinski <dawid.lukwinski@intel.com>

Fix an issue, when driver incorrectly detects state
of recovery process and erroneously reinitializes interrupts,
which results in a kernel error and call trace message.

The issue was caused by a combination of two factors:
1. Assuming the EMP reset issued after completing
firmware recovery means the whole recovery process is complete.
2. Erroneous reinitialization of interrupt vector after detecting
the abovementioned EMP reset.

Fixes (1) by changing how recovery state change is detected
and (2) by adjusting the conditional expression to ensure using proper
interrupt reinitialization method, depending on the situation.

Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support")
Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
---
v2: Change author to Dawid, and remove signed-off-by from Alice
---
---
 drivers/net/ethernet/intel/i40e/i40e_main.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

Comments

Jankowski, Konrad0 July 5, 2022, 9:59 a.m. UTC | #1
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Jan Sokolowski
> Sent: Wednesday, June 22, 2022 11:47 AM
> To: intel-wired-lan@lists.osuosl.org
> Cc: Lukwinski, Dawid <dawid.lukwinski@intel.com>
> Subject: [Intel-wired-lan] [PATCH net v2] i40e: Fix erroneous adapter
> reinitialization during recovery process
> 
> From: Dawid Lukwinski <dawid.lukwinski@intel.com>
> 
> Fix an issue, when driver incorrectly detects state of recovery process and
> erroneously reinitializes interrupts, which results in a kernel error and call
> trace message.
> 
> The issue was caused by a combination of two factors:
> 1. Assuming the EMP reset issued after completing firmware recovery means
> the whole recovery process is complete.
> 2. Erroneous reinitialization of interrupt vector after detecting the
> abovementioned EMP reset.
> 
> Fixes (1) by changing how recovery state change is detected and (2) by
> adjusting the conditional expression to ensure using proper interrupt
> reinitialization method, depending on the situation.
> 
> Fixes: 4ff0ee1af016 ("i40e: Introduce recovery mode support")
> Signed-off-by: Dawid Lukwinski <dawid.lukwinski@intel.com>
> Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
> ---
> v2: Change author to Dawid, and remove signed-off-by from Alice
> ---
> ---
>  drivers/net/ethernet/intel/i40e/i40e_main.c | 13 +++++--------
>  1 file changed, 5 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> b/drivers/net/ethernet/intel/i40e/i40e_main.c
> index d59b9a08f5b3..685556e968f2 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> @@ -10654,7 +10654,7 @@ static int i40e_reset(struct i40e_pf *pf)

Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index d59b9a08f5b3..685556e968f2 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -10654,7 +10654,7 @@  static int i40e_reset(struct i40e_pf *pf)
  **/
 static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
 {
-	int old_recovery_mode_bit = test_bit(__I40E_RECOVERY_MODE, pf->state);
+	const bool is_recovery_mode_reported = i40e_check_recovery_mode(pf);
 	struct i40e_vsi *vsi = pf->vsi[pf->lan_vsi];
 	struct i40e_hw *hw = &pf->hw;
 	i40e_status ret;
@@ -10662,13 +10662,11 @@  static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
 	int v;
 
 	if (test_bit(__I40E_EMP_RESET_INTR_RECEIVED, pf->state) &&
-	    i40e_check_recovery_mode(pf)) {
+	    is_recovery_mode_reported)
 		i40e_set_ethtool_ops(pf->vsi[pf->lan_vsi]->netdev);
-	}
 
 	if (test_bit(__I40E_DOWN, pf->state) &&
-	    !test_bit(__I40E_RECOVERY_MODE, pf->state) &&
-	    !old_recovery_mode_bit)
+	    !test_bit(__I40E_RECOVERY_MODE, pf->state))
 		goto clear_recovery;
 	dev_dbg(&pf->pdev->dev, "Rebuilding internal switch\n");
 
@@ -10695,13 +10693,12 @@  static void i40e_rebuild(struct i40e_pf *pf, bool reinit, bool lock_acquired)
 	 * accordingly with regard to resources initialization
 	 * and deinitialization
 	 */
-	if (test_bit(__I40E_RECOVERY_MODE, pf->state) ||
-	    old_recovery_mode_bit) {
+	if (test_bit(__I40E_RECOVERY_MODE, pf->state)) {
 		if (i40e_get_capabilities(pf,
 					  i40e_aqc_opc_list_func_capabilities))
 			goto end_unlock;
 
-		if (test_bit(__I40E_RECOVERY_MODE, pf->state)) {
+		if (is_recovery_mode_reported) {
 			/* we're staying in recovery mode so we'll reinitialize
 			 * misc vector here
 			 */