diff mbox series

[SRU,FOCAL,1/1] net/mlx5: Fix a race when moving command interface to polling mode

Message ID 20210119160015.37992-2-william.gray@canonical.com
State New
Headers show
Series net/mlx5: Fix a race when moving command interface to polling mode | expand

Commit Message

William Breathitt Gray Jan. 19, 2021, 4 p.m. UTC
From: Eran Ben Elisha <eranbe@mellanox.com>

BugLink: https://bugs.launchpad.net/bugs/1905574

As part of driver unload, it destroys the commands EQ (via FW command).
As the commands EQ is destroyed, FW will not generate EQEs for any command
that driver sends afterwards. Driver should poll for later commands status.

Driver commands mode metadata is updated before the commands EQ is
actually destroyed. This can lead for double completion handle by the
driver (polling and interrupt), if a command is executed and completed by
FW after the mode was changed, but before the EQ was destroyed.

Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
that only DESTROY_EQ command can be executed during this time period.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
(cherry-picked from commit 432161ea26d6d5e5c3f7306d9407d26ed1e1953e)
Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

William Breathitt Gray Jan. 19, 2021, 4:04 p.m. UTC | #1
On Tue, Jan 19, 2021 at 11:00:15AM -0500, William Breathitt Gray wrote:
> From: Eran Ben Elisha <eranbe@mellanox.com>
> 
> BugLink: https://bugs.launchpad.net/bugs/1905574
> 
> As part of driver unload, it destroys the commands EQ (via FW command).
> As the commands EQ is destroyed, FW will not generate EQEs for any command
> that driver sends afterwards. Driver should poll for later commands status.
> 
> Driver commands mode metadata is updated before the commands EQ is
> actually destroyed. This can lead for double completion handle by the
> driver (polling and interrupt), if a command is executed and completed by
> FW after the mode was changed, but before the EQ was destroyed.
> 
> Fix that by using the mlx5_cmd_allowed_opcode mechanism to guarantee
> that only DESTROY_EQ command can be executed during this time period.
> 
> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
> Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
> Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
> (cherry-picked from commit 432161ea26d6d5e5c3f7306d9407d26ed1e1953e)

Cherry pick line has a typo.

Nacked-by: William Breathitt Gray <william.gray@canonical.com>

> Signed-off-by: William Breathitt Gray <william.gray@canonical.com>
> ---
>  drivers/net/ethernet/mellanox/mlx5/core/eq.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> index 0a20938b4aad..938c4a46f9de 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
> @@ -695,8 +695,10 @@ static void destroy_async_eqs(struct mlx5_core_dev *dev)
>  
>  	cleanup_async_eq(dev, &table->pages_eq, "pages");
>  	cleanup_async_eq(dev, &table->async_eq, "async");
> +	mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ);
>  	mlx5_cmd_use_polling(dev);
>  	cleanup_async_eq(dev, &table->cmd_eq, "cmd");
> +	mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
>  	mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
>  }
>  
> -- 
> 2.27.0
>
diff mbox series

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 0a20938b4aad..938c4a46f9de 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -695,8 +695,10 @@  static void destroy_async_eqs(struct mlx5_core_dev *dev)
 
 	cleanup_async_eq(dev, &table->pages_eq, "pages");
 	cleanup_async_eq(dev, &table->async_eq, "async");
+	mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_DESTROY_EQ);
 	mlx5_cmd_use_polling(dev);
 	cleanup_async_eq(dev, &table->cmd_eq, "cmd");
+	mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
 	mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
 }