diff mbox

[V2,net,06/20] net/ena: fix NULL dereference when removing the driver after device reset faild

Message ID 1480857578-5065-7-git-send-email-netanel@annapurnalabs.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Netanel Belgazal Dec. 4, 2016, 1:19 p.m. UTC
If for some reason the device stop responding and the device reset failed
to recover the device, the mmio register read datastructure will not be
reinitialized.
On driver removal, the driver will also tries to reset the device
but this time the mmio data structure will be NULL.

To solve this issue perform the device reset in the remove function only if
the device is runnig.

Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Matt Wilson Dec. 5, 2016, 4:29 a.m. UTC | #1
On Sun, Dec 04, 2016 at 03:19:24PM +0200, Netanel Belgazal wrote:
> If for some reason the device stop responding and the device reset failed
> to recover the device, the mmio register read datastructure will not be
> reinitialized.

If for some reason the device stops responding, and the device reset
fails to recover the device, the MMIO register read data structure
will not be reinitialized.

> On driver removal, the driver will also tries to reset the device
> but this time the mmio data structure will be NULL.

On driver removal, the driver will also try to reset the device, but
this time the MMIO data structure will be NULL.

> To solve this issue perform the device reset in the remove function only if
> the device is runnig.

To solve this issue, perform the device reset in the remove function
only if the device is running.

Do you have an example of the NULL pointer dereference that you can
paste in? It can be helpful for those searching for a fix for a bug
they've experienced.

--msw

> Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
> ---
>  drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> index 224302c..ad5f78f 100644
> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
> @@ -2516,6 +2516,8 @@ static void ena_fw_reset_device(struct work_struct *work)
>  err:
>  	rtnl_unlock();
>  
> +	clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
> +
>  	dev_err(&pdev->dev,
>  		"Reset attempt failed. Can not reset the device\n");
>  }
> @@ -3126,7 +3128,9 @@ static void ena_remove(struct pci_dev *pdev)
>  
>  	cancel_work_sync(&adapter->resume_io_task);
>  
> -	ena_com_dev_reset(ena_dev);
> +	/* Reset the device only if the device is running. */
> +	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
> +		ena_com_dev_reset(ena_dev);
>  
>  	ena_free_mgmnt_irq(adapter);
>
Netanel Belgazal Dec. 5, 2016, 6:30 p.m. UTC | #2
On 12/05/2016 06:29 AM, Matt Wilson wrote:
> On Sun, Dec 04, 2016 at 03:19:24PM +0200, Netanel Belgazal wrote:
>> If for some reason the device stop responding and the device reset failed
>> to recover the device, the mmio register read datastructure will not be
>> reinitialized.
> If for some reason the device stops responding, and the device reset
> fails to recover the device, the MMIO register read data structure
> will not be reinitialized.

OK
>
>> On driver removal, the driver will also tries to reset the device
>> but this time the mmio data structure will be NULL.
> On driver removal, the driver will also try to reset the device, but
> this time the MMIO data structure will be NULL.
OK
>> To solve this issue perform the device reset in the remove function only if
>> the device is runnig.
> To solve this issue, perform the device reset in the remove function
> only if the device is running.
>
> Do you have an example of the NULL pointer dereference that you can
> paste in? It can be helpful for those searching for a fix for a bug
> they've experienced.
I'll add a crash dump.

> --msw
>
>> Signed-off-by: Netanel Belgazal <netanel@annapurnalabs.com>
>> ---
>>   drivers/net/ethernet/amazon/ena/ena_netdev.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> index 224302c..ad5f78f 100644
>> --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
>> @@ -2516,6 +2516,8 @@ static void ena_fw_reset_device(struct work_struct *work)
>>   err:
>>   	rtnl_unlock();
>>   
>> +	clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
>> +
>>   	dev_err(&pdev->dev,
>>   		"Reset attempt failed. Can not reset the device\n");
>>   }
>> @@ -3126,7 +3128,9 @@ static void ena_remove(struct pci_dev *pdev)
>>   
>>   	cancel_work_sync(&adapter->resume_io_task);
>>   
>> -	ena_com_dev_reset(ena_dev);
>> +	/* Reset the device only if the device is running. */
>> +	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
>> +		ena_com_dev_reset(ena_dev);
>>   
>>   	ena_free_mgmnt_irq(adapter);
>>
diff mbox

Patch

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 224302c..ad5f78f 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -2516,6 +2516,8 @@  static void ena_fw_reset_device(struct work_struct *work)
 err:
 	rtnl_unlock();
 
+	clear_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags);
+
 	dev_err(&pdev->dev,
 		"Reset attempt failed. Can not reset the device\n");
 }
@@ -3126,7 +3128,9 @@  static void ena_remove(struct pci_dev *pdev)
 
 	cancel_work_sync(&adapter->resume_io_task);
 
-	ena_com_dev_reset(ena_dev);
+	/* Reset the device only if the device is running. */
+	if (test_bit(ENA_FLAG_DEVICE_RUNNING, &adapter->flags))
+		ena_com_dev_reset(ena_dev);
 
 	ena_free_mgmnt_irq(adapter);