diff mbox series

[V1,net] net: ena: don't wake up tx queue when down

Message ID 20190915142944.7572-1-sameehj@amazon.com
State Accepted
Delegated to: David Miller
Headers show
Series [V1,net] net: ena: don't wake up tx queue when down | expand

Commit Message

Jubran, Samih Sept. 15, 2019, 2:29 p.m. UTC
From: Sameeh Jubran <sameehj@amazon.com>

There is a race condition that can occur when calling ena_down().
The ena_clean_tx_irq() - which is a part of the napi handler -
function might wake up the tx queue when the queue is supposed
to be down (during recovery or changing the size of the queues
for example) This causes the ena_start_xmit() function to trigger
and possibly try to access the destroyed queues.

The race is illustrated below:

Flow A:                                       Flow B(napi handler)
ena_down()
   netif_carrier_off()
   netif_tx_disable()
                                                      ena_clean_tx_irq()
                                                         netif_tx_wake_queue()
   ena_napi_disable_all()
   ena_destroy_all_io_queues()

After these flows the tx queue is active and ena_start_xmit() accesses
the destroyed queue which leads to a kernel panic.

fixes: 1738cd3ed342 (net: ena: Add a driver for Amazon Elastic Network Adapters (ENA))

Signed-off-by: Sameeh Jubran <sameehj@amazon.com>
---
 drivers/net/ethernet/amazon/ena/ena_netdev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

David Miller Sept. 16, 2019, 7:40 p.m. UTC | #1
From: <sameehj@amazon.com>
Date: Sun, 15 Sep 2019 17:29:44 +0300

> From: Sameeh Jubran <sameehj@amazon.com>
> 
> There is a race condition that can occur when calling ena_down().
> The ena_clean_tx_irq() - which is a part of the napi handler -
> function might wake up the tx queue when the queue is supposed
> to be down (during recovery or changing the size of the queues
> for example) This causes the ena_start_xmit() function to trigger
> and possibly try to access the destroyed queues.
> 
> The race is illustrated below:
> 
> Flow A:                                       Flow B(napi handler)
> ena_down()
>    netif_carrier_off()
>    netif_tx_disable()
>                                                       ena_clean_tx_irq()
>                                                          netif_tx_wake_queue()
>    ena_napi_disable_all()
>    ena_destroy_all_io_queues()
> 
> After these flows the tx queue is active and ena_start_xmit() accesses
> the destroyed queue which leads to a kernel panic.
> 
> fixes: 1738cd3ed342 (net: ena: Add a driver for Amazon Elastic Network Adapters (ENA))
> 
> Signed-off-by: Sameeh Jubran <sameehj@amazon.com>

Applied.
diff mbox series

Patch

diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 664e3ed97..d118ed4c5 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -823,7 +823,8 @@  static int ena_clean_tx_irq(struct ena_ring *tx_ring, u32 budget)
 		above_thresh =
 			ena_com_sq_have_enough_space(tx_ring->ena_com_io_sq,
 						     ENA_TX_WAKEUP_THRESH);
-		if (netif_tx_queue_stopped(txq) && above_thresh) {
+		if (netif_tx_queue_stopped(txq) && above_thresh &&
+		    test_bit(ENA_FLAG_DEV_UP, &tx_ring->adapter->flags)) {
 			netif_tx_wake_queue(txq);
 			u64_stats_update_begin(&tx_ring->syncp);
 			tx_ring->tx_stats.queue_wakeup++;