Patchwork [net-next,V1,5/6] net/mlx4_en: Fix a race when closing TX queue

login
register
mail settings
Submitter Amir Vadai
Date Jan. 17, 2013, 11:47 a.m.
Message ID <1358423241-2452-6-git-send-email-amirv@mellanox.com>
Download mbox | patch
Permalink /patch/213222/
State Changes Requested
Delegated to: David Miller
Headers show

Comments

Amir Vadai - Jan. 17, 2013, 11:47 a.m.
There is a possible race where the TX completion handler can clean the
entire TX queue between the decision that the queue is full and actually
closing it. To avoid this situation, check again if the queue is really
full, if not, reopen the transmit and continue with sending the packet.

Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx4/en_tx.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)
Eric Dumazet - Jan. 17, 2013, 2:22 p.m.
On Thu, 2013-01-17 at 13:47 +0200, Amir Vadai wrote:
> There is a possible race where the TX completion handler can clean the
> entire TX queue between the decision that the queue is full and actually
> closing it. To avoid this situation, check again if the queue is really
> full, if not, reopen the transmit and continue with sending the packet.
> 
> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
> Signed-off-by: Amir Vadai <amirv@mellanox.com>
> ---
>  drivers/net/ethernet/mellanox/mlx4/en_tx.c |    9 ++++++++-
>  1 files changed, 8 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> index 2b799f4..1d17f5f 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
> @@ -592,7 +592,14 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>  		netif_tx_stop_queue(ring->tx_queue);
>  		priv->port_stats.queue_stopped++;
>  
> -		return NETDEV_TX_BUSY;
> +		/* Check again whether the queue was cleaned */
> +		if (unlikely(((int)(ring->prod - ring->cons)) <=
> +				ring->size - HEADROOM - MAX_DESC_TXBBS)) {
> +			netif_tx_wake_queue(ring->tx_queue);
> +			priv->port_stats.wake_queue++;
> +		} else {
> +			return NETDEV_TX_BUSY;
> +		}
>  	}
>  
>  	/* Track current inflight packets for performance analysis */

This looks racy to me. You probably want explicit memory barriers ?



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 17, 2013, 8:44 p.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 17 Jan 2013 06:22:18 -0800

> On Thu, 2013-01-17 at 13:47 +0200, Amir Vadai wrote:
>> There is a possible race where the TX completion handler can clean the
>> entire TX queue between the decision that the queue is full and actually
>> closing it. To avoid this situation, check again if the queue is really
>> full, if not, reopen the transmit and continue with sending the packet.
>> 
>> Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.com>
>> Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
>> Signed-off-by: Amir Vadai <amirv@mellanox.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx4/en_tx.c |    9 ++++++++-
>>  1 files changed, 8 insertions(+), 1 deletions(-)
>> 
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> index 2b799f4..1d17f5f 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
>> @@ -592,7 +592,14 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
>>  		netif_tx_stop_queue(ring->tx_queue);
>>  		priv->port_stats.queue_stopped++;
>>  
>> -		return NETDEV_TX_BUSY;
>> +		/* Check again whether the queue was cleaned */
>> +		if (unlikely(((int)(ring->prod - ring->cons)) <=
>> +				ring->size - HEADROOM - MAX_DESC_TXBBS)) {
>> +			netif_tx_wake_queue(ring->tx_queue);
>> +			priv->port_stats.wake_queue++;
>> +		} else {
>> +			return NETDEV_TX_BUSY;
>> +		}
>>  	}
>>  
>>  	/* Track current inflight packets for performance analysis */
> 
> This looks racy to me. You probably want explicit memory barriers ?

The conditional is also not formatted correctly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 2b799f4..1d17f5f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -592,7 +592,14 @@  netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		netif_tx_stop_queue(ring->tx_queue);
 		priv->port_stats.queue_stopped++;
 
-		return NETDEV_TX_BUSY;
+		/* Check again whether the queue was cleaned */
+		if (unlikely(((int)(ring->prod - ring->cons)) <=
+				ring->size - HEADROOM - MAX_DESC_TXBBS)) {
+			netif_tx_wake_queue(ring->tx_queue);
+			priv->port_stats.wake_queue++;
+		} else {
+			return NETDEV_TX_BUSY;
+		}
 	}
 
 	/* Track current inflight packets for performance analysis */