[net] net: fix race on decreasing number of TX queues

Message ID 20180213053531.13080-1-jakub.kicinski@netronome.com
State Accepted
Delegated to: David Miller
Headers show
Series
  • [net] net: fix race on decreasing number of TX queues
Related show

Commit Message

Jakub Kicinski Feb. 13, 2018, 5:35 a.m.
netif_set_real_num_tx_queues() can be called when netdev is up.
That usually happens when user requests change of number of
channels/rings with ethtool -L.  The procedure for changing
the number of queues involves resetting the qdiscs and setting
dev->num_tx_queues to the new value.  When the new value is
lower than the old one, extra care has to be taken to ensure
ordering of accesses to the number of queues vs qdisc reset.

Currently the queues are reset before new dev->num_tx_queues
is assigned, leaving a window of time where packets can be
enqueued onto the queues going down, leading to a likely
crash in the drivers, since most drivers don't check if TX
skbs are assigned to an active queue.

Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
Also reported: http://lists.openwall.net/netdev/2017/04/26/211,
GSO just made it more likely.
---
 net/core/dev.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

David Miller Feb. 14, 2018, 7:16 p.m. | #1
From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon, 12 Feb 2018 21:35:31 -0800

> netif_set_real_num_tx_queues() can be called when netdev is up.
> That usually happens when user requests change of number of
> channels/rings with ethtool -L.  The procedure for changing
> the number of queues involves resetting the qdiscs and setting
> dev->num_tx_queues to the new value.  When the new value is
> lower than the old one, extra care has to be taken to ensure
> ordering of accesses to the number of queues vs qdisc reset.
> 
> Currently the queues are reset before new dev->num_tx_queues
> is assigned, leaving a window of time where packets can be
> enqueued onto the queues going down, leading to a likely
> crash in the drivers, since most drivers don't check if TX
> skbs are assigned to an active queue.
> 
> Fixes: e6484930d7c7 ("net: allocate tx queues in register_netdevice")
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
> ---
> Also reported: http://lists.openwall.net/netdev/2017/04/26/211,
> GSO just made it more likely.

Looks good, applied and queued up for -stable, thanks!

Patch

diff --git a/net/core/dev.c b/net/core/dev.c
index dda9d7b9a840..d4362befe7e2 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2382,8 +2382,11 @@  EXPORT_SYMBOL(netdev_set_num_tc);
  */
 int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
 {
+	bool disabling;
 	int rc;
 
+	disabling = txq < dev->real_num_tx_queues;
+
 	if (txq < 1 || txq > dev->num_tx_queues)
 		return -EINVAL;
 
@@ -2399,15 +2402,19 @@  int netif_set_real_num_tx_queues(struct net_device *dev, unsigned int txq)
 		if (dev->num_tc)
 			netif_setup_tc(dev, txq);
 
-		if (txq < dev->real_num_tx_queues) {
+		dev->real_num_tx_queues = txq;
+
+		if (disabling) {
+			synchronize_net();
 			qdisc_reset_all_tx_gt(dev, txq);
 #ifdef CONFIG_XPS
 			netif_reset_xps_queues_gt(dev, txq);
 #endif
 		}
+	} else {
+		dev->real_num_tx_queues = txq;
 	}
 
-	dev->real_num_tx_queues = txq;
 	return 0;
 }
 EXPORT_SYMBOL(netif_set_real_num_tx_queues);