Patchwork [net-next,v4] netpoll: fix a rtnl lock assertion failure

login
register
mail settings
Submitter Amerigo Wang
Date Jan. 15, 2013, 9:34 a.m.
Message ID <1358242446-4273-1-git-send-email-amwang@redhat.com>
Download mbox | patch
Permalink /patch/212057/
State Accepted
Delegated to: David Miller
Headers show

Comments

Amerigo Wang - Jan. 15, 2013, 9:34 a.m.
From: Cong Wang <amwang@redhat.com>

v4: hold rtnl lock for the whole netpoll_setup()
v3: remove the comment
v2: use RCU read lock

This patch fixes the following warning:

[   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
[   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
[   72.019582] Call Trace:
[   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
[   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
[   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
[   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
[   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
[   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
[   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
[   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
[   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b

In case of other races, hold rtnl lock for the entire netpoll_setup() function.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <amwang@redhat.com>
---
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 16, 2013, 8:27 p.m.
From: Cong Wang <amwang@redhat.com>
Date: Tue, 15 Jan 2013 17:34:06 +0800

> From: Cong Wang <amwang@redhat.com>
> 
> v4: hold rtnl lock for the whole netpoll_setup()
> v3: remove the comment
> v2: use RCU read lock
> 
> This patch fixes the following warning:
> 
> [   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
> [   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
> [   72.019582] Call Trace:
> [   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
> [   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
> [   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
> [   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
> [   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
> [   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
> [   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
> [   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
> [   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
> 
> In case of other races, hold rtnl lock for the entire netpoll_setup() function.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet - Jan. 17, 2013, 1:24 a.m.
On Tue, 2013-01-15 at 17:34 +0800, Cong Wang wrote:
> From: Cong Wang <amwang@redhat.com>
> 
> v4: hold rtnl lock for the whole netpoll_setup()
> v3: remove the comment
> v2: use RCU read lock
> 
> This patch fixes the following warning:
> 
> [   72.013864] RTNL: assertion failed at net/core/dev.c (4955)
> [   72.017758] Pid: 668, comm: netpoll-prep-v6 Not tainted 3.8.0-rc1+ #474
> [   72.019582] Call Trace:
> [   72.020295]  [<ffffffff8176653d>] netdev_master_upper_dev_get+0x35/0x58
> [   72.022545]  [<ffffffff81784edd>] netpoll_setup+0x61/0x340
> [   72.024846]  [<ffffffff815d837e>] store_enabled+0x82/0xc3
> [   72.027466]  [<ffffffff815d7e51>] netconsole_target_attr_store+0x35/0x37
> [   72.029348]  [<ffffffff811c3479>] configfs_write_file+0xe2/0x10c
> [   72.030959]  [<ffffffff8115d239>] vfs_write+0xaf/0xf6
> [   72.032359]  [<ffffffff81978a05>] ? sysret_check+0x22/0x5d
> [   72.033824]  [<ffffffff8115d453>] sys_write+0x5c/0x84
> [   72.035328]  [<ffffffff819789d9>] system_call_fastpath+0x16/0x1b
> 
> In case of other races, hold rtnl lock for the entire netpoll_setup() function.
> 
> Cc: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Cc: David S. Miller <davem@davemloft.net>
> Signed-off-by: Cong Wang <amwang@redhat.com>
> ---
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c

...

>  	if (np->dev_name)
> -		ndev = dev_get_by_name(&init_net, np->dev_name);
> +		ndev = __dev_get_by_name(&init_net, np->dev_name);

This change brings interesting bugs.

All the "goto put;" are basically wrong, and the section waiting for the
carrier and releasing/getting rtnl is buggy.

Please revert this part.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Jan. 17, 2013, 3:52 a.m.
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 16 Jan 2013 17:24:45 -0800

>>  	if (np->dev_name)
>> -		ndev = dev_get_by_name(&init_net, np->dev_name);
>> +		ndev = __dev_get_by_name(&init_net, np->dev_name);
 ...
> Please revert this part.

You mean just revert that hunk above that made it use the
non-refcounting version of dev_get_by_name()?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang - Jan. 17, 2013, 4 a.m.
On Wed, 2013-01-16 at 22:52 -0500, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 16 Jan 2013 17:24:45 -0800
> 
> >>  	if (np->dev_name)
> >> -		ndev = dev_get_by_name(&init_net, np->dev_name);
> >> +		ndev = __dev_get_by_name(&init_net, np->dev_name);
>  ...
> > Please revert this part.
> 
> You mean just revert that hunk above that made it use the
> non-refcounting version of dev_get_by_name()?

But there is no reason to take both rtnl lock and RCU read lock,
although that is fine.

I think just adding dev_hold() is enough.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 9f05067..a5ad1c1 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -1048,11 +1048,13 @@  int netpoll_setup(struct netpoll *np)
 	struct in_device *in_dev;
 	int err;
 
+	rtnl_lock();
 	if (np->dev_name)
-		ndev = dev_get_by_name(&init_net, np->dev_name);
+		ndev = __dev_get_by_name(&init_net, np->dev_name);
 	if (!ndev) {
 		np_err(np, "%s doesn't exist, aborting\n", np->dev_name);
-		return -ENODEV;
+		err = -ENODEV;
+		goto unlock;
 	}
 
 	if (netdev_master_upper_dev_get(ndev)) {
@@ -1066,15 +1068,14 @@  int netpoll_setup(struct netpoll *np)
 
 		np_info(np, "device %s not up yet, forcing it\n", np->dev_name);
 
-		rtnl_lock();
 		err = dev_open(ndev);
-		rtnl_unlock();
 
 		if (err) {
 			np_err(np, "failed to open %s\n", ndev->name);
 			goto put;
 		}
 
+		rtnl_unlock();
 		atleast = jiffies + HZ/10;
 		atmost = jiffies + carrier_timeout * HZ;
 		while (!netif_carrier_ok(ndev)) {
@@ -1094,16 +1095,14 @@  int netpoll_setup(struct netpoll *np)
 			np_notice(np, "carrier detect appears untrustworthy, waiting 4 seconds\n");
 			msleep(4000);
 		}
+		rtnl_lock();
 	}
 
 	if (!np->local_ip.ip) {
 		if (!np->ipv6) {
-			rcu_read_lock();
-			in_dev = __in_dev_get_rcu(ndev);
-
+			in_dev = __in_dev_get_rtnl(ndev);
 
 			if (!in_dev || !in_dev->ifa_list) {
-				rcu_read_unlock();
 				np_err(np, "no IP address for %s, aborting\n",
 				       np->dev_name);
 				err = -EDESTADDRREQ;
@@ -1111,14 +1110,12 @@  int netpoll_setup(struct netpoll *np)
 			}
 
 			np->local_ip.ip = in_dev->ifa_list->ifa_local;
-			rcu_read_unlock();
 			np_info(np, "local IP %pI4\n", &np->local_ip.ip);
 		} else {
 #if IS_ENABLED(CONFIG_IPV6)
 			struct inet6_dev *idev;
 
 			err = -EDESTADDRREQ;
-			rcu_read_lock();
 			idev = __in6_dev_get(ndev);
 			if (idev) {
 				struct inet6_ifaddr *ifp;
@@ -1133,7 +1130,6 @@  int netpoll_setup(struct netpoll *np)
 				}
 				read_unlock_bh(&idev->lock);
 			}
-			rcu_read_unlock();
 			if (err) {
 				np_err(np, "no IPv6 address for %s, aborting\n",
 				       np->dev_name);
@@ -1151,17 +1147,17 @@  int netpoll_setup(struct netpoll *np)
 	/* fill up the skb queue */
 	refill_skbs();
 
-	rtnl_lock();
 	err = __netpoll_setup(np, ndev, GFP_KERNEL);
-	rtnl_unlock();
-
 	if (err)
 		goto put;
 
+	rtnl_unlock();
 	return 0;
 
 put:
 	dev_put(ndev);
+unlock:
+	rtnl_unlock();
 	return err;
 }
 EXPORT_SYMBOL(netpoll_setup);