[04/10] AOE: use rcu to find network device

Submitted by stephen hemminger on Nov. 10, 2009, 5:54 p.m.

Details

Message ID 20091110175647.409162953@vyatta.com
State Rejected
Delegated to: David Miller
Headers show

Commit Message

stephen hemminger Nov. 10, 2009, 5:54 p.m.
This gets rid of another use of read_lock(&dev_base_lock) by using
RCU. Also, it only increments the reference count of the device actually
used rather than holding and releasing every device

Compile tested only.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Comments

Eric Dumazet Nov. 10, 2009, 6:23 p.m.
Stephen Hemminger a écrit :
> This gets rid of another use of read_lock(&dev_base_lock) by using
> RCU. Also, it only increments the reference count of the device actually
> used rather than holding and releasing every device
> 
> Compile tested only.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> 
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ed Cashin Nov. 10, 2009, 8:01 p.m.
On Tue Nov 10 13:07:37 EST 2009, shemminger@vyatta.com wrote:
> This gets rid of another use of read_lock(&dev_base_lock) by using
> RCU. Also, it only increments the reference count of the device actually
> used rather than holding and releasing every device
> 
> Compile tested only.

This function runs once a minute when the aoe driver is loaded,
if you'd like to test it a bit more.

It looks like there's no dev_put corresponding to the dev_hold
after the changes.
stephen hemminger Nov. 10, 2009, 11:06 p.m.
On Tue, 10 Nov 2009 15:01:49 -0500
Ed Cashin <ecashin@coraid.com> wrote:

> On Tue Nov 10 13:07:37 EST 2009, shemminger@vyatta.com wrote:
> > This gets rid of another use of read_lock(&dev_base_lock) by using
> > RCU. Also, it only increments the reference count of the device actually
> > used rather than holding and releasing every device
> > 
> > Compile tested only.
> 
> This function runs once a minute when the aoe driver is loaded,
> if you'd like to test it a bit more.
> 
> It looks like there's no dev_put corresponding to the dev_hold
> after the changes.
> 

Hmm, looks like AOE actually is not ref counting the network device.
So my patch is incorrect. 

As it stands (before my patch), it is UNSAFE. It can decide to queue
packets to a device that is removed out from underneath it causing
reference to freed memory.

Moving the rcu_read_lock up to aoecmd_cfg() would solve that but the
whole driver appears to be unsafe about device refcounting and handling
device removal properly.  

It needs to:

1. Get a device ref count when it remembers a device: (ie addif)
2. Install a notifier that looks for device removal events
3. In notifier, remove interface, including flushing all pending
   skb's for that device.

This obviously is beyond the scope of the RCU stuff.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ed Cashin Nov. 11, 2009, 2:22 p.m.
On Tue Nov 10 18:06:41 EST 2009, shemminger@vyatta.com wrote:
...
> Hmm, looks like AOE actually is not ref counting the network device.
> So my patch is incorrect. 
> 
> As it stands (before my patch), it is UNSAFE. It can decide to queue
> packets to a device that is removed out from underneath it causing
> reference to freed memory.

Yes, that's right.  I will look at the patch you sent in the follow
up.  Thanks.

Patch hide | download patch | download mbox

--- a/drivers/block/aoe/aoecmd.c	2009-11-09 22:19:06.082480836 -0800
+++ b/drivers/block/aoe/aoecmd.c	2009-11-10 09:28:38.222438732 -0800
@@ -296,17 +296,18 @@  aoecmd_cfg_pkts(ushort aoemajor, unsigne
 	struct sk_buff *skb;
 	struct net_device *ifp;
 
-	read_lock(&dev_base_lock);
-	for_each_netdev(&init_net, ifp) {
-		dev_hold(ifp);
+	rcu_read_lock();
+	for_each_netdev_rcu(&init_net, ifp) {
 		if (!is_aoe_netif(ifp))
-			goto cont;
+			continue;
 
 		skb = new_skb(sizeof *h + sizeof *ch);
 		if (skb == NULL) {
 			printk(KERN_INFO "aoe: skb alloc failure\n");
-			goto cont;
+			continue;
 		}
+
+		dev_hold(ifp);
 		skb_put(skb, sizeof *h + sizeof *ch);
 		skb->dev = ifp;
 		__skb_queue_tail(queue, skb);
@@ -320,11 +321,8 @@  aoecmd_cfg_pkts(ushort aoemajor, unsigne
 		h->major = cpu_to_be16(aoemajor);
 		h->minor = aoeminor;
 		h->cmd = AOECMD_CFG;
-
-cont:
-		dev_put(ifp);
 	}
-	read_unlock(&dev_base_lock);
+	rcu_read_unlock();
 }
 
 static void