diff mbox

[1/2] ipv6: do not hold route table lock when send ndisc probe

Message ID 1345520643.12468.6.camel@cr0
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Amerigo Wang Aug. 21, 2012, 3:44 a.m. UTC
Hi, Debabrata,

Could you help to test the attached patch below?

Thanks!

Comments

Debabrata Banerjee Aug. 22, 2012, 4:04 p.m. UTC | #1
Thanks for the patch. We're discussing how to reach to this code properly
in the lab now. Although we will probably have to modify so it's compliant
with the RFC, by checking if a ndisc_send_ns() has already been queued
within rt->rt6i_idev->cnf.rtr_probe_interval, otherwise it could flood the
network with neighbor discoveries.

-Debabrata

On 8/20/12 11:44 PM, "Cong Wang" <amwang@redhat.com> wrote:

>Hi, Debabrata,
>
>Could you help to test the attached patch below?
>
>Thanks!
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang Aug. 23, 2012, 9:11 a.m. UTC | #2
On Wed, 2012-08-22 at 12:04 -0400, Banerjee, Debabrata wrote:
> Thanks for the patch. We're discussing how to reach to this code properly
> in the lab now. Although we will probably have to modify so it's compliant
> with the RFC, by checking if a ndisc_send_ns() has already been queued
> within rt->rt6i_idev->cnf.rtr_probe_interval, otherwise it could flood the
> network with neighbor discoveries.
> 

Thanks for testing, I can't reproduce the deadlock you reported here.

Note that the original code didn't check rtr_probe_interval either.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 23, 2012, 4:10 p.m. UTC | #3
On Tue, 2012-08-21 at 11:44 +0800, Cong Wang wrote:
> Hi, Debabrata,
> 
> Could you help to test the attached patch below?
> 
> Thanks!
> 

Hard to comment on your patch since its not inlined.

+               nw = kmalloc(sizeof(*nw), GFP_ATOMIC);
+               if (nw) {
+                       memcpy(&nw->target, &neigh->primary_key, sizeof(struct in6_addr));
+                       addrconf_addr_solict_mult(&nw->target, &nw->mcaddr);
+                       nw->dev = rt->dst.dev;
+                       INIT_WORK(&nw->work, queue_ndisc);
+                       schedule_work(&nw->work);
+               }

You cant do that without taking extra reference on dev,
and release it in queue_ndisc()

This also will add interesting side effects at device dismantle.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang Aug. 24, 2012, 9:15 a.m. UTC | #4
On Thu, 2012-08-23 at 18:10 +0200, Eric Dumazet wrote:
> On Tue, 2012-08-21 at 11:44 +0800, Cong Wang wrote:
> > Hi, Debabrata,
> > 
> > Could you help to test the attached patch below?
> > 
> > Thanks!
> > 
> 
> Hard to comment on your patch since its not inlined.
> 
> +               nw = kmalloc(sizeof(*nw), GFP_ATOMIC);
> +               if (nw) {
> +                       memcpy(&nw->target, &neigh->primary_key, sizeof(struct in6_addr));
> +                       addrconf_addr_solict_mult(&nw->target, &nw->mcaddr);
> +                       nw->dev = rt->dst.dev;
> +                       INIT_WORK(&nw->work, queue_ndisc);
> +                       schedule_work(&nw->work);
> +               }
> 
> You cant do that without taking extra reference on dev,
> and release it in queue_ndisc()
> 
> This also will add interesting side effects at device dismantle.
> 

Right... If we call dev_hold() in the work, it is possible that the work
is still not scheduled to running when we unregister the device. What's
more, we can't flush work here as we are holding a read lock.

Hmm.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Aug. 24, 2012, 9:44 a.m. UTC | #5
On Fri, 2012-08-24 at 17:15 +0800, Cong Wang wrote:

> 
> Right... If we call dev_hold() in the work, it is possible that the work
> is still not scheduled to running when we unregister the device. What's
> more, we can't flush work here as we are holding a read lock.

You need a global list (and a single work, not one per req), so that a
notifier can flush it at demand.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Amerigo Wang Aug. 24, 2012, 10:45 a.m. UTC | #6
On Fri, 2012-08-24 at 11:44 +0200, Eric Dumazet wrote:
> On Fri, 2012-08-24 at 17:15 +0800, Cong Wang wrote:
> 
> > 
> > Right... If we call dev_hold() in the work, it is possible that the work
> > is still not scheduled to running when we unregister the device. What's
> > more, we can't flush work here as we are holding a read lock.
> 
> You need a global list (and a single work, not one per req), so that a
> notifier can flush it at demand.
> 
> 

Agreed. I will make a new patch.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index eb3f1e4..cd141a5 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -440,9 +440,26 @@  out:
 }
 
 #ifdef CONFIG_IPV6_ROUTER_PREF
+struct ndisc_work {
+	struct work_struct work;
+	struct in6_addr mcaddr;
+	struct in6_addr target;
+	struct net_device *dev;
+};
+
+static void queue_ndisc(struct work_struct *work)
+{
+	struct ndisc_work *nw =
+		container_of(work, struct ndisc_work, work);
+	ndisc_send_ns(nw->dev, NULL, &nw->target, &nw->mcaddr, NULL);
+	kfree(nw);
+}
+
 static void rt6_probe(struct rt6_info *rt)
 {
 	struct neighbour *neigh;
+	struct ndisc_work *nw;
+
 	/*
 	 * Okay, this does not seem to be appropriate
 	 * for now, however, we need to check if it
@@ -457,15 +474,18 @@  static void rt6_probe(struct rt6_info *rt)
 	read_lock_bh(&neigh->lock);
 	if (!(neigh->nud_state & NUD_VALID) &&
 	    time_after(jiffies, neigh->updated + rt->rt6i_idev->cnf.rtr_probe_interval)) {
-		struct in6_addr mcaddr;
-		struct in6_addr *target;
 
 		neigh->updated = jiffies;
 		read_unlock_bh(&neigh->lock);
 
-		target = (struct in6_addr *)&neigh->primary_key;
-		addrconf_addr_solict_mult(target, &mcaddr);
-		ndisc_send_ns(rt->dst.dev, NULL, target, &mcaddr, NULL);
+		nw = kmalloc(sizeof(*nw), GFP_ATOMIC);
+		if (nw) {
+			memcpy(&nw->target, &neigh->primary_key, sizeof(struct in6_addr));
+			addrconf_addr_solict_mult(&nw->target, &nw->mcaddr);
+			nw->dev = rt->dst.dev;
+			INIT_WORK(&nw->work, queue_ndisc);
+			schedule_work(&nw->work);
+		}
 	} else {
 		read_unlock_bh(&neigh->lock);
 	}