Message ID | 20081001141432.470106980@theryb.frec.bull.fr |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: > This is the second version of a patch aimed at fixing a deadlock that > can occur when unregistering net devices. > > This new version of the patch ensures the garbage collector > dst_gc_task() is run when waiting in netdev_wait_allrefs(), by calling > it in the netdevice notifier dst_dev_event() when receiving a > NETDEV_UNREGISTER event. > > Thanks to Jarek Poplawski for proposing this fix. Not at all! Actually, I'm sorry for this mess... (Read below.) > (The previous version proposed to replace in linkwatch_event() > the call to rntl_unlock() by __rtnl_lock()) ... > --- net-next-2.6.orig/net/core/dst.c > +++ net-next-2.6/net/core/dst.c > @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier > dst_ifdown(dst, dev, event != NETDEV_DOWN); > } > mutex_unlock(&dst_gc_mutex); > + > + if (event == NETDEV_UNREGISTER && > + cancel_delayed_work(&dst_gc_work)) > + dst_gc_task(&dst_gc_work.work); Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only kill this while on timer, but not when queued and maybe blocked already. Probably cancel_delayed_work_sync() should be considered instead, but since this needs more checking, and David is waiting for this, I think it's safer to use this previous (__rtnl_unlock) patch for now (especially for -stable). Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jarek Poplawski wrote: > On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: >> This is the second version of a patch aimed at fixing a deadlock that >> can occur when unregistering net devices. >> >> This new version of the patch ensures the garbage collector >> dst_gc_task() is run when waiting in netdev_wait_allrefs(), by calling >> it in the netdevice notifier dst_dev_event() when receiving a >> NETDEV_UNREGISTER event. >> >> Thanks to Jarek Poplawski for proposing this fix. > > Not at all! Actually, I'm sorry for this mess... (Read below.) > >> (The previous version proposed to replace in linkwatch_event() >> the call to rntl_unlock() by __rtnl_lock()) > ... >> --- net-next-2.6.orig/net/core/dst.c >> +++ net-next-2.6/net/core/dst.c >> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >> dst_ifdown(dst, dev, event != NETDEV_DOWN); >> } >> mutex_unlock(&dst_gc_mutex); >> + >> + if (event == NETDEV_UNREGISTER && >> + cancel_delayed_work(&dst_gc_work)) >> + dst_gc_task(&dst_gc_work.work); > > Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only > kill this while on timer, but not when queued and maybe blocked already. Perhaps, I am misunderstanding but, dst_gc_work is always called with schedule_delayed_work. If the task is not running, we can cancel it and if the task is running, no need to trigger the gc, no ? > Probably cancel_delayed_work_sync() should be considered instead, but > since this needs more checking, and David is waiting for this, I think > it's safer to use this previous (__rtnl_unlock) patch for now > (especially for -stable). -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 01, 2008 at 11:06:22PM +0200, Daniel Lezcano wrote: > Jarek Poplawski wrote: >> On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: ... >>> --- net-next-2.6.orig/net/core/dst.c >>> +++ net-next-2.6/net/core/dst.c >>> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >>> dst_ifdown(dst, dev, event != NETDEV_DOWN); >>> } >>> mutex_unlock(&dst_gc_mutex); >>> + >>> + if (event == NETDEV_UNREGISTER && >>> + cancel_delayed_work(&dst_gc_work)) >>> + dst_gc_task(&dst_gc_work.work); >> >> Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only >> kill this while on timer, but not when queued and maybe blocked already. > > Perhaps, I am misunderstanding but, dst_gc_work is always called with > schedule_delayed_work. If the task is not running, we can cancel it and > if the task is running, no need to trigger the gc, no ? Right, but cancel_delayed_work() can't cancel it reliably even if the task is not running (but e.g. queued already just after us - linkwatch). Of course, this should be OK most of the time, especially with such long schedule delays, but, on the other hand, not as sure as the first patch with __rtnl_unlock. So doing this right could take more time. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 01, 2008 at 11:06:22PM +0200, Daniel Lezcano wrote: > Jarek Poplawski wrote: >> On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: ... >>> --- net-next-2.6.orig/net/core/dst.c >>> +++ net-next-2.6/net/core/dst.c >>> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >>> dst_ifdown(dst, dev, event != NETDEV_DOWN); >>> } >>> mutex_unlock(&dst_gc_mutex); >>> + >>> + if (event == NETDEV_UNREGISTER && >>> + cancel_delayed_work(&dst_gc_work)) >>> + dst_gc_task(&dst_gc_work.work); >> >> Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only >> kill this while on timer, but not when queued and maybe blocked already. Hmm#2... Then maybe something like this?: if (event == NETDEV_UNREGISTER && (cancel_delayed_work(&dst_gc_work) || delayed_work_pending(&dst_gc_work))) dst_gc_task(&dst_gc_work.work); Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Jarek Poplawski wrote: > On Wed, Oct 01, 2008 at 11:06:22PM +0200, Daniel Lezcano wrote: >> Jarek Poplawski wrote: >>> On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: > ... >>>> --- net-next-2.6.orig/net/core/dst.c >>>> +++ net-next-2.6/net/core/dst.c >>>> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >>>> dst_ifdown(dst, dev, event != NETDEV_DOWN); >>>> } >>>> mutex_unlock(&dst_gc_mutex); >>>> + >>>> + if (event == NETDEV_UNREGISTER && >>>> + cancel_delayed_work(&dst_gc_work)) >>>> + dst_gc_task(&dst_gc_work.work); >>> Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only >>> kill this while on timer, but not when queued and maybe blocked already. You're right. > > Hmm#2... Then maybe something like this?: > > if (event == NETDEV_UNREGISTER && > (cancel_delayed_work(&dst_gc_work) || > delayed_work_pending(&dst_gc_work))) > dst_gc_task(&dst_gc_work.work); Hmmm... I'm not sure I understand what this change do? OK, I see this ensure we will run dst_gc_task() even if cancel_delayed_work() failed and the work is still pending (ie. the timer has expired and dst_gc_work is already in the queue). But what if the work was not pending at all at beginning? We still need to run dst_gc_task(). Is something like this better? (code expanded to be more readable) if (event == NETDEV_UNREGISTER) { if (!delayed_work_pending(&dst_gc_work)) /* work is not scheduled (no timer, not in queue) */ dst_gc_task(&dst_gc_work.work); else if (cancel_delayed_work(&dst_gc_work) || delayed_work_pending(&dst_gc_work))) /* work was scheduled (and may be blocked) */ dst_gc_task(&dst_gc_work.work); else /* dst_gc_task() is running, do nothing */ } Benjamin > > > Jarek P. > >
On Thu, Oct 02, 2008 at 05:23:30PM +0200, Benjamin Thery wrote: > Jarek Poplawski wrote: >> On Wed, Oct 01, 2008 at 11:06:22PM +0200, Daniel Lezcano wrote: >>> Jarek Poplawski wrote: >>>> On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: >> ... >>>>> --- net-next-2.6.orig/net/core/dst.c >>>>> +++ net-next-2.6/net/core/dst.c >>>>> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >>>>> dst_ifdown(dst, dev, event != NETDEV_DOWN); >>>>> } >>>>> mutex_unlock(&dst_gc_mutex); >>>>> + >>>>> + if (event == NETDEV_UNREGISTER && >>>>> + cancel_delayed_work(&dst_gc_work)) >>>>> + dst_gc_task(&dst_gc_work.work); >>>> Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only >>>> kill this while on timer, but not when queued and maybe blocked already. > > You're right. > >> >> Hmm#2... Then maybe something like this?: >> >> if (event == NETDEV_UNREGISTER && >> (cancel_delayed_work(&dst_gc_work) || >> delayed_work_pending(&dst_gc_work))) >> dst_gc_task(&dst_gc_work.work); > > Hmmm... I'm not sure I understand what this change do? > > OK, I see this ensure we will run dst_gc_task() even if > cancel_delayed_work() failed and the work is still pending (ie. the > timer has expired and dst_gc_work is already in the queue). I think, this covers exactly the case of blocking you described, plus more... (the work is queued but not blocked). > > But what if the work was not pending at all at beginning? > We still need to run dst_gc_task(). Maybe I miss something, but if this is really needed here then it looks like we are fixing more than "the blocking" bug BTW. > > Is something like this better? > (code expanded to be more readable) > > if (event == NETDEV_UNREGISTER) { > if (!delayed_work_pending(&dst_gc_work)) > /* work is not scheduled (no timer, not in queue) */ /* may be running too */ > dst_gc_task(&dst_gc_work.work); > else if (cancel_delayed_work(&dst_gc_work) || > delayed_work_pending(&dst_gc_work))) > /* work was scheduled (and may be blocked) */ /* actually could be both running and pending here: * if it's after rearming */ > dst_gc_task(&dst_gc_work.work); > else > /* dst_gc_task() is running, do nothing */ So again !delayed_work_pending() - there could be the change of state while checking - but then looks a bit inconsistent. I think this should be OK too. As a matter of fact I've thought about something even simpler, which probably should help for all above concerns too: if (event == NETDEV_UNREGISTER) dst_gc_task(&dst_gc_work.work); dst_gc_task() locking allows for this, and running this two times in a row could be even faster than trying to cancel the unnecessary run. Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Quoting Jarek Poplawski <jarkao2@gmail.com>: > On Thu, Oct 02, 2008 at 05:23:30PM +0200, Benjamin Thery wrote: >> Jarek Poplawski wrote: >>> On Wed, Oct 01, 2008 at 11:06:22PM +0200, Daniel Lezcano wrote: >>>> Jarek Poplawski wrote: >>>>> On Wed, Oct 01, 2008 at 04:14:35PM +0200, Benjamin Thery wrote: >>> ... >>>>>> --- net-next-2.6.orig/net/core/dst.c >>>>>> +++ net-next-2.6/net/core/dst.c >>>>>> @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier >>>>>> dst_ifdown(dst, dev, event != NETDEV_DOWN); >>>>>> } >>>>>> mutex_unlock(&dst_gc_mutex); >>>>>> + >>>>>> + if (event == NETDEV_UNREGISTER && >>>>>> + cancel_delayed_work(&dst_gc_work)) >>>>>> + dst_gc_task(&dst_gc_work.work); >>>>> Hmm... It seems this shouldn't work yet: cancel_delayed_work() can only >>>>> kill this while on timer, but not when queued and maybe blocked already. >> >> You're right. >> >>> >>> Hmm#2... Then maybe something like this?: >>> >>> if (event == NETDEV_UNREGISTER && >>> (cancel_delayed_work(&dst_gc_work) || >>> delayed_work_pending(&dst_gc_work))) >>> dst_gc_task(&dst_gc_work.work); >> >> Hmmm... I'm not sure I understand what this change do? >> >> OK, I see this ensure we will run dst_gc_task() even if >> cancel_delayed_work() failed and the work is still pending (ie. the >> timer has expired and dst_gc_work is already in the queue). > > I think, this covers exactly the case of blocking you described, plus > more... (the work is queued but not blocked). > >> >> But what if the work was not pending at all at beginning? >> We still need to run dst_gc_task(). > > Maybe I miss something, but if this is really needed here then it > looks like we are fixing more than "the blocking" bug BTW. > >> >> Is something like this better? >> (code expanded to be more readable) >> >> if (event == NETDEV_UNREGISTER) { >> if (!delayed_work_pending(&dst_gc_work)) >> /* work is not scheduled (no timer, not in queue) */ > > /* may be running too */ > >> dst_gc_task(&dst_gc_work.work); >> else if (cancel_delayed_work(&dst_gc_work) || >> delayed_work_pending(&dst_gc_work))) >> /* work was scheduled (and may be blocked) */ > > /* actually could be both running and pending here: > * if it's after rearming > */ > >> dst_gc_task(&dst_gc_work.work); >> else >> /* dst_gc_task() is running, do nothing */ > > So again !delayed_work_pending() - there could be the change of state > while checking - but then looks a bit inconsistent. I think this should > be OK too. > > As a matter of fact I've thought about something even simpler, which > probably should help for all above concerns too: > > if (event == NETDEV_UNREGISTER) > dst_gc_task(&dst_gc_work.work); > > dst_gc_task() locking allows for this, and running this two times in > a row could be even faster than trying to cancel the unnecessary run. I've thought a bit more about my last proposal and come to the same conclusion as you, hmm, almost. I thought we could call cancel_delayed_work() unconditionally and then dst_gc_task(). if (event == NETDEV_UNREGISTER) { cancel_delayed_work(&dst_gc_work); dst_gc_task(&dst_gc_work.work); } But, you're right, calling only dst_gc_task() seems fine to me. I'll run some tests tomorrow and send a new patch. Do we agree that this fix (calling dst_gc_task() in dst_dev_event()) is better/more appropriate than the first one (replacing rtnl_unlock() by the non-blocking __rtnl_unlock() in linkwatch_event())? Thanks, Benjamin > > Jarek P. > > ---------------------------------------------------------------- This message was sent using IMP, the Internet Messaging Program. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 02, 2008 at 09:55:02PM +0200, Benjamin Thery wrote: > Quoting Jarek Poplawski <jarkao2@gmail.com>: ... >> As a matter of fact I've thought about something even simpler, which >> probably should help for all above concerns too: >> >> if (event == NETDEV_UNREGISTER) >> dst_gc_task(&dst_gc_work.work); >> >> dst_gc_task() locking allows for this, and running this two times in >> a row could be even faster than trying to cancel the unnecessary run. > > I've thought a bit more about my last proposal and come to the same > conclusion as you, hmm, almost. I thought we could call > cancel_delayed_work() unconditionally and then dst_gc_task(). > > if (event == NETDEV_UNREGISTER) { > cancel_delayed_work(&dst_gc_work); > dst_gc_task(&dst_gc_work.work); > } > > But, you're right, calling only dst_gc_task() seems fine to me. I'm OK with any of these versions. > > I'll run some tests tomorrow and send a new patch. > > Do we agree that this fix (calling dst_gc_task() in dst_dev_event()) > is better/more appropriate than the first one (replacing rtnl_unlock() > by the non-blocking __rtnl_unlock() in linkwatch_event())? I agree it's more appropriate, because: a) the job is done according to the rules (in the comment) during the notification and not in some future, b) it removes some hidden dependencies between processes/works, which, even if they currently don't exist except this one in the linkwatch, could be extremly hard to diagnose, if added accidentally in the future. On the other hand, until we are not sure of the reasons, why something like this (the full destroying) was avoided in the past, and until it's heavily tested (with lockdep), your first patch looks to me much safer if applying to any -stable is considered. Thanks, Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 02, 2008 at 10:34:35PM +0200, Jarek Poplawski wrote: > On Thu, Oct 02, 2008 at 09:55:02PM +0200, Benjamin Thery wrote: > > Quoting Jarek Poplawski <jarkao2@gmail.com>: > ... > >> As a matter of fact I've thought about something even simpler, which > >> probably should help for all above concerns too: > >> > >> if (event == NETDEV_UNREGISTER) > >> dst_gc_task(&dst_gc_work.work); > >> > >> dst_gc_task() locking allows for this, and running this two times in > >> a row could be even faster than trying to cancel the unnecessary run. > > > > I've thought a bit more about my last proposal and come to the same > > conclusion as you, hmm, almost. I thought we could call > > cancel_delayed_work() unconditionally and then dst_gc_task(). > > > > if (event == NETDEV_UNREGISTER) { > > cancel_delayed_work(&dst_gc_work); > > dst_gc_task(&dst_gc_work.work); > > } > > > > But, you're right, calling only dst_gc_task() seems fine to me. > > I'm OK with any of these versions. > > > > > I'll run some tests tomorrow and send a new patch. > > > > Do we agree that this fix (calling dst_gc_task() in dst_dev_event()) > > is better/more appropriate than the first one (replacing rtnl_unlock() > > by the non-blocking __rtnl_unlock() in linkwatch_event())? Hmm... You can kill me, but after more looking at this I've changed my mind. > I agree it's more appropriate, because: > a) the job is done according to the rules (in the comment) during > the notification and not in some future, But, since the rules are wrong or protocols buggy, the solution isn't effective: there should be no deadlock after this, but it's also hard to predict how many dst_gc_task() runs are needed to free all refs. During this all other works would be unnecessarily blocked by the linkwatch, what looks a bit too hoggish. My current idea is we should move linkwatch or dst_gc_task() (or both) to it's own workqueue. > b) it removes some hidden dependencies between processes/works, which, > even if they currently don't exist except this one in the linkwatch, > could be extremly hard to diagnose, if added accidentally in the > future. > > On the other hand, until we are not sure of the reasons, why something BTW, sorry, of course: "On the other hand, until we are sure of the reasons, why something" > like this (the full destroying) was avoided in the past, and until it's > heavily tested (with lockdep), your first patch looks to me much safer > if applying to any -stable is considered. And this first patch looks better and better each time... Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Oct 04, 2008 at 09:42:48AM +0200, Jarek Poplawski wrote: ... > During this all other works would be unnecessarily blocked by the > linkwatch, what looks a bit too hoggish. Or even more: can't some of these blockeds task hold our refs too? Jarek P. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: net-next-2.6/net/core/dst.c =================================================================== --- net-next-2.6.orig/net/core/dst.c +++ net-next-2.6/net/core/dst.c @@ -328,6 +328,10 @@ static int dst_dev_event(struct notifier dst_ifdown(dst, dev, event != NETDEV_DOWN); } mutex_unlock(&dst_gc_mutex); + + if (event == NETDEV_UNREGISTER && + cancel_delayed_work(&dst_gc_work)) + dst_gc_task(&dst_gc_work.work); break; } return NOTIFY_DONE;