Message ID | 1439831928.32680.11.camel@edumazet-glaptop2.roam.corp.google.com |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On 2015-08-17 19:18, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > On Mon, 2015-08-17 at 16:25 +0200, Sander Eikelenboom wrote: >> Monday, August 17, 2015, 4:21:47 PM, you wrote: >> >> > On Mon, 2015-08-17 at 09:02 -0500, Jon Christopherson wrote: >> >> This is very similar to the behavior I am seeing in this bug: >> >> >> >> https://bugzilla.kernel.org/show_bug.cgi?id=102911 >> >> > OK, but have you applied the fix ? >> >> > http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=83fccfc3940c4a2db90fd7e7079f5b465cd8c6af >> >> > It will be part of net iteration from David Miller to Linus Torvald. >> >> >> I did have that patch in for my last report. >> But i don't think he had (looking at the second part of his oops). >> > > Then can you try following fix as well ? > > Thanks ! Running now :) > > [PATCH] timer: fix a race in __mod_timer() > > lock_timer_base() can not catch following : > > CPU1 ( in __mod_timer() > timer->flags |= TIMER_MIGRATING; > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > timer->flags &= ~TIMER_BASEMASK; > CPU2 (in lock_timer_base()) > see timer base is cpu0 base > spin_lock_irqsave(&base->lock, > *flags); > if (timer->flags == tf) > return base; // oops, wrong base > timer->flags |= base->cpu // too late > > We must write timer->flags in one go, otherwise we can fool other cpus. > > Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if > disabled") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > kernel/time/timer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 5e097fa9faf7..84190f02b521 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long > expires, > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > - timer->flags &= ~TIMER_BASEMASK; > - timer->flags |= base->cpu; > + WRITE_ONCE(timer->flags, > + (timer->flags & ~TIMER_BASEMASK) | base->cpu); > } > } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 17 Aug 2015, Eric Dumazet wrote: > [PATCH] timer: fix a race in __mod_timer() > > lock_timer_base() can not catch following : > > CPU1 ( in __mod_timer() > timer->flags |= TIMER_MIGRATING; > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > timer->flags &= ~TIMER_BASEMASK; > CPU2 (in lock_timer_base()) > see timer base is cpu0 base > spin_lock_irqsave(&base->lock, *flags); > if (timer->flags == tf) > return base; // oops, wrong base > timer->flags |= base->cpu // too late > > We must write timer->flags in one go, otherwise we can fool other cpus. > > Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled") > Signed-off-by: Eric Dumazet <edumazet@google.com> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > kernel/time/timer.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > index 5e097fa9faf7..84190f02b521 100644 > --- a/kernel/time/timer.c > +++ b/kernel/time/timer.c > @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, > spin_unlock(&base->lock); > base = new_base; > spin_lock(&base->lock); > - timer->flags &= ~TIMER_BASEMASK; > - timer->flags |= base->cpu; > + WRITE_ONCE(timer->flags, > + (timer->flags & ~TIMER_BASEMASK) | base->cpu); Duh, yes. Picking it up for timers/urgent. Thanks for spotting it. tglx -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/17/2015 12:18 PM, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> <snip> > > Then can you try following fix as well ? > > Thanks ! > > [PATCH] timer: fix a race in __mod_timer() > <snip> I have been running the latest code from git with the 2 patches in this thread applied. No issues so far. -Jon -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 5e097fa9faf7..84190f02b521 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -807,8 +807,8 @@ __mod_timer(struct timer_list *timer, unsigned long expires, spin_unlock(&base->lock); base = new_base; spin_lock(&base->lock); - timer->flags &= ~TIMER_BASEMASK; - timer->flags |= base->cpu; + WRITE_ONCE(timer->flags, + (timer->flags & ~TIMER_BASEMASK) | base->cpu); } }