Patchwork deadlocks if use htb

login
register
mail settings
Submitter Jarek Poplawski
Date Dec. 15, 2008, 11:13 a.m.
Message ID <20081215111308.GA5853@ff.dom.local>
Download mbox | patch
Permalink /patch/14004/
State RFC
Delegated to: David Miller
Headers show

Comments

Jarek Poplawski - Dec. 15, 2008, 11:13 a.m.
On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
> > Hello again! Sorry for long away.
> 
> Hi!
> 
> > I was go away from this work for long time.
> > 
> > May we return to this bug?
> > Servers at last stable kernel 2.6.27.8
> > HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
> > Sorry - no patched, update do not i. Do you have fresh patches or ideas
> > for tests?
> 
> Not much, but I can have if you only are willing to test them...
> I attach below a patch which combines 2 patches I sent yesterday to
> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
> 
> You can still try the testing patch #2 I sent previously (quoted below)
> with or without this new #3 patch.
> 

Here is another idea worth checking (instead of patch #2).

Jarek P.

--- (testing patch #4)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Dec. 16, 2008, 7:37 a.m.
Now i test patches 2+3. 5 pc work without oops 5 days
> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>   
>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>     
>>> Hello again! Sorry for long away.
>>>       
>> Hi!
>>
>>     
>>> I was go away from this work for long time.
>>>
>>> May we return to this bug?
>>> Servers at last stable kernel 2.6.27.8
>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>> for tests?
>>>       
>> Not much, but I can have if you only are willing to test them...
>> I attach below a patch which combines 2 patches I sent yesterday to
>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>
>> You can still try the testing patch #2 I sent previously (quoted below)
>> with or without this new #3 patch.
>>
>>     
>
> Here is another idea worth checking (instead of patch #2).
>
> Jarek P.
>
> --- (testing patch #4)
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c	2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c	2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
>  		}
>  	}
>  	sch->qstats.overlimits++;
> +	qdisc_watchdog_cancel(&q->watchdog);
>  	qdisc_watchdog_schedule(&q->watchdog, next_event);
>  fin:
>  	return skb;
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Dec. 18, 2008, 6:43 a.m.
Hello
result: Patch 2+3 = uptime 7 days without crashes.
May i revert patches and try single new patch?

Thanks!

> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>   
>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>     
>>> Hello again! Sorry for long away.
>>>       
>> Hi!
>>
>>     
>>> I was go away from this work for long time.
>>>
>>> May we return to this bug?
>>> Servers at last stable kernel 2.6.27.8
>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>> for tests?
>>>       
>> Not much, but I can have if you only are willing to test them...
>> I attach below a patch which combines 2 patches I sent yesterday to
>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>
>> You can still try the testing patch #2 I sent previously (quoted below)
>> with or without this new #3 patch.
>>
>>     
>
> Here is another idea worth checking (instead of patch #2).
>
> Jarek P.
>
> --- (testing patch #4)
>
> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> --- a2.6.27.7/net/sched/sch_htb.c	2008-12-11 08:16:16.000000000 +0000
> +++ b2.6.27.7/net/sched/sch_htb.c	2008-12-15 10:44:32.000000000 +0000
> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
>  		}
>  	}
>  	sch->qstats.overlimits++;
> +	qdisc_watchdog_cancel(&q->watchdog);
>  	qdisc_watchdog_schedule(&q->watchdog, next_event);
>  fin:
>  	return skb;
>
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Dec. 18, 2008, 8:17 a.m.
On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
> Hello
> result: Patch 2+3 = uptime 7 days without crashes.
> May i revert patches and try single new patch?

Here is my current opinion on this bug:

1) I'm almost sure it's not a htb, but hrtimers bug (some race),

2) the htb patches you've tested are not "the proper" way of fixing
   it; I see substantial changes in hrtimers code in the "-tip" tree
   (probably for 2.6.29), which, probably, you'll be advised by
   hrtimers maintainers to try, and I guess, it's not easy on a
   production system,

So, it's up to you:

1) since these patches work for you, you can stop with testing and
   wait with these patched kernels until 2.6.29 (I can propose this
   #2 patch as a temporary fix then),

2) for curiosity you could try this patch #4 alone on one box first
   (after reverting at least patch #2), but again: if it works, it
   could be only treated as a temporary hack, and alternative of #2.

Thanks,
Jarek P.

> 
> > On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
> >   
> >> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
> >>     
> >>> Hello again! Sorry for long away.
> >>>       
> >> Hi!
> >>
> >>     
> >>> I was go away from this work for long time.
> >>>
> >>> May we return to this bug?
> >>> Servers at last stable kernel 2.6.27.8
> >>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
> >>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
> >>> for tests?
> >>>       
> >> Not much, but I can have if you only are willing to test them...
> >> I attach below a patch which combines 2 patches I sent yesterday to
> >> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
> >>
> >> You can still try the testing patch #2 I sent previously (quoted below)
> >> with or without this new #3 patch.
> >>
> >>     
> >
> > Here is another idea worth checking (instead of patch #2).
> >
> > Jarek P.
> >
> > --- (testing patch #4)
> >
> > diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
> > --- a2.6.27.7/net/sched/sch_htb.c	2008-12-11 08:16:16.000000000 +0000
> > +++ b2.6.27.7/net/sched/sch_htb.c	2008-12-15 10:44:32.000000000 +0000
> > @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
> >  		}
> >  	}
> >  	sch->qstats.overlimits++;
> > +	qdisc_watchdog_cancel(&q->watchdog);
> >  	qdisc_watchdog_schedule(&q->watchdog, next_event);
> >  fin:
> >  	return skb;
> >
> >
> >   
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Dec. 18, 2008, 11:23 a.m.
Thanks for all Jarek!

Vyacheslav Badalian

> On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
>   
>> Hello
>> result: Patch 2+3 = uptime 7 days without crashes.
>> May i revert patches and try single new patch?
>>     
>
> Here is my current opinion on this bug:
>
> 1) I'm almost sure it's not a htb, but hrtimers bug (some race),
>
> 2) the htb patches you've tested are not "the proper" way of fixing
>    it; I see substantial changes in hrtimers code in the "-tip" tree
>    (probably for 2.6.29), which, probably, you'll be advised by
>    hrtimers maintainers to try, and I guess, it's not easy on a
>    production system,
>
> So, it's up to you:
>
> 1) since these patches work for you, you can stop with testing and
>    wait with these patched kernels until 2.6.29 (I can propose this
>    #2 patch as a temporary fix then),
>
> 2) for curiosity you could try this patch #4 alone on one box first
>    (after reverting at least patch #2), but again: if it works, it
>    could be only treated as a temporary hack, and alternative of #2.
>
> Thanks,
> Jarek P.
>
>   
>>> On Thu, Dec 11, 2008 at 08:46:06AM +0000, Jarek Poplawski wrote:
>>>   
>>>       
>>>> On Wed, Dec 10, 2008 at 06:14:28PM +0300, Badalian Vyacheslav wrote:
>>>>     
>>>>         
>>>>> Hello again! Sorry for long away.
>>>>>       
>>>>>           
>>>> Hi!
>>>>
>>>>     
>>>>         
>>>>> I was go away from this work for long time.
>>>>>
>>>>> May we return to this bug?
>>>>> Servers at last stable kernel 2.6.27.8
>>>>> HZ=1000, HR=off, DynamicTicks=off, hysteresis=1
>>>>> Sorry - no patched, update do not i. Do you have fresh patches or ideas
>>>>> for tests?
>>>>>       
>>>>>           
>>>> Not much, but I can have if you only are willing to test them...
>>>> I attach below a patch which combines 2 patches I sent yesterday to
>>>> netdev (PATCH 7/6 and 8/6) vs. 2.6.27.7 (named testing patch #3 here).
>>>>
>>>> You can still try the testing patch #2 I sent previously (quoted below)
>>>> with or without this new #3 patch.
>>>>
>>>>     
>>>>         
>>> Here is another idea worth checking (instead of patch #2).
>>>
>>> Jarek P.
>>>
>>> --- (testing patch #4)
>>>
>>> diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
>>> --- a2.6.27.7/net/sched/sch_htb.c	2008-12-11 08:16:16.000000000 +0000
>>> +++ b2.6.27.7/net/sched/sch_htb.c	2008-12-15 10:44:32.000000000 +0000
>>> @@ -924,6 +924,7 @@ static struct sk_buff *htb_dequeue(struc
>>>  		}
>>>  	}
>>>  	sch->qstats.overlimits++;
>>> +	qdisc_watchdog_cancel(&q->watchdog);
>>>  	qdisc_watchdog_schedule(&q->watchdog, next_event);
>>>  fin:
>>>  	return skb;
>>>
>>>
>>>   
>>>       
>
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Dec. 18, 2008, 11:37 a.m.
On Thu, Dec 18, 2008 at 02:23:33PM +0300, Badalian Vyacheslav wrote:
> Thanks for all Jarek!

The same to you for very valuable reporting and testing Vyacheslav!

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Caputo - Jan. 14, 2009, 2:43 a.m.
On Thu, 18 Dec 2008, Jarek Poplawski wrote:
> On Thu, Dec 18, 2008 at 09:43:51AM +0300, Badalian Vyacheslav wrote:
> > Hello
> > result: Patch 2+3 = uptime 7 days without crashes.
> > May i revert patches and try single new patch?
> 
> Here is my current opinion on this bug:
> 
> 1) I'm almost sure it's not a htb, but hrtimers bug (some race),
> 
> 2) the htb patches you've tested are not "the proper" way of fixing
>    it; I see substantial changes in hrtimers code in the "-tip" tree
>    (probably for 2.6.29), which, probably, you'll be advised by
>    hrtimers maintainers to try, and I guess, it's not easy on a
>    production system,
> 
> So, it's up to you:
> 
> 1) since these patches work for you, you can stop with testing and
>    wait with these patched kernels until 2.6.29 (I can propose this
>    #2 patch as a temporary fix then),
> 
> 2) for curiosity you could try this patch #4 alone on one box first
>    (after reverting at least patch #2), but again: if it works, it
>    could be only treated as a temporary hack, and alternative of #2.

I think I am hitting the same issue discussed in this thread and:

 http://bugzilla.kernel.org/show_bug.cgi?id=11718

and wanted to share my data.

My system:

 2x Xeon @ 3.06ghz
 CONFIG_HZ=250
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_NO_HZ=y
 (please let me know if more details are useful)

With 2.6.27.10 after around 30 minutes or an hour, server tends to hang 
with:

 Pid: 0, comm: swapper Not tainted (2.6.27.10 #1)
 EIP: 0060:[<c023199f>] EFLAGS: 00000286 CPU: 0
 EIP is at run_hrtimer_pending+0x31/0xb8
 EAX: f6c1d468 EBX: c039b2ae ECX: 00000002 EDX: 00000001
 ESI: f6c1d468 EDI: c1807e20 EBP: c0587f28 ESP: c0587f18
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: 09a36358 CR3: 36d64000 CR4: 000006d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
  [<c0231a3c>] run_hrtimer_softirq+0x16/0x18
  [<c0223e6d>] __do_softirq+0x64/0xcd
  [<c0223f0b>] do_softirq+0x35/0x3a
  [<c0224199>] irq_exit+0x38/0x6d
  [<c020eefb>] smp_apic_timer_interrupt+0x71/0x82
  [<c0203640>] apic_timer_interrupt+0x28/0x30
  [<c0207f9d>] ? default_idle+0x2d/0x42
  [<c0201946>] cpu_idle+0xca/0xea
  [<c045194e>] rest_init+0x4e/0x50

One time with 2.6.27.10:

 BUG: unable to handle kernel NULL pointer dereference at 000000
  [<c0239b6b>] smp_call_function+0x12/0x14
  [<c020df20>] native_smp_send_stop+0x1b/0x28
  [<c0220304>] panic+0x47/0xdf
  [<c0203ac2>] oops_end+0x5d/0x71
  [<c0203f9f>] die+0x57/0x5f
  [<c02129a7>] do_page_fault+0x474/0x529
  [<c0212533>] ? do_page_fault+0x0/0x529
  [<c045eb3a>] error_code+0x72/0x78
  [<c024007b>] ? cgroup_get_sb+0x20/0x2b4
  [<c02f223c>] ? rb_erase+0x118/0x241
  [<c023177d>] __remove_hrtimer+0x5f/0x67
  [<c039b2ae>] ? qdisc_watchdog+0x0/0x1c
  [<c023199a>] run_hrtimer_pending+0x2c/0xb8
  [<c0231a3c>] run_hrtimer_softirq+0x16/0x18
  [<c0223e6d>] __do_softirq+0x64/0xcd
  [<c0223f0b>] do_softirq+0x35/0x3a
  [<c0224199>] irq_exit+0x38/0x6d
  [<c020eefb>] smp_apic_timer_interrupt+0x71/0x82
  [<c0203640>] apic_timer_interrupt+0x28/0x30
  [<c0207f9d>] ? default_idle+0x2d/0x42
  [<c0201946>] cpu_idle+0xca/0xea
  [<c045194e>] rest_init+0x4e/0x50
  =======================
 ---[ end trace 818de8d3237477a5 ]---
 Rebooting in 10 seconds..

With 2.6.27.10 plus what is being called patches #2 and #3 in this thread, 
the server ran for a day or so before hanging again:

 Pid: 0, comm: swapper Not tainted (2.6.27.10 #3)
 EIP: 0060:[<c023199f>] EFLAGS: 00000206 CPU: 0
 EIP is at run_hrtimer_pending+0x31/0xb8
 EAX: f73f8470 EBX: c039b312 ECX: 00000002 EDX: 00000001
 ESI: f73f8470 EDI: c1807e20 EBP: c0589f20 ESP: c0589f10
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: 08ca3358 CR3: 005ca000 CR4: 000006d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
  [<c0231a3c>] run_hrtimer_softirq+0x16/0x18
  [<c0223e6d>] __do_softirq+0x64/0xcd
  [<c0223f0b>] do_softirq+0x35/0x3a
  [<c0224199>] irq_exit+0x38/0x6d
  [<c02052b3>] do_IRQ+0x5c/0x75
  [<c0203553>] common_interrupt+0x23/0x28
  [<c0207f9d>] ? default_idle+0x2d/0x42
  [<c0201946>] cpu_idle+0xca/0xea
  [<c045225e>] rest_init+0x4e/0x50

With 2.6.28 (without the patches from this thread) after about an hour the 
system hung again.  "Show Regs" indicated:

 Pid: 0, comm: swapper Not tainted (2.6.28 #1)
 EIP: 0060:[<c03977bd>] EFLAGS: 00000247 CPU: 0
 EIP is at __netif_schedule+0xc/0x39
 EAX: f6934600 EBX: 00000000 ECX: f6934600 EDX: f6864000
 ESI: f6864488 EDI: c1807480 EBP: c05c1f08 ESP: c05c1f04
  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
 CR0: 8005003b CR2: 09968358 CR3: 365cd000 CR4: 000006d0
 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
 DR6: ffff0ff0 DR7: 00000400
 Call Trace:
  [<c03a669d>] qdisc_watchdog+0x18/0x1c
  [<c02335c3>] run_hrtimer_pending+0x56/0xda
  [<c03a6685>] ? qdisc_watchdog+0x0/0x1c
  [<c023365d>] run_hrtimer_softirq+0x16/0x18
  [<c0225b1a>] __do_softirq+0x7a/0x11c
  [<c0225bf1>] do_softirq+0x35/0x3a
  [<c0225eb5>] irq_exit+0x38/0x6d
  [<c020f719>] smp_apic_timer_interrupt+0x71/0x7f
  [<c0203858>] apic_timer_interrupt+0x28/0x30
  [<c0208360>] ? default_idle+0x2d/0x42
  [<c0201938>] cpu_idle+0x6b/0x84
  [<c046b292>] rest_init+0x4e/0x50

Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's 
"hrtimer: removing all ur callback modes" patches dated 2008-11-25, 
2008-12-04 and 2008-12-08.  Uptime was 2 days 22 hours before I hit what 
appears to be an unrelated bug related to the IPv6 FIB.  (Separately 
reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)

Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 6:39 a.m.
On Wed, Jan 14, 2009 at 02:43:13AM +0000, Chris Caputo wrote:
...
> Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's 
> "hrtimer: removing all ur callback modes" patches dated 2008-11-25, 
> 2008-12-04 and 2008-12-08.  Uptime was 2 days 22 hours before I hit what 
> appears to be an unrelated bug related to the IPv6 FIB.  (Separately 
> reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)

Chris, this is a very good news, thanks for testing this!

Peter, great work! IMHO these patches should go to the stable (at
least 2.6.28).

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Denys Fedoryshchenko - Jan. 14, 2009, 12:17 p.m.
I will try that patches too, when i got this message, after 3 minutes i got 
crash of my router :-) after working around 17 hours :-(
There is 3 of them by the way, 2 fixes also.

hrtimer: removing all ur callback modes
hrtimer: removing all ur callback modes, fix hotplug
hrtimer: removing all ur callback modes, fix

On Wednesday 14 January 2009 08:39:09 Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:43:13AM +0000, Chris Caputo wrote:
> ...
>
> > Per Jarek's suggestion in bugzilla, I ran 2.6.28 plus Peter Zijlstra's
> > "hrtimer: removing all ur callback modes" patches dated 2008-11-25,
> > 2008-12-04 and 2008-12-08.  Uptime was 2 days 22 hours before I hit what
> > appears to be an unrelated bug related to the IPv6 FIB.  (Separately
> > reported with subject 'panic with 2.6.28 while doing "ip -6 route"'.)
>
> Chris, this is a very good news, thanks for testing this!
>
> Peter, great work! IMHO these patches should go to the stable (at
> least 2.6.28).
>
> Thanks,
> Jarek P.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 12:36 p.m.
On Wed, Jan 14, 2009 at 02:17:58PM +0200, Denys Fedoryschenko wrote:
> I will try that patches too, when i got this message, after 3 minutes i got 
> crash of my router :-) after working around 17 hours :-(
> There is 3 of them by the way, 2 fixes also.
> 
> hrtimer: removing all ur callback modes
> hrtimer: removing all ur callback modes, fix hotplug
> hrtimer: removing all ur callback modes, fix

Yes, it's a very nasty bug which, BTW you reported the first AFAIK,
but I hope it's really gone with these patches.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Denys Fedoryshchenko - Jan. 14, 2009, 12:41 p.m.
On Wednesday 14 January 2009 14:36:23 Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:17:58PM +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i
> > got crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> >
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
>
> Yes, it's a very nasty bug which, BTW you reported the first AFAIK,
> but I hope it's really gone with these patches.
>
If it is really fix, it must go to mainline ASAP.
Because really many people hitting this bug... and complaining, that "Linux 
with htb not stable".
But i think it is not just fix :-(
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra - Jan. 14, 2009, 12:50 p.m.
On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> I will try that patches too, when i got this message, after 3 minutes i got 
> crash of my router :-) after working around 17 hours :-(
> There is 3 of them by the way, 2 fixes also.
> 
> hrtimer: removing all ur callback modes
> hrtimer: removing all ur callback modes, fix hotplug
> hrtimer: removing all ur callback modes, fix

I'm afraid its a bit more than that:

ca109491f612aab5c8152207631c0444f63da97f
37810659ea7d9572c5ac284ade272f806ef8f788
a0a99b227da57f81319dd239bc4de811b0f530ec
b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
d5fd43c4ae04523e1dcd7794f9c511b289851350
731a55ba0f17064f85903b7bf8e24849ec6cfa20
a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
e3f1d883740b09e5116d4d4e30a6a6987264a83c
82c5b7b527ccc4b5d3cf832437e842f9d2920a79

and

6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0

Also, all this is rather invasive and large, so I'm not sure it will
meet the -stable criteria.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 1:04 p.m.
On Wed, Jan 14, 2009 at 01:50:04PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i got 
> > crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> > 
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
> 
> I'm afraid its a bit more than that:
> 
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> 
> and
> 
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> 
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.
> 

Earlier I thought it's very hard to hit, but after Chris reported it
triggers within hours, IMHO some minimal fix is necessary.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Denys Fedoryshchenko - Jan. 14, 2009, 1:05 p.m.
> I'm afraid its a bit more than that:
>
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
>
> and
>
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
>
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.

Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded 
router, but i am hitting this bug with hrtimers hardly, so no choice for me 
seems. And i cannot disable hrtimers because of bad shaper precision.
Thanks a lot for info and patches :-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 1:12 p.m.
On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > I'm afraid its a bit more than that:
> >
> > ca109491f612aab5c8152207631c0444f63da97f
> > 37810659ea7d9572c5ac284ade272f806ef8f788
> > a0a99b227da57f81319dd239bc4de811b0f530ec
> > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> >
> > and
> >
> > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> >
> > Also, all this is rather invasive and large, so I'm not sure it will
> > meet the -stable criteria.
> 
> Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded 
> router, but i am hitting this bug with hrtimers hardly, so no choice for me 
> seems. And i cannot disable hrtimers because of bad shaper precision.
> Thanks a lot for info and patches :-)
> 

Denys, since these three patches worked for Chris, I guess it's safer
to stay with 2.6.28 anyway (or my patches mentioned earlier in this
thread). rc1 is really not stable...

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra - Jan. 14, 2009, 1:15 p.m.
On Wed, 2009-01-14 at 13:12 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > > I'm afraid its a bit more than that:
> > >
> > > ca109491f612aab5c8152207631c0444f63da97f
> > > 37810659ea7d9572c5ac284ade272f806ef8f788
> > > a0a99b227da57f81319dd239bc4de811b0f530ec
> > > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> > >
> > > and
> > >
> > > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> > >
> > > Also, all this is rather invasive and large, so I'm not sure it will
> > > meet the -stable criteria.
> > 
> > Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded 
> > router, but i am hitting this bug with hrtimers hardly, so no choice for me 
> > seems. And i cannot disable hrtimers because of bad shaper precision.
> > Thanks a lot for info and patches :-)
> > 
> 
> Denys, since these three patches worked for Chris, I guess it's safer
> to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> thread). rc1 is really not stable...

Yeah, but those three patches have some holes in them still, so I cannot
recommend just patching in those three.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Denys Fedoryshchenko - Jan. 14, 2009, 1:19 p.m.
On Wednesday 14 January 2009 15:15:29 Peter Zijlstra wrote:
> >
> > Denys, since these three patches worked for Chris, I guess it's safer
> > to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> > thread). rc1 is really not stable...
>
> Yeah, but those three patches have some holes in them still, so I cannot
> recommend just patching in those three.
I will test rc1-git4 on few units which is less important and have redundancy, 
and then will decide. Already i did this 3 patches, but i prefer complete 
solution. If i will hit any bug, at least i will do my best to report about 
this bugs and such way to help fix them :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 1:26 p.m.
On Wed, Jan 14, 2009 at 02:15:29PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:12 +0000, Jarek Poplawski wrote:
> > On Wed, Jan 14, 2009 at 03:05:46PM +0200, Denys Fedoryschenko wrote:
> > > > I'm afraid its a bit more than that:
> > > >
> > > > ca109491f612aab5c8152207631c0444f63da97f
> > > > 37810659ea7d9572c5ac284ade272f806ef8f788
> > > > a0a99b227da57f81319dd239bc4de811b0f530ec
> > > > b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> > > > 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> > > > d5fd43c4ae04523e1dcd7794f9c511b289851350
> > > > 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> > > > a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> > > > e3f1d883740b09e5116d4d4e30a6a6987264a83c
> > > > 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> > > >
> > > > and
> > > >
> > > > 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> > > >
> > > > Also, all this is rather invasive and large, so I'm not sure it will
> > > > meet the -stable criteria.
> > > 
> > > Then i will try to run 2.6.29-rc1-git4. Kind of risky to run rc1 on loaded 
> > > router, but i am hitting this bug with hrtimers hardly, so no choice for me 
> > > seems. And i cannot disable hrtimers because of bad shaper precision.
> > > Thanks a lot for info and patches :-)
> > > 
> > 
> > Denys, since these three patches worked for Chris, I guess it's safer
> > to stay with 2.6.28 anyway (or my patches mentioned earlier in this
> > thread). rc1 is really not stable...
> 
> Yeah, but those three patches have some holes in them still, so I cannot
> recommend just patching in those three.
> 

OK, I hope Denys can apply more, but what about others? Without any
patches the hole seems to be much bigger.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra - Jan. 14, 2009, 1:32 p.m.
On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:

> OK, I hope Denys can apply more, but what about others? Without any
> patches the hole seems to be much bigger.

OK, I read most of this thread on netdev, but didn't find a clear clue
on the specific hrtimer insertion race.

Do you have any clear ideas or should I poke at the htb/hrtimer code a
little?

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 1:57 p.m.
On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> 
> > OK, I hope Denys can apply more, but what about others? Without any
> > patches the hole seems to be much bigger.
> 
> OK, I read most of this thread on netdev, but didn't find a clear clue
> on the specific hrtimer insertion race.
> 
> Do you have any clear ideas or should I poke at the htb/hrtimer code a
> little?
> 

I didn't diagnose this entirely, and there were "a few" days since I
wrote my opinion on this, so currently I forgot details, but I guess
the most suspicious place is adding a hrtimer, when it's on the list,
maybe even two times in a row. I guess it's not removed from the
rbtree in some case. (I stopped searching after seeing you removed
this list or something...)

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 2:13 p.m.
On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> 
> > OK, I hope Denys can apply more, but what about others? Without any
> > patches the hole seems to be much bigger.
> 
> OK, I read most of this thread on netdev, but didn't find a clear clue
> on the specific hrtimer insertion race.

There is something at the beginning of this thread, plus earlier
threads mostly with Denys as sender, and "htb bug" in the subject.

> 
> Do you have any clear ideas or should I poke at the htb/hrtimer code a
> little?
> 

...And htb code is htb_dequeue(): qdisc_watchdog_schedule().

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra - Jan. 14, 2009, 2:28 p.m.
On Wed, 2009-01-14 at 14:13 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> > On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> > 
> > > OK, I hope Denys can apply more, but what about others? Without any
> > > patches the hole seems to be much bigger.
> > 
> > OK, I read most of this thread on netdev, but didn't find a clear clue
> > on the specific hrtimer insertion race.
> 
> There is something at the beginning of this thread, plus earlier
> threads mostly with Denys as sender, and "htb bug" in the subject.
> 
> > 
> > Do you have any clear ideas or should I poke at the htb/hrtimer code a
> > little?
> > 
> 
> ....And htb code is htb_dequeue(): qdisc_watchdog_schedule().

Right, found all that...

Can't spot anything obviously wrong though.. hrtimer_start*() does
remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
off the relevant list before it continues the enqueue.

However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
but I cannot find an RB-op that doesn't hold base-lock.

Hohumm..

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 14, 2009, 2:39 p.m.
On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:13 +0000, Jarek Poplawski wrote:
> > On Wed, Jan 14, 2009 at 02:32:26PM +0100, Peter Zijlstra wrote:
> > > On Wed, 2009-01-14 at 13:26 +0000, Jarek Poplawski wrote:
> > > 
> > > > OK, I hope Denys can apply more, but what about others? Without any
> > > > patches the hole seems to be much bigger.
> > > 
> > > OK, I read most of this thread on netdev, but didn't find a clear clue
> > > on the specific hrtimer insertion race.
> > 
> > There is something at the beginning of this thread, plus earlier
> > threads mostly with Denys as sender, and "htb bug" in the subject.
> > 
> > > 
> > > Do you have any clear ideas or should I poke at the htb/hrtimer code a
> > > little?
> > > 
> > 
> > ....And htb code is htb_dequeue(): qdisc_watchdog_schedule().
> 
> Right, found all that...
> 
> Can't spot anything obviously wrong though.. hrtimer_start*() does
> remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> off the relevant list before it continues the enqueue.
> 
> However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> but I cannot find an RB-op that doesn't hold base-lock.
> 

Anyway I think some trace in rbtree seemed to show the parent is the
same with the node, if I didn't mess anything.

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Caputo - Jan. 14, 2009, 6:02 p.m.
On Wed, 14 Jan 2009, Peter Zijlstra wrote:
> On Wed, 2009-01-14 at 14:17 +0200, Denys Fedoryschenko wrote:
> > I will try that patches too, when i got this message, after 3 minutes i got 
> > crash of my router :-) after working around 17 hours :-(
> > There is 3 of them by the way, 2 fixes also.
> > 
> > hrtimer: removing all ur callback modes
> > hrtimer: removing all ur callback modes, fix hotplug
> > hrtimer: removing all ur callback modes, fix
> 
> I'm afraid its a bit more than that:
> 
> ca109491f612aab5c8152207631c0444f63da97f
> 37810659ea7d9572c5ac284ade272f806ef8f788
> a0a99b227da57f81319dd239bc4de811b0f530ec
> b2e3c0adec918ea22b6c9d7c76193dd3aaba9bd4
> 8bdec955b0da2ffbd10eb9b200651dd1f9e366f2
> d5fd43c4ae04523e1dcd7794f9c511b289851350
> 731a55ba0f17064f85903b7bf8e24849ec6cfa20
> a6037b61c2f5fc99c57c15b26d7cfa58bbb34008
> e3f1d883740b09e5116d4d4e30a6a6987264a83c
> 82c5b7b527ccc4b5d3cf832437e842f9d2920a79
> 
> and
> 
> 6e5c172cf7ca1ab878cc6a6a4c1d52fef60f3ee0
> 
> Also, all this is rather invasive and large, so I'm not sure it will
> meet the -stable criteria.

In my testing (another 12 hours of uptime, BTW), the three patches I am 
using are:

  ca109491f612aab5c8152207631c0444f63da97f
  37810659ea7d9572c5ac284ade272f806ef8f788
  a0a99b227da57f81319dd239bc4de811b0f530ec

I have not tried the other patches.

That said, I would not recommend just the three for -stable unless they 
get a much wider amount of testing, on multiple platforms.  I don't see 
that as likely to happen, plus Peter says they are incomplete, so maybe it 
is just best to recommend that 2.6.28 users getting crashes while using 
HTB try these specific patches at first, and then the rest of the patches 
if they do not work.

Thanks,
Chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 15, 2009, 9:01 a.m.
On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
...
> Right, found all that...
> 
> Can't spot anything obviously wrong though.. hrtimer_start*() does
> remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> off the relevant list before it continues the enqueue.
> 
> However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> but I cannot find an RB-op that doesn't hold base-lock.
> 

I've revisited it yesterday, and if I don't miss something, there is
possible a scenario similar to this:

cpu1:				cpu2:

run_hrtimer_pending
spin_unlock
restart = fn(timer)

				hrtimer_start
				enqueue_hrtimer

				hrtimer_start
				remove_hrtimer
				(the HRTIMER_STATE_CALLBACK is removed)

				switch_hrtimer_base
spin_lock
(not this hrtimer's anymore)
__remove_hrtimer
list_add_tail			enqueue_hrtimer


Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter Zijlstra - Jan. 15, 2009, 10:46 a.m.
On Thu, 2009-01-15 at 09:01 +0000, Jarek Poplawski wrote:
> On Wed, Jan 14, 2009 at 03:28:03PM +0100, Peter Zijlstra wrote:
> ....
> > Right, found all that...
> > 
> > Can't spot anything obviously wrong though.. hrtimer_start*() does
> > remove_hrtimer() which checks STATE_ENQUEUED, STATE_PENDING and pulls it
> > off the relevant list before it continues the enqueue.
> > 
> > However a loop in enqueue_hrtimer() would suggest a corrupted RB-tree,
> > but I cannot find an RB-op that doesn't hold base-lock.
> > 
> 
> I've revisited it yesterday, and if I don't miss something, there is
> possible a scenario similar to this:
> 
> cpu1:				cpu2:
> 
> run_hrtimer_pending
> spin_unlock
> restart = fn(timer)
> 
> 				hrtimer_start
> 				enqueue_hrtimer
> 
> 				hrtimer_start
> 				remove_hrtimer
> 				(the HRTIMER_STATE_CALLBACK is removed)
> 
> 				switch_hrtimer_base
> spin_lock
> (not this hrtimer's anymore)
> __remove_hrtimer
> list_add_tail			enqueue_hrtimer
> 

(looking at .28 code)

run_hrtimer_pending() reads like:

while (pending timers) {
  __remove_hrtimer(timer, HRTIMER_STATE_CALLBACK);
  spin_unlock(&cpu_base->lock);

  fn(timer);

  spin_lock(&cpu_base->lock);
  timer->state &= ~HRTIMER_STATE_CALLBACK; // _should_ result in HRTIMER_STATE_INACTIVE
  if (HRTIMER_RESTART)
    re-queue
  else if (timer->state != INACTIVE) {
    // so another cpu re-queued this timer _while_ we were executing it.
    if (timer is first && !reprogramm) {
      __remove_hrtimer(timer, HRTIMER_STATE_PENDING);
      list_add_tail(timer, &cb_pending);
    }
  } 
}

So in the window where we drop the lock, one can, as you said, have
another cpu requeue the timer, but the rb_entry and list_entry are free,
so it should not cause the data corruption we're seeing.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Jan. 15, 2009, 10:54 a.m.
On Thu, Jan 15, 2009 at 11:46:48AM +0100, Peter Zijlstra wrote:
> On Thu, 2009-01-15 at 09:01 +0000, Jarek Poplawski wrote:
...
> > spin_lock
> > (not this hrtimer's anymore)
> > __remove_hrtimer
> > list_add_tail			enqueue_hrtimer
> > 
> 
> (looking at .28 code)
> 
> run_hrtimer_pending() reads like:
> 
> while (pending timers) {
>   __remove_hrtimer(timer, HRTIMER_STATE_CALLBACK);
>   spin_unlock(&cpu_base->lock);
> 
>   fn(timer);
> 
>   spin_lock(&cpu_base->lock);
>   timer->state &= ~HRTIMER_STATE_CALLBACK; // _should_ result in HRTIMER_STATE_INACTIVE
>   if (HRTIMER_RESTART)
>     re-queue
>   else if (timer->state != INACTIVE) {
>     // so another cpu re-queued this timer _while_ we were executing it.
>     if (timer is first && !reprogramm) {
>       __remove_hrtimer(timer, HRTIMER_STATE_PENDING);
>       list_add_tail(timer, &cb_pending);
>     }
>   } 
> }
> 
> So in the window where we drop the lock, one can, as you said, have
> another cpu requeue the timer, but the rb_entry and list_entry are free,
> so it should not cause the data corruption we're seeing.
> 

Can't they be enqueued to the list (without a lock) and rbtree at the
same time? Then removing is done for the list only?

Jarek P. 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff -Nurp a2.6.27.7/net/sched/sch_htb.c b2.6.27.7/net/sched/sch_htb.c
--- a2.6.27.7/net/sched/sch_htb.c	2008-12-11 08:16:16.000000000 +0000
+++ b2.6.27.7/net/sched/sch_htb.c	2008-12-15 10:44:32.000000000 +0000
@@ -924,6 +924,7 @@  static struct sk_buff *htb_dequeue(struc
 		}
 	}
 	sch->qstats.overlimits++;
+	qdisc_watchdog_cancel(&q->watchdog);
 	qdisc_watchdog_schedule(&q->watchdog, next_event);
 fin:
 	return skb;