Patchwork deadlocks if use htb

login
register
mail settings
Submitter Jarek Poplawski
Date Oct. 10, 2008, 7:56 a.m.
Message ID <20081010075640.GA5204@ff.dom.local>
Download mbox | patch
Permalink /patch/3696/
State RFC
Delegated to: David Miller
Headers show

Comments

Jarek Poplawski - Oct. 10, 2008, 7:56 a.m.
On 10-10-2008 07:44, Badalian Vyacheslav wrote:
> Hello all!

Hello Slavon,

> 
> Please look to if you have time:
> http://bugzilla.kernel.org/show_bug.cgi?id=11718
> 
> We have deadlocks at few PC one times in week.
> I can test any patches to detect and fix problem.
> Now i test 2.6.27-rc kernel at one PC.

A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
diagnosed. Anyway it looks like hardware dependent. The patch below
can sometimes help. 2.6.27 may have this fixed too (some other way).

Jarek P.

(some offsets are OK when patching 2.6.26)
---

 net/sched/sch_htb.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Oct. 10, 2008, 8:46 a.m.
Jarek Poplawski пишет:
> On 10-10-2008 07:44, Badalian Vyacheslav wrote:
>   
>> Hello all!
>>     
>
> Hello Slavon,
>
>   
>> Please look to if you have time:
>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>
>> We have deadlocks at few PC one times in week.
>> I can test any patches to detect and fix problem.
>> Now i test 2.6.27-rc kernel at one PC.
>>     
>
> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
> diagnosed. Anyway it looks like hardware dependent. The patch below
> can sometimes help. 2.6.27 may have this fixed too (some other way).
>
>   
2.6.27 - get it now!


[ 6951.841662] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01fde4c,
registers:
[ 6951.841662] Modules linked in: sch_sfq sch_htb netconsole e1000
i2c_i801 e1000e i2c_core
[ 6951.841662]
[ 6951.841662] Pid: 0, comm: swapper Not tainted (2.6.27-fw #1)
[ 6951.841662] EIP: 0060:[<c01fde4c>] EFLAGS: 00000092 CPU: 3
[ 6951.841662] EIP is at __rb_rotate_right+0xc/0x70
[ 6951.841662] EAX: f70c3c68 EBX: f70c3c68 ECX: f70c3c68 EDX: c202c134
[ 6951.841662] ESI: f70c3c68 EDI: f70c3c68 EBP: c202c134 ESP: f785fc2c
[ 6951.841662]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[ 6951.841662] Process swapper (pid: 0, ti=f785e000 task=f7832940
task.ti=f785e000)
[ 6951.841662] Stack: f70c3c68 f70c3c68 f70c3c68 c01fdf41 f70c3c68
00000000 c202c12c c202c134
[ 6951.841662]        c013a91f f70c3c68 c202c12c c202212c c045b100
c013ae0a 00000000 c013d63d
[ 6951.841662]        9a011800 00000652 00000001 00000282 00000652
f70c3c68 00000000 00000000
[ 6951.841662] Call Trace:
[ 6951.841662]  [<c01fdf41>] rb_insert_color+0x91/0xc0
[ 6951.841662]  [<c013a91f>] enqueue_hrtimer+0x5f/0x80
[ 6951.841662]  [<c013ae0a>] hrtimer_start+0xaa/0x130
[ 6951.841662]  [<c013d63d>] getnstimeofday+0x3d/0xe0
[ 6951.841662]  [<c02de83d>] qdisc_watchdog_schedule+0x3d/0x50
[ 6951.841662]  [<f88ac343>] htb_dequeue+0x683/0x7b0 [sch_htb]
[ 6951.841662]  [<c02ce692>] dev_hard_start_xmit+0x1d2/0x2c0
[ 6951.841662]  [<c02dc87a>] __qdisc_run+0x13a/0x1d0
[ 6951.841662]  [<c02d0ed7>] dev_queue_xmit+0x227/0x4f0
[ 6951.841662]  [<c02f29ff>] ip_finish_output+0x11f/0x280
[ 6951.841662]  [<c02f00e0>] ip_forward+0x290/0x310
[ 6951.841662]  [<c02efe35>] ip_forward_finish+0x25/0x40
[ 6951.841662]  [<c02ee9a2>] ip_rcv_finish+0x122/0x360
[ 6951.841662]  [<c02c8cc6>] __alloc_skb+0x36/0x120
[ 6951.841662]  [<c02c9d02>] __netdev_alloc_skb+0x22/0x50
[ 6951.841662]  [<c02eee20>] ip_rcv+0x0/0x290
[ 6951.841662]  [<c02ce064>] netif_receive_skb+0x274/0x4d0
[ 6951.841662]  [<c0108b1a>] nommu_map_single+0x2a/0x60
[ 6951.841662]  [<f883be39>] e1000_receive_skb+0x49/0x80 [e1000e]
[ 6951.841662]  [<f883e84c>] e1000_clean_rx_irq+0x23c/0x300 [e1000e]
[ 6951.841662]  [<f883b3ad>] e1000_clean+0x1bd/0x570 [e1000e]
[ 6951.841662]  [<c02d03bc>] net_rx_action+0x13c/0x200
[ 6951.841662]  [<c0129b72>] __do_softirq+0x82/0x100
[ 6951.841662]  [<c0129c27>] do_softirq+0x37/0x40
[ 6951.841662]  [<c0106060>] do_IRQ+0x40/0x80
[ 6951.841662]  [<c01134c7>] smp_apic_timer_interrupt+0x57/0x90
[ 6951.841662]  [<c010457f>] common_interrupt+0x23/0x28
[ 6951.841662]  [<c0109aa2>] mwait_idle+0x32/0x40
[ 6951.841662]  [<c01026c8>] cpu_idle+0x48/0xe0
[ 6951.841662]  =======================
[ 6951.841662] Code: 24 08 83 e0 03 09 d0 89 03 8b 1c 24 83 c4 0c c3 89
56 08 eb e3 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 c3 89 7c
24 08 <89> d7 89 74 24 04 8b 50 08 8b 30 8b 4a 04 83 e6 fc 85 c9 89 48




> Jarek P.
>
> (some offsets are OK when patching 2.6.26)
> ---
>
>  net/sched/sch_htb.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
> index 30c999c..ff9e965 100644
> --- a/net/sched/sch_htb.c
> +++ b/net/sched/sch_htb.c
> @@ -162,6 +162,7 @@ struct htb_sched {
>  
>  	int rate2quantum;	/* quant = rate / rate2quantum */
>  	psched_time_t now;	/* cached dequeue time */
> +	psched_time_t next_watchdog;
>  	struct qdisc_watchdog watchdog;
>  
>  	/* non shaped skbs; let them go directly thru */
> @@ -920,7 +921,11 @@ static struct sk_buff *htb_dequeue(struct Qdisc *sch)
>  		}
>  	}
>  	sch->qstats.overlimits++;
> -	qdisc_watchdog_schedule(&q->watchdog, next_event);
> +	if (q->next_watchdog < q->now || next_event <=
> +	     q->next_watchdog - PSCHED_TICKS_PER_SEC / HZ) {
> +		qdisc_watchdog_schedule(&q->watchdog, next_event);
> +		q->next_watchdog = next_event;
> +	}
>  fin:
>  	return skb;
>  }
> @@ -973,6 +978,7 @@ static void htb_reset(struct Qdisc *sch)
>  		}
>  	}
>  	qdisc_watchdog_cancel(&q->watchdog);
> +	q->next_watchdog = 0;
>  	__skb_queue_purge(&q->direct_queue);
>  	sch->q.qlen = 0;
>  	memset(q->row, 0, sizeof(q->row));
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Oct. 10, 2008, 8:52 a.m.
Oh... sorry. I Wrong you understand... i patch 2.6.27 with this patch
and will test it...

>> Please look to if you have time:
>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>
>> We have deadlocks at few PC one times in week.
>> I can test any patches to detect and fix problem.
>> Now i test 2.6.27-rc kernel at one PC.
>>     
>
> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
> diagnosed. Anyway it looks like hardware dependent. The patch below
> can sometimes help. 2.6.27 may have this fixed too (some other way).
>
> Jarek P.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Oct. 10, 2008, 9:04 a.m.
On Fri, Oct 10, 2008 at 12:52:58PM +0400, Badalian Vyacheslav wrote:
> Oh... sorry. I Wrong you understand... i patch 2.6.27 with this patch
> and will test it...

No, you understood it right. But it seems 2.6.27 fix doesn't work for
you. So, yes, try this patch with 2.6.26 or 2.6.27.

Jarek P.


> 
> >> Please look to if you have time:
> >> http://bugzilla.kernel.org/show_bug.cgi?id=11718
> >>
> >> We have deadlocks at few PC one times in week.
> >> I can test any patches to detect and fix problem.
> >> Now i test 2.6.27-rc kernel at one PC.
> >>     
> >
> > A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
> > diagnosed. Anyway it looks like hardware dependent. The patch below
> > can sometimes help. 2.6.27 may have this fixed too (some other way).
> >
> > Jarek P.
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Oct. 10, 2008, 9:51 a.m.
On Fri, Oct 10, 2008 at 09:04:26AM +0000, Jarek Poplawski wrote:
> On Fri, Oct 10, 2008 at 12:52:58PM +0400, Badalian Vyacheslav wrote:
> > Oh... sorry. I Wrong you understand... i patch 2.6.27 with this patch
> > and will test it...
> 
> No, you understood it right. But it seems 2.6.27 fix doesn't work for
> you. So, yes, try this patch with 2.6.26 or 2.6.27.

There is also a parameter you could try (without this patch):

modprobe sch_htb htb_hysteresis=1

Jarek P.

> > 
> > >> Please look to if you have time:
> > >> http://bugzilla.kernel.org/show_bug.cgi?id=11718
> > >>
> > >> We have deadlocks at few PC one times in week.
> > >> I can test any patches to detect and fix problem.
> > >> Now i test 2.6.27-rc kernel at one PC.
> > >>     
> > >
> > > A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
> > > diagnosed. Anyway it looks like hardware dependent. The patch below
> > > can sometimes help. 2.6.27 may have this fixed too (some other way).
> > >
> > > Jarek P.
> > 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy - Oct. 10, 2008, 12:32 p.m.
Jarek Poplawski wrote:
> On 10-10-2008 07:44, Badalian Vyacheslav wrote:
>> Hello all!
> 
> Hello Slavon,
> 
>> Please look to if you have time:
>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>
>> We have deadlocks at few PC one times in week.
>> I can test any patches to detect and fix problem.
>> Now i test 2.6.27-rc kernel at one PC.
> 
> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
> diagnosed. Anyway it looks like hardware dependent. The patch below
> can sometimes help. 2.6.27 may have this fixed too (some other way).

I doubt its hardware related, whats happening is that the hrtimer
insertion gets into an endless loop because the rb tree (or node)
apparently has a loop. I went through the qdiscs' use of hrtimers
again, but can't spot any error there.

Denys, did your systems also have CONFIG_HIGH_RES_TIMERS=n?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Patrick McHardy - Oct. 10, 2008, 12:34 p.m.
Patrick McHardy wrote:
> Jarek Poplawski wrote:
>> On 10-10-2008 07:44, Badalian Vyacheslav wrote:
>>> Hello all!
>>
>> Hello Slavon,
>>
>>> Please look to if you have time:
>>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>>
>>> We have deadlocks at few PC one times in week.
>>> I can test any patches to detect and fix problem.
>>> Now i test 2.6.27-rc kernel at one PC.
>>
>> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
>> diagnosed. Anyway it looks like hardware dependent. The patch below
>> can sometimes help. 2.6.27 may have this fixed too (some other way).
> 
> I doubt its hardware related, whats happening is that the hrtimer
> insertion gets into an endless loop because the rb tree (or node)
> apparently has a loop. I went through the qdiscs' use of hrtimers
> again, but can't spot any error there.
> 
> Denys, did your systems also have CONFIG_HIGH_RES_TIMERS=n?

Badalian, please try enabling CONFIG_DEBUG_OBJECTS and post the
results, if any.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Oct. 10, 2008, 12:54 p.m.
Patrick McHardy пишет:
> Patrick McHardy wrote:
>> Jarek Poplawski wrote:
>>> On 10-10-2008 07:44, Badalian Vyacheslav wrote:
>>>> Hello all!
>>>
>>> Hello Slavon,
>>>
>>>> Please look to if you have time:
>>>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>>>
>>>> We have deadlocks at few PC one times in week.
>>>> I can test any patches to detect and fix problem.
>>>> Now i test 2.6.27-rc kernel at one PC.
>>>
>>> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
>>> diagnosed. Anyway it looks like hardware dependent. The patch below
>>> can sometimes help. 2.6.27 may have this fixed too (some other way).
>>
>> I doubt its hardware related, whats happening is that the hrtimer
>> insertion gets into an endless loop because the rb tree (or node)
>> apparently has a loop. I went through the qdiscs' use of hrtimers
>> again, but can't spot any error there.
>>
>> Denys, did your systems also have CONFIG_HIGH_RES_TIMERS=n?
>
> Badalian, please try enabling CONFIG_DEBUG_OBJECTS and post the
> results, if any.

i have some results with CONFIG_HIGH_RES_TIMERS=n and
CONFIG_HIGH_RES_TIMERS=y
Ok... i recompile kernel... simple wait crash for reboot =)
Now i have pc:

1. 2.6.27 with patch
2. 2.6.26.6 with htb_hysteresis=1 and CONFIG_DEBUG_OBJECTS=n
3. 2.6.26.6 with htb_hysteresis=1 and CONFIG_DEBUG_OBJECTS=y (wait for
crash for reboot)
4. 2 servers deadlocked and not rebooted after panic (2.6.26.5
kernel)... need for drive to its for reboot...
5. 4 pc with 2.6.24-rc7-git2 that also do equal shaping but not have
crashes(its on other hardware)

> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Oct. 16, 2008, 8:28 a.m.
Jarek Poplawski пишет:
> On Fri, Oct 10, 2008 at 09:04:26AM +0000, Jarek Poplawski wrote:
>   
>> On Fri, Oct 10, 2008 at 12:52:58PM +0400, Badalian Vyacheslav wrote:
>>     
>>> Oh... sorry. I Wrong you understand... i patch 2.6.27 with this patch
>>> and will test it...
>>>       
>> No, you understood it right. But it seems 2.6.27 fix doesn't work for
>> you. So, yes, try this patch with 2.6.26 or 2.6.27.
>>     
>
> There is also a parameter you could try (without this patch):
>
> modprobe sch_htb htb_hysteresis=1
>
> Jarek P.
>
>   
>>>>> Please look to if you have time:
>>>>> http://bugzilla.kernel.org/show_bug.cgi?id=11718
>>>>>
>>>>> We have deadlocks at few PC one times in week.
>>>>> I can test any patches to detect and fix problem.
>>>>> Now i test 2.6.27-rc kernel at one PC.
>>>>>     
>>>>>           
>>>> A similar bug was reported by Denys Fedoryshchenko but it wasn't fully
>>>> diagnosed. Anyway it looks like hardware dependent. The patch below
>>>> can sometimes help. 2.6.27 may have this fixed too (some other way).
>>>>
>>>> Jarek P.
>>>>         
>
>   
Sorry for long answer.

We have troubles with power in our server place. Now its gone and i will
test again all this.

With patch +  htb_hysteresis=0 and htb_hysteresis=1 without patch all PC
work done 2 days and 18 hours. After this we have power crash.... =(

Thanks
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski - Oct. 16, 2008, 8:40 a.m.
On Thu, Oct 16, 2008 at 12:28:46PM +0400, Badalian Vyacheslav wrote:
...
> Sorry for long answer.
> 
> We have troubles with power in our server place. Now its gone and i will
> test again all this.
> 
> With patch +  htb_hysteresis=0 and htb_hysteresis=1 without patch all PC
> work done 2 days and 18 hours. After this we have power crash.... =(

No need to hurry: you've written it's not everyday. Better try to make
sure there is really a diffrence after any of these changes.

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Badalian Vyacheslav - Oct. 22, 2008, 6:06 a.m.
Hello!
I get more information.

Statistics of PC:
1. 2.6.26.6 Dunamic Timer, HiResTimer, 1000HZ, htb_hysteresis=0 -
crashed 1d 18h ago
2. 2.6.26.5 HZ300, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 -
uptime 5d 17h (no crashes for now, but it crashed some time ago with
htb_hysteresis=1)
3. 2.6.27, 1000HZ, NO Dunamic Timer, No HiResTimer, htb_hysteresis=0 +
PATCH - uptime 5d 17h (no crashes for now, but it crashed some time ago
without patch)

Also attach crash log of lash crash PC 1:

[10610.110729] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01fd939,
registers:
[10610.110729] Modules linked in: netconsole e1000e i2c_i801 i2c_core e1000
[10610.110729]
[10610.110729] Pid: 0, comm: swapper Not tainted (2.6.26.6-fw #1)
[10610.110729] EIP: 0060:[<c01fd939>] EFLAGS: 00000082 CPU: 1
[10610.110729] EIP is at rb_insert_color+0x19/0xc0
[10610.110729] EAX: f6c23ca4 EBX: f6c23ca4 ECX: 00000000 EDX: f6c23ca4
[10610.110729] ESI: f6c23ca4 EDI: f6c23ca4 EBP: c20190e0 ESP: f7c4dc98
[10610.110729]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[10610.110729] Process swapper (pid: 0, ti=f7c4c000 task=f7c314a0
task.ti=f7c4c000)
[10610.110729] Stack: f6c23ca8 f6c23ca4 f6c23ca4 00000000 c013b672
c20190e0 00000001 c20190d8
[10610.110729]        c20190d8 f6c23ca4 c202d0d8 c04470a0 c013bd4d
00000000 f7c4dcf4 7c491000
[10610.110729]        000009a5 00000001 00000286 f6c23800 ffffffff
00000000 00000000 c02d407e
[10610.110729] Call Trace:
[10610.110729]  [<c013b672>] enqueue_hrtimer+0x72/0xf0
[10610.110729]  [<c013bd4d>] hrtimer_start+0xad/0x150
[10610.110729]  [<c02d407e>] qdisc_watchdog_schedule+0x1e/0x30
[10610.110729]  [<c02d9826>] htb_dequeue+0x6a6/0x810
[10610.110729]  [<c02d3f72>] tc_classify+0x42/0x90
[10610.110729]  [<c02dab22>] sfq_enqueue+0x22/0x230
[10610.110729]  [<c02d9c40>] htb_enqueue+0x0/0x1e0
[10610.110729]  [<c02d2efc>] __qdisc_run+0x19c/0x1d0
[10610.110729]  [<c02d9c40>] htb_enqueue+0x0/0x1e0
[10610.110729]  [<c02c7737>] dev_queue_xmit+0x267/0x380
[10610.110729]  [<c02e8ab0>] ip_forward_finish+0x0/0x40
[10610.110729]  [<c02eb65f>] ip_finish_output+0x11f/0x280
[10610.110729]  [<c02e8d7f>] ip_forward+0x28f/0x2d0
[10610.110729]  [<c02e8ad5>] ip_forward_finish+0x25/0x40
[10610.110729]  [<c02e7612>] ip_rcv_finish+0x122/0x360
[10610.110729]  [<c02bfa87>] __alloc_skb+0x57/0x120
[10610.110729]  [<c0109c8a>] nommu_map_single+0x2a/0x60
[10610.110729]  [<c02e7a90>] ip_rcv+0x0/0x290
[10610.110729]  [<c02c45cb>] netif_receive_skb+0x26b/0x470
[10610.110729]  [<f886c75d>] e1000_receive_skb+0x4d/0x1b0 [e1000e]
[10610.110729]  [<f886f9cc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e]
[10610.110729]  [<f886bf69>] e1000_clean+0x49/0x1f0 [e1000e]
[10610.110729]  [<c02c69d8>] net_rx_action+0xf8/0x1b0
[10610.110729]  [<c012a922>] __do_softirq+0x82/0x100
[10610.110729]  [<c012a9d7>] do_softirq+0x37/0x40
[10610.110729]  [<c012ad27>] irq_exit+0x57/0x80
[10610.110729]  [<c0107120>] do_IRQ+0x40/0x80
[10610.110729]  [<c0114097>] smp_apic_timer_interrupt+0x57/0x90
[10610.110729]  [<c01055a3>] common_interrupt+0x23/0x28
[10610.110729]  [<c010a602>] mwait_idle+0x32/0x40
[10610.110729]  [<c010a5d0>] mwait_idle+0x0/0x40
[10610.110729]  [<c01036f3>] cpu_idle+0x53/0xc0
[10610.110729]  =======================
[10610.110729] Code: c4 0c c3 89 56 04 eb e3 8d 76 00 8d bc 27 00 00 00
00 55 89 d5 57 89 c7 56 53 90 8d b4 26 00 00 00 00 8b 1f 83 e3 fc 74 32
8b 03 <89> d9 a8 01 75 2a 89 c6 83 e6 fc 8b 56 08 39 d3 74 45 85 d2 74

Thanks!
Best regals, Badalian Vyacheslav

> On Thu, Oct 16, 2008 at 12:28:46PM +0400, Badalian Vyacheslav wrote:
> ...
>   
>> Sorry for long answer.
>>
>> We have troubles with power in our server place. Now its gone and i will
>> test again all this.
>>
>> With patch +  htb_hysteresis=0 and htb_hysteresis=1 without patch all PC
>> work done 2 days and 18 hours. After this we have power crash.... =(
>>     
>
> No need to hurry: you've written it's not everyday. Better try to make
> sure there is really a diffrence after any of these changes.
>
> Thanks,
> Jarek P.
>
>   

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 30c999c..ff9e965 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -162,6 +162,7 @@  struct htb_sched {
 
 	int rate2quantum;	/* quant = rate / rate2quantum */
 	psched_time_t now;	/* cached dequeue time */
+	psched_time_t next_watchdog;
 	struct qdisc_watchdog watchdog;
 
 	/* non shaped skbs; let them go directly thru */
@@ -920,7 +921,11 @@  static struct sk_buff *htb_dequeue(struct Qdisc *sch)
 		}
 	}
 	sch->qstats.overlimits++;
-	qdisc_watchdog_schedule(&q->watchdog, next_event);
+	if (q->next_watchdog < q->now || next_event <=
+	     q->next_watchdog - PSCHED_TICKS_PER_SEC / HZ) {
+		qdisc_watchdog_schedule(&q->watchdog, next_event);
+		q->next_watchdog = next_event;
+	}
 fin:
 	return skb;
 }
@@ -973,6 +978,7 @@  static void htb_reset(struct Qdisc *sch)
 		}
 	}
 	qdisc_watchdog_cancel(&q->watchdog);
+	q->next_watchdog = 0;
 	__skb_queue_purge(&q->direct_queue);
 	sch->q.qlen = 0;
 	memset(q->row, 0, sizeof(q->row));