diff mbox

Soft-Lockup/Race in networking in 2.6.31-rc1+195 ( possibly?caused by netem)

Message ID 20090707065014.GA3296@ami.dom.local
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Jarek Poplawski July 7, 2009, 6:50 a.m. UTC
On Mon, Jul 06, 2009 at 07:26:43PM +0200, Andres Freund wrote:
> On Monday 06 July 2009 19:23:18 Joao Correia wrote:
> > Hello
> >
> > Since i already had the kernel compiled and ready to boot when i read
> > this, i gave it a go anyway :-).
> >
> > I can reproduce the freeze with those 4 patches applied, so i can
> > confirm that its, at least, related to, or exposed by, those patches.
> > There must be something else too, or its just too much fuzziness, but
> > the freeze takes a bit more time (approximately five minutes, give or
> > take) compared to the instant freeze before, but its there with the
> > patches, and without them, no freeze.
> >
> > I assume there isnt a "safe" way to get them out of current .31-rc's,
> > right?
> `echo 0 > /proc/sys/kernel/timer_migration` should mitigate the problem.

I guess it should fix it entirely. Btw., here is a patch disabling the
timers' part, so to make it hrtimers only. Could you try?

Thanks,
Jarek P.
---

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Joao Correia July 7, 2009, 10:40 a.m. UTC | #1
I am now running 2.6.31-rc2 for a couple of hours, no freeze.

Let me know what/if i can help with tracking down the original source
of the problem.

Thank you very much for your time,
Joao Correia

On Tue, Jul 7, 2009 at 7:50 AM, Jarek Poplawski<jarkao2@gmail.com> wrote:
> On Mon, Jul 06, 2009 at 07:26:43PM +0200, Andres Freund wrote:
>> On Monday 06 July 2009 19:23:18 Joao Correia wrote:
>> > Hello
>> >
>> > Since i already had the kernel compiled and ready to boot when i read
>> > this, i gave it a go anyway :-).
>> >
>> > I can reproduce the freeze with those 4 patches applied, so i can
>> > confirm that its, at least, related to, or exposed by, those patches.
>> > There must be something else too, or its just too much fuzziness, but
>> > the freeze takes a bit more time (approximately five minutes, give or
>> > take) compared to the instant freeze before, but its there with the
>> > patches, and without them, no freeze.
>> >
>> > I assume there isnt a "safe" way to get them out of current .31-rc's,
>> > right?
>> `echo 0 > /proc/sys/kernel/timer_migration` should mitigate the problem.
>
> I guess it should fix it entirely. Btw., here is a patch disabling the
> timers' part, so to make it hrtimers only. Could you try?
>
> Thanks,
> Jarek P.
> ---
>
> diff --git a/kernel/timer.c b/kernel/timer.c
> index 0b36b9e..011429c 100644
> --- a/kernel/timer.c
> +++ b/kernel/timer.c
> @@ -634,7 +634,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
>
>        cpu = smp_processor_id();
>
> -#if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
> +#if 0
>        if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu)) {
>                int preferred_cpu = get_nohz_load_balancer();
>
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 7, 2009, 10:47 a.m. UTC | #2
On Tuesday 07 July 2009 12:40:16 Joao Correia wrote:
> I am now running 2.6.31-rc2 for a couple of hours, no freeze.
>
> Let me know what/if i can help with tracking down the original source
> of the problem.
You dont see the problem anymore with the `echo 0 > 
/proc/sys/kernel/timer_migration`  change (or equivalently with the patch from 
Jarek) or has the problem vanished completely?

Andres
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joao Correia July 7, 2009, 11:05 a.m. UTC | #3
CC'ing the list and other listeners, that were left out in the previous email.

Joao Correia


---------- Forwarded message ----------
From: Joao Correia <joaomiguelcorreia@gmail.com>
Date: Tue, Jul 7, 2009 at 12:03 PM
Subject: Re: Soft-Lockup/Race in networking in 2.6.31-rc1+195 (
possibly?caused by netem)
To: Andres Freund <andres@anarazel.de>


I dont see the problem with the patch from Jarek. (sorry for not being
clear about this).

With vanilla 31-rc2 it is still there.

Joao Correia

On Tue, Jul 7, 2009 at 11:47 AM, Andres Freund<andres@anarazel.de> wrote:
> On Tuesday 07 July 2009 12:40:16 Joao Correia wrote:
>> I am now running 2.6.31-rc2 for a couple of hours, no freeze.
>>
>> Let me know what/if i can help with tracking down the original source
>> of the problem.
> You dont see the problem anymore with the `echo 0 >
> /proc/sys/kernel/timer_migration`  change (or equivalently with the patch from
> Jarek) or has the problem vanished completely?
>
> Andres
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 7, 2009, 1:18 p.m. UTC | #4
On Tue, Jul 07, 2009 at 11:40:16AM +0100, Joao Correia wrote:
> I am now running 2.6.31-rc2 for a couple of hours, no freeze.
> 
> Let me know what/if i can help with tracking down the original source
> of the problem.

OK, so we know it's only about timers. Here is another tiny patch
(the previous one should be removed), which could tell (with oops) if
there's something while migrating. Anyway, the bug should be back :-(

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 7, 2009, 1:22 p.m. UTC | #5
On Tuesday 07 July 2009 15:18:03 Jarek Poplawski wrote:
> On Tue, Jul 07, 2009 at 11:40:16AM +0100, Joao Correia wrote:
> > I am now running 2.6.31-rc2 for a couple of hours, no freeze.
> > Let me know what/if i can help with tracking down the original source
> > of the problem.
> OK, so we know it's only about timers. Here is another tiny patch
> (the previous one should be removed), which could tell (with oops) if
> there's something while migrating. Anyway, the bug should be back :-(
How do we know this? It still could be a race uncovered by timer migration, 
right?

Andres

PS: You forgot the patch ;-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 7, 2009, 1:29 p.m. UTC | #6
On Tue, Jul 07, 2009 at 03:22:06PM +0200, Andres Freund wrote:
> On Tuesday 07 July 2009 15:18:03 Jarek Poplawski wrote:
> > On Tue, Jul 07, 2009 at 11:40:16AM +0100, Joao Correia wrote:
> > > I am now running 2.6.31-rc2 for a couple of hours, no freeze.
> > > Let me know what/if i can help with tracking down the original source
> > > of the problem.
> > OK, so we know it's only about timers. Here is another tiny patch
> > (the previous one should be removed), which could tell (with oops) if
> > there's something while migrating. Anyway, the bug should be back :-(
> How do we know this? It still could be a race uncovered by timer migration, 
> right?

Right. But (rather) not by or in hrtimers.

> PS: You forgot the patch ;-)

Yes, I hope you got it already ;-)

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 7, 2009, 1:34 p.m. UTC | #7
On Tuesday 07 July 2009 15:29:37 Jarek Poplawski wrote:
> On Tue, Jul 07, 2009 at 03:22:06PM +0200, Andres Freund wrote:
> > On Tuesday 07 July 2009 15:18:03 Jarek Poplawski wrote:
> > > On Tue, Jul 07, 2009 at 11:40:16AM +0100, Joao Correia wrote:
> > > > I am now running 2.6.31-rc2 for a couple of hours, no freeze.
> > > > Let me know what/if i can help with tracking down the original source
> > > > of the problem.
> > >
> > > OK, so we know it's only about timers. Here is another tiny patch
> > > (the previous one should be removed), which could tell (with oops) if
> > > there's something while migrating. Anyway, the bug should be back :-(
> > PS: You forgot the patch ;-)
> Yes, I hope you got it already ;-)
Yes. Can't reboot that machine right now, will test later.

Testing wether its triggerable inside a vm might be interesting...

Andres
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 7, 2009, 1:57 p.m. UTC | #8
On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
...
> Testing wether its triggerable inside a vm might be interesting...

Probably similarly to testing without this patch or even less. Maybe
I should've warned you but this type of bugs in -rc with possible
memory or stack overwrites might be fatal for your data (at least).

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 7, 2009, 4:11 p.m. UTC | #9
On Tuesday 07 July 2009 15:57:42 Jarek Poplawski wrote:
> On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
> ...
> > Testing wether its triggerable inside a vm might be interesting...
> Probably similarly to testing without this patch or even less. Maybe
> I should've warned you but this type of bugs in -rc with possible
> memory or stack overwrites might be fatal for your data (at least).
Fortunately all the data on that machine should either be replaceable or 
regularly backuped.

Will test later today if that patch bugs.

Andres
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 8, 2009, 8:08 a.m. UTC | #10
On Tue, Jul 07, 2009 at 06:11:27PM +0200, Andres Freund wrote:
> On Tuesday 07 July 2009 15:57:42 Jarek Poplawski wrote:
> > On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
> > ...
> > > Testing wether its triggerable inside a vm might be interesting...
> > Probably similarly to testing without this patch or even less. Maybe
> > I should've warned you but this type of bugs in -rc with possible
> > memory or stack overwrites might be fatal for your data (at least).
> Fortunately all the data on that machine should either be replaceable or 
> regularly backuped.
> 
> Will test later today if that patch bugs.

If you didn't start yet, it would be nice to use this, btw:

CONFIG_HOTPLUG_CPU = N
CONFIG_DEBUG_OBJECTS = Y
CONFIG_DEBUG_OBJECTS_TIMERS = Y

Thanks,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 8, 2009, 8:29 a.m. UTC | #11
On Wednesday 08 July 2009 10:08:52 Jarek Poplawski wrote:
> On Tue, Jul 07, 2009 at 06:11:27PM +0200, Andres Freund wrote:
> > On Tuesday 07 July 2009 15:57:42 Jarek Poplawski wrote:
> > > On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
> > > ...
> > >
> > > > Testing wether its triggerable inside a vm might be interesting...
> > >
> > > Probably similarly to testing without this patch or even less. Maybe
> > > I should've warned you but this type of bugs in -rc with possible
> > > memory or stack overwrites might be fatal for your data (at least).
> >
> > Fortunately all the data on that machine should either be replaceable or
> > regularly backuped.
> >
> > Will test later today if that patch bugs.
>
> If you didn't start yet, it would be nice to use this, btw:
>
> CONFIG_HOTPLUG_CPU = N
> CONFIG_DEBUG_OBJECTS = Y
> CONFIG_DEBUG_OBJECTS_TIMERS = Y
So I should test with a single cpu? Or is there a config where HOTPLUG_CPU does 
not imply !SMP?

Andres
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 8, 2009, 9:13 a.m. UTC | #12
On Wed, Jul 08, 2009 at 10:29:34AM +0200, Andres Freund wrote:
> On Wednesday 08 July 2009 10:08:52 Jarek Poplawski wrote:
> > On Tue, Jul 07, 2009 at 06:11:27PM +0200, Andres Freund wrote:
> > > On Tuesday 07 July 2009 15:57:42 Jarek Poplawski wrote:
> > > > On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
> > > > ...
> > > >
> > > > > Testing wether its triggerable inside a vm might be interesting...
> > > >
> > > > Probably similarly to testing without this patch or even less. Maybe
> > > > I should've warned you but this type of bugs in -rc with possible
> > > > memory or stack overwrites might be fatal for your data (at least).
> > >
> > > Fortunately all the data on that machine should either be replaceable or
> > > regularly backuped.
> > >
> > > Will test later today if that patch bugs.
> >
> > If you didn't start yet, it would be nice to use this, btw:
> >
> > CONFIG_HOTPLUG_CPU = N
> > CONFIG_DEBUG_OBJECTS = Y
> > CONFIG_DEBUG_OBJECTS_TIMERS = Y
> So I should test with a single cpu? Or is there a config where HOTPLUG_CPU does 
> not imply !SMP?

No, my single cpu should be enough ;-) There is something wrong I guess.
I can see in my menuconfig:

SMP [=y]
...
HOTPLUG [=n]
...
HOTPUG_CPU [=y]
...
Depends on SMP && HOTPLUG

So, let it be HOTPLUG_CPU = Y for now...

Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andres Freund July 8, 2009, 10:23 p.m. UTC | #13
On Wednesday 08 July 2009 10:08:52 Jarek Poplawski wrote:
> On Tue, Jul 07, 2009 at 06:11:27PM +0200, Andres Freund wrote:
> > On Tuesday 07 July 2009 15:57:42 Jarek Poplawski wrote:
> > > On Tue, Jul 07, 2009 at 03:34:07PM +0200, Andres Freund wrote:
> > > ...
> > >
> > > > Testing wether its triggerable inside a vm might be interesting...
> > >
> > > Probably similarly to testing without this patch or even less. Maybe
> > > I should've warned you but this type of bugs in -rc with possible
> > > memory or stack overwrites might be fatal for your data (at least).
> >
> > Fortunately all the data on that machine should either be replaceable or
> > regularly backuped.
> >
> > Will test later today if that patch bugs.
>
> If you didn't start yet, it would be nice to use this, btw:
> CONFIG_HOTPLUG_CPU = N
> CONFIG_DEBUG_OBJECTS = Y
> CONFIG_DEBUG_OBJECTS_TIMERS = Y
Unfortunately this just yields the same backtraces during softlockup and not 
earlier.
I did not test without lockdep yet, but that should not have stopped the BUG 
from appearing, right?


Andres
[  207.233011] BUG: soft lockup - CPU#0 stuck for 61s! [openvpn:4232]
[  207.233011] Modules linked in: sch_netem sbs sbshc snd_hda_codec_conexant pcmcia snd_hda_intel snd_hda_codec iwlagn thinkpad_acpi yenta_socket rsrc_nonstatic pcmcia_core btusb snd_hwdep ehci_hcd uhci_hcd
[  207.233011] irq event stamp: 158057
[  207.233011] hardirqs last  enabled at (158056): [<ffffffff81036a10>] restore_args+0x0/0x30
[  207.233011] hardirqs last disabled at (158057): [<ffffffff81035d3a>] save_args+0x6a/0x70
[  207.233011] softirqs last  enabled at (27750): [<ffffffff8155837d>] lock_sock_nested+0x8d/0x130
[  207.233011] softirqs last disabled at (27756): [<ffffffff81568278>] dev_queue_xmit+0x58/0x4b0
[  207.233011] CPU 0:
[  207.233011] Modules linked in: sch_netem sbs sbshc snd_hda_codec_conexant pcmcia snd_hda_intel snd_hda_codec iwlagn thinkpad_acpi yenta_socket rsrc_nonstatic pcmcia_core btusb snd_hwdep ehci_hcd uhci_hcd
[  207.233011] Pid: 4232, comm: openvpn Not tainted 2.6.31-rc2-andres-00151-gf3060b0-dirty #83 208252G
[  207.233011] RIP: 0010:[<ffffffff812a9eb1>]  [<ffffffff812a9eb1>] delay_tsc+0x51/0x80
[  207.233011] RSP: 0018:ffff88012984f938  EFLAGS: 00000202
[  207.233011] RAX: 000000007086c4e9 RBX: ffff88012984f958 RCX: 000000007086c4e9
[  207.233011] RDX: 000000007086c4e9 RSI: 0000000000006238 RDI: 0000000000000001
[  207.233011] RBP: ffffffff81036b6e R08: ffffffff82189460 R09: 0000000000000002
[  207.233011] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000003fda
[  207.233011] R13: ffff88002ee00000 R14: ffff88012984e000 R15: 0000000000000000
[  207.233011] FS:  00007f518d51a6f0(0000) GS:ffff88002ee00000(0000) knlGS:0000000000000000
[  207.233011] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  207.233011] CR2: 00007f46fb78600c CR3: 000000012bc8f000 CR4: 00000000000026f0
[  207.233011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  207.233011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  207.233011] Call Trace:
[  207.233011]  [<ffffffff812a9eaa>] ? delay_tsc+0x4a/0x80
[  207.233011]  [<ffffffff812a9d9a>] ? __delay+0xa/0x10
[  207.233011]  [<ffffffff812ae578>] ? _raw_spin_lock+0xd8/0x150
[  207.233011]  [<ffffffff816f0431>] ? _spin_lock+0x51/0x70
[  207.233011]  [<ffffffff81568306>] ? dev_queue_xmit+0xe6/0x4b0
[  207.233011]  [<ffffffff81568306>] ? dev_queue_xmit+0xe6/0x4b0
[  207.233011]  [<ffffffff81568273>] ? dev_queue_xmit+0x53/0x4b0
[  207.233011]  [<ffffffff8159a67c>] ? ip_finish_output+0x13c/0x320
[  207.233011]  [<ffffffff8159a8db>] ? ip_output+0x7b/0xd0
[  207.233011]  [<ffffffff81598b98>] ? ip_generic_getfrag+0x88/0xa0
[  207.233011]  [<ffffffff815996c0>] ? ip_local_out+0x20/0x30
[  207.233011]  [<ffffffff81599957>] ? ip_push_pending_frames+0x287/0x410
[  207.233011]  [<ffffffff815bae18>] ? udp_push_pending_frames+0x168/0x3d0
[  207.233011]  [<ffffffff815bcd07>] ? udp_sendmsg+0x457/0x760
[  207.233011]  [<ffffffff815c4144>] ? inet_sendmsg+0x24/0x60
[  207.233011]  [<ffffffff81555556>] ? sock_sendmsg+0x126/0x140
[  207.233011]  [<ffffffff81097f60>] ? autoremove_wake_function+0x0/0x40
[  207.233011]  [<ffffffff810ab6e7>] ? mark_held_locks+0x67/0x90
[  207.233011]  [<ffffffff816f01fb>] ? _spin_unlock_irqrestore+0x3b/0x70
[  207.233011]  [<ffffffff810ab9fd>] ? trace_hardirqs_on_caller+0x14d/0x190
[  207.233011]  [<ffffffff81556490>] ? sys_sendto+0xf0/0x130
[  207.233011]  [<ffffffff810aba4d>] ? trace_hardirqs_on+0xd/0x10
[  207.233011]  [<ffffffff810a21f7>] ? getnstimeofday+0x57/0xe0
[  207.233011]  [<ffffffff8109c1f1>] ? ktime_get_ts+0x51/0x70
[  207.233011]  [<ffffffff81035ec2>] ? system_call_fastpath+0x16/0x1bx
Jarek Poplawski July 8, 2009, 10:48 p.m. UTC | #14
On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
...
> Unfortunately this just yields the same backtraces during softlockup and not 
> earlier.
> I did not test without lockdep yet, but that should not have stopped the BUG 
> from appearing, right?

Since it looks like hrtimers now, these changes in timers shouldn't
matter. Let's wait for new ideas.

Thanks for testing anyway,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thomas Gleixner July 9, 2009, 10:31 a.m. UTC | #15
On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
> ...
> > Unfortunately this just yields the same backtraces during softlockup and not 
> > earlier.
> > I did not test without lockdep yet, but that should not have stopped the BUG 
> > from appearing, right?
> 
> Since it looks like hrtimers now, these changes in timers shouldn't
> matter. Let's wait for new ideas.

Some background:

Up to 2.6.30 hrtimer_start() and add_timer() enqueue (hr)timers on the
CPU on which the functions are called. There is one exception when the
timer callback is currently running on another CPU then it is enqueued
on that other CPU.

The migration patches change that behaviour and enqeue the timer on
the nohz.idle_balancer CPU when parts of the system are idle.

With the migration code disabled (via sysctl or the #if 0 patch) the
timer is always enqeued on the same CPU, i.e. you get the 2.6.30
behaviour back.

As you found out it is probably related to hrtimers. Checking the
network code the only hrtimer users are in net/sched/sch_api.c and
net/sched/sch_cbq.c . There is some in net/can as well, but that's
probably irrelevant for the problem at hand.

I'm not familiar with that code, so I have no clue which problems
might pop up due to enqueueing the timer on another CPU, but there is
one pretty suspicios code sequence in cbq_ovl_delay()

	expires = ktime_set(0, 0);
	expires = ktime_add_ns(expires, PSCHED_US2NS(sched));
	if (hrtimer_try_to_cancel(&q->delay_timer) &&
	    ktime_to_ns(ktime_sub(
			hrtimer_get_expires(&q->delay_timer),
			expires)) > 0)
		hrtimer_set_expires(&q->delay_timer, expires);
	hrtimer_restart(&q->delay_timer);

So we set the expiry value of the timer only when the timer was active
(hrtimer_try_to_cancel() returned != 0) and the new expiry time is
before the expiry time which was in the active timer. If the timer was
inactive we start the timer with the last expiry time which is
probably already in the past.

I'm quite sure that this is not causing the migration problem, because
we do not enqueue it on a different CPU when the timer is already
expired.

For completeness: hrtimer_try_to_cancel() can return -1 when the timer
callback is running. So in that case we also fiddle with the expiry
value and restart the timer while the callback code itself might do
the same. There is no serializiation of that code and the callback it
seems. The watchdog timer callback in sch_api.c is not serialized
either.

There is another oddity in cbq_undelay() which is the hrtimer callback
function:

	if (delay) {
		ktime_t time;

		time = ktime_set(0, 0);
		time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
		hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);

The canocial way to restart a hrtimer from the callback function is to
set the expiry value and return HRTIMER_RESTART.

	}

	sch->flags &= ~TCQ_F_THROTTLED;
	__netif_schedule(qdisc_root(sch));
	return HRTIMER_NORESTART;

Again, this should not cause the timer to be enqueued on another CPU
as we do not enqueue on a different CPU when the callback is running,
but see above ...

I have the feeling that the code relies on some implicit cpu
boundness, which is not longer guaranteed with the timer migration
changes, but that's a question for the network experts.

Thanks,

	tglx
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jarek Poplawski July 9, 2009, 10:44 a.m. UTC | #16
On Thu, Jul 09, 2009 at 12:31:53PM +0200, Thomas Gleixner wrote:
> On Thu, 9 Jul 2009, Jarek Poplawski wrote:
> > On Thu, Jul 09, 2009 at 12:23:17AM +0200, Andres Freund wrote:
> > ...
> > > Unfortunately this just yields the same backtraces during softlockup and not 
> > > earlier.
> > > I did not test without lockdep yet, but that should not have stopped the BUG 
> > > from appearing, right?
> > 
> > Since it looks like hrtimers now, these changes in timers shouldn't
> > matter. Let's wait for new ideas.
> 
> Some background:
...
> There is another oddity in cbq_undelay() which is the hrtimer callback
> function:
> 
> 	if (delay) {
> 		ktime_t time;
> 
> 		time = ktime_set(0, 0);
> 		time = ktime_add_ns(time, PSCHED_TICKS2NS(now + delay));
> 		hrtimer_start(&q->delay_timer, time, HRTIMER_MODE_ABS);
> 
> The canocial way to restart a hrtimer from the callback function is to
> set the expiry value and return HRTIMER_RESTART.

OK, that's for later because we didn't use cbq here.

> 
> 	}
> 
> 	sch->flags &= ~TCQ_F_THROTTLED;
> 	__netif_schedule(qdisc_root(sch));
> 	return HRTIMER_NORESTART;
> 
> Again, this should not cause the timer to be enqueued on another CPU
> as we do not enqueue on a different CPU when the callback is running,
> but see above ...
> 
> I have the feeling that the code relies on some implicit cpu
> boundness, which is not longer guaranteed with the timer migration
> changes, but that's a question for the network experts.

As a matter of fact, I've just looked at this __netif_schedule(),
which really is cpu bound, so you might be 100% right.

Thanks for your help,
Jarek P.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/kernel/timer.c b/kernel/timer.c
index 0b36b9e..011429c 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -634,7 +634,7 @@  __mod_timer(struct timer_list *timer, unsigned long expires,
 
 	cpu = smp_processor_id();
 
-#if defined(CONFIG_NO_HZ) && defined(CONFIG_SMP)
+#if 0
 	if (!pinned && get_sysctl_timer_migration() && idle_cpu(cpu)) {
 		int preferred_cpu = get_nohz_load_balancer();