diff mbox

[crash] kernel BUG at net/core/pktgen.c:3503!

Message ID 20090915185112.GA17587@lenovo
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Cyrill Gorcunov Sept. 15, 2009, 6:51 p.m. UTC
[Ingo Molnar - Tue, Sep 15, 2009 at 08:36:47PM +0200]
| 
| not sure which merge caused this, but i got this boot crash with latest 
| -git:
| 
| calling  flow_cache_init+0x0/0x1b9 @ 1
| initcall flow_cache_init+0x0/0x1b9 returned 0 after 64 usecs
| calling  pg_init+0x0/0x37c @ 1
| pktgen 2.72: Packet Generator for packet performance testing.
| ------------[ cut here ]------------
| kernel BUG at net/core/pktgen.c:3503!
| invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
| last sysfs file: 
| 

Hi Ingo,

just curious, will the following patch fix the problem?
I've been fixing problem with familiar symthoms on
system with custome virtual cpu implementation so
it may not help in mainline but anyway :)

	-- Cyrill
---
 net/core/pktgen.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Sept. 17, 2009, 5:29 p.m. UTC | #1
From: Cyrill Gorcunov <gorcunov@gmail.com>
Date: Tue, 15 Sep 2009 22:51:12 +0400

> [Ingo Molnar - Tue, Sep 15, 2009 at 08:36:47PM +0200]
> | 
> | not sure which merge caused this, but i got this boot crash with latest 
> | -git:
> | 
> | calling  flow_cache_init+0x0/0x1b9 @ 1
> | initcall flow_cache_init+0x0/0x1b9 returned 0 after 64 usecs
> | calling  pg_init+0x0/0x37c @ 1
> | pktgen 2.72: Packet Generator for packet performance testing.
> | ------------[ cut here ]------------
> | kernel BUG at net/core/pktgen.c:3503!
> | invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> | last sysfs file: 
> | 
> 
> Hi Ingo,
> 
> just curious, will the following patch fix the problem?
> I've been fixing problem with familiar symthoms on
> system with custome virtual cpu implementation so
> it may not help in mainline but anyway :)

Ingo, does Cyrill's patch help?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar Sept. 17, 2009, 5:44 p.m. UTC | #2
* David Miller <davem@davemloft.net> wrote:

> From: Cyrill Gorcunov <gorcunov@gmail.com>
> Date: Tue, 15 Sep 2009 22:51:12 +0400
> 
> > [Ingo Molnar - Tue, Sep 15, 2009 at 08:36:47PM +0200]
> > | 
> > | not sure which merge caused this, but i got this boot crash with latest 
> > | -git:
> > | 
> > | calling  flow_cache_init+0x0/0x1b9 @ 1
> > | initcall flow_cache_init+0x0/0x1b9 returned 0 after 64 usecs
> > | calling  pg_init+0x0/0x37c @ 1
> > | pktgen 2.72: Packet Generator for packet performance testing.
> > | ------------[ cut here ]------------
> > | kernel BUG at net/core/pktgen.c:3503!
> > | invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > | last sysfs file: 
> > | 
> > 
> > Hi Ingo,
> > 
> > just curious, will the following patch fix the problem?
> > I've been fixing problem with familiar symthoms on
> > system with custome virtual cpu implementation so
> > it may not help in mainline but anyway :)
> 
> Ingo, does Cyrill's patch help?

For now i've turned pktgen off in my tests. Will check it again once 
things have calmed down somewhat.

Also, i just tried to reproduce the pktgen crash with latest -git and 
the config i sent - no luck, so i cannot test Cyrill's patch either.

Btw., we are seeing some other preempt count and task related 
weirdnesses as well in other code, maybe it's related. No good pattern 
yet to act upon.

Anyway - please disregard this bugreport until i've investigated it 
closer.

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 17, 2009, 5:49 p.m. UTC | #3
From: Ingo Molnar <mingo@elte.hu>
Date: Thu, 17 Sep 2009 19:44:48 +0200

> Anyway - please disregard this bugreport until i've investigated it 
> closer.

Ok, thanks for the status update.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cyrill Gorcunov Sept. 17, 2009, 5:51 p.m. UTC | #4
[Ingo Molnar - Thu, Sep 17, 2009 at 07:44:48PM +0200]
...
| 
| > 
| > Ingo, does Cyrill's patch help?
| 
| For now i've turned pktgen off in my tests. Will check it again once 
| things have calmed down somewhat.
| 
| Also, i just tried to reproduce the pktgen crash with latest -git and 
| the config i sent - no luck, so i cannot test Cyrill's patch either.
| 
| Btw., we are seeing some other preempt count and task related 
| weirdnesses as well in other code, maybe it's related. No good pattern 
| yet to act upon.
| 
| Anyway - please disregard this bugreport until i've investigated it 
| closer.
| 
| 	Ingo
| 

I'm unable to reproduce this issue too. I was trying
many ways (under kvm) -- no bug triggered. Though on
a system for which I had done this patch in first place
the bug was been hitting all the time (but it contains
custom vcpu management code, which is not our case here).

	-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ingo Molnar Sept. 21, 2009, 4:45 p.m. UTC | #5
* Ingo Molnar <mingo@elte.hu> wrote:

> 
> * David Miller <davem@davemloft.net> wrote:
> 
> > From: Cyrill Gorcunov <gorcunov@gmail.com>
> > Date: Tue, 15 Sep 2009 22:51:12 +0400
> > 
> > > [Ingo Molnar - Tue, Sep 15, 2009 at 08:36:47PM +0200]
> > > | 
> > > | not sure which merge caused this, but i got this boot crash with latest 
> > > | -git:
> > > | 
> > > | calling  flow_cache_init+0x0/0x1b9 @ 1
> > > | initcall flow_cache_init+0x0/0x1b9 returned 0 after 64 usecs
> > > | calling  pg_init+0x0/0x37c @ 1
> > > | pktgen 2.72: Packet Generator for packet performance testing.
> > > | ------------[ cut here ]------------
> > > | kernel BUG at net/core/pktgen.c:3503!
> > > | invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
> > > | last sysfs file: 
> > > | 
> > > 
> > > Hi Ingo,
> > > 
> > > just curious, will the following patch fix the problem?
> > > I've been fixing problem with familiar symthoms on
> > > system with custome virtual cpu implementation so
> > > it may not help in mainline but anyway :)
> > 
> > Ingo, does Cyrill's patch help?
> 
> For now i've turned pktgen off in my tests. Will check it again once 
> things have calmed down somewhat.
> 
> Also, i just tried to reproduce the pktgen crash with latest -git and 
> the config i sent - no luck, so i cannot test Cyrill's patch either.
> 
> Btw., we are seeing some other preempt count and task related 
> weirdnesses as well in other code, maybe it's related. No good pattern 
> yet to act upon.
> 
> Anyway - please disregard this bugreport until i've investigated it 
> closer.

Update: i've further investigated it and this bug was caused by a 
scheduler bug introduced in this merge window, which got fixed in:

  3f04e8c: sched: Re-add lost cpu_allowed check to sched_fair.c::select_task_rq_fair()

This bug caused CPU affinities to not work in essence - breaking kthread 
per-cpu assumptions in net/core/pktgen.c.

I've confirmed this by re-enabling pktgen in my tests and the crash has 
no reappeared.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6.git/net/core/pktgen.c
=====================================================================
--- linux-2.6.git.orig/net/core/pktgen.c
+++ linux-2.6.git/net/core/pktgen.c
@@ -3511,7 +3511,7 @@  static int pktgen_thread_worker(void *ar
 	struct pktgen_dev *pkt_dev = NULL;
 	int cpu = t->cpu;
 
-	BUG_ON(smp_processor_id() != cpu);
+	BUG_ON(task_cpu(current) != cpu);
 
 	init_waitqueue_head(&t->queue);
 	complete(&t->start_done);