diff mbox

NIU driver: Sun x8 Express Quad Gigabit Ethernet Adapter

Message ID 20081112.041143.11487260.davem@davemloft.net
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

David Miller Nov. 12, 2008, 12:11 p.m. UTC
From: David Miller <davem@davemloft.net>
Date: Wed, 12 Nov 2008 03:52:40 -0800 (PST)

> Ok, Jesper, please try two things for me, leave the debugging patch
> in there for all the tests:
> 
> 1) Retrigger the problem (with or without MSI, doesn't matter) but
>    add back in that test I asked you to try last week.  The one
>    where the "if (++rp->mark_counter == rp->mark_freq)" condition
>    test in niu_start_xmit() is commented out, so that the
>    "mrk |= TX_DESC_MARK;" statement always runs.
> 
>    Get me the log dump produced by that scenerio.
> 
> 2) Next, simply comment out the:
> 
> 	if (unlikely(!(cs & (TX_CS_MK | TX_CS_MMK))))
> 		goto out;
> 
>    lines in niu_tx_work().
> 
> Let's see what new info we can get out of this.

These tests are still useful for me, so please perform them,
but I think I've found the bug.

I am guessing you're running a 32-bit x86 kernel.

In such a case the driver has to define a local readq()
and writeq() implementation.

What I provide for NIU right now reads the upper 32-bits
then the lower 32-bits of the register.

Guess what that does?  The packet counters live in the upper
32-bits and the MARK bits live in the lower 32-bits of the
TX_CS register.

So it first reads the packet counters, and as a side effect that
clears the MARK bits in the TX_CS register.  So when we read the lower
32-bits the MARK bits are always seen as zero.

BzzaaarT!

So the following patch should fix this bug.  writeq() should
be OK as-is, so doesn't need a similar change.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jesper Dangaard Brouer Nov. 12, 2008, 12:49 p.m. UTC | #1
On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Wed, 12 Nov 2008 03:52:40 -0800 (PST)
>
> These tests are still useful for me, so please perform them,

As a gratitude for your work and being allowed to operate your expresso
machine, I'll be happy to perform the tests even though the bug has been
found.

> but I think I've found the bug.

Yes! you have found the bug! :-)

(This is on the non SMP and non MSI kernel.  First test pktgen test says
I can route 319 kpps using a single CPU, promising as I got 160 kpps
using the Sun nxge driver)

Tested-by: Jesper Dangaard Brouer <jdb@comx.dk>
Ben Hutchings Nov. 12, 2008, 12:54 p.m. UTC | #2
On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
[...]
> So the following patch should fix this bug.  writeq() should
> be OK as-is, so doesn't need a similar change.
> 
> diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> index 9acb5d7..d8463b1 100644
> --- a/drivers/net/niu.c
> +++ b/drivers/net/niu.c
> @@ -51,8 +51,7 @@ MODULE_VERSION(DRV_MODULE_VERSION);
>  #ifndef readq
>  static u64 readq(void __iomem *reg)
>  {
> -	return (((u64)readl(reg + 0x4UL) << 32) |
> -		(u64)readl(reg));
> +	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
>  }

Since there's no sequence point between the reads, there's no guarantee
that the reads happen in the order written (regardless of barriers
inside readl()).  This needs to be split into two statements.

Ben.
Jesper Dangaard Brouer Nov. 12, 2008, 1:21 p.m. UTC | #3
On Wed, 2008-11-12 at 12:54 +0000, Ben Hutchings wrote:
> On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> [...]
> > So the following patch should fix this bug.  writeq() should
> > be OK as-is, so doesn't need a similar change.
> > 
> > diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> > index 9acb5d7..d8463b1 100644
> > --- a/drivers/net/niu.c
> > +++ b/drivers/net/niu.c
> > @@ -51,8 +51,7 @@ MODULE_VERSION(DRV_MODULE_VERSION);
> >  #ifndef readq
> >  static u64 readq(void __iomem *reg)
> >  {
> > -	return (((u64)readl(reg + 0x4UL) << 32) |
> > -		(u64)readl(reg));
> > +	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
> >  }
> 
> Since there's no sequence point between the reads, there's no guarantee
> that the reads happen in the order written (regardless of barriers
> inside readl()).  This needs to be split into two statements.

The nxge driver does this:

#ifndef readq
static inline uint64_t readq(void *addr)
{
	uint32_t val32 = readl(addr);
	uint64_t val64 = (uint64_t) readl(addr + 4);
	return (val32 | (val64 << 32));
}
#endif

#ifndef writeq
static inline void writeq(uint64_t val64, void *addr)
{
	writel((uint32_t)(val64), addr);
	writel((uint32_t)(val64 >> 32), (addr + 4));
}
#endif
Jesper Krogh Nov. 12, 2008, 5:56 p.m. UTC | #4
David Miller wrote:
> I am guessing you're running a 32-bit x86 kernel.
> 
> In such a case the driver has to define a local readq()
> and writeq() implementation.
> 
> What I provide for NIU right now reads the upper 32-bits
> then the lower 32-bits of the register.
> 
> Guess what that does?  The packet counters live in the upper
> 32-bits and the MARK bits live in the lower 32-bits of the
> TX_CS register.
> 
> So it first reads the packet counters, and as a side effect that
> clears the MARK bits in the TX_CS register.  So when we read the lower
> 32-bits the MARK bits are always seen as zero.
> 
> BzzaaarT!
> 
> So the following patch should fix this bug.  writeq() should
> be OK as-is, so doesn't need a similar change.
> 
> diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> index 9acb5d7..d8463b1 100644
> --- a/drivers/net/niu.c
> +++ b/drivers/net/niu.c
> @@ -51,8 +51,7 @@ MODULE_VERSION(DRV_MODULE_VERSION);
>  #ifndef readq
>  static u64 readq(void __iomem *reg)
>  {
> -	return (((u64)readl(reg + 0x4UL) << 32) |
> -		(u64)readl(reg));
> +	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
>  }
>  
>  static void writeq(u64 val, void __iomem *reg)

On my system, I'm not in a position where I can just pull down the 
server and test, but if the above seems plausible that it is the same 
bug I hit using the 10GBitE card, then I'll definately try to test it out.

I sort-of reliably hit the problem after a few day of production on a 16 
core, amd64 system running NFS-server.

Does it seem likely to be the same problem?

Thanks
Jesper Dangaard Brouer Nov. 12, 2008, 9:31 p.m. UTC | #5
Hi Google,

On Wed, 12 Nov 2008, David Miller wrote:

> Guess what that does?  The packet counters live in the upper
> 32-bits and the MARK bits live in the lower 32-bits of the
> TX_CS register.
>
> So it first reads the packet counters, and as a side effect that
> clears the MARK bits in the TX_CS register.  So when we read the lower
> 32-bits the MARK bits are always seen as zero.

For the thorough reader, the TX_CS Transmit Control and Status register is 
described in table 26-15 page 761-762 in the PDF document titled: 
"UltraSPARC T2 supplement to UltraSPARC architecture 2007", downloadable 
from: 
http://opensparc-t2.sunsource.net/specs/UST2-UASuppl-current-draft-HP-EXT.pdf

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 12, 2008, 9:43 p.m. UTC | #6
From: Jesper Krogh <jesper@krogh.cc>
Date: Wed, 12 Nov 2008 18:56:48 +0100

> I sort-of reliably hit the problem after a few day of production on
> a 16 core, amd64 system running NFS-server.
>
> Does it seem likely to be the same problem?

Not really, it sounds like you're using a 64-bit kernel (this only
effects 32-bit ones) and the problem triggers after the first 256
packets are sent to the send destination so it should happen quickly.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 12, 2008, 9:46 p.m. UTC | #7
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 12 Nov 2008 12:54:53 +0000

> On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> [...]
> > So the following patch should fix this bug.  writeq() should
> > be OK as-is, so doesn't need a similar change.
> > 
> > diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> > index 9acb5d7..d8463b1 100644
> > --- a/drivers/net/niu.c
> > +++ b/drivers/net/niu.c
> > @@ -51,8 +51,7 @@ MODULE_VERSION(DRV_MODULE_VERSION);
> >  #ifndef readq
> >  static u64 readq(void __iomem *reg)
> >  {
> > -	return (((u64)readl(reg + 0x4UL) << 32) |
> > -		(u64)readl(reg));
> > +	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
> >  }
> 
> Since there's no sequence point between the reads, there's no guarantee
> that the reads happen in the order written (regardless of barriers
> inside readl()).  This needs to be split into two statements.

What version of the C language are you using?

I personally think it's safe.  If the compiler sees "A | B" it's going
to emit the code to compute A, then the code to emit B, and finally
the "|" operation.

Everything I've always seen says that for "|" the expressions are
evaluated left to right.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Nov. 12, 2008, 9:50 p.m. UTC | #8
On Wed, 2008-11-12 at 13:46 -0800, David Miller wrote:
> From: Ben Hutchings <bhutchings@solarflare.com>
> Date: Wed, 12 Nov 2008 12:54:53 +0000
> 
> > On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> > [...]
> > > So the following patch should fix this bug.  writeq() should
> > > be OK as-is, so doesn't need a similar change.
> > > 
> > > diff --git a/drivers/net/niu.c b/drivers/net/niu.c
> > > index 9acb5d7..d8463b1 100644
> > > --- a/drivers/net/niu.c
> > > +++ b/drivers/net/niu.c
> > > @@ -51,8 +51,7 @@ MODULE_VERSION(DRV_MODULE_VERSION);
> > >  #ifndef readq
> > >  static u64 readq(void __iomem *reg)
> > >  {
> > > -	return (((u64)readl(reg + 0x4UL) << 32) |
> > > -		(u64)readl(reg));
> > > +	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
> > >  }
> > 
> > Since there's no sequence point between the reads, there's no guarantee
> > that the reads happen in the order written (regardless of barriers
> > inside readl()).  This needs to be split into two statements.
> 
> What version of the C language are you using?

Any version will do.

> I personally think it's safe.  If the compiler sees "A | B" it's going
> to emit the code to compute A, then the code to emit B, and finally
> the "|" operation.
> 
> Everything I've always seen says that for "|" the expressions are
> evaluated left to right.

I think you're confusing it with "||" which does have this sequencing
rule.

See <http://c-faq.com/expr/seqpoints.html> if you're not convinced.

Ben.
David Miller Nov. 12, 2008, 10:26 p.m. UTC | #9
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Wed, 12 Nov 2008 21:50:57 +0000

> See <http://c-faq.com/expr/seqpoints.html> if you're not convinced.

I don't think that has any implications for the piece of
code we are talking about.

Just google "C order of evaluation" and you will get hundreds
of tables, and all of them will have an entry for "|" (not
just "||") which says that operands are evaluated left to
right.

And since these MMIO reads are volatile operations, there is
no way the compiler can execute them out of order.

And the plain truth is that no compiler does, and that is what
matters in the end.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Matheos Worku Nov. 12, 2008, 11:10 p.m. UTC | #10
Jesper Dangaard Brouer wrote:

>
> Hi Google,
>
> On Wed, 12 Nov 2008, David Miller wrote:
>
>> Guess what that does?  The packet counters live in the upper
>> 32-bits and the MARK bits live in the lower 32-bits of the
>> TX_CS register.
>>
>> So it first reads the packet counters, and as a side effect that
>> clears the MARK bits in the TX_CS register.  So when we read the lower
>> 32-bits the MARK bits are always seen as zero.
>
>
> For the thorough reader, the TX_CS Transmit Control and Status 
> register is described in table 26-15 page 761-762 in the PDF document 
> titled: "UltraSPARC T2 supplement to UltraSPARC architecture 2007", 
> downloadable from: 
> http://opensparc-t2.sunsource.net/specs/UST2-UASuppl-current-draft-HP-EXT.pdf 
>
>
> Cheers,
>   Jesper Brouer
>
> -- 
> -------------------------------------------------------------------
> MSc. Master of Computer Science
> Dept. of Computer Science, University of Copenhagen
> Author of http://www.adsl-optimizer.dk
> -------------------------------------------------------------------
> -- 
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

The niu/neptune HW puts some requirement on 32 bit reads of 64 bit 
registers. You need to read the lower 32 bits first and then the upper 
32 bits.  The same ordering applies to writes as well.
On some 64 bit platforms, the 64 bit reads  are split into two 32 bit 
reads as well, regardless of the OS.

Regards
Matheos

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Nov. 13, 2008, 8:50 a.m. UTC | #11
Another bug... while unloading the niu module.

During my testing I'm unloading/loading the niu module, I usually take
down the interfaces _before_ unloading the module, but I forgot one
time, and got the following BUG in the kern log.


niu: niu_put_parent: port[3]
niu 0000:0b:00.3: PCI INT D disabled
niu: niu_put_parent: port[2]
niu 0000:0b:00.2: PCI INT C disabled
niu: niu_put_parent: port[1]
niu 0000:0b:00.1: PCI INT B disabled
------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:630!
invalid opcode: 0000 [#1] PREEMPT SMP 
last sysfs file: /sys/class/net/lo/operstate
Modules linked in: hpilo serio_raw bnx2 zlib_inflate ipmi_si
ipmi_msghandler hpwdt rng_core ehci_hcd uhci_hcd niu(-) sr_mod cdrom

Pid: 3307, comm: rmmod Tainted: G        W  (2.6.28-rc4-davem #17)
ProLiant DL380 G5
EIP: 0060:[<c02314fc>] EFLAGS: 00010282 CPU: 0
EIP is at msi_free_irqs+0xdc/0xe0
EAX: f60ad420 EBX: 00000030 ECX: f664ff14 EDX: c04a5680
ESI: f71d1000 EDI: f71d146c EBP: f6305eb4 ESP: f6305ea8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmmod (pid: 3307, ti=f6304000 task=f6aaa570 task.ti=f6304000)
Stack:
 f71d1000 f62b4540 f71d1000 f6305ebc c0231508 f6305ec8 c0231791 f62b4000
 f6305edc f81777f8 f71d1000 f817c5d4 f817c5d4 f6305ee8 c022c3e9 f71d1058
 f6305ef8 c0281609 f71d1058 f71d1184 f6305f0c c02816dd f817c5a0 f817c5d4
Call Trace:
 [<c0231508>] ? msix_free_all_irqs+0x8/0x10
 [<c0231791>] ? pci_disable_msix+0x31/0x40
 [<f81777f8>] ? niu_pci_remove_one+0x88/0x8a [niu]
 [<c022c3e9>] ? pci_device_remove+0x19/0x40
 [<c0281609>] ? __device_release_driver+0x59/0x90
 [<c02816dd>] ? driver_detach+0x9d/0xb0
 [<c0280975>] ? bus_remove_driver+0x75/0xa0
 [<c0281b89>] ? driver_unregister+0x39/0x40
 [<c022c641>] ? pci_unregister_driver+0x21/0x80
 [<f817443d>] ? niu_exit+0xd/0x10 [niu]
 [<c014d646>] ? sys_delete_module+0x116/0x1f0
 [<c01744e0>] ? do_munmap+0x1f0/0x250
 [<c01755f6>] ? sys_munmap+0x46/0x60
 [<c0103231>] ? sysenter_do_call+0x12/0x2c
Code: b7 43 08 8b 53 1c c1 e0 04 01 d0 ba 01 00 00 00 83 c0 0c 89 10 3b
7b 14 75 aa 8b 43 1c e8 bd 6f ee ff eb a0 5b 31 c0 5e 5f 5d c3 <0f> 0b
eb fe 55 89 e5 e8 18 ff ff ff 5d c3 8d b6 00 00 00 00 55 
EIP: [<c02314fc>] msi_free_irqs+0xdc/0xe0 SS:ESP 0068:f6305ea8
---[ end trace 6594bbb8d1cf29ee ]---
Jesper Dangaard Brouer Nov. 13, 2008, 9:10 a.m. UTC | #12
On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Wed, 12 Nov 2008 03:52:40 -0800 (PST)
> 
> > Ok, Jesper, please try two things for me, leave the debugging patch
> > in there for all the tests:
> > 
> > 1) Retrigger the problem (with or without MSI, doesn't matter) but
> >    add back in that test I asked you to try last week.  The one
> >    where the "if (++rp->mark_counter == rp->mark_freq)" condition
> >    test in niu_start_xmit() is commented out, so that the
> >    "mrk |= TX_DESC_MARK;" statement always runs.
> > 
> >    Get me the log dump produced by that scenerio.

------------[ cut here ]------------
WARNING: at net/sched/sch_generic.c:226 dev_watchdog+0x21e/0x230()
NETDEV WATCHDOG: eth2 (niu): transmit timed out
Modules linked in: niu ipmi_si hpwdt serio_raw bnx2 zlib_inflate rng_core ipmi_msghandler hpilo ehci_hcd uhci_hcd sr_mod cdrom
Pid: 0, comm: swapper Not tainted 2.6.28-rc4-davem #17
Call Trace:
 [<c0125823>] warn_slowpath+0x63/0x80
 [<c011f03e>] ? __enqueue_entity+0x8e/0xb0
 [<c010888c>] ? native_sched_clock+0x1c/0x80
 [<c01453c4>] ? __lock_acquire+0x104/0x8e0
 [<c01453c4>] ? __lock_acquire+0x104/0x8e0
 [<c010888c>] ? native_sched_clock+0x1c/0x80
 [<c013f19b>] ? getnstimeofday+0x3b/0xe0
 [<c0144b09>] ? lock_release_holdtime+0x79/0xc0
 [<c021fd2e>] ? strlcpy+0x1e/0x60
 [<c031f4be>] dev_watchdog+0x21e/0x230
 [<c0144b09>] ? lock_release_holdtime+0x79/0xc0
 [<c012e55d>] ? run_timer_softirq+0x10d/0x190
 [<c012e56f>] run_timer_softirq+0x11f/0x190
 [<c014362c>] ? tick_dev_program_event+0x3c/0xc0
 [<c031f2a0>] ? dev_watchdog+0x0/0x230
 [<c012a204>] __do_softirq+0x94/0x160
 [<c013c7c0>] ? hrtimer_interrupt+0x150/0x180
 [<c013c651>] ? ktime_get+0x11/0x30
 [<c012a30b>] do_softirq+0x3b/0x50
 [<c012a515>] irq_exit+0x75/0x90
 [<c011364a>] smp_apic_timer_interrupt+0x5a/0x90
 [<c013c5ca>] ? hrtimer_start+0x1a/0x20
 [<c0103f0c>] apic_timer_interrupt+0x28/0x30
 [<c01090d5>] ? mwait_idle+0x35/0x40
 [<c0101c1e>] cpu_idle+0x4e/0xa0
---[ end trace 3045c940a424568f ]---
niu 0000:0b:00.0: niu: eth2: Transmit timed out, resetting
niu 0000:0b:00.0: niu: eth2: LDG[idx(0):num(0)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(1):num(1)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(2):num(2)] V0[sw(0x2000000000)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(3):num(3)] V0[sw(0x1)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(4):num(4)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(5):num(5)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(6):num(6)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(7):num(7)] V0[sw(0x100000000)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(8):num(8)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: LDG[idx(9):num(9)] V0[sw(0x0)hw(0x0)] V1[sw(0x0)hw(0x0)] V2[sw(0x0)hw(0x0)]
niu 0000:0b:00.0: niu: eth2: Dumping transmitter state.
niu 0000:0b:00.0: niu: eth2: TX_RING[ 0] CHANNEL 0 LDN 32
niu 0000:0b:00.0: niu: eth2: TX_RING[ 0] parent->lgd_map[ldn] 7
niu 0000:0b:00.0: niu: eth2: TX_RING[ 0] Num pending TX SKBs: 2
niu 0000:0b:00.0: niu: eth2: TX_RING[ 0] TX_CS sw[0002000100000000] hw[0002000100000000]
niu 0000:0b:00.0: niu: eth2: TX_RING[ 1] CHANNEL 1 LDN 33
niu 0000:0b:00.0: niu: eth2: TX_RING[ 1] parent->lgd_map[ldn] 8
niu 0000:0b:00.0: niu: eth2: TX_RING[ 1] Num pending TX SKBs: 0
niu 0000:0b:00.0: niu: eth2: TX_RING[ 1] TX_CS sw[0000000000000000] hw[0000000000000000]
niu 0000:0b:00.0: niu: eth2: TX_RING[ 2] CHANNEL 2 LDN 34
niu 0000:0b:00.0: niu: eth2: TX_RING[ 2] parent->lgd_map[ldn] 9
niu 0000:0b:00.0: niu: eth2: TX_RING[ 2] Num pending TX SKBs: 0
niu 0000:0b:00.0: niu: eth2: TX_RING[ 2] TX_CS sw[0000000000000000] hw[0000000000000000]
niu 0000:0b:00.0: niu: eth2: TX_RING[ 3] CHANNEL 3 LDN 35
niu 0000:0b:00.0: niu: eth2: TX_RING[ 3] parent->lgd_map[ldn] 0
niu 0000:0b:00.0: niu: eth2: TX_RING[ 3] Num pending TX SKBs: 0
niu 0000:0b:00.0: niu: eth2: TX_RING[ 3] TX_CS sw[0000000000000000] hw[0000000000000000]
niu 0000:0b:00.0: niu: eth2: TX_RING[ 4] CHANNEL 4 LDN 36
niu 0000:0b:00.0: niu: eth2: TX_RING[ 4] parent->lgd_map[ldn] 1
niu 0000:0b:00.0: niu: eth2: TX_RING[ 4] Num pending TX SKBs: 0
niu 0000:0b:00.0: niu: eth2: TX_RING[ 4] TX_CS sw[0000000000000000] hw[0000000000000000]
niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] CHANNEL 5 LDN 37
niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] parent->lgd_map[ldn] 2
niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] Num pending TX SKBs: 237
niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] TX_CS sw[00ed00ec00000000] hw[00ed00ec00000000]
Jesper Dangaard Brouer Nov. 13, 2008, 10:29 a.m. UTC | #13
On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> From: David Miller <davem@davemloft.net>
> Date: Wed, 12 Nov 2008 03:52:40 -0800 (PST)
> 
> > Ok, Jesper, please try two things for me, leave the debugging patch
> > in there for all the tests:
> > 
> > 1) Retrigger the problem (with or without MSI, doesn't matter) but
> >    add back in that test I asked you to try last week.  The one
> >    where the "if (++rp->mark_counter == rp->mark_freq)" condition
> >    test in niu_start_xmit() is commented out, so that the
> >    "mrk |= TX_DESC_MARK;" statement always runs.
> > 
> >    Get me the log dump produced by that scenerio.
> > 
> > 2) Next, simply comment out the:
> > 
> >       if (unlikely(!(cs & (TX_CS_MK | TX_CS_MMK))))
> >               goto out;
> > 
> >    lines in niu_tx_work().
> > 
> > Let's see what new info we can get out of this.

Both applying test#1 and test#2.

After applying test#2, I cannot get it to do a TX transmit timed out.
And every thing seem to work... which after the known bug fix was kind
of the expected behaviour...

Although I'm not happy about the new perf numbers, as I now on a SMP
system only can route approx 290 kpps, remember I could route 319 kpps
using a single CPU nosmp kernel. (even more anyoing is that oprofile is
broken)
David Miller Nov. 13, 2008, 10:15 p.m. UTC | #14
From: Jesper Dangaard Brouer <jdb@comx.dk>
Date: Thu, 13 Nov 2008 11:29:31 +0100

> Although I'm not happy about the new perf numbers, as I now on a SMP
> system only can route approx 290 kpps, remember I could route 319 kpps
> using a single CPU nosmp kernel.

That unfortunately (can be) the cost of SMP :-/

With multi-flow tests, Robert Olsson is getting 4.2 mpps rates with
NIU and pktgen.  That's what this card is designed for, good
multi-flow workload performance, rather than striving for maximum
single-flow performance.

> (even more anyoing is that oprofile is broken)

Yes, people on lkml are trying to figure out what is causing
that regression on x86.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 13, 2008, 10:19 p.m. UTC | #15
From: Jesper Dangaard Brouer <jdb@comx.dk>
Date: Thu, 13 Nov 2008 10:10:12 +0100

> On Wed, 2008-11-12 at 04:11 -0800, David Miller wrote:
> > From: David Miller <davem@davemloft.net>
> > Date: Wed, 12 Nov 2008 03:52:40 -0800 (PST)
> > 
> > > Ok, Jesper, please try two things for me, leave the debugging patch
> > > in there for all the tests:
> > > 
> > > 1) Retrigger the problem (with or without MSI, doesn't matter) but
> > >    add back in that test I asked you to try last week.  The one
> > >    where the "if (++rp->mark_counter == rp->mark_freq)" condition
> > >    test in niu_start_xmit() is commented out, so that the
> > >    "mrk |= TX_DESC_MARK;" statement always runs.
> > > 
> > >    Get me the log dump produced by that scenerio.
> 
> ------------[ cut here ]------------
> WARNING: at net/sched/sch_generic.c:226 dev_watchdog+0x21e/0x230()
> NETDEV WATCHDOG: eth2 (niu): transmit timed out
> Modules linked in: niu ipmi_si hpwdt serio_raw bnx2 zlib_inflate rng_core ipmi_msghandler hpilo ehci_hcd uhci_hcd sr_mod cdrom
> Pid: 0, comm: swapper Not tainted 2.6.28-rc4-davem #17
> Call Trace:

Thanks a lot for making this test Jesper, even though the bug
is fixed.

> niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] CHANNEL 5 LDN 37
> niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] parent->lgd_map[ldn] 2
> niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] Num pending TX SKBs: 237
> niu 0000:0b:00.0: niu: eth2: TX_RING[ 5] TX_CS sw[00ed00ec00000000] hw[00ed00ec00000000]

Same signature, counters advancing yet no mark bits are set.

Now if we can fix that MSIX BUG() and start analyzing your
pps performance with oprofile, we'll be in good shape :)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Nov. 19, 2008, 10:58 p.m. UTC | #16
On Thu, 13 Nov 2008, David Miller wrote:

> From: Jesper Dangaard Brouer <jdb@comx.dk>
> Date: Thu, 13 Nov 2008 11:29:31 +0100
>
>> Although I'm not happy about the new perf numbers, as I now on a SMP
>> system only can route approx 290 kpps, remember I could route 319 kpps
>> using a single CPU nosmp kernel.
>
> That unfortunately (can be) the cost of SMP :-/

[Regression]

Well that was not the real cause of the performance loss.  Because on 
kernel 2.6.27 I get really good performance (900-1200kpps) compared to 
2.6.28 (git net-2.6).

The cause of this problem (tracked down together with Robert Olsson) is 
that on 2.6.28 I have a lot less IRQs available.  It seems max 34 IRQs.

Due the reduced number of IRQs the NIU driver cannot get enough IRQs to 
the interfaces, and starts to use "IO-APIC" based IRQs.

On kernel 2.6.28:

  My eth2 is using 10 IRQs all "PCI-MSI-edge".

  BUT my eth3 is using a single IRQ using "IO-APIC-fasteoi" and shared
  with the usb driver...

Think thats must be my performance problem on 2.6.28.


> With multi-flow tests, Robert Olsson is getting 4.2 mpps rates with
> NIU and pktgen.  That's what this card is designed for, good
> multi-flow workload performance, rather than striving for maximum
> single-flow performance.

[Packet performance]

Yes, I know, I do use pktgen and multi-flows (rand dest IP+port).

For the two drivers NIU and Suns NXGE, my packet per sec performance is 
now, on 2.6.27 (with backported NIU fixes).

   With NIU driver I can route 900 kpps.

   With NXGE driver (and enqueue=NULL hack) I can route 1200 kpps.

Actually I think I can go higher, because I'm limited by my packet rate 
generator. I use pktgen (with rand dst IP+port) and can only generate 1200 
kpps.

(I have actually ordered some new hardware, so I can get a faster pktgen 
machine and perhaps test it as a router too.  Also ordered the hardware 
because I want to test PCI-express v.2.0. I have a prototype 12-port 
gigabit NIC (from hotlava systems) that support PCIe v.2.0 and has 6x 
82575 chips (4RX/4TX queues))


Hilsen
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 19, 2008, 11:11 p.m. UTC | #17
From: Jesper Dangaard Brouer <hawk@diku.dk>
Date: Wed, 19 Nov 2008 23:58:12 +0100 (CET)

> Well that was not the real cause of the performance loss.  Because
> on kernel 2.6.27 I get really good performance (900-1200kpps)
> compared to 2.6.28 (git net-2.6).
>
> The cause of this problem (tracked down together with Robert Olsson)
> is that on 2.6.28 I have a lot less IRQs available.  It seems max 34
> IRQs.
>
> Due the reduced number of IRQs the NIU driver cannot get enough IRQs
> to the interfaces, and starts to use "IO-APIC" based IRQs.

This is almost certainly related to the driver unload bug.

I know you ran into unbuildable/unbootable kernels during a bisect,
but you really need to track down this regression.

There were a lot of IRQ changes, especially on x86.  The sequence is
something like:

1) dyn irqs
2) APIC/IO_APIC handling integration
3) by-hand REVERT of dyn irqs, it was done by hand in order to not
   lose the #2 changes
4) interrupt remapping support

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jesper Dangaard Brouer Nov. 20, 2008, 7:48 p.m. UTC | #18
Hi Thomas Gleixner,

I have bisected a regression to your commit
3235e936c0cc3589309280b6f59e5096779adae3,
"x86: remove sparse irq from Kconfig".

Its actually not necessary your fault, as your commit simply removes
the config option HAVE_SPARSE_IRQ.  This revels the bug / regression
I'm exposted to.

Guess I should bisect again to find the exact faulty commit, but I'm
rather sick of bisecting at the moment, and though you might have a
better idea whats going wrong.  I would rather spend my time
performance tuning the multiqueue routing code...

[The regression]:

During my testing of the Sun Neptune based NICs.  On kernel 2.6.27 I
get really good performance (900-1200kpps) compared to 2.6.28 (davem
git net-2.6).

The cause of this problem (tracked down together with Robert Olsson)
is that on 2.6.28 I have a lot less IRQs available.  It seems max 34
IRQs.  Due the reduced number of IRQs the NIU driver cannot get
enough IRQs to the interfaces, and starts to use "IO-APIC" based
IRQs.

On kernel 2.6.28: My eth2 is using 10 IRQs all "PCI-MSI-edge".  BUT
my eth3 is using a single IRQ using "IO-APIC-fasteoi" and shared with
the usb driver.  That my performance problem on 2.6.28.

[Other related bugs]:
  Is that unloading the "niu" driver will give a kernel BUG during
  deallocation og MSI interrupts. (See dmesg output below if interested)

(I have attached full bisect history)

Cheers,
   Jesper Brouer

--
-------------------------------------------------------------------
MSc. Master of Computer Science
Dept. of Computer Science, University of Copenhagen
Author of http://www.adsl-optimizer.dk
-------------------------------------------------------------------


On Wed, 19 Nov 2008, David Miller wrote:
> From: Jesper Dangaard Brouer <hawk@diku.dk>
> Date: Wed, 19 Nov 2008 23:58:12 +0100 (CET)
>
>> Well that was not the real cause of the performance loss.  Because
>> on kernel 2.6.27 I get really good performance (900-1200kpps)
>> compared to 2.6.28 (git net-2.6).
>>
>> The cause of this problem (tracked down together with Robert Olsson)
>> is that on 2.6.28 I have a lot less IRQs available.  It seems max 34
>> IRQs.
>>
>> Due the reduced number of IRQs the NIU driver cannot get enough IRQs
>> to the interfaces, and starts to use "IO-APIC" based IRQs.
>
> This is almost certainly related to the driver unload bug.
>
> I know you ran into unbuildable/unbootable kernels during a bisect,
> but you really need to track down this regression.


------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:632!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: ehci_hcd bnx2 uhci_hcd zlib_inflate serio_raw hpilo 
niu(-)

Pid: 3036, comm: rmmod Not tainted (2.6.27-bisect #5) ProLiant DL380 G5
EIP: 0060:[<c021ecac>] EFLAGS: 00010286 CPU: 2
EIP is at msi_free_irqs+0xdc/0xe0
EAX: f6b8f860 EBX: 00000030 ECX: f7156ba8 EDX: c0420500
ESI: f7156800 EDI: f7156ba8 EBP: f6431eb4 ESP: f6431ea8
  DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmmod (pid: 3036, ti=f6430000 task=f70f9b20 task.ti=f6430000)
Stack:
  f7156800 f670c400 f7156800 f6431ebc c021ecb8 f6431ec8 c021ef41 f670c000
  f6431edc f809d3f8 f7156800 f80a1ed4 f80a1ed4 f6431ee8 c0219c29 f7156858
  f6431ef8 c026b0d4 f7156858 f7156914 f6431f0c c026b197 f80a1ea0 f80a1ed4
Call Trace:
  [<c021ecb8>] ? msix_free_all_irqs+0x8/0x10
  [<c021ef41>] ? pci_disable_msix+0x31/0x40
  [<f809d3f8>] ? niu_pci_remove_one+0x88/0x8a [niu]
  [<c0219c29>] ? pci_device_remove+0x19/0x40
  [<c026b0d4>] ? __device_release_driver+0x54/0x80
  [<c026b197>] ? driver_detach+0x97/0xa0
  [<c026a475>] ? bus_remove_driver+0x75/0xa0
  [<c026b609>] ? driver_unregister+0x39/0x40
  [<c0219e51>] ? pci_unregister_driver+0x21/0x80
  [<f809a0ad>] ? niu_exit+0xd/0x10 [niu]
  [<c0145d74>] ? sys_delete_module+0x114/0x1d0
  [<c016810a>] ? remove_vma+0x3a/0x50
  [<c0168c29>] ? do_munmap+0x189/0x1e0
  [<c0103229>] ? sysenter_do_call+0x12/0x21
  [<c0330000>] ? quirk_disable_msi+0x30/0x50
Code: b7 43 08 8b 53 1c c1 e0 04 01 d0 ba 01 00 00 00 83 c0 0c 89 10 3b 7b 
14 75 aa 8b 43 1c e8 3d 92 ef ff eb a0 5b 31 c0 5e 5f 5d c3 <0f> 0b eb fe 
55 89 e5 e8 18 ff ff ff 5d c3 8d b6 00 00 00 00 55
EIP: [<c021ecac>] msi_free_irqs+0xdc/0xe0 SS:ESP 0068:f6431ea8
---[ end trace f72de2e283920207 ]---
~~ -*-text-*-

       -------------------------------------------------------
			 Bisecting IRQ change:
		      What change reduced the IRQs
       -------------------------------------------------------
		 Jesper Dangaard Brouer (jdb@comx.dk)
       -------------------------------------------------------
	$LastChangedRevision: 786 $
	$Date: 2008-11-20 20:44:51 +0100 (Thu, 20 Nov 2008) $
       -------------------------------------------------------

git clone
~~~~~~~~~

+---------
 cd /var/kernels/git/davem
 git clone git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6.git net-2.6-bisect-irqs
+---------

Description / Reason to find
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

 During my testing of the Sun Neptune based NICs.

 On kernel 2.6.27 I get really good performance (900-1200kpps)
 compared to 2.6.28 (git net-2.6).

 The cause of this problem (tracked down together with Robert Olsson)
 is that on 2.6.28 I have a lot less IRQs available.  It seems max 34
 IRQs.

 Due the reduced number of IRQs the NIU driver cannot get enough IRQs
 to the interfaces, and starts to use "IO-APIC" based IRQs.

 On kernel 2.6.28:

  My eth2 is using 10 IRQs all "PCI-MSI-edge".

  BUT my eth3 is using a single IRQ using "IO-APIC-fasteoi" and shared
  with the usb driver...

 Think thats must be my performance problem on 2.6.28.

Known: Good and bad
~~~~~~~~~~~~~~~~~~~

 GOOD:
  git bisect good v2.6.27

 BAD:
  git bisect bad 92b29b86fe2e183d44eb467e5e74a5f718ef2e43

 [92b29b86fe2e183d44eb467e5e74a5f718ef2e43] #Merge branch
 'tracing-v28-for-linus' of
 git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip


HiSTORY:
~~~~~~~~

+--------
 cd /var/kernels/git/davem/net-2.6-bisect-irqs/
 git bisect start
 git bisect good v2.6.27
+--------

+--------------
git bisect bad 92b29b86fe2e183d44eb467e5e74a5f718ef2e43
Bisecting: 3220 revisions left to test after this
[af5c2bd16ac2e5688c3bf46ea1f95112d696d294] x86: fix virt_addr_valid() with CONFIG_DEBUG_VIRTUAL=y, v2
+--------------

 CONFIG_LOCALVERSION="-bisect"

+-------------
cp ../net-2.6-bisect/.config .
script make_oldconfig_01
make oldconfig
exit
#Script done, file is make_oldconfig_01
+-------------

+----------------
time make -j6 bzImage modules
#
#real    9m22.739s
#user    16m56.776s
#sys     1m4.672s
+----------------

 Booted kernel: GOOD: irqs and (niu rmmod good)

+----------------
git bisect good
Bisecting: 1614 revisions left to test after this
[36ac1d2f323f8bf8bc10c25b88f617657720e241] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
+----------------

 Compiling:

+----------------
time make -j6 bzImage modules
+----------------

 Booted kernel: GOOD: irqs and (niu rmmod good)

+-----------
git bisect good
Bisecting: 807 revisions left to test after this
[1aece34833721d64eb33fc15cd923c727296d3d3] container freezer: rename check_if_frozen()
+-----------

 Compiling...

+----------------
time make -j6 bzImage modules
#real    10m1.561s
#user    17m23.293s
#sys     1m5.744s
+----------------

 Installing...

 Booted kernel:

+----
dcu-router-ng:~# uname -a
Linux dcu-router-ng 2.6.27-bisect #3 SMP PREEMPT Thu Nov 20 12:33:02 CET 2008 i686 GNU/Linux
+----

 Results: GOOD: irqs and (niu rmmod good)

+------
git bisect good
Bisecting: 403 revisions left to test after this
[1d9a8a47d659f053abeca9ece45651b4d94780c8] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
+------

 Compiling...

+----------------
time make -j6 bzImage modules
#real    10m9.371s
#user    17m21.781s
#sys     1m6.052s
+----------------

 Installing...

 Booting ...

+-------
dcu-router-ng:~# uname -a
Linux dcu-router-ng 2.6.27-bisect #4 SMP PREEMPT Thu Nov 20 12:50:39 CET 2008 i686 GNU/Linux
+-------

 Results: GOOD: irqs and (niu rmmod good)


+-------
 git-bisect good
Bisecting: 223 revisions left to test after this
[dd3a1db900f2a215a7d7dd71b836e149a6cf5fed] genirq: improve include files
+-------

+----------------
time make -j6 bzImage modules
+----------------

 Booting ...

+--------
Linux dcu-router-ng 2.6.27-bisect #5 SMP PREEMPT Thu Nov 20 13:58:34 CET 2008 i686 GNU/Linux
+--------

 Results: BAD: irqs and (niu rmmod also BAD)

+-------
cat /proc/interrupts
           CPU0       CPU1       CPU2       CPU3       
  0:        125          0          0          0   IO-APIC-edge      timer
  1:          0          0          1          1   IO-APIC-edge      i8042
  3:          2          1          2          2   IO-APIC-edge      serial
  8:          0          2          0          0   IO-APIC-edge      rtc
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          1          2          1          0   IO-APIC-edge      i8042
 16:        103        108        108        112   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb6, eth0
 17:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
 19:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4, eth3
 20:          0          0          0          0   PCI-MSI-edge      eth2
 21:          0          0          0          0   PCI-MSI-edge      eth2
 22:         24         23         23         23   IO-APIC-fasteoi   uhci_hcd:usb5, eth2
 23:          0          0          0          0   PCI-MSI-edge      eth2
 24:          0          0          0          0   PCI-MSI-edge      eth2
 25:          0          0          0          0   PCI-MSI-edge      eth2
 26:          0          0          0          0   PCI-MSI-edge      eth2
 27:          0          0          0          0   PCI-MSI-edge      eth2
 28:          0          0          0          0   PCI-MSI-edge      eth2
 29:          0          0          0          0   PCI-MSI-edge      eth2
 30:          0          0          0          0   PCI-MSI-edge      eth2
 31:          0          0          0          0   PCI-MSI-edge      eth2
 32:          0          0          0          0   PCI-MSI-edge      eth2
 34:        271        268        268        264   PCI-MSI-edge      cciss0
NMI:          0          0          0          0   Non-maskable interrupts
LOC:       3301       2970       2594       2389   Local timer interrupts
RES:         28        560          6         13   Rescheduling interrupts
CAL:         50        104         99         62   Function call interrupts
TLB:        241        224        287        279   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0
+-------

 OUTPUT "rmmod niu" (gives segfault) and "dmesg"

+-------
------------[ cut here ]------------
kernel BUG at drivers/pci/msi.c:632!
invalid opcode: 0000 [#1] PREEMPT SMP 
Modules linked in: ehci_hcd bnx2 uhci_hcd zlib_inflate serio_raw hpilo niu(-)

Pid: 3036, comm: rmmod Not tainted (2.6.27-bisect #5) ProLiant DL380 G5
EIP: 0060:[<c021ecac>] EFLAGS: 00010286 CPU: 2
EIP is at msi_free_irqs+0xdc/0xe0
EAX: f6b8f860 EBX: 00000030 ECX: f7156ba8 EDX: c0420500
ESI: f7156800 EDI: f7156ba8 EBP: f6431eb4 ESP: f6431ea8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process rmmod (pid: 3036, ti=f6430000 task=f70f9b20 task.ti=f6430000)
Stack:
 f7156800 f670c400 f7156800 f6431ebc c021ecb8 f6431ec8 c021ef41 f670c000
 f6431edc f809d3f8 f7156800 f80a1ed4 f80a1ed4 f6431ee8 c0219c29 f7156858
 f6431ef8 c026b0d4 f7156858 f7156914 f6431f0c c026b197 f80a1ea0 f80a1ed4
Call Trace:
 [<c021ecb8>] ? msix_free_all_irqs+0x8/0x10
 [<c021ef41>] ? pci_disable_msix+0x31/0x40
 [<f809d3f8>] ? niu_pci_remove_one+0x88/0x8a [niu]
 [<c0219c29>] ? pci_device_remove+0x19/0x40
 [<c026b0d4>] ? __device_release_driver+0x54/0x80
 [<c026b197>] ? driver_detach+0x97/0xa0
 [<c026a475>] ? bus_remove_driver+0x75/0xa0
 [<c026b609>] ? driver_unregister+0x39/0x40
 [<c0219e51>] ? pci_unregister_driver+0x21/0x80
 [<f809a0ad>] ? niu_exit+0xd/0x10 [niu]
 [<c0145d74>] ? sys_delete_module+0x114/0x1d0
 [<c016810a>] ? remove_vma+0x3a/0x50
 [<c0168c29>] ? do_munmap+0x189/0x1e0
 [<c0103229>] ? sysenter_do_call+0x12/0x21
 [<c0330000>] ? quirk_disable_msi+0x30/0x50
Code: b7 43 08 8b 53 1c c1 e0 04 01 d0 ba 01 00 00 00 83 c0 0c 89 10 3b 7b 14 75 aa 8b 43 1c e8 3d 92 ef ff eb a0 5b 31 c0 5e 5f 5d c3 <0f> 0b eb fe 55 89 e5 e8 18 ff ff ff 5d c3 8d b6 00 00 00 00 55 
EIP: [<c021ecac>] msi_free_irqs+0xdc/0xe0 SS:ESP 0068:f6431ea8
---[ end trace f72de2e283920207 ]---
+-------

+------
git-bisect bad
Bisecting: 89 revisions left to test after this
[db4b5525caafd846ec20f95afbc6403c792e22cf] x86: apic_64.c - setup_APIC_timer has to be __cpuinit function
+------

 Related config change? (make oldconfig)

+------
 script make_oldconfig_02
 make oldconfig
 Script done, file is make_oldconfig_02
+------

+------
Support sparse irq numbering (HAVE_SPARSE_IRQ) [Y/n/?] (NEW) ? ?Y

This enables support for sparse irq, esp for msi/msi-x. the irq
number will be bus/dev/fn + 12bit. You may need if you have lots of
cards supports msi-x installed.

If you don't know what to do here, say Y.
+------

 Compiling...

+----------------
 time make -j6 bzImage modules
#
#real    9m29.556s
#user    17m10.396s
#sys     1m5.056s
+----------------

 Booting ...

+-------
Linux dcu-router-ng 2.6.27-bisect #6 SMP PREEMPT Thu Nov 20 14:25:40 CET 2008 i686 GNU/Linux
+-------

 The output from /proc/interrupts changed, very weird!
 BUT eth3 does use a "PCI-MSI-edge" interrupt.

 Guess this is a GOOD state even though it looks weird.

 Unloading NIU driver also GOOD.

+---------
cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
0x0:        124          1          0          0   IO-APIC-edge      timer
0x1:          1          0          0          1   IO-APIC-edge      i8042
0x3:          2          1          2          2   IO-APIC-edge      serial
0x8:          1          0          0          1   IO-APIC-edge      rtc
0x9:          0          0          0          0   IO-APIC-fasteoi   acpi
0xc:          0          1          2          1   IO-APIC-edge      i8042
0x10:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb6
0x11:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
0x12:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
0x6000fe:        288        289        290        290   PCI-MSI-edge      cciss0
0x16:         23         24         23         23   IO-APIC-fasteoi   uhci_hcd:usb5
0xb00100:          0          0          0          0   PCI-MSI-edge      eth2
0xb000ff:          0          0          0          0   PCI-MSI-edge      eth2
0xb000fe:          0          0          0          0   PCI-MSI-edge      eth2
0xb000fd:          0          0          0          0   PCI-MSI-edge      eth2
0xb000fc:          0          0          0          0   PCI-MSI-edge      eth2
0xb000fb:          0          0          0          0   PCI-MSI-edge      eth2
0xb000fa:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f9:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f8:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f7:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f6:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f5:          0          0          0          0   PCI-MSI-edge      eth2
0xb000f4:          0          0          0          0   PCI-MSI-edge      eth2
0xb01100:          0          0          0          0   PCI-MSI-edge      eth3
0xb010ff:          0          0          0          0   PCI-MSI-edge      eth3
0xb010fe:          0          0          0          0   PCI-MSI-edge      eth3
0xb010fd:          0          0          0          0   PCI-MSI-edge      eth3
0xb010fc:          0          0          0          0   PCI-MSI-edge      eth3
0xb010fb:          0          0          0          0   PCI-MSI-edge      eth3
0xb010fa:          0          0          0          0   PCI-MSI-edge      eth3
0xb010f9:          0          0          0          0   PCI-MSI-edge      eth3
0xb010f8:          0          0          0          0   PCI-MSI-edge      eth3
0xb010f7:          0          0          0          0   PCI-MSI-edge      eth3
0xb010f6:          0          0          0          0   PCI-MSI-edge      eth3
0x13:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
0x300100:        210        210        210        208   PCI-MSI-edge      eth0
NMI:          0          0          0          0   Non-maskable interrupts
LOC:       3630       3265       3103       2711   Local timer interrupts
RES:         34        226         12        417   Rescheduling interrupts
CAL:         89         55         90         78   Function call interrupts
TLB:        253        205        311        267   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0
+---------

 Guess it a GOOD situation...

+------
 git-bisect good
Bisecting: 44 revisions left to test after this
[ba374c9baef910fbc5373541d98c50f15e82c3f8] x86: fix HPET compiler error when not using CONFIG_PCI_MSI
+------

 Compiling ...

+--------
 time make -j6 bzImage modules
#real    9m28.062s
#user    17m7.492s
#sys     1m4.248s
+--------

 Installing ...

 Booting ...

+------
Linux dcu-router-ng 2.6.27-bisect #7 SMP PREEMPT Thu Nov 20 14:52:45 CET 2008 i686 GNU/Linux
+------

 Still looks GOOD (/proc/interrupts still looks weird).

 And rmmod NIU driver GOOD.

+------
git-bisect good
Bisecting: 22 revisions left to test after this
[922402f15a85f7a064926eb1db68cc52bc4d4a91] x86: Add UV partition call v4
+------

 Compiling ...

+--------
 time make -j6 bzImage modules
#real    0m34.622s
#user    0m41.139s
#sys     0m5.812s
+--------

 Install ...

 Booting ...

+-----
Linux dcu-router-ng 2.6.27-bisect #8 SMP PREEMPT Thu Nov 20 15:04:11 CET 2008 i686 GNU/Linux
+-----

 Looks GOOD, and /proc/interrupts changed again! Now the interrupts
 are not i HEX anymore, but in decimal, but still strange/large
 numbers for MSI.

 Unloading NIU driver GOOD.

+------
 cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  0:        124          0          0          0   IO-APIC-edge      timer
  1:          0          0          1          1   IO-APIC-edge      i8042
  3:          2          2          1          2   IO-APIC-edge      serial
  8:          0          0          1          1   IO-APIC-edge      rtc
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          1          2          1          0   IO-APIC-edge      i8042
 16:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb6
 17:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
6291710:        828        821        828        823   PCI-MSI-edge      cciss0
 22:         23         24         24         22   IO-APIC-fasteoi   uhci_hcd:usb5
11534592:          0          0          0          0   PCI-MSI-edge      eth2
11534591:          0          0          0          0   PCI-MSI-edge      eth2
11534590:          0          0          0          0   PCI-MSI-edge      eth2
11534589:          0          0          0          0   PCI-MSI-edge      eth2
11534588:          0          0          0          0   PCI-MSI-edge      eth2
11534587:          0          0          0          0   PCI-MSI-edge      eth2
11534586:          0          0          0          0   PCI-MSI-edge      eth2
11534585:          0          0          0          0   PCI-MSI-edge      eth2
11534584:          0          0          0          0   PCI-MSI-edge      eth2
11534583:          0          0          0          0   PCI-MSI-edge      eth2
11534582:          0          0          0          0   PCI-MSI-edge      eth2
11534581:          0          0          0          0   PCI-MSI-edge      eth2
11534580:          0          0          0          0   PCI-MSI-edge      eth2
11538688:          0          0          0          0   PCI-MSI-edge      eth3
11538687:          0          0          0          0   PCI-MSI-edge      eth3
11538686:          0          0          0          0   PCI-MSI-edge      eth3
11538685:          0          0          0          0   PCI-MSI-edge      eth3
11538684:          0          0          0          0   PCI-MSI-edge      eth3
11538683:          0          0          0          0   PCI-MSI-edge      eth3
11538682:          0          0          0          0   PCI-MSI-edge      eth3
11538681:          0          0          0          0   PCI-MSI-edge      eth3
11538680:          0          0          0          0   PCI-MSI-edge      eth3
11538679:          0          0          0          0   PCI-MSI-edge      eth3
11538678:          0          0          0          0   PCI-MSI-edge      eth3
 19:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
3145984:       9993       9994       9987       9993   PCI-MSI-edge      eth0
NMI:          0          0          0          0   Non-maskable interrupts
LOC:      10075      11448       8001       8787   Local timer interrupts
RES:        297         17        349         26   Rescheduling interrupts
CAL:        173        189         95        173   Function call interrupts
TLB:        299        259        330        345   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0
+------

+-------
 git-bisect good
Bisecting: 11 revisions left to test after this
[a1aca5de08a0cb840a90fb3f729a5940f8d21185] genirq: remove artifacts from sparseirq removal
+-------

 Compiling

+--------
 time make -j6 bzImage modules
#real    9m30.767s
#user    17m10.808s
#sys     1m6.388s
+--------

 Installing ...

 Booting ...

+-----
Linux dcu-router-ng 2.6.27-bisect #9 SMP PREEMPT Thu Nov 20 15:28:17 CET 2008 i686 GNU/Linux
+-----

 BAD kernel version, max IRQ is 34.  And eth3 got assigned a
 IO-APIC-fasteoi shared with uhci_hcd:usb2.

 Also BAD unloading of NIU driver.

 BUG is some where in between:

  git log 922402f15a85f7a064926eb1db68cc52bc4d4a91..a1aca5de08a0cb840a90fb3f729a5940f8d21185 | grep ^commit | wc -l

 11 commits


+-------
git-bisect bad
Bisecting: 5 revisions left to test after this
[3235e936c0cc3589309280b6f59e5096779adae3] x86: remove sparse irq from Kconfig
+-------

 Compiling...

+--------
 time make -j6 bzImage modules
+--------

 Install ...

 Booting

+------
Linux dcu-router-ng 2.6.27-bisect #10 SMP PREEMPT Thu Nov 20 15:56:10 CET 2008 i686 GNU/Linux
+------

 BAD kernel.

 BAD rmmod NIU driver.

+---------
git bisect bad
Bisecting: 2 revisions left to test after this
[4c66a73f0796dacc2ff0d4af75794ec843ceb3d1] x86: sparse_irq: fix typo in debug print out
+---------

 Compiling...

+------
 time make -j6 bzImage modules
#real    7m23.814s
#user    12m15.718s
#sys     0m42.183s
+------

 Config change prompting:

+-----
 Support sparse irq numbering (HAVE_SPARSE_IRQ) [Y/n/?] (NEW) Y
+-----

 Installing ...

 Booting ...

+-------
Linux dcu-router-ng 2.6.27-bisect #11 SMP PREEMPT Thu Nov 20 16:19:29 CET 2008 i686 GNU/Linux
+-------

 GOOD!!!

+--------
 cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  0:        124          0          0          0   IO-APIC-edge      timer
  1:          0          0          1          1   IO-APIC-edge      i8042
  3:          2          2          2          2   IO-APIC-edge      serial
  8:          1          0          0          1   IO-APIC-edge      rtc
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          1          2          1          0   IO-APIC-edge      i8042
 16:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb6
 17:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
6291710:        285        288        293        287   PCI-MSI-edge      cciss0
11534592:          0          0          0          0   PCI-MSI-edge      eth2
11534591:          0          0          0          0   PCI-MSI-edge      eth2
11534590:          0          0          0          0   PCI-MSI-edge      eth2
11534589:          0          0          0          0   PCI-MSI-edge      eth2
11534588:          0          0          0          0   PCI-MSI-edge      eth2
11534587:          0          0          0          0   PCI-MSI-edge      eth2
11534586:          0          0          0          0   PCI-MSI-edge      eth2
11534585:          0          0          0          0   PCI-MSI-edge      eth2
11534584:          0          0          0          0   PCI-MSI-edge      eth2
11534583:          0          0          0          0   PCI-MSI-edge      eth2
11534582:          0          0          0          0   PCI-MSI-edge      eth2
11534581:          0          0          0          0   PCI-MSI-edge      eth2
11534580:          0          0          0          0   PCI-MSI-edge      eth2
 22:         23         24         23         23   IO-APIC-fasteoi   uhci_hcd:usb5
11538688:          0          0          0          0   PCI-MSI-edge      eth3
11538687:          0          0          0          0   PCI-MSI-edge      eth3
11538686:          0          0          0          0   PCI-MSI-edge      eth3
11538685:          0          0          0          0   PCI-MSI-edge      eth3
11538684:          0          0          0          0   PCI-MSI-edge      eth3
11538683:          0          0          0          0   PCI-MSI-edge      eth3
11538682:          0          0          0          0   PCI-MSI-edge      eth3
11538681:          0          0          0          0   PCI-MSI-edge      eth3
11538680:          0          0          0          0   PCI-MSI-edge      eth3
11538679:          0          0          0          0   PCI-MSI-edge      eth3
11538678:          0          0          0          0   PCI-MSI-edge      eth3
 19:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
3145984:        244        242        238        241   PCI-MSI-edge      eth0
NMI:          0          0          0          0   Non-maskable interrupts
LOC:       3715       3104       2853       2542   Local timer interrupts
RES:         88         52        280        258   Rescheduling interrupts
CAL:         76         75         93         59   Function call interrupts
TLB:        245        241        312        283   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0
+---------


+---------
 git-bisect good
Bisecting: 1 revisions left to test after this
[7ef0c30dbf96a8d9a234e90c248eb19df3c031be] genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n
+----------

 Compiling ...

+------
 time make -j6 bzImage modules
+------

 Install...

 Boot ...

+-------
Linux dcu-router-ng 2.6.27-bisect #12 SMP PREEMPT Thu Nov 20 16:33:11 CET 2008 i686 GNU/Linux
+-------

+--------
 cat /proc/interrupts 
           CPU0       CPU1       CPU2       CPU3       
  0:        124          0          0          0   IO-APIC-edge      timer
  1:          0          0          1          1   IO-APIC-edge      i8042
  3:          1          2          2          2   IO-APIC-edge      serial
  8:          2          0          0          0   IO-APIC-edge      rtc
  9:          0          0          0          0   IO-APIC-fasteoi   acpi
 12:          1          2          1          0   IO-APIC-edge      i8042
 16:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb1, ehci_hcd:usb6
 17:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb2
 18:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb3
6291710:        268        269        268        267   PCI-MSI-edge      cciss0
11534592:          0          0          0          0   PCI-MSI-edge      eth2
11534591:          0          0          0          0   PCI-MSI-edge      eth2
11534590:          0          0          0          0   PCI-MSI-edge      eth2
11534589:          0          0          0          0   PCI-MSI-edge      eth2
11534588:          0          0          0          0   PCI-MSI-edge      eth2
11534587:          0          0          0          0   PCI-MSI-edge      eth2
11534586:          0          0          0          0   PCI-MSI-edge      eth2
11534585:          0          0          0          0   PCI-MSI-edge      eth2
11534584:          0          0          0          0   PCI-MSI-edge      eth2
11534583:          0          0          0          0   PCI-MSI-edge      eth2
11534582:          0          0          0          0   PCI-MSI-edge      eth2
11534581:          0          0          0          0   PCI-MSI-edge      eth2
11534580:          0          0          0          0   PCI-MSI-edge      eth2
11538688:          0          0          0          0   PCI-MSI-edge      eth3
11538687:          0          0          0          0   PCI-MSI-edge      eth3
11538686:          0          0          0          0   PCI-MSI-edge      eth3
11538685:          0          0          0          0   PCI-MSI-edge      eth3
11538684:          0          0          0          0   PCI-MSI-edge      eth3
11538683:          0          0          0          0   PCI-MSI-edge      eth3
11538682:          0          0          0          0   PCI-MSI-edge      eth3
11538681:          0          0          0          0   PCI-MSI-edge      eth3
11538680:          0          0          0          0   PCI-MSI-edge      eth3
11538679:          0          0          0          0   PCI-MSI-edge      eth3
11538678:          0          0          0          0   PCI-MSI-edge      eth3
 19:          0          0          0          0   IO-APIC-fasteoi   uhci_hcd:usb4
 22:         25         23         24         25   IO-APIC-fasteoi   uhci_hcd:usb5
3145984:        175        174        176        178   PCI-MSI-edge      eth0
NMI:          0          0          0          0   Non-maskable interrupts
LOC:       3508       2902       2765       2489   Local timer interrupts
RES:        238         35        461          6   Rescheduling interrupts
CAL:         61         90         59         81   Function call interrupts
TLB:        257        220        299        300   TLB shootdowns
TRM:          0          0          0          0   Thermal event interrupts
SPU:          0          0          0          0   Spurious interrupts
ERR:          0
MIS:          0
+--------

 GOOD.

+----------
git-bisect good
3235e936c0cc3589309280b6f59e5096779adae3 is first bad commit
commit 3235e936c0cc3589309280b6f59e5096779adae3
Author: Thomas Gleixner <tglx@linutronix.de>
Date:   Wed Oct 15 13:16:00 2008 +0200

    x86: remove sparse irq from Kconfig
    
    This code is not ready yet.
    
    Signed-off-by: Thomas Gleixner <tglx@linutronix.de>


:040000 040000 6043e32465556e828de0fbb6aa497b277239af01 2dd75ba207990d83a3a4c7b7b16abccfe2d5e10d M    arch
+--------

 Found bad commit:
   3235e936c0cc3589309280b6f59e5096779adae3



Git bisect LOG
~~~~~~~~~~~~~~

+-------
git-bisect log
git-bisect start
# good: [3fa8749e584b55f1180411ab1b51117190bac1e5] Linux 2.6.27
git-bisect good 3fa8749e584b55f1180411ab1b51117190bac1e5
# bad: [92b29b86fe2e183d44eb467e5e74a5f718ef2e43] Merge branch 'tracing-v28-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git-bisect bad 92b29b86fe2e183d44eb467e5e74a5f718ef2e43
# good: [af5c2bd16ac2e5688c3bf46ea1f95112d696d294] x86: fix virt_addr_valid() with CONFIG_DEBUG_VIRTUAL=y, v2
git-bisect good af5c2bd16ac2e5688c3bf46ea1f95112d696d294
# good: [36ac1d2f323f8bf8bc10c25b88f617657720e241] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
git-bisect good 36ac1d2f323f8bf8bc10c25b88f617657720e241
# good: [1aece34833721d64eb33fc15cd923c727296d3d3] container freezer: rename check_if_frozen()
git-bisect good 1aece34833721d64eb33fc15cd923c727296d3d3
# good: [1d9a8a47d659f053abeca9ece45651b4d94780c8] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
git-bisect good 1d9a8a47d659f053abeca9ece45651b4d94780c8
# bad: [dd3a1db900f2a215a7d7dd71b836e149a6cf5fed] genirq: improve include files
git-bisect bad dd3a1db900f2a215a7d7dd71b836e149a6cf5fed
# good: [db4b5525caafd846ec20f95afbc6403c792e22cf] x86: apic_64.c - setup_APIC_timer has to be __cpuinit function
git-bisect good db4b5525caafd846ec20f95afbc6403c792e22cf
# good: [ba374c9baef910fbc5373541d98c50f15e82c3f8] x86: fix HPET compiler error when not using CONFIG_PCI_MSI
git-bisect good ba374c9baef910fbc5373541d98c50f15e82c3f8
# good: [922402f15a85f7a064926eb1db68cc52bc4d4a91] x86: Add UV partition call v4
git-bisect good 922402f15a85f7a064926eb1db68cc52bc4d4a91
# bad: [a1aca5de08a0cb840a90fb3f729a5940f8d21185] genirq: remove artifacts from sparseirq removal
git-bisect bad a1aca5de08a0cb840a90fb3f729a5940f8d21185
# bad: [3235e936c0cc3589309280b6f59e5096779adae3] x86: remove sparse irq from Kconfig
git-bisect bad 3235e936c0cc3589309280b6f59e5096779adae3
# good: [4c66a73f0796dacc2ff0d4af75794ec843ceb3d1] x86: sparse_irq: fix typo in debug print out
git-bisect good 4c66a73f0796dacc2ff0d4af75794ec843ceb3d1
# good: [7ef0c30dbf96a8d9a234e90c248eb19df3c031be] genirq: define nr_irqs for architectures with GENERIC_HARDIRQS=n
git-bisect good 7ef0c30dbf96a8d9a234e90c248eb19df3c031be
+-------

Email
~~~~~

 To:
  Thomas Gleixner <tglx@linutronix.de>
  David Miller <davem@davemloft.net>, 
  Jesper Dangaard Brouer <jdb@comx.dk>, 
  netdev <netdev@vger.kernel.org>, 
  linux-kernel@vger.kernel.org, 
  Robert Olsson <Robert.Olsson@data.slu.se>

 Subj.:
  Regression: Bisected, IRQ and MSI allocations screwed without sparse irq


 Hi Thomas Gleixner,

 I have bisected a regression to your commit
 3235e936c0cc3589309280b6f59e5096779adae3, "x86: remove sparse irq
 from Kconfig".

 Its actually not necessary your fault, as your commit simply removes
 the config option HAVE_SPARSE_IRQ.  This revels the bug / regression
 I'm exposted to.

 Guess I should bisect again to find the exact faulty commit, but I'm
 rather sick of bisecting at the moment, and though you might have a
 better idea whats going wrong.  I would rather spend my time
 performance tuning the multiqueue routing code...

 [The regression]:

 During my testing of the Sun Neptune based NICs.  On kernel 2.6.27 I
 get really good performance (900-1200kpps) compared to 2.6.28 (davem
 git net-2.6).

 The cause of this problem (tracked down together with Robert Olsson)
 is that on 2.6.28 I have a lot less IRQs available.  It seems max 34
 IRQs.  Due the reduced number of IRQs the NIU driver cannot get
 enough IRQs to the interfaces, and starts to use "IO-APIC" based
 IRQs.

 On kernel 2.6.28: My eth2 is using 10 IRQs all "PCI-MSI-edge".  BUT
 my eth3 is using a single IRQ using "IO-APIC-fasteoi" and shared with
 the usb driver.  That my performance problem on 2.6.28.

 [Other related bugs]:
 Is that unloading the "niu" driver will give a kernel BUG.
diff mbox

Patch

diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index 9acb5d7..d8463b1 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -51,8 +51,7 @@  MODULE_VERSION(DRV_MODULE_VERSION);
 #ifndef readq
 static u64 readq(void __iomem *reg)
 {
-	return (((u64)readl(reg + 0x4UL) << 32) |
-		(u64)readl(reg));
+	return ((u64) readl(reg)) | (((u64) readl(reg + 4UL)) << 32);
 }
 
 static void writeq(u64 val, void __iomem *reg)