diff mbox

SunFire V240 hangs

Message ID 20090408.152626.220030913.davem@davemloft.net
State RFC
Delegated to: David Miller
Headers show

Commit Message

David Miller April 8, 2009, 10:26 p.m. UTC
From: Sam Ravnborg <sam@ravnborg.org>
Date: Wed, 8 Apr 2009 23:45:51 +0200

> The easiet is to comment out the line:
> 
>     arch_initcall(pcr_arch_init);
> 
> In the file: arch/sparc/kernel/pcr.c

It won't build since this makes pcr_arch_init() unused
and sparc builds with -Werror.

This patch below might work better.

But even if the NMI isn't working, the NMI test should just timeout
and fail if you wait for 10 or 20 seconds.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

seb@frankengul.org April 8, 2009, 10:44 p.m. UTC | #1
On Wed, Apr 08, 2009 at 03:26:26PM -0700, David Miller wrote:
> +#endif
>  
>  	switch (tlb_type) {
>  	case hypervisor:

Strange, with arch_initcall(pcr_arch_init); commented, the kernel builds fine;

However, the good news is that this kernel boots fine.

	Seb

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 8, 2009, 10:45 p.m. UTC | #2
From: seb@frankengul.org
Date: Thu, 9 Apr 2009 00:44:19 +0200

> On Wed, Apr 08, 2009 at 03:26:26PM -0700, David Miller wrote:
>> +#endif
>>  
>>  	switch (tlb_type) {
>>  	case hypervisor:
> 
> Strange, with arch_initcall(pcr_arch_init); commented, the kernel builds fine;
> 
> However, the good news is that this kernel boots fine.

I suspect it will boot fine if you wait for the NMI tester to
timeout too.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 8, 2009, 11:01 p.m. UTC | #3
On Wed, Apr 08, 2009 at 03:45:53PM -0700, David Miller wrote:
> > Strange, with arch_initcall(pcr_arch_init); commented, the kernel builds fine;
> > 
> > However, the good news is that this kernel boots fine.
> 
> I suspect it will boot fine if you wait for the NMI tester to
> timeout too.

Hum, how long does it takes to timeout ?

	Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 8, 2009, 11:02 p.m. UTC | #4
From: seb@frankengul.org
Date: Thu, 9 Apr 2009 01:01:01 +0200

> On Wed, Apr 08, 2009 at 03:45:53PM -0700, David Miller wrote:
>> > Strange, with arch_initcall(pcr_arch_init); commented, the kernel builds fine;
>> > 
>> > However, the good news is that this kernel boots fine.
>> 
>> I suspect it will boot fine if you wait for the NMI tester to
>> timeout too.
> 
> Hum, how long does it takes to timeout ?

It can take from 20 to 30 seconds, for each processor.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 8, 2009, 11:25 p.m. UTC | #5
On Wed, Apr 08, 2009 at 04:02:08PM -0700, David Miller wrote:
> From: seb@frankengul.org
> Date: Thu, 9 Apr 2009 01:01:01 +0200
> 
> > On Wed, Apr 08, 2009 at 03:45:53PM -0700, David Miller wrote:
> >> > Strange, with arch_initcall(pcr_arch_init); commented, the kernel builds fine;
> >> > 
> >> > However, the good news is that this kernel boots fine.
> >> 
> >> I suspect it will boot fine if you wait for the NMI tester to
> >> timeout too.
> > 
> > Hum, how long does it takes to timeout ?
> 
> It can take from 20 to 30 seconds, for each processor.
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

If I remember well, I let the machine on for several hours last time I tried, and the kernel never
recovered.
At this time, the machine froze completely. 5 minutes later, there is still no sign of life on the console.

Here's the last known output of the kernel.

[   49.357679] initcall backlight_class_init+0x0/0x74 returned 0 after 0 usecs
[   49.357685] calling  tty_class_init+0x0/0x38 @ 1
[   49.357700] initcall tty_class_init+0x0/0x38 returned 0 after 0 usecs
[   49.357706] calling  vtconsole_class_init+0x0/0xe8 @ 1
[   49.357749] initcall vtconsole_class_init+0x0/0xe8 returned 0 after 0 usecs
[   49.357757] calling  register_node_type+0x0/0x9c @ 1
[   49.357777] initcall register_node_type+0x0/0x9c returned 0 after 0 usecs
[   49.357783] calling  spi_init+0x0/0x7c @ 1
[   49.357813] initcall spi_init+0x0/0x7c returned 0 after 0 usecs
[   49.357818] calling  i2c_init+0x0/0x70 @ 1
[   49.357866] initcall i2c_init+0x0/0x70 returned 0 after 0 usecs
[   49.357872] calling  cpu_type_probe+0x0/0x274 @ 1
[   49.357879] initcall cpu_type_probe+0x0/0x274 returned 0 after 0 usecs
[   49.357885] calling  pcr_arch_init+0x0/0x14c @ 1
[   55.


	Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 8, 2009, 11:27 p.m. UTC | #6
From: seb@frankengul.org
Date: Thu, 9 Apr 2009 01:25:42 +0200

> Here's the last known output of the kernel.
...
> [   49.357879] initcall cpu_type_probe+0x0/0x274 returned 0 after 0 usecs
> [   49.357885] calling  pcr_arch_init+0x0/0x14c @ 1
> [   55.

It looks like it's trying to print the timeout message but
hangs doing so :-)

This really should work, especially with the cpus you have
in that machine.

I'll try to find some time to put together some debugging
patches, but that's not going to happen any time soon as
I'm busy with several other tasks at the moment.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 8, 2009, 11:32 p.m. UTC | #7
On Wed, Apr 08, 2009 at 04:27:49PM -0700, David Miller wrote:
> From: seb@frankengul.org
> Date: Thu, 9 Apr 2009 01:25:42 +0200
> 
> > Here's the last known output of the kernel.
> ...
> > [   49.357879] initcall cpu_type_probe+0x0/0x274 returned 0 after 0 usecs
> > [   49.357885] calling  pcr_arch_init+0x0/0x14c @ 1
> > [   55.
> 
> It looks like it's trying to print the timeout message but
> hangs doing so :-)
> 
> This really should work, especially with the cpus you have
> in that machine.
> 
> I'll try to find some time to put together some debugging
> patches, but that's not going to happen any time soon as
> I'm busy with several other tasks at the moment.
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

No problem, feel free to send me anything you feel like useful when you have time.

	Seb

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 24, 2009, 3:44 p.m. UTC | #8
I notice in your kernel logs that you're using an Ubuntu gcc version
4.3.x.

How in the world is that possible?

Ubuntu sparc support stopped after Ubuntu version 7.10 and that
shipped with gcc-4.1.x as the default compiler.  That's what I use
for most of my testing FWIW.  On my debian boxes I use the
default gcc-4.3.2 based compiler.

If you're using a custom built compiler, we're going to have to do
some tests to rule that out as the cause of your problems I think.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frans van Berckel April 24, 2009, 7 p.m. UTC | #9
If I am well informed there's a respiratory for the Ubuntu next, called
Jaunty. As well the next+1 Ubuntu called Karmic online available now.

http://ports.ubuntu.com/dists/jaunty/

deb http://ports.ubuntu.com/ubuntu-ports/ jaunty main restricted
deb http://ports.ubuntu.com/ubuntu-ports/ jaunty universe multiverse

http://ports.ubuntu.com/dists/karmic/

deb http://ports.ubuntu.com/ubuntu-ports/ karmic main restricted
deb http://ports.ubuntu.com/ubuntu-ports/ karmic universe multiverse

Thanks,


Frans van Berckel

On Fri, 2009-04-24 at 08:44 -0700, David Miller wrote:
> I notice in your kernel logs that you're using an Ubuntu gcc version
> 4.3.x.
> 
> How in the world is that possible?
> 
> Ubuntu sparc support stopped after Ubuntu version 7.10 and that
> shipped with gcc-4.1.x as the default compiler.  That's what I use
> for most of my testing FWIW.  On my debian boxes I use the
> default gcc-4.3.2 based compiler.
> 
> If you're using a custom built compiler, we're going to have to do
> some tests to rule that out as the cause of your problems I think.
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 25, 2009, 5:30 a.m. UTC | #10
From: Frans van Berckel <fberckel@xs4all.nl>
Date: Fri, 24 Apr 2009 21:00:23 +0200

> If I am well informed there's a respiratory for the Ubuntu next, called
> Jaunty. As well the next+1 Ubuntu called Karmic online available now.
> 
> http://ports.ubuntu.com/dists/jaunty/
> 
> deb http://ports.ubuntu.com/ubuntu-ports/ jaunty main restricted
> deb http://ports.ubuntu.com/ubuntu-ports/ jaunty universe multiverse

I'm going to have to ask you to compile the kernel with something
more widely tested than this, I didn't even know it existed.

I have a good feeling that some of these weird startup hangs and
such are GCC miscompiles.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 25, 2009, 11:30 p.m. UTC | #11
David Miller a écrit :
> I notice in your kernel logs that you're using an Ubuntu gcc version
> 4.3.x.
>
> How in the world is that possible?
>
> Ubuntu sparc support stopped after Ubuntu version 7.10 and that
> shipped with gcc-4.1.x as the default compiler.  That's what I use
> for most of my testing FWIW.  On my debian boxes I use the
> default gcc-4.3.2 based compiler.
>
> If you're using a custom built compiler, we're going to have to do
> some tests to rule that out as the cause of your problems I think.
>   
sparc port is unofficial but available on the ports.ubuntu.com for all 
version up to 9.04.
PPC has been retrogaded in ports recently since apple dropped that arch 
in favor of x86.
The compiler is not custom. It came from the ubuntu archive.
I don't think the compiler is the culprit in this case because all of 
the kernels built before 2.6.29 are working OK
(I could be wrong, but it's unlikely).
I have a sid debian sparc available. I'll try to build a kernel with 
that compiler and try if it changes anything.

    Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 26, 2009, 1:12 a.m. UTC | #12
From: Sébastien Bernard <seb@frankengul.org>
Date: Sun, 26 Apr 2009 01:30:35 +0200

> sparc port is unofficial but available on the ports.ubuntu.com for all
> version up to 9.04.
> PPC has been retrogaded in ports recently since apple dropped that
> arch in favor of x86.
> The compiler is not custom. It came from the ubuntu archive.
> I don't think the compiler is the culprit in this case because all of
> the kernels built before 2.6.29 are working OK
> (I could be wrong, but it's unlikely).

I'm worryed with this particular bug because the NMI code added
in 2.6.29 where this bootup is hanging has a loop over a variable
changing, with a memory barrier in there.  It's exactly the kind
of thing a compiler error can and has caused in the past.

> I have a sid debian sparc available. I'll try to build a kernel with
> that compiler and try if it changes anything.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 26, 2009, 9:59 p.m. UTC | #13
David Miller a écrit :
> From: Sébastien Bernard <seb@frankengul.org>
> Date: Sun, 26 Apr 2009 01:30:35 +0200
>
>   
>> sparc port is unofficial but available on the ports.ubuntu.com for all
>> version up to 9.04.
>> PPC has been retrogaded in ports recently since apple dropped that
>> arch in favor of x86.
>> The compiler is not custom. It came from the ubuntu archive.
>> I don't think the compiler is the culprit in this case because all of
>> the kernels built before 2.6.29 are working OK
>> (I could be wrong, but it's unlikely).
>>     
>
> I'm worryed with this particular bug because the NMI code added
> in 2.6.29 where this bootup is hanging has a loop over a variable
> changing, with a memory barrier in there.  It's exactly the kind
> of thing a compiler error can and has caused in the past
>   
>> I have a sid debian sparc available. I'll try to build a kernel with
>> that compiler and try if it changes anything
Well, it still hangs at the same place.
The strange thing is that the debug_initcall parameter does not work. No 
initcall output is done. Really weird.
Below is the transcription from the logs :
--------------------------------------
boot: LinuxOLD-rescue initcall_debug=1 ignore_loglevel
Allocated 64 Megs of memory at 0x40000000 for kernel^
Uncompressing image...
Loaded kernel version 2.6.29
Loading initial ramdisk (8545117 bytes at 0x103F000000 phys, 0x40C00000 
virt)...
[    0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.22.33 2007/06/18 12:45'
[    0.000000] PROMLIB: Root node compatible:
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.29.1 (seb@calypso) (gcc version 4.3.3 
(Debian 4.3.3-8) ) #1 SMP Sun Apr 26 04:00:36 CEST 2009
[    0.000000] console [earlyprom0] enabled
[    0.000000] ARCH: SUN4U
[    0.000000] Ethernet address: 00:03:ba:83:9d:e5
[    0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
[    0.000000] Remapping the kernel... done.
[    0.000000] OF stdout device is: /pci@1e,600000/isa@7/serial@0,3f8
[    0.000000] PROM: Built device tree with 90020 bytes of memory.
[    0.000000] [0000000200000000-fffff80000400000] page_structs=131072 
node=0 entry=0/0
[    0.000000] [0000000200000000-fffff80000800000] page_structs=131072 
node=0 entry=1/0
[    0.000000] [0000000204000000-fffff80000c00000] page_structs=131072 
node=0 entry=16/0
[    0.000000] [0000000204000000-fffff80001000000] page_structs=131072 
node=0 entry=17/0
[    0.000000] [0000000206000000-fffff80001400000] page_structs=131072 
node=0 entry=24/0
[    0.000000] [0000000206000000-fffff80001800000] page_structs=131072 
node=0 entry=25/0
[    0.000000] [0000000220000000-fffff81000800000] page_structs=131072 
node=1 entry=128/0
[    0.000000] [0000000220000000-fffff81000c00000] page_structs=131072 
node=1 entry=129/0
[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff74
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     0: 0x00100000 -> 0x00120000
[    0.000000]     0: 0x00180000 -> 0x001a0000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081ff30
[    0.000000]     1: 0x0081ff38 -> 0x0081ff40
[    0.000000]     1: 0x0081ff48 -> 0x0081ff49
[    0.000000]     1: 0x0081ff70 -> 0x0081ff74
[    0.000000] Booting Linux...
[    0.000000] Built 2 zonelists in Node order, mobility grouping on.  
Total pages: 509757
[    0.000000] Policy zone: Normal
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[    0.000000] clocksource: mult[535555] shift[16]
[    0.000000] clockevent: mult[3126e97] shift[32]
[   69.011963] Console: colour dummy device 80x25
[   69.070377] console handover: boot [earlyprom0] -> real [tty0]
--------------------------------------
And that's all. Screen clears and cursor goes a few lines down and hangs.
So, should I try with another gcc ? 4.2 maybe ?

Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 27, 2009, 6:15 a.m. UTC | #14
From: Sébastien Bernard <seb@frankengul.org>
Date: Sun, 26 Apr 2009 23:59:26 +0200

> [ 0.000000] Linux version 2.6.29.1 (seb@calypso) (gcc version 4.3.3
> (Debian 4.3.3-8) ) #1 SMP Sun Apr 26 04:00:36 CEST 2009

I have all updates installed on my debian system and my compiler
version is 4.3.2, not 4.3.3

This isn't helping to eliminate variables. :-/

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org April 27, 2009, 2:40 p.m. UTC | #15
On Sun, Apr 26, 2009 at 11:15:16PM -0700, David Miller wrote:
> From: Sébastien Bernard <seb@frankengul.org>
> Date: Sun, 26 Apr 2009 23:59:26 +0200
> 
> > [ 0.000000] Linux version 2.6.29.1 (seb@calypso) (gcc version 4.3.3
> > (Debian 4.3.3-8) ) #1 SMP Sun Apr 26 04:00:36 CEST 2009
> 
> I have all updates installed on my debian system and my compiler
> version is 4.3.2, not 4.3.3
> 
> This isn't helping to eliminate variables. :-/

The search for packages gives :
http://packages.debian.org/search?keywords=gcc-4.3&searchon=names&suite=all&section=all

4.3.2-1 in stable
4.3.3-3 in testing
4.3.3-8 in unstable (the one I have)

I already mentionned it was sid a.k.a. unstable.
I'm willing to downgrade for testing to check. Which one should I pick ? testing or stable ?

Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 28, 2009, 11:36 a.m. UTC | #16
From: seb@frankengul.org
Date: Mon, 27 Apr 2009 16:40:42 +0200

> I'm willing to downgrade for testing to check. Which one should I
> pick ? testing or stable ?

I built a kernel using tools that work for me, using your
provided config, and double checked that it boots on
my ultra45.  Give it a spin:

	http://vger.kernel.org/~davem/v240.image

it of course doesn't have the modules, and therfore won't
be able to mount root.  But we will see if it passes the
NMI test which is where you see problems.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sébastien Bernard April 28, 2009, 12:52 p.m. UTC | #17
David Miller a écrit :
> From: seb@frankengul.org
> Date: Mon, 27 Apr 2009 16:40:42 +0200
>
>   
>> I'm willing to downgrade for testing to check. Which one should I
>> pick ? testing or stable ?
>>     
>
> I built a kernel using tools that work for me, using your
> provided config, and double checked that it boots on
> my ultra45.  Give it a spin:
>
> 	http://vger.kernel.org/~davem/v240.image
>
> it of course doesn't have the modules, and therfore won't
> be able to mount root.  But we will see if it passes the
> NMI test which is where you see problems.
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   
Unfortunately, this kernel hangs in the same way as the others.
Same place, same symptom.
I'm afraid the compiler has nothing to do with that problem.

    Seb

Here is the log :
Probing system devices
Probing memory
Probing I/O buses

Sun Fire V240, No Keyboard
Copyright 2007 Sun Microsystems, Inc.  All rights reserved.
OpenBoot 4.22.33, 4096 MB memory installed, Serial #58957285.
Ethernet address 0:3:ba:83:9d:e5, Host ID: 83839de5.



Rebooting with command: boot
Boot device: disk1  File and args:
SILO Version 1.4.14
boot:
Linux                    LinuxOLD                 Linux-rescue            
LinuxOLD-rescue          Linux2630                test                    
boot: test
Allocated 64 Megs of memory at 0x40000000 for kernel
Loaded kernel version 2.6.29
[    0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 4.22.33 2007/06/18 12:45'
[    0.000000] PROMLIB: Root node compatible:
[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 2.6.29.2 (davem@ultra45) (gcc version 4.3.2 
(Debian 4.3.2-1.1) ) #1 SMP Tue Apr 28 03:39:16 PDT 2009
[    0.000000] debug: ignoring loglevel setting.
[    0.000000] console [earlyprom0] enabled
[    0.000000] ARCH: SUN4U
[    0.000000] Ethernet address: 00:03:ba:83:9d:e5
[    0.000000] Kernel: Using 2 locked TLB entries for main kernel image.
[    0.000000] Remapping the kernel... done.
[    0.000000] OF stdout device is: /pci@1e,600000/isa@7/serial@0,3f8
[    0.000000] PROM: Built device tree with 89996 bytes of memory.
[    0.000000] [0000000200000000-fffff80000400000] page_structs=131072 
node=0 entry=0/0
[    0.000000] [0000000200000000-fffff80000800000] page_structs=131072 
node=0 entry=1/0
[    0.000000] [0000000204000000-fffff80000c00000] page_structs=131072 
node=0 entry=16/0
[    0.000000] [0000000204000000-fffff80001000000] page_structs=131072 
node=0 entry=17/0
[    0.000000] [0000000206000000-fffff80001400000] page_structs=131072 
node=0 entry=24/0
[    0.000000] [0000000206000000-fffff80001800000] page_structs=131072 
node=0 entry=25/0
[    0.000000] [0000000220000000-fffff81000800000] page_structs=131072 
node=1 entry=128/0
[    0.000000] [0000000220000000-fffff81000c00000] page_structs=131072 
node=1 entry=129/0
[    0.000000] Zone PFN ranges:
[    0.000000]   Normal   0x00000000 -> 0x0081ff74
[    0.000000] Movable zone start PFN for each node
[    0.000000] early_node_map[8] active PFN ranges
[    0.000000]     0: 0x00000000 -> 0x00020000
[    0.000000]     0: 0x00100000 -> 0x00120000
[    0.000000]     0: 0x00180000 -> 0x001a0000
[    0.000000]     1: 0x00800000 -> 0x0081f7ff
[    0.000000]     1: 0x0081f800 -> 0x0081ff30
[    0.000000]     1: 0x0081ff38 -> 0x0081ff40
[    0.000000]     1: 0x0081ff48 -> 0x0081ff49
[    0.000000]     1: 0x0081ff70 -> 0x0081ff74
[    0.000000] On node 0 totalpages: 393216
[    0.000000]   Normal zone: 13312 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 379904 pages, LIFO batch:15
[    0.000000] On node 1 totalpages: 130876
[    0.000000]   Normal zone: 1023 pages used for memmap
[    0.000000]   Normal zone: 0 pages reserved
[    0.000000]   Normal zone: 129853 pages, LIFO batch:15
[    0.000000] Booting Linux...
[    0.000000] Built 2 zonelists in Node order, mobility grouping on.  
Total pages: 509757
[    0.000000] Policy zone: Normal
[    0.000000] Kernel command line: ro single initcall_debug=1 
ignore_loglevel
[    0.000000] PID hash table entries: 4096 (order: 12, 32768 bytes)
[    0.000000] clocksource: mult[535555] shift[16]
[    0.000000] clockevent: mult[3126e97] shift[32]
[   36.103573] Console: colour dummy device 80x25
[   36.161984] console handover: boot [earlyprom0] -> real [tty0]
^[[H^[[J 
^H                                                                               
^H^[[@ ^[[1;80H^H^[[7m ^[[m^H^[[@ ^H ^H^[[@ ^[[2;1H

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 28, 2009, 1:29 p.m. UTC | #18
From: Sébastien Bernard <seb@sfrdev.fr>
Date: Tue, 28 Apr 2009 14:52:54 +0200

> Unfortunately, this kernel hangs in the same way as the others.
> Same place, same symptom.
> I'm afraid the compiler has nothing to do with that problem.

Great, now we can move on :-)

I'll put together some debugging images for you to test.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
seb@frankengul.org Sept. 6, 2009, 9:48 p.m. UTC | #19
David Miller a écrit :
> From: Sébastien Bernard <seb@sfrdev.fr>
> Date: Tue, 28 Apr 2009 14:52:54 +0200
>
>   
>> Unfortunately, this kernel hangs in the same way as the others.
>> Same place, same symptom.
>> I'm afraid the compiler has nothing to do with that problem.
>>     
>
> Great, now we can move on :-)
>
> I'll put together some debugging images for you to test.
> --
> To unsubscribe from this list: send the line "unsubscribe sparclinux" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>   

Hi, the problem is still present in the 2.6.30 kernel and, since 
distributions are
pushing new kernels, I'm now unable to boot any kernel released either 
by debian or Ubuntu.

To make the matter worse, I think  someone has been bitten by this bug 
on another sparc machine (SUNBLADE 1000).
See http://lists.debian.org/debian-sparc/2009/08/msg00031.html .


Since few months passed since the last mail, I would like to nail the 
problem, but lack the technical
expertise myself on sparc. How can I help ?


    Seb
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Sept. 7, 2009, 11:51 p.m. UTC | #20
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
Date: Mon, 7 Sep 2009 16:19:22 +0200

> On Sun, Sep 06, 2009 at 11:48:05PM +0200, Sbastien Bernard wrote:
>> David Miller a écrit :
>> >From: Sébastien Bernard <seb@sfrdev.fr>
>> >Date: Tue, 28 Apr 2009 14:52:54 +0200
>> >
>> >  
>> >>Unfortunately, this kernel hangs in the same way as the others.
>> >>Same place, same symptom.
> ...
>> Hi, the problem is still present in the 2.6.30 kernel and, since 
>> distributions are
>> pushing new kernels, I'm now unable to boot any kernel released either 
>> by debian or Ubuntu.
> 
> tried 2.6.30.5 on my Sun Fire 880 (uses qla2xxx) with the nmi patches 
> (see below), but the kernel still hangs after the famous 
> "NET: Registered protocol family 16" line.
> 
> Any other patches I can try (or what else) ?

So turning NMI watchdog off entirely doesn't solve it.

Oh well.  I'm going to be honest with you guys, that until I
can reproduce that problem myself here locally it's too heavy
of a bug for me to fix just going back and forth with someone
with debug patches.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hermann Lauer Sept. 9, 2009, 1:04 p.m. UTC | #21
On Mon, Sep 07, 2009 at 04:51:21PM -0700, David Miller wrote:
> > Any other patches I can try (or what else) ?
> 
> So turning NMI watchdog off entirely doesn't solve it.
> 
> Oh well.  I'm going to be honest with you guys, that until I
> can reproduce that problem myself here locally it's too heavy
> of a bug for me to fix just going back and forth with someone
> with debug patches.

Would it help you to have rsc (remote serial console) access to a
SunFire 480 here for remote debugging and restarting of the machine ?

If you are interested, I'll try to find out how to set up that
beast for you (any pointers from others how to do that are welcome).

Thanks,
  Hermann
David Miller Sept. 9, 2009, 1:32 p.m. UTC | #22
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
Date: Wed, 9 Sep 2009 15:04:03 +0200

> On Mon, Sep 07, 2009 at 04:51:21PM -0700, David Miller wrote:
>> > Any other patches I can try (or what else) ?
>> 
>> So turning NMI watchdog off entirely doesn't solve it.
>> 
>> Oh well.  I'm going to be honest with you guys, that until I
>> can reproduce that problem myself here locally it's too heavy
>> of a bug for me to fix just going back and forth with someone
>> with debug patches.
> 
> Would it help you to have rsc (remote serial console) access to a
> SunFire 480 here for remote debugging and restarting of the machine ?
> 
> If you are interested, I'll try to find out how to set up that
> beast for you (any pointers from others how to do that are welcome).

Sure, if you can get that to work I'll give it a go.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/sparc/kernel/pcr.c b/arch/sparc/kernel/pcr.c
index 1ae8cdd..62b2b9c 100644
--- a/arch/sparc/kernel/pcr.c
+++ b/arch/sparc/kernel/pcr.c
@@ -123,6 +123,9 @@  int __init pcr_arch_init(void)
 
 	if (err)
 		return err;
+#if 1
+	return -ENODEV;
+#endif
 
 	switch (tlb_type) {
 	case hypervisor: