Patchwork Sunfire V880 and 480R 2.6.27.x startup hangs

login
register
mail settings
Submitter David Miller
Date Jan. 31, 2009, midnight
Message ID <20090130.160010.211744559.davem@davemloft.net>
Download mbox | patch
Permalink /patch/21293/
State RFC
Delegated to: David Miller
Headers show

Comments

David Miller - Jan. 31, 2009, midnight
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
Date: Wed, 28 Jan 2009 09:45:18 +0100

> It's hanging in kset_register. Does this ring a bell to you ?
> Will move the full output to the usual place. Thanks.
 ...
> [   47.553935] calling  of_bus_driver_init+0x0/0x12c
> [   47.610180] Setting up of bus
> [   47.645596] In bus_register().
> [   47.682056] Doing kobject_set_name()
> [   47.724764] kset_register()

I suspect it's hanging in uevent generation, let's verify that.
Something really weird is going on in your box, I wonder if the bug is
surfacing because of all of the non-standard options you have enabled
in your build such as cgroups and stuff like that.

Anyways, add this patch on top of your tree and please send the tail
of the new output.

One thing you might want to try to do when it hangs is go:

1) Send a 'break' over the console then immediately type '8'.
   This will increase the kernel log level.

2) Send a 'break' then 'p', this will dump the current cpu's
   registers.

3) Send a 'break' then 'y', this will give a brief backtrace
   on all cpus.

4) Send a 'break' then 't', this will dump the state of all
   processes on the system.

Unfortunately, none of those will work if the cpu handling console
interrupts has cpu interrupts disabled for whatever reason :-/ But it
is definitely worth a try.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hermann Lauer - Feb. 2, 2009, 2:27 p.m.
On Fri, Jan 30, 2009 at 04:00:10PM -0800, David Miller wrote:
> > [   47.553935] calling  of_bus_driver_init+0x0/0x12c
> > [   47.610180] Setting up of bus
> > [   47.645596] In bus_register().
> > [   47.682056] Doing kobject_set_name()
> > [   47.724764] kset_register()
> 
> I suspect it's hanging in uevent generation, let's verify that.
> Something really weird is going on in your box, I wonder if the bug is
> surfacing because of all of the non-standard options you have enabled
> in your build such as cgroups and stuff like that.

I CC this to the debian sparc people, as the config is derived from
their default sparc kernel config. If I remember correctly, I only used
"make oldconfig" to get to the newer kernel. Maybe one of those guys can
comment on the sparc configuration choices.

I was curious, so I took vanilla kernel 2.6.27.13 from the net, 
did a "make menuconfig" with only saving (not changeing anything) the config.
This config I will put on: 
http://www.iwr.uni-heidelberg.de/ftp/linux/sparc-boot/config-2.6.27.13-20090202.txt/config-2.6.27.13-20090202.txt

> Anyways, add this patch on top of your tree and please send the tail
> of the new output.

Here is the output with all patches and that default build of 2.6.27.13:

In bus_register().
Doing kobject_set_name()
kset_register()
kset_register: kset_init()
kset_register: kset_add_internal()
kset_register: kobject_uevent()
[halt sent]
[halt sent]
[halt sent]
[halt sent]

> One thing you might want to try to do when it hangs is go:
> 
> 1) Send a 'break' over the console then immediately type '8'.
>    This will increase the kernel log level.
> 
> 2) Send a 'break' then 'p', this will dump the current cpu's
>    registers.
> 
> 3) Send a 'break' then 'y', this will give a brief backtrace
>    on all cpus.
> 
> 4) Send a 'break' then 't', this will dump the state of all
>    processes on the system.
> 
> Unfortunately, none of those will work if the cpu handling console
> interrupts has cpu interrupts disabled for whatever reason :-/ But it
> is definitely worth a try.

Tried, but as you feared, no output was produced (see above).
Any further ideas ?
If time permits, I probably should start compiling all kernels from 2.6.26.5
on to find the first non working kernel.

Thanks, Hermann

Patch

diff --git a/lib/kobject.c b/lib/kobject.c
index fbf0ae2..4553903 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -708,11 +708,17 @@  int kset_register(struct kset *k)
 	if (!k)
 		return -EINVAL;
 
+	printk(KERN_ERR "kset_register: kset_init()\n");
 	kset_init(k);
+	printk(KERN_ERR "kset_register: kset_add_internal()\n");
 	err = kobject_add_internal(&k->kobj);
-	if (err)
+	if (err) {
+		printk(KERN_ERR "kset_register: Got error %d\n", err);
 		return err;
+	}
+	printk(KERN_ERR "kset_register: kobject_uevent()\n");
 	kobject_uevent(&k->kobj, KOBJ_ADD);
+	printk(KERN_ERR "kset_register: Done\n");
 	return 0;
 }