diff mbox

Sunfire V880 and 480R 2.6.27.x startup hangs

Message ID 20090202.125038.141240910.davem@davemloft.net
State RFC
Delegated to: David Miller
Headers show

Commit Message

David Miller Feb. 2, 2009, 8:50 p.m. UTC
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
Date: Mon, 2 Feb 2009 15:27:57 +0100

> On Fri, Jan 30, 2009 at 04:00:10PM -0800, David Miller wrote:
> > > [   47.553935] calling  of_bus_driver_init+0x0/0x12c
> > > [   47.610180] Setting up of bus
> > > [   47.645596] In bus_register().
> > > [   47.682056] Doing kobject_set_name()
> > > [   47.724764] kset_register()
> > 
> > I suspect it's hanging in uevent generation, let's verify that.
> > Something really weird is going on in your box, I wonder if the bug is
> > surfacing because of all of the non-standard options you have enabled
> > in your build such as cgroups and stuff like that.
> 
> I CC this to the debian sparc people, as the config is derived from
> their default sparc kernel config. If I remember correctly, I only used
> "make oldconfig" to get to the newer kernel. Maybe one of those guys can
> comment on the sparc configuration choices.

I'm not saying the configuration choice is wrong, not at all.

I'm saying that since it's something most active kernel hacker's
don't enable, it may be a reason why myself and others have never
seen this problem.

> Here is the output with all patches and that default build of 2.6.27.13:
> 
> In bus_register().
> Doing kobject_set_name()
> kset_register()
> kset_register: kset_init()
> kset_register: kset_add_internal()
> kset_register: kobject_uevent()
> [halt sent]
> [halt sent]
> [halt sent]
> [halt sent]

I'm pretty certain it's call_usermodehelper() that's hanging.

Perhaps something with forking kernel threads or invoking exec
is failing on sparc64 on your machine for some reason.

New patch:

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Hermann Lauer Feb. 3, 2009, 9:26 p.m. UTC | #1
On Mon, Feb 02, 2009 at 12:50:38PM -0800, David Miller wrote:
> I'm pretty certain it's call_usermodehelper() that's hanging.
> 
> Perhaps something with forking kernel threads or invoking exec
> is failing on sparc64 on your machine for some reason.

Looks like you are right:

> tail console20090203.txt
Doing kobject_set_name()
kset_register()
kset_register: kset_init()
kset_register: kset_add_internal()
kset_register: kobject_uevent()
kobject_uevent_env: [of] fffff8a1fe00c6d8
kobject_uevent_env: Checking uevent_ops->filter
kobject_uevent_env: Allocating and filling env buffer.
kobject_uevent_env: Checking uevent_ops->uevent
kobject_uevent_env: Invoking uevent_helper[/sbin/hotplug]

Please look also at the full console output at
http://www.iwr.uni-heidelberg.de/ftp/linux/sparc-boot/console20090203.txt
as there are some "attempted to send uevent without kset!" and similar messages.

Maybe it's important that this is a two cpu machine (as a similar 6 cpu machine seems to work,
see other report on the list) ?
Thanks.
David Miller Feb. 3, 2009, 11:19 p.m. UTC | #2
From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
Date: Tue, 3 Feb 2009 22:26:32 +0100

> Maybe it's important that this is a two cpu machine (as a similar 6
> cpu machine seems to work, see other report on the list) ?

I don't think it matters, to be honest.  Memory size and layout
may have more to do with it.

I'll look at your logs and post some new debugging patches later,
thanks.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hermann Lauer Feb. 6, 2009, 10:28 a.m. UTC | #3
On Tue, Feb 03, 2009 at 03:19:18PM -0800, David Miller wrote:
> From: Hermann Lauer <Hermann.Lauer@iwr.uni-heidelberg.de>
> Date: Tue, 3 Feb 2009 22:26:32 +0100
> 
> > Maybe it's important that this is a two cpu machine (as a similar 6
> > cpu machine seems to work, see other report on the list) ?
> 
> I don't think it matters, to be honest.  Memory size and layout
> may have more to do with it.

I bisected meanwhile the complete versions from 2.6.26 to 2.6.27 series:

<2.6.26.8	boots
 2.6.27-rc1	compile fails (see below)
>2.6.27-rc2	hangs at boot

Any chance that one of the last memory related patches will fix
this problem ? Thanks.

  CC      arch/sparc64/kernel/iommu.o
In file included from arch/sparc64/kernel/iommu.c:21:
arch/sparc64/kernel/iommu_common.h:40: error: static declaration of iommu_num_pages follows non-static declaration
include/linux/iommu-helper.h:11: error: previous declaration of iommu_num_pages was here
diff mbox

Patch

diff --git a/lib/kobject_uevent.c b/lib/kobject_uevent.c
index 3f91472..2c30f6a 100644
--- a/lib/kobject_uevent.c
+++ b/lib/kobject_uevent.c
@@ -100,6 +100,9 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 	int i = 0;
 	int retval = 0;
 
+	printk(KERN_ERR "kobject_uevent_env: [%s] %p\n",
+	       kobject_name(kobj), kobj);
+
 	pr_debug("kobject: '%s' (%p): %s\n",
 		 kobject_name(kobj), kobj, __func__);
 
@@ -109,8 +112,8 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 		top_kobj = top_kobj->parent;
 
 	if (!top_kobj->kset) {
-		pr_debug("kobject: '%s' (%p): %s: attempted to send uevent "
-			 "without kset!\n", kobject_name(kobj), kobj,
+		printk(KERN_ERR "kobject: '%s' (%p): %s: attempted to send uevent "
+		       "without kset!\n", kobject_name(kobj), kobj,
 			 __func__);
 		return -EINVAL;
 	}
@@ -118,12 +121,14 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 	kset = top_kobj->kset;
 	uevent_ops = kset->uevent_ops;
 
+	printk(KERN_ERR "kobject_uevent_env: Checking uevent_ops->filter\n");
+
 	/* skip the event, if the filter returns zero. */
 	if (uevent_ops && uevent_ops->filter)
 		if (!uevent_ops->filter(kset, kobj)) {
-			pr_debug("kobject: '%s' (%p): %s: filter function "
-				 "caused the event to drop!\n",
-				 kobject_name(kobj), kobj, __func__);
+			printk(KERN_ERR "kobject: '%s' (%p): %s: filter function "
+			       "caused the event to drop!\n",
+			       kobject_name(kobj), kobj, __func__);
 			return 0;
 		}
 
@@ -133,16 +138,20 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 	else
 		subsystem = kobject_name(&kset->kobj);
 	if (!subsystem) {
-		pr_debug("kobject: '%s' (%p): %s: unset subsystem caused the "
-			 "event to drop!\n", kobject_name(kobj), kobj,
-			 __func__);
+		printk(KERN_ERR "kobject: '%s' (%p): %s: unset subsystem caused the "
+		       "event to drop!\n", kobject_name(kobj), kobj,
+		       __func__);
 		return 0;
 	}
 
+	printk(KERN_ERR "kobject_uevent_env: Allocating and filling env buffer.\n");
+
 	/* environment buffer */
 	env = kzalloc(sizeof(struct kobj_uevent_env), GFP_KERNEL);
-	if (!env)
+	if (!env) {
+		printk(KERN_ERR "kobject_uevent_env: env kzalloc() failed\n");
 		return -ENOMEM;
+	}
 
 	/* complete object path */
 	devpath = kobject_get_path(kobj, GFP_KERNEL);
@@ -171,6 +180,8 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 		}
 	}
 
+	printk(KERN_ERR "kobject_uevent_env: Checking uevent_ops->uevent\n");
+
 	/* let the kset specific function add its stuff */
 	if (uevent_ops && uevent_ops->uevent) {
 		retval = uevent_ops->uevent(kset, kobj, env);
@@ -207,6 +218,8 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 		struct sk_buff *skb;
 		size_t len;
 
+		printk(KERN_ERR "kobject_uevent_env: Sending netlink msg\n");
+
 		/* allocate message with the maximum possible size */
 		len = strlen(action_string) + strlen(devpath) + 2;
 		skb = alloc_skb(len + env->buflen, GFP_KERNEL);
@@ -234,6 +247,9 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 	if (uevent_helper[0]) {
 		char *argv [3];
 
+		printk(KERN_ERR "kobject_uevent_env: Invoking uevent_helper[%s]\n",
+		       uevent_helper);
+
 		argv [0] = uevent_helper;
 		argv [1] = (char *)subsystem;
 		argv [2] = NULL;
@@ -250,6 +266,7 @@  int kobject_uevent_env(struct kobject *kobj, enum kobject_action action,
 	}
 
 exit:
+	printk(KERN_ERR "kobject_uevent_env: At 'exit', retval=%d\n", retval);
 	kfree(devpath);
 	kfree(env);
 	return retval;