diff mbox

OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Message ID 20120227194341.GA1448@merkur.ravnborg.org
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Sam Ravnborg Feb. 27, 2012, 7:43 p.m. UTC
On Mon, Feb 27, 2012 at 07:17:42PM +0200, Meelis Roos wrote:
> > > > > Can you please try the following patch?  If it still fails to boot,
> > > > > please attach the failing log.  Thank you.
> > > > 
> > > > It works on E3500! Will try other machines tomorrow.
> > > 
> > > Once confirmed, I'll push the patch through tip.  It just hides the
> > > underlying problem but we should be in no worse shape than before,
> > > it's two line change so reproduing the problem again for proper
> > > diagnosing isn't difficult, and we're getting a bit late in release
> > > cycle already.
> > 
> > It cured the V210 too but I could not test V100 since it's offline until 
> > monday.
> 
> Tested V100 too, success!

Hi Meelis.

I have tried to cook up a small patch that verify the length of what
we read - compared to the original length.

Could you try to give this a quick spin and see if something
turns up. I you have time it would be good to try on a box
that worked before and one that was fixed by the patch from Tejun.

I have not looked much at the of stuff - but this looked like the right place to start.

I have no possibility to try it out myself...

	Sam

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Meelis Roos Feb. 27, 2012, 9:25 p.m. UTC | #1
> Could you try to give this a quick spin and see if something
> turns up. I you have time it would be good to try on a box
> that worked before and one that was fixed by the patch from Tejun.

Neither of the machines - already working one and "fixed with the 
rounding patch" one emit any prot: messages.
David Miller Feb. 27, 2012, 9:30 p.m. UTC | #2
From: Meelis Roos <mroos@linux.ee>
Date: Mon, 27 Feb 2012 23:25:11 +0200 (EET)

>> Could you try to give this a quick spin and see if something
>> turns up. I you have time it would be good to try on a box
>> that worked before and one that was fixed by the patch from Tejun.
> 
> Neither of the machines - already working one and "fixed with the 
> rounding patch" one emit any prot: messages.

I think the issue is that OF writes past the end of the buffer even
though the length it reports is smaller than what it writes.

That's why we really need to fill the memblock memory with magic
numbers and scan every allocation for free memory with corrupted
magic values.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/of/pdt.c b/drivers/of/pdt.c
index 07cc1d6..826204a 100644
--- a/drivers/of/pdt.c
+++ b/drivers/of/pdt.c
@@ -128,6 +128,10 @@  static struct property * __init of_pdt_build_one_prop(phandle node, char *prev,
 			p->value = prom_early_alloc(p->length + 1);
 			len = of_pdt_prom_ops->getproperty(node, p->name,
 					p->value, p->length);
+
+			if (len != p->length)
+				pr_err("prop: %s %d => %d", p->name, p->length, len);
+
 			if (len <= 0)
 				p->length = 0;
 			((unsigned char *)p->value)[p->length] = '\0';
@@ -161,8 +165,13 @@  static char * __init of_pdt_get_one_property(phandle node, const char *name)
 
 	len = of_pdt_prom_ops->getproplen(node, name);
 	if (len > 0) {
+		int proplen;
 		buf = prom_early_alloc(len);
-		len = of_pdt_prom_ops->getproperty(node, name, buf, len);
+		proplen = of_pdt_prom_ops->getproperty(node, name, buf, len);
+
+		if (proplen != len)
+			pr_err("prop: %s %d => %d\n", name, len, proplen);
+
 	}
 
 	return buf;