diff mbox

[BUG] fault while using perf callchains in sparc64

Message ID 20100329.130931.149161813.davem@davemloft.net
State Accepted
Delegated to: David Miller
Headers show

Commit Message

David Miller March 29, 2010, 8:09 p.m. UTC
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Sun, 28 Mar 2010 06:34:49 +0200

> I get kernel crashes each time I use perf with callchains
> on sparc 64.
> 
> It triggers with a simple:
> 
> 	perf record -a -f -g sleep 1

This should fix it, thanks again.

sparc64: Properly truncate pt_regs framepointer in perf callback.

For 32-bit processes, we save the full 64-bits of the regs in pt_regs.

But unlike when the userspace actually does load and store
instructions, the top 32-bits don't get automatically truncated by the
cpu in kernel mode (because the kernel doesn't execute with PSTATE_AM
address masking enabled).

So we have to do it by hand.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 arch/sparc/kernel/perf_event.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

Frédéric Weisbecker March 29, 2010, 8:49 p.m. UTC | #1
On Mon, Mar 29, 2010 at 01:09:31PM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Sun, 28 Mar 2010 06:34:49 +0200
> 
> > I get kernel crashes each time I use perf with callchains
> > on sparc 64.
> > 
> > It triggers with a simple:
> > 
> > 	perf record -a -f -g sleep 1
> 
> This should fix it, thanks again.


I merged your tree on latest -git and it works well.

Thanks!

Sorry, I have another bug report.

While building perf tools, or the kernel, or whatever, I often
get the following error in the middle:

	gcc: Internal error: Segmentation fault (program as)

And this in the logs:

	[ 1429.477049] as[2658]: segfault at 4054dfa8 ip 0000000000020690 (rpc 00000000700adcf4) sp 00000000ffcbf008 
error 30001 in as[10000+40000]

My gcc / as and everything in userspace is 32 bits but the kernel is a 64.

My config is the same as before.

Again, tell me everything you need to help debugging this.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 29, 2010, 9:01 p.m. UTC | #2
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 29 Mar 2010 22:49:33 +0200

> While building perf tools, or the kernel, or whatever, I often
> get the following error in the middle:
> 
> 	gcc: Internal error: Segmentation fault (program as)
> 
> And this in the logs:
> 
> 	[ 1429.477049] as[2658]: segfault at 4054dfa8 ip 0000000000020690 (rpc 00000000700adcf4) sp 00000000ffcbf008 
> error 30001 in as[10000+40000]

What distribution and binutils are you using?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frédéric Weisbecker March 29, 2010, 9:11 p.m. UTC | #3
On Mon, Mar 29, 2010 at 02:01:31PM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Mon, 29 Mar 2010 22:49:33 +0200
> 
> > While building perf tools, or the kernel, or whatever, I often
> > get the following error in the middle:
> > 
> > 	gcc: Internal error: Segmentation fault (program as)
> > 
> > And this in the logs:
> > 
> > 	[ 1429.477049] as[2658]: segfault at 4054dfa8 ip 0000000000020690 (rpc 00000000700adcf4) sp 00000000ffcbf008 
> > error 30001 in as[10000+40000]
> 
> What distribution and binutils are you using?


It's a debian lenny, with binutils 2.18.1~cvs20080103-7.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 29, 2010, 9:19 p.m. UTC | #4
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 29 Mar 2010 23:11:50 +0200

> On Mon, Mar 29, 2010 at 02:01:31PM -0700, David Miller wrote:
>> From: Frederic Weisbecker <fweisbec@gmail.com>
>> Date: Mon, 29 Mar 2010 22:49:33 +0200
>> 
>> > While building perf tools, or the kernel, or whatever, I often
>> > get the following error in the middle:
>> > 
>> > 	gcc: Internal error: Segmentation fault (program as)
>> > 
>> > And this in the logs:
>> > 
>> > 	[ 1429.477049] as[2658]: segfault at 4054dfa8 ip 0000000000020690 (rpc 00000000700adcf4) sp 00000000ffcbf008 
>> > error 30001 in as[10000+40000]
>> 
>> What distribution and binutils are you using?
> 
> It's a debian lenny, with binutils 2.18.1~cvs20080103-7.

I'm using the same here on some boxes, what kind of machine is this?
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frédéric Weisbecker March 29, 2010, 9:28 p.m. UTC | #5
On Mon, Mar 29, 2010 at 02:19:20PM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Mon, 29 Mar 2010 23:11:50 +0200
> 
> > On Mon, Mar 29, 2010 at 02:01:31PM -0700, David Miller wrote:
> >> From: Frederic Weisbecker <fweisbec@gmail.com>
> >> Date: Mon, 29 Mar 2010 22:49:33 +0200
> >> 
> >> > While building perf tools, or the kernel, or whatever, I often
> >> > get the following error in the middle:
> >> > 
> >> > 	gcc: Internal error: Segmentation fault (program as)
> >> > 
> >> > And this in the logs:
> >> > 
> >> > 	[ 1429.477049] as[2658]: segfault at 4054dfa8 ip 0000000000020690 (rpc 00000000700adcf4) sp 00000000ffcbf008 
> >> > error 30001 in as[10000+40000]
> >> 
> >> What distribution and binutils are you using?
> > 
> > It's a debian lenny, with binutils 2.18.1~cvs20080103-7.
> 
> I'm using the same here on some boxes, what kind of machine is this?


It's a Niagara 2 based one.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 29, 2010, 10:02 p.m. UTC | #6
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Mon, 29 Mar 2010 23:28:42 +0200

> It's a Niagara 2 based one.

Strange, that's what I do all of my main sparc64 kernel
work on too.  I've never seen these spurious 'as' crashes.

Hmmmm, what does "ldd /usr/bin/as" give you?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frédéric Weisbecker March 29, 2010, 10:21 p.m. UTC | #7
On Mon, Mar 29, 2010 at 03:02:53PM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Mon, 29 Mar 2010 23:28:42 +0200
> 
> > It's a Niagara 2 based one.
> 
> Strange, that's what I do all of my main sparc64 kernel
> work on too.  I've never seen these spurious 'as' crashes.
> 
> Hmmmm, what does "ldd /usr/bin/as" give you?
> 
> Thanks.


$ ldd /usr/bin/as
	libopcodes-2.18.0.20080103.so => /usr/lib/libopcodes-2.18.0.20080103.so (0xf7ec4000)
	libbfd-2.18.0.20080103.so => /usr/lib/libbfd-2.18.0.20080103.so (0xf7e14000)
	libc.so.6 => /lib/libc.so.6 (0xf7ca0000)
	/lib/ld-linux.so.2 (0xf7efc000)

The last kernel I know that don't have such problems is 2.6.31-rc6
May be I should bisect?

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 29, 2010, 10:32 p.m. UTC | #8
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Tue, 30 Mar 2010 00:21:34 +0200

> $ ldd /usr/bin/as
> 	libopcodes-2.18.0.20080103.so => /usr/lib/libopcodes-2.18.0.20080103.so (0xf7ec4000)
> 	libbfd-2.18.0.20080103.so => /usr/lib/libbfd-2.18.0.20080103.so (0xf7e14000)
> 	libc.so.6 => /lib/libc.so.6 (0xf7ca0000)
> 	/lib/ld-linux.so.2 (0xf7efc000)

Ok, same here.

> The last kernel I know that don't have such problems is 2.6.31-rc6
> May be I should bisect?

Hmmm, since you know a good and bad point, yes a bisect
might be the best way to proceed here.

It might be quicker if you first test 2.6.32 and 2.6.33
and then use the results of that to guide your bisect.

Anyways, if you narrow it down to a commit I should be
able to fix this quickly.

Thanks!
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 1, 2010, 8:09 a.m. UTC | #9
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 1 Apr 2010 11:06:11 +0200

> I actually can't. It works well on a backup 2.6.31-rc6 kernel
> but when I build a new one of this same version, the problem
> happens again. And I don't have the config of the one that works
> (and no /proc/config.gz as well).
> 
> So I suspect this is something that happens with some specific
> configs only.
> 
> Anyway, once I get more clues about this, I'll tell you.

I was going to ask you if any of your compiler tools changed
recently...

Check the gcc version printed by the working kernel at the top of the
dmesg logs and compare to what you end up using now.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 1, 2010, 9:02 a.m. UTC | #10
From: Frederic Weisbecker <fweisbec@gmail.com>
Date: Thu, 1 Apr 2010 11:38:33 +0200

> It seems to happen with ld as well btw (not sure this is related
> though):
> 
> [ 3366.005962] ld[19041]: segfault at 10 ip 000000007010248c (rpc 00000000701023f8) sp 00000000ffda87c8 error 30001 
> in libbfd-2.18.0.20080103.so[700d8000+a0000]

It's data corruption coming either from the kernel or something
malfunctioning in libc is my guess, more likely the kernel.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frédéric Weisbecker April 1, 2010, 9:06 a.m. UTC | #11
On Mon, Mar 29, 2010 at 03:32:08PM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Tue, 30 Mar 2010 00:21:34 +0200
> 
> > $ ldd /usr/bin/as
> > 	libopcodes-2.18.0.20080103.so => /usr/lib/libopcodes-2.18.0.20080103.so (0xf7ec4000)
> > 	libbfd-2.18.0.20080103.so => /usr/lib/libbfd-2.18.0.20080103.so (0xf7e14000)
> > 	libc.so.6 => /lib/libc.so.6 (0xf7ca0000)
> > 	/lib/ld-linux.so.2 (0xf7efc000)
> 
> Ok, same here.
> 
> > The last kernel I know that don't have such problems is 2.6.31-rc6
> > May be I should bisect?
> 
> Hmmm, since you know a good and bad point, yes a bisect
> might be the best way to proceed here.
> 
> It might be quicker if you first test 2.6.32 and 2.6.33
> and then use the results of that to guide your bisect.
> 
> Anyways, if you narrow it down to a commit I should be
> able to fix this quickly.
> 
> Thanks!


I actually can't. It works well on a backup 2.6.31-rc6 kernel
but when I build a new one of this same version, the problem
happens again. And I don't have the config of the one that works
(and no /proc/config.gz as well).

So I suspect this is something that happens with some specific
configs only.

Anyway, once I get more clues about this, I'll tell you.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Frédéric Weisbecker April 1, 2010, 9:38 a.m. UTC | #12
On Thu, Apr 01, 2010 at 01:09:03AM -0700, David Miller wrote:
> From: Frederic Weisbecker <fweisbec@gmail.com>
> Date: Thu, 1 Apr 2010 11:06:11 +0200
> 
> > I actually can't. It works well on a backup 2.6.31-rc6 kernel
> > but when I build a new one of this same version, the problem
> > happens again. And I don't have the config of the one that works
> > (and no /proc/config.gz as well).
> > 
> > So I suspect this is something that happens with some specific
> > configs only.
> > 
> > Anyway, once I get more clues about this, I'll tell you.
> 
> I was going to ask you if any of your compiler tools changed
> recently...
> 
> Check the gcc version printed by the working kernel at the top of the
> dmesg logs and compare to what you end up using now.


They are exactly the same :)

gcc version 4.3.2 (Debian 4.3.2-1.1)

Really I think I need to dig further as I don't have useful
clues to provide. I need to check if the segfault always happen
in the same place, etc...

It seems to happen with ld as well btw (not sure this is related
though):

[ 3366.005962] ld[19041]: segfault at 10 ip 000000007010248c (rpc 00000000701023f8) sp 00000000ffda87c8 error 30001 
in libbfd-2.18.0.20080103.so[700d8000+a0000]

--
To unsubscribe from this list: send the line "unsubscribe sparclinux" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/arch/sparc/kernel/perf_event.c b/arch/sparc/kernel/perf_event.c
index 9f2b2ba..610112e 100644
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1337,7 +1337,7 @@  static void perf_callchain_user_32(struct pt_regs *regs,
 	callchain_store(entry, PERF_CONTEXT_USER);
 	callchain_store(entry, regs->tpc);
 
-	ufp = regs->u_regs[UREG_I6];
+	ufp = regs->u_regs[UREG_I6] & 0xffffffffUL;
 	do {
 		struct sparc_stackf32 *usf, sf;
 		unsigned long pc;