Message ID | 1410937624-25140-3-git-send-email-anton@samba.org (mailing list archive) |
---|---|
State | Accepted |
Commit | b3c18725a0eb7ea2458e9ae3b7e5a477f52e361f |
Delegated to: | Michael Ellerman |
Headers | show |
On Wed, 17 Sep 2014 17:07:04 +1000 Anton Blanchard <anton@samba.org> wrote: > Instead of passing in the stack address of the link register > to be modified, just pass in the old value and return the > new value and rely on ftrace_graph_caller to do the > modification. > > This removes the exception handling around the stack update - > it isn't needed and we weren't consistent about it. Later on > we would do an unprotected modification: > > if (!ftrace_graph_entry(&trace)) { > *parent = old; > First I'll say this is something I've been wanting to do with x86 for some time. That said... With this patch, things move much further in my tests. The stress test passes again. But then it fails on my stack trace test. Which is because this is what I have in the stack traces: sleep-3557 [000] d... 100.206808: <stack trace> => 0 => 0 => 0 => 0 => 0 => 0 => 0 => 0 Where without the patches I have something like this: sleep-3641 [001] d... 304.023550: <stack trace> => .ftrace_raw_event_sched_switch => .__schedule => .schedule => .do_nanosleep => .hrtimer_nanosleep => .compat_SyS_nanosleep => syscall_exit => 0 This could be broken from the earlier patches, I haven't run just this test. I probably should on them. I've attached the test. -- Steve
On Tue, 23 Sep 2014 19:46:04 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > This could be broken from the earlier patches, I haven't run just this > test. I probably should on them. I went back and tested, and it breaks under the first patch. -- Steve
Hi Steve, > > This could be broken from the earlier patches, I haven't run just > > this test. I probably should on them. > > I went back and tested, and it breaks under the first patch. Thanks for testing. It looks like some toolchains have issues other than the -fno-no-omit-frame-pointer one, and -mno-sched-epilog works around it. I'll drop that patch and respin. Anton
On Wed, 2014-09-24 at 12:22 +1000, Anton Blanchard wrote: > Hi Steve, > > > > This could be broken from the earlier patches, I haven't run just > > > this test. I probably should on them. > > > > I went back and tested, and it breaks under the first patch. > > Thanks for testing. It looks like some toolchains have issues > other than the -fno-no-omit-frame-pointer one, and -mno-sched-epilog > works around it. > > I'll drop that patch and respin. Or maybe do a toolchain check / or enable it in LE ? Ben.
Hi Ben, > > I'll drop that patch and respin. > > Or maybe do a toolchain check / or enable it in LE ? We are scratching our heads trying to remember details of the issue right now. In retrospect we should have linked the gcc bugzilla or gcc commit details in the kernel commit message :) Steve: what gcc version are you building with? Anton
On Wed, 24 Sep 2014 12:33:07 +1000 Anton Blanchard <anton@samba.org> wrote: > Hi Ben, > > > > I'll drop that patch and respin. > > > > Or maybe do a toolchain check / or enable it in LE ? > > We are scratching our heads trying to remember details of the issue > right now. In retrospect we should have linked the gcc bugzilla or > gcc commit details in the kernel commit message :) > > Steve: what gcc version are you building with? > powerpc64-linux-gcc (GCC) 4.6.3 I got it from here: https://www.kernel.org/pub/tools/crosstool/files/bin/x86_64/4.6.3/ -- Steve
On Wed, Sep 24, 2014 at 12:33:07PM +1000, Anton Blanchard wrote: > We are scratching our heads trying to remember details of the issue > right now. In retrospect we should have linked the gcc bugzilla or > gcc commit details in the kernel commit message :) There have been many GCC bugs in this area. 30282 (for 32-bit) 44199 (for 64-bit) 52828 (for everything, and this one should finally handle things for good) Also a bunch of duplicates, and I'm sure I've missed some more. The original issue as far as I remember: when using a frame pointer, GCC would sometimes schedule the epilogue to do the stack adjust before restoring all regs from the stack. Then an interrupt comes in, those saved regs are clobbered, kaboom. We cannot disable the frame pointer because -pg forces it (although PowerPC does not need it). The -mno-sched-epilog flag is a workaround: the epilogue (and prologue) will not be reordered by instruction scheduling. Slow code is better than blowing up fast ;-) Segher
Hi Segher, > On Wed, Sep 24, 2014 at 12:33:07PM +1000, Anton Blanchard wrote: > > We are scratching our heads trying to remember details of the issue > > right now. In retrospect we should have linked the gcc bugzilla or > > gcc commit details in the kernel commit message :) > > There have been many GCC bugs in this area. > > 30282 (for 32-bit) > 44199 (for 64-bit) > 52828 (for everything, and this one should finally handle things for > good) Also a bunch of duplicates, and I'm sure I've missed some more. > > The original issue as far as I remember: when using a frame pointer, > GCC would sometimes schedule the epilogue to do the stack adjust > before restoring all regs from the stack. Then an interrupt comes > in, those saved regs are clobbered, kaboom. We cannot disable the > frame pointer because -pg forces it (although PowerPC does not need > it). The -mno-sched-epilog flag is a workaround: the epilogue (and > prologue) will not be reordered by instruction scheduling. Slow code > is better than blowing up fast ;-) Thanks for explaining it! It does look like the last issue wasn't fixed until gcc 4.8. We'll drop that patch. Anton
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 22b45a4..ad837d8 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -1424,12 +1424,18 @@ _GLOBAL(ftrace_graph_caller) lwz r4, 44(r1) subi r4, r4, MCOUNT_INSN_SIZE - /* get the parent address */ - addi r3, r1, 52 + /* Grab the LR out of the caller stack frame */ + lwz r3,52(r1) bl prepare_ftrace_return nop + /* + * prepare_ftrace_return gives us the address we divert to. + * Change the LR in the callers stack frame to this. + */ + stw r3,52(r1) + MCOUNT_RESTORE_FRAME /* old link register ends up in ctr reg */ bctr diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index 955d509..9caab69 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -1221,13 +1221,20 @@ _GLOBAL(ftrace_graph_caller) ld r4, 128(r1) subi r4, r4, MCOUNT_INSN_SIZE - /* get the parent address */ + /* Grab the LR out of the caller stack frame */ ld r11, 112(r1) - addi r3, r11, 16 + ld r3, 16(r11) bl prepare_ftrace_return nop + /* + * prepare_ftrace_return gives us the address we divert to. + * Change the LR in the callers stack frame to this. + */ + ld r11, 112(r1) + std r3, 16(r11) + ld r0, 128(r1) mtlr r0 addi r1, r1, 112 diff --git a/arch/powerpc/kernel/ftrace.c b/arch/powerpc/kernel/ftrace.c index abf7921..d795031 100644 --- a/arch/powerpc/kernel/ftrace.c +++ b/arch/powerpc/kernel/ftrace.c @@ -512,67 +512,34 @@ int ftrace_disable_ftrace_graph_caller(void) /* * Hook the return address and push it in the stack of return addrs - * in current thread info. + * in current thread info. Return the address we want to divert to. */ -void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr) +unsigned long prepare_ftrace_return(unsigned long parent, unsigned long ip) { - unsigned long old; - int faulted; struct ftrace_graph_ent trace; unsigned long return_hooker; if (unlikely(ftrace_graph_is_dead())) - return; + goto out; if (unlikely(atomic_read(¤t->tracing_graph_pause))) - return; + goto out; return_hooker = ppc_function_entry(return_to_handler); - /* - * Protect against fault, even if it shouldn't - * happen. This tool is too much intrusive to - * ignore such a protection. - */ - asm volatile( - "1: " PPC_LL "%[old], 0(%[parent])\n" - "2: " PPC_STL "%[return_hooker], 0(%[parent])\n" - " li %[faulted], 0\n" - "3:\n" - - ".section .fixup, \"ax\"\n" - "4: li %[faulted], 1\n" - " b 3b\n" - ".previous\n" - - ".section __ex_table,\"a\"\n" - PPC_LONG_ALIGN "\n" - PPC_LONG "1b,4b\n" - PPC_LONG "2b,4b\n" - ".previous" - - : [old] "=&r" (old), [faulted] "=r" (faulted) - : [parent] "r" (parent), [return_hooker] "r" (return_hooker) - : "memory" - ); - - if (unlikely(faulted)) { - ftrace_graph_stop(); - WARN_ON(1); - return; - } - - trace.func = self_addr; + trace.func = ip; trace.depth = current->curr_ret_stack + 1; /* Only trace if the calling function expects to */ - if (!ftrace_graph_entry(&trace)) { - *parent = old; - return; - } + if (!ftrace_graph_entry(&trace)) + goto out; + + if (ftrace_push_return_trace(parent, ip, &trace.depth, 0) == -EBUSY) + goto out; - if (ftrace_push_return_trace(old, self_addr, &trace.depth, 0) == -EBUSY) - *parent = old; + parent = return_hooker; +out: + return parent; } #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
Instead of passing in the stack address of the link register to be modified, just pass in the old value and return the new value and rely on ftrace_graph_caller to do the modification. This removes the exception handling around the stack update - it isn't needed and we weren't consistent about it. Later on we would do an unprotected modification: if (!ftrace_graph_entry(&trace)) { *parent = old; Signed-off-by: Anton Blanchard <anton@samba.org> --- arch/powerpc/kernel/entry_32.S | 10 +++++-- arch/powerpc/kernel/entry_64.S | 11 ++++++-- arch/powerpc/kernel/ftrace.c | 59 ++++++++++-------------------------------- 3 files changed, 30 insertions(+), 50 deletions(-)