Message ID | 1429614216-30538-1-git-send-email-naveen.n.rao@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On Tue, 21 Apr 2015 16:33:36 +0530 "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > Add a new powerpc-specific trace clock using the timebase register, > similar to x86-tsc. This gives us a fast, monotonic, cross-cpu clock > for trace entries and can be used to correlate events across cpus as > well as across hypervisor and guest (assuming it is not a migrated guest > with a non-zero tb_offset). > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > --- > I have followed the approach used by x86-tsc here, but we could get rid of > trace_clock.c if we directly use get_tb() with perhaps the notrace annotation. > Would that be preferable? > Probably. But all clocks used by tracing should be marked by notrace. Don't just wrap it with a notrace. But looking at the code, it seems that get_tb() is a static inline, which wont work as a pointer. Seems you still need the indirect function call. Note, all "inline" functions are notrace by default, so you do not need to add any notrace annotation to an inlined function. -- Steve
On 2015/04/21 09:25AM, Steven Rostedt wrote: > On Tue, 21 Apr 2015 16:33:36 +0530 > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > Add a new powerpc-specific trace clock using the timebase register, > > similar to x86-tsc. This gives us a fast, monotonic, cross-cpu clock > > for trace entries and can be used to correlate events across cpus as > > well as across hypervisor and guest (assuming it is not a migrated guest > > with a non-zero tb_offset). > > > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > --- > > I have followed the approach used by x86-tsc here, but we could get rid of > > trace_clock.c if we directly use get_tb() with perhaps the notrace annotation. > > Would that be preferable? > > > > Probably. But all clocks used by tracing should be marked by notrace. > Don't just wrap it with a notrace. But looking at the code, it seems > that get_tb() is a static inline, which wont work as a pointer. Seems > you still need the indirect function call. > > Note, all "inline" functions are notrace by default, so you do not need > to add any notrace annotation to an inlined function. Steve, Thanks for the clarification - the current approach is better in that case. Paul, Mike, Can you please let me know your thoughts on this? - Naveen
On Wed, 2015-04-22 at 10:48 +0530, Naveen N. Rao wrote: > On 2015/04/21 09:25AM, Steven Rostedt wrote: > > On Tue, 21 Apr 2015 16:33:36 +0530 > > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > > > Add a new powerpc-specific trace clock using the timebase register, > > > similar to x86-tsc. This gives us a fast, monotonic, cross-cpu clock > > > for trace entries and can be used to correlate events across cpus as > > > well as across hypervisor and guest (assuming it is not a migrated guest > > > with a non-zero tb_offset). > > > > > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > > --- > > > I have followed the approach used by x86-tsc here, but we could get rid of > > > trace_clock.c if we directly use get_tb() with perhaps the notrace annotation. > > > Would that be preferable? > > > > > > > Probably. But all clocks used by tracing should be marked by notrace. > > Don't just wrap it with a notrace. But looking at the code, it seems > > that get_tb() is a static inline, which wont work as a pointer. Seems > > you still need the indirect function call. > > > > Note, all "inline" functions are notrace by default, so you do not need > > to add any notrace annotation to an inlined function. > > Steve, > Thanks for the clarification - the current approach is better in that > case. > > Paul, Mike, > Can you please let me know your thoughts on this? What is the value in adding a powerpc specific clock, which requires educating people to use it, vs just using the global clock? cheers
On 2015/04/23 01:14PM, Michael Ellerman wrote: > On Wed, 2015-04-22 at 10:48 +0530, Naveen N. Rao wrote: > > On 2015/04/21 09:25AM, Steven Rostedt wrote: > > > On Tue, 21 Apr 2015 16:33:36 +0530 > > > "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com> wrote: > > > > > > > Add a new powerpc-specific trace clock using the timebase register, > > > > similar to x86-tsc. This gives us a fast, monotonic, cross-cpu clock > > > > for trace entries and can be used to correlate events across cpus as > > > > well as across hypervisor and guest (assuming it is not a migrated guest > > > > with a non-zero tb_offset). > > > > > > > > Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> > > > > --- > > > > I have followed the approach used by x86-tsc here, but we could get rid of > > > > trace_clock.c if we directly use get_tb() with perhaps the notrace annotation. > > > > Would that be preferable? > > > > > > > > > > Probably. But all clocks used by tracing should be marked by notrace. > > > Don't just wrap it with a notrace. But looking at the code, it seems > > > that get_tb() is a static inline, which wont work as a pointer. Seems > > > you still need the indirect function call. > > > > > > Note, all "inline" functions are notrace by default, so you do not need > > > to add any notrace annotation to an inlined function. > > > > Steve, > > Thanks for the clarification - the current approach is better in that > > case. > > > > Paul, Mike, > > Can you please let me know your thoughts on this? > > What is the value in adding a powerpc specific clock, which requires educating > people to use it, vs just using the global clock? Hi Michael, There are two reasons: - One, using the timebase register gives us a very fast (just one instruction) hardware clock source for trace entries compared to the global clock. - Two, using the timebase values allows us to correlate trace events across hypervisor and guest, which is not possible using the other trace clocks. [With migrated guests, there could be a non-zero tb_offset which we may also want to make available to userspace in some manner. Any suggestions?] This will show up as an additional trace clock and can be selected if one wishes to use it. As an example: [root@rhel7-img ~]# cd /sys/kernel/debug/tracing [root@rhel7-img tracing]# cat trace_clock [local] global counter uptime perf mono ppc-tb [root@rhel7-img tracing]# head -30 trace # tracer: nop # # entries-in-buffer/entries-written: 534/534 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | tuned-1080 [003] .... 1108.294606: sys_select -> 0x0 tuned-1080 [003] .... 1108.294655: sys_select(n: 0, inp: 0, outp: 0, exp: 0, tvp: 3fff9b7fcdf0) bash-2732 [001] .... 1108.336317: sys_write -> 0x6 bash-2732 [001] .... 1108.336321: sys_dup2(oldfd: a, newfd: 1) bash-2732 [001] .... 1108.336323: sys_dup2 -> 0x1 bash-2732 [001] .... 1108.336327: sys_fcntl(fd: a, cmd: 1, arg: 0) bash-2732 [001] .... 1108.336328: sys_fcntl -> 0x1 bash-2732 [001] .... 1108.336330: sys_close(fd: a) bash-2732 [001] .... 1108.336330: sys_close -> 0x0 bash-2732 [001] .... 1108.336340: sys_rt_sigprocmask(how: 0, nset: 0, oset: 10146e48, sigsetsize: 8) bash-2732 [001] .... 1108.336341: sys_rt_sigprocmask -> 0x0 bash-2732 [001] .... 1108.336342: sys_rt_sigaction(sig: 2, act: 3fffc17d8e20, oact: 3fffc17d8d80, sigsetsize: 8) bash-2732 [001] .... 1108.336343: sys_rt_sigaction -> 0x0 bash-2732 [001] .... 1108.336352: sys_rt_sigprocmask(how: 0, nset: 0, oset: 10146e48, sigsetsize: 8) bash-2732 [001] .... 1108.336352: sys_rt_sigprocmask -> 0x0 bash-2732 [001] .... 1108.336476: sys_write(fd: 1, buf: 3fff7f370000, count: 2d) bash-2732 [001] .... 1108.336484: sys_write -> 0x2d bash-2732 [001] .... 1108.336512: sys_rt_sigprocmask(how: 0, nset: 3fffc17d7c40, oset: 3fffc17d7cc0, sigsetsize: 8) bash-2732 [001] .... 1108.336513: sys_rt_sigprocmask -> 0x0 [root@rhel7-img tracing]# echo ppc-tb > trace_clock [root@rhel7-img tracing]# cat trace_clock local global counter uptime perf mono [ppc-tb] [root@rhel7-img tracing]# echo 0 > trace [root@rhel7-img tracing]# head -30 trace # tracer: nop # # entries-in-buffer/entries-written: 396/396 #P:4 # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # ||| / delay # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | bash-2732 [002] .... 34405585249004: sys_write -> 0x7 bash-2732 [002] .... 34405585251162: sys_dup2(oldfd: a, newfd: 1) bash-2732 [002] .... 34405585252104: sys_dup2 -> 0x1 bash-2732 [002] .... 34405585253774: sys_fcntl(fd: a, cmd: 1, arg: 0) bash-2732 [002] .... 34405585254385: sys_fcntl -> 0x1 bash-2732 [002] .... 34405585255051: sys_close(fd: a) bash-2732 [002] .... 34405585255589: sys_close -> 0x0 bash-2732 [002] .... 34405585260898: sys_rt_sigprocmask(how: 0, nset: 0, oset: 10146e48, sigsetsize: 8) bash-2732 [002] .... 34405585261482: sys_rt_sigprocmask -> 0x0 bash-2732 [002] .... 34405585262339: sys_rt_sigaction(sig: 2, act: 3fffc17d8e20, oact: 3fffc17d8d80, sigsetsize: 8) bash-2732 [002] .... 34405585262927: sys_rt_sigaction -> 0x0 bash-2732 [002] .... 34405585267215: sys_rt_sigprocmask(how: 0, nset: 0, oset: 10146e48, sigsetsize: 8) bash-2732 [002] .... 34405585267690: sys_rt_sigprocmask -> 0x0 bash-2732 [002] .... 34405585345924: sys_write(fd: 1, buf: 3fff7f370000, count: 2d) bash-2732 [002] .n.. 34405585350359: sys_write -> 0x2d bash-2732 [002] .... 34405585428096: sys_rt_sigprocmask(how: 0, nset: 3fffc17d7c40, oset: 3fffc17d7cc0, sigsetsize: 8) bash-2732 [002] .... 34405585447011: sys_rt_sigprocmask -> 0x0 bash-2732 [002] .... 34405585447956: sys_ioctl(fd: ff, cmd: 80047476, arg: 3fffc17d7c08) bash-2732 [002] .... 34405585449451: sys_ioctl -> 0x0 - Naveen
diff --git a/Documentation/trace/ftrace.txt b/Documentation/trace/ftrace.txt index 572ca92..689f61a 100644 --- a/Documentation/trace/ftrace.txt +++ b/Documentation/trace/ftrace.txt @@ -346,6 +346,11 @@ of ftrace. Here is a list of some of the key files: x86-tsc: Architectures may define their own clocks. For example, x86 uses its own TSC cycle clock here. + ppc-tb: This uses the powerpc timebase register value. + This is in sync across CPUs and can also be used + to correlate events across hypervisor/guest if + tb_offset is known. + To set a clock, simply echo the clock name into this file. echo global > trace_clock diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index 382b28e..5041c66 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -5,5 +5,4 @@ generic-y += mcs_spinlock.h generic-y += preempt.h generic-y += rwsem.h generic-y += scatterlist.h -generic-y += trace_clock.h generic-y += vtime.h diff --git a/arch/powerpc/include/asm/trace_clock.h b/arch/powerpc/include/asm/trace_clock.h new file mode 100644 index 0000000..0b0d094 --- /dev/null +++ b/arch/powerpc/include/asm/trace_clock.h @@ -0,0 +1,27 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * Copyright (C) 2015 Naveen N. Rao, IBM Corporation + */ + +#ifndef _ASM_PPC_TRACE_CLOCK_H +#define _ASM_PPC_TRACE_CLOCK_H + +#include <linux/compiler.h> +#include <linux/types.h> + +#ifdef CONFIG_TRACE_CLOCK + +extern u64 notrace trace_clock_ppc_tb(void); + +#define ARCH_TRACE_CLOCKS { trace_clock_ppc_tb, "ppc-tb", 0 }, + +#else + +#define ARCH_TRACE_CLOCKS + +#endif /* CONFIG_TRACE_CLOCK */ + +#endif /* _ASM_PPC_TRACE_CLOCK_H */ diff --git a/arch/powerpc/kernel/Makefile b/arch/powerpc/kernel/Makefile index 502cf69..f9936f3 100644 --- a/arch/powerpc/kernel/Makefile +++ b/arch/powerpc/kernel/Makefile @@ -118,6 +118,7 @@ obj-$(CONFIG_PPC_IO_WORKAROUNDS) += io-workarounds.o obj-$(CONFIG_DYNAMIC_FTRACE) += ftrace.o obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o +obj-$(CONFIG_TRACE_CLOCK) += trace_clock.o ifneq ($(CONFIG_PPC_INDIRECT_PIO),y) obj-y += iomap.o diff --git a/arch/powerpc/kernel/trace_clock.c b/arch/powerpc/kernel/trace_clock.c new file mode 100644 index 0000000..4917069 --- /dev/null +++ b/arch/powerpc/kernel/trace_clock.c @@ -0,0 +1,15 @@ +/* + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License, version 2, as + * published by the Free Software Foundation. + * + * Copyright (C) 2015 Naveen N. Rao, IBM Corporation + */ + +#include <asm/trace_clock.h> +#include <asm/time.h> + +u64 notrace trace_clock_ppc_tb(void) +{ + return get_tb(); +}
Add a new powerpc-specific trace clock using the timebase register, similar to x86-tsc. This gives us a fast, monotonic, cross-cpu clock for trace entries and can be used to correlate events across cpus as well as across hypervisor and guest (assuming it is not a migrated guest with a non-zero tb_offset). Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> --- I have followed the approach used by x86-tsc here, but we could get rid of trace_clock.c if we directly use get_tb() with perhaps the notrace annotation. Would that be preferable? Thanks, Naveen Documentation/trace/ftrace.txt | 5 +++++ arch/powerpc/include/asm/Kbuild | 1 - arch/powerpc/include/asm/trace_clock.h | 27 +++++++++++++++++++++++++++ arch/powerpc/kernel/Makefile | 1 + arch/powerpc/kernel/trace_clock.c | 15 +++++++++++++++ 5 files changed, 48 insertions(+), 1 deletion(-) create mode 100644 arch/powerpc/include/asm/trace_clock.h create mode 100644 arch/powerpc/kernel/trace_clock.c