Message ID | 1421753750-17135-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Michael Ellerman |
Headers | show |
On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote: > This enables us to understand how many hash fault we are taking > when running benchmarks. > > For ex: > -bash-4.2# ./perf stat -e powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30 -P -n 1000 > ... > > Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000': > > 1,10,04,075 powerpc:hash_fault > 1,10,03,429 page-faults > > 30.865978991 seconds time elapsed Looks good. Can you attach some test results that show it's not hurting performance when it's disabled. cheers
Michael Ellerman <mpe@ellerman.id.au> writes: > On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote: >> This enables us to understand how many hash fault we are taking >> when running benchmarks. >> >> For ex: >> -bash-4.2# ./perf stat -e powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30 -P -n 1000 >> ... >> >> Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000': >> >> 1,10,04,075 powerpc:hash_fault >> 1,10,03,429 page-faults >> >> 30.865978991 seconds time elapsed > > Looks good. > > Can you attach some test results that show it's not hurting performance when > it's disabled. > ebizzy with -S 30 -t 1 -P gave 13627 records/s -> Without patch 13546 records/s -> With patch with tracepoint disabled 12408 records/s -> With patch with tracepoint enabled. perf stat gave the below data for the above run. 22,38,284 page-faults 22,38,291 powerpc:hash_fault I also used random_access_bench that Anton wrote, it actually create lots of hash fault. A simple run gives. (random_access_bench -o load -g -i -t 10 16G) 1,888 page-faults 2,64,283 powerpc:hash_fault random_access_bench gave: 1435.979 MB/s -> Without patch 1435.29 MB/s -> With patch with tracepoint disabled 1434.75 MB/s -> With patch with tracepoint enabled. -aneesh
On Wed, 2015-01-21 at 14:15 +0530, Aneesh Kumar K.V wrote: > Michael Ellerman <mpe@ellerman.id.au> writes: > > > On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote: > >> This enables us to understand how many hash fault we are taking > >> when running benchmarks. > >> > >> For ex: > >> -bash-4.2# ./perf stat -e powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30 -P -n 1000 > >> ... > >> > >> Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000': > >> > >> 1,10,04,075 powerpc:hash_fault > >> 1,10,03,429 page-faults > >> > >> 30.865978991 seconds time elapsed > > > > Looks good. > > > > Can you attach some test results that show it's not hurting performance when > > it's disabled. > > ebizzy with -S 30 -t 1 -P gave > 13627 records/s -> Without patch > 13546 records/s -> With patch with tracepoint disabled OK. So that's about -0.6%. Are we happy with that? I'm not sure. Can you do a few more runs and see if that's a stable result. > random_access_bench gave: > 1435.979 MB/s -> Without patch > 1435.29 MB/s -> With patch with tracepoint disabled That's more like -0.05% which is in the noise. cheers
Hi, > > ebizzy with -S 30 -t 1 -P gave > > 13627 records/s -> Without patch > > 13546 records/s -> With patch with tracepoint disabled > > OK. So that's about -0.6%. Are we happy with that? I'm not sure. > > Can you do a few more runs and see if that's a stable result. Surprisingly large. Is CONFIG_JUMP_LABEL enabled? That should reduce the tracepoint to just a nop. Anton
Michael Ellerman <mpe@ellerman.id.au> writes: > On Wed, 2015-01-21 at 14:15 +0530, Aneesh Kumar K.V wrote: >> Michael Ellerman <mpe@ellerman.id.au> writes: >> >> > On Tue, 2015-01-20 at 17:05 +0530, Aneesh Kumar K.V wrote: >> >> This enables us to understand how many hash fault we are taking >> >> when running benchmarks. >> >> >> >> For ex: >> >> -bash-4.2# ./perf stat -e powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30 -P -n 1000 >> >> ... >> >> >> >> Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000': >> >> >> >> 1,10,04,075 powerpc:hash_fault >> >> 1,10,03,429 page-faults >> >> >> >> 30.865978991 seconds time elapsed >> > >> > Looks good. >> > >> > Can you attach some test results that show it's not hurting performance when >> > it's disabled. >> >> ebizzy with -S 30 -t 1 -P gave >> 13627 records/s -> Without patch >> 13546 records/s -> With patch with tracepoint disabled > > OK. So that's about -0.6%. Are we happy with that? I'm not sure. > > Can you do a few more runs and see if that's a stable result. That is within the run variance for that test. Infact I found it difficult to get a stable records/s with ebizzy run, even after fixing the random state variable and forcing single thread. I ended up doing a micro benchmark that allocate a large region and touch one byte per page. That resulted in time perf stat -e page-faults -e powerpc:hash_fault ./a.out Performance counter stats for './a.out': 10,00,062 page-faults 10,00,068 powerpc:hash_fault 12.414350121 seconds time elapsed real 0m12.558s user 0m0.577s sys 0m11.932s and with that test we have an average system time for 10 run Without patch sys: 0m11.2425 With patch: sys: 0m11.3258 ie, a -0.7% impact If that impact is high we could possibly put that tracepoint within #ifdef CONFIG_DEBUG_VM ? -aneesh
Anton Blanchard <anton@samba.org> writes: > Hi, > >> > ebizzy with -S 30 -t 1 -P gave >> > 13627 records/s -> Without patch >> > 13546 records/s -> With patch with tracepoint disabled >> >> OK. So that's about -0.6%. Are we happy with that? I'm not sure. >> >> Can you do a few more runs and see if that's a stable result. > > Surprisingly large. Is CONFIG_JUMP_LABEL enabled? That should reduce the > tracepoint to just a nop. yes. We do use jump label. I also verified that looking at .s #APP # 23 "./arch/powerpc/include/asm/jump_label.h" 1 1: nop .pushsection __jump_table, "aw" .llong 1b, .L201, __tracepoint_hash_fault+8 #, .popsection # 0 "" 2 ..... -aneesh
Hi Aneesh, > yes. We do use jump label. I also verified that looking at .s > > #APP > # 23 "./arch/powerpc/include/asm/jump_label.h" 1 > 1: > nop > .pushsection __jump_table, "aw" > .llong 1b, .L201, __tracepoint_hash_fault+8 #, > .popsection > > # 0 "" 2 So we insert a single nop, and the slow path is in another section. I'd be surprised if we could measure this, unless the nop causes a branch target alignment issue the slow path caused some hot path icache layout issues. > Without patch > sys: 0m11.2425 > > With patch: > sys: 0m11.3258 > > ie, a -0.7% impact > > If that impact is high we could possibly put that tracepoint within > #ifdef CONFIG_DEBUG_VM ? Did the real time change? I'd be careful about comparing based on system time. My feeling is we should not hide it behind CONFIG_DEBUG_VM. Anton
Anton Blanchard <anton@samba.org> writes: > Hi Aneesh, > >> yes. We do use jump label. I also verified that looking at .s >> >> #APP >> # 23 "./arch/powerpc/include/asm/jump_label.h" 1 >> 1: >> nop >> .pushsection __jump_table, "aw" >> .llong 1b, .L201, __tracepoint_hash_fault+8 #, >> .popsection >> >> # 0 "" 2 > > So we insert a single nop, and the slow path is in another section. I'd > be surprised if we could measure this, unless the nop causes a branch > target alignment issue the slow path caused some hot path icache layout > issues. > >> Without patch >> sys: 0m11.2425 >> >> With patch: >> sys: 0m11.3258 >> >> ie, a -0.7% impact >> >> If that impact is high we could possibly put that tracepoint within >> #ifdef CONFIG_DEBUG_VM ? > > Did the real time change? I'd be careful about comparing based on > system time. Yes it did. 11.8769 with patch 11.7924 without patch I will look at the perf stat data difference between the both runs and update. -aneesh
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes: > Michael Ellerman <mpe@ellerman.id.au> writes: > .... .... > With patch: > sys: 0m11.3258 > > ie, a -0.7% impact > > If that impact is high we could possibly put that tracepoint within #ifdef > CONFIG_DEBUG_VM ? Since the ebizzy runs results were not stable, I did a micro benchmark to measure this and I noticed that results observed are within the run variance of the test. I made sure we don't have context-switches between the runs. If I try to get large number of page-faults, we end up with context switches. for ex: We get without patch -------------------------------- root@qemu-pr-host trace-fault]# bash run Performance counter stats for './a.out 3000 300': 643 page-faults # 0.089 M/sec 7.236562 task-clock (msec) # 0.928 CPUs utilized 2,179,213 stalled-cycles-frontend # 0.00% frontend cycles idle 17,174,367 stalled-cycles-backend # 0.00% backend cycles idle 0 context-switches # 0.000 K/sec 0.007794658 seconds time elapsed [root@qemu-pr-host trace-fault]# And with-patch: --------------- [root@qemu-pr-host trace-fault]# bash run Performance counter stats for './a.out 3000 300': 643 page-faults # 0.089 M/sec 7.233746 task-clock (msec) # 0.921 CPUs utilized 0 context-switches # 0.000 K/sec 0.007854876 seconds time elapsed Performance counter stats for './a.out 3000 300': 643 page-faults # 0.087 M/sec 649 powerpc:hash_fault # 0.087 M/sec 7.430376 task-clock (msec) # 0.938 CPUs utilized 2,347,174 stalled-cycles-frontend # 0.00% frontend cycles idle 17,524,282 stalled-cycles-backend # 0.00% backend cycles idle 0 context-switches # 0.000 K/sec 0.007920284 seconds time elapsed [root@qemu-pr-host trace-fault]#
diff --git a/arch/powerpc/include/asm/trace.h b/arch/powerpc/include/asm/trace.h index c15da6073cb8..ded56b183888 100644 --- a/arch/powerpc/include/asm/trace.h +++ b/arch/powerpc/include/asm/trace.h @@ -142,8 +142,29 @@ TRACE_EVENT_FN(opal_exit, opal_tracepoint_regfunc, opal_tracepoint_unregfunc ); + #endif +TRACE_EVENT(hash_fault, + + TP_PROTO(unsigned long addr, unsigned long access, unsigned long trap), + TP_ARGS(addr, access, trap), + TP_STRUCT__entry( + __field(unsigned long, addr) + __field(unsigned long, access) + __field(unsigned long, trap) + ), + + TP_fast_assign( + __entry->addr = addr; + __entry->access = access; + __entry->trap = trap; + ), + + TP_printk("hash fault with addr 0x%lx and access = 0x%lx trap = 0x%lx", + __entry->addr, __entry->access, __entry->trap) +); + #endif /* _TRACE_POWERPC_H */ #undef TRACE_INCLUDE_PATH diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 2c2022d16059..7e88470a876f 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -57,6 +57,7 @@ #include <asm/fadump.h> #include <asm/firmware.h> #include <asm/tm.h> +#include <asm/trace.h> #ifdef DEBUG #define DBG(fmt...) udbg_printf(fmt) @@ -1004,6 +1005,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, DBG_LOW("hash_page(ea=%016lx, access=%lx, trap=%lx\n", ea, access, trap); + trace_hash_fault(ea, access, trap); /* Get region & vsid */ switch (REGION_ID(ea)) {
This enables us to understand how many hash fault we are taking when running benchmarks. For ex: -bash-4.2# ./perf stat -e powerpc:hash_fault -e page-faults /tmp/ebizzy.ppc64 -S 30 -P -n 1000 ... Performance counter stats for '/tmp/ebizzy.ppc64 -S 30 -P -n 1000': 1,10,04,075 powerpc:hash_fault 1,10,03,429 page-faults 30.865978991 seconds time elapsed Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- arch/powerpc/include/asm/trace.h | 21 +++++++++++++++++++++ arch/powerpc/mm/hash_utils_64.c | 2 ++ 2 files changed, 23 insertions(+)