Message ID | 20090714065425.555222245@samba.org (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | de4376c2846bb5a8fc6fe8dbd0e4ff30905493e6 |
Delegated to: | Benjamin Herrenschmidt |
Headers | show |
On Tue, 2009-07-14 at 16:53 +1000, Anton Blanchard wrote: > plain text document attachment (preload_0x10000000) > TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can > reuse this preload slot by loading in the segment at 0x10000000, where almost > all PowerPC binaries are linked at. > > On a microbenchmark that bounces a token between two 64bit processes over pipes > and calls gettimeofday each iteration (to access the VDSO), both the 32bit and > 64bit context switch rate improves (tested on a 4GHz POWER6): > > 32bit: 273k/sec -> 283k/sec > 64bit: 277k/sec -> 284k/sec Any chance you can put that little test program online somewhere ? Cheers, Ben.
Hi Ben,
> Any chance you can put that little test program online somewhere ?
Sure, it's here:
http://ozlabs.org/~anton/junkcode/context_switch.c
Anton
Index: linux.trees.git/arch/powerpc/mm/slb.c =================================================================== --- linux.trees.git.orig/arch/powerpc/mm/slb.c 2009-07-14 15:09:39.000000000 +1000 +++ linux.trees.git/arch/powerpc/mm/slb.c 2009-07-14 15:12:42.000000000 +1000 @@ -191,7 +191,7 @@ unsigned long slbie_data = 0; unsigned long pc = KSTK_EIP(tsk); unsigned long stack = KSTK_ESP(tsk); - unsigned long unmapped_base; + unsigned long exec_base; if (!cpu_has_feature(CPU_FTR_NO_SLBIE_B) && offset <= SLB_CACHE_ENTRIES) { @@ -219,14 +219,13 @@ /* * preload some userspace segments into the SLB. + * Almost all 32 and 64bit PowerPC executables are linked at + * 0x10000000 so it makes sense to preload this segment. */ - if (test_tsk_thread_flag(tsk, TIF_32BIT)) - unmapped_base = TASK_UNMAPPED_BASE_USER32; - else - unmapped_base = TASK_UNMAPPED_BASE_USER64; + exec_base = 0x10000000; if (is_kernel_addr(pc) || is_kernel_addr(stack) || - is_kernel_addr(unmapped_base)) + is_kernel_addr(exec_base)) return; slb_allocate(pc); @@ -234,9 +233,9 @@ if (!esids_match(pc, stack)) slb_allocate(stack); - if (!esids_match(pc, unmapped_base) && - !esids_match(stack, unmapped_base)) - slb_allocate(unmapped_base); + if (!esids_match(pc, exec_base) && + !esids_match(stack, exec_base)) + slb_allocate(exec_base); } static inline void patch_slb_encoding(unsigned int *insn_addr,
TASK_UNMAPPED_BASE is not used with the new top down mmap layout. We can reuse this preload slot by loading in the segment at 0x10000000, where almost all PowerPC binaries are linked at. On a microbenchmark that bounces a token between two 64bit processes over pipes and calls gettimeofday each iteration (to access the VDSO), both the 32bit and 64bit context switch rate improves (tested on a 4GHz POWER6): 32bit: 273k/sec -> 283k/sec 64bit: 277k/sec -> 284k/sec Signed-off-by: Anton Blanchard <anton@samba.org> ---