Message ID | 20171011083502.11648-1-santosh@fossix.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v5] powerpc/vdso64: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE | expand |
On 2017/10/11 08:35AM, Santosh Sivaraj wrote: > Current vDSO64 implementation does not have support for coarse clocks > (CLOCK_MONOTONIC_COARSE, CLOCK_REALTIME_COARSE), for which it falls back > to system call, increasing the response time, vDSO implementation reduces > the cycle time. Below is a benchmark of the difference in execution times. > > (Non-coarse clocks are also included just for completion) > > clock-gettime-realtime: syscall: 172 nsec/call > clock-gettime-realtime: libc: 28 nsec/call > clock-gettime-realtime: vdso: 22 nsec/call > clock-gettime-monotonic: syscall: 171 nsec/call > clock-gettime-monotonic: libc: 30 nsec/call > clock-gettime-monotonic: vdso: 25 nsec/call > clock-gettime-realtime-coarse: syscall: 153 nsec/call > clock-gettime-realtime-coarse: libc: 16 nsec/call > clock-gettime-realtime-coarse: vdso: 10 nsec/call > clock-gettime-monotonic-coarse: syscall: 167 nsec/call > clock-gettime-monotonic-coarse: libc: 17 nsec/call > clock-gettime-monotonic-coarse: vdso: 11 nsec/call > > CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Signed-off-by: Santosh Sivaraj <santosh@fossix.org> > --- > Changelog: > v1: > - gettimeofday was moved from asm to C > - Coarse timer support addition > v2: > - Moved Syscall fallback from inline assembly back to its original place in > __kernel_clock_gettime > v3: > - Based on Ben's input, coarse timer support was added in assembly itself > - Dropped idea of conversion to C due to the vdso update_count variable > being optimized out in C > v4: > - Based on Naveen's comments restructured code to avoid a duplicate code > block > v5: > - Skip creating dependency for registers that are not used for > CLOCK_REALTIME_COARSE (Naveen) > - Reorder instructions to get proper dependency setup (Naveen) > > arch/powerpc/kernel/asm-offsets.c | 2 + > arch/powerpc/kernel/vdso64/gettimeofday.S | 68 ++++++++++++++++++++++++++----- > 2 files changed, 59 insertions(+), 11 deletions(-) Looks good to me. Reviewed-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Hi, On Wed, Oct 11, 2017 at 02:05:02PM +0530, Santosh Sivaraj wrote: > +70: ld r8,CFG_TB_UPDATE_COUNT(r3) > + andi. r0,r8,1 /* pending update ? loop */ > + bne- 70b > + xor r0,r8,r8 /* create dependency */ r0 already is 0 here, and already depends on r8. Or is this trying to do something else? > + add r3,r3,r0 Segher
* Segher Boessenkool <segher@kernel.crashing.org> wrote (on 2017-10-11 17:02:16 +0000): Hi Segher, > Hi, > > On Wed, Oct 11, 2017 at 02:05:02PM +0530, Santosh Sivaraj wrote: > > +70: ld r8,CFG_TB_UPDATE_COUNT(r3) > > + andi. r0,r8,1 /* pending update ? loop */ > > + bne- 70b > > + xor r0,r8,r8 /* create dependency */ > > r0 already is 0 here, and already depends on r8. Or is this trying > to do something else? The function from which this piece was borrowed from had a similar implementation, didn't figure out why the extra dependency, kept it the way it was. See __do_get_tspec function in the same file (arch/powerpc/kernel/vdso64/gettimeofday.S). > > > + add r3,r3,r0 > > > Segher --
Hi! On Thu, Oct 12, 2017 at 10:03:17AM +0530, Santosh Sivaraj wrote: > * Segher Boessenkool <segher@kernel.crashing.org> wrote (on 2017-10-11 17:02:16 +0000): > > On Wed, Oct 11, 2017 at 02:05:02PM +0530, Santosh Sivaraj wrote: > > > +70: ld r8,CFG_TB_UPDATE_COUNT(r3) > > > + andi. r0,r8,1 /* pending update ? loop */ > > > + bne- 70b > > > + xor r0,r8,r8 /* create dependency */ > > > > r0 already is 0 here, and already depends on r8. Or is this trying > > to do something else? > > The function from which this piece was borrowed from had a similar > implementation, didn't figure out why the extra dependency, kept it the way it > was. See __do_get_tspec function in the same file > (arch/powerpc/kernel/vdso64/gettimeofday.S). The xor there is superfluous, too. Segher
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c index 8cfb20e38cfe..b55c68c54dc1 100644 --- a/arch/powerpc/kernel/asm-offsets.c +++ b/arch/powerpc/kernel/asm-offsets.c @@ -396,6 +396,8 @@ int main(void) /* Other bits used by the vdso */ DEFINE(CLOCK_REALTIME, CLOCK_REALTIME); DEFINE(CLOCK_MONOTONIC, CLOCK_MONOTONIC); + DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE); + DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE); DEFINE(NSEC_PER_SEC, NSEC_PER_SEC); DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC); diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S index 382021324883..058e7c24f670 100644 --- a/arch/powerpc/kernel/vdso64/gettimeofday.S +++ b/arch/powerpc/kernel/vdso64/gettimeofday.S @@ -64,6 +64,12 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime) cmpwi cr0,r3,CLOCK_REALTIME cmpwi cr1,r3,CLOCK_MONOTONIC cror cr0*4+eq,cr0*4+eq,cr1*4+eq + + cmpwi cr5,r3,CLOCK_REALTIME_COARSE + cmpwi cr6,r3,CLOCK_MONOTONIC_COARSE + cror cr5*4+eq,cr5*4+eq,cr6*4+eq + + cror cr0*4+eq,cr0*4+eq,cr5*4+eq bne cr0,99f mflr r12 /* r12 saves lr */ @@ -72,6 +78,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime) bl V_LOCAL_FUNC(__get_datapage) /* get data page */ lis r7,NSEC_PER_SEC@h /* want nanoseconds */ ori r7,r7,NSEC_PER_SEC@l + beq cr5,70f 50: bl V_LOCAL_FUNC(__do_get_tspec) /* get time from tb & kernel */ bne cr1,80f /* if not monotonic, all done */ @@ -97,19 +104,58 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime) ld r0,CFG_TB_UPDATE_COUNT(r3) cmpld cr0,r0,r8 /* check if updated */ bne- 50b + b 78f - /* Add wall->monotonic offset and check for overflow or underflow. + /* + * For coarse clocks we get data directly from the vdso data page, so + * we don't need to call __do_get_tspec, but we still need to do the + * counter trick. */ - add r4,r4,r6 - add r5,r5,r9 - cmpd cr0,r5,r7 - cmpdi cr1,r5,0 - blt 1f - subf r5,r7,r5 - addi r4,r4,1 -1: bge cr1,80f - addi r4,r4,-1 - add r5,r5,r7 +70: ld r8,CFG_TB_UPDATE_COUNT(r3) + andi. r0,r8,1 /* pending update ? loop */ + bne- 70b + xor r0,r8,r8 /* create dependency */ + add r3,r3,r0 + + /* + * CLOCK_REALTIME_COARSE, below values are needed for MONOTONIC_COARSE + * too + */ + ld r4,STAMP_XTIME+TSPC64_TV_SEC(r3) + ld r5,STAMP_XTIME+TSPC64_TV_NSEC(r3) + bne cr6,75f + + /* CLOCK_MONOTONIC_COARSE */ + lwa r6,WTOM_CLOCK_SEC(r3) + lwa r9,WTOM_CLOCK_NSEC(r3) + + /* check if counter has updated */ + or r0,r6,r9 +75: or r0,r0,r4 + or r0,r0,r5 + xor r0,r0,r0 + add r3,r3,r0 + ld r0,CFG_TB_UPDATE_COUNT(r3) + cmpld cr0,r0,r8 /* check if updated */ + bne- 70b + + /* Counter has not updated, so continue calculating proper values for + * sec and nsec if monotonic coarse, or just return with the proper + * values for realtime. + */ + bne cr6,80f + + /* Add wall->monotonic offset and check for overflow or underflow */ +78: add r4,r4,r6 + add r5,r5,r9 + cmpd cr0,r5,r7 + cmpdi cr1,r5,0 + blt 79f + subf r5,r7,r5 + addi r4,r4,1 +79: bge cr1,80f + addi r4,r4,-1 + add r5,r5,r7 80: std r4,TSPC64_TV_SEC(r11) std r5,TSPC64_TV_NSEC(r11)
Current vDSO64 implementation does not have support for coarse clocks (CLOCK_MONOTONIC_COARSE, CLOCK_REALTIME_COARSE), for which it falls back to system call, increasing the response time, vDSO implementation reduces the cycle time. Below is a benchmark of the difference in execution times. (Non-coarse clocks are also included just for completion) clock-gettime-realtime: syscall: 172 nsec/call clock-gettime-realtime: libc: 28 nsec/call clock-gettime-realtime: vdso: 22 nsec/call clock-gettime-monotonic: syscall: 171 nsec/call clock-gettime-monotonic: libc: 30 nsec/call clock-gettime-monotonic: vdso: 25 nsec/call clock-gettime-realtime-coarse: syscall: 153 nsec/call clock-gettime-realtime-coarse: libc: 16 nsec/call clock-gettime-realtime-coarse: vdso: 10 nsec/call clock-gettime-monotonic-coarse: syscall: 167 nsec/call clock-gettime-monotonic-coarse: libc: 17 nsec/call clock-gettime-monotonic-coarse: vdso: 11 nsec/call CC: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Santosh Sivaraj <santosh@fossix.org> --- Changelog: v1: - gettimeofday was moved from asm to C - Coarse timer support addition v2: - Moved Syscall fallback from inline assembly back to its original place in __kernel_clock_gettime v3: - Based on Ben's input, coarse timer support was added in assembly itself - Dropped idea of conversion to C due to the vdso update_count variable being optimized out in C v4: - Based on Naveen's comments restructured code to avoid a duplicate code block v5: - Skip creating dependency for registers that are not used for CLOCK_REALTIME_COARSE (Naveen) - Reorder instructions to get proper dependency setup (Naveen) arch/powerpc/kernel/asm-offsets.c | 2 + arch/powerpc/kernel/vdso64/gettimeofday.S | 68 ++++++++++++++++++++++++++----- 2 files changed, 59 insertions(+), 11 deletions(-)