From patchwork Thu Jul 20 09:58:34 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Santosh Sivaraj X-Patchwork-Id: 791546 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xCqC91Rc1z9s7C for ; Thu, 20 Jul 2017 20:00:21 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=fossix-org.20150623.gappssmtp.com header.i=@fossix-org.20150623.gappssmtp.com header.b="O4NQwtQy"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xCqC90GZDzDrFL for ; Thu, 20 Jul 2017 20:00:21 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=fossix-org.20150623.gappssmtp.com header.i=@fossix-org.20150623.gappssmtp.com header.b="O4NQwtQy"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-pg0-x243.google.com (mail-pg0-x243.google.com [IPv6:2607:f8b0:400e:c05::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xCq9R0QJYzDrFJ for ; Thu, 20 Jul 2017 19:58:50 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=fossix-org.20150623.gappssmtp.com header.i=@fossix-org.20150623.gappssmtp.com header.b="O4NQwtQy"; dkim-atps=neutral Received: by mail-pg0-x243.google.com with SMTP id v190so2364040pgv.1 for ; Thu, 20 Jul 2017 02:58:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fossix-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=/XLVjwdAkRDtO+DXiLPXT8H+QatsX7qsqejrU60fNa8=; b=O4NQwtQyhu8VcJrB1f1sk554a8Jd34k37MX5iXzJE6bUQ4qC0M5B9x2u5kvyERtqz2 8UKNOHpRF0kj7pdjIoR0Y56/m3kASLUqVTtgyoXamKbzDA3fkqNo/za1ij1oOfX7wxR7 wocIZx5xcUyi3ZLI3NorQYnvbAt+hjea4jszuL97aALE6NzoVP2gIRn+JVZIZCHdx9EJ 553a1PaiCwNyzXGIrIMccT5G2YW90s4L+eTZx+7beABW0XiL9NQk/IDGSPPUv+mGaCxD hLolD9whTDeui4vr2ATAJMy0SVT1Roz6mutB429iOu69ynsyMtJfcjcvntk14viIsGbu 2Zog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=/XLVjwdAkRDtO+DXiLPXT8H+QatsX7qsqejrU60fNa8=; b=WQRIjrfYUFNidizoz2FEdG2GPDWPYOFMynLO1ONBp+3aLh+sJ66floxqkIMkLOO/Uh obgi47TNf2fwRkVdCP+76ioFRj3psGkILTfyxfHgd2niYpQAdG59PsaUpEtyAun8ITUN +Ev+7qlwmgZBpye8F2qtMqjQiwAMaPFASClF3J6eAeenfXimJJxJ7j+tDZpL3TTGKyn2 S+9BPd+7wNq5m6iWEsmyWdpu8F50jpLwcZLdu0OTjqJZSMUfOiecDQ7OCWFCvdMa5kQ2 ZbBojugMd+JZZSya9weRksWrNluZqlD9sVgWoJhSLGjojW5Yh/6P1FvLjjrPSK1gKrTa fGNw== X-Gm-Message-State: AIVw111+2CNdg/WMQs1NbBL+ibuawA4aXmmP9lDB9c1qTP0q5fjP6F/k 1hZp7AwCdSJGvlkZ5Ck= X-Received: by 10.84.239.8 with SMTP id w8mr3456099plk.73.1500544726651; Thu, 20 Jul 2017 02:58:46 -0700 (PDT) Received: from santosiv.in.ibm.com ([125.16.167.56]) by smtp.gmail.com with ESMTPSA id 204sm3613963pfc.32.2017.07.20.02.58.44 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 20 Jul 2017 02:58:45 -0700 (PDT) From: Santosh Sivaraj To: linuxppc-dev Subject: [PATCH] powerpc/vdso64: Add support for CLOCK_{REALTIME/MONOTONIC}_COARSE Date: Thu, 20 Jul 2017 15:28:34 +0530 Message-Id: <20170720095834.15153-1-santosh@fossix.org> X-Mailer: git-send-email 2.9.4 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Srikar Dronamraju Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Current vDSO64 implementation does not have support for coarse clocks (CLOCK_MONOTONIC_COARSE, CLOCK_REALTIME_COARSE), for which it falls back to system call. Below is a benchmark of the difference in execution time with and without vDSO support. (Non-coarse clocks are also included just for completion) Without vDSO support: -------------------- clock-gettime-realtime: syscall: 1547 nsec/call clock-gettime-realtime: libc: 258 nsec/call clock-gettime-realtime: vdso: 180 nsec/call clock-gettime-monotonic: syscall: 1399 nsec/call clock-gettime-monotonic: libc: 317 nsec/call clock-gettime-monotonic: vdso: 249 nsec/call clock-gettime-realtime-coarse: syscall: 1228 nsec/call clock-gettime-realtime-coarse: libc: 1320 nsec/call clock-gettime-realtime-coarse: vdso: 1330 nsec/call clock-gettime-monotonic-coarse: syscall: 1263 nsec/call clock-gettime-monotonic-coarse: libc: 1368 nsec/call clock-gettime-monotonic-coarse: vdso: 1258 nsec/call With vDSO support: ------------------ clock-gettime-realtime: syscall: 1660 nsec/call clock-gettime-realtime: libc: 251 nsec/call clock-gettime-realtime: vdso: 180 nsec/call clock-gettime-monotonic: syscall: 1514 nsec/call clock-gettime-monotonic: libc: 309 nsec/call clock-gettime-monotonic: vdso: 239 nsec/call clock-gettime-realtime-coarse: syscall: 1228 nsec/call clock-gettime-realtime-coarse: libc: 172 nsec/call clock-gettime-realtime-coarse: vdso: 101 nsec/call clock-gettime-monotonic-coarse: syscall: 1347 nsec/call clock-gettime-monotonic-coarse: libc: 187 nsec/call clock-gettime-monotonic-coarse: vdso: 125 nsec/call Used https://github.com/nlynch-mentor/vdsotest.git for the benchmarks. CC: Benjamin Herrenschmidt Signed-off-by: Santosh Sivaraj --- arch/powerpc/include/asm/vdso.h | 1 + arch/powerpc/kernel/vdso64/Makefile | 2 +- arch/powerpc/kernel/vdso64/gettime.c | 162 ++++++++++++++++++++++++++++++ arch/powerpc/kernel/vdso64/gettimeofday.S | 82 ++------------- 4 files changed, 174 insertions(+), 73 deletions(-) create mode 100644 arch/powerpc/kernel/vdso64/gettime.c diff --git a/arch/powerpc/include/asm/vdso.h b/arch/powerpc/include/asm/vdso.h index c53f5f6..721e4cf 100644 --- a/arch/powerpc/include/asm/vdso.h +++ b/arch/powerpc/include/asm/vdso.h @@ -23,6 +23,7 @@ extern unsigned long vdso32_sigtramp; extern unsigned long vdso32_rt_sigtramp; int vdso_getcpu_init(void); +struct vdso_data *__get_datapage(void); #else /* __ASSEMBLY__ */ diff --git a/arch/powerpc/kernel/vdso64/Makefile b/arch/powerpc/kernel/vdso64/Makefile index 31107bf..8958d87 100644 --- a/arch/powerpc/kernel/vdso64/Makefile +++ b/arch/powerpc/kernel/vdso64/Makefile @@ -1,6 +1,6 @@ # List of files in the vdso, has to be asm only for now -obj-vdso64 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o getcpu.o +obj-vdso64 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o getcpu.o gettime.o # Build rules diff --git a/arch/powerpc/kernel/vdso64/gettime.c b/arch/powerpc/kernel/vdso64/gettime.c new file mode 100644 index 0000000..01f411f --- /dev/null +++ b/arch/powerpc/kernel/vdso64/gettime.c @@ -0,0 +1,162 @@ +/* + * Userland implementation of gettimeofday() for 64 bits processes in a + * ppc64 kernel for use in the vDSO + * + * Copyright (C) 2017 Santosh Sivaraj (santosh@fossix.org), IBM. + * + * Originally implemented in assembly by: + * Benjamin Herrenschmuidt (benh@kernel.crashing.org), + * IBM Corp. + * + * This program is free software; you can redistribute it and/or + * modify it under the terms of the GNU General Public License + * as published by the Free Software Foundation; either version + * 2 of the License, or (at your option) any later version. + */ + +#include +#include +#include +#include +#include +#include + +static notrace void kernel_get_tspec(struct timespec *tp, + struct vdso_data *vdata, u32 *wtom_sec, + u32 *wtom_nsec) +{ + u64 tb; + u32 update_count; + + do { + /* check for update count & load values */ + update_count = vdata->tb_update_count; + + /* Get TB, offset it and scale result */ + tb = mulhdu((get_tb() - vdata->tb_orig_stamp) << 12, + vdata->tb_to_xs) + vdata->stamp_sec_fraction; + tp->tv_sec = vdata->stamp_xtime.tv_sec; + if (wtom_sec) + *wtom_sec = vdata->wtom_clock_sec; + if (wtom_nsec) + *wtom_nsec = vdata->wtom_clock_nsec; + } while (update_count != vdata->tb_update_count); + + tp->tv_nsec = ((u64)mulhwu(tb, NSEC_PER_SEC) << 32) >> 32; + tp->tv_sec += (tb >> 32); +} + +static notrace int clock_get_realtime(struct timespec *tp, + struct vdso_data *vdata) +{ + kernel_get_tspec(tp, vdata, NULL, NULL); + + return 0; +} + +static notrace int clock_get_monotonic(struct timespec *tp, + struct vdso_data *vdata) +{ + __s32 wtom_sec, wtom_nsec; + u64 nsec; + + kernel_get_tspec(tp, vdata, &wtom_sec, &wtom_nsec); + + tp->tv_sec += wtom_sec; + + nsec = tp->tv_nsec; + tp->tv_nsec = 0; + timespec_add_ns(tp, nsec + wtom_nsec); + + return 0; +} + +static notrace int clock_realtime_coarse(struct timespec *tp, + struct vdso_data *vdata) +{ + u32 update_count; + + do { + /* check for update count & load values */ + update_count = vdata->tb_update_count; + + tp->tv_sec = vdata->stamp_xtime.tv_sec; + tp->tv_nsec = vdata->stamp_xtime.tv_nsec; + } while (update_count != vdata->tb_update_count); + + return 0; +} + +static notrace int clock_monotonic_coarse(struct timespec *tp, + struct vdso_data *vdata) +{ + __s32 wtom_sec, wtom_nsec; + u64 nsec; + u32 update_count; + + do { + /* check for update count & load values */ + update_count = vdata->tb_update_count; + + tp->tv_sec = vdata->stamp_xtime.tv_sec; + tp->tv_nsec = vdata->stamp_xtime.tv_nsec; + wtom_sec = vdata->wtom_clock_sec; + wtom_nsec = vdata->wtom_clock_nsec; + } while (update_count != vdata->tb_update_count); + + tp->tv_sec += wtom_sec; + nsec = tp->tv_nsec; + tp->tv_nsec = 0; + timespec_add_ns(tp, nsec + wtom_nsec); + + return 0; +} + +static notrace int gettime_syscall_fallback(clockid_t clk_id, + struct timespec *tp) +{ + register clockid_t id asm("r3") = clk_id; + register struct timespec *t asm("r4") = tp; + register int nr asm("r0") = __NR_clock_gettime; + register int ret asm("r3"); + + asm volatile("sc" + : "=r" (ret) + : "r"(nr), "r"(id), "r"(t) + : "memory"); + + return ret; +} + +notrace int kernel_clock_gettime(clockid_t clk_id, struct timespec *tp) +{ + int ret; + struct vdso_data *vdata = __get_datapage(); + + if (!tp || !vdata) + return -EBADR; + + switch (clk_id) { + case CLOCK_REALTIME: + ret = clock_get_realtime(tp, vdata); + break; + case CLOCK_MONOTONIC: + ret = clock_get_monotonic(tp, vdata); + break; + case CLOCK_REALTIME_COARSE: + ret = clock_realtime_coarse(tp, vdata); + break; + case CLOCK_MONOTONIC_COARSE: + ret = clock_monotonic_coarse(tp, vdata); + break; + default: + /* fallback to syscall */ + ret = -1; + break; + } + + if (ret) + ret = gettime_syscall_fallback(clk_id, tp); + + return ret; +} diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S index 3820213..1258009 100644 --- a/arch/powerpc/kernel/vdso64/gettimeofday.S +++ b/arch/powerpc/kernel/vdso64/gettimeofday.S @@ -16,6 +16,8 @@ #include #include +.global kernel_clock_gettime + .text /* * Exact prototype of gettimeofday @@ -51,85 +53,21 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday) .cfi_endproc V_FUNCTION_END(__kernel_gettimeofday) - -/* - * Exact prototype of clock_gettime() - * - * int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp); - * - */ V_FUNCTION_BEGIN(__kernel_clock_gettime) .cfi_startproc - /* Check for supported clock IDs */ - cmpwi cr0,r3,CLOCK_REALTIME - cmpwi cr1,r3,CLOCK_MONOTONIC - cror cr0*4+eq,cr0*4+eq,cr1*4+eq - bne cr0,99f - - mflr r12 /* r12 saves lr */ - .cfi_register lr,r12 - mr r11,r4 /* r11 saves tp */ - bl V_LOCAL_FUNC(__get_datapage) /* get data page */ - lis r7,NSEC_PER_SEC@h /* want nanoseconds */ - ori r7,r7,NSEC_PER_SEC@l -50: bl V_LOCAL_FUNC(__do_get_tspec) /* get time from tb & kernel */ - bne cr1,80f /* if not monotonic, all done */ - - /* - * CLOCK_MONOTONIC - */ - - /* now we must fixup using wall to monotonic. We need to snapshot - * that value and do the counter trick again. Fortunately, we still - * have the counter value in r8 that was returned by __do_get_tspec. - * At this point, r4,r5 contain our sec/nsec values. - */ - - lwa r6,WTOM_CLOCK_SEC(r3) - lwa r9,WTOM_CLOCK_NSEC(r3) - - /* We now have our result in r6,r9. We create a fake dependency - * on that result and re-check the counter - */ - or r0,r6,r9 - xor r0,r0,r0 - add r3,r3,r0 - ld r0,CFG_TB_UPDATE_COUNT(r3) - cmpld cr0,r0,r8 /* check if updated */ - bne- 50b - - /* Add wall->monotonic offset and check for overflow or underflow. - */ - add r4,r4,r6 - add r5,r5,r9 - cmpd cr0,r5,r7 - cmpdi cr1,r5,0 - blt 1f - subf r5,r7,r5 - addi r4,r4,1 -1: bge cr1,80f - addi r4,r4,-1 - add r5,r5,r7 - -80: std r4,TSPC64_TV_SEC(r11) - std r5,TSPC64_TV_NSEC(r11) - - mtlr r12 + mflr r6 /* r12 saves lr */ + stwu r1,-112(r1) + .cfi_register lr,r6 + std r6,24(r1) + bl V_LOCAL_FUNC(kernel_clock_gettime) crclr cr0*4+so - li r3,0 - blr - - /* - * syscall fallback - */ -99: - li r0,__NR_clock_gettime - sc + ld r6,24(r1) + addi r1,r1,112 + mtlr r6 blr .cfi_endproc V_FUNCTION_END(__kernel_clock_gettime) - /* * Exact prototype of clock_getres() *