From patchwork Wed Jun 11 14:59:41 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco X-Patchwork-Id: 358734 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 080CE14007B for ; Thu, 12 Jun 2014 01:00:02 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-type; q=dns; s=default; b=qAeS4FX2JmT/lzCsxwghPHz/GqrI+ A3CKZ9YrURkR8/oSxoqi/VCEBJ4YlWM4IUKKFlilE0muLcUwmEwV4Uph6Jg96IHX o6AVPEbMY5T29kAVdHD9wTES+m2d5gAsfnDuA0TyMSdrCfk2hvmcvVRu20zTRR8P 3EJCQu15RgirbM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:mime-version :content-type; s=default; bh=6OnIDCGr8fWa9i4r0re6MjwQ+LI=; b=VTH Hd2XnM8+uNdVp6tCnyKjepW0EYAVmRf7T4DrF3jI4EnnkcVkjr1oTE+T3uuakCDm jsSu8H6ovyHgEhW0NyrVI4bs6atIPPep7TPXxsIdl2gAebIoVYvnoWMs05BUNQbW Q/cTg/LV0qemQPN3iwP9Ixbys9x5mqbiRPu63+r4= Received: (qmail 2679 invoked by alias); 11 Jun 2014 14:59:56 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 2668 invoked by uid 89); 11 Jun 2014 14:59:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: service87.mimecast.com From: "Wilco" To: "GNU C Library" Subject: [PATCH] AArch64: Add libc_feholdsetround_noex_aarch64_ctx Date: Wed, 11 Jun 2014 15:59:41 +0100 Message-ID: <004301cf8585$c6248360$526d8a20$@com> MIME-Version: 1.0 X-MC-Unique: 114061115595004601 Hi, This patch adds new function libc_feholdsetround_noex_aarch64_ctx, enabling further optimization. libc_feholdsetround_aarch64_ctx now only needs to read the FPCR in the typical case, avoiding a redundant FPSR read. Performance results show a good improvement (5-10% on sin()) on cores with expensive FPCR/FPSR instructions. OK for commit? Wilco ChangeLog: 2014-06-11 Wilco * sysdeps/aarch64/fpu/math_private.h (libc_feholdsetround_noex_aarch64_ctx): New function. --- sysdeps/aarch64/fpu/math_private.h | 30 +++++++++++++++++++++++++++--- 1 file changed, 27 insertions(+), 3 deletions(-) diff --git a/sysdeps/aarch64/fpu/math_private.h b/sysdeps/aarch64/fpu/math_private.h index 023c9d0..b13c030 100644 --- a/sysdeps/aarch64/fpu/math_private.h +++ b/sysdeps/aarch64/fpu/math_private.h @@ -228,12 +228,9 @@ static __always_inline void libc_feholdsetround_aarch64_ctx (struct rm_ctx *ctx, int r) { fpu_control_t fpcr; - fpu_fpsr_t fpsr; int round; _FPU_GETCW (fpcr); - _FPU_GETFPSR (fpsr); - ctx->env.__fpsr = fpsr; /* Check whether rounding modes are different. */ round = (fpcr ^ r) & _FPU_FPCR_RM_MASK; @@ -264,6 +261,33 @@ libc_feresetround_aarch64_ctx (struct rm_ctx *ctx) #define libc_feresetroundl_ctx libc_feresetround_aarch64_ctx static __always_inline void +libc_feholdsetround_noex_aarch64_ctx (struct rm_ctx *ctx, int r) +{ + fpu_control_t fpcr; + fpu_fpsr_t fpsr; + int round; + + _FPU_GETCW (fpcr); + _FPU_GETFPSR (fpsr); + ctx->env.__fpsr = fpsr; + + /* Check whether rounding modes are different. */ + round = (fpcr ^ r) & _FPU_FPCR_RM_MASK; + ctx->updated_status = round != 0; + + /* Set the rounding mode if changed. */ + if (__glibc_unlikely (round != 0)) + { + ctx->env.__fpcr = fpcr; + _FPU_SETCW (fpcr ^ round); + } +} + +#define libc_feholdsetround_noex_ctx libc_feholdsetround_noex_aarch64_ctx +#define libc_feholdsetround_noexf_ctx libc_feholdsetround_noex_aarch64_ctx +#define libc_feholdsetround_noexl_ctx libc_feholdsetround_noex_aarch64_ctx + +static __always_inline void libc_feresetround_noex_aarch64_ctx (struct rm_ctx *ctx) { /* Restore the rounding mode if updated. */