From patchwork Tue Sep 10 18:15:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 1160512 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-105128-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="dHLgPsSq"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46SYBR4QSfz9s7T for ; Wed, 11 Sep 2019 04:16:15 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; q=dns; s= default; b=GIM1CkkR4TPLY2cgr3zqiVsshTKGDVOREPwPsvliGkLrF5IiYzylb NmN1dLBQY7qVEyKUOrPrjoJHXsnDAREkgV19z405L6HBccuRxo7WKaJ81NSvxda7 ahAL4h26apfLbqMbuHgmQcDhr9FZjQD+pH5cJ1YUhmOkbwqUPVFz9w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; s=default; bh=f2P8GIPcsBd/UQDgQOcDQuw2HrY=; b=dHLgPsSq04S0NGGjSR1/yBPkZXlX iIWJHmFwy53U9mJ0taYNq3TbcF10jq968nSxoDUkQ29sGFuubBSqk05hlpUn/+0g UZKwria4HqMaELGWV/vgI5dxxJ6svI6T6BCKXFTN7UHwq2YKHaGTfvIIHJZFhNep r0ujNJcWDpcRblI= Received: (qmail 40415 invoked by alias); 10 Sep 2019 18:16:09 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 40332 invoked by uid 89); 10 Sep 2019 18:16:02 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-27.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 spammy=HX-Languages-Length:4019 X-HELO: mx0a-001b2d01.pphosted.com From: "Paul A. Clarke" To: libc-alpha@sourceware.org Cc: tuliom@ascii.art.br, murphyp@linux.ibm.com Subject: [PATCH] [powerpc] SET_RESTORE_ROUND optimizations and bug fix Date: Tue, 10 Sep 2019 13:15:48 -0500 Message-Id: <1568139348-32434-1-git-send-email-pc@us.ibm.com> From: "Paul A. Clarke" SET_RESTORE_ROUND brackets a block of code, temporarily setting and restoring the rounding mode and letting everything else, including exceptions generated within the block, pass through. On powerpc, the current code clears the exception enables, which will hide exceptions generated within the block. This issue was introduced by me in commit e905212627350d54b58426214b5a54ddc852b0c9. Fix this by not clearing exception enable bits in the prologue. Also, since we are no longer changing the enable bits in either the prologue or the epilogue, there is no need to test for entering/exiting non-stop mode. Also, optimize the prologue get/save/set rounding mode operations for POWER9 and later by using 'mffscrn' when possible. Fixes: e905212627350d54b58426214b5a54ddc852b0c9 2019-09-10 Paul A. Clarke * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_and_set_rn): New. (__fe_mffscrn): New. * sysdeps/powerpc/fpu/fenv_private.h (libc_feholdsetround_ppc_ctx): Do not clear enable bits, remove obsolete code, use fegetenv_and_set_rn. (libc_feresetround_ppc): Remove obsolete code, use fegetenv_and_set_rn. --- sysdeps/powerpc/fpu/fenv_libc.h | 32 ++++++++++++++++++++++++++++++++ sysdeps/powerpc/fpu/fenv_private.h | 23 ++++------------------- 2 files changed, 36 insertions(+), 19 deletions(-) diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index 0aad897..3173bc2 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -48,6 +48,38 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; __fr; \ }) +#define __fe_mffscrn(rn) \ + ({register fenv_union_t __fr; \ + if (__builtin_constant_p (rn)) \ + __asm__ __volatile__ ( \ + ".machine push; .machine \"power9\"; mffscrni %0,%1; .machine pop" \ + : "=f" (__fr.fenv) : "i" (rn)); \ + else \ + { \ + __fr.l = (rn); \ + __asm__ __volatile__ ( \ + ".machine push; .machine \"power9\"; mffscrn %0,%1; .machine pop" \ + : "=f" (__fr.fenv) : "f" (__fr.fenv)); \ + } \ + __fr.fenv; \ + }) + +/* Like fegetenv_status, but also sets the rounding mode. */ +#ifdef _ARCH_PWR9 +#define fegetenv_and_set_rn(rn) __fe_mffscrn (rn) +#else +/* 'mffscrn' will decode to 'mffs' on ARCH < 3_00, which is still necessary + but not sufficient, because it does not set the rounding mode. + Explicitly set the rounding mode when 'mffscrn' actually doesn't. */ +#define fegetenv_and_set_rn(rn) \ + ({register fenv_union_t __fr; \ + __fr.fenv = __fe_mffscrn (rn); \ + if (__glibc_unlikely (!(GLRO(dl_hwcap2) & PPC_FEATURE2_ARCH_3_00))) \ + __fesetround_inline (rn); \ + __fr.fenv; \ + }) +#endif + /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ #define fesetenv_register(env) \ do { \ diff --git a/sysdeps/powerpc/fpu/fenv_private.h b/sysdeps/powerpc/fpu/fenv_private.h index 3286b4e..504f7b8 100644 --- a/sysdeps/powerpc/fpu/fenv_private.h +++ b/sysdeps/powerpc/fpu/fenv_private.h @@ -101,11 +101,7 @@ static __always_inline void libc_feresetround_ppc (fenv_t *envp) { fenv_union_t new = { .fenv = *envp }; - - __TEST_AND_EXIT_NON_STOP (-1ULL, new.l); - - /* Atomically enable and raise (if appropriate) exceptions set in `new'. */ - fesetenv_mode (new.fenv); + fegetenv_and_set_rn (new.l & FPSCR_RN_MASK); } static __always_inline int @@ -147,21 +143,10 @@ libc_feupdateenv_ppc (fenv_t *e) static __always_inline void libc_feholdsetround_ppc_ctx (struct rm_ctx *ctx, int r) { - fenv_union_t old, new; - - old.fenv = fegetenv_status (); + fenv_union_t old; - new.l = (old.l & ~(FPSCR_ENABLES_MASK|FPSCR_RN_MASK)) | r; - - ctx->env = old.fenv; - if (__glibc_unlikely (new.l != old.l)) - { - __TEST_AND_ENTER_NON_STOP (old.l, 0ULL); - fesetenv_mode (new.fenv); - ctx->updated_status = true; - } - else - ctx->updated_status = false; + ctx->env = old.fenv = fegetenv_and_set_rn (r); + ctx->updated_status = (r != (old.l & FPSCR_RN_MASK)); } static __always_inline void