From patchwork Fri Jul 6 08:55:17 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Szabolcs Nagy X-Patchwork-Id: 940335 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-94051-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="I7DuOMIw"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.b="BoeW1OVQ"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41MTBh1bk5z9s1B for ; Fri, 6 Jul 2018 18:57:32 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:cc:subject:from:to:references:message-id:date :mime-version:in-reply-to:content-type; q=dns; s=default; b=vhDS W0inBZjkK/fYoEcxUBNuKelG7jG81aridOIc40x0v3PBAcsLVaTGgFJSTAAHd0jB WYLd2kt3v5yEux3vO7h2t9WgVdHfrF0qSwF0E522pORSVzaWumuCjdrs+ZEszgb4 yqRGa0JnnzxtwTGeYIsuzTZFV0r3T+055IOCuX4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:cc:subject:from:to:references:message-id:date :mime-version:in-reply-to:content-type; s=default; bh=M15nZwV7LM FtO9dT8zUGo7LJwuY=; b=I7DuOMIwNFCIIR7vHKyV4VaBNnRO91HFkoYRUok5OA 60EUNiiGnIqTDJmKAKu5nllKvk4kPkmYQ3aw4ABugaFkvgRtdxFMO8J7lHAtUYd7 jaldugOm7rNGn6DJKnt30wA4ZgwVn4j3YAVx+VF4SusfLfWiMxjispzUxjhl0p6w g= Received: (qmail 27943 invoked by alias); 6 Jul 2018 08:57:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 11454 invoked by uid 89); 6 Jul 2018 08:55:32 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=3823, Hx-languages-length:4695 X-HELO: EUR04-VI1-obe.outbound.protection.outlook.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector1-arm-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qwBXgsImG/DsvAAiZuPlATwbEgQxcYGWJLg9wM8LfUw=; b=BoeW1OVQhKnVHS3jOBHV83OBi5ANfYrrL530LJGo6OrXZ8nrNQ/ASYoUZYXxA9HkmrD1r5BtZppwpEjrwWH2b/8vROvrttbn8rpiXgTFtIkTpmqX9BO2BlAPmo2MCW/3dhxJ9Og8rsUa9S0apykQQ5DYorBpBHtsIPwSyEpALJw= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Szabolcs.Nagy@arm.com; Cc: nd@arm.com, Wilco Dijkstra , Joseph Myers Subject: [PATCH 01/10] Clean up converttoint handling and document the semantics From: Szabolcs Nagy To: GNU C Library References: Message-ID: Date: Fri, 6 Jul 2018 09:55:17 +0100 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts) This patch currently only affects aarch64. The roundtoint and converttoint internal functions are only called with small values, so 32 bit result is enough for converttoint and it is a signed int conversion so the return type is changed to int32_t. The original idea was to help the compiler keeping the result in uint64_t, then it's clear that no sign extension is needed and there is no accidental undefined or implementation defined signed int arithmetics. But it turns out gcc does a good job with inlining so changing the type has no overhead and the semantics of the conversion is less surprising this way. Since we want to allow the asuint64 (x + 0x1.8p52) style conversion, the top bits were never usable and the existing code ensures that only the bottom 32 bits of the conversion result are used. On aarch64 the neon intrinsics (which round ties to even) are changed to round and lround (which round ties away from zero) this does not affect the results in a significant way, but more portable (relies on round and lround being inlined which works with -fno-math-errno). The TOINT_SHIFT and TOINT_RINT macros were removed, only keep separate code paths for TOINT_INTRINSICS and !TOINT_INTRINSICS. 2018-07-06 Wilco Dijkstra Szabolcs Nagy * sysdeps/aarch64/fpu/math_private.h (roundtoint): Use round. (converttoint): Use lround. * sysdeps/ieee754/flt-32/math_config.h (roundtoint): Declare and document the semantics when TOINT_INTRINSICS is set. (converttoint): Likewise. (TOINT_RINT): Remove. (TOINT_SHIFT): Remove. * sysdeps/ieee754/flt-32/e_expf.c (__expf): Remove the TOINT_RINT code path. --- sysdeps/aarch64/fpu/math_private.h | 17 +++++++---------- sysdeps/ieee754/flt-32/e_expf.c | 5 +---- sysdeps/ieee754/flt-32/math_config.h | 20 +++++++++++++++----- 3 files changed, 23 insertions(+), 19 deletions(-) diff --git a/sysdeps/aarch64/fpu/math_private.h b/sysdeps/aarch64/fpu/math_private.h index fcd02c0654..d2e0abc0b2 100644 --- a/sysdeps/aarch64/fpu/math_private.h +++ b/sysdeps/aarch64/fpu/math_private.h @@ -21,6 +21,8 @@ #include #include +#include +#include static __always_inline void libc_feholdexcept_aarch64 (fenv_t *envp) @@ -298,25 +300,20 @@ libc_feresetround_noex_aarch64_ctx (struct rm_ctx *ctx) #define libc_feresetround_noexf_ctx libc_feresetround_noex_aarch64_ctx #define libc_feresetround_noexl_ctx libc_feresetround_noex_aarch64_ctx -/* Hack: only include the large arm_neon.h when needed. */ -#ifdef _MATH_CONFIG_H -# include - -/* ACLE intrinsics for frintn and fcvtns instructions. */ -# define TOINT_INTRINSICS 1 +/* Use inline round and lround instructions. */ +#define TOINT_INTRINSICS 1 static inline double_t roundtoint (double_t x) { - return vget_lane_f64 (vrndn_f64 (vld1_f64 (&x)), 0); + return round (x); } -static inline uint64_t +static inline int32_t converttoint (double_t x) { - return vcvtnd_s64_f64 (x); + return lround (x); } -#endif #include_next diff --git a/sysdeps/ieee754/flt-32/e_expf.c b/sysdeps/ieee754/flt-32/e_expf.c index f2238bfd74..384a586172 100644 --- a/sysdeps/ieee754/flt-32/e_expf.c +++ b/sysdeps/ieee754/flt-32/e_expf.c @@ -85,10 +85,7 @@ __expf (float x) #if TOINT_INTRINSICS kd = roundtoint (z); ki = converttoint (z); -#elif TOINT_RINT - kd = rint (z); - ki = (long) kd; -#elif TOINT_SHIFT +#else # define SHIFT __exp2f_data.shift kd = math_narrow_eval ((double) (z + SHIFT)); /* Needs to be double. */ ki = asuint64 (kd); diff --git a/sysdeps/ieee754/flt-32/math_config.h b/sysdeps/ieee754/flt-32/math_config.h index 9c4ef30173..8ca7532686 100644 --- a/sysdeps/ieee754/flt-32/math_config.h +++ b/sysdeps/ieee754/flt-32/math_config.h @@ -38,13 +38,23 @@ #endif #ifndef TOINT_INTRINSICS +/* When set, the roundtoint and converttoint functions are provided with + the semantics documented below. */ # define TOINT_INTRINSICS 0 #endif -#ifndef TOINT_RINT -# define TOINT_RINT 0 -#endif -#ifndef TOINT_SHIFT -# define TOINT_SHIFT 1 + +#if TOINT_INTRINSICS +/* Round x to nearest int in all rounding modes, ties have to be rounded + consistently with converttoint so the results match. If the result + would be outside of [-2^31, 2^31-1] then the semantics is unspecified. */ +static inline double_t +roundtoint (double_t x); + +/* Convert x to nearest int in all rounding modes, ties have to be rounded + consistently with roundtoint. If the result is not representible in an + int32_t then the semantics is unspecified. */ +static inline int32_t +converttoint (double_t x); #endif static inline uint32_t