From patchwork Wed Mar 21 17:50:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888975 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91197-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="ONlqcKdO"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405y4q0k3pz9s0q for ; Thu, 22 Mar 2018 04:50:18 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=nwEiEdiw3MELVo4XurIaxcSspVlHe3QX79uH0I7ukOHzf5FE7Hjen m6QDodNX5yMljyz9a/W1JYEIY1QLniL1VYAI2aQMpnY2pafjI096AAT82oANbeKf pVrjHPK7qCIUU2VwDf5Xp/EcQx8eK0KjcU9uv+DVnZTGdpSbyqBgVQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=5L4OB7XX1xl4MH2J1mzm1VJAXqQ=; b=ONlqcKdO+cof5uUIeKSA4QF3R+Bw i3t9VIzbpgwpHJs2KKGCPp/WiP5jqB2OisgC5/C5XleScS9sEiCFSwixbb6AvMBZ MPgoGN+STt6M7zxJDyPkoL5J+RQwSF3l2Hy1mgfvY3Jaj8S8sfwAFpOmW8kLChZz xa0V44YlA4WXGeU= Received: (qmail 109113 invoked by alias); 21 Mar 2018 17:50:13 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 108621 invoked by uid 89); 21 Mar 2018 17:50:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=million, ldouble, HAccept-Language:en-GB, average X-HELO: EUR01-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 1/7] sin/cos slow paths: avoid slow paths for small inputs Date: Wed, 21 Mar 2018 17:50:07 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB2038; 6:4YNlT1zGjsCosHX4Rja1OZCPLhh0ll+PXR/x8vFExAm/7+eTzqzqhj5VJg/k6SvZYyXpk1b8olVyyINJmaSbSj5R3ZSRLN3OhDXsDus5hmJ8vaLsC+vLxWnfa4niqWbbsMaVZXBFyyrKzOAFI0uZePJ3IKBzwqWBhBBPxuQFsUk3l6WsOc9U2YCQ3BRWTGPU1MT+een0nIzltdlJPDm5vKBbZXCXORej6yyBPFyRMBom3YXvBNH1sq7FsNcUq1MBbQTcKJSDaLmbe5ssPXhkDbAzQ0tYKTztTIKp+OCJBovS3mFonAAk8qAB791KUImHmmco+XQ+QHhu9y8ZsKpPzIUP2Ni/U07VtZhbWq/XGTSPLqjjTMGPKz7u/fX/8nGE; 5:q2acbPSjELpVn1nAPGb1VnN135C8fLPEdierq0lJTundJXWRU+/QVbCgpIyOnZmxTTzbwyQLCgtP8Z9/Ms9a4/Xvf/9LI9GxlgBdr4Q7WqjFpq/x87G4jO167Ngq/Xy11iJcTjtGJoEEEyJRRx8hK6KXYmPRJKF+AbVHWkuMdwU=; 24:u0fw4oDf8Y5K9Cn0Q4sjCdvv0zH4ujhp/kh8TlVuj6QRMD2Qm4JZl/bnVGiNp1lsww6THdvEG/pFd5US2Q9X49qDfMsrlKR9mfeHbxjXkDs=; 7:iz3bceTX7lwOeOs43akEpi2J2Q9L/0URaJyDvcWW+g6vaIAF8dv6U24vduNu3vtY2+Ngxjiczfaw3yluvifUfW8nFAKg+QECk2ETJACCYAFZvr/ldWcbL3vVQHB9cCIgTzv2TUpOWGVbpOZQMBcoviryWN1zi6zJl/3q4MmHMTX2JO769AOelRmmdQrbVtR5oOH7sh0qr2K/5kpTYkUo8n8+Wyqg/Ly7XYh70nK0KiG8Re1AaTej85XBVqBadNbu x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 81e2c0e7-2d1c-42af-7907-08d58f542f02 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB2038; x-ms-traffictypediagnostic: DB6PR0801MB2038: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(10201501046)(3002001)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123558120)(6072148)(201708071742011); SRVR:DB6PR0801MB2038; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB2038; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39380400002)(39860400002)(396003)(346002)(376002)(54534003)(377424004)(189003)(199004)(102836004)(478600001)(2351001)(3660700001)(68736007)(316002)(59450400001)(6506007)(6916009)(105586002)(97736004)(8936002)(72206003)(7696005)(33656002)(99286004)(14454004)(81166006)(81156014)(8676002)(26005)(575784001)(86362001)(3846002)(6116002)(25786009)(74316002)(7736002)(305945005)(53936002)(5660300001)(4326008)(9686003)(106356001)(5250100002)(6436002)(2906002)(2900100001)(5640700003)(3280700002)(66066001)(2501003)(55016002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB2038; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: adMgqXthQN4fHn4qfJ4qIFpF/KziLgwtseDev8teoAzp8v73avyFNxNV7IFyrVhoNVrY8yD9J5C+ws68Jgw1MaB98VOvlAAo4+X5eCCBzDKce7kv66QOYk3EBYg1ZzbuZROXjksA4mZg10qLCWjjeWCvJUQ9pVEdZhE4wKFJaOlZixdpc58TavFP9hjai+wcFUSd9beUKIfz5APTNsf8s0zouIARe78AinaZNJGWqSOAuFDF+9RkiiiD1riXilI6FHfesAKOynTll4z72Xvf6srllayhBZ4iOEn6J63ZlyDr7HGVjlkINF6M0rvTtw0Kc0VTymNnuL8U8hD05udhKw== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 81e2c0e7-2d1c-42af-7907-08d58f542f02 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:50:07.0884 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2038 This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 1f469803be59bb4813370d95c6d091de901e6129..be06085154db24c8fd6cf1bce417028a959aaa27 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1012,7 +1012,9 @@ ildouble: 2 ldouble: 2 Function: "cos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -1970,7 +1972,9 @@ ildouble: 2 ldouble: 2 Function: "sin": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -2000,7 +2004,9 @@ ildouble: 3 ldouble: 3 Function: "sincos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 8c589cbd4ab7451a5889e9a474bf4bd36c49d498..0c16b728df127ad54039da3eec376e5f1fe4c852 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -448,7 +448,7 @@ SECTION #endif __sin (double x) { - double xx, res, t, cor; + double xx, t, cor; mynumber u; int4 k, m; double retval = 0; @@ -471,26 +471,22 @@ __sin (double x) xx = x * x; /* Taylor series. */ t = POLYNOMIAL (xx) * (xx * x); - res = x + t; - cor = (x - res) + t; - retval = (res == res + 1.07 * cor) ? res : slow (x); + /* Max ULP of x + t is 0.535. */ + retval = x + t; } /* else if (k < 0x3fd00000) */ /*---------------------------- 0.25<|x|< 0.855469---------------------- */ else if (k < 0x3feb6000) { - res = do_sin (x, 0, &cor); - retval = (res == res + 1.096 * cor) ? res : slow1 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.548. */ + retval = __copysign (do_sin (x, 0, &cor), x); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ else if (k < 0x400368fd) { - t = hp0 - fabs (x); - res = do_cos (t, hp1, &cor); - retval = (res == res + 1.020 * cor) ? res : slow2 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.51. */ + retval = __copysign (do_cos (t, hp1, &cor), x); } /* else if (k < 0x400368fd) */ #ifndef IN_SINCOS @@ -541,7 +537,7 @@ SECTION #endif __cos (double x) { - double y, xx, res, cor, a, da; + double y, xx, cor, a, da; mynumber u; int4 k, m; @@ -561,8 +557,8 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ - res = do_cos (x, 0, &cor); - retval = (res == res + 1.020 * cor) ? res : cslow2 (x); + /* Max ULP is 0.51. */ + retval = do_cos (x, 0, &cor); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -571,20 +567,12 @@ __cos (double x) a = y + hp1; da = (y - a) + hp1; xx = a * a; + /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. + Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - { - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (1.0e-31, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, true); - } + retval = TAYLOR_SIN (xx, a, da, cor); else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (1.0e-31, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, true)); - } - + retval = __copysign (do_sin (a, da, &cor), a); } /* else if (k < 0x400368fd) */ diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 48e53f7ef2cf814d71d5d0c9f2bb907f594aa7ef..bbb8a4d0754dbe6665682cd8a7f51f7319a14014 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1262,7 +1262,9 @@ ildouble: 1 ldouble: 1 Function: "cos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2528,7 +2530,9 @@ Function: "pow_vlen8_avx2": float: 3 Function: "sin": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2578,7 +2582,9 @@ Function: "sin_vlen8_avx2": float: 1 Function: "sincos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 From patchwork Wed Mar 21 17:51:24 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888978 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91198-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="kHFqB0RP"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405y6J6dZgz9s0q for ; Thu, 22 Mar 2018 04:51:36 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=Cl7bqKOD8QtqqvxDYxzjedSPaa5/kfqtvsvzfMnYtvZbtFhjgoIKu Y/WXU75cUjAFl3X0s5zpfL+o3lIQyM3HStbBxagQrBgPM+o6DjGxuPV1wY+6o2DK EfJ26ul1oCBfllzQL+k1OA1DDTr760aYMLQLxQ+n7uKH/8DQdfAtYI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=+D1qelPsIgy3yVyT4dg/hjgrgUU=; b=kHFqB0RPccCVrHAVi9JjhiiwqHPZ izKtuzB9jgzOcV/RHugzTag/AcUT9OhwpS/L8xRj4clc+VP2YD1uUMuUteahiQNe E/35YVFEZuNCG/HIcT3zCoHIDv2pWu8JYPPKKH7vDcqaowOpJi0I2gEVnWBhwP1o 3/QxIkCwB6hqeYg= Received: (qmail 34998 invoked by alias); 21 Mar 2018 17:51:31 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 34606 invoked by uid 89); 21 Mar 2018 17:51:30 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=vi, vx, xn2, speedups X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 2/7] sin/cos slow paths: remove large range reduction Date: Wed, 21 Mar 2018 17:51:24 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1813; 7:5fO1Bp11D1kPA3HR2Cf2i5D0W8Fn1pcJ6bcQ+wDrg/yKXXoxQT2OcF+uiXxp+ST2Gd1SH4ZQbljDeXotdhCaYumOM+pzouATDF/5arxeqsItnoqXLIeY5DBMbwqTeb/Vna22SK5lZlJGEdZ2zb8+bo8YdAmqRamE4NXeegqvvPsblWTiNMR3sOWwYITlT7ZWXgAehjYxCECNqgw3mpBZqzeScKaJuEeiDwYRIg5CkCMFIBzRDgMQIX+lQX1T41IG x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 899bde1c-52cf-4209-736f-08d58f545d23 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB6PR0801MB1813; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1813; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: LsgjvQjFAXwt95QrrzLx71tjE0gRTQHjPbl5DhfGg6rw3utLfulUuXmLr1S9pbMhor0X52WgsKut0DKKWYLuaHycm3APCp7J+btHGq+UlYhL73FOYrIyWc56ZaQHOFnBr7XmtNKbSc/mL81ZpkV5ssq4QrBgdgcvApXiJWm0vTHKv2yCJsG5CnI+Fpc8AYmi96PPx5Wd6NItcn+DawNs/S6lvyDpAIJHroN2lzbfnfvEPcvHZlyOXba4BMjvbG4pXWTF2cOHNlGG5cysbXNWOTzOmBXZdGp+I3txl5OW9pX9E0RtIxXpmQMD7TmJh0hXyWsg+IxPXoaTQDmGuMZpwQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 899bde1c-52cf-4209-736f-08d58f545d23 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:51:24.4953 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 This patch removes the large range reduction code and defers to the huge range reduction code. The first level range reducer supports inputs up to 2^27, which is way too large given that inputs for sin/cos are typically small (< 10), and optimizing for a smaller range would give a significant speedup. Input values above 2^27 are practically never used, so there is no reason for supporting range reduction between 2^27 and 2^48. Removing it significantly simplifies code and enables further speedups. There is about a 2.3x slowdown in this range due to __branred being extremely slow (a better algorithm could easily more than double performance). ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 0c16b728df127ad54039da3eec376e5f1fe4c852..c86fb9f2aa9f18418defc522830a7b8f85c1dfae 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -362,80 +362,6 @@ do_sincos_1 (double a, double da, double x, int4 n, bool shift_quadrant) return retval; } -static inline int4 -__always_inline -reduce_sincos_2 (double x, double *a, double *da) -{ - mynumber v; - - double t = (x * hpinv + toint); - double xn = t - toint; - v.x = t; - double xn1 = (xn + 8.0e22) - 8.0e22; - double xn2 = xn - xn1; - double y = ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); - int4 n = v.i[LOW_HALF] & 3; - double db = xn1 * pp3; - t = y - db; - db = (y - t) - db; - db = (db - xn2 * pp3) - xn * pp4; - double b = t + db; - db = (t - b) + db; - - *a = b; - *da = db; - - return n; -} - -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ -static double -__always_inline -do_sincos_2 (double a, double da, double x, int4 n, bool shift_quadrant) -{ - double res, retval, cor, xx; - - double eps = 1.0e-24; - - int4 k = (n + shift_quadrant) & 3; - - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; - if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : bsloww (a, da, x, n); - } - else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : bsloww1 (a, da, x, n)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : bsloww2 (a, da, x, n)); - break; - } - - return retval; -} - /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ @@ -498,16 +424,7 @@ __sin (double x) retval = do_sincos_1 (a, da, x, n, false); } /* else if (k < 0x419921FB ) */ -/*---------------------105414350 <|x|< 281474976710656 --------------------*/ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, false); - } /* else if (k < 0x42F00000 ) */ - -/* -----------------281474976710656 <|x| <2^1024----------------------------*/ +/* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) retval = reduce_and_compute (x, false); @@ -584,15 +501,7 @@ __cos (double x) retval = do_sincos_1 (a, da, x, n, true); } /* else if (k < 0x419921FB ) */ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, true); - } /* else if (k < 0x42F00000 ) */ - - /* 281474976710656 <|x| <2^1024 */ + /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) retval = reduce_and_compute (x, true); diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index e1977ea7e93c32cca5369677f23e68f8f797a9f4..a9af8ce526bfe78c06cfafa65de0815ec69585c5 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -86,16 +86,6 @@ __sincos (double x, double *sinx, double *cosx) return; } - if (k < 0x42F00000) - { - double a, da; - int4 n = reduce_sincos_2 (x, &a, &da); - - *sinx = do_sincos_2 (a, da, x, n, false); - *cosx = do_sincos_2 (a, da, x, n, true); - - return; - } if (k < 0x7ff00000) { reduce_and_compute_sincos (x, sinx, cosx); From patchwork Wed Mar 21 17:52:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888979 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91199-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="kl3TJL9o"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405y7d2L7nz9s0q for ; Thu, 22 Mar 2018 04:52:45 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=wmRfMxkFI7Vw0/4Ww+mtES0BwsrOlXAGD7QpeOXEPQoL7xof/WAwS gPayzzhvgcrIuYIQMYh/9q1pRuHlOT/MpYsxvv7jR7POcPquolbYqTcKX97O4Pzd Hnlsu8Co6Qwyv1r7/yLtRKOTLaCn6llrNTG4wue2LBtzhYqo1X0E00= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=CmzZNvzbLT/R90yve7TGPySwjgM=; b=kl3TJL9oZrbm5/f7mgtYAcnfqAxt sJl/RUOPMNN9Z8sQYLRXpnrrIGomBOQKJViFBkj+Lba9zNEZx6Jy39gWt0YQzzvB NuFfjrP3DTfn2kXftCVrQdWa0Z1gePj8kTaGhSpM3Ic37LYm4D+Lu0SjhXkMTw/a Air2nN2YLtryc7k= Received: (qmail 78481 invoked by alias); 21 Mar 2018 17:52:39 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 74329 invoked by uid 89); 21 Mar 2018 17:52:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=ux, abs X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 3/7] sin/cos slow paths: remove slow paths from small range reduction Date: Wed, 21 Mar 2018 17:52:26 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB2039; 7:JmHXXDVARtCN09KiYeGLuSCwh5zR6pyfM5lKUSaqyIrpgaKrvcvYNXPFebHtZGprQWZkpMVFzP/5jR3affbmY1qsxA1nwn7958te4hzEU1A8PmUl8DFqIbMulr732edJHmjhN2twjuZQk68MktFOkULaO+8H3KTK66RAz564PRMz7hlvq+70GOI/H0wNsYri66X6J36qef2u34a4rB78aS1ueKRSKQGi385FZ3U8LztU8Gg9GQn88aEumtvd1maT x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: cbbc067c-36ed-4442-7008-08d58f54824c x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB2039; x-ms-traffictypediagnostic: DB6PR0801MB2039: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231221)(944501325)(52105095)(93006095)(93001095)(3002001)(10201501046)(6055026)(6041310)(20161123564045)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(6072148)(201708071742011); SRVR:DB6PR0801MB2039; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB2039; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(376002)(39380400002)(366004)(346002)(39860400002)(377424004)(189003)(199004)(54534003)(4326008)(9686003)(33656002)(14454004)(2501003)(3846002)(6916009)(6116002)(5250100002)(25786009)(53936002)(2906002)(55016002)(105586002)(66066001)(6436002)(5640700003)(8676002)(81166006)(81156014)(5660300001)(102836004)(7696005)(3280700002)(8936002)(3660700001)(86362001)(106356001)(2900100001)(99286004)(316002)(97736004)(26005)(2351001)(305945005)(7736002)(72206003)(74316002)(68736007)(478600001)(59450400001)(6506007); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB2039; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: 8udvnrhhtDkn7ksAl/P74A9RlVn2cyae7DVqyZ9JOgSlFNLHzxHh4fygRuZMyVgwAsnrBUM1DVndYlJZIYEoedkd8Hr/3IgeRk6WzscNqBFfI6u8GP7U1cXwZpis1uAZ8qh87Frklh9VEW9pdi9Rr60Cep8GghfxGgkuffIXpCu/SNPSp7fqYPBbqgWaYG+Crv/Ajf5JPclhbZ+J7/HyQnnxup6TSe19QugL65so60SCLEcxIg+qxu1sxYG3g10dpgytwlFG5eur87diNJ/UtctHLFEyGTbHiT8a2BgSOHYX8BLgFIkcIg9VADPGKPqFSNkj5RfGeckTRs7edzZskQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: cbbc067c-36ed-4442-7008-08d58f54824c X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:52:26.8554 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2039 This patch improves the accuracy of the range reduction. When the input is large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not enough. Improve range reduction accuracy to 136 bits. As a result the special checks for results close to zero can be removed. The ULP of the polynomials is at worst 0.55ULP, so there is no reason for the slow functions, and they can be removed. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to sincos_1, improve accuracy to 136 bits. (do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions. (__sin): Use improved reduction and simplified do_sincos calculation. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index c86fb9f2aa9f18418defc522830a7b8f85c1dfae..b8c366a6f05ef6b6632302fac96cd19af518f1fe 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -295,9 +295,13 @@ reduce_and_compute (double x, bool shift_quadrant) return retval; } +/* Reduce range of x to within PI/2 with abs (x) < 105414350. The high part + is written to *a, the low part to *da. Range reduction is accurate to 136 + bits so that when x is large and *a very close to zero, all 53 bits of *a + are correct. */ static inline int4 __always_inline -reduce_sincos_1 (double x, double *a, double *da) +reduce_sincos (double x, double *a, double *da) { mynumber v; @@ -306,62 +310,45 @@ reduce_sincos_1 (double x, double *a, double *da) v.x = t; double y = (x - xn * mp1) - xn * mp2; int4 n = v.i[LOW_HALF] & 3; - double db = xn * mp3; - double b = y - db; - db = (y - b) - db; + + double b, db, t1, t2; + t1 = xn * pp3; + t2 = y - t1; + db = (y - t2) - t1; + + t1 = xn * pp4; + b = t2 - t1; + db += (t2 - b) - t1; *a = b; *da = db; - return n; } -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ +/* Compute sin or cos (A + DA) for the given quadrant N. */ static double __always_inline -do_sincos_1 (double a, double da, double x, int4 n, bool shift_quadrant) +do_sincos (double a, double da, int4 n) { - double xx, retval, res, cor; - double eps = fabs (x) * 1.2e-30; + double retval, cor; - int k1 = (n + shift_quadrant) & 3; - switch (k1) - { /* quarter of unit circle */ - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; + if (n & 1) + /* Max ULP is 0.513. */ + retval = do_cos (a, da, &cor); + else + { + double xx = a * a; + /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, shift_quadrant); - } + retval = TAYLOR_SIN (xx, a, da, cor); else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, shift_quadrant)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : sloww2 (a, da, x, n)); - break; + retval = __copysign (do_sin (a, da, &cor), a); } - return retval; + return (n & 2) ? -retval : retval; } + /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ @@ -374,13 +361,18 @@ SECTION #endif __sin (double x) { - double xx, t, cor; +#ifndef IN_SINCOS + double xx, t, a, da, cor; mynumber u; - int4 k, m; + int4 k, m, n; double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); +#else + double xx, t, cor; + mynumber u; + int4 k, m; + double retval = 0; #endif u.x = x; @@ -419,9 +411,8 @@ __sin (double x) /*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ else if (k < 0x419921FB) { - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, false); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n); } /* else if (k < 0x419921FB ) */ /* --------------------105414350 <|x| <2^1024------------------------------*/ @@ -456,7 +447,11 @@ __cos (double x) { double y, xx, cor, a, da; mynumber u; +#ifndef IN_SINCOS + int4 k, m, n; +#else int4 k, m; +#endif double retval = 0; @@ -496,9 +491,8 @@ __cos (double x) #ifndef IN_SINCOS else if (k < 0x419921FB) { /* 2.426265<|x|< 105414350 */ - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, true); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n + 1); } /* else if (k < 0x419921FB ) */ /* 105414350 <|x| <2^1024 */ diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index a9af8ce526bfe78c06cfafa65de0815ec69585c5..4f032d2e42593ccde22169b374728386dd8fca8e 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -79,10 +79,10 @@ __sincos (double x, double *sinx, double *cosx) if (k < 0x419921FB) { double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); + int4 n = reduce_sincos (x, &a, &da); - *sinx = do_sincos_1 (a, da, x, n, false); - *cosx = do_sincos_1 (a, da, x, n, true); + *sinx = do_sincos (a, da, n); + *cosx = do_sincos (a, da, n + 1); return; } From patchwork Wed Mar 21 17:53:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888980 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91200-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="KgDMA8Io"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405y8k5VqTz9s0q for ; Thu, 22 Mar 2018 04:53:42 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=DyUeZS2QuSVREbHQQIuX5jeivG3d0ei58VJSOiI2uc4ikDNiCORTj N418nvHTfpwL22FKBTifbrbZSs2NSMGnYonUgdh3lwEuSKR5xpFjZk5Qf4OGjr2R 8XJURYi5Hofe7Bpty5HEyJ0euu1GMixbb2VL5S+/dKExpgQhOUTPq0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=TRcuLIiPimflAjrLgdmTsHibVHg=; b=KgDMA8IoQn37xCJRm2z8MC9tfyrz j2GuqlUr6VzxQ5RonhpvG0m2v0uXqGNbpupHG7i8+eEf+mpRiTjGMkAJZEVfqWtz fnhMS889vVJxDbdrYaWaxm2JlTifZe/R8icoKR44oKBxlXZGV5Qy5JzlVHKic3f0 kA9jpd1a4oqB25U= Received: (qmail 120891 invoked by alias); 21 Mar 2018 17:53:37 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 120179 invoked by uid 89); 21 Mar 2018 17:53:36 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=variation, corp, RES X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 4/7] sin/cos slow paths: remove slow paths from huge range reduction Date: Wed, 21 Mar 2018 17:53:31 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1813; 6:2mLdhAhKWXMA8T9iwavaPvtwg2fGTOyPg2nok11gilqtTebl7/j4caKF59TRktyrctoBvuwpa/jOgbqGa2qB04Z78E8XK38LqqLF4mFZvmTGpzzuhE9s77ug9IuPZNW3bGHheASFQijqaGGPhQPycaws/viq/GqzhegFPrlOpQMGyMZ5Q0RBIdWTrRnHH/oc7Ij4OTKYERipK2NFWFAKPfKB7UZV/buLzpT7L3UrQg75KtVkG1ACiEJdvyT4fYm0fYEmzhJRs5yZM6ECNjBDjzC2nZjr5TfHhjK6Zro3oki0d1+SjPpIT7Ia8jqUGH59f76RVco9yw9ihUt/RTxLi/V7yNBEBTMD5cSWXyEj7mQUTBoaOja+w6dowsHCAhSl; 5:TIdu4SXH4JaMdUJDMv7Pf/+tUu67imsoxnth/A/CG6yxVWgGTjaSHINOTikTJ0e7KOTjJX5yow+/Pe9R6zIFOoYt9Xm3CspWSLiYSTTY4aG03VCWBJxyzQWF0IZu78vtpw7YivrEtGp09rLIUhqdWzPuGPNByOrn1BsZ++RPTnc=; 24:5xxP6cURZ79PJR9HNjPORLKuKATnDhRC9PnXrGr8+c/yz/x2NJhKwTivVH84VmAeKO/IC1HSXS1acfPfNXnlaFxuCGSxvRGb+qmnj5MSkFg=; 7:2wFV4XLSa2Bk9pZI7agGxsY765coOHmDr/x5NT9M773XfPDUEOb2yl33wKoybRs/8mI4rC7dAewGUmWo3ByTe9PIaLZH/X6W1EYXwHHEl6S24lvV9S6+NKvlhhezGRmi4PG2TRoGzCbRqkoi7ZjhN1J/e4wnjisJHBg8/yQRqMFkh3BgctqEAZ9zI9k5LLGAZFPH5VJeftbTfZRaH5b+TlT1W7B7c+R4QTcoejcy5rN+FM8iHHi+TiSXRUY995/2 x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 34ec7092-e9b7-4538-71de-08d58f54a8be x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB6PR0801MB1813; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002)(14583001)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1813; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: 2RCCxL7oGYlJApyr8t4Yz8++QWmbpMAyNIedfCoEkiIE2xrIGMpSrT168i9TKE69wH/36WIYas8m3Fal4Rx8jKmAtsIp8RijqesJ7cEBlJ+55deKp97txYsGIum+0OE9WQEEswyRgMGiAWPejZdysAQDSDepOrQ8PmSHnniAFZSqj62XLH1PHwsDDtwyZTEcWH539RsSEyVqt0jJkVx/KefE2o8gRYxdYZMdPrU3/ur4xUVYyYpGDjDp9/gn+SkACP8gokHb+SLwneXT+zhgZJXWEvHmOP5uc4pLXiyx5fPzu7TmcotzDyoHSGm6CRvWiuodA+WiWTz2mkMJ59mL6Q== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 34ec7092-e9b7-4538-71de-08d58f54a8be X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:53:31.3558 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 For huge inputs use the improved do_sincos function as well. Now no cases use the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. (reduce_and_compute_sincos): Remove unused function. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index b8c366a6f05ef6b6632302fac96cd19af518f1fe..099a8a128f9883d1e683436a9f09720922e923ce 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -67,11 +67,10 @@ The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! and so on. The result is returned to LHS and correction in COR. */ -#define TAYLOR_SIN(xx, a, da, cor) \ +#define TAYLOR_SIN(xx, a, da) \ ({ \ double t = ((POLYNOMIAL (xx) * (a) - 0.5 * (da)) * (xx) + (da)); \ double res = (a) + t; \ - (cor) = ((a) - res) + t; \ res; \ }) @@ -145,10 +144,10 @@ static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to - get the result in RES and a correction value in COR. */ + get the result. */ static inline double __always_inline -do_cos (double x, double dx, double *corp) +do_cos (double x, double dx) { mynumber u; @@ -158,16 +157,13 @@ do_cos (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big) + dx; - double xx, s, sn, ssn, c, cs, ccs, res, cor; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + x * xx * (sn3 + xx * sn5); c = xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ccs - s * ssn - cs * c) - sn * s; - res = cs + cor; - cor = (cs - res) + cor; - *corp = cor; - return res; + return cs + cor; } /* A more precise variant of DO_COS. EPS is the adjustment to the correction @@ -207,10 +203,10 @@ do_cos_slow (double x, double dx, double eps, double *corp) /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get - the result in RES and a correction value in COR. */ + the result. */ static inline double __always_inline -do_sin (double x, double dx, double *corp) +do_sin (double x, double dx) { mynumber u; @@ -219,16 +215,13 @@ do_sin (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big); - double xx, s, sn, ssn, c, cs, ccs, cor, res; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + (dx + x * xx * (sn3 + xx * sn5)); c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ssn + s * ccs - sn * c) + cs * s; - res = sn + cor; - cor = (sn - res) + cor; - *corp = cor; - return res; + return sn + cor; } /* A more precise variant of DO_SIN. EPS is the adjustment to the correction @@ -330,19 +323,19 @@ static double __always_inline do_sincos (double a, double da, int4 n) { - double retval, cor; + double retval; if (n & 1) /* Max ULP is 0.513. */ - retval = do_cos (a, da, &cor); + retval = do_cos (a, da); else { double xx = a * a; /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } return (n & 2) ? -retval : retval; @@ -362,7 +355,7 @@ SECTION __sin (double x) { #ifndef IN_SINCOS - double xx, t, a, da, cor; + double xx, t, a, da; mynumber u; int4 k, m, n; double retval = 0; @@ -396,7 +389,7 @@ __sin (double x) else if (k < 0x3feb6000) { /* Max ULP is 0.548. */ - retval = __copysign (do_sin (x, 0, &cor), x); + retval = __copysign (do_sin (x, 0), x); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ @@ -404,7 +397,7 @@ __sin (double x) { t = hp0 - fabs (x); /* Max ULP is 0.51. */ - retval = __copysign (do_cos (t, hp1, &cor), x); + retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ #ifndef IN_SINCOS @@ -417,8 +410,10 @@ __sin (double x) /* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, false); - + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n); + } /*--------------------- |x| > 2^1024 ----------------------------------*/ else { @@ -445,7 +440,7 @@ SECTION #endif __cos (double x) { - double y, xx, cor, a, da; + double y, xx, a, da; mynumber u; #ifndef IN_SINCOS int4 k, m, n; @@ -470,7 +465,7 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ /* Max ULP is 0.51. */ - retval = do_cos (x, 0, &cor); + retval = do_cos (x, 0); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -482,9 +477,9 @@ __cos (double x) /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } /* else if (k < 0x400368fd) */ @@ -497,7 +492,10 @@ __cos (double x) /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, true); + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n + 1); + } else { diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 4f032d2e42593ccde22169b374728386dd8fca8e..4335ecbba3c9894e61c087ac970b392fa73abfab 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -28,37 +28,6 @@ #define IN_SINCOS 1 #include "s_sin.c" -/* Consolidated version of reduce_and_compute in s_sin.c that does range - reduction only once and computes sin and cos together. */ -static inline void -__always_inline -reduce_and_compute_sincos (double x, double *sinx, double *cosx) -{ - double a, da; - unsigned int n = __branred (x, &a, &da); - - n = n & 3; - - if (n == 1 || n == 2) - { - a = -a; - da = -da; - } - - if (n & 1) - { - double *temp = cosx; - cosx = sinx; - sinx = temp; - } - - if (a * a < 0.01588) - *sinx = bsloww (a, da, x, n); - else - *sinx = bsloww1 (a, da, x, n); - *cosx = bsloww2 (a, da, x, n); -} - void __sincos (double x, double *sinx, double *cosx) { @@ -88,8 +57,11 @@ __sincos (double x, double *sinx, double *cosx) } if (k < 0x7ff00000) { - reduce_and_compute_sincos (x, sinx, cosx); - return; + double a, da; + int4 n = __branred (x, &a, &da); + + *sinx = do_sincos (a, da, n); + *cosx = do_sincos (a, da, n + 1); } if (isinf (x)) From patchwork Wed Mar 21 17:54:22 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888981 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91201-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="q18BDW3i"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405y9n2pZ8z9s0q for ; Thu, 22 Mar 2018 04:54:37 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=G1lvod6od67Iz0SiGHublqf3zMOw0pFaeGJWJo8EifvXj9wkQAGpw VrcfB0+dIPWPifUULBJ+3X5escbt84CWYCqlsMItgjHfHi9yEjk9PkV/dTxRk4Lf MbW+HwDi4Sp7jXMdSv2B1c6XhrSi7Djhu/s8qzayFbEB1fh4bKwQvQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=f1nobyiI5Ht38AR0tSbRXIM+QLs=; b=q18BDW3iLV7F0KsH2Dg+XrddgkBv WwaqH1dqaUkEzO0QVv3/AeBq2+7fhWy7Nf4BbqVHJFHjDy4g91X6lJZ0CJhLVBZv 2ORSw5vC4D8X/3GS3AR4BP2c+NO4kioUgXcXlajOqq6MLH/s7wVBjJSj+zbPHIgB MhCQDcbvf5DoGb4= Received: (qmail 30091 invoked by alias); 21 Mar 2018 17:54:30 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 29804 invoked by uid 89); 21 Mar 2018 17:54:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=orig, UD:And, rotates, *************************************************************************** X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 5/7] sin/cos slow paths: remove unused slowpath functions Date: Wed, 21 Mar 2018 17:54:22 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1813; 7:0Ts4mI75D4ZpXIQ/RUkdOTdDZUiEuNxmCuxUN0tDJL21bsm6hXBTg4aMtkq6XsTBaNxfcxdnjUQYrxtzyJgI3hNehlg7CcCcOf0UqFcv/mebNSNu9nvygBgAYoynHHv4ZT9D/mtXJ1gNv4h1qexpamtupcKcFg9GvRwUpSlW5U4kfAcj79LHl+h1CRvzJj2RKs55MUquFbQ/Jvwtb6wsErTQgj6hiXVbuc6wQwfSK0OFO0P9l/MqVWJDqnLQSi+s x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: ba418d96-486a-4ea2-6b01-08d58f54c730 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB6PR0801MB1813; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(53946003)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(59450400001)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1813; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: G0HL77Sd500m9tstwqEfwJx+/TpjZbxOCxUfTEqth+I3kMR8rACATvOGOPh0zhPgJ8DLoLwMn15czJcuec7+8m05V7kSGcH39Q4kloQcbr7EDvePLXuzcMbwxP6PBXuSE0hQzalv2V0bRJdlx5Xa1EcsceT2vW3qqRLf9eJ94W+85KGcA6Z//n57d9XD6GcPdgXRODuYnjIdtJdhwmmEOtb/PGHgEi+I5SLt3FHyPEZwYV6gOZPLfpKMb87XtD8PQSrqSwuEbKNkqnRJPkfdZEQbEE8Uxjb6L7Gq09rCuelyX2JMd202Uu6jCuRKSMUzopZEmdjlmxf3Z1nW3txCWA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: ba418d96-486a-4ea2-6b01-08d58f54c730 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:54:22.4032 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 Remove all unused slowpath functions. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove. (do_cos_slow): Likewise. (do_sin_slow): Likewise. (reduce_and_compute): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bslow): Likewise. (bslow1): Likewise. (bslow2): Likewise. (cslow2): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 099a8a128f9883d1e683436a9f09720922e923ce..7a55636889f186849f638c4c510ee29dd007d655 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -22,22 +22,11 @@ /* */ /* FUNCTIONS: usin */ /* ucos */ -/* slow */ -/* slow1 */ -/* slow2 */ -/* sloww */ -/* sloww1 */ -/* sloww2 */ -/* bsloww */ -/* bsloww1 */ -/* bsloww2 */ -/* cslow2 */ /* FILES NEEDED: dla.h endian.h mpa.h mydefs.h usncs.h */ -/* branred.c sincos32.c dosincos.c mpa.c */ -/* sincos.tbl */ +/* branred.c sincos.tbl */ /* */ -/* An ultimate sin and routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of sin(x) or cos(x) */ +/* An ultimate sin and cos routine. Given an IEEE double machine number x */ +/* it computes sin(x) or cos(x) with ~0.55 ULP. */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /* */ @@ -74,29 +63,6 @@ res; \ }) -/* This is again a variation of the Taylor series expansion with the term - x^3/3! expanded into the following for better accuracy: - - bb * x ^ 3 + 3 * aa * x * x1 * x2 + aa * x1 ^ 3 + aa * x2 ^ 3 - - The correction term is dx and bb + aa = -1/3! - */ -#define TAYLOR_SLOW(x0, dx, cor) \ -({ \ - static const double th2_36 = 206158430208.0; /* 1.5*2**37 */ \ - double xx = (x0) * (x0); \ - double x1 = ((x0) + th2_36) - th2_36; \ - double y = aa * x1 * x1 * x1; \ - double r = (x0) + y; \ - double x2 = ((x0) - x1) + (dx); \ - double t = (((POLYNOMIAL2 (xx) + bb) * xx + 3.0 * aa * x1 * x2) \ - * (x0) + aa * x2 * x2 * x2 + (dx)); \ - t = (((x0) - r) + y) + t; \ - double res = r + t; \ - (cor) = (r - res) + t; \ - res; \ -}) - #define SINCOS_TABLE_LOOKUP(u, sn, ssn, cs, ccs) \ ({ \ int4 k = u.i[LOW_HALF] << 2; \ @@ -123,23 +89,7 @@ static const double cs4 = -4.16666666666664434524222570944589E-02, cs6 = 1.38888874007937613028114285595617E-03; -static const double t22 = 0x1.8p22; - -void __dubsin (double x, double dx, double w[]); -void __docos (double x, double dx, double w[]); -double __mpsin (double x, double dx, bool reduce_range); -double __mpcos (double x, double dx, bool reduce_range); -static double slow (double x); -static double slow1 (double x); -static double slow2 (double x); -static double sloww (double x, double dx, double orig, bool shift_quadrant); -static double sloww1 (double x, double dx, double orig, bool shift_quadrant); -static double sloww2 (double x, double dx, double orig, int n); -static double bsloww (double x, double dx, double orig, int n); -static double bsloww1 (double x, double dx, double orig, int n); -static double bsloww2 (double x, double dx, double orig, int n); int __branred (double x, double *a, double *aa); -static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation @@ -166,40 +116,6 @@ do_cos (double x, double dx) return cs + cor; } -/* A more precise variant of DO_COS. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_cos_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, e1, e2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - e1 = (sn + t22) - t22; - e2 = (sn - e1) + ssn; - cor = (ccs - cs * c - e1 * x2 - e2 * x) - sn * s; - y = cs - e1 * x1; - cor = cor + ((cs - y) - e1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; -} - /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get @@ -224,70 +140,6 @@ do_sin (double x, double dx) return sn + cor; } -/* A more precise variant of DO_SIN. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_sin_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, c1, c2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - c1 = (cs + t22) - t22; - c2 = (cs - c1) + ccs; - cor = (ssn + s * ccs + cs * s + c2 * x + c1 * x2 - sn * x * dx) - sn * c; - y = sn + c1 * x1; - cor = cor + ((sn - y) + c1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; -} - -/* Reduce range of X and compute sin of a + da. When SHIFT_QUADRANT is true, - the routine returns the cosine of a + da by rotating the quadrant once and - computing the sine of the result. */ -static inline double -__always_inline -reduce_and_compute (double x, bool shift_quadrant) -{ - double retval = 0, a, da; - unsigned int n = __branred (x, &a, &da); - int4 k = (n + shift_quadrant) % 4; - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - if (a * a < 0.01588) - retval = bsloww (a, da, x, n); - else - retval = bsloww1 (a, da, x, n); - break; - - case 1: - case 3: - retval = bsloww2 (a, da, x, n); - break; - } - return retval; -} - /* Reduce range of x to within PI/2 with abs (x) < 105414350. The high part is written to *a, the low part to *da. Range reduction is accurate to 136 bits so that when x is large and *a very close to zero, all 53 bits of *a @@ -508,299 +360,6 @@ __cos (double x) return retval; } -/************************************************************************/ -/* Routine compute sin(x) for 2^-26 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpsin or dubsin */ -/************************************************************************/ - -static inline double -__always_inline -slow (double x) -{ - double res, cor, w[2]; - res = TAYLOR_SLOW (x, 0, cor); - if (res == res + 1.0007 * cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000001 * w[1]) - return __copysign (w[0], x); - - return __copysign (__mpsin (fabs (x), 0, false), x); -} - -/*******************************************************************************/ -/* Routine compute sin(x) for 0.25<|x|< 0.855469 by __sincostab.tbl and Taylor */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/*******************************************************************************/ - -static inline double -__always_inline -slow1 (double x) -{ - double w[2], cor, res; - - res = do_sin_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/**************************************************************************/ -/* Routine compute sin(x) for 0.855469 <|x|<2.426265 by __sincostab.tbl */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/**************************************************************************/ -static inline double -__always_inline -slow2 (double x) -{ - double w[2], y, y1, y2, cor, res; - - double t = hp0 - fabs (x); - res = do_cos_slow (t, hp1, 0, &cor); - if (res == res + cor) - return res; - - y = fabs (x) - hp0; - y1 = y - hp1; - y2 = (y - y1) - hp1; - __docos (y1, y2, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/* Compute sin(x + dx) where X is small enough to use Taylor series around zero - and (x + dx) in the first or third quarter of the unit circle. ORIG is the - original value of X for computing error of the result. If the result is not - accurate enough, the routine calls mpsin or dubsin. SHIFT_QUADRANT rotates - the unit circle by 1 to compute the cosine instead of sine. */ -static inline double -__always_inline -sloww (double x, double dx, double orig, bool shift_quadrant) -{ - double y, t, res, cor, w[2], a, da, xn; - mynumber v; - int4 n; - res = TAYLOR_SLOW (x, dx, cor); - - double eps = fabs (orig) * 3.1e-30; - - cor = 1.0005 * cor + __copysign (eps, cor); - - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - eps = fabs (orig) * 1.1e-30; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - t = (orig * hpinv + toint); - xn = t - toint; - v.x = t; - y = (orig - xn * mp1) - xn * mp2; - n = (v.i[LOW_HALF] + shift_quadrant) & 3; - da = xn * pp3; - t = y - da; - da = (y - t) - da; - y = xn * pp4; - a = t - y; - da = ((t - a) - y) + da; - - if (n & 2) - { - a = -a; - da = -da; - } - x = fabs (a); - dx = (a > 0) ? da : -da; - __dubsin (x, dx, w); - eps = fabs (orig) * 1.1e-40; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], a); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/* Compute sin(x + dx) where X is in the first or third quarter of the unit - circle. ORIG is the original value of X for computing error of the result. - If the result is not accurate enough, the routine calls mpsin or dubsin. - SHIFT_QUADRANT rotates the unit circle by 1 to compute the cosine instead of - sine. */ -static inline double -__always_inline -sloww1 (double x, double dx, double orig, bool shift_quadrant) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return __copysign (res, x); - - dx = (x > 0 ? dx : -dx); - __dubsin (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) (Double-Length number) where x in second or */ -/* fourth quarter of unit circle.Routine receive also the original value */ -/* and quarter(n= 1or 3)of x for computing error of result.And if result not*/ -/* accurate enough routine calls mpsin1 or dubsin */ -/***************************************************************************/ - -static inline double -__always_inline -sloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = x > 0 ? dx : -dx; - __docos (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* is small enough to use Taylor series around zero and (x+dx) */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of */ -/* result.And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww (double x, double dx, double orig, int n) -{ - double res, cor, w[2], a, da; - - res = TAYLOR_SLOW (x, dx, cor); - cor = 1.0005 * cor + __copysign (1.1e-24, cor); - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of result.*/ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww1 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (x > 0) ? res : -res; - - dx = (x > 0) ? dx : -dx; - __dubsin (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in second or fourth quarter of unit circle.Routine receive also the */ -/* original value and quarter(n= 1or 3)of x for computing error of result. */ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = (x > 0) ? dx : -dx; - __docos (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/************************************************************************/ -/* Routine compute cos(x) for 2^-27 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpcos or docos */ -/************************************************************************/ - -static inline double -__always_inline -cslow2 (double x) -{ - double w[2], cor, res; - - res = do_cos_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __docos (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpcos (x, 0, false); -} - #ifndef __cos libm_alias_double (__cos, cos) #endif From patchwork Wed Mar 21 17:55:28 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888982 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91202-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="kjy6fuyJ"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405yBz6qfbz9s0w for ; Thu, 22 Mar 2018 04:55:39 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=LHZ+K5Te1YEQeLPv7xBsR/G43D2DGz7sV++hNGgirkGK5sATQ9ZPw B0RX+1xPCslsDtlI+9OmxMa80IBxohgVRRHgOoTiRDqQ1aJCKBEAxo8DAImNqeaF dFlZQoVHY/vllT2Sj8rlFcXPwmon5aouprEGbcdxskm4Pj/VlMAwFQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=L62N2QiEId/GXrbpS1jIrkPU8y8=; b=kjy6fuyJ63rEfJh9GgSZqMAtFSM/ JQmo9E02+hsGU2UL8uxra9Sj23faufQHRu1pKilpWx+h9zIcFFjQHGu5J+UDW/2n kVf6xkiPR5t4ixcLwuRRynrQTXyFtajFW5O1LRm8aYETuysfKgKoGDagWrB5TvT+ 3sbLyDNxWDNKg98= Received: (qmail 36522 invoked by alias); 21 Mar 2018 17:55:34 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 36486 invoked by uid 89); 21 Mar 2018 17:55:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=0126 X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 6/7] sin/cos slow paths: refactor duplicated code into dosin Date: Wed, 21 Mar 2018 17:55:28 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1813; 7:WclsU+fCi+HBMgTctojdQSv+1tRADvWeTpJodkx2vS+yFaGrTAPpJ84AHXCQ1W4MMA6gxLlSftdNCgWAGOQEq9Eg9dNelCLXEbRhfqk3zPtRtz2EYhN/263AOJlBeXCdB019g3l7lnGZZSlySSqGhLljawEKEPI73DZU0OP+ll8/Hts65l65hEo/ztQvbnAhgtEGtfD//DVptBIuM8yjz/xio8pIwQybnml4tdubD5/Kr8WqSPXxOZkuIF3kkiJb x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 4445fc0a-2649-4d3a-112a-08d58f54eeca x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1813; x-ms-traffictypediagnostic: DB6PR0801MB1813: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917)(17755550239193); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231221)(944501325)(52105095)(93006095)(93001095)(6055026)(6041310)(20161123562045)(20161123558120)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB6PR0801MB1813; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1813; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(376002)(396003)(346002)(39860400002)(39380400002)(199004)(189003)(377424004)(54534003)(102836004)(478600001)(305945005)(8936002)(2351001)(9686003)(3660700001)(7696005)(5660300001)(66066001)(86362001)(55016002)(105586002)(6436002)(81166006)(5640700003)(8676002)(53936002)(6916009)(6116002)(106356001)(575784001)(2900100001)(3280700002)(6506007)(2906002)(81156014)(3846002)(25786009)(4326008)(68736007)(7736002)(99286004)(26005)(5250100002)(2501003)(14454004)(316002)(33656002)(97736004)(72206003)(74316002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1813; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: MqnAMxhakdQWneDPDAacyh65O5KO6+LIhIcozs77tDNob72PDJgDvy28AsT0M05R67H6VDwT0pdEvVyqUjcekoOam6i/m1xeggnWgx5qTSb6VY2/OLu94+eL7gURKRSP4RZuz49+S5CuWgLebsnyRRomLNxBM44ny2EDhST13E3SEKos3FZ/7cwTQoI/Q/a9oZhHjl6IIDk0D6qrRZv3UAFX1IBGA2D2Zp3dbBAMTjCvzGsq+8/Nx0ySA8QL8Mj9ff/6aaFQk59ILhv96xGY3ZRljQcHfS7nefK+rX8GgLR8qPfaiPjnYqaWFor6nNmcxlAt7X1edxXdVfXRpfZ8iQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 4445fc0a-2649-4d3a-112a-08d58f54eeca X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:55:28.8725 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1813 Refactor duplicated code into do_sin. Since all calls to do_sin use copysign to set the sign of the result, move it inside do_sin. Small inputs use a separate polynomial, so move this into do_sin as well (the check is based on the more conservative case when doing large range reduction, but could be relaxed). ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (do_sin): Use TAYLOR_SIN for small inputs. Return correct sign. (do_sincos): Remove small input check before do_sin, let do_sin set the sign. (__sin): Likewise. (__cos): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 7a55636889f186849f638c4c510ee29dd007d655..e4a2153bb8d010d72d898c0d08e9253f4173f51d 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -124,6 +124,11 @@ static inline double __always_inline do_sin (double x, double dx) { + double xold = x; + /* Max ULP is 0.501 if |x| < 0.126, otherwise ULP is 0.518. */ + if (fabs (x) < 0.126) + return TAYLOR_SIN (x * x, x, dx); + mynumber u; if (x <= 0) @@ -137,7 +142,7 @@ do_sin (double x, double dx) c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ssn + s * ccs - sn * c) + cs * s; - return sn + cor; + return __copysign (sn + cor, xold); } /* Reduce range of x to within PI/2 with abs (x) < 105414350. The high part @@ -181,14 +186,8 @@ do_sincos (double a, double da, int4 n) /* Max ULP is 0.513. */ retval = do_cos (a, da); else - { - double xx = a * a; - /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ - if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da); - else - retval = __copysign (do_sin (a, da), a); - } + /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ + retval = do_sin (a, da); return (n & 2) ? -retval : retval; } @@ -207,7 +206,7 @@ SECTION __sin (double x) { #ifndef IN_SINCOS - double xx, t, a, da; + double t, a, da; mynumber u; int4 k, m, n; double retval = 0; @@ -228,20 +227,11 @@ __sin (double x) math_check_force_underflow (x); retval = x; } - /*---------------------------- 2^-26 < |x|< 0.25 ----------------------*/ - else if (k < 0x3fd00000) - { - xx = x * x; - /* Taylor series. */ - t = POLYNOMIAL (xx) * (xx * x); - /* Max ULP of x + t is 0.535. */ - retval = x + t; - } /* else if (k < 0x3fd00000) */ -/*---------------------------- 0.25<|x|< 0.855469---------------------- */ +/*--------------------------- 2^-26<|x|< 0.855469---------------------- */ else if (k < 0x3feb6000) { /* Max ULP is 0.548. */ - retval = __copysign (do_sin (x, 0), x); + retval = do_sin (x, 0); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ @@ -292,7 +282,7 @@ SECTION #endif __cos (double x) { - double y, xx, a, da; + double y, a, da; mynumber u; #ifndef IN_SINCOS int4 k, m, n; @@ -325,13 +315,9 @@ __cos (double x) y = hp0 - fabs (x); a = y + hp1; da = (y - a) + hp1; - xx = a * a; /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. Range reduction uses 106 bits here which is sufficient. */ - if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da); - else - retval = __copysign (do_sin (a, da), a); + retval = do_sin (a, da); } /* else if (k < 0x400368fd) */ From patchwork Wed Mar 21 17:57:29 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 888984 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91203-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="wOkcRV4m"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 405yFK04sKz9s0q for ; Thu, 22 Mar 2018 04:57:40 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=gwjRE9ixKjgwshkcf/e3wI2UQL3e7qz5AjEwrF/alwPL3rbQda3Mc 970uVX9eFeVHKjLmFJ4McdFvWRmOa7AdvLclAjbf5mvgdA/S7Lohc89i68b0Awsq ChDbaeoAM3nRR+OF4m54Dyw1wsYXY46gC7Jimjn0RoaYK2SVRSmdzE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=+Gln6hfgsG7ccxwrjWQNAhilQPQ=; b=wOkcRV4mSkSGKNXLgjFeWjjK/upj m73RHMBLmtbZVwi+AdST3ui8aBAvCxoU1tsSPiNtklpKesMR/NLqleFPd0W+8u0c 2gc39B8a8fyXdZskIt9jNXonRFpfzOVswIx/jF3dbee3j/4Sg2ZM3PkZdHCQgrL5 x563lzJQimLx94E= Received: (qmail 45871 invoked by alias); 21 Mar 2018 17:57:35 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 45503 invoked by uid 89); 21 Mar 2018 17:57:34 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:4694 X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 7/7] sin/cos slow paths: refactor sincos implementation Date: Wed, 21 Mar 2018 17:57:29 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB2039; 7:mnznwPlr6uLIuVJZAEBu4giADXFLm1osbPc12ZxUZ21OQSVmqshj0fVgDTooMmVVProApYIMXhZV4qoUCEjkR3JmfRqN4yKJceWVXA2JYHBiX6rBQOIp5i/e1qNerbREXExU5eRFfBNejN2QpMuyy6PpzFYV8x25ZjE+xA+RUv7cKPqLpqd1pkCGdvU19Am+ORnvq0vDMOJaqO3ll87Qn4t1wajq93vsDdqedV0xnIYrDxnNWmPdwsKpK34l0g7w x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 7bf121ca-3d03-4baf-ac30-08d58f5536f7 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB2039; x-ms-traffictypediagnostic: DB6PR0801MB2039: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231221)(944501325)(52105095)(93006095)(93001095)(3002001)(10201501046)(6055026)(6041310)(20161123564045)(20161123558120)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123562045)(6072148)(201708071742011); SRVR:DB6PR0801MB2039; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB2039; x-forefront-prvs: 0618E4E7E1 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(376002)(39380400002)(366004)(346002)(39860400002)(377424004)(189003)(199004)(54534003)(4326008)(9686003)(33656002)(14454004)(2501003)(3846002)(6916009)(6116002)(5250100002)(25786009)(53936002)(2906002)(55016002)(105586002)(66066001)(6436002)(5640700003)(8676002)(81166006)(81156014)(5660300001)(102836004)(7696005)(3280700002)(8936002)(3660700001)(86362001)(106356001)(2900100001)(99286004)(316002)(575784001)(97736004)(26005)(2351001)(305945005)(7736002)(72206003)(74316002)(68736007)(478600001)(59450400001)(6506007); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB2039; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: /V2RcgPVRIsP/OfLJHnuFXRFT3xfPR69subgv83ZCt3b8q0QmgoiWrIVvm5iVFzonr/XBB6L2Of4B5tM6Z//KpufZ8JD4UmebWyhgFQuh0GcyXWXqQyvPT89Ul5wnxQ/wt1QCiouKc8aMnm6HwsQlpcETQ6WRNBxUWh8XiNbO7rC7+pAeMtcOgJ7fBSXPW3uR5GZQqiIf5CQI2FGHUm8HYgG89cQH1mNvBUVNMsRj0IlsWzrNRy0y8H6T7AogkjsYhQQOzTgO1hm2U/wMNe3eutPeY94CmQ81M65TQStEQkPbUZggpQvSWNf6TCa3oKwlqcSA2KRycQN2wl5UziDNQ== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 7bf121ca-3d03-4baf-ac30-08d58f5536f7 X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Mar 2018 17:57:29.9517 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2039 Refactor the sincos implementation - rather than rely on odd partial inlining of preprocessed portions from sin and cos, explicitly write out the cases. This makes sincos much easier to maintain and provides an additional 16-20% speedup between 0 and 2^27. The overall speedup of sincos is 48% over this range. Between 0 and PI it is 66% faster. ChangeLog: 2018-03-20 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Refactor using the same logic as sin and cos. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index e4a2153bb8d010d72d898c0d08e9253f4173f51d..2fde7713ee340aa8e3ce143db7254d0b57f1ab5d 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -197,27 +197,17 @@ do_sincos (double a, double da, int4 n) /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else +#ifndef IN_SINCOS double SECTION -#endif __sin (double x) { -#ifndef IN_SINCOS double t, a, da; mynumber u; int4 k, m, n; double retval = 0; SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#else - double xx, t, cor; - mynumber u; - int4 k, m; - double retval = 0; -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -242,7 +232,6 @@ __sin (double x) retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ -#ifndef IN_SINCOS /*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ else if (k < 0x419921FB) { @@ -263,7 +252,6 @@ __sin (double x) __set_errno (EDOM); retval = x / x; } -#endif return retval; } @@ -274,27 +262,17 @@ __sin (double x) /* it computes the correctly rounded (to nearest) value of cos(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else double SECTION -#endif __cos (double x) { double y, a, da; mynumber u; -#ifndef IN_SINCOS int4 k, m, n; -#else - int4 k, m; -#endif double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -320,8 +298,6 @@ __cos (double x) retval = do_sin (a, da); } /* else if (k < 0x400368fd) */ - -#ifndef IN_SINCOS else if (k < 0x419921FB) { /* 2.426265<|x|< 105414350 */ n = reduce_sincos (x, &a, &da); @@ -341,7 +317,6 @@ __cos (double x) __set_errno (EDOM); retval = x / x; /* |x| > 2^1024 */ } -#endif return retval; } @@ -352,3 +327,5 @@ libm_alias_double (__cos, cos) #ifndef __sin libm_alias_double (__sin, sin) #endif + +#endif diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 4335ecbba3c9894e61c087ac970b392fa73abfab..c7460371e44a02c99522f265efa7e5e66a121b1e 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -23,9 +23,7 @@ #include #include -#define __sin __sin_local -#define __cos __cos_local -#define IN_SINCOS 1 +#define IN_SINCOS #include "s_sin.c" void @@ -37,31 +35,63 @@ __sincos (double x, double *sinx, double *cosx) SET_RESTORE_ROUND_53BIT (FE_TONEAREST); u.x = x; - k = 0x7fffffff & u.i[HIGH_HALF]; + k = u.i[HIGH_HALF] & 0x7fffffff; if (k < 0x400368fd) { - *sinx = __sin_local (x); - *cosx = __cos_local (x); - return; - } - if (k < 0x419921FB) - { - double a, da; - int4 n = reduce_sincos (x, &a, &da); - - *sinx = do_sincos (a, da, n); - *cosx = do_sincos (a, da, n + 1); + double a, da, y; + /* |x| < 2^-27 => cos (x) = 1, sin (x) = x. */ + if (k < 0x3e400000) + { + if (k < 0x3e500000) + math_check_force_underflow (x); + *sinx = x; + *cosx = 1.0; + return; + } + /* |x| < 0.855469. */ + else if (k < 0x3feb6000) + { + *sinx = do_sin (x, 0); + *cosx = do_cos (x, 0); + return; + } + /* |x| < 2.426265. */ + y = hp0 - fabs (x); + a = y + hp1; + da = (y - a) + hp1; + *sinx = __copysign (do_cos (a, da), x); + *cosx = do_sin (a, da); return; } + /* |x| < 2^1024. */ if (k < 0x7ff00000) { - double a, da; - int4 n = __branred (x, &a, &da); + double a, da, xx; + unsigned int n; - *sinx = do_sincos (a, da, n); - *cosx = do_sincos (a, da, n + 1); + /* If |x| < 105414350 use simple range reduction. */ + n = k < 0x419921FB ? reduce_sincos (x, &a, &da) : __branred (x, &a, &da); + n = n & 3; + + if (n == 1 || n == 2) + { + a = -a; + da = -da; + } + + if (n & 1) + { + double *temp = cosx; + cosx = sinx; + sinx = temp; + } + + *sinx = do_sin (a, da); + xx = do_cos (a, da); + *cosx = (n & 2) ? -xx : xx; + return; } if (isinf (x))