From patchwork Fri Mar 9 15:44:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883717 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90922-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="ua4EZ9ks"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyWsp4G2hz9sbw for ; Sat, 10 Mar 2018 02:45:01 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=aaVnhjK5B/tS8rs9nu91Xyi9Agv6DbDdUIX/c24kXJ2dAmTZJawPO nZgNcNF8s5XhsKio8gS0Ofw0Jx3F5/0ucN6FYzvKGF4Ta7bNS0h7JII2qyWXzoqy Ib/xz0o+h/x1uNgosWhlwU28q0RWRbylvxqQHDAkOYaJSJsh59gpGw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=qk0hQ74SBouvXUaGJZB7ODKmyDs=; b=ua4EZ9ksaTy4kqfZqUuytmKCpzvq NJ6WOTIi8J7xdFBTxaff6fO/0HdzuvkWBlwOLomOkQcKT7KXwEi3TU0dYO6sYy/C WwB8vlugSbRR6c2ebEItpC19H0V7wDz55aivcvXhvQZucNk6+Umr+aq3kFfpNc0y wvTC9kFlgBtxE1A= Received: (qmail 7418 invoked by alias); 9 Mar 2018 15:44:55 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 7405 invoked by uid 89); 9 Mar 2018 15:44:54 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=hp0, million, 19707, 0535 X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 1/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:44:49 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB2085; 6:r+83dTfhEw+jFJ/+67ZT7Bm0f4jtY8Fi5X9j+pE3jrOsZZzi2wFVQIYT5CW0JbcBT2xJqpYEh12NwvbSqyI/hgrIaizghX4H1TnOT/0Q1zgIOC9gue2WumlRpNcRcM2yQZ9l4OGAMgvOJT3mi4fM02wdkKYTN93fjIj9jBT6bnhiHaTFbCCLkUOvmyY021a1p0ZUB1Wva9nRjSK6blxiX7QzkYZijwPTLz3kSSBk2mJT3UgVzW1fXth9GJJbcSFldRw7lvrDqN2VNWcWxH+Y3vhcbnbsqBCKkIBKcZqeUrlLp7tI1f1LbqF52yrET+eP2/33kTYu2aVx5u+8v2giMOrHNZiV14Bi98zQogwqbFTsRB6wD5blKSQWCRn44uVK; 5:XfQaWHlYR8PCV/xDH16UH7aahUQ5gJojHHkC8UpqaHFdEjwGNYk38TaJwAXIuifmsuBF9fzZFX1NGUiCWAm68V0tRbFIRDqYpgCe313z3SPm79g9K6l38kCFzUX1oN923+y9FWF9GU+05/3TfitxtL2lI+7zn9hJPfABu9ZGfTE=; 24:GgwJvmJyRsRb0OwdR0iAMjh6/1fbmUVdb68bm+HZffz0KWyBqxe7skt1J1dczniVa9PfSpgflB4fyY+t5fgAz2WzsJvlbJ647XuB2WIyP9U=; 7:JFFQIB7/ByjYQQIzXlMhG4o+5jlFZSeomPRUcj21qLTUdjjqp4LEikIGLz3vMhHeo+Xu6ZT1+i9NFhGpNCMWIBqn2mg6XQc6tS0TrqMHGk2mqPdglfRHQf5ek8MHc/u92BnpYnnktDKIClSWwBKa89FWeYRAk2/T/T4r0zLRPhkivDhs5mHxS90LmWwauNKLySicjvTUjezyylx3PniJ8seA5hFGHWeJbn5IKI131Nc0KGxUuUvWv/c/gky/8ZGh x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 2735e276-37f9-4e82-fe2f-08d585d4b138 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB2085; x-ms-traffictypediagnostic: DB6PR0801MB2085: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3002001)(93006095)(93001095)(3231220)(944501244)(52105095)(10201501046)(6055026)(6041310)(20161123558120)(20161123564045)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0801MB2085; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB2085; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(346002)(376002)(396003)(366004)(39380400002)(189003)(199004)(54534003)(377424004)(5660300001)(97736004)(2351001)(106356001)(478600001)(3660700001)(3280700002)(6916009)(81156014)(81166006)(8936002)(8676002)(575784001)(14454004)(7696005)(316002)(86362001)(26005)(72206003)(3846002)(99286004)(6116002)(2501003)(5250100002)(4326008)(25786009)(53936002)(6506007)(105586002)(102836004)(2900100001)(59450400001)(305945005)(68736007)(7736002)(66066001)(2906002)(9686003)(74316002)(5640700003)(55016002)(33656002)(6436002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB2085; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: +5Po549fNoCAMfzFNGl2NwY7EnZD0pU9nqILq1kijexQT6IzLbhwWc914ioBeMx/XxhhaeLy6QblXtUr7cc1O18HvVUS6U9V1aqKZDk/7KdLB2CsKST+WhrA2SD94AW74kZCZanEUYroLWyiN1sUs8YFVNYrY3JIVo7mbJa+SkoEKhqCeZqJn+NRe2Wl+TOdboCROq/2yPPJktJB09Anr38NMQM+RM90uunkaT17P8Pa3o63ulKtbA+WWWDbIinh+m8+zv/VTnPp5C3lrgzadJs/evnNjluW+fevYktr1QDEDf3fK7SEXHnK2HtxzFICqef+c60kOJB3bMSB4DGRFg== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2735e276-37f9-4e82-fe2f-08d585d4b138 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:44:49.5187 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2085 This series of patches removes the slow patchs from sin, cos and sincos. Besides greatly simplifying the implementation, the new version is also much faster for inputs up to PI (41% faster) and for large inputs needing range reduction (27% faster). ULP is ~0.55 with no errors found after testing 1.6 billion inputs across most of the range with mpsin and mpcos. The number of incorrectly rounded results (ie. ULP >0.5) is at most ~2750 per million inputs between 0.125 and 0.5, the average is ~850 per million between 0 and PI. Tested on AArch64 and x86_64 with no regressions. The first patch removes the slow paths for the cases where the input is small and doesn't require range reduction. Update ULP tables for sin, cos and sincos on AArch64 and x86_64. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/aarch64/libm-test-ulps: Update ULP for sin, cos, sincos. * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Remove slow paths for small inputs. (__cos): Likewise. * sysdeps/x86_64/fpu/libm-test-ulps: Update ULP for sin, cos, sincos. diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index 1f469803be59bb4813370d95c6d091de901e6129..be06085154db24c8fd6cf1bce417028a959aaa27 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -1012,7 +1012,9 @@ ildouble: 2 ldouble: 2 Function: "cos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -1970,7 +1972,9 @@ ildouble: 2 ldouble: 2 Function: "sin": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 @@ -2000,7 +2004,9 @@ ildouble: 3 ldouble: 3 Function: "sincos": +double: 1 float: 1 +idouble: 1 ifloat: 1 ildouble: 1 ldouble: 1 diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 8c589cbd4ab7451a5889e9a474bf4bd36c49d498..9673a461ac592fc2bf3babc755dae336312e4c56 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -448,7 +448,7 @@ SECTION #endif __sin (double x) { - double xx, res, t, cor; + double xx, t, cor; mynumber u; int4 k, m; double retval = 0; @@ -471,26 +471,22 @@ __sin (double x) xx = x * x; /* Taylor series. */ t = POLYNOMIAL (xx) * (xx * x); - res = x + t; - cor = (x - res) + t; - retval = (res == res + 1.07 * cor) ? res : slow (x); + /* Max ULP of x + t is 0.535. */ + retval = x + t; } /* else if (k < 0x3fd00000) */ /*---------------------------- 0.25<|x|< 0.855469---------------------- */ else if (k < 0x3feb6000) { - res = do_sin (x, 0, &cor); - retval = (res == res + 1.096 * cor) ? res : slow1 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.548. */ + retval = __copysign (do_sin (x, 0, &cor), x); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ else if (k < 0x400368fd) { - t = hp0 - fabs (x); - res = do_cos (t, hp1, &cor); - retval = (res == res + 1.020 * cor) ? res : slow2 (x); - retval = __copysign (retval, x); + /* Max ULP is 0.51. */ + retval = __copysign (do_cos (t, hp1, &cor), x); } /* else if (k < 0x400368fd) */ #ifndef IN_SINCOS @@ -541,7 +537,7 @@ SECTION #endif __cos (double x) { - double y, xx, res, cor, a, da; + double y, xx, cor, a, da; mynumber u; int4 k, m; @@ -561,8 +557,8 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ - res = do_cos (x, 0, &cor); - retval = (res == res + 1.020 * cor) ? res : cslow2 (x); + /* Max ULP is 0.51. */ + retval = do_cos (x, 0, &cor); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -571,20 +567,12 @@ __cos (double x) a = y + hp1; da = (y - a) + hp1; xx = a * a; + /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. + Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - { - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (1.0e-31, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, true); - } + retval = TAYLOR_SIN (xx, a, da, cor); else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (1.0e-31, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, true)); - } - + retval = __copysign (do_sin (a, da, &cor), a); } /* else if (k < 0x400368fd) */ diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index 48e53f7ef2cf814d71d5d0c9f2bb907f594aa7ef..bbb8a4d0754dbe6665682cd8a7f51f7319a14014 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -1262,7 +1262,9 @@ ildouble: 1 ldouble: 1 Function: "cos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2528,7 +2530,9 @@ Function: "pow_vlen8_avx2": float: 3 Function: "sin": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 @@ -2578,7 +2582,9 @@ Function: "sin_vlen8_avx2": float: 1 Function: "sincos": +double: 1 float128: 1 +idouble: 1 ifloat128: 1 ildouble: 1 ldouble: 1 From patchwork Fri Mar 9 15:46:31 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883718 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90923-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="FmmoKWKl"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyWvk4DYBz9sbw for ; Sat, 10 Mar 2018 02:46:42 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=Q89SvnqeVwqVTKRvmuI7Q49/++jmqXvTPSujV9lE+89YACLePI1PG GkCKoXjUXUapaAdhScRxdmDgbv2saNjBujfJ1yx7ugKou/u2f9c1TcICDaXVVaOc xb/d5MGpX3QJt5Di1Buet77nGeEsWndI4V8EHRTzUzS4/h2bvb0e34= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=buiO8DQzuLGXX7lYYhTtispD3yg=; b=FmmoKWKl239jV9srsjii5nS82Uk3 GIm/XvYh5G9aLwwx8RopZKn4B1ta1l5jYJ3uAhvBF89vfPyiLs1lNt/wNe2MbOdy 3WFfBsj8iZDqBhjFoIgttyoYnylbrTOExHFk2EaPuYPP+uwldpGGovZLauzMdbF3 +cjwKwOLPUpjK2c= Received: (qmail 11348 invoked by alias); 9 Mar 2018 15:46:36 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 11323 invoked by uid 89); 9 Mar 2018 15:46:35 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=vi X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 2/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:46:31 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB2085; 7:EFdgg8DpZjuCxPrK3sv9EWzlSO9F/OmQ+snLtckDS9auz0rMC+EaV/1CmlZe24THHfKlodNuLZoAwV8r+pqbPo7fR2sCH7HEgIHVat59CdpfwLOoT7O6HkhiC0KPI0fhGMV5Y7MqNhFsR0wxPSbniecRAvdjzSsye1DEZvGezqxeZHwpljvynSQqh6PRKkJ/nGix0GBBYHiiLyQ2uXC2gBsglWM9FHGQgIsONXHEp1Q22pN+m7nbJXGaVPtfj2wN x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 2877c785-1836-4b1a-3f2f-08d585d4edc2 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB2085; x-ms-traffictypediagnostic: DB6PR0801MB2085: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3002001)(93006095)(93001095)(3231220)(944501244)(52105095)(10201501046)(6055026)(6041310)(20161123558120)(20161123564045)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0801MB2085; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB2085; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(979002)(39860400002)(346002)(376002)(396003)(366004)(39380400002)(189003)(199004)(54534003)(377424004)(5660300001)(97736004)(2351001)(106356001)(478600001)(3660700001)(3280700002)(6916009)(81156014)(81166006)(8936002)(8676002)(14454004)(7696005)(316002)(86362001)(26005)(72206003)(3846002)(99286004)(6116002)(2501003)(5250100002)(4326008)(25786009)(53936002)(6506007)(105586002)(102836004)(2900100001)(305945005)(68736007)(7736002)(66066001)(2906002)(9686003)(74316002)(5640700003)(55016002)(33656002)(6436002)(969003)(989001)(999001)(1009001)(1019001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB2085; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: ZLiWbRBR16Yh0Jux52T7Jv2i0jOWwEWoiiEzyu5jcdITyMIx0jYwF4nIPZ/NMzAYwwXS2OQuZ8lss0ie/Nba2dSAxShCLzAzMv+YMf6lLMrgsBD9SJYbYqW9CNmp9C0HOzUwEhI3Kd66pgyWcOLhp9IH8RU8Gy0SmpCBRMdDADxCUMAuP/epeq6LzZownuPoEs2xYQjh4E0Vz+Hdf3tTvgJ6Ofs8ctye/mUx40v5djS9UlO9SFwIYXzRWe+sZ+PXMzI6xWueAe7uw5D2e8cZ148+I74mlPBSs3siaFblIyCW9+ZjRNm0lSns/yvGoGJxxl3V3sIBqWEg/UDqdxTtCw== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2877c785-1836-4b1a-3f2f-08d585d4edc2 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:46:31.0503 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB2085 This patch removes 2nd of the 3 range reduction cases and defer to the final one. Input values above 2^27 are extremely rare, so this case doesn't need to as be optimized as smaller inputs. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_2): Remove function. (do_sincos_2): Likewise. (__sin): Remove middle range reduction case. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Remove middle range reduction case. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 9673a461ac592fc2bf3babc755dae336312e4c56..1f98e29278183d1fccd7c2b3fd467d6b16c245ed 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -362,80 +362,6 @@ do_sincos_1 (double a, double da, double x, int4 n, bool shift_quadrant) return retval; } -static inline int4 -__always_inline -reduce_sincos_2 (double x, double *a, double *da) -{ - mynumber v; - - double t = (x * hpinv + toint); - double xn = t - toint; - v.x = t; - double xn1 = (xn + 8.0e22) - 8.0e22; - double xn2 = xn - xn1; - double y = ((((x - xn1 * mp1) - xn1 * mp2) - xn2 * mp1) - xn2 * mp2); - int4 n = v.i[LOW_HALF] & 3; - double db = xn1 * pp3; - t = y - db; - db = (y - t) - db; - db = (db - xn2 * pp3) - xn * pp4; - double b = t + db; - db = (t - b) + db; - - *a = b; - *da = db; - - return n; -} - -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ -static double -__always_inline -do_sincos_2 (double a, double da, double x, int4 n, bool shift_quadrant) -{ - double res, retval, cor, xx; - - double eps = 1.0e-24; - - int4 k = (n + shift_quadrant) & 3; - - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; - if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : bsloww (a, da, x, n); - } - else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : bsloww1 (a, da, x, n)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : bsloww2 (a, da, x, n)); - break; - } - - return retval; -} - /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ @@ -498,16 +424,7 @@ __sin (double x) retval = do_sincos_1 (a, da, x, n, false); } /* else if (k < 0x419921FB ) */ -/*---------------------105414350 <|x|< 281474976710656 --------------------*/ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, false); - } /* else if (k < 0x42F00000 ) */ - -/* -----------------281474976710656 <|x| <2^1024----------------------------*/ +/* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) retval = reduce_and_compute (x, false); @@ -584,15 +501,7 @@ __cos (double x) retval = do_sincos_1 (a, da, x, n, true); } /* else if (k < 0x419921FB ) */ - else if (k < 0x42F00000) - { - double a, da; - - int4 n = reduce_sincos_2 (x, &a, &da); - retval = do_sincos_2 (a, da, x, n, true); - } /* else if (k < 0x42F00000 ) */ - - /* 281474976710656 <|x| <2^1024 */ + /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) retval = reduce_and_compute (x, true); diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index e1977ea7e93c32cca5369677f23e68f8f797a9f4..a9af8ce526bfe78c06cfafa65de0815ec69585c5 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -86,16 +86,6 @@ __sincos (double x, double *sinx, double *cosx) return; } - if (k < 0x42F00000) - { - double a, da; - int4 n = reduce_sincos_2 (x, &a, &da); - - *sinx = do_sincos_2 (a, da, x, n, false); - *cosx = do_sincos_2 (a, da, x, n, true); - - return; - } if (k < 0x7ff00000) { reduce_and_compute_sincos (x, sinx, cosx); From patchwork Fri Mar 9 15:48:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883719 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90924-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="hd1x1DXC"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyWyM6CDlz9sbw for ; Sat, 10 Mar 2018 02:48:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=LH32fP5GjY5gx0Il602qFIqlNb6htsOvLSEq8/If5viCENGccbqWr YAlmlxSAe4LH381fiYoyIxqmnijGtlcodiwCi1kL3+Ghw/WNjWNol58sy+2fEi+U nvBZrGD7k4KJzY5JcvjH8ASP5kr86H/1BzuskbVzonOyfrtdCDlXqo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=lq4bQoQzCnFvMxcDo5BVH1OUXDE=; b=hd1x1DXCVfKFe0PP5/p4fpz/QMiF k4wW2JVMxTQbN1QOwsD493ZxlTEgNSX4Gt/X4xjQooYsB9ogBC0tS2D8UXSc1HtC a+sSye44m59+osDKZL/mu9QGMixV7KrhJm2iLK5/xLjIFSav33Rm1IOchhOZycOj 3pfdpDeV8QG21iM= Received: (qmail 16198 invoked by alias); 9 Mar 2018 15:48:54 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 16186 invoked by uid 89); 9 Mar 2018 15:48:53 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 3/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:48:47 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1878; 7:9EfTK3wsE08XnnATeYhyLLn6Ejgy00fBP1sUvZTjuG2WpOq/qsaeq0kBDGYBihhlhbDfZhCmmIFemDjKCYbTHmDaOuyFNbpMgegyFe5+hYfSYwad9ZnwNKkTjZmmQcOm3b2m9szWmS9qDehqwiXkHhm18YTuUIJYmoFprSJFKxIulUwwP1fkxBT6wx4AzGUyGrWzNK5ek5dc/lKQfvcWp3lbBsJRQtRauy77nVoA0bQGwxMIGaiXQw/xfy4i7R7b x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 740aa337-df39-42c5-c6dc-08d585d53f57 x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1878; x-ms-traffictypediagnostic: DB6PR0801MB1878: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123562045)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0801MB1878; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1878; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39860400002)(39380400002)(396003)(376002)(346002)(54534003)(377424004)(189003)(199004)(102836004)(74316002)(6436002)(3846002)(66066001)(72206003)(8676002)(8936002)(6116002)(81166006)(81156014)(3280700002)(2900100001)(68736007)(25786009)(14454004)(6506007)(6916009)(99286004)(2351001)(7696005)(86362001)(478600001)(7736002)(305945005)(2906002)(5640700003)(53936002)(106356001)(3660700001)(5250100002)(59450400001)(105586002)(97736004)(33656002)(2501003)(55016002)(5660300001)(4326008)(316002)(9686003)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1878; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: 1M9eJjldefsoNp1e7IexG4b2pxUqoabpwBKoU/FqdF20yn70CUhWuzozWiSsNza0V8Q/R7jvmBUw1ICnzsAmdj/7rhB1eFmllaaEdjSHvaZeo1x60c2zs3v+LlcRQ2VcWeJm8oP2iSJMI7jTJ0S8iXvePJGD1ieDGb4+Ot/2+Kq9H8rYolb82bXLU/JQqGDiIGscOiWynKqjHjXz6GGUBfF6Y2/Hv+noHeWEBN13hwPMrDfkIuCWug32D2KwQzBgYi+3aWSBoOgyRlvsdNlJ/M0XhsW3D9JVcHRSoNpGg6as5K0BSUqaODEgyKrMvTJVLOThWknHz+ZzaCb0zuw00Q== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 740aa337-df39-42c5-c6dc-08d585d53f57 X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:48:47.9726 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1878 This patch improves the accuracy of the range reduction. When the input is large (2^27) and very close to a multiple of PI/2, using 110 bits of PI is not enough. Improve range reduction accuracy to 136 bits. As a result the special checks for results close to zero can be removed. The ULP of the polynomials is at worst 0.55ULP, so there is no reason for the slow functions, and they can be removed. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (reduce_sincos_1): Rename to sincos_1, improve accuracy to 136 bits. (do_sincos_1): Rename to do_sincos, remove fallbacks to slow functions. (__sin): Use improved reduction and simplified do_sincos calculation. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 1f98e29278183d1fccd7c2b3fd467d6b16c245ed..b48b8627a7a801dfafecc920062aaaac51969a8a 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -295,9 +295,13 @@ reduce_and_compute (double x, bool shift_quadrant) return retval; } +/* Reduce range of x to within PI/2 with abs (x) < 105414350. The high part + is written to *a, the low part to *da. Range reduction is accurate to 136 + bits so that when x is large and *a very close to zero, all 53 bits of *a + are correct. */ static inline int4 __always_inline -reduce_sincos_1 (double x, double *a, double *da) +reduce_sincos (double x, double *a, double *da) { mynumber v; @@ -306,62 +310,45 @@ reduce_sincos_1 (double x, double *a, double *da) v.x = t; double y = (x - xn * mp1) - xn * mp2; int4 n = v.i[LOW_HALF] & 3; - double db = xn * mp3; - double b = y - db; - db = (y - b) - db; + + double b, db, t1, t2; + t1 = xn * pp3; + t2 = y - t1; + db = (y - t2) - t1; + + t1 = xn * pp4; + b = t2 - t1; + db += (t2 - b) - t1; *a = b; *da = db; - return n; } -/* Compute sin (A + DA). cos can be computed by passing SHIFT_QUADRANT as - true, which results in shifting the quadrant N clockwise. */ +/* Compute sin or cos (A + DA) for the given quadrant N. */ static double __always_inline -do_sincos_1 (double a, double da, double x, int4 n, bool shift_quadrant) +do_sincos (double a, double da, int4 n) { - double xx, retval, res, cor; - double eps = fabs (x) * 1.2e-30; + double retval, cor; - int k1 = (n + shift_quadrant) & 3; - switch (k1) - { /* quarter of unit circle */ - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - xx = a * a; + if (n & 1) + /* Max ULP is 0.513. */ + retval = do_cos (a, da, &cor); + else + { + double xx = a * a; + /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - { - /* Taylor series. */ - res = TAYLOR_SIN (xx, a, da, cor); - cor = 1.02 * cor + __copysign (eps, cor); - retval = (res == res + cor) ? res : sloww (a, da, x, shift_quadrant); - } + retval = TAYLOR_SIN (xx, a, da, cor); else - { - res = do_sin (a, da, &cor); - cor = 1.035 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? __copysign (res, a) - : sloww1 (a, da, x, shift_quadrant)); - } - break; - - case 1: - case 3: - res = do_cos (a, da, &cor); - cor = 1.025 * cor + __copysign (eps, cor); - retval = ((res == res + cor) ? ((n & 2) ? -res : res) - : sloww2 (a, da, x, n)); - break; + retval = __copysign (do_sin (a, da, &cor), a); } - return retval; + return (n & 2) ? -retval : retval; } + /*******************************************************************/ /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ @@ -374,9 +361,9 @@ SECTION #endif __sin (double x) { - double xx, t, cor; + double xx, t, a, da, cor; mynumber u; - int4 k, m; + int4 k, m, n; double retval = 0; #ifndef IN_SINCOS @@ -419,9 +406,8 @@ __sin (double x) /*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ else if (k < 0x419921FB) { - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, false); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n); } /* else if (k < 0x419921FB ) */ /* --------------------105414350 <|x| <2^1024------------------------------*/ @@ -435,6 +421,11 @@ __sin (double x) __set_errno (EDOM); retval = x / x; } +#else + /* Disable warning... */ + n = 0, n = n; + a = 0, a = a; + da = 0, da = da; #endif return retval; @@ -456,7 +447,7 @@ __cos (double x) { double y, xx, cor, a, da; mynumber u; - int4 k, m; + int4 k, m, n; double retval = 0; @@ -496,9 +487,8 @@ __cos (double x) #ifndef IN_SINCOS else if (k < 0x419921FB) { /* 2.426265<|x|< 105414350 */ - double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); - retval = do_sincos_1 (a, da, x, n, true); + n = reduce_sincos (x, &a, &da); + retval = do_sincos (a, da, n + 1); } /* else if (k < 0x419921FB ) */ /* 105414350 <|x| <2^1024 */ @@ -511,6 +501,9 @@ __cos (double x) __set_errno (EDOM); retval = x / x; /* |x| > 2^1024 */ } +#else + /* Disable warning... */ + n = 0, n = n; #endif return retval; diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index a9af8ce526bfe78c06cfafa65de0815ec69585c5..4f032d2e42593ccde22169b374728386dd8fca8e 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -79,10 +79,10 @@ __sincos (double x, double *sinx, double *cosx) if (k < 0x419921FB) { double a, da; - int4 n = reduce_sincos_1 (x, &a, &da); + int4 n = reduce_sincos (x, &a, &da); - *sinx = do_sincos_1 (a, da, x, n, false); - *cosx = do_sincos_1 (a, da, x, n, true); + *sinx = do_sincos (a, da, n); + *cosx = do_sincos (a, da, n + 1); return; } From patchwork Fri Mar 9 15:50:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883720 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90925-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="W867HIK5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyX052WCMz9sbw for ; Sat, 10 Mar 2018 02:50:29 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=QgLW3dPh/kTFNfwCEuZqV84g5B8pUF9itNGMeJbiAaJcqm1uvXPbQ zCdrf0PtXUMWAMsF+l1OarISxs7A6aBRxsvYgJwv/2XOBoz3ak5pjdMNYnE+JFgP 25BkWplNzKG6UhKjcvd3Uiawmy9IFwK1iHt4tyAcXQG7Av3a65n+hw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=+ZKTqbfkYuEvV/+iJovLyIMe8wc=; b=W867HIK54FVl72GvOpLxNe9LhtJ8 K2cfv9j1JMTjPwZPkwXSeqNIphdaSfs5xKtSOKTKDnut/v7yR4fbBetbtsXRtAPC mCL+oodpHMrzJK2CC01SV+FFmSFwLOb4zUiyDEpcg6dLpyAO47jvAww1C7v9HTbL TDro6UykM6QrKCY= Received: (qmail 18648 invoked by alias); 9 Mar 2018 15:50:23 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 18633 invoked by uid 89); 9 Mar 2018 15:50:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=sine X-HELO: EUR03-DB5-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 4/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:50:18 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1878; 6:0S5i2/nTCyUoWZjgb1qQQ+QAIhmlh7U4AHO1Cy5XkW5EnkDtTO0dvQ2yWC0E9P2NkYxxCQ9x+CaI3UfP2bHylm8Wk5cMsxiBzsklIIbOL+rJp+Ycke4p2P1/rSaryXd/WZQSMrLZI/Svl3NAMZjtNMcFdYVMz6pZeyHeXcEUPs0gpdDH8PcXAi/WjgreS5S4imESHntEh8fFDnb1q1v39w7yDerVraUBsC2J+lSQTKKKvlk465DN3kOwgrUuWoy4qts+oy7RIClGa1JpwmGsLCNd58gYvmr4fkqU495WSQO5nGFxzVHrE+KtSRuwSg8AtmTN6Bp3y6e/Y5OzqwlmxpzoI6jmC8R0tC+D3ddOCyqDlnm2rZaTWt8Nxw852QOo; 5:7+JNn1pZyGTiirTz/3ns9CfeYyaU2Y06T+RYn7D0udmBseUEzcB08Vir34E10X8VxZETIUvFzJNoiL9PZJ2XV+8xeEJylH8YRt4XpD9XTf+WLTTtTxjhraYp3sh6RtTVCckp7Fb6I26KKbWBljEv5/AXbjQBF7LTYxcJpt6zuSo=; 24:O2KyGmptnqAuYvvxa+E4kBKiIxqok+S77BCE8E1hsBtCY6STSmLmUrNWfaG2MO/x/qV86u5naL3htRAxtiu2gIGcK/zJ9ZQZhggjctUE3So=; 7:KTL1nFGIAAolTmpzsA3//QxnkAaOA/s3KZo04srzYSWErq32ZRhDOqnNVVC1FB0LmNeZShq3c8WQOloOozlVNZYv4Pbwf4WL0QmpxiPcOSyBztvQfutEdy8ltDg6vG3ipHL+5L+2qDiRdzBqIsPELUPloLQOWkw0B/9y7b8jPU1j0wS1O2Lv0XpcE5nD9dacKU9sE6sjwSunRbJ360bNqEPPNq9PiLE0tB4FsCHZRw0lljLQ2lNUaMhrfG/z/wdK x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 6438cf11-077b-4f42-6975-08d585d5752c x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1878; x-ms-traffictypediagnostic: DB6PR0801MB1878: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123562045)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0801MB1878; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1878; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39860400002)(39380400002)(396003)(376002)(346002)(54534003)(377424004)(189003)(199004)(102836004)(74316002)(6436002)(3846002)(66066001)(72206003)(8676002)(8936002)(6116002)(81166006)(81156014)(3280700002)(2900100001)(68736007)(25786009)(14454004)(6506007)(6916009)(99286004)(2351001)(7696005)(86362001)(478600001)(7736002)(305945005)(2906002)(5640700003)(53936002)(106356001)(3660700001)(5250100002)(105586002)(97736004)(33656002)(2501003)(55016002)(5660300001)(4326008)(316002)(9686003)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1878; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: rsbav708y+fNJ8IZu1+74C7oehprL8asE53hOF8NlN3jYXDgv4+LoBoQRSZTkrresiHwNB9pMZWgYudZUan9qk3qdJ136f21LCQgVqDBVIXunqYOwZ3oZuSzlJijB1Ddmf/l2rEPl+2kGD5NS5Qz/6qkS/yhl9fRbEEwkZPvrUkWqE8qK8KcqvJzOSy16Y2xEKxh1LsEPmKH+AMXCzihKVQpQ/nFZGqXGZPJeW3QcvNUpuKEZHOjwBtur1o3rUqfeuj0Fxe6DKd1WxMjklKoTi/6ADxv1IHUWs/+iWSL5l14qL7j2BztVQ+yUkFslnHyyfOolUm5TvG8dzyaPYZjQg== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6438cf11-077b-4f42-6975-08d585d5752c X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:50:18.2385 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1878 For huge inputs use the improved do_sincos function as well. Now no cases use the correction factor returned by do_sin, do_cos and TAYLOR_SIN, so remove it. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SIN): Remove cor parameter. (do_cos): Remove corp parameter and calculations. (do_sin): Likewise. (do_sincos): Remove cor variable. (__sin): Use do_sincos for huge inputs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sincos.c (__sincos): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 3b748821f6e5f817dc234ec7f96d951910299e21..5966282db60224528fea2bf55a05dd4120ab12a9 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -67,11 +67,10 @@ The constants s1, s2, s3, etc. are pre-computed values of 1/3!, 1/5! and so on. The result is returned to LHS and correction in COR. */ -#define TAYLOR_SIN(xx, a, da, cor) \ +#define TAYLOR_SIN(xx, a, da) \ ({ \ double t = ((POLYNOMIAL (xx) * (a) - 0.5 * (da)) * (xx) + (da)); \ double res = (a) + t; \ - (cor) = ((a) - res) + t; \ res; \ }) @@ -145,10 +144,10 @@ static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to - get the result in RES and a correction value in COR. */ + get the result. */ static inline double __always_inline -do_cos (double x, double dx, double *corp) +do_cos (double x, double dx) { mynumber u; @@ -158,16 +157,13 @@ do_cos (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big) + dx; - double xx, s, sn, ssn, c, cs, ccs, res, cor; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + x * xx * (sn3 + xx * sn5); c = xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ccs - s * ssn - cs * c) - sn * s; - res = cs + cor; - cor = (cs - res) + cor; - *corp = cor; - return res; + return cs + cor; } /* A more precise variant of DO_COS. EPS is the adjustment to the correction @@ -207,10 +203,10 @@ do_cos_slow (double x, double dx, double eps, double *corp) /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get - the result in RES and a correction value in COR. */ + the result. */ static inline double __always_inline -do_sin (double x, double dx, double *corp) +do_sin (double x, double dx) { mynumber u; @@ -219,16 +215,13 @@ do_sin (double x, double dx, double *corp) u.x = big + fabs (x); x = fabs (x) - (u.x - big); - double xx, s, sn, ssn, c, cs, ccs, cor, res; + double xx, s, sn, ssn, c, cs, ccs, cor; xx = x * x; s = x + (dx + x * xx * (sn3 + xx * sn5)); c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); cor = (ssn + s * ccs - sn * c) + cs * s; - res = sn + cor; - cor = (sn - res) + cor; - *corp = cor; - return res; + return sn + cor; } /* A more precise variant of DO_SIN. EPS is the adjustment to the correction @@ -340,19 +333,19 @@ static double __always_inline do_sincos (double a, double da, int4 n) { - double retval, cor; + double retval; if (n & 1) /* Max ULP is 0.513. */ - retval = do_cos (a, da, &cor); + retval = do_cos (a, da); else { double xx = a * a; /* Max ULP is 0.501 if xx < 0.01588, otherwise ULP is 0.518. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } return (n & 2) ? -retval : retval; @@ -371,7 +364,7 @@ SECTION #endif __sin (double x) { - double xx, t, a, da, cor; + double xx, t, a, da; mynumber u; int4 k, m, n; double retval = 0; @@ -401,7 +394,7 @@ __sin (double x) else if (k < 0x3feb6000) { /* Max ULP is 0.548. */ - retval = __copysign (do_sin (x, 0, &cor), x); + retval = __copysign (do_sin (x, 0), x); } /* else if (k < 0x3feb6000) */ /*----------------------- 0.855469 <|x|<2.426265 ----------------------*/ @@ -409,7 +402,7 @@ __sin (double x) { t = hp0 - fabs (x); /* Max ULP is 0.51. */ - retval = __copysign (do_cos (t, hp1, &cor), x); + retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ #ifndef IN_SINCOS @@ -422,8 +415,10 @@ __sin (double x) /* --------------------105414350 <|x| <2^1024------------------------------*/ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, false); - + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n); + } /*--------------------- |x| > 2^1024 ----------------------------------*/ else { @@ -455,7 +450,7 @@ SECTION #endif __cos (double x) { - double y, xx, cor, a, da; + double y, xx, a, da; mynumber u; int4 k, m, n; @@ -476,7 +471,7 @@ __cos (double x) else if (k < 0x3feb6000) { /* 2^-27 < |x| < 0.855469 */ /* Max ULP is 0.51. */ - retval = do_cos (x, 0, &cor); + retval = do_cos (x, 0); } /* else if (k < 0x3feb6000) */ else if (k < 0x400368fd) @@ -488,9 +483,9 @@ __cos (double x) /* Max ULP is 0.501 if xx < 0.01588 or 0.518 otherwise. Range reduction uses 106 bits here which is sufficient. */ if (xx < 0.01588) - retval = TAYLOR_SIN (xx, a, da, cor); + retval = TAYLOR_SIN (xx, a, da); else - retval = __copysign (do_sin (a, da, &cor), a); + retval = __copysign (do_sin (a, da), a); } /* else if (k < 0x400368fd) */ @@ -503,7 +498,10 @@ __cos (double x) /* 105414350 <|x| <2^1024 */ else if (k < 0x7ff00000) - retval = reduce_and_compute (x, true); + { + n = __branred (x, &a, &da); + retval = do_sincos (a, da, n + 1); + } else { diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 4f032d2e42593ccde22169b374728386dd8fca8e..4335ecbba3c9894e61c087ac970b392fa73abfab 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -28,37 +28,6 @@ #define IN_SINCOS 1 #include "s_sin.c" -/* Consolidated version of reduce_and_compute in s_sin.c that does range - reduction only once and computes sin and cos together. */ -static inline void -__always_inline -reduce_and_compute_sincos (double x, double *sinx, double *cosx) -{ - double a, da; - unsigned int n = __branred (x, &a, &da); - - n = n & 3; - - if (n == 1 || n == 2) - { - a = -a; - da = -da; - } - - if (n & 1) - { - double *temp = cosx; - cosx = sinx; - sinx = temp; - } - - if (a * a < 0.01588) - *sinx = bsloww (a, da, x, n); - else - *sinx = bsloww1 (a, da, x, n); - *cosx = bsloww2 (a, da, x, n); -} - void __sincos (double x, double *sinx, double *cosx) { @@ -88,8 +57,11 @@ __sincos (double x, double *sinx, double *cosx) } if (k < 0x7ff00000) { - reduce_and_compute_sincos (x, sinx, cosx); - return; + double a, da; + int4 n = __branred (x, &a, &da); + + *sinx = do_sincos (a, da, n); From patchwork Fri Mar 9 15:51:26 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883722 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90926-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="yB15z7i+"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyX1R3wCdz9sbw for ; Sat, 10 Mar 2018 02:51:39 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=ZYM8XnenRzxBGsZl3xb2P5qxC7Ba70gjZwhw83VdIX0AOCrEvLkgz jeNrzyr76eLjZTM4BiftM+bWXkZCXuwjFgJilzyuBaRyXaGqelaOmRJPpOU/DO+x vWS4dAwlkkWguHJEKFwyr4nBWp6s93DhpS+VpuNvP6YZLKn6hQQpic= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=ZIrriPeDLsf/fHt4UJmG3iYv1Ww=; b=yB15z7i+L6O7MDwBFtefLgo5YRLv nHBQfosUsQfd2W4h082kqoveFIPsmLq9TOPRRNuTrIIcJIczRv+gKT885L2su2ng zFF+tInrr0IfGTgPxe7EK2ToMfFcQeH9jrccCN/cTDafhsUxNCBrsf8t0Sfjv2GM 1cszvpY7rgbD59U= Received: (qmail 20512 invoked by alias); 9 Mar 2018 15:51:33 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 20501 invoked by uid 89); 9 Mar 2018 15:51:32 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=UD:And, Assumption, FILES X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 5/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:51:26 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1878; 7:3348+4wRL97hf54xvpVf6YS3R2VEIqYOwCX/U2/50IY7Ax4QouWeq099evvno33aFxjzXN1WBaUK7NLHSZWwXm7P5xR+su2jO8y+07TW9RchcB+rLvenBj41uE3fVXG8vaX21LZtch7zluY8yuVFlXtmcZG+uTtJUij1No5anbQW5w5pCyFAv248jNo1ZJkwLivl9QFSz5ODaEM9cepIwmeRtUOl1zEwEmc8uUSAsxIL1WBSzwRZn1bl6evAPXQJ x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 30dda2b4-6fe8-4357-f364-08d585d59ddc x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1878; x-ms-traffictypediagnostic: DB6PR0801MB1878: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123564045)(20161123562045)(20161123560045)(6072148)(201708071742011); SRVR:DB6PR0801MB1878; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1878; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(39860400002)(39380400002)(396003)(376002)(346002)(54534003)(377424004)(189003)(199004)(102836004)(74316002)(6436002)(3846002)(66066001)(72206003)(8676002)(8936002)(6116002)(81166006)(81156014)(3280700002)(2900100001)(68736007)(25786009)(14454004)(6506007)(6916009)(99286004)(2351001)(7696005)(86362001)(478600001)(575784001)(7736002)(305945005)(2906002)(5640700003)(53936002)(106356001)(3660700001)(5250100002)(105586002)(97736004)(33656002)(2501003)(55016002)(5660300001)(4326008)(316002)(53946003)(9686003)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1878; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: AKcPdJe4SoHWHRwCXsEkchZoajaxp59Vxviu8Sllq6lSvJ0pSGbVOJKgD+Iv1lBx3VknmZDM9XBR2G5xeLd25laVulyFE43JorwUbsBvbeLe8Z1rMo+/I+GwkJlyXMsuAqtXvg0vPYgRXyii/YLuRekI3cYSFGzLjwokno16MMJVCFoztKf4kE+EFhFfcaF6RZFxZZMAJnGO+cExx3gfCskyYTTlBPypIByTiqHK/3eiouBpl0DhgdfRdPuVixVIKtOFTBXsbJhhR+0TCGVfXg4uKwjGOdlWE8Htwat1Zup/ipibDUeb1nmZna9Howrpq3LlSmUa4YOUPG/FtBBK7w== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 30dda2b4-6fe8-4357-f364-08d585d59ddc X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:51:26.5355 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1878 Remove all unused functions. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (TAYLOR_SLOW): Remove. (do_cos_slow): Likewise. (do_sin_slow): Likewise. (reduce_and_compute): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bslow): Likewise. (bslow1): Likewise. (bslow2): Likewise. (cslow2): Likewise. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 5966282db60224528fea2bf55a05dd4120ab12a9..b847a45cb3d3bd9d7c79d63e1577aec48eb0e8b1 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -22,22 +22,11 @@ /* */ /* FUNCTIONS: usin */ /* ucos */ -/* slow */ -/* slow1 */ -/* slow2 */ -/* sloww */ -/* sloww1 */ -/* sloww2 */ -/* bsloww */ -/* bsloww1 */ -/* bsloww2 */ -/* cslow2 */ /* FILES NEEDED: dla.h endian.h mpa.h mydefs.h usncs.h */ -/* branred.c sincos32.c dosincos.c mpa.c */ -/* sincos.tbl */ +/* branred.c sincos.tbl */ /* */ -/* An ultimate sin and routine. Given an IEEE double machine number x */ -/* it computes the correctly rounded (to nearest) value of sin(x) or cos(x) */ +/* An ultimate sin and cos routine. Given an IEEE double machine number x */ +/* it computes sin(x) or cos(x) with ~0.55 ULP. */ /* Assumption: Machine arithmetic operations are performed in */ /* round to nearest mode of IEEE 754 standard. */ /* */ @@ -74,29 +63,6 @@ res; \ }) -/* This is again a variation of the Taylor series expansion with the term - x^3/3! expanded into the following for better accuracy: - - bb * x ^ 3 + 3 * aa * x * x1 * x2 + aa * x1 ^ 3 + aa * x2 ^ 3 - - The correction term is dx and bb + aa = -1/3! - */ -#define TAYLOR_SLOW(x0, dx, cor) \ -({ \ - static const double th2_36 = 206158430208.0; /* 1.5*2**37 */ \ - double xx = (x0) * (x0); \ - double x1 = ((x0) + th2_36) - th2_36; \ - double y = aa * x1 * x1 * x1; \ - double r = (x0) + y; \ - double x2 = ((x0) - x1) + (dx); \ - double t = (((POLYNOMIAL2 (xx) + bb) * xx + 3.0 * aa * x1 * x2) \ - * (x0) + aa * x2 * x2 * x2 + (dx)); \ - t = (((x0) - r) + y) + t; \ - double res = r + t; \ - (cor) = (r - res) + t; \ - res; \ -}) - #define SINCOS_TABLE_LOOKUP(u, sn, ssn, cs, ccs) \ ({ \ int4 k = u.i[LOW_HALF] << 2; \ @@ -123,23 +89,7 @@ static const double cs4 = -4.16666666666664434524222570944589E-02, cs6 = 1.38888874007937613028114285595617E-03; -static const double t22 = 0x1.8p22; - -void __dubsin (double x, double dx, double w[]); -void __docos (double x, double dx, double w[]); -double __mpsin (double x, double dx, bool reduce_range); -double __mpcos (double x, double dx, bool reduce_range); -static double slow (double x); -static double slow1 (double x); -static double slow2 (double x); -static double sloww (double x, double dx, double orig, bool shift_quadrant); -static double sloww1 (double x, double dx, double orig, bool shift_quadrant); -static double sloww2 (double x, double dx, double orig, int n); -static double bsloww (double x, double dx, double orig, int n); -static double bsloww1 (double x, double dx, double orig, int n); -static double bsloww2 (double x, double dx, double orig, int n); int __branred (double x, double *a, double *aa); -static double cslow2 (double x); /* Given a number partitioned into X and DX, this function computes the cosine of the number by combining the sin and cos of X (as computed by a variation @@ -166,40 +116,6 @@ do_cos (double x, double dx) return cs + cor; } -/* A more precise variant of DO_COS. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_cos_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, e1, e2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = x * dx + xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - e1 = (sn + t22) - t22; - e2 = (sn - e1) + ssn; - cor = (ccs - cs * c - e1 * x2 - e2 * x) - sn * s; - y = cs - e1 * x1; - cor = cor + ((cs - y) - e1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; -} - /* Given a number partitioned into X and DX, this function computes the sine of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get @@ -224,70 +140,6 @@ do_sin (double x, double dx) return sn + cor; } -/* A more precise variant of DO_SIN. EPS is the adjustment to the correction - COR. */ -static inline double -__always_inline -do_sin_slow (double x, double dx, double eps, double *corp) -{ - mynumber u; - - if (x <= 0) - dx = -dx; - u.x = big + fabs (x); - x = fabs (x) - (u.x - big); - - double xx, y, x1, x2, c1, c2, res, cor; - double s, sn, ssn, c, cs, ccs; - xx = x * x; - s = x * xx * (sn3 + xx * sn5); - c = xx * (cs2 + xx * (cs4 + xx * cs6)); - SINCOS_TABLE_LOOKUP (u, sn, ssn, cs, ccs); - x1 = (x + t22) - t22; - x2 = (x - x1) + dx; - c1 = (cs + t22) - t22; - c2 = (cs - c1) + ccs; - cor = (ssn + s * ccs + cs * s + c2 * x + c1 * x2 - sn * x * dx) - sn * c; - y = sn + c1 * x1; - cor = cor + ((sn - y) + c1 * x1); - res = y + cor; - cor = (y - res) + cor; - cor = 1.0005 * cor + __copysign (eps, cor); - *corp = cor; - return res; -} - -/* Reduce range of X and compute sin of a + da. When SHIFT_QUADRANT is true, - the routine returns the cosine of a + da by rotating the quadrant once and - computing the sine of the result. */ -static inline double -__always_inline -reduce_and_compute (double x, bool shift_quadrant) -{ - double retval = 0, a, da; - unsigned int n = __branred (x, &a, &da); - int4 k = (n + shift_quadrant) % 4; - switch (k) - { - case 2: - a = -a; - da = -da; - /* Fall through. */ - case 0: - if (a * a < 0.01588) - retval = bsloww (a, da, x, n); - else - retval = bsloww1 (a, da, x, n); - break; - - case 1: - case 3: - retval = bsloww2 (a, da, x, n); - break; - } - return retval; -} - static inline int4 __always_inline reduce_sincos (double x, double *a, double *da) @@ -517,299 +369,6 @@ __cos (double x) return retval; } -/************************************************************************/ -/* Routine compute sin(x) for 2^-26 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpsin or dubsin */ -/************************************************************************/ - -static inline double -__always_inline -slow (double x) -{ - double res, cor, w[2]; - res = TAYLOR_SLOW (x, 0, cor); - if (res == res + 1.0007 * cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000001 * w[1]) - return __copysign (w[0], x); - - return __copysign (__mpsin (fabs (x), 0, false), x); -} - -/*******************************************************************************/ -/* Routine compute sin(x) for 0.25<|x|< 0.855469 by __sincostab.tbl and Taylor */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/*******************************************************************************/ - -static inline double -__always_inline -slow1 (double x) -{ - double w[2], cor, res; - - res = do_sin_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __dubsin (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/**************************************************************************/ -/* Routine compute sin(x) for 0.855469 <|x|<2.426265 by __sincostab.tbl */ -/* and if result still doesn't accurate enough by mpsin or dubsin */ -/**************************************************************************/ -static inline double -__always_inline -slow2 (double x) -{ - double w[2], y, y1, y2, cor, res; - - double t = hp0 - fabs (x); - res = do_cos_slow (t, hp1, 0, &cor); - if (res == res + cor) - return res; - - y = fabs (x) - hp0; - y1 = y - hp1; - y2 = (y - y1) - hp1; - __docos (y1, y2, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpsin (fabs (x), 0, false); -} - -/* Compute sin(x + dx) where X is small enough to use Taylor series around zero - and (x + dx) in the first or third quarter of the unit circle. ORIG is the - original value of X for computing error of the result. If the result is not - accurate enough, the routine calls mpsin or dubsin. SHIFT_QUADRANT rotates - the unit circle by 1 to compute the cosine instead of sine. */ -static inline double -__always_inline -sloww (double x, double dx, double orig, bool shift_quadrant) -{ - double y, t, res, cor, w[2], a, da, xn; - mynumber v; - int4 n; - res = TAYLOR_SLOW (x, dx, cor); - - double eps = fabs (orig) * 3.1e-30; - - cor = 1.0005 * cor + __copysign (eps, cor); - - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - eps = fabs (orig) * 1.1e-30; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - t = (orig * hpinv + toint); - xn = t - toint; - v.x = t; - y = (orig - xn * mp1) - xn * mp2; - n = (v.i[LOW_HALF] + shift_quadrant) & 3; - da = xn * pp3; - t = y - da; - da = (y - t) - da; - y = xn * pp4; - a = t - y; - da = ((t - a) - y) + da; - - if (n & 2) - { - a = -a; - da = -da; - } - x = fabs (a); - dx = (a > 0) ? da : -da; - __dubsin (x, dx, w); - eps = fabs (orig) * 1.1e-40; - cor = 1.000000001 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], a); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/* Compute sin(x + dx) where X is in the first or third quarter of the unit - circle. ORIG is the original value of X for computing error of the result. - If the result is not accurate enough, the routine calls mpsin or dubsin. - SHIFT_QUADRANT rotates the unit circle by 1 to compute the cosine instead of - sine. */ -static inline double -__always_inline -sloww1 (double x, double dx, double orig, bool shift_quadrant) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return __copysign (res, x); - - dx = (x > 0 ? dx : -dx); - __dubsin (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return shift_quadrant ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) (Double-Length number) where x in second or */ -/* fourth quarter of unit circle.Routine receive also the original value */ -/* and quarter(n= 1or 3)of x for computing error of result.And if result not*/ -/* accurate enough routine calls mpsin1 or dubsin */ -/***************************************************************************/ - -static inline double -__always_inline -sloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 3.1e-30 * fabs (orig), &cor); - - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = x > 0 ? dx : -dx; - __docos (fabs (x), dx, w); - - double eps = 1.1e-30 * fabs (orig); - cor = 1.000000005 * w[1] + __copysign (eps, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* is small enough to use Taylor series around zero and (x+dx) */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of */ -/* result.And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww (double x, double dx, double orig, int n) -{ - double res, cor, w[2], a, da; - - res = TAYLOR_SLOW (x, dx, cor); - cor = 1.0005 * cor + __copysign (1.1e-24, cor); - if (res == res + cor) - return res; - - a = fabs (x); - da = (x > 0) ? dx : -dx; - __dubsin (a, da, w); - cor = 1.000000001 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in first or third quarter of unit circle.Routine receive also */ -/* (right argument) the original value of x for computing error of result.*/ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww1 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_sin_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (x > 0) ? res : -res; - - dx = (x > 0) ? dx : -dx; - __dubsin (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return __copysign (w[0], x); - - return (n & 1) ? __mpcos (orig, 0, true) : __mpsin (orig, 0, true); -} - -/***************************************************************************/ -/* Routine compute sin(x+dx) or cos(x+dx) (Double-Length number) where x */ -/* in second or fourth quarter of unit circle.Routine receive also the */ -/* original value and quarter(n= 1or 3)of x for computing error of result. */ -/* And if result not accurate enough routine calls other routines */ -/***************************************************************************/ - -static inline double -__always_inline -bsloww2 (double x, double dx, double orig, int n) -{ - double w[2], cor, res; - - res = do_cos_slow (x, dx, 1.1e-24, &cor); - if (res == res + cor) - return (n & 2) ? -res : res; - - dx = (x > 0) ? dx : -dx; - __docos (fabs (x), dx, w); - - cor = 1.000000005 * w[1] + __copysign (1.1e-24, w[1]); - - if (w[0] == w[0] + cor) - return (n & 2) ? -w[0] : w[0]; - - return (n & 1) ? __mpsin (orig, 0, true) : __mpcos (orig, 0, true); -} - -/************************************************************************/ -/* Routine compute cos(x) for 2^-27 < |x|< 0.25 by Taylor with more */ -/* precision and if still doesn't accurate enough by mpcos or docos */ -/************************************************************************/ - -static inline double -__always_inline -cslow2 (double x) -{ - double w[2], cor, res; - - res = do_cos_slow (x, 0, 0, &cor); - if (res == res + cor) - return res; - - __docos (fabs (x), 0, w); - if (w[0] == w[0] + 1.000000005 * w[1]) - return w[0]; - - return __mpcos (x, 0, false); -} - #ifndef __cos libm_alias_double (__cos, cos) #endif From patchwork Fri Mar 9 15:52:55 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 883724 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-90927-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="WudmzWCX"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zyX381vw1z9sbw for ; Sat, 10 Mar 2018 02:53:08 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=IBREPAO63oduv1cVE731IoN64OwbUXJdPDPhtE3PlrqKJvmKLbAA5 10jwK7zsU9OPJcF4LK1mARXCxkY9u8ixkKMAS0N4/uzHPE0oDHSynkBQwGMPUSFz +It28qvGX06cqwBK7G6ECnU8kKzgdqkDIMHZDEqRMpkhZVkM8tbvmA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=pDXw3BUfz1FaPCsbE5axdtKFHO0=; b=WudmzWCXo1LvrF51+G5TgLLnhhKt DypY+DXoqpsZiI2+vfcXM7TAzI6gdxPGmoh7ovvfrJ6wjckIFRKyLGr0uyPVVtsh t2AXdsO7rQPvSfweRXnwfNtp24gHC6MACtijzazxitEraMAbN69jmtQvy+h9dmZi K8QOzmQXCEmYp18= Received: (qmail 22802 invoked by alias); 9 Mar 2018 15:53:02 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 22786 invoked by uid 89); 9 Mar 2018 15:53:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: EUR03-VE1-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 6/6] Remove slow paths from sin/cos Date: Fri, 9 Mar 2018 15:52:55 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0801MB1256; 7:oVEmI7kjZmJ8okkdk6u4sqyGu7X4JM4yDLaTCmdQzJgmam1JbS+Ydu33FjlU+0I+2mOAep1y2IgEED52MvBdSHv12UyAeUBq9zzXrQNPon5tX059UVfEC2IAMf68KuOz4CjjrTF3XHedhjpSAMvbUHbRX8VQcj6WBvKs83U+qDs8ZIq0gwJUA0WbjpckedpuV2rncOqqyh7GiHvCFCTyyUlWbtW9fLlJrjNqFkN/5/HojUd/aKHO8WGfvI1L0FZ9 x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: da73746d-2a77-4a5c-b2a3-08d585d5d2eb x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4604075)(3008032)(2017052603328)(7153060)(7193020); SRVR:DB6PR0801MB1256; x-ms-traffictypediagnostic: DB6PR0801MB1256: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(93006095)(93001095)(3231220)(944501244)(52105095)(6055026)(6041310)(20161123558120)(20161123562045)(20161123564045)(20161123560045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB6PR0801MB1256; BCL:0; PCL:0; RULEID:; SRVR:DB6PR0801MB1256; x-forefront-prvs: 0606BBEB39 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(366004)(346002)(39380400002)(39860400002)(376002)(199004)(189003)(377424004)(54534003)(316002)(33656002)(7696005)(2501003)(2351001)(68736007)(5250100002)(3846002)(25786009)(4326008)(6116002)(99286004)(106356001)(5660300001)(3660700001)(6916009)(53936002)(2900100001)(105586002)(26005)(8936002)(55016002)(66066001)(81156014)(5640700003)(8676002)(81166006)(6436002)(9686003)(102836004)(3280700002)(575784001)(6506007)(86362001)(74316002)(2906002)(305945005)(59450400001)(478600001)(97736004)(72206003)(7736002)(14454004); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0801MB1256; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: VmLgx4V79taHFqZFldtu8+7mtBEzWSyUXXS01T3fTfmPlI5773PSgHb8NVSNxkhgFGeD5y3e3mKqONFCBcTZ2WyUdM1wfdlQ8Bs5TEJMwyUtbmj8MOgpn/71xN2LASkyQrAJnRxSbYHz88mn1d5mDp4Cer4XnRL1hUapyC2e3kWEY4eOtjzseGyAPHg1DOdA/ifVe+c2RoL4VZohFSiABxmAYKbw9AqMIzc1UDVLU4Vnp3RGmyvKAMmwvTrrbjXCNfst+m+i7Z+Z++05t/7PsP6YjHqtrNAu+CnOsNBTLODCSmLnPzbuQzUJWL50Bq5fRdVB7bNla/iB55Mx/HTUeA== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: da73746d-2a77-4a5c-b2a3-08d585d5d2eb X-MS-Exchange-CrossTenant-originalarrivaltime: 09 Mar 2018 15:52:55.5202 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0801MB1256 Restructure the sincos implementation - rather than rely on odd partial inlining of preprocessed portions from sin and cos, explicitly write out the cases. ChangeLog: 2018-03-09 Wilco Dijkstra * sysdeps/ieee754/dbl-64/s_sin.c (__sin): Cleanup ifdefs. (__cos): Likewise. * sysdeps/ieee754/dbl-64/s_sin.c (__sincos): Reimplement using the same logic as sin and cos. diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 91b0abc9c3ac21dae0e673576940ef97bfd20c23..8f804a42e6d94652a62f81b2e0b053135cf9f03a 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -208,12 +208,9 @@ do_sincos (double a, double da, int4 n) /* An ultimate sin routine. Given an IEEE double machine number x */ /* it computes the correctly rounded (to nearest) value of sin(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else +#ifndef IN_SINCOS double SECTION -#endif __sin (double x) { double xx, t, a, da; @@ -221,9 +218,7 @@ __sin (double x) int4 k, m, n; double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -257,7 +252,6 @@ __sin (double x) retval = __copysign (do_cos (t, hp1), x); } /* else if (k < 0x400368fd) */ -#ifndef IN_SINCOS /*-------------------------- 2.426265<|x|< 105414350 ----------------------*/ else if (k < 0x419921FB) { @@ -278,12 +272,6 @@ __sin (double x) __set_errno (EDOM); retval = x / x; } -#else - /* Disable warning... */ - n = 0, n = n; - a = 0, a = a; - da = 0, da = da; -#endif return retval; } @@ -294,12 +282,8 @@ __sin (double x) /* it computes the correctly rounded (to nearest) value of cos(x) */ /*******************************************************************/ -#ifdef IN_SINCOS -static double -#else double SECTION -#endif __cos (double x) { double y, xx, a, da; @@ -308,9 +292,7 @@ __cos (double x) double retval = 0; -#ifndef IN_SINCOS SET_RESTORE_ROUND_53BIT (FE_TONEAREST); -#endif u.x = x; m = u.i[HIGH_HALF]; @@ -340,8 +322,6 @@ __cos (double x) retval = __copysign (do_sin (a, da), a); } /* else if (k < 0x400368fd) */ - -#ifndef IN_SINCOS else if (k < 0x419921FB) { /* 2.426265<|x|< 105414350 */ n = reduce_sincos (x, &a, &da); @@ -361,10 +341,6 @@ __cos (double x) __set_errno (EDOM); retval = x / x; /* |x| > 2^1024 */ } -#else - /* Disable warning... */ - n = 0, n = n; -#endif return retval; } @@ -375,3 +351,5 @@ libm_alias_double (__cos, cos) #ifndef __sin libm_alias_double (__sin, sin) #endif + +#endif diff --git a/sysdeps/ieee754/dbl-64/s_sincos.c b/sysdeps/ieee754/dbl-64/s_sincos.c index 4335ecbba3c9894e61c087ac970b392fa73abfab..c04972707b284e37b15e82933a00250cda959985 100644 --- a/sysdeps/ieee754/dbl-64/s_sincos.c +++ b/sysdeps/ieee754/dbl-64/s_sincos.c @@ -23,9 +23,7 @@ #include #include -#define __sin __sin_local -#define __cos __cos_local -#define IN_SINCOS 1 +#define IN_SINCOS #include "s_sin.c" void @@ -37,31 +35,79 @@ __sincos (double x, double *sinx, double *cosx) SET_RESTORE_ROUND_53BIT (FE_TONEAREST); u.x = x; - k = 0x7fffffff & u.i[HIGH_HALF]; + k = u.i[HIGH_HALF] & 0x7fffffff; if (k < 0x400368fd) { - *sinx = __sin_local (x); - *cosx = __cos_local (x); - return; - } - if (k < 0x419921FB) - { - double a, da; - int4 n = reduce_sincos (x, &a, &da); - - *sinx = do_sincos (a, da, n); - *cosx = do_sincos (a, da, n + 1); + double t, xx, a, da, y; + /* |x| < 2^-27 => cos (x) = 1, sin (x) = x. */ + if (k < 0x3e400000) + { + if (k < 0x3e500000) + math_check_force_underflow (x); + *sinx = x; + *cosx = 1.0; + return; + } + /* |x| < 0.855469. */ + else if (k < 0x3feb6000) + { + /* |x| < 0.25. */ + if (k < 0x3fd00000) + { + xx = x * x; + t = POLYNOMIAL (xx) * (xx * x); + *sinx = x + t; + } + else + *sinx = __copysign (do_sin (x, 0), x); + *cosx = do_cos (x, 0); + return; + } + /* |x| < 2.426265. */ + y = hp0 - fabs (x); + a = y + hp1; + da = (y - a) + hp1; + *sinx = __copysign (do_cos (a, da), x); + xx = a * a; + if (xx < 0.01588) + *cosx = TAYLOR_SIN (xx, a, da); + else + *cosx = __copysign (do_sin (a, da), a); return; } + /* |x| < 2^1024. */ if (k < 0x7ff00000) { - double a, da; - int4 n = __branred (x, &a, &da); + double a, da, xx; + unsigned int n; - *sinx = do_sincos (a, da, n); - *cosx = do_sincos (a, da, n + 1); + /* If |x| < 105414350 use simple range reduction. */ + n = k < 0x419921FB ? reduce_sincos (x, &a, &da) : __branred (x, &a, &da); + n = n & 3; + + if (n == 1 || n == 2) + { + a = -a; + da = -da; + } + + if (n & 1) + { + double *temp = cosx; + cosx = sinx; + sinx = temp; + } + + xx = a * a; + if (xx < 0.01588) + *sinx = TAYLOR_SIN (xx, a, da); + else + *sinx = __copysign (do_sin (a, da), a); + xx = do_cos (a, da); + *cosx = (n & 2) ? -xx : xx; + return; } if (isinf (x))