From patchwork Fri Nov 3 12:12:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joe Ramsay X-Patchwork-Id: 1858882 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=VadL+/sz; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=VadL+/sz; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SMKQc4trVz1yQ5 for ; Fri, 3 Nov 2023 23:13:28 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AE2D3385800A for ; Fri, 3 Nov 2023 12:13:26 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2087.outbound.protection.outlook.com [40.107.8.87]) by sourceware.org (Postfix) with ESMTPS id 9152D3858C62 for ; Fri, 3 Nov 2023 12:12:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9152D3858C62 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9152D3858C62 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.8.87 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699013583; cv=pass; b=t1DV1Be201ezHnhsej4s1j1l9mpFLYCPrk2NvqHdo0cLj3dQ9jjC8uNq3RpvjmAvF1N3NkHKY+Mcfg8uq4Ng1UhffucMhyEeCFfzA7mgRmASDzTrqIH9+x2rsWzVdYIBMZhZ5IuepN618eB0fw9WjOAWin5qjcZCV0l9Cu/TWVk= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1699013583; c=relaxed/simple; bh=H/HPTRlwtKoQ5hqgar8QNMSjI0P92W8ENeJjBBE8V5o=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=D+Spkup8SG5xTU6GG1G1uIfTKF7PQyEwByDWDzL8/ntokR7fKbcc/1t6ZYFC5m6Rf3Zst+3W5jsOHQIDJ8A/IZXTvDyX3S5pZj/ou7AnocLdtHPrCzzQJmzZFK9A5iPHPKC40bIesShXib0EZhys+wHfzv1Ah++/wBM0FbZj8Ys= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=KshBeDZ/JE9oJQWA5B0hTAeTohHWhXx/WqPAKZrBDrH5sZwRElq2m6872zEE0I3yMsIF8IMethmDClA8R4kuJMgb8LJNtxP+Pym65bu87a0AxIqTQud3bmqK4R/93131eSyJnBFCnuTUdodO1lIQ969FDj+1XGin/DznR2KEJ5VFrOcmHiiS8MpZ7s/fyyyG8Zcg4VDc4NdLLiiN3LgLgxi/Ky8MUEhyJpsULVln/3hkCStR9DkALfL/RrB2GYv7qnowHB60NiUcc/MsgnWGCLSh76rXqJ2FNUTHVk0eqmo7jO/VUKZRRnZFTJ3rAQt7t+KaTEh3KK3LnLDK0UCaGw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GRNY18BqPEU698qElyjmgcpxNEP5//ZyvmfEBikAoig=; b=AV7A5a66FzXtZzoWgpYr8hpKPajzfD37JrtTuBwUQRfirOE/K5K6MiQav1X2xNhyPGN2Jqqj59Js5z3JFqlj2maE2QTGFa/71WR5oyB92x47ef9qc8wOJaaiHo6x9JrH2HhEP5r39HxWkLy9xuCkE1v9ldNwGGDzOBQdQngjaEAK3+nF59rDK3o4+htDy5C3hcuw6kB7TwdEfGIcRfY2yM6dV9vzl4SQpZ9U4OMbxg1sD2TS461bYxhBRgZFxP3/c3+FfuSklQ8RbLUai3tt8iVgOnq4TTmw9wHW/Oa6ZM1hihGn5bQq6F0uRt4Rmj09fVB5zSJ0jFai505wYePktA== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1, 1, smtp.mailfrom=arm.com] dmarc=[1, 1, header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GRNY18BqPEU698qElyjmgcpxNEP5//ZyvmfEBikAoig=; b=VadL+/sztjaZVh7SRNLpYQ5T8mOt2YVATAgHXNDiba6YYN6S/uoHe8L1iCTh4wnno0x4i8WoCr3SJ4ZxEOVG9xYEOKPOB6HByniYwpqx1Qbk+lgDQpRAcorKQIrz5FkOoYHiClnuqZZLVTr3R1S1hzq2J8XuhwyTXcPSSPCJftM= Received: from DUZPR01CA0007.eurprd01.prod.exchangelabs.com (2603:10a6:10:3c3::11) by AS8PR08MB9790.eurprd08.prod.outlook.com (2603:10a6:20b:615::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.21; Fri, 3 Nov 2023 12:12:47 +0000 Received: from DU6PEPF0000B61C.eurprd02.prod.outlook.com (2603:10a6:10:3c3:cafe::70) by DUZPR01CA0007.outlook.office365.com (2603:10a6:10:3c3::11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.21 via Frontend Transport; Fri, 3 Nov 2023 12:12:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF0000B61C.mail.protection.outlook.com (10.167.8.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.19 via Frontend Transport; Fri, 3 Nov 2023 12:12:47 +0000 Received: ("Tessian outbound 8289ea11ec17:v228"); Fri, 03 Nov 2023 12:12:47 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: db6e7aff01f8f6f3 X-CR-MTA-TID: 64aa7808 Received: from 6afdfe6df0c9.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 568C4DA7-35B9-453C-9782-613567236717.1; Fri, 03 Nov 2023 12:12:40 +0000 Received: from EUR05-DB8-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 6afdfe6df0c9.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 03 Nov 2023 12:12:40 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=dGgpLxnY58/jrk145ZIGxyyVWw90mrjYSpMLY9dpJD7zr73ZDOgUfSumKe7hW6VncYkJliWwjZhThA93C5mYQrxdzGIKh5taKnGBSi38dd213zyMZeAisrIfWZEph3beHZdsdSpjM5j/XlzSQh/WXYOLhZSdyosoKvwhy+i3gugfIy1do0tLe66hPXF5lWLHOSkSe/L9DaXiNfXDxkre0Fdiw1TuZIJHqFIUcpAAoakjU/NoOanRZh2gnfaVgUSBE0U5Ue4BXnS//o45vToJDWEWIZPytdToHUu2bYYnGLQG3Eg7XfD94NuHKkK5zZa/xsmfiQb8CxLmZGu6CyBjdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GRNY18BqPEU698qElyjmgcpxNEP5//ZyvmfEBikAoig=; b=BJXBejeZaZjIhRXDI3i7aFfP1Gf9tUPIKW15d7LdeO40f6garunfW3Fw3rGIiUT8sd2aUeWGq+Uvsj203mBQTiH8KPx4d/Xv1Tj7KlpVOzQ5EnOso8LRIFhzvhLsvAVVHx5QRteJ/tWcfjzYKmP6HXM865gaR8VqpSqtcnjaRINJiLw1XHkxktx+cP3tD15R6S+kQRi9PyeCDwfD2BVFju26+wNlH+ChuqJ2BIk6/KdDXrV/s+hpFiMp6j3/DXUrO3Rem5nnDxLEGNDcHWtcxFylJMsbTOI0veY40Z3+Pp+3nDqxaGOcpZVRUbXe+8VakAbLJN0YIFkipyl2YW8m0A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=sourceware.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GRNY18BqPEU698qElyjmgcpxNEP5//ZyvmfEBikAoig=; b=VadL+/sztjaZVh7SRNLpYQ5T8mOt2YVATAgHXNDiba6YYN6S/uoHe8L1iCTh4wnno0x4i8WoCr3SJ4ZxEOVG9xYEOKPOB6HByniYwpqx1Qbk+lgDQpRAcorKQIrz5FkOoYHiClnuqZZLVTr3R1S1hzq2J8XuhwyTXcPSSPCJftM= Received: from AS9PR04CA0068.eurprd04.prod.outlook.com (2603:10a6:20b:48b::10) by AS4PR08MB8045.eurprd08.prod.outlook.com (2603:10a6:20b:585::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.19; Fri, 3 Nov 2023 12:12:38 +0000 Received: from AMS0EPF000001B5.eurprd05.prod.outlook.com (2603:10a6:20b:48b:cafe::ab) by AS9PR04CA0068.outlook.office365.com (2603:10a6:20b:48b::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.22 via Frontend Transport; Fri, 3 Nov 2023 12:12:38 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; pr=C Received: from nebula.arm.com (40.67.248.234) by AMS0EPF000001B5.mail.protection.outlook.com (10.167.16.169) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6954.19 via Frontend Transport; Fri, 3 Nov 2023 12:12:38 +0000 Received: from AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) by AZ-NEU-EX03.Arm.com (10.251.24.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Fri, 3 Nov 2023 12:12:36 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX02.Emea.Arm.com (10.251.26.5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.32; Fri, 3 Nov 2023 12:12:35 +0000 Received: from vcn-man-apps.manchester.arm.com (10.32.108.22) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.2507.32 via Frontend Transport; Fri, 3 Nov 2023 12:12:35 +0000 From: Joe Ramsay To: CC: Joe Ramsay Subject: [PATCH 1/6] aarch64: Add vector implementations of asin routines Date: Fri, 3 Nov 2023 12:12:19 +0000 Message-ID: <20231103121224.16835-1-Joe.Ramsay@arm.com> X-Mailer: git-send-email 2.27.0 MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-TrafficTypeDiagnostic: AMS0EPF000001B5:EE_|AS4PR08MB8045:EE_|DU6PEPF0000B61C:EE_|AS8PR08MB9790:EE_ X-MS-Office365-Filtering-Correlation-Id: ef59f04d-f338-4e1e-e9c8-08dbdc663132 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Z4vZ9EMzRdfUHqYTaUiHmXzdTvo7r5wNQw5W4Q8nQpDWBjUKJVE4O5YpxH1CmcdC5q7d00olHK6kTpYMhfEHgl7vuVTJqkCA03PrQYMulDUFhOtZmlXHuBr4jdKMg6wnnk2BLkVpKfcqZIKgqYEAMreJGb6RLyH+vZpUp7qXjMO8MtvclhNpGQiLCfVEDp60xFmIcehos37+aQUV0ZX20tsGNctThfz3B2rkoC13rOdUqZiw4E11aUh47bg3PJSBDs8ApAEVic8C2WWogKQ5jcglu2q/n9XEr9VKIEWaPqDeCaYvVwD8KDjVyPYcnUIKZan5kqe0JO0bD7RhvKQJjvNn6pfnBDz8Z5TByRD5Tu7485lz4mxE6BXdA1ZDIAvqEpdwrVCGltJk01fCUzDcjcc0XsUM7RkO7B8C9LB48HQz+GU8a7XyGXllLJv/hrdrj1DbN3UVXnKrkjxub1wWk4V0O7FKSGQEjCrUPEVklUEuueB1lt/LCs99bqVlzXzs7w8KTD8KodJwNYN7PMUjO7JyBj8fsbMVA4GCGkiBg7xy+VRrARZ1QywABHUKBsvmNV5sHY9UAVrsHuTagN5UKGER/GJQE3ZVq1YYwTL7U5UNtPfCi1ZCj3zNn35s9q4X5KN8qx2f8690YVgV2W+HuebO09eWRuczum/reSOg318SMP2YAyK+S4btUUzxIIRoNMtYbCVso+B483U7YxZ+qLm7BZnRMYsLmy5nUn5GBuRsEpPVc+JmJkgsOqyPvUq8vy1KVyZcOMLcUD3FmufaMw== X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:nebula.arm.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230031)(4636009)(39860400002)(346002)(396003)(136003)(376002)(230922051799003)(451199024)(186009)(82310400011)(64100799003)(1800799009)(46966006)(36840700001)(40470700004)(2906002)(36860700001)(70206006)(40460700003)(7696005)(70586007)(316002)(2616005)(6916009)(83380400001)(86362001)(30864003)(478600001)(1076003)(6666004)(8676002)(8936002)(81166007)(36756003)(47076005)(82740400003)(356005)(5660300002)(426003)(41300700001)(336012)(26005)(40480700001)(4326008)(2004002)(36900700001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS4PR08MB8045 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF0000B61C.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 83e1d37b-7e45-48d2-4625-08dbdc662bf3 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: roaquBYx07xjV8AciNxjoAKYFszQf4Do9wlbQUKY1hJdbPZrj2HJLvzlUuD3D6rnp3wIV4pgMYBZ7rRlrxzrA5AULkZO4tw7PardDO9k7juQJx4GwwKU5+RS8OotUUncWxC8BuU75h84YDX3zOn1jC1YTaIwQ4iUril2D/48lxZzjij/w2AICnMIYJt3MhB4caZQMdciQNhyjUXqgpTF2dOTgOgZ1XOby4dt6FRX0Gmmt04g7NNoalwU7KxCUaeM5GkVJYhhs7f+6uWcITr2lHxyEmpMpwJIOPTpFwuVdvtxdQ+FDeWhHpkfhi7o4WHr6e3qT1r098BU1OG4d8Fsz5eECXEp60QSmAi3qzmpxgp1BSW4TRQMf4pamWVnzNUdocUWggQuMIwDGJp4OMTCxwfl47AEfeUsC1LLilqyHnbMFJtqynhxyZ/ck2L92ljX+wHeQsdukTGrxCzGjKDxbE/Kr9WG+SmN6mjbZYteh4VprTi5AEKcP16BuKbxQDqUj3xsLeEj/ySG9yR4PqUsA/aBOznBUUyWvIPcU8oiKZfIcoNe4UJtY2vGCiRX7Ls6j3lpX19YfrIgRRd+k8WCRr0onHoGRtA+k68s62P4etyQIz6nHFVwHHbdcbH/YDSQDAk10pBVklNGTrRFwqEthZjj7wFYywSKtYQJUR4iaZGmZYBXa2CAY7nuY6+I8ijzxMqmq9/0Tld8ZVzoX7qvHS6vt+iDPyMp6ZKqoq3nNud6FaKmXm7yCSe/tyHf/sy6 X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(13230031)(4636009)(376002)(396003)(346002)(39860400002)(136003)(230922051799003)(451199024)(82310400011)(186009)(1800799009)(64100799003)(40470700004)(46966006)(36840700001)(86362001)(40460700003)(81166007)(82740400003)(36756003)(478600001)(2906002)(7696005)(6666004)(5660300002)(30864003)(426003)(41300700001)(26005)(336012)(1076003)(2616005)(8936002)(8676002)(4326008)(70586007)(70206006)(6916009)(316002)(36860700001)(47076005)(40480700001)(83380400001)(2004002); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Nov 2023 12:12:47.3252 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ef59f04d-f338-4e1e-e9c8-08dbdc663132 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF0000B61C.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR08MB9790 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, FORGED_SPF_HELO, GIT_PATCH_0, KAM_DMARC_NONE, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org --- Thanks, Joe sysdeps/aarch64/fpu/Makefile | 3 +- sysdeps/aarch64/fpu/Versions | 4 + sysdeps/aarch64/fpu/asin_advsimd.c | 113 ++++++++++++++++++ sysdeps/aarch64/fpu/asin_sve.c | 86 +++++++++++++ sysdeps/aarch64/fpu/asinf_advsimd.c | 104 ++++++++++++++++ sysdeps/aarch64/fpu/asinf_sve.c | 78 ++++++++++++ sysdeps/aarch64/fpu/bits/math-vector.h | 4 + .../fpu/test-double-advsimd-wrappers.c | 1 + .../aarch64/fpu/test-double-sve-wrappers.c | 1 + .../aarch64/fpu/test-float-advsimd-wrappers.c | 1 + sysdeps/aarch64/fpu/test-float-sve-wrappers.c | 1 + sysdeps/aarch64/libm-test-ulps | 8 ++ .../unix/sysv/linux/aarch64/libmvec.abilist | 4 + 13 files changed, 407 insertions(+), 1 deletion(-) create mode 100644 sysdeps/aarch64/fpu/asin_advsimd.c create mode 100644 sysdeps/aarch64/fpu/asin_sve.c create mode 100644 sysdeps/aarch64/fpu/asinf_advsimd.c create mode 100644 sysdeps/aarch64/fpu/asinf_sve.c Reviewed-by: Szabolcs Nagy diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile index 1f1ac2a2b8..d7c0bd2ed5 100644 --- a/sysdeps/aarch64/fpu/Makefile +++ b/sysdeps/aarch64/fpu/Makefile @@ -1,4 +1,5 @@ -libmvec-supported-funcs = cos \ +libmvec-supported-funcs = asin \ + cos \ exp \ exp10 \ exp2 \ diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions index eb5ad50017..0f365a1e2e 100644 --- a/sysdeps/aarch64/fpu/Versions +++ b/sysdeps/aarch64/fpu/Versions @@ -18,6 +18,10 @@ libmvec { _ZGVsMxv_sinf; } GLIBC_2.39 { + _ZGVnN4v_asinf; + _ZGVnN2v_asin; + _ZGVsMxv_asinf; + _ZGVsMxv_asin; _ZGVnN4v_exp10f; _ZGVnN2v_exp10; _ZGVsMxv_exp10f; diff --git a/sysdeps/aarch64/fpu/asin_advsimd.c b/sysdeps/aarch64/fpu/asin_advsimd.c new file mode 100644 index 0000000000..d2adbc0d87 --- /dev/null +++ b/sysdeps/aarch64/fpu/asin_advsimd.c @@ -0,0 +1,113 @@ +/* Double-precision AdvSIMD inverse sin + + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "v_math.h" +#include "poly_advsimd_f64.h" + +static const struct data +{ + float64x2_t poly[12]; + float64x2_t pi_over_2; + uint64x2_t abs_mask; +} data = { + /* Polynomial approximation of (asin(sqrt(x)) - sqrt(x)) / (x * sqrt(x)) + on [ 0x1p-106, 0x1p-2 ], relative error: 0x1.c3d8e169p-57. */ + .poly = { V2 (0x1.555555555554ep-3), V2 (0x1.3333333337233p-4), + V2 (0x1.6db6db67f6d9fp-5), V2 (0x1.f1c71fbd29fbbp-6), + V2 (0x1.6e8b264d467d6p-6), V2 (0x1.1c5997c357e9dp-6), + V2 (0x1.c86a22cd9389dp-7), V2 (0x1.856073c22ebbep-7), + V2 (0x1.fd1151acb6bedp-8), V2 (0x1.087182f799c1dp-6), + V2 (-0x1.6602748120927p-7), V2 (0x1.cfa0dd1f9478p-6), }, + .pi_over_2 = V2 (0x1.921fb54442d18p+0), + .abs_mask = V2 (0x7fffffffffffffff), +}; + +#define AllMask v_u64 (0xffffffffffffffff) +#define One (0x3ff0000000000000) +#define Small (0x3e50000000000000) /* 2^-12. */ + +#if WANT_SIMD_EXCEPT +static float64x2_t VPCS_ATTR NOINLINE +special_case (float64x2_t x, float64x2_t y, uint64x2_t special) +{ + return v_call_f64 (asin, x, y, special); +} +#endif + +/* Double-precision implementation of vector asin(x). + + For |x| < Small, approximate asin(x) by x. Small = 2^-12 for correct + rounding. If WANT_SIMD_EXCEPT = 0, Small = 0 and we proceed with the + following approximation. + + For |x| in [Small, 0.5], use an order 11 polynomial P such that the final + approximation is an odd polynomial: asin(x) ~ x + x^3 P(x^2). + + The largest observed error in this region is 1.01 ulps, + _ZGVnN2v_asin (0x1.da9735b5a9277p-2) got 0x1.ed78525a927efp-2 + want 0x1.ed78525a927eep-2. + + For |x| in [0.5, 1.0], use same approximation with a change of variable + + asin(x) = pi/2 - (y + y * z * P(z)), with z = (1-x)/2 and y = sqrt(z). + + The largest observed error in this region is 2.69 ulps, + _ZGVnN2v_asin (0x1.044ac9819f573p-1) got 0x1.110d7e85fdd5p-1 + want 0x1.110d7e85fdd53p-1. */ +float64x2_t VPCS_ATTR V_NAME_D1 (asin) (float64x2_t x) +{ + const struct data *d = ptr_barrier (&data); + + float64x2_t ax = vabsq_f64 (x); + +#if WANT_SIMD_EXCEPT + /* Special values need to be computed with scalar fallbacks so + that appropriate exceptions are raised. */ + uint64x2_t special + = vcgtq_u64 (vsubq_u64 (vreinterpretq_u64_f64 (ax), v_u64 (Small)), + v_u64 (One - Small)); + if (__glibc_unlikely (v_any_u64 (special))) + return special_case (x, x, AllMask); +#endif + + uint64x2_t a_lt_half = vcltq_f64 (ax, v_f64 (0.5)); + + /* Evaluate polynomial Q(x) = y + y * z * P(z) with + z = x ^ 2 and y = |x| , if |x| < 0.5 + z = (1 - |x|) / 2 and y = sqrt(z), if |x| >= 0.5. */ + float64x2_t z2 = vbslq_f64 (a_lt_half, vmulq_f64 (x, x), + vfmsq_n_f64 (v_f64 (0.5), ax, 0.5)); + float64x2_t z = vbslq_f64 (a_lt_half, ax, vsqrtq_f64 (z2)); + + /* Use a single polynomial approximation P for both intervals. */ + float64x2_t z4 = vmulq_f64 (z2, z2); + float64x2_t z8 = vmulq_f64 (z4, z4); + float64x2_t z16 = vmulq_f64 (z8, z8); + float64x2_t p = v_estrin_11_f64 (z2, z4, z8, z16, d->poly); + + /* Finalize polynomial: z + z * z2 * P(z2). */ + p = vfmaq_f64 (z, vmulq_f64 (z, z2), p); + + /* asin(|x|) = Q(|x|) , for |x| < 0.5 + = pi/2 - 2 Q(|x|), for |x| >= 0.5. */ + float64x2_t y = vbslq_f64 (a_lt_half, p, vfmsq_n_f64 (d->pi_over_2, p, 2.0)); + + /* Copy sign. */ + return vbslq_f64 (d->abs_mask, y, x); +} diff --git a/sysdeps/aarch64/fpu/asin_sve.c b/sysdeps/aarch64/fpu/asin_sve.c new file mode 100644 index 0000000000..fa04d7fca6 --- /dev/null +++ b/sysdeps/aarch64/fpu/asin_sve.c @@ -0,0 +1,86 @@ +/* Double-precision SVE inverse sin + + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "sv_math.h" +#include "poly_sve_f64.h" + +static const struct data +{ + float64_t poly[12]; + float64_t pi_over_2f; +} data = { + /* Polynomial approximation of (asin(sqrt(x)) - sqrt(x)) / (x * sqrt(x)) + on [ 0x1p-106, 0x1p-2 ], relative error: 0x1.c3d8e169p-57. */ + .poly = { 0x1.555555555554ep-3, 0x1.3333333337233p-4, + 0x1.6db6db67f6d9fp-5, 0x1.f1c71fbd29fbbp-6, + 0x1.6e8b264d467d6p-6, 0x1.1c5997c357e9dp-6, + 0x1.c86a22cd9389dp-7, 0x1.856073c22ebbep-7, + 0x1.fd1151acb6bedp-8, 0x1.087182f799c1dp-6, + -0x1.6602748120927p-7, 0x1.cfa0dd1f9478p-6, }, + .pi_over_2f = 0x1.921fb54442d18p+0, +}; + +#define P(i) sv_f64 (d->poly[i]) + +/* Double-precision SVE implementation of vector asin(x). + + For |x| in [0, 0.5], use an order 11 polynomial P such that the final + approximation is an odd polynomial: asin(x) ~ x + x^3 P(x^2). + + The largest observed error in this region is 0.52 ulps, + _ZGVsMxv_asin(0x1.d95ae04998b6cp-2) got 0x1.ec13757305f27p-2 + want 0x1.ec13757305f26p-2. + + For |x| in [0.5, 1.0], use same approximation with a change of variable + + asin(x) = pi/2 - (y + y * z * P(z)), with z = (1-x)/2 and y = sqrt(z). + + The largest observed error in this region is 2.69 ulps, + _ZGVsMxv_asin(0x1.044ac9819f573p-1) got 0x1.110d7e85fdd5p-1 + want 0x1.110d7e85fdd53p-1. */ +svfloat64_t SV_NAME_D1 (asin) (svfloat64_t x, const svbool_t pg) +{ + const struct data *d = ptr_barrier (&data); + + svuint64_t sign = svand_x (pg, svreinterpret_u64 (x), 0x8000000000000000); + svfloat64_t ax = svabs_x (pg, x); + svbool_t a_ge_half = svacge (pg, x, 0.5); + + /* Evaluate polynomial Q(x) = y + y * z * P(z) with + z = x ^ 2 and y = |x| , if |x| < 0.5 + z = (1 - |x|) / 2 and y = sqrt(z), if |x| >= 0.5. */ + svfloat64_t z2 = svsel (a_ge_half, svmls_x (pg, sv_f64 (0.5), ax, 0.5), + svmul_x (pg, x, x)); + svfloat64_t z = svsqrt_m (ax, a_ge_half, z2); + + /* Use a single polynomial approximation P for both intervals. */ + svfloat64_t z4 = svmul_x (pg, z2, z2); + svfloat64_t z8 = svmul_x (pg, z4, z4); + svfloat64_t z16 = svmul_x (pg, z8, z8); + svfloat64_t p = sv_estrin_11_f64_x (pg, z2, z4, z8, z16, d->poly); + /* Finalize polynomial: z + z * z2 * P(z2). */ + p = svmla_x (pg, z, svmul_x (pg, z, z2), p); + + /* asin(|x|) = Q(|x|) , for |x| < 0.5 + = pi/2 - 2 Q(|x|), for |x| >= 0.5. */ + svfloat64_t y = svmad_m (a_ge_half, p, sv_f64 (-2.0), d->pi_over_2f); + + /* Copy sign. */ + return svreinterpret_f64 (svorr_x (pg, svreinterpret_u64 (y), sign)); +} diff --git a/sysdeps/aarch64/fpu/asinf_advsimd.c b/sysdeps/aarch64/fpu/asinf_advsimd.c new file mode 100644 index 0000000000..3180ae7c8e --- /dev/null +++ b/sysdeps/aarch64/fpu/asinf_advsimd.c @@ -0,0 +1,104 @@ +/* Single-precision AdvSIMD inverse sin + + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "v_math.h" +#include "poly_advsimd_f32.h" + +static const struct data +{ + float32x4_t poly[5]; + float32x4_t pi_over_2f; +} data = { + /* Polynomial approximation of (asin(sqrt(x)) - sqrt(x)) / (x * sqrt(x)) on + [ 0x1p-24 0x1p-2 ] order = 4 rel error: 0x1.00a23bbp-29 . */ + .poly = { V4 (0x1.55555ep-3), V4 (0x1.33261ap-4), V4 (0x1.70d7dcp-5), + V4 (0x1.b059dp-6), V4 (0x1.3af7d8p-5) }, + .pi_over_2f = V4 (0x1.921fb6p+0f), +}; + +#define AbsMask 0x7fffffff +#define Half 0x3f000000 +#define One 0x3f800000 +#define Small 0x39800000 /* 2^-12. */ + +#if WANT_SIMD_EXCEPT +static float32x4_t VPCS_ATTR NOINLINE +special_case (float32x4_t x, float32x4_t y, uint32x4_t special) +{ + return v_call_f32 (asinf, x, y, special); +} +#endif + +/* Single-precision implementation of vector asin(x). + + For |x| < Small, approximate asin(x) by x. Small = 2^-12 for correct + rounding. If WANT_SIMD_EXCEPT = 0, Small = 0 and we proceed with the + following approximation. + + For |x| in [Small, 0.5], use order 4 polynomial P such that the final + approximation is an odd polynomial: asin(x) ~ x + x^3 P(x^2). + + The largest observed error in this region is 0.83 ulps, + _ZGVnN4v_asinf (0x1.ea00f4p-2) got 0x1.fef15ep-2 want 0x1.fef15cp-2. + + For |x| in [0.5, 1.0], use same approximation with a change of variable + + asin(x) = pi/2 - (y + y * z * P(z)), with z = (1-x)/2 and y = sqrt(z). + + The largest observed error in this region is 2.41 ulps, + _ZGVnN4v_asinf (0x1.00203ep-1) got 0x1.0c3a64p-1 want 0x1.0c3a6p-1. */ +float32x4_t VPCS_ATTR V_NAME_F1 (asin) (float32x4_t x) +{ + const struct data *d = ptr_barrier (&data); + + uint32x4_t ix = vreinterpretq_u32_f32 (x); + uint32x4_t ia = vandq_u32 (ix, v_u32 (AbsMask)); + +#if WANT_SIMD_EXCEPT + /* Special values need to be computed with scalar fallbacks so + that appropriate fp exceptions are raised. */ + uint32x4_t special + = vcgtq_u32 (vsubq_u32 (ia, v_u32 (Small)), v_u32 (One - Small)); + if (__glibc_unlikely (v_any_u32 (special))) + return special_case (x, x, v_u32 (0xffffffff)); +#endif + + float32x4_t ax = vreinterpretq_f32_u32 (ia); + uint32x4_t a_lt_half = vcltq_u32 (ia, v_u32 (Half)); + + /* Evaluate polynomial Q(x) = y + y * z * P(z) with + z = x ^ 2 and y = |x| , if |x| < 0.5 + z = (1 - |x|) / 2 and y = sqrt(z), if |x| >= 0.5. */ + float32x4_t z2 = vbslq_f32 (a_lt_half, vmulq_f32 (x, x), + vfmsq_n_f32 (v_f32 (0.5), ax, 0.5)); + float32x4_t z = vbslq_f32 (a_lt_half, ax, vsqrtq_f32 (z2)); + + /* Use a single polynomial approximation P for both intervals. */ + float32x4_t p = v_horner_4_f32 (z2, d->poly); + /* Finalize polynomial: z + z * z2 * P(z2). */ + p = vfmaq_f32 (z, vmulq_f32 (z, z2), p); + + /* asin(|x|) = Q(|x|) , for |x| < 0.5 + = pi/2 - 2 Q(|x|), for |x| >= 0.5. */ + float32x4_t y + = vbslq_f32 (a_lt_half, p, vfmsq_n_f32 (d->pi_over_2f, p, 2.0)); + + /* Copy sign. */ + return vbslq_f32 (v_u32 (AbsMask), y, x); +} diff --git a/sysdeps/aarch64/fpu/asinf_sve.c b/sysdeps/aarch64/fpu/asinf_sve.c new file mode 100644 index 0000000000..5abe710b5a --- /dev/null +++ b/sysdeps/aarch64/fpu/asinf_sve.c @@ -0,0 +1,78 @@ +/* Single-precision SVE inverse sin + + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "sv_math.h" +#include "poly_sve_f32.h" + +static const struct data +{ + float32_t poly[5]; + float32_t pi_over_2f; +} data = { + /* Polynomial approximation of (asin(sqrt(x)) - sqrt(x)) / (x * sqrt(x)) on + [ 0x1p-24 0x1p-2 ] order = 4 rel error: 0x1.00a23bbp-29 . */ + .poly = { 0x1.55555ep-3, 0x1.33261ap-4, 0x1.70d7dcp-5, 0x1.b059dp-6, + 0x1.3af7d8p-5, }, + .pi_over_2f = 0x1.921fb6p+0f, +}; + +/* Single-precision SVE implementation of vector asin(x). + + For |x| in [0, 0.5], use order 4 polynomial P such that the final + approximation is an odd polynomial: asin(x) ~ x + x^3 P(x^2). + + The largest observed error in this region is 0.83 ulps, + _ZGVsMxv_asinf (0x1.ea00f4p-2) got 0x1.fef15ep-2 + want 0x1.fef15cp-2. + + For |x| in [0.5, 1.0], use same approximation with a change of variable + + asin(x) = pi/2 - (y + y * z * P(z)), with z = (1-x)/2 and y = sqrt(z). + + The largest observed error in this region is 2.41 ulps, + _ZGVsMxv_asinf (-0x1.00203ep-1) got -0x1.0c3a64p-1 + want -0x1.0c3a6p-1. */ +svfloat32_t SV_NAME_F1 (asin) (svfloat32_t x, const svbool_t pg) +{ + const struct data *d = ptr_barrier (&data); + + svuint32_t sign = svand_x (pg, svreinterpret_u32 (x), 0x80000000); + + svfloat32_t ax = svabs_x (pg, x); + svbool_t a_ge_half = svacge (pg, x, 0.5); + + /* Evaluate polynomial Q(x) = y + y * z * P(z) with + z = x ^ 2 and y = |x| , if |x| < 0.5 + z = (1 - |x|) / 2 and y = sqrt(z), if |x| >= 0.5. */ + svfloat32_t z2 = svsel (a_ge_half, svmls_x (pg, sv_f32 (0.5), ax, 0.5), + svmul_x (pg, x, x)); + svfloat32_t z = svsqrt_m (ax, a_ge_half, z2); + + /* Use a single polynomial approximation P for both intervals. */ + svfloat32_t p = sv_horner_4_f32_x (pg, z2, d->poly); + /* Finalize polynomial: z + z * z2 * P(z2). */ + p = svmla_x (pg, z, svmul_x (pg, z, z2), p); + + /* asin(|x|) = Q(|x|) , for |x| < 0.5 + = pi/2 - 2 Q(|x|), for |x| >= 0.5. */ + svfloat32_t y = svmad_m (a_ge_half, p, sv_f32 (-2.0), d->pi_over_2f); + + /* Copy sign. */ + return svreinterpret_f32 (svorr_x (pg, svreinterpret_u32 (y), sign)); +} diff --git a/sysdeps/aarch64/fpu/bits/math-vector.h b/sysdeps/aarch64/fpu/bits/math-vector.h index 06587ffa91..03778faf96 100644 --- a/sysdeps/aarch64/fpu/bits/math-vector.h +++ b/sysdeps/aarch64/fpu/bits/math-vector.h @@ -49,6 +49,7 @@ typedef __SVBool_t __sv_bool_t; # define __vpcs __attribute__ ((__aarch64_vector_pcs__)) +__vpcs __f32x4_t _ZGVnN4v_asinf (__f32x4_t); __vpcs __f32x4_t _ZGVnN4v_cosf (__f32x4_t); __vpcs __f32x4_t _ZGVnN4v_expf (__f32x4_t); __vpcs __f32x4_t _ZGVnN4v_exp10f (__f32x4_t); @@ -59,6 +60,7 @@ __vpcs __f32x4_t _ZGVnN4v_log2f (__f32x4_t); __vpcs __f32x4_t _ZGVnN4v_sinf (__f32x4_t); __vpcs __f32x4_t _ZGVnN4v_tanf (__f32x4_t); +__vpcs __f64x2_t _ZGVnN2v_asin (__f64x2_t); __vpcs __f64x2_t _ZGVnN2v_cos (__f64x2_t); __vpcs __f64x2_t _ZGVnN2v_exp (__f64x2_t); __vpcs __f64x2_t _ZGVnN2v_exp10 (__f64x2_t); @@ -74,6 +76,7 @@ __vpcs __f64x2_t _ZGVnN2v_tan (__f64x2_t); #ifdef __SVE_VEC_MATH_SUPPORTED +__sv_f32_t _ZGVsMxv_asinf (__sv_f32_t, __sv_bool_t); __sv_f32_t _ZGVsMxv_cosf (__sv_f32_t, __sv_bool_t); __sv_f32_t _ZGVsMxv_expf (__sv_f32_t, __sv_bool_t); __sv_f32_t _ZGVsMxv_exp10f (__sv_f32_t, __sv_bool_t); @@ -84,6 +87,7 @@ __sv_f32_t _ZGVsMxv_log2f (__sv_f32_t, __sv_bool_t); __sv_f32_t _ZGVsMxv_sinf (__sv_f32_t, __sv_bool_t); __sv_f32_t _ZGVsMxv_tanf (__sv_f32_t, __sv_bool_t); +__sv_f64_t _ZGVsMxv_asin (__sv_f64_t, __sv_bool_t); __sv_f64_t _ZGVsMxv_cos (__sv_f64_t, __sv_bool_t); __sv_f64_t _ZGVsMxv_exp (__sv_f64_t, __sv_bool_t); __sv_f64_t _ZGVsMxv_exp10 (__sv_f64_t, __sv_bool_t); diff --git a/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c index 26d5ecf66f..b5ccd6b1cc 100644 --- a/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c +++ b/sysdeps/aarch64/fpu/test-double-advsimd-wrappers.c @@ -23,6 +23,7 @@ #define VEC_TYPE float64x2_t +VPCS_VECTOR_WRAPPER (asin_advsimd, _ZGVnN2v_asin) VPCS_VECTOR_WRAPPER (cos_advsimd, _ZGVnN2v_cos) VPCS_VECTOR_WRAPPER (exp_advsimd, _ZGVnN2v_exp) VPCS_VECTOR_WRAPPER (exp10_advsimd, _ZGVnN2v_exp10) diff --git a/sysdeps/aarch64/fpu/test-double-sve-wrappers.c b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c index 86efd60779..fc3b20f421 100644 --- a/sysdeps/aarch64/fpu/test-double-sve-wrappers.c +++ b/sysdeps/aarch64/fpu/test-double-sve-wrappers.c @@ -32,6 +32,7 @@ return svlastb_f64 (svptrue_b64 (), mr); \ } +SVE_VECTOR_WRAPPER (asin_sve, _ZGVsMxv_asin) SVE_VECTOR_WRAPPER (cos_sve, _ZGVsMxv_cos) SVE_VECTOR_WRAPPER (exp_sve, _ZGVsMxv_exp) SVE_VECTOR_WRAPPER (exp10_sve, _ZGVsMxv_exp10) diff --git a/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c index 8f7ebea1ac..0a36aa91f5 100644 --- a/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c +++ b/sysdeps/aarch64/fpu/test-float-advsimd-wrappers.c @@ -23,6 +23,7 @@ #define VEC_TYPE float32x4_t +VPCS_VECTOR_WRAPPER (asinf_advsimd, _ZGVnN4v_asinf) VPCS_VECTOR_WRAPPER (cosf_advsimd, _ZGVnN4v_cosf) VPCS_VECTOR_WRAPPER (expf_advsimd, _ZGVnN4v_expf) VPCS_VECTOR_WRAPPER (exp10f_advsimd, _ZGVnN4v_exp10f) diff --git a/sysdeps/aarch64/fpu/test-float-sve-wrappers.c b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c index 885e58ac39..f7e4882c7a 100644 --- a/sysdeps/aarch64/fpu/test-float-sve-wrappers.c +++ b/sysdeps/aarch64/fpu/test-float-sve-wrappers.c @@ -32,6 +32,7 @@ return svlastb_f32 (svptrue_b32 (), mr); \ } +SVE_VECTOR_WRAPPER (asinf_sve, _ZGVsMxv_asinf) SVE_VECTOR_WRAPPER (cosf_sve, _ZGVsMxv_cosf) SVE_VECTOR_WRAPPER (expf_sve, _ZGVsMxv_expf) SVE_VECTOR_WRAPPER (exp10f_sve, _ZGVsMxv_exp10f) diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps index d117209c06..1edc0fc343 100644 --- a/sysdeps/aarch64/libm-test-ulps +++ b/sysdeps/aarch64/libm-test-ulps @@ -46,11 +46,19 @@ double: 1 float: 1 ldouble: 1 +Function: "asin_advsimd": +double: 2 +float: 2 + Function: "asin_downward": double: 1 float: 1 ldouble: 2 +Function: "asin_sve": +double: 2 +float: 2 + Function: "asin_towardzero": double: 1 float: 1 diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist index cad774521a..6431c3fe65 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist @@ -14,16 +14,20 @@ GLIBC_2.38 _ZGVsMxv_log F GLIBC_2.38 _ZGVsMxv_logf F GLIBC_2.38 _ZGVsMxv_sin F GLIBC_2.38 _ZGVsMxv_sinf F +GLIBC_2.39 _ZGVnN2v_asin F GLIBC_2.39 _ZGVnN2v_exp10 F GLIBC_2.39 _ZGVnN2v_exp2 F GLIBC_2.39 _ZGVnN2v_log10 F GLIBC_2.39 _ZGVnN2v_log2 F GLIBC_2.39 _ZGVnN2v_tan F +GLIBC_2.39 _ZGVnN4v_asinf F GLIBC_2.39 _ZGVnN4v_exp10f F GLIBC_2.39 _ZGVnN4v_exp2f F GLIBC_2.39 _ZGVnN4v_log10f F GLIBC_2.39 _ZGVnN4v_log2f F GLIBC_2.39 _ZGVnN4v_tanf F +GLIBC_2.39 _ZGVsMxv_asin F +GLIBC_2.39 _ZGVsMxv_asinf F GLIBC_2.39 _ZGVsMxv_exp10 F GLIBC_2.39 _ZGVsMxv_exp10f F GLIBC_2.39 _ZGVsMxv_exp2 F