From patchwork Fri Jan 29 12:55:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jonathan Wright X-Patchwork-Id: 1433302 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=iys5AzZT; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4DRy511prPz9sVF for ; Fri, 29 Jan 2021 23:56:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6AE383854821; Fri, 29 Jan 2021 12:56:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6AE383854821 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1611924961; bh=/2RoWKX2fid5w3L305dNryYmKFwFJEJqf6r0ktcA4Yw=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=iys5AzZT/HnK204CqiusWZjomY0xYrSkbH1KY55emncZLTNsh7lohZSnxy79wo958 rNGkp/aGa9iHb0KeLMcgawHveIYeD7LR4HIhwQt+sXSTBdihTxKLBZ7hWh9ElzXHgp ogqzzQEz5INP8VCQG3frOZ9Tj/GpMf8yTXOyGvBI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from FRA01-MR2-obe.outbound.protection.outlook.com (mail-eopbgr90052.outbound.protection.outlook.com [40.107.9.52]) by sourceware.org (Postfix) with ESMTPS id BBFA23858004 for ; Fri, 29 Jan 2021 12:55:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org BBFA23858004 Received: from MR2P264CA0036.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500::24) by PR2PR08MB4633.eurprd08.prod.outlook.com (2603:10a6:101:1c::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11; Fri, 29 Jan 2021 12:55:54 +0000 Received: from VE1EUR03FT042.eop-EUR03.prod.protection.outlook.com (2603:10a6:500:0:cafe::7) by MR2P264CA0036.outlook.office365.com (2603:10a6:500::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3805.19 via Frontend Transport; Fri, 29 Jan 2021 12:55:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;gcc.gnu.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by VE1EUR03FT042.mail.protection.outlook.com (10.152.19.62) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.11 via Frontend Transport; Fri, 29 Jan 2021 12:55:53 +0000 Received: ("Tessian outbound 28c96a6c9d2e:v71"); Fri, 29 Jan 2021 12:55:53 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3f86beb14d54e9ad X-CR-MTA-TID: 64aa7808 Received: from 4d3052902594.2 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 1F011BC1-3F76-48B3-8B1F-D7017612A731.1; Fri, 29 Jan 2021 12:55:48 +0000 Received: from EUR02-VE1-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4d3052902594.2 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 29 Jan 2021 12:55:48 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=WW0v+A1ehuxSnyiLJoOOP9cPi0Km7wqTQAm2ysvmxYLvI8CEs3+j9j7ZRJQPpFPolsnwFQ/XMhbMacGUQS4kYUB5hFbxwDPrlkjWn/1GQkTKBBb2XvaV/hQoYAlD0wxcuWMm9NUmjU3/n86TvVieMMh43cVFeFfpr7D0V2FY3YjjZltWcUNregannXbYgmzPwbToIiqYfRmNsGTtWSAEYPepOA/GAgbA/JTb0QW9AWg1bFAX7ndMouKvY37ZHeO/F4van1mILhdxe9IXa1P/p1QqzgDFUCMa5jxdmjMOXzNnepKF40P5HvhnGF4DB9AoB9GnRqcc6VRNVyIhTQBCHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=/2RoWKX2fid5w3L305dNryYmKFwFJEJqf6r0ktcA4Yw=; b=OTBg4xQ0+q0uyWVaz0gM+DQD0cck3HwcM6FWIDRamvriPg2OC4um3/c2wJogjw1cBNZeyy5g36FakP41E9d+/8AJ1d4SUJk9BFhZ4qed2iHj125VGuRrl0t8uEwmbq378BTPQl3ADs5CbkyrMqHhbEyLBh+w3EI2uc1G5XzqQcJU7Nhw8g5EX1LOUAR2fsilWjocErIlmu/Rn7ff9BCTTW7hIRPPkAR0DZliBAgxGzg8OA8PceaaOPE/F6YuwTJq8w2g9tt6r7QyABWwOWYXMPoXhHf7N0DAPweZbkF8fQN4lbcKhasWQ5dtqs4TpcAB5mBZ/fvyx7eMcXNGMgjqfg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none Received: from DBBPR08MB4758.eurprd08.prod.outlook.com (2603:10a6:10:da::16) by DBBPR08MB4856.eurprd08.prod.outlook.com (2603:10a6:10:f6::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3784.12; Fri, 29 Jan 2021 12:55:45 +0000 Received: from DBBPR08MB4758.eurprd08.prod.outlook.com ([fe80::4407:4457:8f6b:eee5]) by DBBPR08MB4758.eurprd08.prod.outlook.com ([fe80::4407:4457:8f6b:eee5%3]) with mapi id 15.20.3784.019; Fri, 29 Jan 2021 12:55:45 +0000 To: "gcc-patches@gcc.gnu.org" Subject: [PATCH] aarch64: Use RTL builtins for [su]mlsl_lane[q] intrinsics Thread-Topic: [PATCH] aarch64: Use RTL builtins for [su]mlsl_lane[q] intrinsics Thread-Index: AQHW9j2bxJ1yP9I8I0CxYD8Vwe+SWQ== Date: Fri, 29 Jan 2021 12:55:45 +0000 Message-ID: Accept-Language: en-GB, en-US Content-Language: en-GB X-MS-Has-Attach: yes X-MS-TNEF-Correlator: Authentication-Results-Original: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; x-originating-ip: [217.140.99.251] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: 515a386d-1802-4664-9908-08d8c45536a0 x-ms-traffictypediagnostic: DBBPR08MB4856:|PR2PR08MB4633: x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:80;OLM:80; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: E6UHNrVNj9x7QJv20yke+vWIUH6u6o6nUIikyqD4inIbLLoUqNEoSLwxASUnd7oPloYlvhyuxNHJk0fa12voLujmduGfPTO8rfxlmQepiuVk2HJ0eicaCv/E1rcZVWUGeTyPQvVtFM8rBR1RhmwiEOUbW2kavX6iLIkNwQ+vTvk830x7p8BTtoxcjhKHhvUoBjXlmV0O7K8QNTGz2b7HuDO/mJ+BfYCgbxdptSsQgrKTPmDnPi7b0iT6bmiL8YiMAytROtACJ7UIl50G+J1cWqfs9aFglEpTl8t3I2kx1KmoCmAa2ouJVrxv7yYEinV9NipMmGUs0ccLgD7n3qTAwBBLErP1V2owTT+hJqtAMeaUOtrskW3/81Dx1TM5b9+7MoFnPMAeBfbs38E5N7r/W/YVzaAs/1ZSy2WDubVNCVOYxxJdaPfdflDCORFmyRN49gmH3eSY1NYlNY6iixxmcKrNY+261r1my6eE65Qa6W28i63kk929lJrAI0/BecNFowTsiDo+k+89ukweQLOmpQ== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DBBPR08MB4758.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(346002)(396003)(136003)(39860400002)(366004)(376002)(99936003)(8936002)(5660300002)(52536014)(8676002)(33656002)(66446008)(26005)(6506007)(71200400001)(86362001)(91956017)(478600001)(9686003)(2906002)(7696005)(316002)(76116006)(4326008)(4744005)(55016002)(186003)(66946007)(66556008)(6916009)(64756008)(66476007)(66616009); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: =?iso-8859-1?q?9iapOowWm91SI10QmpoDH2OLA?= =?iso-8859-1?q?HVydfNnPzD1MzMNGW4iV55b8BDOi9sg1GfztQKj5/p5SLysEQo7W+jMNPw3I?= =?iso-8859-1?q?e2PM7+eTqmAQg9RDRe570aId1G6x+DGXkwffMoogqdhrjhxF3kCbOvlprpB2?= =?iso-8859-1?q?x+EdQprL9SyPAwAzdy20QZdqUk07ZbtSJE2yUeVNQgPSuJgzf6ze2KmEId20?= =?iso-8859-1?q?e93ICdO4dzhMBpZvEE8Xj54frm+yz1SrxegwdwQsVEnUSERLXOw3KDJmWSol?= =?iso-8859-1?q?KVR6Bn0XK/8ds63M0KZpiP0VYylYWLG70AnlRtA47AFCOQJ1yHwUwUHw9KdS?= =?iso-8859-1?q?DhesL0CjV4UaX2VRJUxPWKMOemyZrx94kdUVPT6WwhHbGImLdXEr7qDAD9qz?= =?iso-8859-1?q?Vz63l1y6UQvj4jOYgmmuGIONmPN5gNt8zJqJsKi29S2/PAa/Rf1eyzIWm7I4?= =?iso-8859-1?q?0eks2Kzts4Mgev/LALKxtWpD3SMd+5N+8eXUZt15POikHgsyZtGk8orB+NHN?= =?iso-8859-1?q?/BIZpyaWVWfd2Ldl4//wBwkGTOnHyT43wzn9fdZL2P3sG/iOGfpGnr1KeXF9?= =?iso-8859-1?q?hpnaPETHf/vAClOIEa7/r1pmQaI4MK3gtRZ75W+uwgiRzA1SLCbi4kIWDAC8?= =?iso-8859-1?q?5rHMp34zvY2zxc3uQBly9m2vb/YlU7hygno7DA4WqnXrCafk4L1vySkf+Loa?= =?iso-8859-1?q?Kv2p4aA+g7V3eqVSpA6+DomExpxyRdFKSWkDjJWcIQnJ/Xep0r1FG7BdF1vz?= =?iso-8859-1?q?9H+DmBanwzwLn8l4PsNj2sVidafpoYcJSPk8Uf1x6468m51s1lb+3ZG51cyA?= =?iso-8859-1?q?cxZ3c/b/1w7R1OaN5S+JfM33JtLJBtQ7MYgYIYDYocVTp4T0Vs5YK5Oz1SsJ?= =?iso-8859-1?q?TZ11okmflvVbXShGwgkcTd7qq5GnISOsMLLoJLP+9TUTJ0NCpDjM1phsDusn?= =?iso-8859-1?q?RNpMOaZJLTd7ub0KND7d2UYbqQl1DsvgpcqC6FBl8CqXKv5hJcvlXeHoamMz?= =?iso-8859-1?q?u3hwCcneFrDPtvZJYA92TRsFLV1fywz8wB9w7adNIyOBqLHkOZRkYUZ3Xc78?= =?iso-8859-1?q?7tW/wN03EmxYtrnMou6677MaXW4Ke9ek0i5cTAGI2JJ?= MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB4856 Original-Authentication-Results: gcc.gnu.org; dkim=none (message not signed) header.d=none;gcc.gnu.org; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: VE1EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 4dc03891-7e4a-4f82-e662-08d8c45531ba X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: VvZ9ecWyIVRK3pS7hIYU1e2Tacw6l0CWjb4DSWhWCeOz0bE4cFpHM0ROXE/oIW9KOVPcsyC3nvW+d3LJ2brWJtYDojTRKU5mZ7F42ljdHz536URCilwiXsYZvovOo2QpH9RYne5DHE0NeUWnpUd2KGj2lvIbR2eT6vM7dXN/HC4tFoHzCQJ4XNcxSd1UtNV2x0iMoL0HdCqY2Bd6Uia4HDQYVCOskVcYXTfrqOATmLlvHuV5g5SgMStUN8Kem8SS8SPF28Pp3m9fR7ZFNoxdOGQXaX9Fa0vKaPvI1o6UJpknBbiKiG8dK2xPyfhSMjIkmtyI9kl2ENnN//lUXlXk26ljmm9bjmG+WdcnGxll6OWNXjNEYan1pYNftN/AP8UwSkjNHE86OorYMWjxPz+EYxDX6vvhc2XuWqgaWJoyanXaPBZFocMjmVUfBcS8R+Vcl971NMlFa6rZlC+lF6AHnqPWDcQLbzDkmaqfB0HHHPWH4Uguf/mM//NK9ZaEMDiMmIYZAdo2qFTRPF4c2vkt6RDZYorLdljN89nb5trcZkMFcKdcq6GYkfcCKIXS0IMBaVeimsBnKufKHO6CvJpZcd7c+l+lT7Wq5YWbvqt0f4s= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(136003)(39860400002)(376002)(396003)(346002)(46966006)(336012)(55016002)(235185007)(81166007)(99936003)(186003)(8676002)(82740400003)(478600001)(6916009)(356005)(52536014)(2906002)(9686003)(33656002)(8936002)(86362001)(70206006)(82310400003)(66616009)(47076005)(6506007)(316002)(7696005)(26005)(4326008)(5660300002)(70586007); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Jan 2021 12:55:53.9006 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 515a386d-1802-4664-9908-08d8c45536a0 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: VE1EUR03FT042.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR2PR08MB4633 X-Spam-Status: No, score=-15.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, KAM_LOTSOFHASH, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jonathan Wright via Gcc-patches From: Jonathan Wright Reply-To: Jonathan Wright Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, As subject, this patch rewrites [su]mlsl_lane[q] Neon intrinsics to use RTL builtins rather than inline assembly code, allowing for better scheduling and optimization. Regression tested and bootstrapped on aarch64-none-linux-gnu and aarch64_be-none-elf - no issues. Ok for master? Thanks, Jonathan gcc/ChangeLog: 2021-01-28  Jonathan Wright   * config/aarch64/aarch64-simd-builtins.def: Add [su]mlsl_lane[q] builtin generator macros. * config/aarch64/aarch64-simd.md (aarch64_vec_mlsl_lane): Define. * config/aarch64/arm_neon.h (vmlsl_lane_s16): Use RTL builtin instead of inline asm. (vmlsl_lane_s32): Likewise. (vmlsl_lane_u16): Likewise. (vmlsl_lane_u32): Likewise. (vmlsl_laneq_s16): Likewise. (vmlsl_laneq_s32): Likewise. (vmlsl_laneq_u16): Likewise. (vmlsl_laneq_u32): Likewise. diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index cb79c08ba66df817e289d891b206ea7f66c81527..4913231ea55260fea1c7511a28a436e1e1e2ab20 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -264,6 +264,11 @@ BUILTIN_VD_HSI (TERNOPU_LANE, vec_umult_laneq_, 0, ALL) BUILTIN_VD_HSI (QUADOPU_LANE, vec_umlal_laneq_, 0, ALL) + BUILTIN_VD_HSI (QUADOP_LANE, vec_smlsl_lane_, 0, NONE) + BUILTIN_VD_HSI (QUADOP_LANE, vec_smlsl_laneq_, 0, NONE) + BUILTIN_VD_HSI (QUADOPU_LANE, vec_umlsl_lane_, 0, NONE) + BUILTIN_VD_HSI (QUADOPU_LANE, vec_umlsl_laneq_, 0, NONE) + BUILTIN_VSD_HSI (BINOP, sqdmull, 0, NONE) BUILTIN_VSD_HSI (TERNOP_LANE, sqdmull_lane, 0, NONE) BUILTIN_VSD_HSI (TERNOP_LANE, sqdmull_laneq, 0, NONE) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 919d0b03998d893232331d6f4da5c93ae6bf41b8..adeec028d49f06156a5e84ce4dd83dbd6f151474 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2082,6 +2082,26 @@ [(set_attr "type" "neon_mla__scalar_long")] ) +(define_insn "aarch64_vec_mlsl_lane" + [(set (match_operand: 0 "register_operand" "=w") + (minus: + (match_operand: 1 "register_operand" "0") + (mult: + (ANY_EXTEND: + (match_operand: 2 "register_operand" "w")) + (ANY_EXTEND: + (vec_duplicate: + (vec_select: + (match_operand:VDQHS 3 "register_operand" "") + (parallel [(match_operand:SI 4 "immediate_operand" "i")])))))))] + "TARGET_SIMD" + { + operands[4] = aarch64_endian_lane_rtx (mode, INTVAL (operands[4])); + return "mlsl\\t%0., %2., %3.[%4]"; + } + [(set_attr "type" "neon_mla__scalar_long")] +) + ;; FP vector operations. ;; AArch64 AdvSIMD supports single-precision (32-bit) and ;; double-precision (64-bit) floating-point data types and arithmetic as diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index b56ab68aad57afb97447c9f5d24f392f6e2b618b..2a71ca9aa3c8c4095e99aa08c48e583f037a41ed 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -8068,117 +8068,65 @@ vmlsl_high_u32 (uint64x2_t __a, uint32x4_t __b, uint32x4_t __c) return __builtin_aarch64_umlsl_hiv4si_uuuu (__a, __b, __c); } -#define vmlsl_lane_s16(a, b, c, d) \ - __extension__ \ - ({ \ - int16x4_t c_ = (c); \ - int16x4_t b_ = (b); \ - int32x4_t a_ = (a); \ - int32x4_t result; \ - __asm__ ("smlsl %0.4s, %2.4h, %3.h[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "x"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_lane_s16 (int32x4_t __a, int16x4_t __b, int16x4_t __v, const int __lane) +{ + return __builtin_aarch64_vec_smlsl_lane_v4hi (__a, __b, __v, __lane); +} -#define vmlsl_lane_s32(a, b, c, d) \ - __extension__ \ - ({ \ - int32x2_t c_ = (c); \ - int32x2_t b_ = (b); \ - int64x2_t a_ = (a); \ - int64x2_t result; \ - __asm__ ("smlsl %0.2d, %2.2s, %3.s[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "w"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline int64x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_lane_s32 (int64x2_t __a, int32x2_t __b, int32x2_t __v, const int __lane) +{ + return __builtin_aarch64_vec_smlsl_lane_v2si (__a, __b, __v, __lane); +} -#define vmlsl_lane_u16(a, b, c, d) \ - __extension__ \ - ({ \ - uint16x4_t c_ = (c); \ - uint16x4_t b_ = (b); \ - uint32x4_t a_ = (a); \ - uint32x4_t result; \ - __asm__ ("umlsl %0.4s, %2.4h, %3.h[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "x"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_lane_u16 (uint32x4_t __a, uint16x4_t __b, uint16x4_t __v, + const int __lane) +{ + return __builtin_aarch64_vec_umlsl_lane_v4hi_uuuus (__a, __b, __v, __lane); +} -#define vmlsl_lane_u32(a, b, c, d) \ - __extension__ \ - ({ \ - uint32x2_t c_ = (c); \ - uint32x2_t b_ = (b); \ - uint64x2_t a_ = (a); \ - uint64x2_t result; \ - __asm__ ("umlsl %0.2d, %2.2s, %3.s[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "w"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline uint64x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_lane_u32 (uint64x2_t __a, uint32x2_t __b, uint32x2_t __v, + const int __lane) +{ + return __builtin_aarch64_vec_umlsl_lane_v2si_uuuus (__a, __b, __v, __lane); +} -#define vmlsl_laneq_s16(a, b, c, d) \ - __extension__ \ - ({ \ - int16x8_t c_ = (c); \ - int16x4_t b_ = (b); \ - int32x4_t a_ = (a); \ - int32x4_t result; \ - __asm__ ("smlsl %0.4s, %2.4h, %3.h[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "x"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_laneq_s16 (int32x4_t __a, int16x4_t __b, int16x8_t __v, const int __lane) +{ + return __builtin_aarch64_vec_smlsl_laneq_v4hi (__a, __b, __v, __lane); +} -#define vmlsl_laneq_s32(a, b, c, d) \ - __extension__ \ - ({ \ - int32x4_t c_ = (c); \ - int32x2_t b_ = (b); \ - int64x2_t a_ = (a); \ - int64x2_t result; \ - __asm__ ("smlsl %0.2d, %2.2s, %3.s[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "w"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline int64x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_laneq_s32 (int64x2_t __a, int32x2_t __b, int32x4_t __v, const int __lane) +{ + return __builtin_aarch64_vec_smlsl_laneq_v2si (__a, __b, __v, __lane); +} -#define vmlsl_laneq_u16(a, b, c, d) \ - __extension__ \ - ({ \ - uint16x8_t c_ = (c); \ - uint16x4_t b_ = (b); \ - uint32x4_t a_ = (a); \ - uint32x4_t result; \ - __asm__ ("umlsl %0.4s, %2.4h, %3.h[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "x"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_laneq_u16 (uint32x4_t __a, uint16x4_t __b, uint16x8_t __v, + const int __lane) +{ + return __builtin_aarch64_vec_umlsl_laneq_v4hi_uuuus (__a, __b, __v, __lane); +} -#define vmlsl_laneq_u32(a, b, c, d) \ - __extension__ \ - ({ \ - uint32x4_t c_ = (c); \ - uint32x2_t b_ = (b); \ - uint64x2_t a_ = (a); \ - uint64x2_t result; \ - __asm__ ("umlsl %0.2d, %2.2s, %3.s[%4]" \ - : "=w"(result) \ - : "0"(a_), "w"(b_), "w"(c_), "i"(d) \ - : /* No clobbers */); \ - result; \ - }) +__extension__ extern __inline uint64x2_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +vmlsl_laneq_u32 (uint64x2_t __a, uint32x2_t __b, uint32x4_t __v, + const int __lane) +{ + return __builtin_aarch64_vec_umlsl_laneq_v2si_uuuus (__a, __b, __v, __lane); +} __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))