From patchwork Thu Nov 14 19:13:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Srinath Parvathaneni X-Patchwork-Id: 1195150 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513488-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="T/9H3fUl"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.b="Hl4lkTOB"; dkim=fail reason="signature verification failed" (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.b="Hl4lkTOB"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47DWfv73xGz9sP3 for ; Fri, 15 Nov 2019 06:25:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=kZkhuwlhxW/eu26U c0L3v2wVNnVr0XLd56NuJwc0CIFx6cVu6DZC6zov+6LgoWOp9YVuxoRuj+92NJ0w Ofhzjw1kuTxNIxULsWwBNJirEjjF2xIJ+/UaikFO0Oi5IwH6kGrJBXf7zP6AYW/W 8ySd3AulQ2JuCC2gbdfwvx1vJ3s= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=r/4KCse3XkJQ6OIlBXn+oy QKczI=; b=T/9H3fUlBugvsEdt6kSPCL4jRA+kNyUrXAdTn6Q09BFy5OhWhoxgLg C4+n9z05rKRq2GrPxtoutW3DXN2pL22fikpa8Lsf3ETiWMO9K+yo0VueXtTtBT5p FOXq5IIkPNY7B88QX6W/64qvbWb35nJ+9lpS3wL47/pllqu5j/h9o= Received: (qmail 49549 invoked by alias); 14 Nov 2019 19:16:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 41958 invoked by uid 89); 14 Nov 2019 19:14:38 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-21.2 required=5.0 tests=AWL, BAYES_00, FORGED_SPF_HELO, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS autolearn=ham version=3.3.1 spammy= X-HELO: EUR02-VE1-obe.outbound.protection.outlook.com Received: from mail-eopbgr20085.outbound.protection.outlook.com (HELO EUR02-VE1-obe.outbound.protection.outlook.com) (40.107.2.85) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 14 Nov 2019 19:14:13 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JjdWUqXuyxxgb99L+PnDVBfO54gGMwFuZLiMndckzQA=; b=Hl4lkTOB5NLBtrhJiwtb2Qmh64k+U06oViJ6/WHBauFHxRrNa5ZlB0+IS3YDSPLg42DQAtZQN73xZwt3Z9SSuI6seQohaBCoH/rf9IXiqDqTTpsYQpZgXZkfkdpE8+s36TX3TuK38701RRWKmoj9YzzzX7WAOgJ4rUZA9WWWznE= Received: from DB6PR0801CA0056.eurprd08.prod.outlook.com (2603:10a6:4:2b::24) by DB7PR08MB3051.eurprd08.prod.outlook.com (2603:10a6:5:1e::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2430.25; Thu, 14 Nov 2019 19:14:08 +0000 Received: from DB5EUR03FT020.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::200) by DB6PR0801CA0056.outlook.office365.com (2603:10a6:4:2b::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2451.23 via Frontend Transport; Thu, 14 Nov 2019 19:14:08 +0000 Authentication-Results: spf=fail (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; gcc.gnu.org; dmarc=none action=none header.from=arm.com; Received-SPF: Fail (protection.outlook.com: domain of arm.com does not designate 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT020.mail.protection.outlook.com (10.152.20.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2451.23 via Frontend Transport; Thu, 14 Nov 2019 19:14:08 +0000 Received: ("Tessian outbound 0939a6bab6b1:v33"); Thu, 14 Nov 2019 19:14:08 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 41c0753ea3fe9232 X-CR-MTA-TID: 64aa7808 Received: from 72bc954f5a00.5 (cr-mta-lb-1.cr-mta-net [104.47.13.53]) by 64aa7808-outbound-1.mta.getcheckrecipient.com id 6FCAB186-3E1C-4CF4-9DDC-82B63B80BB2E.1; Thu, 14 Nov 2019 19:14:03 +0000 Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-he1eur04lp2053.outbound.protection.outlook.com [104.47.13.53]) by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 72bc954f5a00.5 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Thu, 14 Nov 2019 19:14:03 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fFGfk8rWMv2ya8fxEuQk67ZNaJ9mQvpZhYhFk06xuDre6vy99+OsP26CV0TjTIZXWMV7it85EO3UZK+7AEDBKCyy7hh6tc1wxinP8wxfSZ/IG8yOWVC64aXuglnxNyU8XODDTDtFJuuH0FiSG7gjPu5kpn3wJXqbznFEtmcgKJkTH4Fwm0MpxbDcuU3N0eayOXC4PyjxyqeuaZ95jLPIosX84+lYZs5OnQGGwVcMqF3DTjx2EW4Az5P3pHqPGV/DX78EUylAYpjkvj2CTePZHQpEgxWkSGV2FmlvkjmS/9Zvl7M7H7UCZOoAwwOBypGJ/Y4/FjOvdAFXUBPK62LGmw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JjdWUqXuyxxgb99L+PnDVBfO54gGMwFuZLiMndckzQA=; b=mh3ZOxusmiGM61r+PQ2QN1BD4jzdNVxcJTUiin+5CUkx9dJtAJuCfOxLmd8TRckzUV3NSSDD/J5Uko95ZsIcwYX1aBYQgSbEob1GdmD33wgZZQ/rsybr0EP/ig3dlZ3/kP52vs0KhN8CsyJi7lB0PK9vbqbPf1d8SWdAo94m7y7S2LN2uBudRxuhkHi4p2Jdj3PCDcGEoDH4/Nq7M0rhNh6f+Oki/rXn2GSzfO4JIfjkM2yezRIbXcZqpHhOml8qNxnzcWhr/GTnG515eOhPN/Hq8+/3Luc6l6uSEDiMmTFemc9NoeIq2lCN8jkY+4uFGTY8VBZzf7907XdLp1xHcg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=JjdWUqXuyxxgb99L+PnDVBfO54gGMwFuZLiMndckzQA=; b=Hl4lkTOB5NLBtrhJiwtb2Qmh64k+U06oViJ6/WHBauFHxRrNa5ZlB0+IS3YDSPLg42DQAtZQN73xZwt3Z9SSuI6seQohaBCoH/rf9IXiqDqTTpsYQpZgXZkfkdpE8+s36TX3TuK38701RRWKmoj9YzzzX7WAOgJ4rUZA9WWWznE= Received: from DBBPR08MB4775.eurprd08.prod.outlook.com (20.179.46.211) by DBBPR08MB4807.eurprd08.prod.outlook.com (20.179.46.144) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2430.23; Thu, 14 Nov 2019 19:14:01 +0000 Received: from DBBPR08MB4775.eurprd08.prod.outlook.com ([fe80::1c7c:c72d:2183:12d1]) by DBBPR08MB4775.eurprd08.prod.outlook.com ([fe80::1c7c:c72d:2183:12d1%7]) with mapi id 15.20.2430.028; Thu, 14 Nov 2019 19:14:01 +0000 From: Srinath Parvathaneni To: "gcc-patches@gcc.gnu.org" CC: Richard Earnshaw , Kyrylo Tkachov Subject: [PATCH][ARM][GCC][10x]: MVE ACLE intrinsics "add with carry across beats" and "beat-wise substract". Date: Thu, 14 Nov 2019 19:13:33 +0000 Message-ID: References: <157375666998.31400.16652205595246718910.scripted-patch-series@arm.com> In-Reply-To: <157375666998.31400.16652205595246718910.scripted-patch-series@arm.com> Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Srinath.Parvathaneni@arm.com; X-MS-Exchange-PUrlCount: 1 x-ms-exchange-transport-forked: True x-checkrecipientrouted: true x-ms-oob-tlc-oobclassifiers: OLM:7;OLM:7; X-Forefront-Antispam-Report-Untrusted: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(396003)(366004)(376002)(346002)(136003)(54534003)(199004)(189003)(81156014)(316002)(81166006)(30864003)(2501003)(44832011)(4326008)(74316002)(7696005)(99286004)(478600001)(186003)(11346002)(26005)(54906003)(76176011)(52536014)(5660300002)(446003)(25786009)(71190400001)(71200400001)(33656002)(66946007)(52116002)(305945005)(66556008)(7736002)(66446008)(6666004)(66476007)(66616009)(64756008)(256004)(5024004)(9686003)(966005)(5640700003)(3846002)(66066001)(6916009)(86362001)(486006)(6506007)(6436002)(14454004)(8936002)(2906002)(6116002)(2351001)(55016002)(8676002)(102836004)(386003)(476003)(6306002)(579004)(559001)(569006); DIR:OUT; SFP:1101; SCL:1; SRVR:DBBPR08MB4807; H:DBBPR08MB4775.eurprd08.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: krxfsEgpoLqAYUZuewieGMv06wDFo+MwHklqzvOrg6W46UVFEGVHJxqqW1ZTY4YVp1H4L0utNSoG7GnD9TPbVemI/8XqMoJaMuK3vxZ0GowBudzBPWWJxh0tWk1vR384HovbxlyBntvkS1pwDW3ojO45bBXxhgRIIRI38eOMrzDO2FlRakf4C3Uw2p1EEkt+5brYpzlycCBkin9KRk3zF0EhWLaGXVLSiX4Oj16SZNTO5UG3QBg8P1DwHFv3oBVhxO84sk19aVMonrpiyMjosfKHp4vBKucjZtBT72OFIrBF2YLvbei4Vrm0C5y+4tin+PZ5VvjBmx/55iwFrDo1yg8NX2Mh3eopRsSvoZu6pD17JNqvvQDKjD3AhVuKmTFamO8wOKG+vdj25jy5wIySPuZVSpfoptcs5ym0h8cp7uDw8fdIMOqj4PRwGTdCfLkkxrxqGD1U8wVwXYgmOKuSIfHD8oVus7xHwJ+THbmduz0= MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Srinath.Parvathaneni@arm.com; X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT020.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 52a20fdc-6bb6-42e7-416d-08d76936bdd2 X-IsSubscribed: yes Hello, This patch supports following MVE ACLE "add with carry across beats" intrinsics and "beat-wise substract" intrinsics. vadciq_s32, vadciq_u32, vadciq_m_s32, vadciq_m_u32, vadcq_s32, vadcq_u32, vadcq_m_s32, vadcq_m_u32, vsbciq_s32, vsbciq_u32, vsbciq_m_s32, vsbciq_m_u32, vsbcq_s32, vsbcq_u32, vsbcq_m_s32, vsbcq_m_u32. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2019-11-08 Andre Vieira Mihail Ionescu Srinath Parvathaneni * config/arm/arm_mve.h (vadciq_s32): Define macro. (vadciq_u32): Likewise. (vadciq_m_s32): Likewise. (vadciq_m_u32): Likewise. (vadcq_s32): Likewise. (vadcq_u32): Likewise. (vadcq_m_s32): Likewise. (vadcq_m_u32): Likewise. (vsbciq_s32): Likewise. (vsbciq_u32): Likewise. (vsbciq_m_s32): Likewise. (vsbciq_m_u32): Likewise. (vsbcq_s32): Likewise. (vsbcq_u32): Likewise. (vsbcq_m_s32): Likewise. (vsbcq_m_u32): Likewise. (__arm_vadciq_s32): Define intrinsic. (__arm_vadciq_u32): Likewise. (__arm_vadciq_m_s32): Likewise. (__arm_vadciq_m_u32): Likewise. (__arm_vadcq_s32): Likewise. (__arm_vadcq_u32): Likewise. (__arm_vadcq_m_s32): Likewise. (__arm_vadcq_m_u32): Likewise. (__arm_vsbciq_s32): Likewise. (__arm_vsbciq_u32): Likewise. (__arm_vsbciq_m_s32): Likewise. (__arm_vsbciq_m_u32): Likewise. (__arm_vsbcq_s32): Likewise. (__arm_vsbcq_u32): Likewise. (__arm_vsbcq_m_s32): Likewise. (__arm_vsbcq_m_u32): Likewise. (vadciq_m): Define polymorphic variant. (vadciq): Likewise. (vadcq_m): Likewise. (vadcq): Likewise. (vsbciq_m): Likewise. (vsbciq): Likewise. (vsbcq_m): Likewise. (vsbcq): Likewise. * config/arm/arm_mve_builtins.def (BINOP_NONE_NONE_NONE): Use builtin qualifier. (BINOP_UNONE_UNONE_UNONE): Likewise. (QUADOP_NONE_NONE_NONE_NONE_UNONE): Likewise. (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE): Likewise. * config/arm/mve.md (VADCIQ): Define iterator. (VADCIQ_M): Likewise. (VSBCQ): Likewise. (VSBCQ_M): Likewise. (VSBCIQ): Likewise. (VSBCIQ_M): Likewise. (VADCQ): Likewise. (VADCQ_M): Likewise. (mve_vadciq_m_v4si): Define RTL pattern. (mve_vadciq_v4si): Likewise. (mve_vadcq_m_v4si): Likewise. (mve_vadcq_v4si): Likewise. (mve_vsbciq_m_v4si): Likewise. (mve_vsbciq_v4si): Likewise. (mve_vsbcq_m_v4si): Likewise. (mve_vsbcq_v4si): Likewise. gcc/testsuite/ChangeLog: 2019-11-08 Andre Vieira Mihail Ionescu Srinath Parvathaneni * gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: New test. * gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadciq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadciq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vadcq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Likewise. ############### Attachment also inlined for ease of reply ############### diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 31ad3fc5cddfedede02b10e194a426a98bd13024..1704b622c5d6e0abcf814ae1d439bb732f0bd76e 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -2450,6 +2450,22 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vrev32q_x_f16(__a, __p) __arm_vrev32q_x_f16(__a, __p) #define vrev64q_x_f16(__a, __p) __arm_vrev64q_x_f16(__a, __p) #define vrev64q_x_f32(__a, __p) __arm_vrev64q_x_f32(__a, __p) +#define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) +#define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) +#define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) +#define vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) +#define vadcq_s32(__a, __b, __carry) __arm_vadcq_s32(__a, __b, __carry) +#define vadcq_u32(__a, __b, __carry) __arm_vadcq_u32(__a, __b, __carry) +#define vadcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_s32(__inactive, __a, __b, __carry, __p) +#define vadcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_u32(__inactive, __a, __b, __carry, __p) +#define vsbciq_s32(__a, __b, __carry_out) __arm_vsbciq_s32(__a, __b, __carry_out) +#define vsbciq_u32(__a, __b, __carry_out) __arm_vsbciq_u32(__a, __b, __carry_out) +#define vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) +#define vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) +#define vsbcq_s32(__a, __b, __carry) __arm_vsbcq_s32(__a, __b, __carry) +#define vsbcq_u32(__a, __b, __carry) __arm_vsbcq_u32(__a, __b, __carry) +#define vsbcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_s32(__inactive, __a, __b, __carry, __p) +#define vsbcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_u32(__inactive, __a, __b, __carry, __p) #endif __extension__ extern __inline void @@ -15917,6 +15933,158 @@ __arm_vshrq_x_n_u32 (uint32x4_t __a, const int __imm, mve_pred16_t __p) return __builtin_mve_vshrq_m_n_uv4si (vuninitializedq_u32 (), __a, __imm, __p); } +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) +{ + int32x4_t __res = __builtin_mve_vadciq_sv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) +{ + uint32x4_t __res = __builtin_mve_vadciq_uv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + int32x4_t __res = __builtin_mve_vadciq_m_sv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vadciq_m_uv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vadcq_m_uv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) +{ + int32x4_t __res = __builtin_mve_vsbciq_sv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) +{ + uint32x4_t __res = __builtin_mve_vsbciq_uv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + int32x4_t __res = __builtin_mve_vsbciq_m_sv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vsbciq_m_uv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vsbcq_sv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vsbcq_uv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vsbcq_m_sv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vsbcq_m_uv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void @@ -25552,6 +25720,65 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshrq_x_n_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshrq_x_n_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) +#define vadciq_m(p0,p1,p2,p3,p4) __arm_vadciq_m(p0,p1,p2,p3,p4) +#define __arm_vadciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vadciq(p0,p1,p2) __arm_vadciq(p0,p1,p2) +#define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vadcq_m(p0,p1,p2,p3,p4) __arm_vadcq_m(p0,p1,p2,p3,p4) +#define __arm_vadcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vadcq(p0,p1,p2) __arm_vadcq(p0,p1,p2) +#define __arm_vadcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vsbciq_m(p0,p1,p2,p3,p4) __arm_vsbciq_m(p0,p1,p2,p3,p4) +#define __arm_vsbciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vsbciq(p0,p1,p2) __arm_vsbciq(p0,p1,p2) +#define __arm_vsbciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vsbcq_m(p0,p1,p2,p3,p4) __arm_vsbcq_m(p0,p1,p2,p3,p4) +#define __arm_vsbcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vsbcq(p0,p1,p2) __arm_vsbcq(p0,p1,p2) +#define __arm_vsbcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) #endif /* MVE Floating point. */ diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index b77335cff133872558b48b5574dccc0f17df9ed1..a413b38676f2f102c16fdf2147f3b8a4d8ec47b4 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -857,3 +857,19 @@ VAR1 (LDRGBWBS_Z, vldrdq_gather_base_wb_z_s, v2di) VAR1 (LDRGBWBS, vldrwq_gather_base_wb_s, v4si) VAR1 (LDRGBWBS, vldrwq_gather_base_wb_f, v4sf) VAR1 (LDRGBWBS, vldrdq_gather_base_wb_s, v2di) +VAR1 (BINOP_NONE_NONE_NONE, vadciq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vadciq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vadcq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vadcq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vsbciq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vsbciq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vsbcq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vsbcq_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vadciq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vadciq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vadcq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vadcq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vsbciq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vsbciq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vsbcq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vsbcq_m_u, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index a938e0922f8dc6749dc7192961ae2091d666c6e7..8ff69094378396830ef31d9e2ca9db71c58aefab 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -211,7 +211,10 @@ VDWDUPQ_M VIDUPQ VIDUPQ_M VIWDUPQ VIWDUPQ_M VSTRWQSBWB_S VSTRWQSBWB_U VLDRWQGBWB_S VLDRWQGBWB_U VSTRWQSBWB_F VLDRWQGBWB_F VSTRDQSBWB_S VSTRDQSBWB_U - VLDRDQGBWB_S VLDRDQGBWB_U]) + VLDRDQGBWB_S VLDRDQGBWB_U VADCQ_U VADCQ_M_U VADCQ_S + VADCQ_M_S VSBCIQ_U VSBCIQ_S VSBCIQ_M_U VSBCIQ_M_S + VSBCQ_U VSBCQ_S VSBCQ_M_U VSBCQ_M_S VADCIQ_U VADCIQ_M_U + VADCIQ_S VADCIQ_M_S]) (define_mode_attr MVE_CNVT [(V8HI "V8HF") (V4SI "V4SF") (V8HF "V8HI") (V4SF "V4SI")]) @@ -382,8 +385,13 @@ (VSTRWQSO_U "u") (VSTRWQSO_S "s") (VSTRWQSSO_U "u") (VSTRWQSSO_S "s") (VSTRWQSBWB_S "s") (VSTRWQSBWB_U "u") (VLDRWQGBWB_S "s") (VLDRWQGBWB_U "u") (VLDRDQGBWB_S "s") - (VLDRDQGBWB_U "u") (VSTRDQSBWB_S "s") - (VSTRDQSBWB_U "u")]) + (VLDRDQGBWB_U "u") (VSTRDQSBWB_S "s") (VADCQ_M_S "s") + (VSTRDQSBWB_U "u") (VSBCQ_U "u") (VSBCQ_M_U "u") + (VSBCQ_S "s") (VSBCQ_M_S "s") (VSBCIQ_U "u") + (VSBCIQ_M_U "u") (VSBCIQ_S "s") (VSBCIQ_M_S "s") + (VADCQ_U "u") (VADCQ_M_U "u") (VADCQ_S "s") + (VADCIQ_U "u") (VADCIQ_M_U "u") (VADCIQ_S "s") + (VADCIQ_M_S "s")]) (define_int_attr mode1 [(VCTP8Q "8") (VCTP16Q "16") (VCTP32Q "32") (VCTP64Q "64") (VCTP8Q_M "8") (VCTP16Q_M "16") @@ -636,6 +644,15 @@ (define_int_iterator VLDRWGBWBQ [VLDRWQGBWB_S VLDRWQGBWB_U]) (define_int_iterator VSTRDSBWBQ [VSTRDQSBWB_S VSTRDQSBWB_U]) (define_int_iterator VLDRDGBWBQ [VLDRDQGBWB_S VLDRDQGBWB_U]) +(define_int_iterator VADCIQ [VADCIQ_U VADCIQ_S]) +(define_int_iterator VADCIQ_M [VADCIQ_M_U VADCIQ_M_S]) +(define_int_iterator VSBCQ [VSBCQ_U VSBCQ_S]) +(define_int_iterator VSBCQ_M [VSBCQ_M_U VSBCQ_M_S]) +(define_int_iterator VSBCIQ [VSBCIQ_U VSBCIQ_S]) +(define_int_iterator VSBCIQ_M [VSBCIQ_M_U VSBCIQ_M_S]) +(define_int_iterator VADCQ [VADCQ_U VADCQ_S]) +(define_int_iterator VADCQ_M [VADCQ_M_U VADCQ_M_S]) + (define_insn "*mve_mov" [(set (match_operand:MVE_types 0 "s_register_operand" "=w,w,r,w,w,r,w") @@ -10614,3 +10631,147 @@ return ""; } [(set_attr "length" "8")]) +;; +;; [vadciq_m_s, vadciq_m_u]) +;; +(define_insn "mve_vadciq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VADCIQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VADCIQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vadcit.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vadciq_u, vadciq_s]) +;; +(define_insn "mve_vadciq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VADCIQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VADCIQ)) + ] + "TARGET_HAVE_MVE" + "vadci.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) + +;; +;; [vadcq_m_s, vadcq_m_u]) +;; +(define_insn "mve_vadcq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VADCQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VADCQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vadct.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vadcq_u, vadcq_s]) +;; +(define_insn "mve_vadcq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VADCQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VADCQ)) + ] + "TARGET_HAVE_MVE" + "vadc.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4") + (set_attr "conds" "set")]) + +;; +;; [vsbciq_m_u, vsbciq_m_s]) +;; +(define_insn "mve_vsbciq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VSBCIQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VSBCIQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vsbcit.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vsbciq_s, vsbciq_u]) +;; +(define_insn "mve_vsbciq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VSBCIQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VSBCIQ)) + ] + "TARGET_HAVE_MVE" + "vsbci.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) + +;; +;; [vsbcq_m_u, vsbcq_m_s]) +;; +(define_insn "mve_vsbcq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VSBCQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VSBCQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vsbct.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vsbcq_s, vsbcq_u]) +;; +(define_insn "mve_vsbcq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VSBCQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VSBCQ)) + ] + "TARGET_HAVE_MVE" + "vsbc.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..51cedf3c5421241b0a2a1d8473e07172669a2043 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m_s32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..b46f0a7d44d8bf77105ffcfc59cbb24aa4eaa658 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m_u32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..124bff6dbb225edd156390411db2b4e18643f256 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vadciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vadciq (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..0718570130caa4cb88085f8a11ab92a72522cec7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vadciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vadciq (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..1c2a928d52ff04c7341f043c15123386453c66d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m_s32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..af38e01b7c8fd180311afbb6ee05c758ec2e636e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m_u32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..35be2d6aa2e31628a162bd9456be0844c861bb6a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vadcq_s32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vadcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..9a8246318b6fad157cbb9b74379e2f760fe71ff6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vadcq_u32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vadcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..c353a51080d9a15f033562636ffde3cb81e39b36 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m_s32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..e2bddb737c866c86a0ba0ac6fb5de1eeb7206c69 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m_u32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..db32b9cef100c1ec8b8fa75d946e9423e7dc6669 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vsbciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vsbciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..60f213d9a631d7d12f8ab7dfe58e3c251b9e8854 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vsbciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vsbciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..4ab7e2f1f69ade2189c44bc9bcd6243dc3cd9d68 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vsbcq_m_s32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 1 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..da2edac3d95d5405226bd0b85a9dce42ad62fd9b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vsbcq_m_u32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 1 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..9c3e6e99f5938d523d5dee127300031ddc09ee33 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vsbcq_s32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vsbcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..122b20c41b34c906c00b9914fe152195a3739118 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vsbcq_u32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vsbcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 31ad3fc5cddfedede02b10e194a426a98bd13024..1704b622c5d6e0abcf814ae1d439bb732f0bd76e 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -2450,6 +2450,22 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t; #define vrev32q_x_f16(__a, __p) __arm_vrev32q_x_f16(__a, __p) #define vrev64q_x_f16(__a, __p) __arm_vrev64q_x_f16(__a, __p) #define vrev64q_x_f32(__a, __p) __arm_vrev64q_x_f32(__a, __p) +#define vadciq_s32(__a, __b, __carry_out) __arm_vadciq_s32(__a, __b, __carry_out) +#define vadciq_u32(__a, __b, __carry_out) __arm_vadciq_u32(__a, __b, __carry_out) +#define vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_s32(__inactive, __a, __b, __carry_out, __p) +#define vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vadciq_m_u32(__inactive, __a, __b, __carry_out, __p) +#define vadcq_s32(__a, __b, __carry) __arm_vadcq_s32(__a, __b, __carry) +#define vadcq_u32(__a, __b, __carry) __arm_vadcq_u32(__a, __b, __carry) +#define vadcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_s32(__inactive, __a, __b, __carry, __p) +#define vadcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vadcq_m_u32(__inactive, __a, __b, __carry, __p) +#define vsbciq_s32(__a, __b, __carry_out) __arm_vsbciq_s32(__a, __b, __carry_out) +#define vsbciq_u32(__a, __b, __carry_out) __arm_vsbciq_u32(__a, __b, __carry_out) +#define vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_s32(__inactive, __a, __b, __carry_out, __p) +#define vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) __arm_vsbciq_m_u32(__inactive, __a, __b, __carry_out, __p) +#define vsbcq_s32(__a, __b, __carry) __arm_vsbcq_s32(__a, __b, __carry) +#define vsbcq_u32(__a, __b, __carry) __arm_vsbcq_u32(__a, __b, __carry) +#define vsbcq_m_s32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_s32(__inactive, __a, __b, __carry, __p) +#define vsbcq_m_u32(__inactive, __a, __b, __carry, __p) __arm_vsbcq_m_u32(__inactive, __a, __b, __carry, __p) #endif __extension__ extern __inline void @@ -15917,6 +15933,158 @@ __arm_vshrq_x_n_u32 (uint32x4_t __a, const int __imm, mve_pred16_t __p) return __builtin_mve_vshrq_m_n_uv4si (vuninitializedq_u32 (), __a, __imm, __p); } +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) +{ + int32x4_t __res = __builtin_mve_vadciq_sv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) +{ + uint32x4_t __res = __builtin_mve_vadciq_uv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + int32x4_t __res = __builtin_mve_vadciq_m_sv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vadciq_m_uv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vadcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vadcq_m_uv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry_out) +{ + int32x4_t __res = __builtin_mve_vsbciq_sv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out) +{ + uint32x4_t __res = __builtin_mve_vsbciq_uv4si (__a, __b); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + int32x4_t __res = __builtin_mve_vsbciq_m_sv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbciq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry_out, mve_pred16_t __p) +{ + uint32x4_t __res = __builtin_mve_vsbciq_m_uv4si (__inactive, __a, __b, __p); + *__carry_out = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vsbcq_sv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vsbcq_uv4si (__a, __b); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline int32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + int32x4_t __res = __builtin_mve_vsbcq_m_sv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + +__extension__ extern __inline uint32x4_t +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) +__arm_vsbcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, unsigned * __carry, mve_pred16_t __p) +{ + __builtin_arm_set_fpscr((__builtin_arm_get_fpscr () & ~0x20000000u) | (*__carry << 29)); + uint32x4_t __res = __builtin_mve_vsbcq_m_uv4si (__inactive, __a, __b, __p); + *__carry = (__builtin_arm_get_fpscr () >> 29) & 0x1u; + return __res; +} + #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point. */ __extension__ extern __inline void @@ -25552,6 +25720,65 @@ extern void *__ARM_undef; int (*)[__ARM_mve_type_uint16x8_t]: __arm_vshrq_x_n_u16 (__ARM_mve_coerce(__p1, uint16x8_t), p2, p3), \ int (*)[__ARM_mve_type_uint32x4_t]: __arm_vshrq_x_n_u32 (__ARM_mve_coerce(__p1, uint32x4_t), p2, p3));}) +#define vadciq_m(p0,p1,p2,p3,p4) __arm_vadciq_m(p0,p1,p2,p3,p4) +#define __arm_vadciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vadciq(p0,p1,p2) __arm_vadciq(p0,p1,p2) +#define __arm_vadciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vadcq_m(p0,p1,p2,p3,p4) __arm_vadcq_m(p0,p1,p2,p3,p4) +#define __arm_vadcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vadcq(p0,p1,p2) __arm_vadcq(p0,p1,p2) +#define __arm_vadcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vadcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vadcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vsbciq_m(p0,p1,p2,p3,p4) __arm_vsbciq_m(p0,p1,p2,p3,p4) +#define __arm_vsbciq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vsbciq(p0,p1,p2) __arm_vsbciq(p0,p1,p2) +#define __arm_vsbciq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbciq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbciq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) + +#define vsbcq_m(p0,p1,p2,p3,p4) __arm_vsbcq_m(p0,p1,p2,p3,p4) +#define __arm_vsbcq_m(p0,p1,p2,p3,p4) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + __typeof(p2) __p2 = (p2); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_m_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, int32x4_t), p3, p4), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, uint32x4_t), p3, p4));}) + +#define vsbcq(p0,p1,p2) __arm_vsbcq(p0,p1,p2) +#define __arm_vsbcq(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \ + __typeof(p1) __p1 = (p1); \ + _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \ + int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vsbcq_s32 (__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \ + int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: __arm_vsbcq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, uint32x4_t), p2));}) #endif /* MVE Floating point. */ diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index b77335cff133872558b48b5574dccc0f17df9ed1..a413b38676f2f102c16fdf2147f3b8a4d8ec47b4 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -857,3 +857,19 @@ VAR1 (LDRGBWBS_Z, vldrdq_gather_base_wb_z_s, v2di) VAR1 (LDRGBWBS, vldrwq_gather_base_wb_s, v4si) VAR1 (LDRGBWBS, vldrwq_gather_base_wb_f, v4sf) VAR1 (LDRGBWBS, vldrdq_gather_base_wb_s, v2di) +VAR1 (BINOP_NONE_NONE_NONE, vadciq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vadciq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vadcq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vadcq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vsbciq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vsbciq_u, v4si) +VAR1 (BINOP_NONE_NONE_NONE, vsbcq_s, v4si) +VAR1 (BINOP_UNONE_UNONE_UNONE, vsbcq_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vadciq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vadciq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vadcq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vadcq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vsbciq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vsbciq_m_u, v4si) +VAR1 (QUADOP_NONE_NONE_NONE_NONE_UNONE, vsbcq_m_s, v4si) +VAR1 (QUADOP_UNONE_UNONE_UNONE_UNONE_UNONE, vsbcq_m_u, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index a938e0922f8dc6749dc7192961ae2091d666c6e7..8ff69094378396830ef31d9e2ca9db71c58aefab 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -211,7 +211,10 @@ VDWDUPQ_M VIDUPQ VIDUPQ_M VIWDUPQ VIWDUPQ_M VSTRWQSBWB_S VSTRWQSBWB_U VLDRWQGBWB_S VLDRWQGBWB_U VSTRWQSBWB_F VLDRWQGBWB_F VSTRDQSBWB_S VSTRDQSBWB_U - VLDRDQGBWB_S VLDRDQGBWB_U]) + VLDRDQGBWB_S VLDRDQGBWB_U VADCQ_U VADCQ_M_U VADCQ_S + VADCQ_M_S VSBCIQ_U VSBCIQ_S VSBCIQ_M_U VSBCIQ_M_S + VSBCQ_U VSBCQ_S VSBCQ_M_U VSBCQ_M_S VADCIQ_U VADCIQ_M_U + VADCIQ_S VADCIQ_M_S]) (define_mode_attr MVE_CNVT [(V8HI "V8HF") (V4SI "V4SF") (V8HF "V8HI") (V4SF "V4SI")]) @@ -382,8 +385,13 @@ (VSTRWQSO_U "u") (VSTRWQSO_S "s") (VSTRWQSSO_U "u") (VSTRWQSSO_S "s") (VSTRWQSBWB_S "s") (VSTRWQSBWB_U "u") (VLDRWQGBWB_S "s") (VLDRWQGBWB_U "u") (VLDRDQGBWB_S "s") - (VLDRDQGBWB_U "u") (VSTRDQSBWB_S "s") - (VSTRDQSBWB_U "u")]) + (VLDRDQGBWB_U "u") (VSTRDQSBWB_S "s") (VADCQ_M_S "s") + (VSTRDQSBWB_U "u") (VSBCQ_U "u") (VSBCQ_M_U "u") + (VSBCQ_S "s") (VSBCQ_M_S "s") (VSBCIQ_U "u") + (VSBCIQ_M_U "u") (VSBCIQ_S "s") (VSBCIQ_M_S "s") + (VADCQ_U "u") (VADCQ_M_U "u") (VADCQ_S "s") + (VADCIQ_U "u") (VADCIQ_M_U "u") (VADCIQ_S "s") + (VADCIQ_M_S "s")]) (define_int_attr mode1 [(VCTP8Q "8") (VCTP16Q "16") (VCTP32Q "32") (VCTP64Q "64") (VCTP8Q_M "8") (VCTP16Q_M "16") @@ -636,6 +644,15 @@ (define_int_iterator VLDRWGBWBQ [VLDRWQGBWB_S VLDRWQGBWB_U]) (define_int_iterator VSTRDSBWBQ [VSTRDQSBWB_S VSTRDQSBWB_U]) (define_int_iterator VLDRDGBWBQ [VLDRDQGBWB_S VLDRDQGBWB_U]) +(define_int_iterator VADCIQ [VADCIQ_U VADCIQ_S]) +(define_int_iterator VADCIQ_M [VADCIQ_M_U VADCIQ_M_S]) +(define_int_iterator VSBCQ [VSBCQ_U VSBCQ_S]) +(define_int_iterator VSBCQ_M [VSBCQ_M_U VSBCQ_M_S]) +(define_int_iterator VSBCIQ [VSBCIQ_U VSBCIQ_S]) +(define_int_iterator VSBCIQ_M [VSBCIQ_M_U VSBCIQ_M_S]) +(define_int_iterator VADCQ [VADCQ_U VADCQ_S]) +(define_int_iterator VADCQ_M [VADCQ_M_U VADCQ_M_S]) + (define_insn "*mve_mov" [(set (match_operand:MVE_types 0 "s_register_operand" "=w,w,r,w,w,r,w") @@ -10614,3 +10631,147 @@ return ""; } [(set_attr "length" "8")]) +;; +;; [vadciq_m_s, vadciq_m_u]) +;; +(define_insn "mve_vadciq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VADCIQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VADCIQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vadcit.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vadciq_u, vadciq_s]) +;; +(define_insn "mve_vadciq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VADCIQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VADCIQ)) + ] + "TARGET_HAVE_MVE" + "vadci.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) + +;; +;; [vadcq_m_s, vadcq_m_u]) +;; +(define_insn "mve_vadcq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "0") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VADCQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VADCQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vadct.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vadcq_u, vadcq_s]) +;; +(define_insn "mve_vadcq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VADCQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VADCQ)) + ] + "TARGET_HAVE_MVE" + "vadc.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4") + (set_attr "conds" "set")]) + +;; +;; [vsbciq_m_u, vsbciq_m_s]) +;; +(define_insn "mve_vsbciq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VSBCIQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VSBCIQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vsbcit.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vsbciq_s, vsbciq_u]) +;; +(define_insn "mve_vsbciq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VSBCIQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(const_int 0)] + VSBCIQ)) + ] + "TARGET_HAVE_MVE" + "vsbci.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) + +;; +;; [vsbcq_m_u, vsbcq_m_s]) +;; +(define_insn "mve_vsbcq_m_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w") + (match_operand:V4SI 3 "s_register_operand" "w") + (match_operand:HI 4 "vpr_register_operand" "Up")] + VSBCQ_M)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VSBCQ_M)) + ] + "TARGET_HAVE_MVE" + "vpst\;vsbct.i32\t%q0, %q2, %q3" + [(set_attr "type" "mve_move") + (set_attr "length" "8")]) + +;; +;; [vsbcq_s, vsbcq_u]) +;; +(define_insn "mve_vsbcq_v4si" + [(set (match_operand:V4SI 0 "s_register_operand" "=w") + (unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w") + (match_operand:V4SI 2 "s_register_operand" "w")] + VSBCQ)) + (set (reg:SI VFPCC_REGNUM) + (unspec:SI [(reg:SI VFPCC_REGNUM)] + VSBCQ)) + ] + "TARGET_HAVE_MVE" + "vsbc.i32\t%q0, %q1, %q2" + [(set_attr "type" "mve_move") + (set_attr "length" "4")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..51cedf3c5421241b0a2a1d8473e07172669a2043 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_s32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m_s32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..b46f0a7d44d8bf77105ffcfc59cbb24aa4eaa658 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_m_u32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m_u32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vadciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..124bff6dbb225edd156390411db2b4e18643f256 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vadciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vadciq (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..0718570130caa4cb88085f8a11ab92a72522cec7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vadciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vadciq (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vadci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..1c2a928d52ff04c7341f043c15123386453c66d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m_s32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..af38e01b7c8fd180311afbb6ee05c758ec2e636e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m_u32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vadcq_m (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vadct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..35be2d6aa2e31628a162bd9456be0844c861bb6a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vadcq_s32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vadcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..9a8246318b6fad157cbb9b74379e2f760fe71ff6 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vadcq_u32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vadcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vadc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..c353a51080d9a15f033562636ffde3cb81e39b36 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m_s32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +int32x4_t +foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..e2bddb737c866c86a0ba0ac6fb5de1eeb7206c69 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m_u32 (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ + +uint32x4_t +foo1 (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry_out, mve_pred16_t p) +{ + return vsbciq_m (inactive, a, b, carry_out, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbcit.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..db32b9cef100c1ec8b8fa75d946e9423e7dc6669 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vsbciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry_out) +{ + return vsbciq_s32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..60f213d9a631d7d12f8ab7dfe58e3c251b9e8854 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vsbciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry_out) +{ + return vsbciq_u32 (a, b, carry_out); +} + +/* { dg-final { scan-assembler "vsbci.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..4ab7e2f1f69ade2189c44bc9bcd6243dc3cd9d68 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t inactive, int32x4_t a, int32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vsbcq_m_s32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 1 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..da2edac3d95d5405226bd0b85a9dce42ad62fd9b --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t inactive, uint32x4_t a, uint32x4_t b, unsigned * carry, mve_pred16_t p) +{ + return vsbcq_m_u32 (inactive, a, b, carry, p); +} + +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vsbct.i32" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mrc" 2 } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler-times "mcr" 1 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c new file mode 100644 index 0000000000000000000000000000000000000000..9c3e6e99f5938d523d5dee127300031ddc09ee33 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +int32x4_t +foo (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vsbcq_s32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ + +int32x4_t +foo1 (int32x4_t a, int32x4_t b, unsigned * carry) +{ + return vsbcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */ + diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c new file mode 100644 index 0000000000000000000000000000000000000000..122b20c41b34c906c00b9914fe152195a3739118 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -O2" } */ +/* { dg-skip-if "Skip if not auto" {*-*-*} {"-mfpu=*"} {"-mfpu=auto"} } */ + +#include "arm_mve.h" + +uint32x4_t +foo (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vsbcq_u32 (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ + +uint32x4_t +foo1 (uint32x4_t a, uint32x4_t b, unsigned * carry) +{ + return vsbcq (a, b, carry); +} + +/* { dg-final { scan-assembler "vsbc.i32" } } */ +/* { dg-final { scan-assembler-times "mrc" 4 } } */ +/* { dg-final { scan-assembler-times "mcr" 2 } } */