From patchwork Tue Mar 31 16:13:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Srinath Parvathaneni X-Patchwork-Id: 1264840 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=OGYRK1I2; dkim=pass (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=OGYRK1I2; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48sDtx5xlsz9sRN for ; Wed, 1 Apr 2020 03:15:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 25346385DC00; Tue, 31 Mar 2020 16:15:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2076.outbound.protection.outlook.com [40.107.22.76]) by sourceware.org (Postfix) with ESMTPS id 7582B385C017 for ; Tue, 31 Mar 2020 16:14:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 7582B385C017 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Srinath.Parvathaneni@arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=08Q95YdavwDDYiasv4YAUwLKqI59nBUcSooxww4tA4I=; b=OGYRK1I28Co7mhA+a+kPWNT0zcVjP0HPJILPkDk1UUP4j09RSgf0L5v4i9iWk4iv7/5DtYVN8l3aHSi0J/ysOiD5KEcI3stzxyjL3ZQcm5Xr5+Y7vSGUXCzRU3I3CrroO6bX+cGzrxr9XUr/9mAFKmlDV2JJykkmEraxwtiNinw= Received: from AM0PR05CA0075.eurprd05.prod.outlook.com (2603:10a6:208:136::15) by DB7PR08MB3625.eurprd08.prod.outlook.com (2603:10a6:10:42::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.20; Tue, 31 Mar 2020 16:14:51 +0000 Received: from AM5EUR03FT057.eop-EUR03.prod.protection.outlook.com (2603:10a6:208:136:cafe::2) by AM0PR05CA0075.outlook.office365.com (2603:10a6:208:136::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.20 via Frontend Transport; Tue, 31 Mar 2020 16:14:51 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT057.mail.protection.outlook.com (10.152.17.44) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.17 via Frontend Transport; Tue, 31 Mar 2020 16:14:51 +0000 Received: ("Tessian outbound eadf07c3b4bb:v50"); Tue, 31 Mar 2020 16:14:51 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: f62877bfaad082c4 X-CR-MTA-TID: 64aa7808 Received: from 1306aeab97fa.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 0FA1D4B8-A69C-4D18-9762-F28018C9DAC1.1; Tue, 31 Mar 2020 16:14:45 +0000 Received: from EUR01-DB5-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 1306aeab97fa.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 31 Mar 2020 16:14:45 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ZreWkvOHphps1x27JVWmWI7PSoLz21gwEQwHY7WywYp+rMsY6y7LKJTy2cDTLXrLmcYAFiYQG0awgY3SbLWjkpzFHQ0XCVDPxMUvjz/0R6GT/Iz60VRIpU0k42BdcMQ+ShknfMzu6kpymOXep02U6Fou+C71g+HiYjvepMXp8y6+QhWmHNjPW6ewdvmRHcru7hOlun26OkxhKx5asJCxy23ljSMVk03gKdycjn79L9VIVIWG3hGsDAAMWX/Qt28jzws4RfMaOrA+UNKm0VLbjBAxTNg1LksE4sRiIexmC5n7bMTDh28nhyWKrvQYple7/MQMTG497UyYjOXsNxn4Jw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=08Q95YdavwDDYiasv4YAUwLKqI59nBUcSooxww4tA4I=; b=ZVpcHuKjJGvXUJAVHNqco/ki2vLNrUhZaa2F66LQLnKVDBIFMekvsgPj3N9ndm4gJzt0xhkPsxCIN3kVceutksQvhiDbZIIRDLadYYGQ6EuhSkIRLk9UPwVsBdrb9bRNUBselA8G8JUOijD8N1mZ994kP3GBNbaTdfIHT2yVmIlc16onngAKhvmLbqTzL/+hkJpEQ8vIlvhALxEx6ltT1riidECO4mn8TlxYZgFp5GWW7761ZwJ9L9bGMv7WRMJLY/WquNacNJt7T9tD37IIplow8PN46RHBhko0u4Di9GIezEtfmVZ1vbzXj/KM85Bpa7M4RaERQ8JJRQ4JeJVpzg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=08Q95YdavwDDYiasv4YAUwLKqI59nBUcSooxww4tA4I=; b=OGYRK1I28Co7mhA+a+kPWNT0zcVjP0HPJILPkDk1UUP4j09RSgf0L5v4i9iWk4iv7/5DtYVN8l3aHSi0J/ysOiD5KEcI3stzxyjL3ZQcm5Xr5+Y7vSGUXCzRU3I3CrroO6bX+cGzrxr9XUr/9mAFKmlDV2JJykkmEraxwtiNinw= Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Srinath.Parvathaneni@arm.com; Received: from AM0PR08MB5380.eurprd08.prod.outlook.com (52.132.213.136) by AM0PR08MB3073.eurprd08.prod.outlook.com (52.134.125.156) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.20; Tue, 31 Mar 2020 16:14:43 +0000 Received: from AM0PR08MB5380.eurprd08.prod.outlook.com ([fe80::e016:9e56:512d:b9ae]) by AM0PR08MB5380.eurprd08.prod.outlook.com ([fe80::e016:9e56:512d:b9ae%7]) with mapi id 15.20.2856.019; Tue, 31 Mar 2020 16:14:43 +0000 From: Srinath Parvathaneni Date: Tue, 31 Mar 2020 17:13:22 +0100 To: gcc-patches@gcc.gnu.org Subject: [GCC][PATCH][ARM]: Fix for MVE ACLE intrinsics with writeback (PR94317). X-ClientProxiedBy: LO2P265CA0205.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9e::25) To AM0PR08MB5380.eurprd08.prod.outlook.com (2603:10a6:208:183::8) Message-ID: MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from e120703-lin.cambridge.arm.com (217.140.106.52) by LO2P265CA0205.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:9e::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2856.20 via Frontend Transport; Tue, 31 Mar 2020 16:14:43 +0000 X-Originating-IP: [217.140.106.52] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: ab79100f-710b-40ae-9b8f-08d7d58ea425 X-MS-TrafficTypeDiagnostic: AM0PR08MB3073:|AM0PR08MB3073:|DB7PR08MB3625: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true NoDisclaimer: true X-MS-Oob-TLC-OOBClassifiers: OLM:400;OLM:400; X-Forefront-PRVS: 0359162B6D X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR08MB5380.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(10009020)(4636009)(366004)(396003)(346002)(39860400002)(376002)(136003)(6916009)(6666004)(8676002)(81166006)(81156014)(478600001)(316002)(966005)(6486002)(6512007)(66616009)(9686003)(66556008)(66476007)(52536014)(30864003)(2906002)(44832011)(33964004)(52116002)(235185007)(4326008)(186003)(956004)(33656002)(16526019)(26005)(66946007)(8936002)(5660300002)(86362001); DIR:OUT; SFP:1101; Received-SPF: None (protection.outlook.com: arm.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: RxGdpOP2AYOhYSK4nAl1XENyxmrbwEZpzdhLPEPGlep3qdEd4z9OcgHnqlaPVXuTn7iBwmoEFMVRZf72xtimiOyGf9nwOY6Pl81Sdla3p2B6Z2kdgzdeXlOWOMHgWznRVucZZK8p021LblrcYy51eItdq8DHTZqqBiJz8Wz+okRGpxtgFGljbpwQMznl85zpmVSrhLVZ+75TGNNYnWPL6xj+XalSawAHy7I41nR2Kwrz3WcIht7t9xHpqM9cHMENh5NB0TTEQx9NE+7YDUOGfi8bi6AfNV6yi9MmO4TIRucUfHwrLb7pDQloUCJ/80f3XLKs1jPM2vPHUOBea67cNuDRiqeFUsSpOQZYYToG9NGI4O91qi1Ym7LiESyqdS9RWgfOq/N7v69LaeHxVpLFX7mJpwYh8agXU9rROdDNOzAUTOPH9y0+sIMN1W4i7wKlT8RgjQugcEI0iyZfTqKnXe00QXtIS/+7DSppGi+ZN3KwUZVZyHXXUxKmoq4j+8PpXpXeS0O+bypoqnxEx8TKsg== X-MS-Exchange-AntiSpam-MessageData: Wu/Z2RdIRR5Bwb1WylSOAfR5gRMm4hJcTeddKK0lORSkULvQXS1gj/i5mAzCp+NqyVjBy/0XniHtudF+Pkz/45EM/IZNquuhmj5o4x7Upl5PbkH72Oj5KdL3zN9V60eX9mAl7xMXEg45krn/TI3s3A== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR08MB3073 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Srinath.Parvathaneni@arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT057.eop-EUR03.prod.protection.outlook.com X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFTY:; SFS:(10009020)(4636009)(136003)(396003)(39860400002)(346002)(376002)(46966005)(5660300002)(6486002)(2906002)(8676002)(9686003)(186003)(82740400003)(6512007)(6666004)(81166006)(336012)(966005)(356004)(47076004)(86362001)(30864003)(33964004)(956004)(70206006)(44832011)(316002)(478600001)(66616009)(70586007)(26005)(8936002)(81156014)(4326008)(52536014)(6916009)(36906005)(235185007)(16526019)(26826003)(33656002); DIR:OUT; SFP:1101; X-MS-Office365-Filtering-Correlation-Id-Prvs: 6f21d11c-a0b0-47af-1245-08d7d58e9f60 X-Forefront-PRVS: 0359162B6D X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: V0s83d9uWLjP1aGerc1azdZ+FZ6GPkrVgqnnz30X66/N7uBe9G1J0JqnlE3yLVEzRa+CIy7SeoIuDdW5lpxOyfqT9VxQCNqWOhqJO4X4nrCWVo1ZEs/RQsPBzl4KGX6f7QDYCzFisKMTb724XHyRGg/ZU/xkKgAqn87WKzdkKkgmO3t+ghvWczKQk8HHH0BWMCyHMMU8grlDHkZ0YBFdt2XCMJSqWMLWgSSUE6pULauu2CPo04oiYLj7KUSfcCuQ0Z9sP4trPE3bZsNbyPxtJ+IGvqeHYe/ubSPYhhow5oP73Hho5wHH3dH7GpwRK4fC2tnDLUgcFPej6RDxRq5osgr+N2vPnvdMOsyCGr5r0PoSr+4KC3RGSVYii1XAsy14ocXbb2DrS6Z9eH1gTdf5pPamz/idoqdrzUbdGtIFxsSgpo9scJy8FT+5D0jSXc1lpUB7vRgLktGea2Kb6diss3s3bX0vDbmC89aS2TCNkjh99vtTqxL6KCJN+OUn6pY1q7uNWDn+SIYl5pdOK3lgfPeSthnOR6Y2nCYGBljIQo5HUNeA23cPLRHMI4RnN4nFs/kWLEX+tyu50K31bIuBe/BpbO7XLZytzGFeRjCKhVg= X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2020 16:14:51.1494 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ab79100f-710b-40ae-9b8f-08d7d58ea425 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR08MB3625 X-Spam-Status: No, score=-34.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LOTSOFHASH, MSGID_FROM_MTA_HEADER, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Richard.Earnshaw@arm.com Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hello, Following MVE ACLE intrinsics have an issue with writeback to the base address. vldrdq_gather_base_wb_s64, vldrdq_gather_base_wb_u64, vldrdq_gather_base_wb_z_s64, vldrdq_gather_base_wb_z_u64, vldrwq_gather_base_wb_s32, vldrwq_gather_base_wb_u32, vldrwq_gather_base_wb_z_s32, vldrwq_gather_base_wb_z_u32, vldrwq_gather_base_wb_f32, vldrwq_gather_base_wb_z_f32. This patch fixes the bug reported in PR94317 by adding separate builtin calls to update the result and writeback to base address for the above intrinsics. Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more details. [1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics Regression tested on arm-none-eabi and found no regressions. Ok for trunk? Thanks, Srinath. gcc/ChangeLog: 2020-03-31 Srinath Parvathaneni PR target/94317 * config/arm/arm-builtins.c (LDRGBWBXU_QUALIFIERS): Define. (LDRGBWBXU_Z_QUALIFIERS): Likewise. * config/arm/arm_mve.h (__arm_vldrdq_gather_base_wb_s64): Modify intrinsic defintion by adding a new builtin call to writeback into base address. (__arm_vldrdq_gather_base_wb_u64): Likewise. (__arm_vldrdq_gather_base_wb_z_s64): Likewise. (__arm_vldrdq_gather_base_wb_z_u64): Likewise. (__arm_vldrwq_gather_base_wb_s32): Likewise. (__arm_vldrwq_gather_base_wb_u32): Likewise. (__arm_vldrwq_gather_base_wb_z_s32): Likewise. (__arm_vldrwq_gather_base_wb_z_u32): Likewise. (__arm_vldrwq_gather_base_wb_f32): Likewise. (__arm_vldrwq_gather_base_wb_z_f32): Likewise. * config/arm/arm_mve_builtins.def (vldrwq_gather_base_wb_z_u): Modify builtin's qualifier. (vldrdq_gather_base_wb_z_u): Likewise. (vldrwq_gather_base_wb_u): Likewise. (vldrdq_gather_base_wb_u): Likewise. (vldrwq_gather_base_wb_z_s): Likewise. (vldrwq_gather_base_wb_z_f): Likewise. (vldrdq_gather_base_wb_z_s): Likewise. (vldrwq_gather_base_wb_s): Likewise. (vldrwq_gather_base_wb_f): Likewise. (vldrdq_gather_base_wb_s): Likewise. (vldrwq_gather_base_nowb_z_u): Define builtin. (vldrdq_gather_base_nowb_z_u): Likewise. (vldrwq_gather_base_nowb_u): Likewise. (vldrdq_gather_base_nowb_u): Likewise. (vldrwq_gather_base_nowb_z_s): Likewise. (vldrwq_gather_base_nowb_z_f): Likewise. (vldrdq_gather_base_nowb_z_s): Likewise. (vldrwq_gather_base_nowb_s): Likewise. (vldrwq_gather_base_nowb_f): Likewise. (vldrdq_gather_base_nowb_s): Likewise. * config/arm/mve.md (mve_vldrwq_gather_base_nowb_v4si): Define RTL pattern. (mve_vldrwq_gather_base_wb_v4si): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_z_v4si): Define RTL pattern. (mve_vldrwq_gather_base_wb_z_v4si): Modify RTL pattern. (mve_vldrwq_gather_base_wb_fv4sf): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_fv4sf): Define RTL pattern. (mve_vldrwq_gather_base_wb_z_fv4sf): Modify RTL pattern. (mve_vldrwq_gather_base_nowb_z_fv4sf): Define RTL pattern. (mve_vldrdq_gather_base_nowb_v4di): Define RTL pattern. (mve_vldrdq_gather_base_wb_v4di): Modify RTL pattern. (mve_vldrdq_gather_base_nowb_z_v4di): Define RTL pattern. (mve_vldrdq_gather_base_wb_z_v4di): Modify RTL pattern. gcc/testsuite/ChangeLog: 2020-03-31 Srinath Parvathaneni PR target/94317 * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c: Modify * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c: Likewise. * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c: Likewise. ############### Attachment also inlined for ease of reply ############### diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 56f0db21ea95dcd738877daba27f1cb60f0d5a32..832b9107424fd9a4a0ee272b773b3d0929172370 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -719,6 +719,17 @@ arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] (arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers) static enum arm_type_qualifiers +arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate}; +#define LDRGBWBXU_QUALIFIERS (arm_ldrgbwbxu_qualifiers) + +static enum arm_type_qualifiers +arm_ldrgbwbxu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate, + qualifier_unsigned}; +#define LDRGBWBXU_Z_QUALIFIERS (arm_ldrgbwbxu_z_qualifiers) + +static enum arm_type_qualifiers arm_ldrgbwbs_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_none, qualifier_unsigned, qualifier_immediate}; #define LDRGBWBS_QUALIFIERS (arm_ldrgbwbs_qualifiers) diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index f1dcdc2153217e796c58526ba0e5be11be642234..47a6268e0800958f49d46238fe34ec749d243929 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -13903,8 +13903,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_s64 (uint64x2_t * __addr, const int __offset) { int64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_sv2di (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_sv2di (*__addr, __offset); + *__addr = __builtin_mve_vldrdq_gather_base_wb_sv2di (*__addr, __offset); return result; } @@ -13913,8 +13913,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_u64 (uint64x2_t * __addr, const int __offset) { uint64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_uv2di (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_uv2di (*__addr, __offset); + *__addr = __builtin_mve_vldrdq_gather_base_wb_uv2di (*__addr, __offset); return result; } @@ -13923,8 +13923,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_z_s64 (uint64x2_t * __addr, const int __offset, mve_pred16_t __p) { int64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_z_sv2di (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_z_sv2di (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrdq_gather_base_wb_z_sv2di (*__addr, __offset, __p); return result; } @@ -13933,8 +13933,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_z_u64 (uint64x2_t * __addr, const int __offset, mve_pred16_t __p) { uint64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_z_uv2di (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_z_uv2di (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrdq_gather_base_wb_z_uv2di (*__addr, __offset, __p); return result; } @@ -13943,8 +13943,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_s32 (uint32x4_t * __addr, const int __offset) { int32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_sv4si (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_sv4si (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_sv4si (*__addr, __offset); return result; } @@ -13953,8 +13953,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_u32 (uint32x4_t * __addr, const int __offset) { uint32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_uv4si (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_uv4si (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_uv4si (*__addr, __offset); return result; } @@ -13963,8 +13963,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_s32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { int32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_sv4si (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_sv4si (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_sv4si (*__addr, __offset, __p); return result; } @@ -13973,8 +13973,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_u32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { uint32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_uv4si (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_uv4si (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_uv4si (*__addr, __offset, __p); return result; } @@ -19372,8 +19372,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_f32 (uint32x4_t * __addr, const int __offset) { float32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_fv4sf (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_fv4sf (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_fv4sf (*__addr, __offset); return result; } @@ -19382,8 +19382,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_f32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { float32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_fv4sf (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf (*__addr, __offset, __p); return result; } diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 2fb975944b9fdac9de4b5a1bec3962be410637f1..753e40a951d071c1ab77476a1cc4779e91689178 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -847,16 +847,26 @@ VAR1 (STRSBWBS, vstrdq_scatter_base_wb_s, v2di) VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_s, v4si) VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_f, v4sf) VAR1 (STRSBWBS_P, vstrdq_scatter_base_wb_p_s, v2di) -VAR1 (LDRGBWBU_Z, vldrwq_gather_base_wb_z_u, v4si) -VAR1 (LDRGBWBU_Z, vldrdq_gather_base_wb_z_u, v2di) -VAR1 (LDRGBWBU, vldrwq_gather_base_wb_u, v4si) -VAR1 (LDRGBWBU, vldrdq_gather_base_wb_u, v2di) -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_s, v4si) -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_f, v4sf) -VAR1 (LDRGBWBS_Z, vldrdq_gather_base_wb_z_s, v2di) -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_s, v4si) -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_f, v4sf) -VAR1 (LDRGBWBS, vldrdq_gather_base_wb_s, v2di) +VAR1 (LDRGBWBU_Z, vldrwq_gather_base_nowb_z_u, v4si) +VAR1 (LDRGBWBU_Z, vldrdq_gather_base_nowb_z_u, v2di) +VAR1 (LDRGBWBU, vldrwq_gather_base_nowb_u, v4si) +VAR1 (LDRGBWBU, vldrdq_gather_base_nowb_u, v2di) +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_s, v4si) +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_f, v4sf) +VAR1 (LDRGBWBS_Z, vldrdq_gather_base_nowb_z_s, v2di) +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_s, v4si) +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_f, v4sf) +VAR1 (LDRGBWBS, vldrdq_gather_base_nowb_s, v2di) +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_s, v2di) +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_u, v2di) +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_s, v2di) +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_u, v2di) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_s, v4si) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_f, v4sf) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_u, v4si) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_s, v4si) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_f, v4sf) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_u, v4si) VAR1 (BINOP_NONE_NONE_NONE, vadciq_s, v4si) VAR1 (BINOP_UNONE_UNONE_UNONE, vadciq_u, v4si) VAR1 (BINOP_NONE_NONE_NONE, vadcq_s, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index df602b07840bb4ccb9aa2a9b10992ba7078452ba..d1028f4542b4972b4080e46544c86d625d77383a 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -10420,6 +10420,20 @@ (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V4SImode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_v4si_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_v4si" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V4SImode); emit_insn ( gen_mve_vldrwq_gather_base_wb_v4si_insn (operands[0], ignore_wb, @@ -10459,6 +10473,21 @@ (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V4SImode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_z_v4si_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) +(define_expand "mve_vldrwq_gather_base_nowb_z_v4si" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V4SImode); emit_insn ( gen_mve_vldrwq_gather_base_wb_z_v4si_insn (operands[0], ignore_wb, @@ -10487,12 +10516,26 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) (define_expand "mve_vldrwq_gather_base_wb_fv4sf" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)] + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" +{ + rtx ignore_result = gen_reg_rtx (V4SFmode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_fv4sf_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_fv4sf" [(match_operand:V4SF 0 "s_register_operand") (match_operand:V4SI 1 "s_register_operand") (match_operand:SI 2 "mve_vldrd_immediate") @@ -10531,6 +10574,22 @@ [(set_attr "length" "4")]) (define_expand "mve_vldrwq_gather_base_wb_z_fv4sf" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)] + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" +{ + rtx ignore_result = gen_reg_rtx (V4SFmode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_z_fv4sf_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_z_fv4sf" [(match_operand:V4SF 0 "s_register_operand") (match_operand:V4SI 1 "s_register_operand") (match_operand:SI 2 "mve_vldrd_immediate") @@ -10566,7 +10625,7 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) @@ -10578,6 +10637,20 @@ (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V2DImode); + emit_insn ( + gen_mve_vldrdq_gather_base_wb_v2di_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrdq_gather_base_nowb_v2di" + [(match_operand:V2DI 0 "s_register_operand") + (match_operand:V2DI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V2DImode); emit_insn ( gen_mve_vldrdq_gather_base_wb_v2di_insn (operands[0], ignore_wb, @@ -10585,6 +10658,7 @@ DONE; }) + ;; ;; [vldrdq_gather_base_wb_s vldrdq_gather_base_wb_u] ;; @@ -10617,6 +10691,22 @@ (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V2DImode); + emit_insn ( + gen_mve_vldrdq_gather_base_wb_z_v2di_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +(define_expand "mve_vldrdq_gather_base_nowb_z_v2di" + [(match_operand:V2DI 0 "s_register_operand") + (match_operand:V2DI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V2DImode); emit_insn ( gen_mve_vldrdq_gather_base_wb_z_v2di_insn (operands[0], ignore_wb, @@ -10660,7 +10750,7 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrdt.u64\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrdt.u64\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c index a5c5a61345cb0a46abc7796ceff195698cabe804..0d1ee769ec64b55c7559ce9dc14f8a6ae2e43e34 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c @@ -10,4 +10,6 @@ foo (uint64x2_t * addr) return vldrdq_gather_base_wb_s64 (addr, 8); } -/* { dg-final { scan-assembler "vldrd.64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c index 442bca92a43c05124717bf6ea0c44672941091f0..cb2a41bdcd32b553a93d3bcc4787d506f1b54f74 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c @@ -10,4 +10,6 @@ foo (uint64x2_t * addr) return vldrdq_gather_base_wb_u64 (addr, 8); } -/* { dg-final { scan-assembler "vldrd.64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c index 1863d0835e12328b7b7bb824f59e3d441042f56d..243fbeacc3429025202da2ff157ade38a472e123 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c @@ -8,4 +8,8 @@ int64x2_t foo (uint64x2_t * addr, mve_pred16_t p) return vldrdq_gather_base_wb_z_s64 (addr, 1016, p); } -/* { dg-final { scan-assembler "vldrdt.u64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*$" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c index 7ba272a112607b0e57a3d4659e5b4033044af83c..10ba42405fe8fde9d4f8993b20e41a59c7bb2e77 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c @@ -8,4 +8,8 @@ uint64x2_t foo (uint64x2_t * addr, mve_pred16_t p) return vldrdq_gather_base_wb_z_u64 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrdt.u64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c index 6b496873f173e30414ffcddf50513758bc8ca770..db8108e37325c4e1fafd2293d48eba0c33309073 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_f32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c index 9bbbd0d701546b5ec224129aef49e632addea550..3da64e218e2c0789e996be551650033567eba4e5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_s32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c index 774230b290367a7d28f0c8579be26fc9c75db1cb..2597ee11608bfe21d697f2250bee7e69c0cc7aec 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_u32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c index 6400f014a88ccf34fef15effff65f9b1267dbd5f..f1ba63855be254d96806c163177e32856294c106 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_f32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\tP0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c index de7006c51f17665b80b83fd5ea034477b7a7e778..56da5a46c64d2946ceade8689105048e19efdc6a 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_s32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c index 6c9608f07ba966876804f56403a4352a51a0e0c4..63165d97c1a7b4120be036348a09b73afddd36d1 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_u32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 56f0db21ea95dcd738877daba27f1cb60f0d5a32..832b9107424fd9a4a0ee272b773b3d0929172370 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -719,6 +719,17 @@ arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers[SIMD_MAX_BUILTIN_ARGS] (arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers) static enum arm_type_qualifiers +arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate}; +#define LDRGBWBXU_QUALIFIERS (arm_ldrgbwbxu_qualifiers) + +static enum arm_type_qualifiers +arm_ldrgbwbxu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate, + qualifier_unsigned}; +#define LDRGBWBXU_Z_QUALIFIERS (arm_ldrgbwbxu_z_qualifiers) + +static enum arm_type_qualifiers arm_ldrgbwbs_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_none, qualifier_unsigned, qualifier_immediate}; #define LDRGBWBS_QUALIFIERS (arm_ldrgbwbs_qualifiers) diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index f1dcdc2153217e796c58526ba0e5be11be642234..47a6268e0800958f49d46238fe34ec749d243929 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -13903,8 +13903,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_s64 (uint64x2_t * __addr, const int __offset) { int64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_sv2di (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_sv2di (*__addr, __offset); + *__addr = __builtin_mve_vldrdq_gather_base_wb_sv2di (*__addr, __offset); return result; } @@ -13913,8 +13913,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_u64 (uint64x2_t * __addr, const int __offset) { uint64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_uv2di (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_uv2di (*__addr, __offset); + *__addr = __builtin_mve_vldrdq_gather_base_wb_uv2di (*__addr, __offset); return result; } @@ -13923,8 +13923,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_z_s64 (uint64x2_t * __addr, const int __offset, mve_pred16_t __p) { int64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_z_sv2di (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_z_sv2di (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrdq_gather_base_wb_z_sv2di (*__addr, __offset, __p); return result; } @@ -13933,8 +13933,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrdq_gather_base_wb_z_u64 (uint64x2_t * __addr, const int __offset, mve_pred16_t __p) { uint64x2_t - result = __builtin_mve_vldrdq_gather_base_wb_z_uv2di (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrdq_gather_base_nowb_z_uv2di (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrdq_gather_base_wb_z_uv2di (*__addr, __offset, __p); return result; } @@ -13943,8 +13943,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_s32 (uint32x4_t * __addr, const int __offset) { int32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_sv4si (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_sv4si (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_sv4si (*__addr, __offset); return result; } @@ -13953,8 +13953,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_u32 (uint32x4_t * __addr, const int __offset) { uint32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_uv4si (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_uv4si (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_uv4si (*__addr, __offset); return result; } @@ -13963,8 +13963,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_s32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { int32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_sv4si (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_sv4si (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_sv4si (*__addr, __offset, __p); return result; } @@ -13973,8 +13973,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_u32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { uint32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_uv4si (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_uv4si (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_uv4si (*__addr, __offset, __p); return result; } @@ -19372,8 +19372,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_f32 (uint32x4_t * __addr, const int __offset) { float32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_fv4sf (*__addr, __offset); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_fv4sf (*__addr, __offset); + *__addr = __builtin_mve_vldrwq_gather_base_wb_fv4sf (*__addr, __offset); return result; } @@ -19382,8 +19382,8 @@ __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) __arm_vldrwq_gather_base_wb_z_f32 (uint32x4_t * __addr, const int __offset, mve_pred16_t __p) { float32x4_t - result = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf (*__addr, __offset, __p); - __addr += __offset; + result = __builtin_mve_vldrwq_gather_base_nowb_z_fv4sf (*__addr, __offset, __p); + *__addr = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf (*__addr, __offset, __p); return result; } diff --git a/gcc/config/arm/arm_mve_builtins.def b/gcc/config/arm/arm_mve_builtins.def index 2fb975944b9fdac9de4b5a1bec3962be410637f1..753e40a951d071c1ab77476a1cc4779e91689178 100644 --- a/gcc/config/arm/arm_mve_builtins.def +++ b/gcc/config/arm/arm_mve_builtins.def @@ -847,16 +847,26 @@ VAR1 (STRSBWBS, vstrdq_scatter_base_wb_s, v2di) VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_s, v4si) VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_f, v4sf) VAR1 (STRSBWBS_P, vstrdq_scatter_base_wb_p_s, v2di) -VAR1 (LDRGBWBU_Z, vldrwq_gather_base_wb_z_u, v4si) -VAR1 (LDRGBWBU_Z, vldrdq_gather_base_wb_z_u, v2di) -VAR1 (LDRGBWBU, vldrwq_gather_base_wb_u, v4si) -VAR1 (LDRGBWBU, vldrdq_gather_base_wb_u, v2di) -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_s, v4si) -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_f, v4sf) -VAR1 (LDRGBWBS_Z, vldrdq_gather_base_wb_z_s, v2di) -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_s, v4si) -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_f, v4sf) -VAR1 (LDRGBWBS, vldrdq_gather_base_wb_s, v2di) +VAR1 (LDRGBWBU_Z, vldrwq_gather_base_nowb_z_u, v4si) +VAR1 (LDRGBWBU_Z, vldrdq_gather_base_nowb_z_u, v2di) +VAR1 (LDRGBWBU, vldrwq_gather_base_nowb_u, v4si) +VAR1 (LDRGBWBU, vldrdq_gather_base_nowb_u, v2di) +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_s, v4si) +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_f, v4sf) +VAR1 (LDRGBWBS_Z, vldrdq_gather_base_nowb_z_s, v2di) +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_s, v4si) +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_f, v4sf) +VAR1 (LDRGBWBS, vldrdq_gather_base_nowb_s, v2di) +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_s, v2di) +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_u, v2di) +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_s, v2di) +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_u, v2di) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_s, v4si) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_f, v4sf) +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_u, v4si) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_s, v4si) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_f, v4sf) +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_u, v4si) VAR1 (BINOP_NONE_NONE_NONE, vadciq_s, v4si) VAR1 (BINOP_UNONE_UNONE_UNONE, vadciq_u, v4si) VAR1 (BINOP_NONE_NONE_NONE, vadcq_s, v4si) diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index df602b07840bb4ccb9aa2a9b10992ba7078452ba..d1028f4542b4972b4080e46544c86d625d77383a 100644 --- a/gcc/config/arm/mve.md +++ b/gcc/config/arm/mve.md @@ -10420,6 +10420,20 @@ (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V4SImode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_v4si_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_v4si" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V4SImode); emit_insn ( gen_mve_vldrwq_gather_base_wb_v4si_insn (operands[0], ignore_wb, @@ -10459,6 +10473,21 @@ (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V4SImode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_z_v4si_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) +(define_expand "mve_vldrwq_gather_base_nowb_z_v4si" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V4SImode); emit_insn ( gen_mve_vldrwq_gather_base_wb_z_v4si_insn (operands[0], ignore_wb, @@ -10487,12 +10516,26 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) (define_expand "mve_vldrwq_gather_base_wb_fv4sf" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)] + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" +{ + rtx ignore_result = gen_reg_rtx (V4SFmode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_fv4sf_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_fv4sf" [(match_operand:V4SF 0 "s_register_operand") (match_operand:V4SI 1 "s_register_operand") (match_operand:SI 2 "mve_vldrd_immediate") @@ -10531,6 +10574,22 @@ [(set_attr "length" "4")]) (define_expand "mve_vldrwq_gather_base_wb_z_fv4sf" + [(match_operand:V4SI 0 "s_register_operand") + (match_operand:V4SI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)] + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT" +{ + rtx ignore_result = gen_reg_rtx (V4SFmode); + emit_insn ( + gen_mve_vldrwq_gather_base_wb_z_fv4sf_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +(define_expand "mve_vldrwq_gather_base_nowb_z_fv4sf" [(match_operand:V4SF 0 "s_register_operand") (match_operand:V4SI 1 "s_register_operand") (match_operand:SI 2 "mve_vldrd_immediate") @@ -10566,7 +10625,7 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) @@ -10578,6 +10637,20 @@ (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V2DImode); + emit_insn ( + gen_mve_vldrdq_gather_base_wb_v2di_insn (ignore_result, operands[0], + operands[1], operands[2])); + DONE; +}) + +(define_expand "mve_vldrdq_gather_base_nowb_v2di" + [(match_operand:V2DI 0 "s_register_operand") + (match_operand:V2DI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V2DImode); emit_insn ( gen_mve_vldrdq_gather_base_wb_v2di_insn (operands[0], ignore_wb, @@ -10585,6 +10658,7 @@ DONE; }) + ;; ;; [vldrdq_gather_base_wb_s vldrdq_gather_base_wb_u] ;; @@ -10617,6 +10691,22 @@ (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] "TARGET_HAVE_MVE" { + rtx ignore_result = gen_reg_rtx (V2DImode); + emit_insn ( + gen_mve_vldrdq_gather_base_wb_z_v2di_insn (ignore_result, operands[0], + operands[1], operands[2], + operands[3])); + DONE; +}) + +(define_expand "mve_vldrdq_gather_base_nowb_z_v2di" + [(match_operand:V2DI 0 "s_register_operand") + (match_operand:V2DI 1 "s_register_operand") + (match_operand:SI 2 "mve_vldrd_immediate") + (match_operand:HI 3 "vpr_register_operand") + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)] + "TARGET_HAVE_MVE" +{ rtx ignore_wb = gen_reg_rtx (V2DImode); emit_insn ( gen_mve_vldrdq_gather_base_wb_z_v2di_insn (operands[0], ignore_wb, @@ -10660,7 +10750,7 @@ ops[0] = operands[0]; ops[1] = operands[2]; ops[2] = operands[3]; - output_asm_insn ("vpst\;\tvldrdt.u64\t%q0, [%q1, %2]!",ops); + output_asm_insn ("vpst\;vldrdt.u64\t%q0, [%q1, %2]!",ops); return ""; } [(set_attr "length" "8")]) diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c index a5c5a61345cb0a46abc7796ceff195698cabe804..0d1ee769ec64b55c7559ce9dc14f8a6ae2e43e34 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c @@ -10,4 +10,6 @@ foo (uint64x2_t * addr) return vldrdq_gather_base_wb_s64 (addr, 8); } -/* { dg-final { scan-assembler "vldrd.64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c index 442bca92a43c05124717bf6ea0c44672941091f0..cb2a41bdcd32b553a93d3bcc4787d506f1b54f74 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c @@ -10,4 +10,6 @@ foo (uint64x2_t * addr) return vldrdq_gather_base_wb_u64 (addr, 8); } -/* { dg-final { scan-assembler "vldrd.64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c index 1863d0835e12328b7b7bb824f59e3d441042f56d..243fbeacc3429025202da2ff157ade38a472e123 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c @@ -8,4 +8,8 @@ int64x2_t foo (uint64x2_t * addr, mve_pred16_t p) return vldrdq_gather_base_wb_z_s64 (addr, 1016, p); } -/* { dg-final { scan-assembler "vldrdt.u64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*$" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c index 7ba272a112607b0e57a3d4659e5b4033044af83c..10ba42405fe8fde9d4f8993b20e41a59c7bb2e77 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c @@ -8,4 +8,8 @@ uint64x2_t foo (uint64x2_t * addr, mve_pred16_t p) return vldrdq_gather_base_wb_z_u64 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrdt.u64" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c index 6b496873f173e30414ffcddf50513758bc8ca770..db8108e37325c4e1fafd2293d48eba0c33309073 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_f32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c index 9bbbd0d701546b5ec224129aef49e632addea550..3da64e218e2c0789e996be551650033567eba4e5 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_s32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c index 774230b290367a7d28f0c8579be26fc9c75db1cb..2597ee11608bfe21d697f2250bee7e69c0cc7aec 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c @@ -10,4 +10,6 @@ foo (uint32x4_t * addr) return vldrwq_gather_base_wb_u32 (addr, 8); } -/* { dg-final { scan-assembler "vldrw.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c index 6400f014a88ccf34fef15effff65f9b1267dbd5f..f1ba63855be254d96806c163177e32856294c106 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_f32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\tP0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c index de7006c51f17665b80b83fd5ea034477b7a7e778..56da5a46c64d2946ceade8689105048e19efdc6a 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_s32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c index 6c9608f07ba966876804f56403a4352a51a0e0c4..63165d97c1a7b4120be036348a09b73afddd36d1 100644 --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p) return vldrwq_gather_base_wb_z_u32 (addr, 8, p); } -/* { dg-final { scan-assembler "vldrwt.u32" } } */ +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */ +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */ +/* { dg-final { scan-assembler "vpst" } } */ +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */ +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */