From patchwork Tue Feb 18 14:05:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Greenhalgh X-Patchwork-Id: 1240066 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-519734-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=arm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha1 header.s=default header.b=RNmlQ3x4; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=L0LacAic; dkim=fail reason="signature verification failed" (1024-bit key) header.d=armh.onmicrosoft.com header.i=@armh.onmicrosoft.com header.a=rsa-sha256 header.s=selector2-armh-onmicrosoft-com header.b=L0LacAic; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48MN1t0Q3jz9sRk for ; Wed, 19 Feb 2020 01:06:23 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=LWqswQg1O5GB5Yea +APt6AeWEnIWfX4bszcUFxhMIoWQfUrT104IYQY0jfRTNPGL8iNOzr3DdZV1XWoh LfV2B2c7Ufysdhza8dt+MeS0ftuLFFve/sLPhqSJZhl7QyQfv2ZEJ5pdf63ngpYR v2Bj6fuOE20lO1BHfWNBozDqmck= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=IAozohEz0Gkn4vzJWbeUOO s8dnI=; b=RNmlQ3x4xTIGlkmqYoFWtcBZdWoNFElbwRT6HQkiJMlHWJPCpvKVjH 0Tjx85OyIXPXGogLufhiVr429ddzCGKLm6F+nlWrpE7LMflf6gBXdiZAABW2D8l8 r5juQOIut/C/gI56RBpyvRiU3qGhnPlvHBoB+rKNSp8ytYAToy/9U= Received: (qmail 67155 invoked by alias); 18 Feb 2020 14:06:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 67038 invoked by uid 89); 18 Feb 2020 14:06:11 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 spammy= X-HELO: EUR03-AM5-obe.outbound.protection.outlook.com Received: from mail-eopbgr30080.outbound.protection.outlook.com (HELO EUR03-AM5-obe.outbound.protection.outlook.com) (40.107.3.80) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 18 Feb 2020 14:06:08 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NiLQuWMQ0ZU2WtALy3b9BwWf+o4285GXExW1GFsLC9w=; b=L0LacAicb06QA5CTKtbXMe6WJd5Tuv6F5yqGQa7+l6j9jUeFzweUnRi2Hl6I2KcgckjIzBpUCy+epOrZthpZSj5+prWANQ3mQhT2xr+6xS6Bcrbxc73mzVlT/+ZZHYQ1dlR6VTSq0w5rP1UpceMJQTe79yZ7S5g+qfhcN3p9ehI= Received: from VI1PR08CA0132.eurprd08.prod.outlook.com (2603:10a6:800:d4::34) by AM0PR08MB3266.eurprd08.prod.outlook.com (2603:10a6:208:66::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.22; Tue, 18 Feb 2020 14:06:04 +0000 Received: from DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e0a::201) by VI1PR08CA0132.outlook.office365.com (2603:10a6:800:d4::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.25 via Frontend Transport; Tue, 18 Feb 2020 14:06:04 +0000 Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DB5EUR03FT024.mail.protection.outlook.com (10.152.20.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2750.17 via Frontend Transport; Tue, 18 Feb 2020 14:06:04 +0000 Received: ("Tessian outbound 846b976b3941:v42"); Tue, 18 Feb 2020 14:06:04 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: 3c89b6c1ef6d14d3 X-CR-MTA-TID: 64aa7808 Received: from 8824ca078165.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id CE5F614A-F91C-4B35-9228-824A187F25E3.1; Tue, 18 Feb 2020 14:05:58 +0000 Received: from FRA01-PR2-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 8824ca078165.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Tue, 18 Feb 2020 14:05:58 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FdTzUxrMOcvFv5WJh9CrCzG6fC4j+CVMuTCKNgGbiwn3YnuF5MOffliI263FAYPe+DrTCZ8ZpErwAPvJmXkX/K/uDVN5iaU6ME+y1dZT0V6jollTZTbNJ5aQ7J2UheyPOpDvUk4rmqGok6VwbCnjOFD85ltL9+BxBBxhOPIkznrL2Uow0iMMZ+NEP8fBRa+56eaqF/hm/dZYAKvJ/p5h8+saMAE+PbGhlGM5P9Gpz1oHIigURCJNVMPjS5OxaRZoIMAsyyzy7G/y0BPGFTEhEYpoyhVieSzFAGRhn58u5op7haEGx6SeSkZ/voxeJt0SRC8RCLiPGYeqYupxXOdxag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NiLQuWMQ0ZU2WtALy3b9BwWf+o4285GXExW1GFsLC9w=; b=WCOhcTgBijwShAcJ8Uhk0qUrMPTaUtTh06K8dMkCsJjqIO9kTfTTqJhzTalu+xQfE4fJqMv7Egobh9KGswGlFaQcwtSYB/sw504uGGS5+/BgkpoK7yM2luqh79yF2bqTt/9qoBa+QiJZSFbTI/qiYa7mgJ9KiMhC6LSpsBJvjY3bKFfPDSUTTWb+in0Ql2R+Y4FEXS8+9imMRdoE6Q2LSo1NzvcXklLgHNcN72yNWoTSNCEOJqCbkRmKZRKVD65SZOaAcPESI472LOG0TyzaXU2KcJY29ugx4dlnbvCvMF0Ic9A95RFxqIQYlsSYzoqJQEU3bFMe3K0ijLqRxLTQTQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 40.67.248.234) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=bestguesspass action=none header.from=arm.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=NiLQuWMQ0ZU2WtALy3b9BwWf+o4285GXExW1GFsLC9w=; b=L0LacAicb06QA5CTKtbXMe6WJd5Tuv6F5yqGQa7+l6j9jUeFzweUnRi2Hl6I2KcgckjIzBpUCy+epOrZthpZSj5+prWANQ3mQhT2xr+6xS6Bcrbxc73mzVlT/+ZZHYQ1dlR6VTSq0w5rP1UpceMJQTe79yZ7S5g+qfhcN3p9ehI= Received: from VE1PR08CA0017.eurprd08.prod.outlook.com (2603:10a6:803:104::30) by PR2PR08MB4889.eurprd08.prod.outlook.com (2603:10a6:101:1d::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.27; Tue, 18 Feb 2020 14:05:55 +0000 Received: from AM5EUR03FT027.eop-EUR03.prod.protection.outlook.com (2a01:111:f400:7e08::209) by VE1PR08CA0017.outlook.office365.com (2603:10a6:803:104::30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2729.24 via Frontend Transport; Tue, 18 Feb 2020 14:05:55 +0000 Authentication-Results-Original: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 40.67.248.234 as permitted sender) receiver=protection.outlook.com; client-ip=40.67.248.234; helo=nebula.arm.com; Received: from nebula.arm.com (40.67.248.234) by AM5EUR03FT027.mail.protection.outlook.com (10.152.16.138) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.2750.17 via Frontend Transport; Tue, 18 Feb 2020 14:05:54 +0000 Received: from AZ-NEU-EX04.Arm.com (10.251.24.32) by AZ-NEU-EX04.Arm.com (10.251.24.32) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1415.2; Tue, 18 Feb 2020 14:05:52 +0000 Received: from e107456-lin.cambridge.arm.com (10.2.79.28) by mail.arm.com (10.251.24.32) with Microsoft SMTP Server id 15.1.1415.2 via Frontend Transport; Tue, 18 Feb 2020 14:05:51 +0000 From: James Greenhalgh To: CC: , , Subject: [AArch64] Move vmull_* to intrinsics Date: Tue, 18 Feb 2020 14:05:44 +0000 Message-ID: <20200218140544.4335-1-james.greenhalgh@arm.com> MIME-Version: 1.0 X-Forefront-Antispam-Report-Untrusted: CIP:40.67.248.234; IPV:; CTRY:IE; EFV:NLI; SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(346002)(136003)(376002)(396003)(189003)(199004)(26005)(6916009)(478600001)(4326008)(86362001)(6666004)(7696005)(2616005)(426003)(356004)(1076003)(33964004)(186003)(44832011)(8936002)(70206006)(81166006)(81156014)(5660300002)(235185007)(2906002)(8676002)(70586007)(66616009)(36756003)(336012)(316002)(54906003); DIR:OUT; SFP:1101; SCL:1; SRVR:PR2PR08MB4889; H:nebula.arm.com; FPR:; SPF:Pass; LANG:en; PTR:InfoDomainNonexistent; MX:1; A:1; x-checkrecipientrouted: true X-MS-Oob-TLC-OOBClassifiers: OLM:364;OLM:364; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: wp4HjiPNuRoaj6jM4I6TUEwXtCKi0IrWU4fy9X/6A13YlFu4ERSl5+3n83SugfzEIqR4BXF3aGtk7wNyDAnlK+WRSS4g8itiRXngpF3w4i6b9xYJQKLDEjopyRD4hGHLhsYjnwdeLGwRbYOUfGy6sknVP088V44hlNujn9rZ1xIyS/RZGJ6hhHfqfEVCPfNhWi8l0LP8UaaNrPHBpcRRBuQdJyZQYy3mDFhlWWf1Xb9pSJBH20fBdGVKgfgXT0nFuTuS1c6Z+eREn/GGjGelXbRvjnn5m2PIjFkURyyZhqr4UWcvLinhNn/uD7b7gQhLSQDI1BhUAyTACImio72TSUn8ETrWg8OpUQm188A88EkzR2e/5eO0Bm0JSzdt6+37hYcMBnmUiGzon91nWTGkVMySXOEQl32OcoIgcNWPTG+ptUE7mJRL0s3QuqtFwBjA Original-Authentication-Results: spf=pass (sender IP is 40.67.248.234) smtp.mailfrom=arm.com; gcc.gnu.org; dkim=none (message not signed) header.d=none; gcc.gnu.org; dmarc=bestguesspass action=none header.from=arm.com; X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT024.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 5ce2cd59-7ae8-4665-7e58-08d7b47baba8 X-IsSubscribed: yes Hi, As title, move some arm_neon.h functions which currently use assembly over to intrinsics. Bootstrapped and tested on aarch64-none-linux-gnu. OK, if so can someone please apply on my behalf? Thanks, James --- gcc/ 2020-02-18 James Greenhalgh * config/aarch64/aarch64-simd-builtins.def (intrinsic_vec_smult_lo_): New. (intrinsic_vec_umult_lo_): Likewise. (vec_widen_smult_hi_): Likewise. (vec_widen_umult_hi_): Likewise. * config/aarch64/aarch64-simd.md (aarch64_intrinsic_vec_mult_lo_): New. * config/aarch64/arm_neon.h (vmull_high_s8): Use intrinsics. (vmull_high_s16): Likewise. (vmull_high_s32): Likewise. (vmull_high_u8): Likewise. (vmull_high_u16): Likewise. (vmull_high_u32): Likewise. (vmull_s8): Likewise. (vmull_s16): Likewise. (vmull_s32): Likewise. (vmull_u8): Likewise. (vmull_u16): Likewise. (vmull_u32): Likewise. gcc/testsuite/ 2020-02-18 James Greenhalgh * gcc.target/aarch64/vmull_high.c: New. diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 57fc5933b43..f86866b9e78 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -185,6 +185,12 @@ BUILTIN_VQ_HSI (TERNOP, sqdmlal2_n, 0) BUILTIN_VQ_HSI (TERNOP, sqdmlsl2_n, 0) + BUILTIN_VD_BHSI (BINOP, intrinsic_vec_smult_lo_, 0) + BUILTIN_VD_BHSI (BINOPU, intrinsic_vec_umult_lo_, 0) + + BUILTIN_VQW (BINOP, vec_widen_smult_hi_, 10) + BUILTIN_VQW (BINOPU, vec_widen_umult_hi_, 10) + BUILTIN_VSD_HSI (BINOP, sqdmull, 0) BUILTIN_VSD_HSI (TERNOP_LANE, sqdmull_lane, 0) BUILTIN_VSD_HSI (TERNOP_LANE, sqdmull_laneq, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 4e28cf97516..281b9ce93b9 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1791,6 +1791,17 @@ [(set_attr "type" "neon_mul__long")] ) +(define_insn "aarch64_intrinsic_vec_mult_lo_" + [(set (match_operand: 0 "register_operand" "=w") + (mult: (ANY_EXTEND: + (match_operand:VD_BHSI 1 "register_operand" "w")) + (ANY_EXTEND: + (match_operand:VD_BHSI 2 "register_operand" "w"))))] + "TARGET_SIMD" + "mull\\t%0., %1., %2." + [(set_attr "type" "neon_mul__long")] +) + (define_expand "vec_widen_mult_lo_" [(match_operand: 0 "register_operand") (ANY_EXTEND: (match_operand:VQW 1 "register_operand")) diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index c7425346b86..0b11d670837 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -9218,72 +9218,42 @@ __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_s8 (int8x16_t __a, int8x16_t __b) { - int16x8_t __result; - __asm__ ("smull2 %0.8h,%1.16b,%2.16b" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_smult_hi_v16qi (__a, __b); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_s16 (int16x8_t __a, int16x8_t __b) { - int32x4_t __result; - __asm__ ("smull2 %0.4s,%1.8h,%2.8h" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_smult_hi_v8hi (__a, __b); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_s32 (int32x4_t __a, int32x4_t __b) { - int64x2_t __result; - __asm__ ("smull2 %0.2d,%1.4s,%2.4s" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_smult_hi_v4si (__a, __b); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_u8 (uint8x16_t __a, uint8x16_t __b) { - uint16x8_t __result; - __asm__ ("umull2 %0.8h,%1.16b,%2.16b" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_umult_hi_v16qi_uuu (__a, __b); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_u16 (uint16x8_t __a, uint16x8_t __b) { - uint32x4_t __result; - __asm__ ("umull2 %0.4s,%1.8h,%2.8h" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_umult_hi_v8hi_uuu (__a, __b); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_high_u32 (uint32x4_t __a, uint32x4_t __b) { - uint64x2_t __result; - __asm__ ("umull2 %0.2d,%1.4s,%2.4s" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_vec_widen_umult_hi_v4si_uuu (__a, __b); } #define vmull_lane_s16(a, b, c) \ @@ -9454,72 +9424,42 @@ __extension__ extern __inline int16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_s8 (int8x8_t __a, int8x8_t __b) { - int16x8_t __result; - __asm__ ("smull %0.8h, %1.8b, %2.8b" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_smult_lo_v8qi (__a, __b); } __extension__ extern __inline int32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_s16 (int16x4_t __a, int16x4_t __b) { - int32x4_t __result; - __asm__ ("smull %0.4s, %1.4h, %2.4h" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_smult_lo_v4hi (__a, __b); } __extension__ extern __inline int64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_s32 (int32x2_t __a, int32x2_t __b) { - int64x2_t __result; - __asm__ ("smull %0.2d, %1.2s, %2.2s" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_smult_lo_v2si (__a, __b); } __extension__ extern __inline uint16x8_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_u8 (uint8x8_t __a, uint8x8_t __b) { - uint16x8_t __result; - __asm__ ("umull %0.8h, %1.8b, %2.8b" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_umult_lo_v8qi_uuu (__a, __b); } __extension__ extern __inline uint32x4_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_u16 (uint16x4_t __a, uint16x4_t __b) { - uint32x4_t __result; - __asm__ ("umull %0.4s, %1.4h, %2.4h" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_umult_lo_v4hi_uuu (__a, __b); } __extension__ extern __inline uint64x2_t __attribute__ ((__always_inline__, __gnu_inline__, __artificial__)) vmull_u32 (uint32x2_t __a, uint32x2_t __b) { - uint64x2_t __result; - __asm__ ("umull %0.2d, %1.2s, %2.2s" - : "=w"(__result) - : "w"(__a), "w"(__b) - : /* No clobbers */); - return __result; + return __builtin_aarch64_intrinsic_vec_umult_lo_v2si_uuu (__a, __b); } __extension__ extern __inline int16x4_t diff --git a/gcc/testsuite/gcc.target/aarch64/vmull_high.c b/gcc/testsuite/gcc.target/aarch64/vmull_high.c new file mode 100644 index 00000000000..cddb7e7a96a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vmull_high.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3" } */ + +#include + +int64x2_t +doit (int8x16_t a) +{ + int16x8_t b = vmull_high_s8 (a, a); + int32x4_t c = vmull_high_s16 (b, b); + return vmull_high_s32 (c, c); +} + +uint64x2_t +douit (uint8x16_t a) +{ + uint16x8_t b = vmull_high_u8 (a, a); + uint32x4_t c = vmull_high_u16 (b, b); + return vmull_high_u32 (c, c); +} + +/* { dg-final { scan-assembler-times "smull2\[ |\t\]*v" 3} } */ +/* { dg-final { scan-assembler-times "umull2\[ |\t\]*v" 3} } */