From patchwork Thu Jul 7 16:16:01 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiong Wang X-Patchwork-Id: 645954 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rljRc2zqZz9t0F for ; Fri, 8 Jul 2016 02:16:27 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=RBAJczi5; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; q=dns; s=default; b=JORT/INaKi44H18l3eeZ8wd0KENgN dXsHJS8oGnzhUTEl/Ef1BFn/jZhOLj2vB9SeOJ2F8SXxMSMdV+HbH/MI9xDS8BlC qU6AbXir0tSSyLs6/jFX/kiVXo10JW01WK29+ABn99lX+RnBT6qvXPyLGkjfqa9C rV1A3hZ4mq1hVc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; s=default; bh=OhH0Z3KJ+jogih+AF7hvrc0rWIg=; b=RBA Jczi5/Q+4y3/gCKXpwMJ2iupKvD2TnuhfguCC6gZMDkME57opPIr96woN2+KGaT6 aQrKlfmTEDS3dFd5ZW4PsAv4qiLoK0A5aHqNG9/0EYqMPuntycSkJbmH2cRekSrO FqtCinWwYKooMsdfTRhDRzgvRIfi5QGm3MMjMCEU= Received: (qmail 53809 invoked by alias); 7 Jul 2016 16:16:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 53792 invoked by uid 89); 7 Jul 2016 16:16:13 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy= X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 07 Jul 2016 16:16:04 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4D93228 for ; Thu, 7 Jul 2016 09:17:03 -0700 (PDT) Received: from [10.2.206.198] (e104437-lin.cambridge.arm.com [10.2.206.198]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 2407C3F41F for ; Thu, 7 Jul 2016 09:16:02 -0700 (PDT) From: Jiong Wang Subject: [AArch64][4/14] ARMv8.2-A FP16 three operands vector intrinsics To: GCC Patches References: <67f7b93f-0a92-de8f-8c50-5b4b573fed3a@foss.arm.com> <99eb95e3-5e9c-c6c9-b85f-e67d15f4859a@foss.arm.com> <21c3c64f-95ad-c127-3f8a-4afd236aae33@foss.arm.com> <938d13c1-39be-5fe3-9997-e55942bbd163@foss.arm.com> Message-ID: Date: Thu, 7 Jul 2016 17:16:01 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <938d13c1-39be-5fe3-9997-e55942bbd163@foss.arm.com> X-IsSubscribed: yes This patch add ARMv8.2-A FP16 three operands vector intrinsics. Three operands intrinsics only contain fma and fms. 2016-07-07 Jiong Wang gcc/ * config/aarch64/aarch64-simd-builtins.def: Register new builtins. * config/aarch64/aarch64-simd.md (fma4): Extend to HF modes. (fnma4): Likewise. * config/aarch64/arm_neon.h (vfma_f16): New. (vfmaq_f16): Likewise. (vfms_f16): Likewise. (vfmsq_f16): Likewise. From dc2121d586b759b864d9653e188a14d1f7296f25 Mon Sep 17 00:00:00 2001 From: Jiong Wang Date: Wed, 8 Jun 2016 10:21:25 +0100 Subject: [PATCH 04/14] [4/14] ARMv8.2 FP16 three operands vector intrinsics --- gcc/config/aarch64/aarch64-simd-builtins.def | 4 +++- gcc/config/aarch64/aarch64-simd.md | 28 ++++++++++++++-------------- gcc/config/aarch64/arm_neon.h | 26 ++++++++++++++++++++++++++ 3 files changed, 43 insertions(+), 15 deletions(-) diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index fe17298..6ff5063 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -405,7 +405,9 @@ BUILTIN_VALL_F16 (STORE1, st1, 0) /* Implemented by fma4. */ - BUILTIN_VDQF (TERNOP, fma, 4) + BUILTIN_VHSDF (TERNOP, fma, 4) + /* Implemented by fnma4. */ + BUILTIN_VHSDF (TERNOP, fnma, 4) /* Implemented by aarch64_simd_bsl. */ BUILTIN_VDQQH (BSL_P, simd_bsl, 0) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 0a80adb..576ad3c 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1526,13 +1526,13 @@ ) (define_insn "fma4" - [(set (match_operand:VDQF 0 "register_operand" "=w") - (fma:VDQF (match_operand:VDQF 1 "register_operand" "w") - (match_operand:VDQF 2 "register_operand" "w") - (match_operand:VDQF 3 "register_operand" "0")))] + [(set (match_operand:VHSDF 0 "register_operand" "=w") + (fma:VHSDF (match_operand:VHSDF 1 "register_operand" "w") + (match_operand:VHSDF 2 "register_operand" "w") + (match_operand:VHSDF 3 "register_operand" "0")))] "TARGET_SIMD" "fmla\\t%0., %1., %2." - [(set_attr "type" "neon_fp_mla_")] + [(set_attr "type" "neon_fp_mla_")] ) (define_insn "*aarch64_fma4_elt" @@ -1599,15 +1599,15 @@ ) (define_insn "fnma4" - [(set (match_operand:VDQF 0 "register_operand" "=w") - (fma:VDQF - (match_operand:VDQF 1 "register_operand" "w") - (neg:VDQF - (match_operand:VDQF 2 "register_operand" "w")) - (match_operand:VDQF 3 "register_operand" "0")))] - "TARGET_SIMD" - "fmls\\t%0., %1., %2." - [(set_attr "type" "neon_fp_mla_")] + [(set (match_operand:VHSDF 0 "register_operand" "=w") + (fma:VHSDF + (match_operand:VHSDF 1 "register_operand" "w") + (neg:VHSDF + (match_operand:VHSDF 2 "register_operand" "w")) + (match_operand:VHSDF 3 "register_operand" "0")))] + "TARGET_SIMD" + "fmls\\t%0., %1., %2." + [(set_attr "type" "neon_fp_mla_")] ) (define_insn "*aarch64_fnma4_elt" diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index e78ff43..ad5b6fa 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -26458,6 +26458,32 @@ vsubq_f16 (float16x8_t __a, float16x8_t __b) return __a - __b; } +/* ARMv8.2-A FP16 three operands vector intrinsics. */ + +__extension__ static __inline float16x4_t __attribute__ ((__always_inline__)) +vfma_f16 (float16x4_t __a, float16x4_t __b, float16x4_t __c) +{ + return __builtin_aarch64_fmav4hf (__b, __c, __a); +} + +__extension__ static __inline float16x8_t __attribute__ ((__always_inline__)) +vfmaq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) +{ + return __builtin_aarch64_fmav8hf (__b, __c, __a); +} + +__extension__ static __inline float16x4_t __attribute__ ((__always_inline__)) +vfms_f16 (float16x4_t __a, float16x4_t __b, float16x4_t __c) +{ + return __builtin_aarch64_fnmav4hf (__b, __c, __a); +} + +__extension__ static __inline float16x8_t __attribute__ ((__always_inline__)) +vfmsq_f16 (float16x8_t __a, float16x8_t __b, float16x8_t __c) +{ + return __builtin_aarch64_fnmav8hf (__b, __c, __a); +} + #pragma GCC pop_options #undef __aarch64_vget_lane_any -- 2.5.0