diff mbox

[11/17,ARM] Add builtins for VFP FP16 intrinsics.

Message ID 577A6EA9.3090407@foss.arm.com
State New
Headers show

Commit Message

Matthew Wahab July 4, 2016, 2:11 p.m. UTC
On 17/05/16 15:41, Matthew Wahab wrote:
 > The ACLE intrinsics introduced to support the ARMv8.2 FP16 extensions
 > require that intrinsics for scalar floating pointer (VFP) instructions
 > are available under different conditions from those for the NEON
 > intrinsics.
 >
 > This patch adds the support code and builtins data for the new VFP
 > intrinsics. Because of the similarities between the scalar and NEON
 > builtins, the support code for the scalar builtins follows the code for
 > the NEON builtins. The declarations for the VFP builtins are also added
 > in this patch since the support code expects non-empty tables.

Updated the patch to drop the builtins for vneg, vadd, vsub, vmul and
vdiv, which are no longer needed.

Tested the series for arm-none-linux-gnueabihf with native bootstrap and
make check and for arm-none-eabi and armeb-none-eabi with make check on
an ARMv8.2-A emulator.

Ok for trunk?
Matthew

2016-07-04  Matthew Wahab  <matthew.wahab@arm.com>

	* config/arm/arm-builtins.c (hf_UP): New.
	(si_UP): New.
	(vfp_builtin_data): New.  Update comment.
	(enum arm_builtins): Include "arm_vfp_builtins.def".
	(ARM_BUILTIN_VFP_PATTERN_START): New.
	(arm_init_vfp_builtins): New.
	(arm_init_builtins): Add arm_init_vfp_builtins.
	(arm_expand_vfp_builtin): New.
	(arm_expand_builtins): Update for arm_expand_vfp_builtin.  Fix
	long line.
	* config/arm/arm_vfp_builtins.def: New file.
	* config/arm/t-arm (arm.o): Add arm_vfp_builtins.def.
	(arm-builtins.o): Likewise.

Comments

Ramana Radhakrishnan July 28, 2016, 11:55 a.m. UTC | #1
On Mon, Jul 4, 2016 at 3:11 PM, Matthew Wahab
<matthew.wahab@foss.arm.com> wrote:
> On 17/05/16 15:41, Matthew Wahab wrote:
>> The ACLE intrinsics introduced to support the ARMv8.2 FP16 extensions
>> require that intrinsics for scalar floating pointer (VFP) instructions
>> are available under different conditions from those for the NEON
>> intrinsics.
>>
>> This patch adds the support code and builtins data for the new VFP
>> intrinsics. Because of the similarities between the scalar and NEON
>> builtins, the support code for the scalar builtins follows the code for
>> the NEON builtins. The declarations for the VFP builtins are also added
>> in this patch since the support code expects non-empty tables.
>
> Updated the patch to drop the builtins for vneg, vadd, vsub, vmul and
> vdiv, which are no longer needed.
>
> Tested the series for arm-none-linux-gnueabihf with native bootstrap and
> make check and for arm-none-eabi and armeb-none-eabi with make check on
> an ARMv8.2-A emulator.
>
> Ok for trunk?
> Matthew
>
> 2016-07-04  Matthew Wahab  <matthew.wahab@arm.com>
>
>         * config/arm/arm-builtins.c (hf_UP): New.
>         (si_UP): New.
>         (vfp_builtin_data): New.  Update comment.
>         (enum arm_builtins): Include "arm_vfp_builtins.def".
>         (ARM_BUILTIN_VFP_PATTERN_START): New.
>         (arm_init_vfp_builtins): New.
>         (arm_init_builtins): Add arm_init_vfp_builtins.
>         (arm_expand_vfp_builtin): New.
>         (arm_expand_builtins): Update for arm_expand_vfp_builtin.  Fix
>         long line.
>         * config/arm/arm_vfp_builtins.def: New file.
>
>         * config/arm/t-arm (arm.o): Add arm_vfp_builtins.def.
>         (arm-builtins.o): Likewise.
>


OK.

Ramana
diff mbox

Patch

From 04896868ba0af25b31e9d23c3af5d3a88e70a564 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Thu, 7 Apr 2016 15:33:14 +0100
Subject: [PATCH 11/17] [PATCH 11/17][ARM] Add builtins for VFP FP16
 intrinsics.

2016-07-04  Matthew Wahab  <matthew.wahab@arm.com>

	* config/arm/arm-builtins.c (hf_UP): New.
	(si_UP): New.
	(vfp_builtin_data): New.  Update comment.
	(enum arm_builtins): Include "arm_vfp_builtins.def".
	(ARM_BUILTIN_VFP_PATTERN_START): New.
	(arm_init_vfp_builtins): New.
	(arm_init_builtins): Add arm_init_vfp_builtins.
	(arm_expand_vfp_builtin): New.
	(arm_expand_builtins): Update for arm_expand_vfp_builtin.  Fix
	long line.
	* config/arm/arm_vfp_builtins.def: New file.
	* config/arm/t-arm (arm.o): Add arm_vfp_builtins.def.
	(arm-builtins.o): Likewise.
---
 gcc/config/arm/arm-builtins.c       | 75 +++++++++++++++++++++++++++++++++----
 gcc/config/arm/arm_vfp_builtins.def | 51 +++++++++++++++++++++++++
 gcc/config/arm/t-arm                |  4 +-
 3 files changed, 121 insertions(+), 9 deletions(-)
 create mode 100644 gcc/config/arm/arm_vfp_builtins.def

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 5dd81b1..70bcc07 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -190,6 +190,8 @@  arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
 #define ti_UP	 TImode
 #define ei_UP	 EImode
 #define oi_UP	 OImode
+#define hf_UP	 HFmode
+#define si_UP	 SImode
 
 #define UP(X) X##_UP
 
@@ -239,12 +241,22 @@  typedef struct {
   VAR11 (T, N, A, B, C, D, E, F, G, H, I, J, K) \
   VAR1 (T, N, L)
 
-/* The NEON builtin data can be found in arm_neon_builtins.def.
-   The mode entries in the following table correspond to the "key" type of the
-   instruction variant, i.e. equivalent to that which would be specified after
-   the assembler mnemonic, which usually refers to the last vector operand.
-   The modes listed per instruction should be the same as those defined for
-   that instruction's pattern in neon.md.  */
+/* The NEON builtin data can be found in arm_neon_builtins.def and
+   arm_vfp_builtins.def.  The entries in arm_neon_builtins.def require
+   TARGET_NEON to be true.  The entries in arm_vfp_builtins.def require
+   TARGET_VFP to be true.  The feature tests are checked when the builtins are
+   expanded.
+
+   The mode entries in the following table correspond to
+   the "key" type of the instruction variant, i.e. equivalent to that which
+   would be specified after the assembler mnemonic, which usually refers to the
+   last vector operand.  The modes listed per instruction should be the same as
+   those defined for that instruction's pattern in neon.md.  */
+
+static neon_builtin_datum vfp_builtin_data[] =
+{
+#include "arm_vfp_builtins.def"
+};
 
 static neon_builtin_datum neon_builtin_data[] =
 {
@@ -534,6 +546,10 @@  enum arm_builtins
 #undef CRYPTO2
 #undef CRYPTO3
 
+  ARM_BUILTIN_VFP_BASE,
+
+#include "arm_vfp_builtins.def"
+
   ARM_BUILTIN_NEON_BASE,
   ARM_BUILTIN_NEON_LANE_CHECK = ARM_BUILTIN_NEON_BASE,
 
@@ -542,6 +558,9 @@  enum arm_builtins
   ARM_BUILTIN_MAX
 };
 
+#define ARM_BUILTIN_VFP_PATTERN_START \
+  (ARM_BUILTIN_VFP_BASE + 1)
+
 #define ARM_BUILTIN_NEON_PATTERN_START \
   (ARM_BUILTIN_NEON_BASE + 1)
 
@@ -1033,6 +1052,20 @@  arm_init_neon_builtins (void)
     }
 }
 
+/* Set up all the scalar floating point builtins.  */
+
+static void
+arm_init_vfp_builtins (void)
+{
+  unsigned int i, fcode = ARM_BUILTIN_VFP_PATTERN_START;
+
+  for (i = 0; i < ARRAY_SIZE (vfp_builtin_data); i++, fcode++)
+    {
+      neon_builtin_datum *d = &vfp_builtin_data[i];
+      arm_init_neon_builtin (fcode, d);
+    }
+}
+
 static void
 arm_init_crypto_builtins (void)
 {
@@ -1777,7 +1810,7 @@  arm_init_builtins (void)
   if (TARGET_HARD_FLOAT)
     {
       arm_init_neon_builtins ();
-
+      arm_init_vfp_builtins ();
       arm_init_crypto_builtins ();
     }
 
@@ -2324,6 +2357,27 @@  arm_expand_neon_builtin (int fcode, tree exp, rtx target)
   return arm_expand_neon_builtin_1 (fcode, exp, target, d);
 }
 
+/* Expand a VFP builtin, if TARGET_VFP is true.  These builtins are treated like
+   neon builtins except that the data is looked up in table
+   VFP_BUILTIN_DATA.  */
+
+static rtx
+arm_expand_vfp_builtin (int fcode, tree exp, rtx target)
+{
+  if (fcode >= ARM_BUILTIN_VFP_BASE && ! TARGET_VFP)
+    {
+      fatal_error (input_location,
+		   "You must enable VFP instructions"
+		   " to use these intrinsics.");
+      return const0_rtx;
+    }
+
+  neon_builtin_datum *d
+    = &vfp_builtin_data[fcode - ARM_BUILTIN_VFP_PATTERN_START];
+
+  return arm_expand_neon_builtin_1 (fcode, exp, target, d);
+}
+
 /* Expand an expression EXP that calls a built-in function,
    with result going to TARGET if that's convenient
    (and in mode MODE if that's convenient).
@@ -2361,13 +2415,18 @@  arm_expand_builtin (tree exp,
   if (fcode >= ARM_BUILTIN_NEON_BASE)
     return arm_expand_neon_builtin (fcode, exp, target);
 
+  if (fcode >= ARM_BUILTIN_VFP_BASE)
+    return arm_expand_vfp_builtin (fcode, exp, target);
+
   /* Check in the context of the function making the call whether the
      builtin is supported.  */
   if (fcode >= ARM_BUILTIN_CRYPTO_BASE
       && (!TARGET_CRYPTO || !TARGET_HARD_FLOAT))
     {
       fatal_error (input_location,
-		   "You must enable crypto intrinsics (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...) to use these intrinsics.");
+		   "You must enable crypto instructions"
+		   " (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...)"
+		   " to use these intrinsics.");
       return const0_rtx;
     }
 
diff --git a/gcc/config/arm/arm_vfp_builtins.def b/gcc/config/arm/arm_vfp_builtins.def
new file mode 100644
index 0000000..5abfcdd
--- /dev/null
+++ b/gcc/config/arm/arm_vfp_builtins.def
@@ -0,0 +1,51 @@ 
+/* VFP instruction builtin definitions.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by ARM Ltd.
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This file lists the builtins that may be available when VFP is enabled but
+   not NEON is enabled.  The entries otherwise have the same requirements and
+   generate the same structures as those in the arm_neon_builtins.def.  */
+
+/* FP16 Arithmetic instructions.  */
+VAR1 (UNOP, vabs, hf)
+VAR2 (UNOP, vcvths, hf, si)
+VAR2 (UNOP, vcvthu, hf, si)
+VAR1 (UNOP, vcvtahs, si)
+VAR1 (UNOP, vcvtahu, si)
+VAR1 (UNOP, vcvtmhs, si)
+VAR1 (UNOP, vcvtmhu, si)
+VAR1 (UNOP, vcvtnhs, si)
+VAR1 (UNOP, vcvtnhu, si)
+VAR1 (UNOP, vcvtphs, si)
+VAR1 (UNOP, vcvtphu, si)
+VAR1 (UNOP, vrnd, hf)
+VAR1 (UNOP, vrnda, hf)
+VAR1 (UNOP, vrndi, hf)
+VAR1 (UNOP, vrndm, hf)
+VAR1 (UNOP, vrndn, hf)
+VAR1 (UNOP, vrndp, hf)
+VAR1 (UNOP, vrndx, hf)
+VAR1 (UNOP, vsqrt, hf)
+
+VAR2 (BINOP, vcvths_n, hf, si)
+VAR2 (BINOP, vcvthu_n, hf, si)
+VAR1 (BINOP, vmaxnm, hf)
+VAR1 (BINOP, vminnm, hf)
+
+VAR1 (TERNOP, vfma, hf)
+VAR1 (TERNOP, vfms, hf)
diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm
index 749a58d..803baa2 100644
--- a/gcc/config/arm/t-arm
+++ b/gcc/config/arm/t-arm
@@ -95,7 +95,8 @@  arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \
   $(srcdir)/config/arm/arm-cores.def \
   $(srcdir)/config/arm/arm-arches.def $(srcdir)/config/arm/arm-fpus.def \
   $(srcdir)/config/arm/arm-protos.h \
-  $(srcdir)/config/arm/arm_neon_builtins.def
+  $(srcdir)/config/arm/arm_neon_builtins.def \
+  $(srcdir)/config/arm/arm_vfp_builtins.def
 
 arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \
   $(SYSTEM_H) coretypes.h $(TM_H) \
@@ -103,6 +104,7 @@  arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \
   $(DIAGNOSTIC_CORE_H) $(OPTABS_H) \
   $(srcdir)/config/arm/arm-protos.h \
   $(srcdir)/config/arm/arm_neon_builtins.def \
+  $(srcdir)/config/arm/arm_vfp_builtins.def \
   $(srcdir)/config/arm/arm-simd-builtin-types.def
 	$(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \
 		$(srcdir)/config/arm/arm-builtins.c
-- 
2.1.4