From patchwork Mon Jul 4 14:11:53 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Wahab X-Patchwork-Id: 644200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rjpqq3j4fz9sXR for ; Tue, 5 Jul 2016 00:12:23 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=oA9yRpVB; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=NINSt1MF3FmujqUPE 0zZC2JbViMeM/AWCwtpNf2FDrdkla5iRjDEmdnFLit59yv/rbJgj9KQ8sVchuk4H +GNK0mQt5XAO2ZMIIbt128a5FBnNONQ18Kq0GsmZcLwtfSiHVYrDQ+Z24KLHBlDa X0rqh9cY945FGfl69WjXfyUC3A= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=GWZ74/8GVgnYJdYBJppc7ii Cv6c=; b=oA9yRpVB4FGlZEdI65WgbN8IlVapzCPg1v1Qpei3ColrHLAVNJSViyg xN+qHFSs8wpjyBrltjs6um6tDZp3M6GnfBMU7J9vwvPiJSz58zT1iergt5mdUFvt x00XMMxdjk33GUI3Mo1FX9JMUzSsCyd8EqVqnUWJn+E8KhX19lSE= Received: (qmail 58466 invoked by alias); 4 Jul 2016 14:12:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 58452 invoked by uid 89); 4 Jul 2016 14:12:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.2 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=target_crypto, TARGET_CRYPTO, vsub, vmul X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 04 Jul 2016 14:11:57 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 453A728 for ; Mon, 4 Jul 2016 07:12:52 -0700 (PDT) Received: from [10.2.206.222] (e108033-lin.cambridge.arm.com [10.2.206.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A607D3F41F for ; Mon, 4 Jul 2016 07:11:54 -0700 (PDT) Subject: Re: [PATCH 11/17][ARM] Add builtins for VFP FP16 intrinsics. To: gcc-patches@gcc.gnu.org References: <573B28A3.9030603@foss.arm.com> <573B2D9E.2000202@foss.arm.com> From: Matthew Wahab Message-ID: <577A6EA9.3090407@foss.arm.com> Date: Mon, 4 Jul 2016 15:11:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: <573B2D9E.2000202@foss.arm.com> X-IsSubscribed: yes On 17/05/16 15:41, Matthew Wahab wrote: > The ACLE intrinsics introduced to support the ARMv8.2 FP16 extensions > require that intrinsics for scalar floating pointer (VFP) instructions > are available under different conditions from those for the NEON > intrinsics. > > This patch adds the support code and builtins data for the new VFP > intrinsics. Because of the similarities between the scalar and NEON > builtins, the support code for the scalar builtins follows the code for > the NEON builtins. The declarations for the VFP builtins are also added > in this patch since the support code expects non-empty tables. Updated the patch to drop the builtins for vneg, vadd, vsub, vmul and vdiv, which are no longer needed. Tested the series for arm-none-linux-gnueabihf with native bootstrap and make check and for arm-none-eabi and armeb-none-eabi with make check on an ARMv8.2-A emulator. Ok for trunk? Matthew 2016-07-04 Matthew Wahab * config/arm/arm-builtins.c (hf_UP): New. (si_UP): New. (vfp_builtin_data): New. Update comment. (enum arm_builtins): Include "arm_vfp_builtins.def". (ARM_BUILTIN_VFP_PATTERN_START): New. (arm_init_vfp_builtins): New. (arm_init_builtins): Add arm_init_vfp_builtins. (arm_expand_vfp_builtin): New. (arm_expand_builtins): Update for arm_expand_vfp_builtin. Fix long line. * config/arm/arm_vfp_builtins.def: New file. * config/arm/t-arm (arm.o): Add arm_vfp_builtins.def. (arm-builtins.o): Likewise. From 04896868ba0af25b31e9d23c3af5d3a88e70a564 Mon Sep 17 00:00:00 2001 From: Matthew Wahab Date: Thu, 7 Apr 2016 15:33:14 +0100 Subject: [PATCH 11/17] [PATCH 11/17][ARM] Add builtins for VFP FP16 intrinsics. 2016-07-04 Matthew Wahab * config/arm/arm-builtins.c (hf_UP): New. (si_UP): New. (vfp_builtin_data): New. Update comment. (enum arm_builtins): Include "arm_vfp_builtins.def". (ARM_BUILTIN_VFP_PATTERN_START): New. (arm_init_vfp_builtins): New. (arm_init_builtins): Add arm_init_vfp_builtins. (arm_expand_vfp_builtin): New. (arm_expand_builtins): Update for arm_expand_vfp_builtin. Fix long line. * config/arm/arm_vfp_builtins.def: New file. * config/arm/t-arm (arm.o): Add arm_vfp_builtins.def. (arm-builtins.o): Likewise. --- gcc/config/arm/arm-builtins.c | 75 +++++++++++++++++++++++++++++++++---- gcc/config/arm/arm_vfp_builtins.def | 51 +++++++++++++++++++++++++ gcc/config/arm/t-arm | 4 +- 3 files changed, 121 insertions(+), 9 deletions(-) create mode 100644 gcc/config/arm/arm_vfp_builtins.def diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 5dd81b1..70bcc07 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -190,6 +190,8 @@ arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define ti_UP TImode #define ei_UP EImode #define oi_UP OImode +#define hf_UP HFmode +#define si_UP SImode #define UP(X) X##_UP @@ -239,12 +241,22 @@ typedef struct { VAR11 (T, N, A, B, C, D, E, F, G, H, I, J, K) \ VAR1 (T, N, L) -/* The NEON builtin data can be found in arm_neon_builtins.def. - The mode entries in the following table correspond to the "key" type of the - instruction variant, i.e. equivalent to that which would be specified after - the assembler mnemonic, which usually refers to the last vector operand. - The modes listed per instruction should be the same as those defined for - that instruction's pattern in neon.md. */ +/* The NEON builtin data can be found in arm_neon_builtins.def and + arm_vfp_builtins.def. The entries in arm_neon_builtins.def require + TARGET_NEON to be true. The entries in arm_vfp_builtins.def require + TARGET_VFP to be true. The feature tests are checked when the builtins are + expanded. + + The mode entries in the following table correspond to + the "key" type of the instruction variant, i.e. equivalent to that which + would be specified after the assembler mnemonic, which usually refers to the + last vector operand. The modes listed per instruction should be the same as + those defined for that instruction's pattern in neon.md. */ + +static neon_builtin_datum vfp_builtin_data[] = +{ +#include "arm_vfp_builtins.def" +}; static neon_builtin_datum neon_builtin_data[] = { @@ -534,6 +546,10 @@ enum arm_builtins #undef CRYPTO2 #undef CRYPTO3 + ARM_BUILTIN_VFP_BASE, + +#include "arm_vfp_builtins.def" + ARM_BUILTIN_NEON_BASE, ARM_BUILTIN_NEON_LANE_CHECK = ARM_BUILTIN_NEON_BASE, @@ -542,6 +558,9 @@ enum arm_builtins ARM_BUILTIN_MAX }; +#define ARM_BUILTIN_VFP_PATTERN_START \ + (ARM_BUILTIN_VFP_BASE + 1) + #define ARM_BUILTIN_NEON_PATTERN_START \ (ARM_BUILTIN_NEON_BASE + 1) @@ -1033,6 +1052,20 @@ arm_init_neon_builtins (void) } } +/* Set up all the scalar floating point builtins. */ + +static void +arm_init_vfp_builtins (void) +{ + unsigned int i, fcode = ARM_BUILTIN_VFP_PATTERN_START; + + for (i = 0; i < ARRAY_SIZE (vfp_builtin_data); i++, fcode++) + { + neon_builtin_datum *d = &vfp_builtin_data[i]; + arm_init_neon_builtin (fcode, d); + } +} + static void arm_init_crypto_builtins (void) { @@ -1777,7 +1810,7 @@ arm_init_builtins (void) if (TARGET_HARD_FLOAT) { arm_init_neon_builtins (); - + arm_init_vfp_builtins (); arm_init_crypto_builtins (); } @@ -2324,6 +2357,27 @@ arm_expand_neon_builtin (int fcode, tree exp, rtx target) return arm_expand_neon_builtin_1 (fcode, exp, target, d); } +/* Expand a VFP builtin, if TARGET_VFP is true. These builtins are treated like + neon builtins except that the data is looked up in table + VFP_BUILTIN_DATA. */ + +static rtx +arm_expand_vfp_builtin (int fcode, tree exp, rtx target) +{ + if (fcode >= ARM_BUILTIN_VFP_BASE && ! TARGET_VFP) + { + fatal_error (input_location, + "You must enable VFP instructions" + " to use these intrinsics."); + return const0_rtx; + } + + neon_builtin_datum *d + = &vfp_builtin_data[fcode - ARM_BUILTIN_VFP_PATTERN_START]; + + return arm_expand_neon_builtin_1 (fcode, exp, target, d); +} + /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient (and in mode MODE if that's convenient). @@ -2361,13 +2415,18 @@ arm_expand_builtin (tree exp, if (fcode >= ARM_BUILTIN_NEON_BASE) return arm_expand_neon_builtin (fcode, exp, target); + if (fcode >= ARM_BUILTIN_VFP_BASE) + return arm_expand_vfp_builtin (fcode, exp, target); + /* Check in the context of the function making the call whether the builtin is supported. */ if (fcode >= ARM_BUILTIN_CRYPTO_BASE && (!TARGET_CRYPTO || !TARGET_HARD_FLOAT)) { fatal_error (input_location, - "You must enable crypto intrinsics (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...) to use these intrinsics."); + "You must enable crypto instructions" + " (e.g. include -mfloat-abi=softfp -mfpu=crypto-neon...)" + " to use these intrinsics."); return const0_rtx; } diff --git a/gcc/config/arm/arm_vfp_builtins.def b/gcc/config/arm/arm_vfp_builtins.def new file mode 100644 index 0000000..5abfcdd --- /dev/null +++ b/gcc/config/arm/arm_vfp_builtins.def @@ -0,0 +1,51 @@ +/* VFP instruction builtin definitions. + Copyright (C) 2016 Free Software Foundation, Inc. + Contributed by ARM Ltd. + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +/* This file lists the builtins that may be available when VFP is enabled but + not NEON is enabled. The entries otherwise have the same requirements and + generate the same structures as those in the arm_neon_builtins.def. */ + +/* FP16 Arithmetic instructions. */ +VAR1 (UNOP, vabs, hf) +VAR2 (UNOP, vcvths, hf, si) +VAR2 (UNOP, vcvthu, hf, si) +VAR1 (UNOP, vcvtahs, si) +VAR1 (UNOP, vcvtahu, si) +VAR1 (UNOP, vcvtmhs, si) +VAR1 (UNOP, vcvtmhu, si) +VAR1 (UNOP, vcvtnhs, si) +VAR1 (UNOP, vcvtnhu, si) +VAR1 (UNOP, vcvtphs, si) +VAR1 (UNOP, vcvtphu, si) +VAR1 (UNOP, vrnd, hf) +VAR1 (UNOP, vrnda, hf) +VAR1 (UNOP, vrndi, hf) +VAR1 (UNOP, vrndm, hf) +VAR1 (UNOP, vrndn, hf) +VAR1 (UNOP, vrndp, hf) +VAR1 (UNOP, vrndx, hf) +VAR1 (UNOP, vsqrt, hf) + +VAR2 (BINOP, vcvths_n, hf, si) +VAR2 (BINOP, vcvthu_n, hf, si) +VAR1 (BINOP, vmaxnm, hf) +VAR1 (BINOP, vminnm, hf) + +VAR1 (TERNOP, vfma, hf) +VAR1 (TERNOP, vfms, hf) diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm index 749a58d..803baa2 100644 --- a/gcc/config/arm/t-arm +++ b/gcc/config/arm/t-arm @@ -95,7 +95,8 @@ arm.o: $(srcdir)/config/arm/arm.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) \ $(srcdir)/config/arm/arm-cores.def \ $(srcdir)/config/arm/arm-arches.def $(srcdir)/config/arm/arm-fpus.def \ $(srcdir)/config/arm/arm-protos.h \ - $(srcdir)/config/arm/arm_neon_builtins.def + $(srcdir)/config/arm/arm_neon_builtins.def \ + $(srcdir)/config/arm/arm_vfp_builtins.def arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \ $(SYSTEM_H) coretypes.h $(TM_H) \ @@ -103,6 +104,7 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.c $(CONFIG_H) \ $(DIAGNOSTIC_CORE_H) $(OPTABS_H) \ $(srcdir)/config/arm/arm-protos.h \ $(srcdir)/config/arm/arm_neon_builtins.def \ + $(srcdir)/config/arm/arm_vfp_builtins.def \ $(srcdir)/config/arm/arm-simd-builtin-types.def $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/arm/arm-builtins.c -- 2.1.4