From patchwork Mon Dec 5 16:54:37 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andre Vieira X-Patchwork-Id: 702790 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tXW8R10Bbz9t0G for ; Tue, 6 Dec 2016 03:55:02 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="wSHo+8Nu"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=w/sLMNK48+jI2UYtg dDCCNKTqtHnoCnkmjGAczX66CoRNR7IirHyuf/NOsWGWjOClyH3DoEOlGv+fgDbi XZbAsh7jq2Bzesg/TdiEgiCn4jxspAYCUjvmXTQAUZL6a9PPIdp6ICTx8XKcCS+9 KX4PAVgPq8x4kt0AGrf03ar+bc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=LDcFIMIj11LFVoYrpPMsC9f BtJ8=; b=wSHo+8NuZdcYTENOYrRmhxVNOPzaqQjppGVgjTFZoyak/Uq/3i4JVfK a/Ini8T8UTz2PpvZvmT/A+gmiAK3hbNl6SVq33oblKvwf2dxCHOE6jmCECwC8eH4 OjlTNpFTGJUJWNPdKhl8QxDWo47rJCj+lqVA/D0wH5QbkjoJy9hY= Received: (qmail 31377 invoked by alias); 5 Dec 2016 16:54:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 31368 invoked by uid 89); 5 Dec 2016 16:54:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.8 required=5.0 tests=BAYES_00, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=am, el, classification, sup X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 05 Dec 2016 16:54:40 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9F6FCAD7; Mon, 5 Dec 2016 08:54:38 -0800 (PST) Received: from [10.2.206.251] (e107157-lin.cambridge.arm.com [10.2.206.251]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 500A13F483 for ; Mon, 5 Dec 2016 08:54:38 -0800 (PST) Subject: [arm-embedded][committed][PATCH 3/6] ARM ACLE Coprocessor CDP intrinsics To: gcc-patches@gcc.gnu.org References: <5822F3CB.3040202@arm.com> <5822F652.208@arm.com> From: "Andre Vieira (lists)" Message-ID: <58459BCD.7010706@arm.com> Date: Mon, 5 Dec 2016 16:54:37 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <5822F652.208@arm.com> X-IsSubscribed: yes On 09/11/16 10:11, Andre Vieira (lists) wrote: > Hi, > > This patch implements support for the ARM ACLE Coprocessor CDP > intrinsics. See below a table mapping the intrinsics to their respective > instructions: > > +----------------------------------------------------+--------------------------------------+ > | Intrinsic signature | Instruction > pattern | > +----------------------------------------------------+--------------------------------------+ > |void __arm_cdp(coproc, opc1, CRd, CRn, CRm, opc2) |CDP coproc, opc1, > CRd, CRn, CRm, opc2 | > +----------------------------------------------------+--------------------------------------+ > |void __arm_cdp2(coproc, opc1, CRd, CRn, CRm, opc2) |CDP2 coproc, opc1, > CRd, CRn, CRm, opc2| > +----------------------------------------------------+--------------------------------------+ > Note that any untyped variable in the intrinsic signature is required to > be a compiler-time constant and has the type 'unsigned int'. We do some > boundary checks for coproc:[0-15], opc1:[0-15], CR*:[0-31], opc2:[0-7]. > If either of these requirements are not met a diagnostic is issued. > > I renamed neon_const_bounds in this patch, to arm_const_bounds, simply > because it is also used in the Coprocessor intrinsics. It also requires > the expansion of the builtin frame work such that it accepted 'void' > modes and intrinsics with 6 arguments. > > I also changed acle.exp to run tests for multiple options, where all lto > option sets are appended with -ffat-objects to allow for assembly scans. > > Is this OK for trunk? > > Regards, > Andre > > gcc/ChangeLog: > 2016-11-09 Andre Vieira > > * config/arm/arm.md (): New. > * config/arm/arm.c (neon_const_bounds): Rename this ... > (arm_const_bounds): ... this. > (arm_coproc_builtin_available): New. > * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase. > (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'. > (CDP_QUALIFIERS): Define to... > (arm_cdp_qualifiers): ... this. New. > (void_UP): Define. > (arm_expand_builtin_args): Add case for 6 arguments. > * config/arm/arm-protos.h (neon_const_bounds): Rename this ... > (arm_const_bounds): ... this. > (arm_coproc_builtin_available): New. > * config/arm/arm_acle.h (__arm_cdp): New. > (__arm_cdp2): New. > * config/arm/arm_acle_builtins.def (cdp): New. > (cdp2): New. > * config/arm/iterators.md (CDPI,CDP,cdp): New. > * config/arm/neon.md: Rename all 'neon_const_bounds' to > 'arm_const_bounds'. > * config/arm/types.md (coproc): New. > * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New. > * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics. > > gcc/testsuite/ChangeLog: > 2016-11-09 Andre Vieira > > * gcc.target/arm/acle/acle.exp: Run tests for different options > and make sure fat-lto-objects is used such that we can still do > assemble scans. > * gcc.target/arm/acle/cdp.c: New. > * gcc.target/arm/acle/cdp2.c: New. > * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): New. > (check_effective_target_arm_coproc1_ok_nocache): New. > (check_effective_target_arm_coproc2_ok): New. > (check_effective_target_arm_coproc2_ok_nocache): New. > (check_effective_target_arm_coproc3_ok): New. > (check_effective_target_arm_coproc3_ok_nocache): New. > Hi, I committed this patch to the embedded-6-branch in revision r243261. Cheers, Andre gcc/ChangeLog.arm: 2016-11-09 Andre Vieira * config/arm/arm.md (): New. * config/arm/arm.c (neon_const_bounds): Rename this ... (arm_const_bounds): ... this. (arm_coproc_builtin_available): New. * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase. (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'. (CDP_QUALIFIERS): Define to... (arm_cdp_qualifiers): ... this. New. (void_UP): Define. (arm_expand_builtin_args): Add case for 6 arguments. * config/arm/arm-protos.h (neon_const_bounds): Rename this ... (arm_const_bounds): ... this. (arm_coproc_builtin_available): New. * config/arm/arm_acle.h (__arm_cdp): New. (__arm_cdp2): New. * config/arm/arm_acle_builtins.def (cdp): New. (cdp2): New. * config/arm/iterators.md (CDPI,CDP,cdp): New. * config/arm/neon.md: Rename all 'neon_const_bounds' to 'arm_const_bounds'. * config/arm/types.md (coproc): New. * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New. * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics. * gcc/doc/sourcebuild.tex (arm_coproc1_ok, arm_coproc2_ok, arm_coproc3_ok): New. gcc/testsuite/ChangeLog.arm: 2016-12-05 Andre Vieira * gcc.target/arm/acle/acle.exp: Run tests for different options and make sure fat-lto-objects is used such that we can still do assemble scans. * gcc.target/arm/acle/cdp.c: New. * gcc.target/arm/acle/cdp2.c: New. * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): New. (check_effective_target_arm_coproc1_ok_nocache): New. (check_effective_target_arm_coproc2_ok): New. (check_effective_target_arm_coproc2_ok_nocache): New. (check_effective_target_arm_coproc3_ok): New. (check_effective_target_arm_coproc3_ok_nocache): New. diff --git a/gcc/ChangeLog.arm b/gcc/ChangeLog.arm index 9b0763dc96469053a64b312eba8f8519cd5667ad..ecb7c33a708d70a4e0834285e127b2caa849941b 100644 --- a/gcc/ChangeLog.arm +++ b/gcc/ChangeLog.arm @@ -1,5 +1,33 @@ 2016-12-05 Andre Vieira + * config/arm/arm.md (): New. + * config/arm/arm.c (neon_const_bounds): Rename this ... + (arm_const_bounds): ... this. + (arm_coproc_builtin_available): New. + * config/arm/arm-builtins.c (SIMD_MAX_BUILTIN_ARGS): Increase. + (arm_type_qualifiers): Add 'qualifier_unsigned_immediate'. + (CDP_QUALIFIERS): Define to... + (arm_cdp_qualifiers): ... this. New. + (void_UP): Define. + (arm_expand_builtin_args): Add case for 6 arguments. + * config/arm/arm-protos.h (neon_const_bounds): Rename this ... + (arm_const_bounds): ... this. + (arm_coproc_builtin_available): New. + * config/arm/arm_acle.h (__arm_cdp): New. + (__arm_cdp2): New. + * config/arm/arm_acle_builtins.def (cdp): New. + (cdp2): New. + * config/arm/iterators.md (CDPI,CDP,cdp): New. + * config/arm/neon.md: Rename all 'neon_const_bounds' to + 'arm_const_bounds'. + * config/arm/types.md (coproc): New. + * config/arm/unspecs.md (VUNSPEC_CDP, VUNSPEC_CDP2): New. + * gcc/doc/extend.texi (ACLE): Add a mention of Coprocessor intrinsics. + * gcc/doc/sourcebuild.tex + (arm_coproc1_ok, arm_coproc2_ok, arm_coproc3_ok): New. + +2016-12-05 Andre Vieira + * config/arm/arm-builtins.c (arm_unsigned_binop_qualifiers): New. (UBINOP_QUALIFIERS): New. (si_UP): Define. diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c index 1fb41c91efc845fb72b9412b5ff3d7fd219fb210..9edb1ddc79af09dd94001f58d009f877bc107a65 100644 --- a/gcc/config/arm/arm-builtins.c +++ b/gcc/config/arm/arm-builtins.c @@ -37,7 +37,7 @@ #include "langhooks.h" #include "case-cfn-macros.h" -#define SIMD_MAX_BUILTIN_ARGS 5 +#define SIMD_MAX_BUILTIN_ARGS 7 enum arm_type_qualifiers { @@ -52,6 +52,7 @@ enum arm_type_qualifiers /* Used when expanding arguments if an operand could be an immediate. */ qualifier_immediate = 0x8, /* 1 << 3 */ + qualifier_unsigned_immediate = 0x9, qualifier_maybe_immediate = 0x10, /* 1 << 4 */ /* void foo (...). */ qualifier_void = 0x20, /* 1 << 5 */ @@ -163,6 +164,18 @@ arm_unsigned_binop_qualifiers[SIMD_MAX_BUILTIN_ARGS] qualifier_unsigned }; #define UBINOP_QUALIFIERS (arm_unsigned_binop_qualifiers) +/* void (unsigned immediate, unsigned immediate, unsigned immediate, + unsigned immediate, unsigned immediate, unsigned immediate). */ +static enum arm_type_qualifiers +arm_cdp_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_void, qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate, + qualifier_unsigned_immediate }; +#define CDP_QUALIFIERS \ + (arm_cdp_qualifiers) /* The first argument (return type) of a store should be void type, which we represent with qualifier_void. Their first operand will be a DImode pointer to the location to store to, so we must use @@ -198,6 +211,7 @@ arm_storestruct_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS] #define ei_UP EImode #define oi_UP OImode #define si_UP SImode +#define void_UP VOIDmode #define UP(X) X##_UP @@ -2192,6 +2206,10 @@ constant_arg: pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4]); break; + case 6: + pat = GEN_FCN (icode) (target, op[0], op[1], op[2], op[3], op[4], op[5]); + break; + default: gcc_unreachable (); } @@ -2218,6 +2236,10 @@ constant_arg: pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4]); break; + case 6: + pat = GEN_FCN (icode) (op[0], op[1], op[2], op[3], op[4], op[5]); + break; + default: gcc_unreachable (); } diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 05bd4aaa17794fdb236d78b51d817c2c782dbec5..c60e3c6d4168bfa67d2e11a5c1e454576293e1b3 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -90,7 +90,7 @@ extern rtx neon_make_constant (rtx); extern tree arm_builtin_vectorized_function (unsigned int, tree, tree); extern void neon_expand_vector_init (rtx, rtx); extern void neon_lane_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT, const_tree); -extern void neon_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); +extern void arm_const_bounds (rtx, HOST_WIDE_INT, HOST_WIDE_INT); extern HOST_WIDE_INT neon_element_bits (machine_mode); extern void neon_emit_pair_result_insn (machine_mode, rtx (*) (rtx, rtx, rtx, rtx), @@ -169,6 +169,7 @@ extern void arm_expand_compare_and_swap (rtx op[]); extern void arm_split_compare_and_swap (rtx op[]); extern void arm_split_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx, rtx); extern rtx arm_load_tp (rtx); +extern bool arm_coproc_builtin_available (enum unspecv); #if defined TREE_CODE extern void arm_init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree); diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7d37aae072f6e334f6bc73accf66047c23f28d0d..dee8389a2c1e766db530bb3316aeaa951906ce67 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -13190,7 +13190,7 @@ neon_lane_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high, /* Bounds-check constants. */ void -neon_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) +arm_const_bounds (rtx operand, HOST_WIDE_INT low, HOST_WIDE_INT high) { bounds_check (operand, low, high, NULL_TREE, "constant"); } @@ -31500,4 +31500,33 @@ arm_elf_section_type_flags (tree decl, const char *name, int reloc) return flags; } +/* This function checks for the availability of the coprocessor builtin passed + in BUILTIN for the current target. Returns true if it is available and + false otherwise. If a BUILTIN is passed for which this function has not + been implemented it will cause an exception. */ + +bool arm_coproc_builtin_available (enum unspecv builtin) +{ + /* None of these builtins are available in Thumb mode if the target only + supports Thumb-1. */ + if (TARGET_THUMB1) + return false; + + switch (builtin) + { + case VUNSPEC_CDP: + if (arm_arch4) + return true; + break; + case VUNSPEC_CDP2: + /* Only present in ARMv5*, ARMv6 (but not ARMv6-M), ARMv7* and + ARMv8-{A,M}. */ + if (arm_arch5) + return true; + break; + default: + gcc_unreachable (); + } + return false; +} #include "gt-arm.h" diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index c8dc184cd154922b4fc881d138058f68dc9e8569..590c7f40bb2fb554eff454ab93a1a8baf1cfcae4 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11498,6 +11498,26 @@ DONE; }) +(define_insn "" + [(unspec_volatile [(match_operand:SI 0 "immediate_operand") + (match_operand:SI 1 "immediate_operand") + (match_operand:SI 2 "immediate_operand") + (match_operand:SI 3 "immediate_operand") + (match_operand:SI 4 "immediate_operand") + (match_operand:SI 5 "immediate_operand")] CDPI)] + "arm_coproc_builtin_available (VUNSPEC_)" +{ + arm_const_bounds (operands[0], 0, 16); + arm_const_bounds (operands[1], 0, 16); + arm_const_bounds (operands[2], 0, (1 << 5)); + arm_const_bounds (operands[3], 0, (1 << 5)); + arm_const_bounds (operands[4], 0, (1 << 5)); + arm_const_bounds (operands[5], 0, 8); + return "\\tp%c0, %1, CR%c2, CR%c3, CR%c4, %5"; +} + [(set_attr "length" "4") + (set_attr "type" "coproc")]) + ;; Vector bits common to IWMMXT and Neon (include "vec-common.md") ;; Load the Intel Wireless Multimedia Extension patterns diff --git a/gcc/config/arm/arm_acle.h b/gcc/config/arm/arm_acle.h index 5d937168e10499d7a926495d668efd2bc4a72f79..747a07ced6c7aa2dbf606ca48d4637847add12c6 100644 --- a/gcc/config/arm/arm_acle.h +++ b/gcc/config/arm/arm_acle.h @@ -32,6 +32,26 @@ extern "C" { #endif +#if (!__thumb__ || __thumb2__) && __ARM_ARCH >= 4 +__extension__ static __inline void __attribute__ ((__always_inline__)) +__arm_cdp (const unsigned int __coproc, const unsigned int __opc1, + const unsigned int __CRd, const unsigned int __CRn, + const unsigned int __CRm, const unsigned int __opc2) +{ + return __builtin_arm_cdp (__coproc, __opc1, __CRd, __CRn, __CRm, __opc2); +} + +#if __ARM_ARCH >= 5 +__extension__ static __inline void __attribute__ ((__always_inline__)) +__arm_cdp2 (const unsigned int __coproc, const unsigned int __opc1, + const unsigned int __CRd, const unsigned int __CRn, + const unsigned int __CRm, const unsigned int __opc2) +{ + return __builtin_arm_cdp2 (__coproc, __opc1, __CRd, __CRn, __CRm, __opc2); +} +#endif /* __ARM_ARCH >= 5. */ +#endif /* (!__thumb__ || __thumb2__) && __ARM_ARCH >= 4. */ + #ifdef __ARM_FEATURE_CRC32 __extension__ static __inline uint32_t __attribute__ ((__always_inline__)) __crc32b (uint32_t __a, uint8_t __b) diff --git a/gcc/config/arm/arm_acle_builtins.def b/gcc/config/arm/arm_acle_builtins.def index 81ab7720971ba042a5d64c22b6bd19710147e602..03b5bf88ef2632bceedba1e64c0f83bc50337364 100644 --- a/gcc/config/arm/arm_acle_builtins.def +++ b/gcc/config/arm/arm_acle_builtins.def @@ -24,3 +24,5 @@ VAR1 (UBINOP, crc32w, si) VAR1 (UBINOP, crc32cb, si) VAR1 (UBINOP, crc32ch, si) VAR1 (UBINOP, crc32cw, si) +VAR1 (CDP, cdp, void) +VAR1 (CDP, cdp2, void) diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index aba1023cdd0d9feb5396222a27c1897ca844937e..e48d89a1f43b49493f8e5b986fd2c35bb7610d4d 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -847,3 +847,8 @@ ;; Attributes for VQRDMLAH/VQRDMLSH (define_int_attr neon_rdma_as [(UNSPEC_VQRDMLAH "a") (UNSPEC_VQRDMLSH "s")]) + +;; An iterator for the CDP coprocessor instructions +(define_int_iterator CDPI [VUNSPEC_CDP VUNSPEC_CDP2]) +(define_int_attr cdp [(VUNSPEC_CDP "cdp") (VUNSPEC_CDP2 "cdp2")]) +(define_int_attr CDP [(VUNSPEC_CDP "CDP") (VUNSPEC_CDP2 "CDP2")]) diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 879c07c13b6aa20c46828d08f5a4f413a5722eca..14c1ac5f607fdbec2120dc5c89d0e21baae54742 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -3100,7 +3100,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vcvt.%#32.f32\t%0, %1, %2"; } [(set_attr "type" "neon_fp_to_int_")] @@ -3113,7 +3113,7 @@ if (BYTES_BIG_ENDIAN) VCVT_US_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, 33); + arm_const_bounds (operands[2], 1, 33); return "vcvt.f32.%#32\t%0, %1, %2"; } [(set_attr "type" "neon_int_to_fp_")] @@ -3682,7 +3682,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VEXT))] "TARGET_NEON" { - neon_const_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); + arm_const_bounds (operands[3], 0, GET_MODE_NUNITS (mode)); return "vext.\t%0, %1, %2, %3"; } [(set_attr "type" "neon_ext")] @@ -3779,7 +3779,7 @@ if (BYTES_BIG_ENDIAN) VSHR_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (mode) + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (mode) + 1); return "v.%#\t%0, %1, %2"; } [(set_attr "type" "neon_shift_imm")] @@ -3793,7 +3793,7 @@ if (BYTES_BIG_ENDIAN) VSHRN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); return "v.\t%P0, %q1, %2"; } [(set_attr "type" "neon_shift_imm_narrow_q")] @@ -3807,7 +3807,7 @@ if (BYTES_BIG_ENDIAN) VQSHRN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); return "v.%#\t%P0, %q1, %2"; } [(set_attr "type" "neon_sat_shift_imm_narrow_q")] @@ -3821,7 +3821,7 @@ if (BYTES_BIG_ENDIAN) VQSHRUN_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); + arm_const_bounds (operands[2], 1, neon_element_bits (mode) / 2 + 1); return "v.\t%P0, %q1, %2"; } [(set_attr "type" "neon_sat_shift_imm_narrow_q")] @@ -3834,7 +3834,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSHL_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (mode)); return "vshl.\t%0, %1, %2"; } [(set_attr "type" "neon_shift_imm")] @@ -3847,7 +3847,7 @@ if (BYTES_BIG_ENDIAN) VQSHL_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (mode)); return "vqshl.%#\t%0, %1, %2"; } [(set_attr "type" "neon_sat_shift_imm")] @@ -3860,7 +3860,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VQSHLU_N))] "TARGET_NEON" { - neon_const_bounds (operands[2], 0, neon_element_bits (mode)); + arm_const_bounds (operands[2], 0, neon_element_bits (mode)); return "vqshlu.\t%0, %1, %2"; } [(set_attr "type" "neon_sat_shift_imm")] @@ -3874,7 +3874,7 @@ if (BYTES_BIG_ENDIAN) "TARGET_NEON" { /* The boundaries are: 0 < imm <= size. */ - neon_const_bounds (operands[2], 0, neon_element_bits (mode) + 1); + arm_const_bounds (operands[2], 0, neon_element_bits (mode) + 1); return "vshll.%#\t%q0, %P1, %2"; } [(set_attr "type" "neon_shift_imm_long")] @@ -3889,7 +3889,7 @@ if (BYTES_BIG_ENDIAN) VSRA_N))] "TARGET_NEON" { - neon_const_bounds (operands[3], 1, neon_element_bits (mode) + 1); + arm_const_bounds (operands[3], 1, neon_element_bits (mode) + 1); return "v.%#\t%0, %2, %3"; } [(set_attr "type" "neon_shift_acc")] @@ -3903,7 +3903,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSRI))] "TARGET_NEON" { - neon_const_bounds (operands[3], 1, neon_element_bits (mode) + 1); + arm_const_bounds (operands[3], 1, neon_element_bits (mode) + 1); return "vsri.\t%0, %2, %3"; } [(set_attr "type" "neon_shift_reg")] @@ -3917,7 +3917,7 @@ if (BYTES_BIG_ENDIAN) UNSPEC_VSLI))] "TARGET_NEON" { - neon_const_bounds (operands[3], 0, neon_element_bits (mode)); + arm_const_bounds (operands[3], 0, neon_element_bits (mode)); return "vsli.\t%0, %2, %3"; } [(set_attr "type" "neon_shift_reg")] diff --git a/gcc/config/arm/types.md b/gcc/config/arm/types.md index 25f79b4d010ae24c14d97d9fead93db1eff42f32..14d6429ceabbad07ed80c4166a6e77cc9f1a062e 100644 --- a/gcc/config/arm/types.md +++ b/gcc/config/arm/types.md @@ -538,6 +538,10 @@ ; crypto_sha1_slow ; crypto_sha256_fast ; crypto_sha256_slow +; +; The classification below is for coprocessor instructions +; +; coproc (define_attr "type" "adc_imm,\ @@ -1071,7 +1075,8 @@ crypto_sha1_fast,\ crypto_sha1_slow,\ crypto_sha256_fast,\ - crypto_sha256_slow" + crypto_sha256_slow,\ + coproc" (const_string "untyped")) ; Is this an (integer side) multiply with a 32-bit (or smaller) result? diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md index c2f5e5b52cd84380c009af4077bd6effda923701..b62c952b631b7700bbb01171a24a752b50367be1 100644 --- a/gcc/config/arm/unspecs.md +++ b/gcc/config/arm/unspecs.md @@ -150,6 +150,8 @@ VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content. VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content. VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing. + VUNSPEC_CDP ; Represent the coprocessor cdp instruction. + VUNSPEC_CDP2 ; Represent the coprocessor cdp2 instruction. ]) ;; Enumerators for NEON unspecs. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 44d9b48fcd2ffd8a3b127261be8088d1ab67002e..ceb7f201961dedc073e5d0519a12c6481c417465 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -12242,8 +12242,9 @@ The built-in intrinsics for the Advanced SIMD extension are available when NEON is enabled. Currently, ARM and AArch64 back ends do not support ACLE 2.0 fully. Both -back ends support CRC32 intrinsics from @file{arm_acle.h}. The ARM back end's -16-bit floating-point Advanced SIMD intrinsics currently comply to ACLE v1.1. +back ends support CRC32 intrinsics and the ARM back end supports the +Coprocessor intrinsics, all from @file{arm_acle.h}. The ARM back end's 16-bit +floating-point Advanced SIMD intrinsics currently comply to ACLE v1.1. AArch64's back end does not have support for 16-bit floating point Advanced SIMD intrinsics yet. diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 7f74d87b144be530a1f012819aed63519d7e2cbf..0d4023e5056eac56c572197dc0fc12d454be0f85 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1613,6 +1613,21 @@ ARM target generates Thumb-1 code for @code{-mthumb} with ARM target supports ARMv8-M Security Extensions, enabled by the @code{-mcmse} option. +@item arm_coproc1_ok +@anchor{arm_coproc1_ok} +ARM target supports the following coprocessor instruction: @code{CDP}, +@code{LDC}, @code{STC}, @code{MCR} and @code{MRC}. + +@item arm_coproc2_ok +@anchor{arm_coproc2_ok} +ARM target supports the all the coprocessor instructions also listed as +supported in @ref{arm_coproc1_ok} and the following: @code{CDP2}, @code{LDC2}, +@code{LDC2l}, @code{STC2}, @code{STC2l}, @code{MCR2} and @code{MRC2}. + +@item arm_coproc3_ok +ARM target supports the all the coprocessor instructions also listed as +supported in @ref{arm_coproc2_ok} and the following: @code{MCRR}, @code{MCRR2}, +@code{MRRC}, and @code{MRRC2}. @end table @subsubsection AArch64-specific attributes diff --git a/gcc/testsuite/ChangeLog.arm b/gcc/testsuite/ChangeLog.arm index 17dbfe0675bc9c8d01b3af4360ee7327bd719eda..11233dbe636fa2ffb1d8d4209e3fee2dfcffc695 100644 --- a/gcc/testsuite/ChangeLog.arm +++ b/gcc/testsuite/ChangeLog.arm @@ -1,5 +1,19 @@ 2016-12-05 Andre Vieira + * gcc.target/arm/acle/acle.exp: Run tests for different options + and make sure fat-lto-objects is used such that we can still do + assemble scans. + * gcc.target/arm/acle/cdp.c: New. + * gcc.target/arm/acle/cdp2.c: New. + * lib/target-supports.exp (check_effective_target_arm_coproc1_ok): New. + (check_effective_target_arm_coproc1_ok_nocache): New. + (check_effective_target_arm_coproc2_ok): New. + (check_effective_target_arm_coproc2_ok_nocache): New. + (check_effective_target_arm_coproc3_ok): New. + (check_effective_target_arm_coproc3_ok_nocache): New. + +2016-12-05 Andre Vieira + Backport from mainline 2016-12-02 Andre Vieira Thomas Preud'homme diff --git a/gcc/testsuite/gcc.target/arm/acle/acle.exp b/gcc/testsuite/gcc.target/arm/acle/acle.exp index 91954bdff2f8fbb140bef44edbb5f040c68b92ca..f431da677940996e031ad0693427fbd3c8a211c3 100644 --- a/gcc/testsuite/gcc.target/arm/acle/acle.exp +++ b/gcc/testsuite/gcc.target/arm/acle/acle.exp @@ -27,9 +27,26 @@ load_lib gcc-dg.exp # Initialize `dg'. dg-init +set saved-dg-do-what-default ${dg-do-what-default} +set dg-do-what-default "assemble" + +set saved-lto_torture_options ${LTO_TORTURE_OPTIONS} + +# Add -ffat-lto-objects option to all LTO options such that we can do assembly +# scans. +proc add_fat_objects { list } { + set res {} + foreach el $list {set res [lappend res [concat $el " -ffat-lto-objects"]]} + return $res +}; +set LTO_TORTURE_OPTIONS [add_fat_objects ${LTO_TORTURE_OPTIONS}] + # Main loop. -dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cCS\]]] \ "" "" +# Restore globals +set dg-do-what-default ${saved-dg-do-what-default} +set LTO_TORTURE_OPTIONS ${saved-lto_torture_options} # All done. dg-finish diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp.c b/gcc/testsuite/gcc.target/arm/acle/cdp.c new file mode 100644 index 0000000000000000000000000000000000000000..28b218e7cfcdb7d6ce1381feb4c6dea3ff08a620 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/acle/cdp.c @@ -0,0 +1,14 @@ +/* Test the cdp ACLE intrinsic. */ + +/* { dg-do assemble } */ +/* { dg-options "-save-temps" } */ +/* { dg-require-effective-target arm_coproc1_ok } */ + +#include "arm_acle.h" + +void test_cdp (void) +{ + __arm_cdp (10, 1, 2, 3, 4, 5); +} + +/* { dg-final { scan-assembler "cdp\tp10, #1, CR2, CR3, CR4, #5\n" } } */ diff --git a/gcc/testsuite/gcc.target/arm/acle/cdp2.c b/gcc/testsuite/gcc.target/arm/acle/cdp2.c new file mode 100644 index 0000000000000000000000000000000000000000..00bcd502b563cfe6df1e5d4c2e53f8034063d47e --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/acle/cdp2.c @@ -0,0 +1,14 @@ +/* Test the cdp2 ACLE intrinsic. */ + +/* { dg-do assemble } */ +/* { dg-options "-save-temps" } */ +/* { dg-require-effective-target arm_coproc2_ok } */ + +#include "arm_acle.h" + +void test_cdp2 (void) +{ + __arm_cdp2 (10, 4, 3, 2, 1, 0); +} + +/* { dg-final { scan-assembler "cdp2\tp10, #4, CR3, CR2, CR1, #0\n" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index a1d786b04466574b4f7ee19d4d6fa13464917d91..cf3e80a367ac1aac772b48fa74435c55427dae55 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -7065,3 +7065,59 @@ proc check_effective_target_offload_hsa { } { int main () {return 0;} } "-foffload=hsa" ] } + +# Return 1 if the target supports coprocessor instructions: cdp, ldc, stc, mcr and +# mrc. +proc check_effective_target_arm_coproc1_ok_nocache { } { + if { ![istarget arm*-*-*] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc1_ok assembly { + #if (__thumb__ && !__thumb2__) || __ARM_ARCH < 4 + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc1_ok { } { + return [check_cached_effective_target arm_coproc1_ok \ + check_effective_target_arm_coproc1_ok_nocache] +} + +# Return 1 if the target supports all coprocessor instructions checked by +# check_effective_target_arm_coproc1_ok and the following: cdp2, ldc2, ldc2l, +# stc2, stc2l, mcr2 and mrc2. +proc check_effective_target_arm_coproc2_ok_nocache { } { + if { ![check_effective_target_arm_coproc1_ok] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc2_ok assembly { + #if __ARM_ARCH < 5 + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc2_ok { } { + return [check_cached_effective_target arm_coproc2_ok \ + check_effective_target_arm_coproc2_ok_nocache] +} + +# Return 1 if the target supports all coprocessor instructions checked by +# check_effective_target_arm_coproc2_ok and the following: mcrr, mcrr2, mrrc +# and mrrc2. +proc check_effective_target_arm_coproc3_ok_nocache { } { + if { ![check_effective_target_arm_coproc2_ok] } { + return 0 + } + return [check_no_compiler_messages_nocache arm_coproc3_ok assembly { + #if __ARM_ARCH < 6 && !defined (__ARM_ARCH_5TE) + #error FOO + #endif + }] +} + +proc check_effective_target_arm_coproc3_ok { } { + return [check_cached_effective_target arm_coproc3_ok \ + check_effective_target_arm_coproc3_ok_nocache] +}