From patchwork Tue Feb 25 17:02:13 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Velenko X-Patchwork-Id: 324032 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id A0A9D2C00CC for ; Wed, 26 Feb 2014 04:02:28 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=XAZTFBOoGLK6p0uPij2Hbqyg+lMmfNqBqgfzSTV+OKs by3+Yk5XGVzw0blRe9uGYQEQpOCsC7ix0xe0T4rw3AhAsllgwZqLCrS7qyzodpw+ 6cHbKzhykVENfvwpQEsBGO8x1i83L7yRPaIiJ2rgaKH/ddg4cvDvZTGSbvZXYVfU = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=SD1SKy4PX6q/RjUcpVgdReHFpT4=; b=RcgcRB+pZrpXggE9R QK4mcy5J32DcOJ1L1kG4jopF6g2y5n0T2om+QoBHMAnSl15pyL5rRG8YDsv6aga6 hQFEK3fWBV/8gdQ269HPXkAgMmEHyhdgNqWZ8xedAj3sMXsWM9FdA+WF1bT1gqA6 rcrxRM5T1BjXU7Gh9lCjnUusOE= Received: (qmail 10433 invoked by alias); 25 Feb 2014 17:02:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10423 invoked by uid 89); 25 Feb 2014 17:02:20 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.7 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: cam-smtp0.cambridge.arm.com Received: from fw-tnat.cambridge.arm.com (HELO cam-smtp0.cambridge.arm.com) (217.140.96.21) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 25 Feb 2014 17:02:17 +0000 Received: from [10.1.207.25] (e104458-lin.cambridge.arm.com [10.1.207.25]) by cam-smtp0.cambridge.arm.com (8.13.8/8.13.8) with ESMTP id s1PH2DBp005911; Tue, 25 Feb 2014 17:02:13 GMT Message-ID: <530CCC95.80603@arm.com> Date: Tue, 25 Feb 2014 17:02:13 +0000 From: Alex Velenko User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130804 Thunderbird/17.0.8 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft Subject: [AArch64] 64-bit float vreinterpret implemention X-IsSubscribed: yes Hi, This patch introduces vreinterpret implementation for 64-bit float vectors intrinsics and adds testcase for them. This patch tested on LE or BE with no regressions. Is this patch ok for stage-1? Thanks, Alex gcc/ 2014-02-14 Alex Velenko * config/aarch64/aarch64-builtins.c (aarch64_types_su_qualifiers): Qualifier added. (aarch64_types_sp_qualifiers): Likewise. (aarch64_types_us_qualifiers): Likewise. (aarch64_types_ps_qualifiers): Likewise. (TYPES_REINTERP_SS): Type macro added. (TYPES_REINTERP_SU): Likewise. (TYPES_REINTERP_SP): Likewise. (TYPES_REINTERP_US): Likewise. (TYPES_REINTERP_PS): Likewise. * config/aarch64/aarch64-simd-builtins.def (REINTERP): Declarations removed. (REINTERP_SS): Declarations added. (REINTERP_US): Likewise. (REINTERP_PS): Likewise. (REINTERP_SU): Likewise. (REINTERP_SP): Likewise. * config/aarch64/arm_neon.h (vreinterpret_p8_f64): Implemented. (vreinterpretq_p8_f64): Likewise. (vreinterpret_p16_f64): Likewise. (vreinterpretq_p16_f64): Likewise. (vreinterpret_f32_f64): Likewise. (vreinterpretq_f32_f64): Likewise. (vreinterpret_f64_f32): Likewise. (vreinterpret_f64_p8): Likewise. (vreinterpret_f64_p16): Likewise. (vreinterpret_f64_s8): Likewise. (vreinterpret_f64_s16): Likewise. (vreinterpret_f64_s32): Likewise. (vreinterpret_f64_s64): Likewise. (vreinterpret_f64_u8): Likewise. (vreinterpret_f64_u16): Likewise. (vreinterpret_f64_u32): Likewise. (vreinterpret_f64_u64): Likewise. (vreinterpretq_f64_f32): Likewise. (vreinterpretq_f64_p8): Likewise. (vreinterpretq_f64_p16): Likewise. (vreinterpretq_f64_s8): Likewise. (vreinterpretq_f64_s16): Likewise. (vreinterpretq_f64_s32): Likewise. (vreinterpretq_f64_s64): Likewise. (vreinterpretq_f64_u8): Likewise. (vreinterpretq_f64_u16): Likewise. (vreinterpretq_f64_u32): Likewise. (vreinterpretq_f64_u64): Likewise. (vreinterpret_s64_f64): Likewise. (vreinterpretq_s64_f64): Likewise. (vreinterpret_u64_f64): Likewise. (vreinterpretq_u64_f64): Likewise. (vreinterpret_s8_f64): Likewise. (vreinterpretq_s8_f64): Likewise. (vreinterpret_s16_f64): Likewise. (vreinterpretq_s16_f64): Likewise. (vreinterpret_s32_f64): Likewise. (vreinterpretq_s32_f64): Likewise. (vreinterpret_u8_f64): Likewise. (vreinterpretq_u8_f64): Likewise. (vreinterpret_u16_f64): Likewise. (vreinterpretq_u16_f64): Likewise. (vreinterpret_u32_f64): Likewise. (vreinterpretq_u32_f64): Likewise. gcc/testsuite/ 2014-02-14 Alex Velenko * gcc.target/aarch64/vreinterpret_f64_1.c: new_testcase diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 5e0e9b94653deb1530955d62d9842c39da95058a..0485447d266fd7542d66f01f2d4d4cbc37177079 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -147,6 +147,23 @@ aarch64_types_unopu_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_unsigned, qualifier_unsigned }; #define TYPES_UNOPU (aarch64_types_unopu_qualifiers) #define TYPES_CREATE (aarch64_types_unop_qualifiers) +#define TYPES_REINTERP_SS (aarch64_types_unop_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_unop_su_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_unsigned }; +#define TYPES_REINTERP_SU (aarch64_types_unop_su_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_unop_sp_qualifiers[SIMD_MAX_BUILTIN_ARGS] + = { qualifier_none, qualifier_poly }; +#define TYPES_REINTERP_SP (aarch64_types_unop_sp_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_unop_us_qualifiers[SIMD_MAX_BUILTIN_ARGS] += { qualifier_unsigned, qualifier_none }; +#define TYPES_REINTERP_US (aarch64_types_unop_us_qualifiers) +static enum aarch64_type_qualifiers +aarch64_types_unop_ps_qualifiers[SIMD_MAX_BUILTIN_ARGS] += { qualifier_poly, qualifier_none }; +#define TYPES_REINTERP_PS (aarch64_types_unop_ps_qualifiers) static enum aarch64_type_qualifiers aarch64_types_binop_qualifiers[SIMD_MAX_BUILTIN_ARGS] = { qualifier_none, qualifier_none, qualifier_maybe_immediate }; diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 8a3d7ecbbfc7743310da3f46a03f42a524302c9f..82aceedb4ec3c639df504aaeff9a54a174b6acf8 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -51,6 +51,28 @@ VAR1 (GETLANE, get_lane, 0, di) BUILTIN_VALL (GETLANE, be_checked_get_lane, 0) + VAR1 (REINTERP_SS, reinterpretdi, 0, df) + VAR1 (REINTERP_SS, reinterpretv8qi, 0, df) + VAR1 (REINTERP_SS, reinterpretv4hi, 0, df) + VAR1 (REINTERP_SS, reinterpretv2si, 0, df) + VAR1 (REINTERP_SS, reinterpretv2sf, 0, df) + BUILTIN_VD (REINTERP_SS, reinterpretdf, 0) + + BUILTIN_VD (REINTERP_SU, reinterpretdf, 0) + + VAR1 (REINTERP_US, reinterpretdi, 0, df) + VAR1 (REINTERP_US, reinterpretv8qi, 0, df) + VAR1 (REINTERP_US, reinterpretv4hi, 0, df) + VAR1 (REINTERP_US, reinterpretv2si, 0, df) + VAR1 (REINTERP_US, reinterpretv2sf, 0, df) + + BUILTIN_VD (REINTERP_SP, reinterpretdf, 0) + + VAR1 (REINTERP_PS, reinterpretdi, 0, df) + VAR1 (REINTERP_PS, reinterpretv8qi, 0, df) + VAR1 (REINTERP_PS, reinterpretv4hi, 0, df) + VAR1 (REINTERP_PS, reinterpretv2si, 0, df) + VAR1 (REINTERP_PS, reinterpretv2sf, 0, df) BUILTIN_VDQ_I (BINOP, dup_lane, 0) /* Implemented by aarch64_qshl. */ diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 4dffb59e856aeaafb79007255d3b91a73ef1ef13..cfcbd5117450cbbd7a9d297a0fbdcd687799c7e0 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -2234,6 +2234,15 @@ DONE; }) +(define_expand "aarch64_reinterpretdf" + [(match_operand:DF 0 "register_operand" "") + (match_operand:VD_RE 1 "register_operand" "")] + "TARGET_SIMD" +{ + aarch64_simd_reinterpret (operands[0], operands[1]); + DONE; +}) + (define_expand "aarch64_reinterpretv16qi" [(match_operand:V16QI 0 "register_operand" "") (match_operand:VQ 1 "register_operand" "")] diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 47fee84d1b1791596ad0c38b3d008dad2b035063..a6f4dd0854ca35c01ccc06c57db93d5e46feb983 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -2643,6 +2643,12 @@ vgetq_lane_u64 (uint64x2_t __a, const int __b) /* vreinterpret */ __extension__ static __inline poly8x8_t __attribute__ ((__always_inline__)) +vreinterpret_p8_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv8qidf_ps (__a); +} + +__extension__ static __inline poly8x8_t __attribute__ ((__always_inline__)) vreinterpret_p8_s8 (int8x8_t __a) { return (poly8x8_t) __a; @@ -2703,6 +2709,12 @@ vreinterpret_p8_p16 (poly16x4_t __a) } __extension__ static __inline poly8x16_t __attribute__ ((__always_inline__)) +vreinterpretq_p8_f64 (float64x2_t __a) +{ + return (poly8x16_t) __a; +} + +__extension__ static __inline poly8x16_t __attribute__ ((__always_inline__)) vreinterpretq_p8_s8 (int8x16_t __a) { return (poly8x16_t) __a; @@ -2763,6 +2775,12 @@ vreinterpretq_p8_p16 (poly16x8_t __a) } __extension__ static __inline poly16x4_t __attribute__ ((__always_inline__)) +vreinterpret_p16_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv4hidf_ps (__a); +} + +__extension__ static __inline poly16x4_t __attribute__ ((__always_inline__)) vreinterpret_p16_s8 (int8x8_t __a) { return (poly16x4_t) __a; @@ -2823,6 +2841,12 @@ vreinterpret_p16_p8 (poly8x8_t __a) } __extension__ static __inline poly16x8_t __attribute__ ((__always_inline__)) +vreinterpretq_p16_f64 (float64x2_t __a) +{ + return (poly16x8_t) __a; +} + +__extension__ static __inline poly16x8_t __attribute__ ((__always_inline__)) vreinterpretq_p16_s8 (int8x16_t __a) { return (poly16x8_t) __a; @@ -2883,6 +2907,12 @@ vreinterpretq_p16_p8 (poly8x16_t __a) } __extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) +vreinterpret_f32_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv2sfdf (__a); +} + +__extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) vreinterpret_f32_s8 (int8x8_t __a) { return (float32x2_t) __a; @@ -2943,6 +2973,12 @@ vreinterpret_f32_p16 (poly16x4_t __a) } __extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) +vreinterpretq_f32_f64 (float64x2_t __a) +{ + return (float32x4_t) __a; +} + +__extension__ static __inline float32x4_t __attribute__ ((__always_inline__)) vreinterpretq_f32_s8 (int8x16_t __a) { return (float32x4_t) __a; @@ -3002,6 +3038,144 @@ vreinterpretq_f32_p16 (poly16x8_t __a) return (float32x4_t) __a; } +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_f32 (float32x2_t __a) +{ + return __builtin_aarch64_reinterpretdfv2sf (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_p8 (poly8x8_t __a) +{ + return __builtin_aarch64_reinterpretdfv8qi_sp (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_p16 (poly16x4_t __a) +{ + return __builtin_aarch64_reinterpretdfv4hi_sp (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_s8 (int8x8_t __a) +{ + return __builtin_aarch64_reinterpretdfv8qi (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_s16 (int16x4_t __a) +{ + return __builtin_aarch64_reinterpretdfv4hi (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_s32 (int32x2_t __a) +{ + return __builtin_aarch64_reinterpretdfv2si (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_s64 (int64x1_t __a) +{ + return __builtin_aarch64_createdf ((uint64_t) vget_lane_s64 (__a, 0)); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_u8 (uint8x8_t __a) +{ + return __builtin_aarch64_reinterpretdfv8qi_su (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_u16 (uint16x4_t __a) +{ + return __builtin_aarch64_reinterpretdfv4hi_su (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_u32 (uint32x2_t __a) +{ + return __builtin_aarch64_reinterpretdfv2si_su (__a); +} + +__extension__ static __inline float64x1_t __attribute__((__always_inline__)) +vreinterpret_f64_u64 (uint64x1_t __a) +{ + return __builtin_aarch64_createdf (vget_lane_u64 (__a, 0)); +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_f32 (float32x4_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_p8 (poly8x16_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_p16 (poly16x8_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_s8 (int8x16_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_s16 (int16x8_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_s32 (int32x4_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_s64 (int64x2_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_u8 (uint8x16_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_u16 (uint16x8_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_u32 (uint32x4_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline float64x2_t __attribute__((__always_inline__)) +vreinterpretq_f64_u64 (uint64x2_t __a) +{ + return (float64x2_t) __a; +} + +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vreinterpret_s64_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretdidf (__a); +} + __extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) vreinterpret_s64_s8 (int8x8_t __a) { @@ -3063,6 +3237,12 @@ vreinterpret_s64_p16 (poly16x4_t __a) } __extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) +vreinterpretq_s64_f64 (float64x2_t __a) +{ + return (int64x2_t) __a; +} + +__extension__ static __inline int64x2_t __attribute__ ((__always_inline__)) vreinterpretq_s64_s8 (int8x16_t __a) { return (int64x2_t) __a; @@ -3123,6 +3303,12 @@ vreinterpretq_s64_p16 (poly16x8_t __a) } __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) +vreinterpret_u64_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretdidf_us (__a); +} + +__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vreinterpret_u64_s8 (int8x8_t __a) { return (uint64x1_t) __a; @@ -3183,6 +3369,12 @@ vreinterpret_u64_p16 (poly16x4_t __a) } __extension__ static __inline uint64x2_t __attribute__ ((__always_inline__)) +vreinterpretq_u64_f64 (float64x2_t __a) +{ + return (uint64x2_t) __a; +} + +__extension__ static __inline uint64x2_t __attribute__ ((__always_inline__)) vreinterpretq_u64_s8 (int8x16_t __a) { return (uint64x2_t) __a; @@ -3243,6 +3435,12 @@ vreinterpretq_u64_p16 (poly16x8_t __a) } __extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) +vreinterpret_s8_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv8qidf (__a); +} + +__extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) vreinterpret_s8_s16 (int16x4_t __a) { return (int8x8_t) __a; @@ -3303,6 +3501,12 @@ vreinterpret_s8_p16 (poly16x4_t __a) } __extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) +vreinterpretq_s8_f64 (float64x2_t __a) +{ + return (int8x16_t) __a; +} + +__extension__ static __inline int8x16_t __attribute__ ((__always_inline__)) vreinterpretq_s8_s16 (int16x8_t __a) { return (int8x16_t) __a; @@ -3363,6 +3567,12 @@ vreinterpretq_s8_p16 (poly16x8_t __a) } __extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) +vreinterpret_s16_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv4hidf (__a); +} + +__extension__ static __inline int16x4_t __attribute__ ((__always_inline__)) vreinterpret_s16_s8 (int8x8_t __a) { return (int16x4_t) __a; @@ -3423,6 +3633,12 @@ vreinterpret_s16_p16 (poly16x4_t __a) } __extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) +vreinterpretq_s16_f64 (float64x2_t __a) +{ + return (int16x8_t) __a; +} + +__extension__ static __inline int16x8_t __attribute__ ((__always_inline__)) vreinterpretq_s16_s8 (int8x16_t __a) { return (int16x8_t) __a; @@ -3483,6 +3699,12 @@ vreinterpretq_s16_p16 (poly16x8_t __a) } __extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) +vreinterpret_s32_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv2sidf (__a); +} + +__extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) vreinterpret_s32_s8 (int8x8_t __a) { return (int32x2_t) __a; @@ -3543,6 +3765,12 @@ vreinterpret_s32_p16 (poly16x4_t __a) } __extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) +vreinterpretq_s32_f64 (float64x2_t __a) +{ + return (int32x4_t) __a; +} + +__extension__ static __inline int32x4_t __attribute__ ((__always_inline__)) vreinterpretq_s32_s8 (int8x16_t __a) { return (int32x4_t) __a; @@ -3603,6 +3831,12 @@ vreinterpretq_s32_p16 (poly16x8_t __a) } __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) +vreinterpret_u8_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv8qidf_us (__a); +} + +__extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) vreinterpret_u8_s8 (int8x8_t __a) { return (uint8x8_t) __a; @@ -3663,6 +3897,12 @@ vreinterpret_u8_p16 (poly16x4_t __a) } __extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) +vreinterpretq_u8_f64 (float64x2_t __a) +{ + return (uint8x16_t) __a; +} + +__extension__ static __inline uint8x16_t __attribute__ ((__always_inline__)) vreinterpretq_u8_s8 (int8x16_t __a) { return (uint8x16_t) __a; @@ -3723,6 +3963,12 @@ vreinterpretq_u8_p16 (poly16x8_t __a) } __extension__ static __inline uint16x4_t __attribute__ ((__always_inline__)) +vreinterpret_u16_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv4hidf_us (__a); +} + +__extension__ static __inline uint16x4_t __attribute__ ((__always_inline__)) vreinterpret_u16_s8 (int8x8_t __a) { return (uint16x4_t) __a; @@ -3783,6 +4029,12 @@ vreinterpret_u16_p16 (poly16x4_t __a) } __extension__ static __inline uint16x8_t __attribute__ ((__always_inline__)) +vreinterpretq_u16_f64 (float64x2_t __a) +{ + return (uint16x8_t) __a; +} + +__extension__ static __inline uint16x8_t __attribute__ ((__always_inline__)) vreinterpretq_u16_s8 (int8x16_t __a) { return (uint16x8_t) __a; @@ -3843,6 +4095,12 @@ vreinterpretq_u16_p16 (poly16x8_t __a) } __extension__ static __inline uint32x2_t __attribute__ ((__always_inline__)) +vreinterpret_u32_f64 (float64x1_t __a) +{ + return __builtin_aarch64_reinterpretv2sidf_us (__a); +} + +__extension__ static __inline uint32x2_t __attribute__ ((__always_inline__)) vreinterpret_u32_s8 (int8x8_t __a) { return (uint32x2_t) __a; @@ -3903,6 +4161,12 @@ vreinterpret_u32_p16 (poly16x4_t __a) } __extension__ static __inline uint32x4_t __attribute__ ((__always_inline__)) +vreinterpretq_u32_f64 (float64x2_t __a) +{ + return (uint32x4_t) __a; +} + +__extension__ static __inline uint32x4_t __attribute__ ((__always_inline__)) vreinterpretq_u32_s8 (int8x16_t __a) { return (uint32x4_t) __a; diff --git a/gcc/testsuite/gcc.target/aarch64/vreinterpret_f64_1.c b/gcc/testsuite/gcc.target/aarch64/vreinterpret_f64_1.c new file mode 100644 index 0000000000000000000000000000000000000000..08bd1bfbd49a1bb65d16d2f3c0a934f5380e1b33 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vreinterpret_f64_1.c @@ -0,0 +1,596 @@ +/* Test vreinterpret_f64_* and vreinterpret_*_f64 intrinsics work correctly. */ +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#include + +extern void abort (void); + +#define ABS(a) __builtin_fabs (a) +#define ISNAN(a) __builtin_isnan (a) + +#define DOUBLE_EQUALS(a, b, epsilon) \ +( \ + ((a) == (b)) \ + || (ISNAN (a) && ISNAN (b)) \ + || (ABS (a - b) < epsilon) \ +) + +/* Pi accurate up to 16 digits. + Further digits are a closest binary approximation. */ +#define PI_F64 3.14159265358979311599796346854 +/* Hex representation in Double (IEEE754 Double precision 64-bit) is: + 0x400921FB54442D18. */ + +/* E accurate up to digits. + Further digits are a closest binary approximation. */ +#define E_F64 2.71828182845904509079559829843 +/* Hex representation in Double (IEEE754 Double precision 64-bit) is: + 0x4005BF0A8B145769. */ + +float32x2_t __attribute__ ((noinline)) +wrap_vreinterpret_f32_f64 (float64x1_t __a) +{ + return vreinterpret_f32_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f32_f64 () +{ + float64x1_t a; + float32x2_t b; + float64_t c[1] = { PI_F64 }; + /* Values { 0x54442D18, 0x400921FB } reinterpreted as f32. */ + float32_t d[2] = { 3.3702805504E12, 2.1426990032196044921875E0 }; + float32_t e[2]; + int i; + + a = vld1_f64 (c); + b = wrap_vreinterpret_f32_f64 (a); + vst1_f32 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __FLT_EPSILON__)) + return 1; + return 0; +}; + +int8x8_t __attribute__ ((noinline)) +wrap_vreinterpret_s8_f64 (float64x1_t __a) +{ + return vreinterpret_s8_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_s8_f64 () +{ + float64x1_t a; + int8x8_t b; + float64_t c[1] = { PI_F64 }; + int8_t d[8] = { 0x18, 0x2D, 0x44, 0x54, 0xFB, 0x21, 0x09, 0x40 }; + int8_t e[8]; + int i; + + a = vld1_f64 (c); + b = wrap_vreinterpret_s8_f64 (a); + vst1_s8 (e, b); + for (i = 0; i < 8; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int16x4_t __attribute__ ((noinline)) +wrap_vreinterpret_s16_f64 (float64x1_t __a) +{ + return vreinterpret_s16_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_s16_f64 () +{ + float64x1_t a; + int16x4_t b; + float64_t c[1] = { PI_F64 }; + int16_t d[4] = { 0x2D18, 0x5444, 0x21FB, 0x4009 }; + int16_t e[4]; + int i; + + a = vld1_f64 (c); + b = wrap_vreinterpret_s16_f64 (a); + vst1_s16 (e, b); + for (i = 0; i < 4; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int32x2_t __attribute__ ((noinline)) +wrap_vreinterpret_s32_f64 (float64x1_t __a) +{ + return vreinterpret_s32_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_s32_f64 () +{ + float64x1_t a; + int32x2_t b; + float64_t c[1] = { PI_F64 }; + int32_t d[2] = { 0x54442D18, 0x400921FB }; + int32_t e[2]; + int i; + + a = vld1_f64 (c); + b = wrap_vreinterpret_s32_f64 (a); + vst1_s32 (e, b); + for (i = 0; i < 2; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_s64_f64 (float64x1_t __a) +{ + return vreinterpret_s64_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_s64_f64 () +{ + float64x1_t a; + int64x1_t b; + float64_t c[1] = { PI_F64 }; + int64_t d[1] = { 0x400921FB54442D18 }; + int64_t e[1]; + int i; + + a = vld1_f64 (c); + b = wrap_vreinterpret_s64_f64 (a); + vst1_s64 (e, b); + if (d[0] != e[0]) + return 1; + return 0; +}; + +float32x4_t __attribute__ ((noinline)) +wrap_vreinterpretq_f32_f64 (float64x2_t __a) +{ + return vreinterpretq_f32_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f32_f64 () +{ + float64x2_t a; + float32x4_t b; + float64_t c[2] = { PI_F64, E_F64 }; + + /* Values corresponding to f32 reinterpret of + { 0x54442D18, 0x400921FB, 0x8B145769, 0x4005BF0A }. */ + float32_t d[4] = { 3.3702805504E12, + 2.1426990032196044921875E0, + -2.8569523269651966444143014594E-32, + 2.089785099029541015625E0 }; + float32_t e[4]; + int i; + + a = vld1q_f64 (c); + b = wrap_vreinterpretq_f32_f64 (a); + vst1q_f32 (e, b); + for (i = 0; i < 4; i++) + { + if (!DOUBLE_EQUALS (d[i], e[i], __FLT_EPSILON__)) + return 1; + } + return 0; +}; + +int8x16_t __attribute__ ((noinline)) +wrap_vreinterpretq_s8_f64 (float64x2_t __a) +{ + return vreinterpretq_s8_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_s8_f64 () +{ + float64x2_t a; + int8x16_t b; + float64_t c[2] = { PI_F64, E_F64 }; + int8_t d[16] = { 0x18, 0x2D, 0x44, 0x54, 0xFB, 0x21, 0x09, 0x40, + 0x69, 0x57, 0x14, 0x8B, 0x0A, 0xBF, 0x05, 0x40 }; + int8_t e[16]; + int i; + + a = vld1q_f64 (c); + b = wrap_vreinterpretq_s8_f64 (a); + vst1q_s8 (e, b); + for (i = 0; i < 16; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int16x8_t __attribute__ ((noinline)) +wrap_vreinterpretq_s16_f64 (float64x2_t __a) +{ + return vreinterpretq_s16_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_s16_f64 () +{ + float64x2_t a; + int16x8_t b; + float64_t c[2] = { PI_F64, E_F64 }; + int16_t d[8] = { 0x2D18, 0x5444, 0x21FB, 0x4009, + 0x5769, 0x8B14, 0xBF0A, 0x4005 }; + int16_t e[8]; + int i; + + a = vld1q_f64 (c); + b = wrap_vreinterpretq_s16_f64 (a); + vst1q_s16 (e, b); + for (i = 0; i < 8; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int32x4_t __attribute__ ((noinline)) +wrap_vreinterpretq_s32_f64 (float64x2_t __a) +{ + return vreinterpretq_s32_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_s32_f64 () +{ + float64x2_t a; + int32x4_t b; + float64_t c[2] = { PI_F64, E_F64 }; + int32_t d[4] = { 0x54442D18, 0x400921FB, 0x8B145769, 0x4005BF0A }; + int32_t e[4]; + int i; + + a = vld1q_f64 (c); + b = wrap_vreinterpretq_s32_f64 (a); + vst1q_s32 (e, b); + for (i = 0; i < 4; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +int64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_s64_f64 (float64x2_t __a) +{ + return vreinterpretq_s64_f64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_s64_f64 () +{ + float64x2_t a; + int64x2_t b; + float64_t c[2] = { PI_F64, E_F64 }; + int64_t d[2] = { 0x400921FB54442D18, 0x4005BF0A8B145769 }; + int64_t e[2]; + int i; + + a = vld1q_f64 (c); + b = wrap_vreinterpretq_s64_f64 (a); + vst1q_s64 (e, b); + for (i = 0; i < 2; i++) + if (d[i] != e[i]) + return 1; + return 0; +}; + +float64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_f64_f32 (float32x2_t __a) +{ + return vreinterpret_f64_f32 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f64_f32 () +{ + float32x2_t a; + float64x1_t b; + /* Values { 0x54442D18, 0x400921FB } reinterpreted as f32. */ + float32_t c[2] = { 3.3702805504E12, 2.1426990032196044921875E0 }; + float64_t d[1] = { PI_F64 }; + float64_t e[1]; + int i; + + a = vld1_f32 (c); + b = wrap_vreinterpret_f64_f32 (a); + vst1_f64 (e, b); + if (!DOUBLE_EQUALS (d[0], e[0], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_f64_s8 (int8x8_t __a) +{ + return vreinterpret_f64_s8 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f64_s8 () +{ + int8x8_t a; + float64x1_t b; + int8_t c[8] = { 0x18, 0x2D, 0x44, 0x54, 0xFB, 0x21, 0x09, 0x40 }; + float64_t d[1] = { PI_F64 }; + float64_t e[1]; + int i; + + a = vld1_s8 (c); + b = wrap_vreinterpret_f64_s8 (a); + vst1_f64 (e, b); + if (!DOUBLE_EQUALS (d[0], e[0], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_f64_s16 (int16x4_t __a) +{ + return vreinterpret_f64_s16 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f64_s16 () +{ + int16x4_t a; + float64x1_t b; + int16_t c[4] = { 0x2D18, 0x5444, 0x21FB, 0x4009 }; + float64_t d[1] = { PI_F64 }; + float64_t e[1]; + int i; + + a = vld1_s16 (c); + b = wrap_vreinterpret_f64_s16 (a); + vst1_f64 (e, b); + if (!DOUBLE_EQUALS (d[0], e[0], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_f64_s32 (int32x2_t __a) +{ + return vreinterpret_f64_s32 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f64_s32 () +{ + int32x2_t a; + float64x1_t b; + int32_t c[2] = { 0x54442D18, 0x400921FB }; + float64_t d[1] = { PI_F64 }; + float64_t e[1]; + int i; + + a = vld1_s32 (c); + b = wrap_vreinterpret_f64_s32 (a); + vst1_f64 (e, b); + if (!DOUBLE_EQUALS (d[0], e[0], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x1_t __attribute__ ((noinline)) +wrap_vreinterpret_f64_s64 (int64x1_t __a) +{ + return vreinterpret_f64_s64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpret_f64_s64 () +{ + int64x1_t a; + float64x1_t b; + int64_t c[1] = { 0x400921FB54442D18 }; + float64_t d[1] = { PI_F64 }; + float64_t e[1]; + + a = vld1_s64 (c); + b = wrap_vreinterpret_f64_s64 (a); + vst1_f64 (e, b); + if (!DOUBLE_EQUALS (d[0], e[0], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_f64_f32 (float32x4_t __a) +{ + return vreinterpretq_f64_f32 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f64_f32 () +{ + float32x4_t a; + float64x2_t b; + /* Values corresponding to f32 reinterpret of + { 0x54442D18, 0x400921FB, 0x8B145769, 0x4005BF0A }. */ + float32_t c[4] = { 3.3702805504E12, + 2.1426990032196044921875E0, + -2.8569523269651966444143014594E-32, + 2.089785099029541015625E0 }; + + float64_t d[2] = { PI_F64, E_F64 }; + float64_t e[2]; + int i; + + a = vld1q_f32 (c); + b = wrap_vreinterpretq_f64_f32 (a); + vst1q_f64 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_f64_s8 (int8x16_t __a) +{ + return vreinterpretq_f64_s8 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f64_s8 () +{ + int8x16_t a; + float64x2_t b; + int8_t c[16] = { 0x18, 0x2D, 0x44, 0x54, 0xFB, 0x21, 0x09, 0x40, + 0x69, 0x57, 0x14, 0x8B, 0x0A, 0xBF, 0x05, 0x40 }; + float64_t d[2] = { PI_F64, E_F64 }; + float64_t e[2]; + int i; + + a = vld1q_s8 (c); + b = wrap_vreinterpretq_f64_s8 (a); + vst1q_f64 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_f64_s16 (int16x8_t __a) +{ + return vreinterpretq_f64_s16 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f64_s16 () +{ + int16x8_t a; + float64x2_t b; + int16_t c[8] = { 0x2D18, 0x5444, 0x21FB, 0x4009, + 0x5769, 0x8B14, 0xBF0A, 0x4005 }; + float64_t d[2] = { PI_F64, E_F64 }; + float64_t e[2]; + int i; + + a = vld1q_s16 (c); + b = wrap_vreinterpretq_f64_s16 (a); + vst1q_f64 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_f64_s32 (int32x4_t __a) +{ + return vreinterpretq_f64_s32 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f64_s32 () +{ + int32x4_t a; + float64x2_t b; + int32_t c[4] = { 0x54442D18, 0x400921FB, 0x8B145769, 0x4005BF0A }; + float64_t d[2] = { PI_F64, E_F64 }; + float64_t e[2]; + int i; + + a = vld1q_s32 (c); + b = wrap_vreinterpretq_f64_s32 (a); + vst1q_f64 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __DBL_EPSILON__)) + return 1; + return 0; +}; + +float64x2_t __attribute__ ((noinline)) +wrap_vreinterpretq_f64_s64 (int64x2_t __a) +{ + return vreinterpretq_f64_s64 (__a); +} + +int __attribute__ ((noinline)) +test_vreinterpretq_f64_s64 () +{ + int64x2_t a; + float64x2_t b; + int64_t c[2] = { 0x400921FB54442D18, 0x4005BF0A8B145769 }; + float64_t d[2] = { PI_F64, E_F64 }; + float64_t e[2]; + int i; + + a = vld1q_s64 (c); + b = wrap_vreinterpretq_f64_s64 (a); + vst1q_f64 (e, b); + for (i = 0; i < 2; i++) + if (!DOUBLE_EQUALS (d[i], e[i], __DBL_EPSILON__)) + return 1; + return 0; +}; + +int +main (int argc, char **argv) +{ + if (test_vreinterpret_f32_f64 ()) + abort (); + + if (test_vreinterpret_s8_f64 ()) + abort (); + if (test_vreinterpret_s16_f64 ()) + abort (); + if (test_vreinterpret_s32_f64 ()) + abort (); + if (test_vreinterpret_s64_f64 ()) + abort (); + + if (test_vreinterpretq_f32_f64 ()) + abort (); + + if (test_vreinterpretq_s8_f64 ()) + abort (); + if (test_vreinterpretq_s16_f64 ()) + abort (); + if (test_vreinterpretq_s32_f64 ()) + abort (); + if (test_vreinterpretq_s64_f64 ()) + abort (); + + if (test_vreinterpret_f64_f32 ()) + abort (); + + if (test_vreinterpret_f64_s8 ()) + abort (); + if (test_vreinterpret_f64_s16 ()) + abort (); + if (test_vreinterpret_f64_s32 ()) + abort (); + if (test_vreinterpret_f64_s64 ()) + abort (); + + if (test_vreinterpretq_f64_f32 ()) + abort (); + + if (test_vreinterpretq_f64_s8 ()) + abort (); + if (test_vreinterpretq_f64_s16 ()) + abort (); + if (test_vreinterpretq_f64_s32 ()) + abort (); + if (test_vreinterpretq_f64_s64 ()) + abort (); + + return 0; +}