From patchwork Thu Mar 14 12:49:29 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Belagod X-Patchwork-Id: 227650 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 679AE2C0094 for ; Thu, 14 Mar 2013 23:50:18 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1363870220; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=8g+rEg+ fX5y3G/2cgifv7RgrruE=; b=EIKM+qF2mp5xxDyGSR50Qcl2k8ETLhp1sWzsh4k 6PubDLNcSvI30sYBKUx62WOjFr8kk5cJUJE5lfcf7Zhrxua+pRVnEcv5qABNMqjl Kqm6hvkiB7OxWjdKB+vu00FghF/M8O3Sl6LA0/ciJdFWx3es3m6rMq8AGQfZOFJi JW/s= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=hdTTk+ADYM3O7Q37Gbx4iTsyKDaKYXnsFbCkBoAD6TZcP8rxvdkFMLOv0zEjpP AYZeCdurvIEnA2SRThYUfoMPQaET5D64mnZn57I2k5xApYsdkvILGzSWK4CSq800 p1zyXmVk893uTF7DYCeF9YiMST6+D0bkZYjcOo9Kvxx8U=; Received: (qmail 22928 invoked by alias); 14 Mar 2013 12:50:13 -0000 Received: (qmail 22912 invoked by uid 22791); 14 Mar 2013 12:50:12 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, RCVD_IN_DNSWL_LOW, TW_DV X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 14 Mar 2013 12:49:39 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Thu, 14 Mar 2013 12:49:31 +0000 Received: from [10.1.79.66] ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Thu, 14 Mar 2013 12:49:30 +0000 Message-ID: <5141C759.8090707@arm.com> Date: Thu, 14 Mar 2013 12:49:29 +0000 From: Tejas Belagod User-Agent: Thunderbird 2.0.0.18 (X11/20081120) MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft Subject: [Patch, AArch64] Implement framework for Tree/Gimple Implementation of NEON intrinsics. X-MC-Unique: 113031412493106901 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Attached is a patch that implements the framework necessary for implementing NEON Intrinsics' builtins in Tree/Gimple rather than RTL. For this it uses the target hook TARGET_FOLD_BUILTIN and folds all the builtins for NEON Intrinsics into equivalent trees. This framework is accompanied by an example implementation of vaddv_f<32, 64> intrinsics using the framework. Regression tested on aarch64-none-elf. OK for trunk? Thanks, Tejas Belagod ARM. Changelog: 2013-03-14 Tejas Belagod gcc/ * config/aarch64/aarch64-builtins.c (aarch64_fold_builtin): New. * config/aarch64/aarch64-protos.h (aarch64_fold_builtin): Declare. * config/aarch64/aarch64-simd-builtins.def: New entry for reduc_splus. * config/aarch64/aarch64.c (TARGET_FOLD_BUILTIN): Define. * config/aarch64/arm_neon.h (vaddv_f32, vaddvq_f32, vaddvq_f64): New. testsuite/ * gcc.target/aarch64/vaddv-intrinsic-compile.c: New. * gcc.target/aarch64/vaddv-intrinsic.c: New. diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c index 35475ba..a1bd032 100644 --- a/gcc/config/aarch64/aarch64-builtins.c +++ b/gcc/config/aarch64/aarch64-builtins.c @@ -1254,6 +1254,31 @@ aarch64_builtin_vectorized_function (tree fndecl, tree type_out, tree type_in) return NULL_TREE; } + +#undef VAR1 +#define VAR1(T, N, MAP, A) \ + case AARCH64_SIMD_BUILTIN_##N##A: + +tree +aarch64_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED, tree *args, + bool ignore ATTRIBUTE_UNUSED) +{ + int fcode = DECL_FUNCTION_CODE (fndecl); + tree type = TREE_TYPE (TREE_TYPE (fndecl)); + + switch (fcode) + { + BUILTIN_VDQF (UNOP, reduc_splus_, 10) + return fold_build1 (REDUC_PLUS_EXPR, type, args[0]); + break; + + default: + break; + } + + return NULL_TREE; +} + #undef AARCH64_CHECK_BUILTIN_MODE #undef AARCH64_FIND_FRINT_VARIANT #undef BUILTIN_DX diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 5d0072f..1bb33e8 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -177,6 +177,7 @@ rtx aarch64_simd_gen_const_vector_dup (enum machine_mode, int); bool aarch64_simd_mem_operand_p (rtx); rtx aarch64_simd_vect_par_cnst_half (enum machine_mode, bool); rtx aarch64_tls_get_addr (void); +tree aarch64_fold_builtin (tree, int, tree *, bool); unsigned aarch64_dbx_register_number (unsigned); unsigned aarch64_trampoline_size (void); void aarch64_asm_output_labelref (FILE *, const char *); diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index e18e3f3..1dd4ad6 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -238,6 +238,9 @@ BUILTIN_VDQF (BINOP, fmax, 0) BUILTIN_VDQF (BINOP, fmin, 0) + /* Implemented by reduc_splus_. */ + BUILTIN_VDQF (UNOP, reduc_splus_, 10) + /* Implemented by 3. */ BUILTIN_VDQ_BHSI (BINOP, smax, 3) BUILTIN_VDQ_BHSI (BINOP, smin, 3) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 45c4106..156c20e 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -7829,6 +7829,9 @@ aarch64_vectorize_vec_perm_const_ok (enum machine_mode vmode, #undef TARGET_EXPAND_BUILTIN_VA_START #define TARGET_EXPAND_BUILTIN_VA_START aarch64_expand_builtin_va_start +#undef TARGET_FOLD_BUILTIN +#define TARGET_FOLD_BUILTIN aarch64_fold_builtin + #undef TARGET_FUNCTION_ARG #define TARGET_FUNCTION_ARG aarch64_function_arg diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index 5e25c77..6198f99 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -19731,6 +19731,29 @@ vaddd_u64 (uint64x1_t __a, uint64x1_t __b) return __a + __b; } +/* vaddv */ + +__extension__ static __inline float32_t __attribute__ ((__always_inline__)) +vaddv_f32 (float32x2_t __a) +{ + float32x2_t t = __builtin_aarch64_reduc_splus_v2sf (__a); + return vget_lane_f32 (t, 0); +} + +__extension__ static __inline float32_t __attribute__ ((__always_inline__)) +vaddvq_f32 (float32x4_t __a) +{ + float32x4_t t = __builtin_aarch64_reduc_splus_v4sf (__a); + return vgetq_lane_f32 (t, 0); +} + +__extension__ static __inline float64_t __attribute__ ((__always_inline__)) +vaddvq_f64 (float64x2_t __a) +{ + float64x2_t t = __builtin_aarch64_reduc_splus_v2df (__a); + return vgetq_lane_f64 (t, 0); +} + /* vceq */ __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) diff --git a/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic-compile.c b/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic-compile.c new file mode 100644 index 0000000..c736c0d --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic-compile.c @@ -0,0 +1,36 @@ + +/* { dg-do compile } */ +/* { dg-options "-O3" } */ + +#include "arm_neon.h" + +float32_t +test_vaddv_v2sf (const float32_t *pool) +{ + float32x2_t val; + + val = vld1_f32 (pool); + return vaddv_f32 (val); +} + +float32_t +test_vaddv_v4sf (const float32_t *pool) +{ + float32x4_t val; + + val = vld1q_f32 (pool); + return vaddvq_f32 (val); +} + +float64_t +test_vaddv_v2df (const float64_t *pool) +{ + float64x2_t val; + + val = vld1q_f64 (pool); + return vaddvq_f64 (val); +} + +/* { dg-final { scan-assembler "faddp\\ts\[0-9\]+"} } */ +/* { dg-final { scan-assembler-times "faddp\\tv\[0-9\]+\.4s" 2} } */ +/* { dg-final { scan-assembler "faddp\\td\[0-9\]+"} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic.c b/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic.c new file mode 100644 index 0000000..d324333 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/vaddv-intrinsic.c @@ -0,0 +1,53 @@ + +/* { dg-do run } */ +/* { dg-options "-O3" } */ + +#include "arm_neon.h" + +extern void abort (void); + +float32_t +test_vaddv_v2sf (const float32_t *pool) +{ + float32x2_t val; + + val = vld1_f32 (pool); + return vaddv_f32 (val); +} + +float32_t +test_vaddv_v4sf (const float32_t *pool) +{ + float32x4_t val; + + val = vld1q_f32 (pool); + return vaddvq_f32 (val); +} + +float64_t +test_vaddv_v2df (const float64_t *pool) +{ + float64x2_t val; + + val = vld1q_f64 (pool); + return vaddvq_f64 (val); +} + +int +main (void) +{ + const float32_t pool_v2sf[] = {4.0f, 9.0f}; + const float32_t pool_v4sf[] = {4.0f, 9.0f, 16.0f, 25.0f}; + const float64_t pool_v2df[] = {4.0, 9.0}; + + if (test_vaddv_v2sf (pool_v2sf) != 13.0f) + abort (); + + if (test_vaddv_v4sf (pool_v4sf) != 54.0f) + abort (); + + if (test_vaddv_v2df (pool_v2df) != 13.0) + abort (); + + return 0; +}