From patchwork Fri Nov 30 16:34:41 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Lyon X-Patchwork-Id: 202987 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 605A42C0080 for ; Sat, 1 Dec 2012 03:35:10 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1354898110; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:In-Reply-To:References:Date: Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=HlWn0V7TMZgpbCmXfZVOvMjSqK8=; b=lB4EHhNoL++s9OFK379zQdiKeh+L5Sz8UpN1ybz3WaOoS4L+nHgf0p68FoxtSo dKa2mpaD/Xyi5xASE204pklwcJZOhxH5bsZ6em3RKr7z6LJh/hCVU827lsHI+B6e ec/D+HUUuhKvxe16VnDsw6QX6D1RcA5ipSSoM9MtZqrw4= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=JCBnx4Jb0bU0I1LN431ycSC/0dT2WzY6tfV7DyrjdhiZHdGdgKaKdrXOXvjLwU QSmKBp2jCBzBzIWJfxU3s1F/zBxDf/68t8M3uKXhv/a8fTMqPxJ6Pe23Q3WmLlQY Kv8wtZma3HOR2mQ3Bqz+7LXGRJmaMFX2XY1NtJscc4QC0=; Received: (qmail 10790 invoked by alias); 30 Nov 2012 16:34:56 -0000 Received: (qmail 10464 invoked by uid 22791); 30 Nov 2012 16:34:53 -0000 X-SWARE-Spam-Status: No, hits=-2.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_QE, TW_VB X-Spam-Check-By: sourceware.org Received: from mail-qc0-f175.google.com (HELO mail-qc0-f175.google.com) (209.85.216.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 30 Nov 2012 16:34:42 +0000 Received: by mail-qc0-f175.google.com with SMTP id j3so344880qcs.20 for ; Fri, 30 Nov 2012 08:34:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=hW8c/lolkhH4BHlo6At8Sq4hW5nL2udX0+KZmFUNKNs=; b=SeXE2Q0nX6eH1c804nZ7DV4w90M8FrIE2/XQv8tOLzH5PXE6zjHPgaP/3Qk5OaA4pg Mo/eFQBdYYnFxOFhXNPE3eBnj3mbjzE09WtfXOT4rG5YF0YL0WRErUV0Bfv00uw1VL6R POZPl66aXALNdNViDkiQxeOpwEH7UfFa7fCvXdRxQnSrf0EN0tPPi+ICQkDGQR0QdmpZ 7ZZzi0Afz3H73/h6WjSwSh0lrLzF0K5qPK+jz/oz+UFcEJcgtXXmUGAWLxC+6JdGlJid wG+6jyufrog/novuwodvHTM35TC3LTmInTBGf37qDuc7gVrPwz0F3z771GOGmdEPcRxQ piWw== MIME-Version: 1.0 Received: by 10.229.201.26 with SMTP id ey26mr643405qcb.150.1354293281219; Fri, 30 Nov 2012 08:34:41 -0800 (PST) Received: by 10.49.84.106 with HTTP; Fri, 30 Nov 2012 08:34:41 -0800 (PST) In-Reply-To: References: Date: Fri, 30 Nov 2012 17:34:41 +0100 Message-ID: Subject: Re: [ARM] Turning off 64bits ops in Neon and gfortran/modulo-scheduling problem From: Christophe Lyon To: "Joseph S. Myers" Cc: "gcc-patches@gcc.gnu.org" X-Gm-Message-State: ALoCoQmx9VCfhg/e1secqyesWmdxp9XAe3aqL6qO96jr4Y2uv4IYNxtfsjNYatkrmXff4c5f0ZQc X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 29 November 2012 21:59, Joseph S. Myers wrote: > On Thu, 29 Nov 2012, Christophe Lyon wrote: > >> 2012-11-28 Christophe Lyon >> >> gcc/ >> * config/arm/arm-protos.h (tune_params): Add >> prefer_neon_for_64bits field. >> * config/arm/arm.c (prefer_neon_for_64bits): New variable. >> (arm_slowmul_tune): Default prefer_neon_for_64bits to false. >> (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. >> (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. >> (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. >> (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. >> (arm_option_override): Handle -mneon-for-64bits new option. >> * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. >> (prefer_neon_for_64bits): Declare new variable. >> * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to >> avoid_neon_for_64bits and neon_for_64bits. >> (arch_enabled): Handle new arch types. >> (one_cmpldi2): Use new arch names. >> * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) >> (anddi3_neon, xordi3_neon, ashldi3_neon, di3_neon): Use >> neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead >> of onlya8. > > This ChangeLog entry doesn't appear to mention the arm.opt change. > Furthermore, the patch seems to be missing any .texi change to document > the option; any new option needs documentation. You are also missing > testcases for the testsuite to verify that both enabled and disabled > states of the option work properly. > Indeed, I forgot about the documentation; here is an updated patch. Regarding the testcases, as this patch disables transformations recently introduced, I would have appreciated if testcases had been associated with them in the 1st place.... This requirement should be enforced :-) Tested with qemu on target arm-none-linux-gnueabi. 2012-11-30 Christophe Lyon gcc/ * config/arm/arm-protos.h (tune_params): Add prefer_neon_for_64bits field. * config/arm/arm.c (prefer_neon_for_64bits): New variable. (arm_slowmul_tune): Default prefer_neon_for_64bits to false. (arm_fastmul_tune, arm_strongarm_tune, arm_xscale_tune): Ditto. (arm_9e_tune, arm_v6t2_tune, arm_cortex_tune): Ditto. (arm_cortex_a5_tune, arm_cortex_a15_tune): Ditto. (arm_cortex_a9_tune, arm_fa726te_tune): Ditto. (arm_option_override): Handle -mneon-for-64bits new option. * config/arm/arm.h (TARGET_PREFER_NEON_64BITS): New macro. (prefer_neon_for_64bits): Declare new variable. * config/arm/arm.md (arch): Rename neon_onlya8 and neon_nota8 to avoid_neon_for_64bits and neon_for_64bits. (arch_enabled): Handle new arch types. (one_cmpldi2): Use new arch names. * config/arm/arm.opt (mneon-for-64bits): Add option. * config/arm/neon.md (adddi3_neon, subdi3_neon, iordi3_neon) (anddi3_neon, xordi3_neon, ashldi3_neon, di3_neon): Use neon_for_64bits instead of nota8 and avoid_neon_for_64bits instead of onlya8. * doc/invoke.texi (-mneon-for-64bits): Document. gcc/testsuite/ * gcc.target/arm/neon-for-64bits-1.c: New tests. * gcc.target/arm/neon-for-64bits-2.c: Likewise. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index d942c5b..c92f055 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -247,6 +247,8 @@ struct tune_params performance. The first element covers Thumb state and the second one is for ARM state. */ bool logical_op_non_short_circuit[2]; + /* Prefer Neon for 64-bit bitops. */ + bool prefer_neon_for_64bits; }; extern const struct tune_params *current_tune; diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 286a6c5..9efd215 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -816,6 +816,10 @@ int arm_arch_thumb2; int arm_arch_arm_hwdiv; int arm_arch_thumb_hwdiv; +/* Nonzero if we should use Neon to handle 64-bits operations rather + than core registers. */ +int prefer_neon_for_64bits = 0; + /* In case of a PRE_INC, POST_INC, PRE_DEC, POST_DEC memory reference, we must report the mode of the memory reference from TARGET_PRINT_OPERAND to TARGET_PRINT_OPERAND_ADDRESS. */ @@ -895,6 +899,7 @@ const struct tune_params arm_slowmul_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_fastmul_tune = @@ -908,6 +913,7 @@ const struct tune_params arm_fastmul_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; /* StrongARM has early execution of branches, so a sequence that is worth @@ -924,6 +930,7 @@ const struct tune_params arm_strongarm_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_xscale_tune = @@ -937,6 +944,7 @@ const struct tune_params arm_xscale_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_9e_tune = @@ -950,6 +958,7 @@ const struct tune_params arm_9e_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_v6t2_tune = @@ -963,6 +972,7 @@ const struct tune_params arm_v6t2_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; /* Generic Cortex tuning. Use more specific tunings if appropriate. */ @@ -977,6 +987,7 @@ const struct tune_params arm_cortex_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_cortex_a15_tune = @@ -990,6 +1001,7 @@ const struct tune_params arm_cortex_a15_tune = arm_default_branch_cost, true, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; /* Branches can be dual-issued on Cortex-A5, so conditional execution is @@ -1006,6 +1018,7 @@ const struct tune_params arm_cortex_a5_tune = arm_cortex_a5_branch_cost, false, /* Prefer LDRD/STRD. */ {false, false}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_cortex_a9_tune = @@ -1019,6 +1032,7 @@ const struct tune_params arm_cortex_a9_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; /* The arm_v6m_tune is duplicated from arm_cortex_tune, rather than @@ -1034,6 +1048,7 @@ const struct tune_params arm_v6m_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {false, false}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; const struct tune_params arm_fa726te_tune = @@ -1047,6 +1062,7 @@ const struct tune_params arm_fa726te_tune = arm_default_branch_cost, false, /* Prefer LDRD/STRD. */ {true, true}, /* Prefer non short circuit. */ + false /* Prefer Neon for 64-bits bitops. */ }; @@ -2077,6 +2093,12 @@ arm_option_override (void) global_options.x_param_values, global_options_set.x_param_values); + /* Use Neon to perform 64-bits operations rather than core + registers. */ + prefer_neon_for_64bits = current_tune->prefer_neon_for_64bits; + if (use_neon_for_64bits == 1) + prefer_neon_for_64bits = true; + /* Use the alternative scheduling-pressure algorithm by default. */ maybe_set_param_value (PARAM_SCHED_PRESSURE_ALGORITHM, 2, global_options.x_param_values, diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index f520cc7..c71d85f 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -356,6 +356,9 @@ extern void (*arm_lang_output_object_attributes_hook)(void); #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \ || (TARGET_THUMB2 && arm_arch_thumb_hwdiv)) +/* Should NEON be used for 64-bits bitops. */ +#define TARGET_PREFER_NEON_64BITS (prefer_neon_for_64bits) + /* True iff the full BPABI is being used. If TARGET_BPABI is true, then TARGET_AAPCS_BASED must be true -- but the converse does not hold. TARGET_BPABI implies the use of the BPABI runtime library, @@ -541,6 +544,10 @@ extern int arm_arch_arm_hwdiv; /* Nonzero if chip supports integer division instruction in Thumb mode. */ extern int arm_arch_thumb_hwdiv; +/* Nonzero if we should use Neon to handle 64-bits operations rather + than core registers. */ +extern int prefer_neon_for_64bits; + #ifndef TARGET_DEFAULT #define TARGET_DEFAULT (MASK_APCS_FRAME) #endif diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ac507ef..afde613 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -202,7 +202,7 @@ ; for ARM or Thumb-2 with arm_arch6, and nov6 for ARM without ; arm_arch6. This attribute is used to compute attribute "enabled", ; use type "any" to enable an alternative in all cases. -(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,neon_onlya8,nota8,neon_nota8,iwmmxt,iwmmxt2" +(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,onlya8,nota8,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2" (const_string "any")) (define_attr "arch_enabled" "no,yes" @@ -241,18 +241,18 @@ (eq_attr "tune" "cortexa8")) (const_string "yes") - (and (eq_attr "arch" "neon_onlya8") - (eq_attr "tune" "cortexa8") - (match_test "TARGET_NEON")) + (and (eq_attr "arch" "avoid_neon_for_64bits") + (match_test "TARGET_NEON") + (not (match_test "TARGET_PREFER_NEON_64BITS"))) (const_string "yes") (and (eq_attr "arch" "nota8") (not (eq_attr "tune" "cortexa8"))) (const_string "yes") - (and (eq_attr "arch" "neon_nota8") - (not (eq_attr "tune" "cortexa8")) - (match_test "TARGET_NEON")) + (and (eq_attr "arch" "neon_for_64bits") + (match_test "TARGET_NEON") + (match_test "TARGET_PREFER_NEON_64BITS")) (const_string "yes") (and (eq_attr "arch" "iwmmxt2") @@ -4370,7 +4370,7 @@ [(set_attr "length" "*,8,8,*") (set_attr "predicable" "no,yes,yes,no") (set_attr "neon_type" "neon_int_1,*,*,neon_int_1") - (set_attr "arch" "neon_nota8,*,*,neon_onlya8")] + (set_attr "arch" "neon_for_64bits,*,*,avoid_neon_for_64bits")] ) (define_expand "one_cmplsi2" diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt index fb12c55..83b6002 100644 --- a/gcc/config/arm/arm.opt +++ b/gcc/config/arm/arm.opt @@ -251,3 +251,7 @@ that may trigger Cortex-M3 errata. munaligned-access Target Report Var(unaligned_access) Init(2) Enable unaligned word and halfword accesses to packed data. + +mneon-for-64bits +Target Report RejectNegative Var(use_neon_for_64bits) Init(0) +Use Neon to perform 64-bits operations rather than core registers. diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 2103580..8b0e877 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -617,7 +617,7 @@ [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1,*,*,*") (set_attr "conds" "*,clob,clob,*,clob,clob,clob") (set_attr "length" "*,8,8,*,8,8,8") - (set_attr "arch" "nota8,*,*,onlya8,*,*,*")] + (set_attr "arch" "neon_for_64bits,*,*,avoid_neon_for_64bits,*,*,*")] ) (define_insn "*sub3_neon" @@ -654,7 +654,7 @@ [(set_attr "neon_type" "neon_int_2,*,*,*,neon_int_2") (set_attr "conds" "*,clob,clob,clob,*") (set_attr "length" "*,8,8,8,*") - (set_attr "arch" "nota8,*,*,*,onlya8")] + (set_attr "arch" "neon_for_64bits,*,*,*,avoid_neon_for_64bits")] ) (define_insn "*mul3_neon" @@ -816,7 +816,7 @@ } [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1") (set_attr "length" "*,*,8,8,*,*") - (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")] + (set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits")] ) ;; The concrete forms of the Neon immediate-logic instructions are vbic and @@ -861,7 +861,7 @@ } [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*,neon_int_1,neon_int_1") (set_attr "length" "*,*,8,8,*,*") - (set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8")] + (set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits")] ) (define_insn "orn3_neon" @@ -957,7 +957,7 @@ veor\t%P0, %P1, %P2" [(set_attr "neon_type" "neon_int_1,*,*,neon_int_1") (set_attr "length" "*,8,8,*") - (set_attr "arch" "nota8,*,*,onlya8")] + (set_attr "arch" "neon_for_64bits,*,*,avoid_neon_for_64bits")] ) (define_insn "one_cmpl2" @@ -1279,7 +1279,7 @@ } DONE; }" - [(set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8") + [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") (set_attr "opt" "*,*,speed,speed,*,*")] ) @@ -1380,7 +1380,7 @@ DONE; }" - [(set_attr "arch" "nota8,nota8,*,*,onlya8,onlya8") + [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") (set_attr "opt" "*,*,speed,speed,*,*")] ) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 51b6e85..3918b1d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -514,7 +514,8 @@ Objective-C and Objective-C++ Dialects}. -mtp=@var{name} -mtls-dialect=@var{dialect} @gol -mword-relocations @gol -mfix-cortex-m3-ldrd @gol --munaligned-access} +-munaligned-access @gol +-mneon-for-64bits} @emph{AVR Options} @gccoptlist{-mmcu=@var{mcu} -maccumulate-args -mbranch-cost=@var{cost} @gol @@ -11521,6 +11522,11 @@ setting of this option. If unaligned access is enabled then the preprocessor symbol @code{__ARM_FEATURE_UNALIGNED} will also be defined. +@item -mneon-for-64bits +@opindex mneon-for-64bits +Enables using Neon to handle scalar 64-bits operations. This is +disabled by default since the cost of moving data from core registers +to Neon is high. @end table @node AVR Options diff --git a/gcc/testsuite/gcc.target/arm/neon-for-64bits-1.c b/gcc/testsuite/gcc.target/arm/neon-for-64bits-1.c new file mode 100644 index 0000000..a2a4103 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-for-64bits-1.c @@ -0,0 +1,54 @@ +/* Check that Neon is *not* used by default to handle 64-bits scalar + operations. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2" } */ +/* { dg-add-options arm_neon } */ + +typedef long long i64; +typedef unsigned long long u64; +typedef unsigned int u32; +typedef int i32; + +/* Unary operators */ +#define UNARY_OP(name, op) \ + void unary_##name(u64 *a, u64 *b) { *a = op (*b + 0x1234567812345678ULL) ; } + +/* Binary operators */ +#define BINARY_OP(name, op) \ + void binary_##name(u64 *a, u64 *b, u64 *c) { *a = *b op *c ; } + +/* Unsigned shift */ +#define SHIFT_U(name, op, amount) \ + void ushift_##name(u64 *a, u64 *b, int c) { *a = *b op amount; } + +/* Signed shift */ +#define SHIFT_S(name, op, amount) \ + void sshift_##name(i64 *a, i64 *b, int c) { *a = *b op amount; } + +UNARY_OP(not, ~) + +BINARY_OP(add, +) +BINARY_OP(sub, -) +BINARY_OP(and, &) +BINARY_OP(or, |) +BINARY_OP(xor, ^) + +SHIFT_U(right1, >>, 1) +SHIFT_U(right2, >>, 2) +SHIFT_U(right5, >>, 5) +SHIFT_U(rightn, >>, c) + +SHIFT_S(right1, >>, 1) +SHIFT_S(right2, >>, 2) +SHIFT_S(right5, >>, 5) +SHIFT_S(rightn, >>, c) + +/* { dg-final {scan-assembler-times "vmvn" 0} } */ +/* { dg-final {scan-assembler-times "vadd" 0} } */ +/* { dg-final {scan-assembler-times "vsub" 0} } */ +/* { dg-final {scan-assembler-times "vand" 0} } */ +/* { dg-final {scan-assembler-times "vorr" 0} } */ +/* { dg-final {scan-assembler-times "veor" 0} } */ +/* { dg-final {scan-assembler-times "vshr" 0} } */ diff --git a/gcc/testsuite/gcc.target/arm/neon-for-64bits-2.c b/gcc/testsuite/gcc.target/arm/neon-for-64bits-2.c new file mode 100644 index 0000000..035bfb7 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-for-64bits-2.c @@ -0,0 +1,57 @@ +/* Check that Neon is used to handle 64-bits scalar operations. */ + +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mneon-for-64bits" } */ +/* { dg-add-options arm_neon } */ + +typedef long long i64; +typedef unsigned long long u64; +typedef unsigned int u32; +typedef int i32; + +/* Unary operators */ +#define UNARY_OP(name, op) \ + void unary_##name(u64 *a, u64 *b) { *a = op (*b + 0x1234567812345678ULL) ; } + +/* Binary operators */ +#define BINARY_OP(name, op) \ + void binary_##name(u64 *a, u64 *b, u64 *c) { *a = *b op *c ; } + +/* Unsigned shift */ +#define SHIFT_U(name, op, amount) \ + void ushift_##name(u64 *a, u64 *b, int c) { *a = *b op amount; } + +/* Signed shift */ +#define SHIFT_S(name, op, amount) \ + void sshift_##name(i64 *a, i64 *b, int c) { *a = *b op amount; } + +UNARY_OP(not, ~) + +BINARY_OP(add, +) +BINARY_OP(sub, -) +BINARY_OP(and, &) +BINARY_OP(or, |) +BINARY_OP(xor, ^) + +SHIFT_U(right1, >>, 1) +SHIFT_U(right2, >>, 2) +SHIFT_U(right5, >>, 5) +SHIFT_U(rightn, >>, c) + +SHIFT_S(right1, >>, 1) +SHIFT_S(right2, >>, 2) +SHIFT_S(right5, >>, 5) +SHIFT_S(rightn, >>, c) + +/* { dg-final {scan-assembler-times "vmvn" 1} } */ +/* Two vadd: 1 in unary_not, 1 in binary_add */ +/* { dg-final {scan-assembler-times "vadd" 2} } */ +/* { dg-final {scan-assembler-times "vsub" 1} } */ +/* { dg-final {scan-assembler-times "vand" 1} } */ +/* { dg-final {scan-assembler-times "vorr" 1} } */ +/* { dg-final {scan-assembler-times "veor" 1} } */ +/* 6 vshr for right shifts by constant, and variable right shift uses + vshl with a negative amount in register. */ +/* { dg-final {scan-assembler-times "vshr" 6} } */ +/* { dg-final {scan-assembler-times "vshl" 2} } */