From patchwork Fri Jul 2 23:23:21 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 57788 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id F30A71007D2 for ; Sat, 3 Jul 2010 09:23:40 +1000 (EST) Received: (qmail 17771 invoked by alias); 2 Jul 2010 23:23:38 -0000 Received: (qmail 17759 invoked by uid 22791); 2 Jul 2010 23:23:34 -0000 X-SWARE-Spam-Status: No, hits=-1.5 required=5.0 tests=AWL, BAYES_00, TW_VB, TW_VC, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 02 Jul 2010 23:23:26 +0000 Received: (qmail 30538 invoked from network); 2 Jul 2010 23:23:24 -0000 Received: from unknown (HELO ?192.168.2.2?) (sandra@127.0.0.2) by mail.codesourcery.com with ESMTPA; 2 Jul 2010 23:23:24 -0000 Message-ID: <4C2E74E9.7060508@codesourcery.com> Date: Fri, 02 Jul 2010 19:23:21 -0400 From: Sandra Loosemore User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: Richard Earnshaw CC: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, ARM]: rewrite NEON bitwise operations without UNSPECs References: <4C20095B.3090106@codesourcery.com> <1277917254.3358.75.camel@e102346-lin.cambridge.arm.com> In-Reply-To: <1277917254.3358.75.camel@e102346-lin.cambridge.arm.com> Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Richard Earnshaw wrote: > > Shouldn't there be support in orndi_neon for the thumb2 ORN instruction? Ah, good catch -- I missed making that pattern know it can split, like the others. For the record, this is the version of the patch I checked in. -Sandra 2010-07-02 Sandra Loosemore gcc/ * config/arm/neon.md (UNSPEC_VAND): Delete. (UNSPEC_VBIC): Delete. (UNSPEC_VCLZ): Delete. (UNSPEC_VCNT): Delete. (UNSPEC_VEOR): Delete. (UNSPEC_VORN): Delete. (UNSPEC_VORR): Delete. (iordi3_neon): Rewrite RTL without unspec. Add alternatives to handle core registers too. (anddi3_neon): Likewise. (orndi3_neon): Likewise. (bicdi3_neon): Likewise. (xordi3_neon): Likewise. (neon_vclz): Rewrite as define_expand and clz2 to get rid of unspec and handle unused operand. (neon_vcnt): Similarly, with popcount2. * config/arm/predicates.md (imm_for_neon_logic_operand): Require TARGET_NEON. (imm_for_neon_inv_logic_operand): Likewise. * config/arm/arm.md (define_split for logical_binary_operator): Disable for NEON registers. (anddi3): Add new define_expand, and rename the insn. Disable this insn for NEON, where anddi3_neon now applies. (*anddi_notdi_di): Disable for TARGET_NEON, where bicdi3_neon applies. (iordi3): As for anddi3. (xordi3): Likewise. * config/arm/neon.ml (Vand): Split DImode variants and mark them as No_op to disable testing for exact instruction match. (Vorr): Likewise. (Veor): Likewise. (Vbic): Likewise. (Vorn): Likewise. * config/arm/arm_neon.h: Regenerated. * doc/arm-neon-intrinsics.texi: Regenerated. gcc/testsuite/ * gcc.target/arm/neon-vands64.c: New. * gcc.target/arm/neon-vandu64.c: New. * gcc.target/arm/neon-vbics64.c: New. * gcc.target/arm/neon-vbicu64.c: New. * gcc.target/arm/neon-veors64.c: New. * gcc.target/arm/neon-veoru64.c: New. * gcc.target/arm/neon-vorns64.c: New. * gcc.target/arm/neon-vornu64.c: New. * gcc.target/arm/neon-vorrs64.c: New. * gcc.target/arm/neon-vorru64.c: New. * gcc.target/arm/neon/vands64.c: Regenerated. * gcc.target/arm/neon/vandu64.c: Regenerated. * gcc.target/arm/neon/vbics64.c: Regenerated. * gcc.target/arm/neon/vbicu64.c: Regenerated. * gcc.target/arm/neon/veors64.c: Regenerated. * gcc.target/arm/neon/veoru64.c: Regenerated. * gcc.target/arm/neon/vorns64.c: Regenerated. * gcc.target/arm/neon/vornu64.c: Regenerated. * gcc.target/arm/neon/vorrs64.c: Regenerated. * gcc.target/arm/neon/vorru64.c: Regenerated. Index: gcc/config/arm/neon.md =================================================================== --- gcc/config/arm/neon.md (revision 161753) +++ gcc/config/arm/neon.md (working copy) @@ -31,8 +31,6 @@ (UNSPEC_VADDHN 73) (UNSPEC_VADDL 74) (UNSPEC_VADDW 75) - (UNSPEC_VAND 76) - (UNSPEC_VBIC 77) (UNSPEC_VBSL 78) (UNSPEC_VCAGE 79) (UNSPEC_VCAGT 80) @@ -40,11 +38,8 @@ (UNSPEC_VCGE 82) (UNSPEC_VCGT 83) (UNSPEC_VCLS 84) - (UNSPEC_VCLZ 85) - (UNSPEC_VCNT 86) (UNSPEC_VCVT 88) (UNSPEC_VCVT_N 89) - (UNSPEC_VEOR 92) (UNSPEC_VEXT 93) (UNSPEC_VHADD 97) (UNSPEC_VHSUB 98) @@ -81,8 +76,6 @@ (UNSPEC_VMUL_LANE 129) (UNSPEC_VMULL_LANE 130) (UNSPEC_VMUL_N 131) - (UNSPEC_VORN 133) - (UNSPEC_VORR 134) (UNSPEC_VPADAL 135) (UNSPEC_VPADD 136) (UNSPEC_VPADDL 137) @@ -940,10 +933,9 @@ ) (define_insn "iordi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w,w") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "w,0") - (match_operand:DI 2 "neon_logic_op2" "w,Dl")] - UNSPEC_VORR))] + [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r") + (ior:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r") + (match_operand:DI 2 "neon_logic_op2" "w,Dl,r,r")))] "TARGET_NEON" { switch (which_alternative) @@ -951,10 +943,13 @@ case 0: return "vorr\t%P0, %P1, %P2"; case 1: return neon_output_logic_immediate ("vorr", &operands[2], DImode, 0, VALID_NEON_QREG_MODE (DImode)); + case 2: return "#"; + case 3: return "#"; default: gcc_unreachable (); } } - [(set_attr "neon_type" "neon_int_1")] + [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*") + (set_attr "length" "*,*,8,8")] ) ;; The concrete forms of the Neon immediate-logic instructions are vbic and @@ -980,10 +975,9 @@ ) (define_insn "anddi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w,w") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "w,0") - (match_operand:DI 2 "neon_inv_logic_op2" "w,DL")] - UNSPEC_VAND))] + [(set (match_operand:DI 0 "s_register_operand" "=w,w,?&r,?&r") + (and:DI (match_operand:DI 1 "s_register_operand" "%w,0,0,r") + (match_operand:DI 2 "neon_inv_logic_op2" "w,DL,r,r")))] "TARGET_NEON" { switch (which_alternative) @@ -991,10 +985,13 @@ case 0: return "vand\t%P0, %P1, %P2"; case 1: return neon_output_logic_immediate ("vand", &operands[2], DImode, 1, VALID_NEON_QREG_MODE (DImode)); + case 2: return "#"; + case 3: return "#"; default: gcc_unreachable (); } } - [(set_attr "neon_type" "neon_int_1")] + [(set_attr "neon_type" "neon_int_1,neon_int_1,*,*") + (set_attr "length" "*,*,8,8")] ) (define_insn "orn3_neon" @@ -1007,13 +1004,16 @@ ) (define_insn "orndi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "w") - (match_operand:DI 2 "s_register_operand" "w")] - UNSPEC_VORN))] + [(set (match_operand:DI 0 "s_register_operand" "=w,?=&r,?&r") + (ior:DI (match_operand:DI 1 "s_register_operand" "w,r,0") + (not:DI (match_operand:DI 2 "s_register_operand" "w,0,r"))))] "TARGET_NEON" - "vorn\t%P0, %P1, %P2" - [(set_attr "neon_type" "neon_int_1")] + "@ + vorn\t%P0, %P1, %P2 + # + #" + [(set_attr "neon_type" "neon_int_1,*,*") + (set_attr "length" "*,8,8")] ) (define_insn "bic3_neon" @@ -1025,14 +1025,18 @@ [(set_attr "neon_type" "neon_int_1")] ) +;; Compare to *anddi_notdi_di. (define_insn "bicdi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "w") - (match_operand:DI 2 "s_register_operand" "w")] - UNSPEC_VBIC))] + [(set (match_operand:DI 0 "s_register_operand" "=w,?=&r,?&r") + (and:DI (not:DI (match_operand:DI 2 "s_register_operand" "w,r,0")) + (match_operand:DI 1 "s_register_operand" "w,0,r")))] "TARGET_NEON" - "vbic\t%P0, %P1, %P2" - [(set_attr "neon_type" "neon_int_1")] + "@ + vbic\t%P0, %P1, %P2 + # + #" + [(set_attr "neon_type" "neon_int_1,*,*") + (set_attr "length" "*,8,8")] ) (define_insn "xor3" @@ -1045,13 +1049,16 @@ ) (define_insn "xordi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "=w") - (unspec:DI [(match_operand:DI 1 "s_register_operand" "w") - (match_operand:DI 2 "s_register_operand" "w")] - UNSPEC_VEOR))] + [(set (match_operand:DI 0 "s_register_operand" "=w,?&r,?&r") + (xor:DI (match_operand:DI 1 "s_register_operand" "%w,0,r") + (match_operand:DI 2 "s_register_operand" "w,r,r")))] "TARGET_NEON" - "veor\t%P0, %P1, %P2" - [(set_attr "neon_type" "neon_int_1")] + "@ + veor\t%P0, %P1, %P2 + # + #" + [(set_attr "neon_type" "neon_int_1,*,*") + (set_attr "length" "*,8,8")] ) (define_insn "one_cmpl2" @@ -2359,26 +2366,42 @@ [(set_attr "neon_type" "neon_int_1")] ) -(define_insn "neon_vclz" +(define_insn "clz2" [(set (match_operand:VDQIW 0 "s_register_operand" "=w") - (unspec:VDQIW [(match_operand:VDQIW 1 "s_register_operand" "w") - (match_operand:SI 2 "immediate_operand" "i")] - UNSPEC_VCLZ))] + (clz:VDQIW (match_operand:VDQIW 1 "s_register_operand" "w")))] "TARGET_NEON" "vclz.\t%0, %1" [(set_attr "neon_type" "neon_int_1")] ) -(define_insn "neon_vcnt" +(define_expand "neon_vclz" + [(match_operand:VDQIW 0 "s_register_operand" "") + (match_operand:VDQIW 1 "s_register_operand" "") + (match_operand:SI 2 "immediate_operand" "")] + "TARGET_NEON" +{ + emit_insn (gen_clz2 (operands[0], operands[1])); + DONE; +}) + +(define_insn "popcount2" [(set (match_operand:VE 0 "s_register_operand" "=w") - (unspec:VE [(match_operand:VE 1 "s_register_operand" "w") - (match_operand:SI 2 "immediate_operand" "i")] - UNSPEC_VCNT))] + (popcount:VE (match_operand:VE 1 "s_register_operand" "w")))] "TARGET_NEON" "vcnt.\t%0, %1" [(set_attr "neon_type" "neon_int_1")] ) +(define_expand "neon_vcnt" + [(match_operand:VE 0 "s_register_operand" "=w") + (match_operand:VE 1 "s_register_operand" "w") + (match_operand:SI 2 "immediate_operand" "i")] + "TARGET_NEON" +{ + emit_insn (gen_popcount2 (operands[0], operands[1])); + DONE; +}) + (define_insn "neon_vrecpe" [(set (match_operand:V32 0 "s_register_operand" "=w") (unspec:V32 [(match_operand:V32 1 "s_register_operand" "w") Index: gcc/config/arm/neon.ml =================================================================== --- gcc/config/arm/neon.ml (revision 161753) +++ gcc/config/arm/neon.ml (working copy) @@ -1619,23 +1619,28 @@ let ops = store_3, [P16; F32; U16; U32; S16; S32]; (* Logical operations. And. *) - Vand, [], All (3, Dreg), "vand", notype_2, su_8_64; + Vand, [], All (3, Dreg), "vand", notype_2, su_8_32; + Vand, [No_op], All (3, Dreg), "vand", notype_2, [S64; U64]; Vand, [], All (3, Qreg), "vandQ", notype_2, su_8_64; (* Or. *) - Vorr, [], All (3, Dreg), "vorr", notype_2, su_8_64; + Vorr, [], All (3, Dreg), "vorr", notype_2, su_8_32; + Vorr, [No_op], All (3, Dreg), "vorr", notype_2, [S64; U64]; Vorr, [], All (3, Qreg), "vorrQ", notype_2, su_8_64; (* Eor. *) - Veor, [], All (3, Dreg), "veor", notype_2, su_8_64; + Veor, [], All (3, Dreg), "veor", notype_2, su_8_32; + Veor, [No_op], All (3, Dreg), "veor", notype_2, [S64; U64]; Veor, [], All (3, Qreg), "veorQ", notype_2, su_8_64; (* Bic (And-not). *) - Vbic, [], All (3, Dreg), "vbic", notype_2, su_8_64; + Vbic, [], All (3, Dreg), "vbic", notype_2, su_8_32; + Vbic, [No_op], All (3, Dreg), "vbic", notype_2, [S64; U64]; Vbic, [], All (3, Qreg), "vbicQ", notype_2, su_8_64; (* Or-not. *) - Vorn, [], All (3, Dreg), "vorn", notype_2, su_8_64; + Vorn, [], All (3, Dreg), "vorn", notype_2, su_8_32; + Vorn, [No_op], All (3, Dreg), "vorn", notype_2, [S64; U64]; Vorn, [], All (3, Qreg), "vornQ", notype_2, su_8_64; ] Index: gcc/config/arm/predicates.md =================================================================== --- gcc/config/arm/predicates.md (revision 161753) +++ gcc/config/arm/predicates.md (working copy) @@ -506,13 +506,15 @@ (define_predicate "imm_for_neon_logic_operand" (match_code "const_vector") { - return neon_immediate_valid_for_logic (op, mode, 0, NULL, NULL); + return (TARGET_NEON + && neon_immediate_valid_for_logic (op, mode, 0, NULL, NULL)); }) (define_predicate "imm_for_neon_inv_logic_operand" (match_code "const_vector") { - return neon_immediate_valid_for_logic (op, mode, 1, NULL, NULL); + return (TARGET_NEON + && neon_immediate_valid_for_logic (op, mode, 1, NULL, NULL)); }) (define_predicate "neon_logic_op2" Index: gcc/config/arm/arm_neon.h =================================================================== --- gcc/config/arm/arm_neon.h (revision 161753) +++ gcc/config/arm/arm_neon.h (working copy) @@ -5808,12 +5808,6 @@ vget_low_s32 (int32x4_t __a) return (int32x2_t)__builtin_neon_vget_lowv4si (__a); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vget_low_s64 (int64x2_t __a) -{ - return (int64x1_t)__builtin_neon_vget_lowv2di (__a); -} - __extension__ static __inline float32x2_t __attribute__ ((__always_inline__)) vget_low_f32 (float32x4_t __a) { @@ -5838,12 +5832,6 @@ vget_low_u32 (uint32x4_t __a) return (uint32x2_t)__builtin_neon_vget_lowv4si ((int32x4_t) __a); } -__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) -vget_low_u64 (uint64x2_t __a) -{ - return (uint64x1_t)__builtin_neon_vget_lowv2di ((int64x2_t) __a); -} - __extension__ static __inline poly8x8_t __attribute__ ((__always_inline__)) vget_low_p8 (poly8x16_t __a) { @@ -5856,6 +5844,18 @@ vget_low_p16 (poly16x8_t __a) return (poly16x4_t)__builtin_neon_vget_lowv8hi ((int16x8_t) __a); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vget_low_s64 (int64x2_t __a) +{ + return (int64x1_t)__builtin_neon_vget_lowv2di (__a); +} + +__extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) +vget_low_u64 (uint64x2_t __a) +{ + return (uint64x1_t)__builtin_neon_vget_lowv2di ((int64x2_t) __a); +} + __extension__ static __inline int32x2_t __attribute__ ((__always_inline__)) vcvt_s32_f32 (float32x2_t __a) { @@ -10386,12 +10386,6 @@ vand_s32 (int32x2_t __a, int32x2_t __b) return (int32x2_t)__builtin_neon_vandv2si (__a, __b, 1); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vand_s64 (int64x1_t __a, int64x1_t __b) -{ - return (int64x1_t)__builtin_neon_vanddi (__a, __b, 1); -} - __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) vand_u8 (uint8x8_t __a, uint8x8_t __b) { @@ -10410,6 +10404,12 @@ vand_u32 (uint32x2_t __a, uint32x2_t __b return (uint32x2_t)__builtin_neon_vandv2si ((int32x2_t) __a, (int32x2_t) __b, 0); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vand_s64 (int64x1_t __a, int64x1_t __b) +{ + return (int64x1_t)__builtin_neon_vanddi (__a, __b, 1); +} + __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vand_u64 (uint64x1_t __a, uint64x1_t __b) { @@ -10482,12 +10482,6 @@ vorr_s32 (int32x2_t __a, int32x2_t __b) return (int32x2_t)__builtin_neon_vorrv2si (__a, __b, 1); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vorr_s64 (int64x1_t __a, int64x1_t __b) -{ - return (int64x1_t)__builtin_neon_vorrdi (__a, __b, 1); -} - __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) vorr_u8 (uint8x8_t __a, uint8x8_t __b) { @@ -10506,6 +10500,12 @@ vorr_u32 (uint32x2_t __a, uint32x2_t __b return (uint32x2_t)__builtin_neon_vorrv2si ((int32x2_t) __a, (int32x2_t) __b, 0); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vorr_s64 (int64x1_t __a, int64x1_t __b) +{ + return (int64x1_t)__builtin_neon_vorrdi (__a, __b, 1); +} + __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vorr_u64 (uint64x1_t __a, uint64x1_t __b) { @@ -10578,12 +10578,6 @@ veor_s32 (int32x2_t __a, int32x2_t __b) return (int32x2_t)__builtin_neon_veorv2si (__a, __b, 1); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -veor_s64 (int64x1_t __a, int64x1_t __b) -{ - return (int64x1_t)__builtin_neon_veordi (__a, __b, 1); -} - __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) veor_u8 (uint8x8_t __a, uint8x8_t __b) { @@ -10602,6 +10596,12 @@ veor_u32 (uint32x2_t __a, uint32x2_t __b return (uint32x2_t)__builtin_neon_veorv2si ((int32x2_t) __a, (int32x2_t) __b, 0); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +veor_s64 (int64x1_t __a, int64x1_t __b) +{ + return (int64x1_t)__builtin_neon_veordi (__a, __b, 1); +} + __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) veor_u64 (uint64x1_t __a, uint64x1_t __b) { @@ -10674,12 +10674,6 @@ vbic_s32 (int32x2_t __a, int32x2_t __b) return (int32x2_t)__builtin_neon_vbicv2si (__a, __b, 1); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vbic_s64 (int64x1_t __a, int64x1_t __b) -{ - return (int64x1_t)__builtin_neon_vbicdi (__a, __b, 1); -} - __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) vbic_u8 (uint8x8_t __a, uint8x8_t __b) { @@ -10698,6 +10692,12 @@ vbic_u32 (uint32x2_t __a, uint32x2_t __b return (uint32x2_t)__builtin_neon_vbicv2si ((int32x2_t) __a, (int32x2_t) __b, 0); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vbic_s64 (int64x1_t __a, int64x1_t __b) +{ + return (int64x1_t)__builtin_neon_vbicdi (__a, __b, 1); +} + __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vbic_u64 (uint64x1_t __a, uint64x1_t __b) { @@ -10770,12 +10770,6 @@ vorn_s32 (int32x2_t __a, int32x2_t __b) return (int32x2_t)__builtin_neon_vornv2si (__a, __b, 1); } -__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) -vorn_s64 (int64x1_t __a, int64x1_t __b) -{ - return (int64x1_t)__builtin_neon_vorndi (__a, __b, 1); -} - __extension__ static __inline uint8x8_t __attribute__ ((__always_inline__)) vorn_u8 (uint8x8_t __a, uint8x8_t __b) { @@ -10794,6 +10788,12 @@ vorn_u32 (uint32x2_t __a, uint32x2_t __b return (uint32x2_t)__builtin_neon_vornv2si ((int32x2_t) __a, (int32x2_t) __b, 0); } +__extension__ static __inline int64x1_t __attribute__ ((__always_inline__)) +vorn_s64 (int64x1_t __a, int64x1_t __b) +{ + return (int64x1_t)__builtin_neon_vorndi (__a, __b, 1); +} + __extension__ static __inline uint64x1_t __attribute__ ((__always_inline__)) vorn_u64 (uint64x1_t __a, uint64x1_t __b) { Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 161753) +++ gcc/config/arm/arm.md (working copy) @@ -1810,6 +1810,7 @@ [(match_operand:DI 1 "s_register_operand" "") (match_operand:DI 2 "s_register_operand" "")]))] "TARGET_32BIT && reload_completed + && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0]))) && ! IS_IWMMXT_REGNUM (REGNO (operands[0]))" [(set (match_dup 0) (match_op_dup:SI 6 [(match_dup 1) (match_dup 2)])) (set (match_dup 3) (match_op_dup:SI 6 [(match_dup 4) (match_dup 5)]))] @@ -1883,11 +1884,19 @@ }" ) -(define_insn "anddi3" +(define_expand "anddi3" + [(set (match_operand:DI 0 "s_register_operand" "") + (and:DI (match_operand:DI 1 "s_register_operand" "") + (match_operand:DI 2 "neon_inv_logic_op2" "")))] + "TARGET_32BIT" + "" +) + +(define_insn "*anddi3_insn" [(set (match_operand:DI 0 "s_register_operand" "=&r,&r") (and:DI (match_operand:DI 1 "s_register_operand" "%0,r") (match_operand:DI 2 "s_register_operand" "r,r")))] - "TARGET_32BIT && ! TARGET_IWMMXT" + "TARGET_32BIT && !TARGET_IWMMXT && !TARGET_NEON" "#" [(set_attr "length" "8")] ) @@ -2487,7 +2496,9 @@ (match_operand:DI 2 "s_register_operand" "r,0")))] "TARGET_32BIT" "#" - "TARGET_32BIT && reload_completed && ! IS_IWMMXT_REGNUM (REGNO (operands[0]))" + "TARGET_32BIT && reload_completed + && ! (TARGET_NEON && IS_VFP_REGNUM (REGNO (operands[0]))) + && ! IS_IWMMXT_REGNUM (REGNO (operands[0]))" [(set (match_dup 0) (and:SI (not:SI (match_dup 1)) (match_dup 2))) (set (match_dup 3) (and:SI (not:SI (match_dup 4)) (match_dup 5)))] " @@ -2611,11 +2622,19 @@ [(set_attr "conds" "set")] ) -(define_insn "iordi3" +(define_expand "iordi3" + [(set (match_operand:DI 0 "s_register_operand" "") + (ior:DI (match_operand:DI 1 "s_register_operand" "") + (match_operand:DI 2 "neon_logic_op2" "")))] + "TARGET_32BIT" + "" +) + +(define_insn "*iordi3_insn" [(set (match_operand:DI 0 "s_register_operand" "=&r,&r") (ior:DI (match_operand:DI 1 "s_register_operand" "%0,r") (match_operand:DI 2 "s_register_operand" "r,r")))] - "TARGET_32BIT && ! TARGET_IWMMXT" + "TARGET_32BIT && !TARGET_IWMMXT && !TARGET_NEON" "#" [(set_attr "length" "8") (set_attr "predicable" "yes")] @@ -2741,11 +2760,19 @@ [(set_attr "conds" "set")] ) -(define_insn "xordi3" +(define_expand "xordi3" + [(set (match_operand:DI 0 "s_register_operand" "") + (xor:DI (match_operand:DI 1 "s_register_operand" "") + (match_operand:DI 2 "s_register_operand" "")))] + "TARGET_32BIT" + "" +) + +(define_insn "*xordi3_insn" [(set (match_operand:DI 0 "s_register_operand" "=&r,&r") (xor:DI (match_operand:DI 1 "s_register_operand" "%0,r") (match_operand:DI 2 "s_register_operand" "r,r")))] - "TARGET_32BIT && !TARGET_IWMMXT" + "TARGET_32BIT && !TARGET_IWMMXT && !TARGET_NEON" "#" [(set_attr "length" "8") (set_attr "predicable" "yes")] Index: gcc/doc/arm-neon-intrinsics.texi =================================================================== --- gcc/doc/arm-neon-intrinsics.texi (revision 161753) +++ gcc/doc/arm-neon-intrinsics.texi (working copy) @@ -9713,13 +9713,11 @@ @itemize @bullet @item uint64x1_t vand_u64 (uint64x1_t, uint64x1_t) -@*@emph{Form of expected instruction(s):} @code{vand @var{d0}, @var{d0}, @var{d0}} @end itemize @itemize @bullet @item int64x1_t vand_s64 (int64x1_t, int64x1_t) -@*@emph{Form of expected instruction(s):} @code{vand @var{d0}, @var{d0}, @var{d0}} @end itemize @@ -9813,13 +9811,11 @@ @itemize @bullet @item uint64x1_t vorr_u64 (uint64x1_t, uint64x1_t) -@*@emph{Form of expected instruction(s):} @code{vorr @var{d0}, @var{d0}, @var{d0}} @end itemize @itemize @bullet @item int64x1_t vorr_s64 (int64x1_t, int64x1_t) -@*@emph{Form of expected instruction(s):} @code{vorr @var{d0}, @var{d0}, @var{d0}} @end itemize @@ -9913,13 +9909,11 @@ @itemize @bullet @item uint64x1_t veor_u64 (uint64x1_t, uint64x1_t) -@*@emph{Form of expected instruction(s):} @code{veor @var{d0}, @var{d0}, @var{d0}} @end itemize @itemize @bullet @item int64x1_t veor_s64 (int64x1_t, int64x1_t) -@*@emph{Form of expected instruction(s):} @code{veor @var{d0}, @var{d0}, @var{d0}} @end itemize @@ -10013,13 +10007,11 @@ @itemize @bullet @item uint64x1_t vbic_u64 (uint64x1_t, uint64x1_t) -@*@emph{Form of expected instruction(s):} @code{vbic @var{d0}, @var{d0}, @var{d0}} @end itemize @itemize @bullet @item int64x1_t vbic_s64 (int64x1_t, int64x1_t) -@*@emph{Form of expected instruction(s):} @code{vbic @var{d0}, @var{d0}, @var{d0}} @end itemize @@ -10113,13 +10105,11 @@ @itemize @bullet @item uint64x1_t vorn_u64 (uint64x1_t, uint64x1_t) -@*@emph{Form of expected instruction(s):} @code{vorn @var{d0}, @var{d0}, @var{d0}} @end itemize @itemize @bullet @item int64x1_t vorn_s64 (int64x1_t, int64x1_t) -@*@emph{Form of expected instruction(s):} @code{vorn @var{d0}, @var{d0}, @var{d0}} @end itemize Index: gcc/testsuite/gcc.target/arm/neon/vbicu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vbicu64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vbicu64.c (working copy) @@ -17,5 +17,4 @@ void test_vbicu64 (void) out_uint64x1_t = vbic_u64 (arg0_uint64x1_t, arg1_uint64x1_t); } -/* { dg-final { scan-assembler "vbic\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vorns64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vorns64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vorns64.c (working copy) @@ -17,5 +17,4 @@ void test_vorns64 (void) out_int64x1_t = vorn_s64 (arg0_int64x1_t, arg1_int64x1_t); } -/* { dg-final { scan-assembler "vorn\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vornu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vornu64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vornu64.c (working copy) @@ -17,5 +17,4 @@ void test_vornu64 (void) out_uint64x1_t = vorn_u64 (arg0_uint64x1_t, arg1_uint64x1_t); } -/* { dg-final { scan-assembler "vorn\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vands64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vands64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vands64.c (working copy) @@ -17,5 +17,4 @@ void test_vands64 (void) out_int64x1_t = vand_s64 (arg0_int64x1_t, arg1_int64x1_t); } -/* { dg-final { scan-assembler "vand\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vorrs64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vorrs64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vorrs64.c (working copy) @@ -17,5 +17,4 @@ void test_vorrs64 (void) out_int64x1_t = vorr_s64 (arg0_int64x1_t, arg1_int64x1_t); } -/* { dg-final { scan-assembler "vorr\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vandu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vandu64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vandu64.c (working copy) @@ -17,5 +17,4 @@ void test_vandu64 (void) out_uint64x1_t = vand_u64 (arg0_uint64x1_t, arg1_uint64x1_t); } -/* { dg-final { scan-assembler "vand\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/veors64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/veors64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/veors64.c (working copy) @@ -17,5 +17,4 @@ void test_veors64 (void) out_int64x1_t = veor_s64 (arg0_int64x1_t, arg1_int64x1_t); } -/* { dg-final { scan-assembler "veor\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vorru64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vorru64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vorru64.c (working copy) @@ -17,5 +17,4 @@ void test_vorru64 (void) out_uint64x1_t = vorr_u64 (arg0_uint64x1_t, arg1_uint64x1_t); } -/* { dg-final { scan-assembler "vorr\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/veoru64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/veoru64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/veoru64.c (working copy) @@ -17,5 +17,4 @@ void test_veoru64 (void) out_uint64x1_t = veor_u64 (arg0_uint64x1_t, arg1_uint64x1_t); } -/* { dg-final { scan-assembler "veor\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon/vbics64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon/vbics64.c (revision 161753) +++ gcc/testsuite/gcc.target/arm/neon/vbics64.c (working copy) @@ -17,5 +17,4 @@ void test_vbics64 (void) out_int64x1_t = vbic_s64 (arg0_int64x1_t, arg1_int64x1_t); } -/* { dg-final { scan-assembler "vbic\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+!?\(\[ \]+@\[a-zA-Z0-9 \]+\)?\n" } } */ /* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/arm/neon-vands64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vands64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vands64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vand_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t out_int64x1_t = 0; + int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL; + int64x1_t arg1_int64x1_t = (int64x1_t)0xdead00000000beefLL; + + out_int64x1_t = vand_s64 (arg0_int64x1_t, arg1_int64x1_t); + if (out_int64x1_t != (int64x1_t)0xdead000000000000LL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vandu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vandu64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vandu64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vand_u64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + uint64x1_t out_uint64x1_t = 0; + uint64x1_t arg0_uint64x1_t = (uint64x1_t)0xdeadbeef00000000LL; + uint64x1_t arg1_uint64x1_t = (uint64x1_t)0xdead00000000beefLL; + + out_uint64x1_t = vand_u64 (arg0_uint64x1_t, arg1_uint64x1_t); + if (out_uint64x1_t != (uint64x1_t)0xdead000000000000LL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-veors64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-veors64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-veors64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `veor_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t out_int64x1_t = 0; + int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL; + int64x1_t arg1_int64x1_t = (int64x1_t)0xdead00000000beefLL; + + out_int64x1_t = veor_s64 (arg0_int64x1_t, arg1_int64x1_t); + if (out_int64x1_t != (int64x1_t)0x0000beef0000beefLL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-veoru64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-veoru64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-veoru64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `veor_u64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + uint64x1_t out_uint64x1_t = 0; + uint64x1_t arg0_uint64x1_t = (uint64x1_t)0xdeadbeef00000000LL; + uint64x1_t arg1_uint64x1_t = (uint64x1_t)0xdead00000000beefLL; + + out_uint64x1_t = veor_u64 (arg0_uint64x1_t, arg1_uint64x1_t); + if (out_uint64x1_t != (uint64x1_t)0x0000beef0000beefLL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vorrs64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vorrs64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vorrs64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vorr_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t out_int64x1_t = 0; + int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL; + int64x1_t arg1_int64x1_t = (int64x1_t)0xdead00000000beefLL; + + out_int64x1_t = vorr_s64 (arg0_int64x1_t, arg1_int64x1_t); + if (out_int64x1_t != (int64x1_t)0xdeadbeef0000beefLL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vorru64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vorru64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vorru64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vorr_u64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + uint64x1_t out_uint64x1_t = 0; + uint64x1_t arg0_uint64x1_t = (uint64x1_t)0xdeadbeef00000000LL; + uint64x1_t arg1_uint64x1_t = (uint64x1_t)0xdead00000000beefLL; + + out_uint64x1_t = vorr_u64 (arg0_uint64x1_t, arg1_uint64x1_t); + if (out_uint64x1_t != (uint64x1_t)0xdeadbeef0000beefLL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vbics64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vbics64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vbics64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vbic_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t out_int64x1_t = 0; + int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL; + int64x1_t arg1_int64x1_t = (int64x1_t)(~0xdead00000000beefLL); + + out_int64x1_t = vbic_s64 (arg0_int64x1_t, arg1_int64x1_t); + if (out_int64x1_t != (int64x1_t)0xdead000000000000LL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vbicu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vbicu64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vbicu64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vbic_u64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + uint64x1_t out_uint64x1_t = 0; + uint64x1_t arg0_uint64x1_t = (uint64x1_t)0xdeadbeef00000000LL; + uint64x1_t arg1_uint64x1_t = (uint64x1_t)(~0xdead00000000beefLL); + + out_uint64x1_t = vbic_u64 (arg0_uint64x1_t, arg1_uint64x1_t); + if (out_uint64x1_t != (uint64x1_t)0xdead000000000000LL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vorns64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vorns64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vorns64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vorn_s64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + int64x1_t out_int64x1_t = 0; + int64x1_t arg0_int64x1_t = (int64x1_t)0xdeadbeef00000000LL; + int64x1_t arg1_int64x1_t = (int64x1_t)(~0xdead00000000beefLL); + + out_int64x1_t = vorn_s64 (arg0_int64x1_t, arg1_int64x1_t); + if (out_int64x1_t != (int64x1_t)0xdeadbeef0000beefLL) + abort(); + return 0; +} Index: gcc/testsuite/gcc.target/arm/neon-vornu64.c =================================================================== --- gcc/testsuite/gcc.target/arm/neon-vornu64.c (revision 0) +++ gcc/testsuite/gcc.target/arm/neon-vornu64.c (revision 0) @@ -0,0 +1,21 @@ +/* Test the `vorn_u64' ARM Neon intrinsic. */ + +/* { dg-do run } */ +/* { dg-require-effective-target arm_neon_hw } */ +/* { dg-options "-O0" } */ +/* { dg-add-options arm_neon } */ + +#include "arm_neon.h" +#include + +int main (void) +{ + uint64x1_t out_uint64x1_t = 0; + uint64x1_t arg0_uint64x1_t = (uint64x1_t)0xdeadbeef00000000LL; + uint64x1_t arg1_uint64x1_t = (uint64x1_t)(~0xdead00000000beefLL); + + out_uint64x1_t = vorn_u64 (arg0_uint64x1_t, arg1_uint64x1_t); + if (out_uint64x1_t != (uint64x1_t)0xdeadbeef0000beefLL) + abort(); + return 0; +}