From patchwork Sun Nov 14 00:16:07 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 71078 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 13571B7107 for ; Sun, 14 Nov 2010 11:18:15 +1100 (EST) Received: (qmail 15983 invoked by alias); 14 Nov 2010 00:18:07 -0000 Received: (qmail 15723 invoked by uid 22791); 14 Nov 2010 00:17:52 -0000 X-SWARE-Spam-Status: No, hits=-5.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_HI, SPF_HELO_PASS, TW_BD, TW_FP, TW_VZ, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 14 Nov 2010 00:16:45 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id oAE0GhUW031634 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 13 Nov 2010 19:16:43 -0500 Received: from stone.twiddle.home ([10.3.113.11]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id oAE0G2fY011703; Sat, 13 Nov 2010 19:16:03 -0500 Message-ID: <4CDF2A47.7050702@redhat.com> Date: Sat, 13 Nov 2010 16:16:07 -0800 From: Richard Henderson User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Thunderbird/3.1.6 MIME-Version: 1.0 To: Michael Meissner , GCC Patches , dje.gcc@gmail.com, richard.guenther@gmail.com Subject: Re: [patch 4/N][rs6000, cft] -mfused-add cleanup References: <4CDB3BA6.1010500@redhat.com> <4CDC3159.8040705@redhat.com> <20101111200635.GB11922@hungry-tiger.westford.ibm.com> <4CDF29CE.4010405@redhat.com> In-Reply-To: <4CDF29CE.4010405@redhat.com> X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On 11/13/2010 04:14 PM, Richard Henderson wrote: > On 11/11/2010 12:06 PM, Michael Meissner wrote: >> gcc.dg/var-expand3.c is a function using Altivec fused multiply/add >> instructions, and the loop unroller would unroll loops with (plus (mult (..))) >> but not (fma ...). The code is analyze_insn_to_expand_var in loop-unroll.c >> where it doesn't realize fma is a loop accumulator. I imagine it is probably >> simple to fix this function. > > Fixed by > http://gcc.gnu.org/ml/gcc-patches/2010-11/msg01248.html > as yet unreviewed. > >> gcc.target/powerpc/ppc-fma-{2,4}.c are failing because they use >> -mno-fused-madd, and the compiler gives a new warning that isn't accounted >> for. Obviously we just need to switch to use -ffp-contract=off for these two >> tests (or add the warning message). > > Fixed here. > >> gfortran.fortran-torture/execute/in-pack.f90 is getting an internal compiler >> error when -maltivec is used: >> >> /home/meissner/fsf-src/rth/gcc/testsuite/gfortran.fortran-torture/execute/in-pack.f90: In function 'csub4': >> /home/meissner/fsf-src/rth/gcc/testsuite/gfortran.fortran-torture/execute/in-pack.f90:59:0: error: unrecognizable insn: >> (insn 395 394 396 45 (set (reg:V4SF 592) >> (fma:V4SF (reg:V4SF 363 [ vect_var_.731 ]) >> (reg:V4SF 593) >> (reg:V4SI 594))) /home/meissner/fsf-src/rth/gcc/testsuite/gfortran.fortran-torture/execute/in-pack.f90:55 -1 >> (expr_list:REG_EQUAL (mult:V4SF (reg:V4SF 363 [ vect_var_.731 ]) >> (reg:V4SF 593)) >> (nil))) > > Fixed here. A silly mistake converting the altivec mulv4sf pattern. Dammit. Index: gcc/optabs.c =================================================================== --- gcc/optabs.c (revision 166635) +++ gcc/optabs.c (working copy) @@ -6180,6 +6180,10 @@ init_optab (umax_optab, UMAX); init_optab (pow_optab, UNKNOWN); init_optab (atan2_optab, UNKNOWN); + init_optab (fma_optab, FMA); + init_optab (fms_optab, UNKNOWN); + init_optab (fnma_optab, UNKNOWN); + init_optab (fnms_optab, UNKNOWN); /* These three have codes assigned exclusively for the sake of have_insn_for. */ Index: gcc/testsuite/gcc.target/powerpc/ppc-fma-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-fma-2.c (revision 166635) +++ gcc/testsuite/gcc.target/powerpc/ppc-fma-2.c (working copy) @@ -1,7 +1,7 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ /* { dg-require-effective-target powerpc_vsx_ok } */ -/* { dg-options "-O3 -ftree-vectorize -mcpu=power7 -ffast-math -mno-fused-madd" } */ +/* { dg-options "-O3 -ftree-vectorize -mcpu=power7 -ffast-math -ffp-contract=off" } */ /* { dg-final { scan-assembler-times "xvmadd" 2 } } */ /* { dg-final { scan-assembler-times "xsmadd" 1 } } */ /* { dg-final { scan-assembler-times "fmadds" 1 } } */ Index: gcc/testsuite/gcc.target/powerpc/ppc-fma-4.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-fma-4.c (revision 166635) +++ gcc/testsuite/gcc.target/powerpc/ppc-fma-4.c (working copy) @@ -1,7 +1,7 @@ /* { dg-do compile { target { powerpc*-*-* } } } */ /* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ /* { dg-require-effective-target powerpc_altivec_ok } */ -/* { dg-options "-O3 -ftree-vectorize -mcpu=power6 -maltivec -ffast-math -mno-fused-madd" } */ +/* { dg-options "-O3 -ftree-vectorize -mcpu=power6 -maltivec -ffast-math -ffp-contract=off" } */ /* { dg-final { scan-assembler-times "vmaddfp" 1 } } */ /* { dg-final { scan-assembler-times "fmadd " 1 } } */ /* { dg-final { scan-assembler-times "fmadds" 1 } } */ Index: gcc/testsuite/gcc.target/i386/sse-24.c =================================================================== --- gcc/testsuite/gcc.target/i386/sse-24.c (revision 166635) +++ gcc/testsuite/gcc.target/i386/sse-24.c (working copy) @@ -1,5 +1,5 @@ /* PR target/44338 */ /* { dg-do compile } */ -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -mno-fused-madd" } */ +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -ffp-contract=off" } */ #include "sse-23.c" Index: gcc/testsuite/gcc.target/ia64/mno-fused-madd-vect.c =================================================================== --- gcc/testsuite/gcc.target/ia64/mno-fused-madd-vect.c (revision 166635) +++ gcc/testsuite/gcc.target/ia64/mno-fused-madd-vect.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile */ -/* { dg-options "-O2 -mno-fused-madd -ftree-vectorize" } */ +/* { dg-options "-O2 -ffp-contract=off -ftree-vectorize" } */ /* { dg-final { scan-assembler "fpmpy" } } */ /* fpma and fpms will show in either way because there are no Index: gcc/testsuite/gcc.target/ia64/mno-fused-madd.c =================================================================== --- gcc/testsuite/gcc.target/ia64/mno-fused-madd.c (revision 166635) +++ gcc/testsuite/gcc.target/ia64/mno-fused-madd.c (working copy) @@ -1,5 +1,5 @@ /* { dg-do compile */ -/* { dg-options "-O2 -mno-fused-madd" } */ +/* { dg-options "-O2 -ffp-contract=off" } */ /* { dg-final { scan-assembler-not "fma" } } */ /* { dg-final { scan-assembler-not "fms" } } */ /* { dg-final { scan-assembler-not "fnma" } } */ Index: gcc/loop-unroll.c =================================================================== --- gcc/loop-unroll.c (revision 166635) +++ gcc/loop-unroll.c (working copy) @@ -1616,10 +1616,10 @@ static struct var_to_expand * analyze_insn_to_expand_var (struct loop *loop, rtx insn) { - rtx set, dest, src, op1, op2, something; + rtx set, dest, src; struct var_to_expand *ves; - enum machine_mode mode1, mode2; unsigned accum_pos; + enum rtx_code code; int debug_uses = 0; set = single_set (insn); @@ -1628,12 +1628,20 @@ dest = SET_DEST (set); src = SET_SRC (set); + code = GET_CODE (src); - if (GET_CODE (src) != PLUS - && GET_CODE (src) != MINUS - && GET_CODE (src) != MULT) + if (code != PLUS && code != MINUS && code != MULT && code != FMA) return NULL; + if (FLOAT_MODE_P (GET_MODE (dest))) + { + if (!flag_associative_math) + return NULL; + /* In the case of FMA, we're also changing the rounding. */ + if (code == FMA && !flag_unsafe_math_optimizations) + return NULL; + } + /* Hmm, this is a bit paradoxical. We know that INSN is a valid insn in MD. But if there is no optab to generate the insn, we can not perform the variable expansion. This can happen if an MD provides @@ -1643,54 +1651,57 @@ So we check have_insn_for which looks for an optab for the operation in SRC. If it doesn't exist, we can't perform the expansion even though INSN is valid. */ - if (!have_insn_for (GET_CODE (src), GET_MODE (src))) + if (!have_insn_for (code, GET_MODE (src))) return NULL; - op1 = XEXP (src, 0); - op2 = XEXP (src, 1); - if (!REG_P (dest) && !(GET_CODE (dest) == SUBREG && REG_P (SUBREG_REG (dest)))) return NULL; - if (rtx_equal_p (dest, op1)) + /* Find the accumulator use within the operation. */ + if (code == FMA) + { + /* We only support accumulation via FMA in the ADD position. */ + if (!rtx_equal_p (dest, XEXP (src, 2))) + return NULL; + accum_pos = 2; + } + else if (rtx_equal_p (dest, XEXP (src, 0))) accum_pos = 0; - else if (rtx_equal_p (dest, op2)) - accum_pos = 1; + else if (rtx_equal_p (dest, XEXP (src, 1))) + { + /* The method of expansion that we are using; which includes the + initialization of the expansions with zero and the summation of + the expansions at the end of the computation will yield wrong + results for (x = something - x) thus avoid using it in that case. */ + if (code == MINUS) + return NULL; + accum_pos = 1; + } else return NULL; - /* The method of expansion that we are using; which includes - the initialization of the expansions with zero and the summation of - the expansions at the end of the computation will yield wrong results - for (x = something - x) thus avoid using it in that case. */ - if (accum_pos == 1 - && GET_CODE (src) == MINUS) - return NULL; - - something = (accum_pos == 0) ? op2 : op1; - - if (rtx_referenced_p (dest, something)) + /* It must not otherwise be used. */ + if (code == FMA) + { + if (rtx_referenced_p (dest, XEXP (src, 0)) + || rtx_referenced_p (dest, XEXP (src, 1))) + return NULL; + } + else if (rtx_referenced_p (dest, XEXP (src, 1 - accum_pos))) return NULL; + /* It must be used in exactly one insn. */ if (!referenced_in_one_insn_in_loop_p (loop, dest, &debug_uses)) return NULL; - mode1 = GET_MODE (dest); - mode2 = GET_MODE (something); - if ((FLOAT_MODE_P (mode1) - || FLOAT_MODE_P (mode2)) - && !flag_associative_math) - return NULL; - if (dump_file) - { - fprintf (dump_file, - "\n;; Expanding Accumulator "); - print_rtl (dump_file, dest); - fprintf (dump_file, "\n"); - } + { + fprintf (dump_file, "\n;; Expanding Accumulator "); + print_rtl (dump_file, dest); + fprintf (dump_file, "\n"); + } if (debug_uses) /* Instead of resetting the debug insns, we could replace each @@ -2123,23 +2134,34 @@ return; start_sequence (); - if (ve->op == PLUS || ve->op == MINUS) - FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) - { - if (honor_signed_zero_p) - zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); - else - zero_init = CONST0_RTX (mode); + switch (ve->op) + { + case FMA: + /* Note that we only accumulate FMA via the ADD operand. */ + case PLUS: + case MINUS: + FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) + { + if (honor_signed_zero_p) + zero_init = simplify_gen_unary (NEG, mode, CONST0_RTX (mode), mode); + else + zero_init = CONST0_RTX (mode); + emit_move_insn (var, zero_init); + } + break; - emit_move_insn (var, zero_init); - } - else if (ve->op == MULT) - FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) - { - zero_init = CONST1_RTX (GET_MODE (var)); - emit_move_insn (var, zero_init); - } + case MULT: + FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) + { + zero_init = CONST1_RTX (GET_MODE (var)); + emit_move_insn (var, zero_init); + } + break; + default: + gcc_unreachable (); + } + seq = get_insns (); end_sequence (); @@ -2165,19 +2187,25 @@ return; start_sequence (); - if (ve->op == PLUS || ve->op == MINUS) - FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) - { - sum = simplify_gen_binary (PLUS, GET_MODE (ve->reg), - var, sum); - } - else if (ve->op == MULT) - FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) - { - sum = simplify_gen_binary (MULT, GET_MODE (ve->reg), - var, sum); - } + switch (ve->op) + { + case FMA: + /* Note that we only accumulate FMA via the ADD operand. */ + case PLUS: + case MINUS: + FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) + sum = simplify_gen_binary (PLUS, GET_MODE (ve->reg), var, sum); + break; + case MULT: + FOR_EACH_VEC_ELT (rtx, ve->var_expansions, i, var) + sum = simplify_gen_binary (MULT, GET_MODE (ve->reg), var, sum); + break; + + default: + gcc_unreachable (); + } + expr = force_operand (sum, ve->reg); if (expr != ve->reg) emit_move_insn (ve->reg, expr); Index: gcc/config.gcc =================================================================== --- gcc/config.gcc (revision 166635) +++ gcc/config.gcc (working copy) @@ -311,6 +311,7 @@ cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h @@ -322,6 +323,7 @@ cpu_type=i386 c_target_objs="i386-c.o" cxx_target_objs="i386-c.o" + extra_options="${extra_options} fused-madd.opt" extra_headers="cpuid.h mmintrin.h mm3dnow.h xmmintrin.h emmintrin.h pmmintrin.h tmmintrin.h ammintrin.h smmintrin.h nmmintrin.h bmmintrin.h fma4intrin.h wmmintrin.h @@ -333,7 +335,7 @@ ia64-*-*) extra_headers=ia64intrin.h need_64bit_hwint=yes - extra_options="${extra_options} g.opt" + extra_options="${extra_options} g.opt fused-madd.opt" ;; hppa*-*-*) cpu_type=pa @@ -370,11 +372,11 @@ cpu_is_64bit=yes ;; esac - extra_options="${extra_options} g.opt" + extra_options="${extra_options} g.opt fused-madd.opt" ;; rs6000*-*-*) need_64bit_hwint=yes - extra_options="${extra_options} g.opt" + extra_options="${extra_options} g.opt fused-madd.opt" ;; score*-*-*) cpu_type=score Index: gcc/config/i386/sse.md =================================================================== --- gcc/config/i386/sse.md (revision 166635) +++ gcc/config/i386/sse.md (working copy) @@ -1856,6 +1856,10 @@ ;; (set (reg1) (mem (addr1))) ;; (set (reg2) (mult (reg1) (mem (addr2)))) ;; (set (reg3) (plus (reg2) (mem (addr3)))) +;; +;; ??? This is historic, pre-dating the gimple fma transformation. +;; We could now properly represent that only one memory operand is +;; allowed and not be penalized during optimization. ;; Intrinsic FMA operations. @@ -2180,100 +2184,6 @@ ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; -;; Non-intrinsic versions, matched when fused-multiply-add is allowed. -;; -;; ??? If fused-madd were a generic flag, combine could do this without -;; needing splitters here in the backend. Irritatingly, combine won't -;; recognize many of these with mere splits, since only 3 or more insns -;; are allowed to split during combine. Thankfully, there's always a -;; split_all_insns pass that runs before reload. -;; -;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; - -(define_insn_and_split "*split_fma" - [(set (match_operand:FMAMODE 0 "register_operand") - (plus:FMAMODE - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (match_dup 1) - (match_dup 2) - (match_dup 3)))] - "") - -;; Floating multiply and subtract. -(define_insn_and_split "*split_fms" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (match_dup 1) - (match_dup 2) - (neg:FMAMODE (match_dup 3))))] - "") - -;; Floating point negative multiply and add. -;; Recognize (-a * b + c) via the canonical form: c - (a * b). -(define_insn_and_split "*split_fnma" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (match_operand:FMAMODE 3 "nonimmediate_operand") - (mult:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand") - (match_operand:FMAMODE 2 "nonimmediate_operand"))))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (neg:FMAMODE (match_dup 1)) - (match_dup 2) - (match_dup 3)))] - "") - -;; Floating point negative multiply and subtract. -;; Recognize (-a * b - c) via the canonical form: c - (-a * b). -(define_insn_and_split "*split_fnms" - [(set (match_operand:FMAMODE 0 "register_operand") - (minus:FMAMODE - (mult:FMAMODE - (neg:FMAMODE - (match_operand:FMAMODE 1 "nonimmediate_operand")) - (match_operand:FMAMODE 2 "nonimmediate_operand")) - (match_operand:FMAMODE 3 "nonimmediate_operand")))] - "TARGET_SSE_MATH && TARGET_FUSED_MADD - && (TARGET_FMA || TARGET_FMA4) - && !(reload_in_progress || reload_completed)" - { gcc_unreachable (); } - "&& 1" - [(set (match_dup 0) - (fma:FMAMODE - (neg:FMAMODE (match_dup 1)) - (match_dup 2) - (neg:FMAMODE (match_dup 3))))] - "") - -;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; -;; ;; Parallel single-precision floating point conversion operations ;; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Index: gcc/config/i386/i386.opt =================================================================== --- gcc/config/i386/i386.opt (revision 166635) +++ gcc/config/i386/i386.opt (working copy) @@ -261,12 +261,6 @@ Generate vzeroupper instruction before a transfer of control flow out of the function. -mfused-madd -Target Report Mask(FUSED_MADD) Save -Enable automatic generation of fused floating point multiply-add instructions -if the ISA supports such instructions. The -mfused-madd option is on by -default. - mdispatch-scheduler Target RejectNegative Var(flag_dispatch_scheduler) Do dispatch scheduling if processor is bdver1 and Haifa scheduling Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 166635) +++ gcc/config/i386/i386.c (working copy) @@ -28587,6 +28587,31 @@ } return false; + case FMA: + { + rtx sub; + + gcc_assert (FLOAT_MODE_P (mode)); + gcc_assert (TARGET_FMA || TARGET_FMA4); + + /* ??? SSE scalar/vector cost should be used here. */ + /* ??? Bald assumption that fma has the same cost as fmul. */ + *total = cost->fmul; + *total += rtx_cost (XEXP (x, 1), FMA, speed); + + /* Negate in op0 or op2 is free: FMS, FNMA, FNMS. */ + sub = XEXP (x, 0); + if (GET_CODE (sub) == NEG) + sub = XEXP (x, 0); + *total += rtx_cost (sub, FMA, speed); + + sub = XEXP (x, 2); + if (GET_CODE (sub) == NEG) + sub = XEXP (x, 0); + *total += rtx_cost (sub, FMA, speed); + return true; + } + case MULT: if (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH) { @@ -34483,8 +34508,7 @@ #define TARGET_DEFAULT_TARGET_FLAGS \ (TARGET_DEFAULT \ | TARGET_SUBTARGET_DEFAULT \ - | TARGET_TLS_DIRECT_SEG_REFS_DEFAULT \ - | MASK_FUSED_MADD) + | TARGET_TLS_DIRECT_SEG_REFS_DEFAULT) #undef TARGET_HANDLE_OPTION #define TARGET_HANDLE_OPTION ix86_handle_option Index: gcc/config/ia64/vms.h =================================================================== --- gcc/config/ia64/vms.h (revision 166635) +++ gcc/config/ia64/vms.h (working copy) @@ -45,7 +45,7 @@ /* Need .debug_line info generated from gcc and gas. */ #undef TARGET_DEFAULT -#define TARGET_DEFAULT (MASK_DWARF2_ASM | MASK_GNU_AS | MASK_FUSED_MADD) +#define TARGET_DEFAULT (MASK_DWARF2_ASM | MASK_GNU_AS) #define VMS_DEBUG_MAIN_POINTER "TRANSFER$BREAK$GO" Index: gcc/config/ia64/ia64.opt =================================================================== --- gcc/config/ia64/ia64.opt (revision 166635) +++ gcc/config/ia64/ia64.opt (working copy) @@ -178,8 +178,4 @@ Target Report Var(mflag_sel_sched_dont_check_control_spec) Init(0) Don't generate checks for control speculation in selective scheduling -mfused-madd -Target Report Mask(FUSED_MADD) -Enable fused multiply/add and multiply/subtract instructions - ; This comment is to ensure we retain the blank line above. Index: gcc/config/ia64/ia64.h =================================================================== --- gcc/config/ia64/ia64.h (revision 166635) +++ gcc/config/ia64/ia64.h (working copy) @@ -96,7 +96,7 @@ /* Default target_flags if no switches are specified */ #ifndef TARGET_DEFAULT -#define TARGET_DEFAULT (MASK_DWARF2_ASM | MASK_FUSED_MADD) +#define TARGET_DEFAULT (MASK_DWARF2_ASM) #endif #ifndef TARGET_CPU_DEFAULT Index: gcc/config/ia64/vect.md =================================================================== --- gcc/config/ia64/vect.md (revision 166635) +++ gcc/config/ia64/vect.md (working copy) @@ -903,106 +903,29 @@ "fpnegabs %0 = %1" [(set_attr "itanium_class" "fmisc")]) -;; In order to convince combine to merge plus and mult to a useful fpma, -;; we need a couple of extra patterns. (define_expand "addv2sf3" - [(parallel - [(set (match_operand:V2SF 0 "fr_register_operand" "") - (plus:V2SF (match_operand:V2SF 1 "fr_register_operand" "") - (match_operand:V2SF 2 "fr_register_operand" ""))) - (use (match_dup 3))])] + [(set (match_operand:V2SF 0 "fr_register_operand" "") + (fma:V2SF (match_operand:V2SF 1 "fr_register_operand" "") + (match_dup 3) + (match_operand:V2SF 2 "fr_register_operand" "")))] "" { rtvec v = gen_rtvec (2, CONST1_RTX (SFmode), CONST1_RTX (SFmode)); operands[3] = force_reg (V2SFmode, gen_rtx_CONST_VECTOR (V2SFmode, v)); - if (!TARGET_FUSED_MADD) - { - emit_insn (gen_fpma (operands[0], operands[1], operands[3], operands[2])); - DONE; - } }) -;; The split condition here could be combine_completed, if we had such. -(define_insn_and_split "*addv2sf3_1" - [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (plus:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f"))) - (use (match_operand:V2SF 3 "fr_register_operand" "f"))] - "" - "#" - "reload_completed" - [(set (match_dup 0) - (plus:V2SF - (mult:V2SF (match_dup 1) (match_dup 3)) - (match_dup 2)))] - "") - -(define_insn_and_split "*addv2sf3_2" - [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (plus:V2SF - (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f")) - (match_operand:V2SF 3 "fr_register_operand" "f"))) - (use (match_operand:V2SF 4 "" "X"))] - "" - "#" - "" - [(set (match_dup 0) - (plus:V2SF - (mult:V2SF (match_dup 1) (match_dup 2)) - (match_dup 3)))] - "") - -;; In order to convince combine to merge minus and mult to a useful fpms, -;; we need a couple of extra patterns. (define_expand "subv2sf3" - [(parallel - [(set (match_operand:V2SF 0 "fr_register_operand" "") - (minus:V2SF (match_operand:V2SF 1 "fr_register_operand" "") - (match_operand:V2SF 2 "fr_register_operand" ""))) - (use (match_dup 3))])] + [(set (match_operand:V2SF 0 "fr_register_operand" "") + (fma:V2SF + (match_operand:V2SF 1 "fr_register_operand" "") + (match_dup 3) + (neg:V2SF (match_operand:V2SF 2 "fr_register_operand" ""))))] "" { rtvec v = gen_rtvec (2, CONST1_RTX (SFmode), CONST1_RTX (SFmode)); operands[3] = force_reg (V2SFmode, gen_rtx_CONST_VECTOR (V2SFmode, v)); - if (!TARGET_FUSED_MADD) - { - emit_insn (gen_fpms (operands[0], operands[1], operands[3], operands[2])); - DONE; - } }) -;; The split condition here could be combine_completed, if we had such. -(define_insn_and_split "*subv2sf3_1" - [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (minus:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f"))) - (use (match_operand:V2SF 3 "fr_register_operand" "f"))] - "" - "#" - "reload_completed" - [(set (match_dup 0) - (minus:V2SF - (mult:V2SF (match_dup 1) (match_dup 3)) - (match_dup 2)))] - "") - -(define_insn_and_split "*subv2sf3_2" - [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (minus:V2SF - (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f")) - (match_operand:V2SF 3 "fr_register_operand" "f"))) - (use (match_operand:V2SF 4 "" "X"))] - "" - "#" - "" - [(set (match_dup 0) - (minus:V2SF - (mult:V2SF (match_dup 1) (match_dup 2)) - (match_dup 3)))] - "") - (define_insn "mulv2sf3" [(set (match_operand:V2SF 0 "fr_register_operand" "=f") (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") @@ -1011,22 +934,22 @@ "fpmpy %0 = %1, %2" [(set_attr "itanium_class" "fmac")]) -(define_insn "fpma" +(define_insn "fmav2sf4" [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (plus:V2SF - (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f")) + (fma:V2SF + (match_operand:V2SF 1 "fr_register_operand" "f") + (match_operand:V2SF 2 "fr_register_operand" "f") (match_operand:V2SF 3 "fr_register_operand" "f")))] "" "fpma %0 = %1, %2, %3" [(set_attr "itanium_class" "fmac")]) -(define_insn "fpms" +(define_insn "fmsv2sf4" [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (minus:V2SF - (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f")) - (match_operand:V2SF 3 "fr_register_operand" "f")))] + (fma:V2SF + (match_operand:V2SF 1 "fr_register_operand" "f") + (match_operand:V2SF 2 "fr_register_operand" "f") + (neg:V2SF (match_operand:V2SF 3 "fr_register_operand" "f"))))] "" "fpms %0 = %1, %2, %3" [(set_attr "itanium_class" "fmac")]) @@ -1040,12 +963,11 @@ "fpnmpy %0 = %1, %2" [(set_attr "itanium_class" "fmac")]) -(define_insn "*fpnma" +(define_insn "fnmav2sf4" [(set (match_operand:V2SF 0 "fr_register_operand" "=f") - (plus:V2SF - (neg:V2SF - (mult:V2SF (match_operand:V2SF 1 "fr_register_operand" "f") - (match_operand:V2SF 2 "fr_register_operand" "f"))) + (fma:V2SF + (neg:V2SF (match_operand:V2SF 1 "fr_register_operand" "f")) + (match_operand:V2SF 2 "fr_register_operand" "f") (match_operand:V2SF 3 "fr_register_operand" "f")))] "" "fpnma %0 = %1, %2, %3" Index: gcc/config/ia64/ia64.md =================================================================== --- gcc/config/ia64/ia64.md (revision 166635) +++ gcc/config/ia64/ia64.md (working copy) @@ -2757,24 +2757,6 @@ "fmax %0 = %F1, %F2" [(set_attr "itanium_class" "fmisc")]) -(define_insn "*maddsf4" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (plus:SF (mult:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:SF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:SF 3 "fr_reg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubsf4" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (minus:SF (mult:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:SF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:SF 3 "fr_reg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fms.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - (define_insn "*nmulsf3" [(set (match_operand:SF 0 "fr_register_operand" "=f") (neg:SF (mult:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") @@ -2783,16 +2765,6 @@ "fnmpy.s %0 = %F1, %F2" [(set_attr "itanium_class" "fmac")]) -(define_insn "*nmaddsf4" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (minus:SF (match_operand:SF 3 "fr_reg_or_fp01_operand" "fG") - (mult:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:SF 2 "fr_reg_or_fp01_operand" "fG"))))] - "TARGET_FUSED_MADD" - "fnma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -;; Official C99 versions of the fmaf family of operations. (define_insn "fmasf4" [(set (match_operand:SF 0 "fr_register_operand" "=f") (fma:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") @@ -2802,7 +2774,7 @@ "fma.s %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -(define_insn "*fmssf4" +(define_insn "fmssf4" [(set (match_operand:SF 0 "fr_register_operand" "=f") (fma:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG") (match_operand:SF 2 "fr_reg_or_fp01_operand" "fG") @@ -2812,8 +2784,7 @@ "fms.s %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -;; This insn is officially "-(a * b) + c" which is "(-a * b) + c". -(define_insn "*nfmasf4" +(define_insn "fnmasf4" [(set (match_operand:SF 0 "fr_register_operand" "=f") (fma:SF (neg:SF (match_operand:SF 1 "fr_reg_or_fp01_operand" "fG")) (match_operand:SF 2 "fr_reg_or_fp01_operand" "fG") @@ -2934,44 +2905,6 @@ "fmax %0 = %F1, %F2" [(set_attr "itanium_class" "fmisc")]) -(define_insn "*madddf4" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (plus:DF (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:DF 3 "fr_reg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fma.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*madddf4_trunc" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (plus:DF (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:DF 3 "fr_reg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubdf4" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (minus:DF (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:DF 3 "fr_reg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fms.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubdf4_trunc" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (minus:DF (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG")) - (match_operand:DF 3 "fr_reg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fms.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - (define_insn "*nmuldf3" [(set (match_operand:DF 0 "fr_register_operand" "=f") (neg:DF (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") @@ -2989,26 +2922,6 @@ "fnmpy.s %0 = %F1, %F2" [(set_attr "itanium_class" "fmac")]) -(define_insn "*nmadddf4" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (minus:DF (match_operand:DF 3 "fr_reg_or_fp01_operand" "fG") - (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG"))))] - "TARGET_FUSED_MADD" - "fnma.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*nmadddf4_truncsf" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (minus:DF (match_operand:DF 3 "fr_reg_or_fp01_operand" "fG") - (mult:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") - (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG")))))] - "TARGET_FUSED_MADD" - "fnma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -;; Official C99 versions of the fma family of operations. (define_insn "fmadf4" [(set (match_operand:DF 0 "fr_register_operand" "=f") (fma:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") @@ -3018,7 +2931,7 @@ "fma.d %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -(define_insn "*fmsdf4" +(define_insn "fmsdf4" [(set (match_operand:DF 0 "fr_register_operand" "=f") (fma:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG") (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG") @@ -3028,8 +2941,7 @@ "fms.d %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -;; See comment for nfmasf4. -(define_insn "*nfmadf4" +(define_insn "fnmadf4" [(set (match_operand:DF 0 "fr_register_operand" "=f") (fma:DF (neg:DF (match_operand:DF 1 "fr_reg_or_fp01_operand" "fG")) (match_operand:DF 2 "fr_reg_or_fp01_operand" "fG") @@ -3177,64 +3089,6 @@ "fmax %0 = %F1, %F2" [(set_attr "itanium_class" "fmisc")]) -(define_insn "*maddxf4" - [(set (match_operand:XF 0 "fr_register_operand" "=f") - (plus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fma %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*maddxf4_truncsf" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (plus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*maddxf4_truncdf" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (float_truncate:DF - (plus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fma.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubxf4" - [(set (match_operand:XF 0 "fr_register_operand" "=f") - (minus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ")))] - "TARGET_FUSED_MADD" - "fms %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubxf4_truncsf" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (minus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fms.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*msubxf4_truncdf" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (float_truncate:DF - (minus:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG")) - (match_operand:XF 3 "xfreg_or_signed_fp01_operand" "fZ"))))] - "TARGET_FUSED_MADD" - "fms.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - (define_insn "*nmulxf3" [(set (match_operand:XF 0 "fr_register_operand" "=f") (neg:XF (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") @@ -3263,39 +3117,6 @@ "fnmpy.d %0 = %F1, %F2" [(set_attr "itanium_class" "fmac")]) -(define_insn "*nmaddxf4" - [(set (match_operand:XF 0 "fr_register_operand" "=f") - (minus:XF (match_operand:XF 3 "xfreg_or_fp01_operand" "fG") - (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG") - )))] - "TARGET_FUSED_MADD" - "fnma %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*nmaddxf4_truncsf" - [(set (match_operand:SF 0 "fr_register_operand" "=f") - (float_truncate:SF - (minus:XF (match_operand:XF 3 "xfreg_or_fp01_operand" "fG") - (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG") - ))))] - "TARGET_FUSED_MADD" - "fnma.s %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -(define_insn "*nmaddxf4_truncdf" - [(set (match_operand:DF 0 "fr_register_operand" "=f") - (float_truncate:DF - (minus:XF (match_operand:XF 3 "xfreg_or_fp01_operand" "fG") - (mult:XF (match_operand:XF 1 "xfreg_or_fp01_operand" "fG") - (match_operand:XF 2 "xfreg_or_fp01_operand" "fG") - ))))] - "TARGET_FUSED_MADD" - "fnma.d %0 = %F1, %F2, %F3" - [(set_attr "itanium_class" "fmac")]) - -;; Official C99 versions of the fmal family of operations. (define_insn "fmaxf4" [(set (match_operand:XF 0 "fr_register_operand" "=f") (fma:XF (match_operand:XF 1 "fr_reg_or_fp01_operand" "fG") @@ -3305,7 +3126,7 @@ "fma %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -(define_insn "*fmsxf4" +(define_insn "fmsxf4" [(set (match_operand:XF 0 "fr_register_operand" "=f") (fma:XF (match_operand:XF 1 "fr_reg_or_fp01_operand" "fG") (match_operand:XF 2 "fr_reg_or_fp01_operand" "fG") @@ -3315,8 +3136,7 @@ "fms %0 = %F1, %F2, %F3" [(set_attr "itanium_class" "fmac")]) -;; See comment for nfmasf4. -(define_insn "*nfmaxf4" +(define_insn "fnmaxf4" [(set (match_operand:XF 0 "fr_register_operand" "=f") (fma:XF (neg:XF (match_operand:XF 1 "fr_reg_or_fp01_operand" "fG")) (match_operand:XF 2 "fr_reg_or_fp01_operand" "fG") Index: gcc/config/ia64/vms64.h =================================================================== --- gcc/config/ia64/vms64.h (revision 166635) +++ gcc/config/ia64/vms64.h (working copy) @@ -36,6 +36,6 @@ #define POINTER_SIZE 64 #undef TARGET_DEFAULT -#define TARGET_DEFAULT (MASK_DWARF2_ASM | MASK_GNU_AS | MASK_FUSED_MADD | MASK_MALLOC64) +#define TARGET_DEFAULT (MASK_DWARF2_ASM | MASK_GNU_AS | MASK_MALLOC64) #include "config/vms/vms-crtl-64.h" Index: gcc/config/ia64/hpux.h =================================================================== --- gcc/config/ia64/hpux.h (revision 166635) +++ gcc/config/ia64/hpux.h (working copy) @@ -106,7 +106,7 @@ #undef TARGET_DEFAULT #define TARGET_DEFAULT \ - (MASK_DWARF2_ASM | MASK_BIG_ENDIAN | MASK_ILP32 | MASK_FUSED_MADD) + (MASK_DWARF2_ASM | MASK_BIG_ENDIAN | MASK_ILP32) /* ??? Might not be needed anymore. */ #define MEMBER_TYPE_FORCES_BLK(FIELD, MODE) ((MODE) == TFmode) Index: gcc/config/rs6000/vector.md =================================================================== --- gcc/config/rs6000/vector.md (revision 166635) +++ gcc/config/rs6000/vector.md (working copy) @@ -202,16 +202,14 @@ [(set (match_operand:VEC_F 0 "vfloat_operand" "") (mult:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "") (match_operand:VEC_F 2 "vfloat_operand" "")))] - "(VECTOR_UNIT_VSX_P (mode) - || (VECTOR_UNIT_ALTIVEC_P (mode) && TARGET_FUSED_MADD))" - " + "VECTOR_UNIT_VSX_P (mode) || VECTOR_UNIT_ALTIVEC_P (mode)" { if (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (mode)) { emit_insn (gen_altivec_mulv4sf3 (operands[0], operands[1], operands[2])); DONE; } -}") +}) (define_expand "div3" [(set (match_operand:VEC_F 0 "vfloat_operand" "") Index: gcc/config/rs6000/paired.md =================================================================== --- gcc/config/rs6000/paired.md (revision 166635) +++ gcc/config/rs6000/paired.md (working copy) @@ -96,77 +96,85 @@ (define_insn "paired_madds0" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (vec_concat:V2SF - (plus:SF (mult:SF (vec_select:SF (match_operand:V2SF 1 "gpc_reg_operand" "f") - (parallel [(const_int 0)])) - (vec_select:SF (match_operand:V2SF 2 "gpc_reg_operand" "f") - (parallel [(const_int 0)]))) - (vec_select:SF (match_operand:V2SF 3 "gpc_reg_operand" "f") - (parallel [(const_int 0)]))) - (plus:SF (mult:SF (vec_select:SF (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SF (match_dup 2) - (parallel [(const_int 0)]))) - (vec_select:SF (match_dup 3) - (parallel [(const_int 1)])))))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD" + (vec_concat:V2SF + (fma:SF + (vec_select:SF (match_operand:V2SF 1 "gpc_reg_operand" "f") + (parallel [(const_int 0)])) + (vec_select:SF (match_operand:V2SF 2 "gpc_reg_operand" "f") + (parallel [(const_int 0)])) + (vec_select:SF (match_operand:V2SF 3 "gpc_reg_operand" "f") + (parallel [(const_int 0)]))) + (fma:SF + (vec_select:SF (match_dup 1) + (parallel [(const_int 1)])) + (vec_select:SF (match_dup 2) + (parallel [(const_int 0)])) + (vec_select:SF (match_dup 3) + (parallel [(const_int 1)])))))] + "TARGET_PAIRED_FLOAT" "ps_madds0 %0,%1,%2,%3" [(set_attr "type" "fp")]) (define_insn "paired_madds1" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (vec_concat:V2SF - (plus:SF (mult:SF (vec_select:SF (match_operand:V2SF 1 "gpc_reg_operand" "f") - (parallel [(const_int 0)])) - (vec_select:SF (match_operand:V2SF 2 "gpc_reg_operand" "f") - (parallel [(const_int 1)]))) - (vec_select:SF (match_operand:V2SF 3 "gpc_reg_operand" "f") - (parallel [(const_int 0)]))) - (plus:SF (mult:SF (vec_select:SF (match_dup 1) - (parallel [(const_int 1)])) - (vec_select:SF (match_dup 2) - (parallel [(const_int 1)]))) - (vec_select:SF (match_dup 3) - (parallel [(const_int 1)])))))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD" + (vec_concat:V2SF + (fma:SF + (vec_select:SF (match_operand:V2SF 1 "gpc_reg_operand" "f") + (parallel [(const_int 0)])) + (vec_select:SF (match_operand:V2SF 2 "gpc_reg_operand" "f") + (parallel [(const_int 1)])) + (vec_select:SF (match_operand:V2SF 3 "gpc_reg_operand" "f") + (parallel [(const_int 0)]))) + (fma:SF + (vec_select:SF (match_dup 1) + (parallel [(const_int 1)])) + (vec_select:SF (match_dup 2) + (parallel [(const_int 1)])) + (vec_select:SF (match_dup 3) + (parallel [(const_int 1)])))))] + "TARGET_PAIRED_FLOAT" "ps_madds1 %0,%1,%2,%3" [(set_attr "type" "fp")]) -(define_insn "paired_madd" +(define_insn "*paired_madd" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (plus:V2SF (mult:V2SF (match_operand:V2SF 1 "gpc_reg_operand" "%f") - (match_operand:V2SF 2 "gpc_reg_operand" "f")) - (match_operand:V2SF 3 "gpc_reg_operand" "f")))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD" + (fma:V2SF + (match_operand:V2SF 1 "gpc_reg_operand" "f") + (match_operand:V2SF 2 "gpc_reg_operand" "f") + (match_operand:V2SF 3 "gpc_reg_operand" "f")))] + "TARGET_PAIRED_FLOAT" "ps_madd %0,%1,%2,%3" [(set_attr "type" "fp")]) -(define_insn "paired_msub" +(define_insn "*paired_msub" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (minus:V2SF (mult:V2SF (match_operand:V2SF 1 "gpc_reg_operand" "%f") - (match_operand:V2SF 2 "gpc_reg_operand" "f")) - (match_operand:V2SF 3 "gpc_reg_operand" "f")))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD" + (fma:V2SF + (match_operand:V2SF 1 "gpc_reg_operand" "f") + (match_operand:V2SF 2 "gpc_reg_operand" "f") + (neg:V2SF (match_operand:V2SF 3 "gpc_reg_operand" "f"))))] + "TARGET_PAIRED_FLOAT" "ps_msub %0,%1,%2,%3" [(set_attr "type" "fp")]) -(define_insn "paired_nmadd" +(define_insn "*paired_nmadd" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (neg:V2SF (plus:V2SF (mult:V2SF (match_operand:V2SF 1 "gpc_reg_operand" "%f") - (match_operand:V2SF 2 "gpc_reg_operand" "f")) - (match_operand:V2SF 3 "gpc_reg_operand" "f"))))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (SFmode)" + (neg:V2SF + (fma:V2SF + (match_operand:V2SF 1 "gpc_reg_operand" "f") + (match_operand:V2SF 2 "gpc_reg_operand" "f") + (match_operand:V2SF 3 "gpc_reg_operand" "f"))))] + "TARGET_PAIRED_FLOAT" "ps_nmadd %0,%1,%2,%3" [(set_attr "type" "fp")]) -(define_insn "paired_nmsub" +(define_insn "*paired_nmsub" [(set (match_operand:V2SF 0 "gpc_reg_operand" "=f") - (neg:V2SF (minus:V2SF (mult:V2SF (match_operand:V2SF 1 "gpc_reg_operand" "%f") - (match_operand:V2SF 2 "gpc_reg_operand" "f")) - (match_operand:V2SF 3 "gpc_reg_operand" "f"))))] - "TARGET_PAIRED_FLOAT && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (DFmode)" + (neg:V2SF + (fma:V2SF + (match_operand:V2SF 1 "gpc_reg_operand" "f") + (match_operand:V2SF 2 "gpc_reg_operand" "f") + (neg:V2SF (match_operand:V2SF 3 "gpc_reg_operand" "f")))))] + "TARGET_PAIRED_FLOAT" "ps_nmsub %0,%1,%2,%3" [(set_attr "type" "dmul")]) Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 166635) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -176,10 +176,6 @@ Target Report Var(TARGET_AVOID_XFORM) Init(-1) Avoid generation of indexed load/store instructions when possible -mfused-madd -Target Report Var(TARGET_FUSED_MADD) Init(1) -Generate fused multiply/add instructions - mtls-markers Target Report Var(tls_markers) Init(1) Mark __tls_get_addr calls with argument info Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 166635) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -2284,16 +2284,13 @@ if (rs6000_recip_control) { - if (!TARGET_FUSED_MADD) - warning (0, "-mrecip requires -mfused-madd"); if (!flag_finite_math_only) warning (0, "-mrecip requires -ffinite-math or -ffast-math"); if (flag_trapping_math) warning (0, "-mrecip requires -fno-trapping-math or -ffast-math"); if (!flag_reciprocal_math) warning (0, "-mrecip requires -freciprocal-math or -ffast-math"); - if (TARGET_FUSED_MADD && flag_finite_math_only && !flag_trapping_math - && flag_reciprocal_math) + if (flag_finite_math_only && !flag_trapping_math && flag_reciprocal_math) { if (RS6000_RECIP_HAVE_RE_P (SFmode) && (rs6000_recip_control & RECIP_SF_DIV) != 0) @@ -9684,7 +9681,7 @@ static const struct builtin_description bdesc_3arg[] = { - { MASK_ALTIVEC, CODE_FOR_altivec_vmaddfp, "__builtin_altivec_vmaddfp", ALTIVEC_BUILTIN_VMADDFP }, + { MASK_ALTIVEC, CODE_FOR_fmav4sf4, "__builtin_altivec_vmaddfp", ALTIVEC_BUILTIN_VMADDFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vmhaddshs, "__builtin_altivec_vmhaddshs", ALTIVEC_BUILTIN_VMHADDSHS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmhraddshs, "__builtin_altivec_vmhraddshs", ALTIVEC_BUILTIN_VMHRADDSHS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmladduhm, "__builtin_altivec_vmladduhm", ALTIVEC_BUILTIN_VMLADDUHM}, @@ -9694,7 +9691,7 @@ { MASK_ALTIVEC, CODE_FOR_altivec_vmsumshm, "__builtin_altivec_vmsumshm", ALTIVEC_BUILTIN_VMSUMSHM }, { MASK_ALTIVEC, CODE_FOR_altivec_vmsumuhs, "__builtin_altivec_vmsumuhs", ALTIVEC_BUILTIN_VMSUMUHS }, { MASK_ALTIVEC, CODE_FOR_altivec_vmsumshs, "__builtin_altivec_vmsumshs", ALTIVEC_BUILTIN_VMSUMSHS }, - { MASK_ALTIVEC, CODE_FOR_altivec_vnmsubfp, "__builtin_altivec_vnmsubfp", ALTIVEC_BUILTIN_VNMSUBFP }, + { MASK_ALTIVEC, CODE_FOR_nfmsv4sf4, "__builtin_altivec_vnmsubfp", ALTIVEC_BUILTIN_VNMSUBFP }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2df, "__builtin_altivec_vperm_2df", ALTIVEC_BUILTIN_VPERM_2DF }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v2di, "__builtin_altivec_vperm_2di", ALTIVEC_BUILTIN_VPERM_2DI }, { MASK_ALTIVEC, CODE_FOR_altivec_vperm_v4sf, "__builtin_altivec_vperm_4sf", ALTIVEC_BUILTIN_VPERM_4SF }, @@ -9736,15 +9733,15 @@ { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_perm", ALTIVEC_BUILTIN_VEC_PERM }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_sel", ALTIVEC_BUILTIN_VEC_SEL }, - { MASK_VSX, CODE_FOR_vsx_fmaddv2df4, "__builtin_vsx_xvmadddp", VSX_BUILTIN_XVMADDDP }, - { MASK_VSX, CODE_FOR_vsx_fmsubv2df4, "__builtin_vsx_xvmsubdp", VSX_BUILTIN_XVMSUBDP }, - { MASK_VSX, CODE_FOR_vsx_fnmaddv2df4, "__builtin_vsx_xvnmadddp", VSX_BUILTIN_XVNMADDDP }, - { MASK_VSX, CODE_FOR_vsx_fnmsubv2df4, "__builtin_vsx_xvnmsubdp", VSX_BUILTIN_XVNMSUBDP }, + { MASK_VSX, CODE_FOR_fmav2df4, "__builtin_vsx_xvmadddp", VSX_BUILTIN_XVMADDDP }, + { MASK_VSX, CODE_FOR_fmsv2df4, "__builtin_vsx_xvmsubdp", VSX_BUILTIN_XVMSUBDP }, + { MASK_VSX, CODE_FOR_nfmav2df4, "__builtin_vsx_xvnmadddp", VSX_BUILTIN_XVNMADDDP }, + { MASK_VSX, CODE_FOR_nfmsv2df4, "__builtin_vsx_xvnmsubdp", VSX_BUILTIN_XVNMSUBDP }, - { MASK_VSX, CODE_FOR_vsx_fmaddv4sf4, "__builtin_vsx_xvmaddsp", VSX_BUILTIN_XVMADDSP }, - { MASK_VSX, CODE_FOR_vsx_fmsubv4sf4, "__builtin_vsx_xvmsubsp", VSX_BUILTIN_XVMSUBSP }, - { MASK_VSX, CODE_FOR_vsx_fnmaddv4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP }, - { MASK_VSX, CODE_FOR_vsx_fnmsubv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP }, + { MASK_VSX, CODE_FOR_fmav4sf4, "__builtin_vsx_xvmaddsp", VSX_BUILTIN_XVMADDSP }, + { MASK_VSX, CODE_FOR_fmsv4sf4, "__builtin_vsx_xvmsubsp", VSX_BUILTIN_XVMSUBSP }, + { MASK_VSX, CODE_FOR_nfmav4sf4, "__builtin_vsx_xvnmaddsp", VSX_BUILTIN_XVNMADDSP }, + { MASK_VSX, CODE_FOR_nfmsv4sf4, "__builtin_vsx_xvnmsubsp", VSX_BUILTIN_XVNMSUBSP }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_msub", VSX_BUILTIN_VEC_MSUB }, { MASK_ALTIVEC, CODE_FOR_nothing, "__builtin_vec_nmadd", VSX_BUILTIN_VEC_NMADD }, @@ -9789,12 +9786,12 @@ { MASK_VSX, CODE_FOR_vsx_xxsldwi_v16qi, "__builtin_vsx_xxsldwi_16qi", VSX_BUILTIN_XXSLDWI_16QI }, { MASK_VSX, CODE_FOR_nothing, "__builtin_vsx_xxsldwi", VSX_BUILTIN_VEC_XXSLDWI }, - { 0, CODE_FOR_paired_msub, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB }, - { 0, CODE_FOR_paired_madd, "__builtin_paired_madd", PAIRED_BUILTIN_MADD }, + { 0, CODE_FOR_fmsv2sf4, "__builtin_paired_msub", PAIRED_BUILTIN_MSUB }, + { 0, CODE_FOR_fmav2sf4, "__builtin_paired_madd", PAIRED_BUILTIN_MADD }, { 0, CODE_FOR_paired_madds0, "__builtin_paired_madds0", PAIRED_BUILTIN_MADDS0 }, { 0, CODE_FOR_paired_madds1, "__builtin_paired_madds1", PAIRED_BUILTIN_MADDS1 }, - { 0, CODE_FOR_paired_nmsub, "__builtin_paired_nmsub", PAIRED_BUILTIN_NMSUB }, - { 0, CODE_FOR_paired_nmadd, "__builtin_paired_nmadd", PAIRED_BUILTIN_NMADD }, + { 0, CODE_FOR_nfmsv2sf4, "__builtin_paired_nmsub", PAIRED_BUILTIN_NMSUB }, + { 0, CODE_FOR_nfmav2sf4, "__builtin_paired_nmadd", PAIRED_BUILTIN_NMADD }, { 0, CODE_FOR_paired_sum0, "__builtin_paired_sum0", PAIRED_BUILTIN_SUM0 }, { 0, CODE_FOR_paired_sum1, "__builtin_paired_sum1", PAIRED_BUILTIN_SUM1 }, { 0, CODE_FOR_selv2sf4, "__builtin_paired_selv2sf4", PAIRED_BUILTIN_SELV2SF4 }, @@ -26390,112 +26387,65 @@ return reg; } -/* Generate a FMADD instruction: - dst = (m1 * m2) + a +/* Generate an FMA instruction. */ - generating different RTL based on the fused multiply/add switch. */ - static void -rs6000_emit_madd (rtx dst, rtx m1, rtx m2, rtx a) +rs6000_emit_madd (rtx target, rtx m1, rtx m2, rtx a) { - enum machine_mode mode = GET_MODE (dst); + enum machine_mode mode = GET_MODE (target); + rtx dst; - if (!TARGET_FUSED_MADD) - { - /* For the simple ops, use the generator function, rather than assuming - that the RTL is standard. */ - enum insn_code mcode = optab_handler (smul_optab, mode); - enum insn_code acode = optab_handler (add_optab, mode); - gen_2arg_fn_t gen_mul = (gen_2arg_fn_t) GEN_FCN (mcode); - gen_2arg_fn_t gen_add = (gen_2arg_fn_t) GEN_FCN (acode); - rtx mreg = gen_reg_rtx (mode); + dst = expand_ternary_op (mode, fma_optab, m1, m2, a, target, 0); + gcc_assert (dst != NULL); - gcc_assert (mcode != CODE_FOR_nothing && acode != CODE_FOR_nothing); - emit_insn (gen_mul (mreg, m1, m2)); - emit_insn (gen_add (dst, mreg, a)); - } - - else - emit_insn (gen_rtx_SET (VOIDmode, dst, - gen_rtx_PLUS (mode, - gen_rtx_MULT (mode, m1, m2), - a))); + if (dst != target) + emit_move_insn (target, dst); } -/* Generate a FMSUB instruction: - dst = (m1 * m2) - a +/* Generate a FMSUB instruction: dst = fma(m1, m2, -a). */ - generating different RTL based on the fused multiply/add switch. */ - static void -rs6000_emit_msub (rtx dst, rtx m1, rtx m2, rtx a) +rs6000_emit_msub (rtx target, rtx m1, rtx m2, rtx a) { - enum machine_mode mode = GET_MODE (dst); + enum machine_mode mode = GET_MODE (target); + rtx dst; - if (!TARGET_FUSED_MADD - || (mode == V4SFmode && VECTOR_UNIT_ALTIVEC_P (V4SFmode))) + /* Altivec does not support fms directly; + generate in terms of fma in that case. */ + if (optab_handler (fms_optab, mode) != CODE_FOR_nothing) + dst = expand_ternary_op (mode, fms_optab, m1, m2, a, target, 0); + else { - /* For the simple ops, use the generator function, rather than assuming - that the RTL is standard. */ - enum insn_code mcode = optab_handler (smul_optab, mode); - enum insn_code scode = optab_handler (add_optab, mode); - gen_2arg_fn_t gen_mul = (gen_2arg_fn_t) GEN_FCN (mcode); - gen_2arg_fn_t gen_sub = (gen_2arg_fn_t) GEN_FCN (scode); - rtx mreg = gen_reg_rtx (mode); - - gcc_assert (mcode != CODE_FOR_nothing && scode != CODE_FOR_nothing); - emit_insn (gen_mul (mreg, m1, m2)); - emit_insn (gen_sub (dst, mreg, a)); + a = expand_unop (mode, neg_optab, a, NULL_RTX, 0); + dst = expand_ternary_op (mode, fma_optab, m1, m2, a, target, 0); } + gcc_assert (dst != NULL); - else - emit_insn (gen_rtx_SET (VOIDmode, dst, - gen_rtx_MINUS (mode, - gen_rtx_MULT (mode, m1, m2), - a))); + if (dst != target) + emit_move_insn (target, dst); } + +/* Generate a FNMSUB instruction: dst = -fma(m1, m2, -a). */ -/* Generate a FNMSUB instruction: - dst = - ((m1 * m2) - a) - - Which is equivalent to (except in the prescence of -0.0): - dst = a - (m1 * m2) - - generating different RTL based on the fast-math and fused multiply/add - switches. */ - static void rs6000_emit_nmsub (rtx dst, rtx m1, rtx m2, rtx a) { enum machine_mode mode = GET_MODE (dst); + rtx r; - if (!TARGET_FUSED_MADD) - { - /* For the simple ops, use the generator function, rather than assuming - that the RTL is standard. */ - enum insn_code mcode = optab_handler (smul_optab, mode); - enum insn_code scode = optab_handler (sub_optab, mode); - gen_2arg_fn_t gen_mul = (gen_2arg_fn_t) GEN_FCN (mcode); - gen_2arg_fn_t gen_sub = (gen_2arg_fn_t) GEN_FCN (scode); - rtx mreg = gen_reg_rtx (mode); + /* This is a tad more complicated, since the fnma_optab is for + a different expression: fma(-m1, m2, a), which is the same + thing except in the case of signed zeros. - gcc_assert (mcode != CODE_FOR_nothing && scode != CODE_FOR_nothing); - emit_insn (gen_mul (mreg, m1, m2)); - emit_insn (gen_sub (dst, a, mreg)); - } + Fortunately we know that if FMA is supported that FNMSUB is + also supported in the ISA. Just expand it directly. */ - else - { - rtx m = gen_rtx_MULT (mode, m1, m2); + gcc_assert (optab_handler (fma_optab, mode) != CODE_FOR_nothing); - if (!HONOR_SIGNED_ZEROS (mode)) - emit_insn (gen_rtx_SET (VOIDmode, dst, gen_rtx_MINUS (mode, a, m))); - - else - emit_insn (gen_rtx_SET (VOIDmode, dst, - gen_rtx_NEG (mode, - gen_rtx_MINUS (mode, m, a)))); - } + r = gen_rtx_NEG (mode, a); + r = gen_rtx_FMA (mode, m1, m2, r); + r = gen_rtx_NEG (mode, r); + emit_insn (gen_rtx_SET (VOIDmode, dst, r)); } /* Newton-Raphson approximation of floating point divide with just 2 passes Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 166635) +++ gcc/config/rs6000/vsx.md (working copy) @@ -513,51 +513,12 @@ ;; Fused vector multiply/add instructions -;; Note we have a pattern for the multiply/add operations that uses unspec and -;; does not check -mfused-madd to allow users to use these ops when they know -;; they want the fused multiply/add. - -;; Fused multiply add. By default expand the FMA into (plus (mult)) to help -;; loop unrolling. Don't do negate multiply ops, because of complications with -;; honoring signed zero and fused-madd. - -(define_expand "vsx_fmadd4" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "") - (plus:VSX_B - (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "") - (match_operand:VSX_B 2 "vsx_register_operand" "")) - (match_operand:VSX_B 3 "vsx_register_operand" "")))] - "VECTOR_UNIT_VSX_P (mode)" -{ - if (!TARGET_FUSED_MADD) - { - emit_insn (gen_vsx_fmadd4_2 (operands[0], operands[1], - operands[2], operands[3])); - DONE; - } -}) - -(define_insn "*vsx_fmadd4_1" +(define_insn "*vsx_fma4" [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (plus:VSX_B - (mult:VSX_B + (fma:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD" - "@ - xmadda %x0,%x1,%x2 - xmaddm %x0,%x1,%x3 - xmadda %x0,%x1,%x2 - xmaddm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - -(define_insn "vsx_fmadd4_2" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (fma:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] "VECTOR_UNIT_VSX_P (mode)" "@ xmadda %x0,%x1,%x2 @@ -567,44 +528,13 @@ [(set_attr "type" "") (set_attr "fp_type" "")]) -(define_expand "vsx_fmsub4" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "") - (minus:VSX_B - (mult:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "") - (match_operand:VSX_B 2 "vsx_register_operand" "")) - (match_operand:VSX_B 3 "vsx_register_operand" "")))] - "VECTOR_UNIT_VSX_P (mode)" -{ - if (!TARGET_FUSED_MADD) - { - emit_insn (gen_vsx_fmsub4_2 (operands[0], operands[1], - operands[2], operands[3])); - DONE; - } -}) - -(define_insn "*vsx_fmsub4_1" +(define_insn "*vsx_fms4" [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (minus:VSX_B - (mult:VSX_B + (fma:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD" - "@ - xmsuba %x0,%x1,%x2 - xmsubm %x0,%x1,%x3 - xmsuba %x0,%x1,%x2 - xmsubm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - -(define_insn "vsx_fmsub4_2" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (fma:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") - (neg:VSX_B - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (neg:VSX_B + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] "VECTOR_UNIT_VSX_P (mode)" "@ xmsuba %x0,%x1,%x2 @@ -614,7 +544,7 @@ [(set_attr "type" "") (set_attr "fp_type" "")]) -(define_insn "vsx_fnmadd4" +(define_insn "*vsx_nfma4" [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") (neg:VSX_B (fma:VSX_B @@ -630,49 +560,14 @@ [(set_attr "type" "") (set_attr "fp_type" "")]) -(define_insn "vsx_fnmadd4_1" +(define_insn "*vsx_nfms4" [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") (neg:VSX_B - (plus:VSX_B - (mult:VSX_B - (match_operand:VSX_B 1 "vsx_register_operand" ",,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (DFmode)" - "@ - xnmadda %x0,%x1,%x2 - xnmaddm %x0,%x1,%x3 - xnmadda %x0,%x1,%x2 - xnmaddm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - -(define_insn "vsx_fnmadd4_2" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (minus:VSX_B - (mult:VSX_B - (neg:VSX_B - (match_operand:VSX_B 1 "gpc_reg_operand" ",,wa,wa")) - (match_operand:VSX_B 2 "gpc_reg_operand" ",0,wa,0")) - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD - && !HONOR_SIGNED_ZEROS (DFmode)" - "@ - xnmadda %x0,%x1,%x2 - xnmaddm %x0,%x1,%x3 - xnmadda %x0,%x1,%x2 - xnmaddm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - -(define_insn "vsx_fnmsub4" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (neg:VSX_B - (fma:VSX_B (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") - (neg:VSX_B - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))))] + (fma:VSX_B + (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") + (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0") + (neg:VSX_B + (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa")))))] "VECTOR_UNIT_VSX_P (mode)" "@ xnmsuba %x0,%x1,%x2 @@ -682,41 +577,6 @@ [(set_attr "type" "") (set_attr "fp_type" "")]) -(define_insn "vsx_fnmsub4_1" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (neg:VSX_B - (minus:VSX_B - (mult:VSX_B - (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0")) - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa"))))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (DFmode)" - "@ - xnmsuba %x0,%x1,%x2 - xnmsubm %x0,%x1,%x3 - xnmsuba %x0,%x1,%x2 - xnmsubm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - -(define_insn "vsx_fnmsub4_2" - [(set (match_operand:VSX_B 0 "vsx_register_operand" "=,,?wa,?wa") - (minus:VSX_B - (match_operand:VSX_B 3 "vsx_register_operand" "0,,0,wa") - (mult:VSX_B - (match_operand:VSX_B 1 "vsx_register_operand" "%,,wa,wa") - (match_operand:VSX_B 2 "vsx_register_operand" ",0,wa,0"))))] - "VECTOR_UNIT_VSX_P (mode) && TARGET_FUSED_MADD - && !HONOR_SIGNED_ZEROS (DFmode)" - "@ - xnmsuba %x0,%x1,%x2 - xnmsubm %x0,%x1,%x3 - xnmsuba %x0,%x1,%x2 - xnmsubm %x0,%x1,%x3" - [(set_attr "type" "") - (set_attr "fp_type" "")]) - ;; Vector conditional expressions (no scalar version for these instructions) (define_insn "vsx_eq" [(set (match_operand:VSX_F 0 "vsx_register_operand" "=,?wa") Index: gcc/config/rs6000/altivec.md =================================================================== --- gcc/config/rs6000/altivec.md (revision 166635) +++ gcc/config/rs6000/altivec.md (working copy) @@ -512,36 +512,10 @@ "vsel %0,%3,%2,%1" [(set_attr "type" "vecperm")]) -;; Fused multiply add. By default expand the FMA into (plus (mult)) to help -;; loop unrolling. Don't do negate multiply ops, because of complications with -;; honoring signed zero and fused-madd. +;; Fused multiply add. -(define_expand "altivec_vmaddfp" - [(set (match_operand:V4SF 0 "register_operand" "") - (plus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "") - (match_operand:V4SF 2 "register_operand" "")) - (match_operand:V4SF 3 "register_operand" "")))] - "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" -{ - if (!TARGET_FUSED_MADD) - { - emit_insn (gen_altivec_vmaddfp_2 (operands[0], operands[1], operands[2], - operands[3])); - DONE; - } -}) - -(define_insn "*altivec_vmaddfp_1" +(define_insn "*altivec_fmav4sf4" [(set (match_operand:V4SF 0 "register_operand" "=v") - (plus:V4SF (mult:V4SF (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")) - (match_operand:V4SF 3 "register_operand" "v")))] - "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD" - "vmaddfp %0,%1,%2,%3" - [(set_attr "type" "vecfloat")]) - -(define_insn "altivec_vmaddfp_2" - [(set (match_operand:V4SF 0 "register_operand" "=v") (fma:V4SF (match_operand:V4SF 1 "register_operand" "v") (match_operand:V4SF 2 "register_operand" "v") (match_operand:V4SF 3 "register_operand" "v")))] @@ -552,11 +526,11 @@ ;; We do multiply as a fused multiply-add with an add of a -0.0 vector. (define_expand "altivec_mulv4sf3" - [(use (match_operand:V4SF 0 "register_operand" "")) - (use (match_operand:V4SF 1 "register_operand" "")) - (use (match_operand:V4SF 2 "register_operand" ""))] + [(set (match_operand:V4SF 0 "register_operand" "") + (fma:V4SF (match_operand:V4SF 1 "register_operand" "") + (match_operand:V4SF 2 "register_operand" "") + (match_dup 3)))] "VECTOR_UNIT_ALTIVEC_P (V4SFmode)" - " { rtx neg0; @@ -565,11 +539,8 @@ emit_insn (gen_altivec_vspltisw (neg0, constm1_rtx)); emit_insn (gen_vashlv4si3 (neg0, neg0, neg0)); - /* Use the multiply-add. */ - emit_insn (gen_altivec_vmaddfp (operands[0], operands[1], operands[2], - gen_lowpart (V4SFmode, neg0))); - DONE; -}") + operands[3] = gen_lowpart (V4SFmode, neg0); +}) ;; 32-bit integer multiplication ;; A_high = Operand_0 & 0xFFFF0000 >> 16 @@ -653,7 +624,7 @@ }") ;; Fused multiply subtract -(define_insn "altivec_vnmsubfp" +(define_insn "*altivec_vnmsubfp" [(set (match_operand:V4SF 0 "register_operand" "=v") (neg:V4SF (fma:V4SF (match_operand:V4SF 1 "register_operand" "v") @@ -664,31 +635,6 @@ "vnmsubfp %0,%1,%2,%3" [(set_attr "type" "vecfloat")]) -(define_insn "*altivec_vnmsubfp_1" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (neg:V4SF - (minus:V4SF - (mult:V4SF - (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v")) - (match_operand:V4SF 3 "register_operand" "v"))))] - "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD - && HONOR_SIGNED_ZEROS (SFmode)" - "vnmsubfp %0,%1,%2,%3" - [(set_attr "type" "vecfloat")]) - -(define_insn "*altivec_vnmsubfp_2" - [(set (match_operand:V4SF 0 "register_operand" "=v") - (minus:V4SF - (match_operand:V4SF 3 "register_operand" "v") - (mult:V4SF - (match_operand:V4SF 1 "register_operand" "v") - (match_operand:V4SF 2 "register_operand" "v"))))] - "VECTOR_UNIT_ALTIVEC_P (V4SFmode) && TARGET_FUSED_MADD - && !HONOR_SIGNED_ZEROS (SFmode)" - "vnmsubfp %0,%1,%2,%3" - [(set_attr "type" "vecfloat")]) - (define_insn "altivec_vmsumum" [(set (match_operand:V4SI 0 "register_operand" "=v") (unspec:V4SI [(match_operand:VIshort 1 "register_operand" "v") Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 166635) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -226,6 +226,16 @@ (DD "TARGET_DFP") (TD "TARGET_DFP")]) +; Any fma capable floating-point mode. +(define_mode_iterator FMA_F [ + (SF "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT") + (DF "(TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT) + || VECTOR_UNIT_VSX_P (DFmode)") + (V2SF "TARGET_PAIRED_FLOAT") + (V4SF "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V4SFmode)") + (V2DF "VECTOR_UNIT_ALTIVEC_OR_VSX_P (V2DFmode)") + ]) + ; These modes do not fit in integer registers in 32-bit mode. ; but on e500v2, the gpr are 64 bit registers (define_mode_iterator DIFD [DI (DF "!TARGET_E500_DOUBLE") DD]) @@ -5845,28 +5855,17 @@ [(set_attr "type" "fp")]) ; builtin fmaf support -; If the user explicitly uses the fma builtin, don't convert this to -; (plus (mult op1 op2) op3) -(define_expand "fmasf4" - [(set (match_operand:SF 0 "gpc_reg_operand" "") - (fma:SF (match_operand:SF 1 "gpc_reg_operand" "") - (match_operand:SF 2 "gpc_reg_operand" "") - (match_operand:SF 3 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "") - -(define_insn "fmasf4_fpr" +(define_insn "*fmasf4_fpr" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f") (match_operand:SF 3 "gpc_reg_operand" "f")))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "* { - return ((TARGET_POWERPC) - ? \"fmadds %0,%1,%2,%3\" - : \"{fma|fmadd} %0,%1,%2,%3\"); -}" + return (TARGET_POWERPC + ? "fmadds %0,%1,%2,%3" + : "{fma|fmadd} %0,%1,%2,%3"); +} [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) @@ -5876,168 +5875,42 @@ (match_operand:SF 2 "gpc_reg_operand" "f") (neg:SF (match_operand:SF 3 "gpc_reg_operand" "f"))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "* { - return ((TARGET_POWERPC) - ? \"fmsubs %0,%1,%2,%3\" - : \"{fms|fmsub} %0,%1,%2,%3\"); -}" + return (TARGET_POWERPC + ? "fmsubs %0,%1,%2,%3" + : "{fms|fmsub} %0,%1,%2,%3"); +} [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "*fnmasf4_fpr" +(define_insn "*nfmasf4_fpr" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f") (match_operand:SF 3 "gpc_reg_operand" "f"))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "* { - return ((TARGET_POWERPC) - ? \"fnmadds %0,%1,%2,%3\" - : \"{fnma|fnmadd} %0,%1,%2,%3\"); -}" + return (TARGET_POWERPC + ? "fnmadds %0,%1,%2,%3" + : "{fnma|fnmadd} %0,%1,%2,%3"); +} [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "*fnmssf4_fpr" +(define_insn "*nfmssf4_fpr" [(set (match_operand:SF 0 "gpc_reg_operand" "=f") (neg:SF (fma:SF (match_operand:SF 1 "gpc_reg_operand" "f") (match_operand:SF 2 "gpc_reg_operand" "f") (neg:SF (match_operand:SF 3 "gpc_reg_operand" "f")))))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "* { - return ((TARGET_POWERPC) - ? \"fnmsubs %0,%1,%2,%3\" - : \"{fnms|fnmsub} %0,%1,%2,%3\"); -}" + return (TARGET_POWERPC + ? "fnmsubs %0,%1,%2,%3" + : "{fnms|fnmsub} %0,%1,%2,%3"); +} [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -; Fused multiply/add ops created by the combiner -(define_insn "*fmaddsf4_powerpc" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_SINGLE_FLOAT && TARGET_FUSED_MADD" - "fmadds %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fmaddsf4_power" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" - "{fma|fmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - -(define_insn "*fmsubsf4_powerpc" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_SINGLE_FLOAT && TARGET_FUSED_MADD" - "fmsubs %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fmsubsf4_power" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" - "{fms|fmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - -(define_insn "*fnmaddsf4_powerpc_1" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f"))))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && TARGET_SINGLE_FLOAT" - "fnmadds %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fnmaddsf4_powerpc_2" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_POWERPC && TARGET_SINGLE_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (SFmode)" - "fnmadds %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fnmaddsf4_power_1" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (neg:SF (plus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f"))))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" - "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - -(define_insn "*fnmaddsf4_power_2" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (mult:SF (neg:SF (match_operand:SF 1 "gpc_reg_operand" "f")) - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (SFmode)" - "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - -(define_insn "*fnmsubsf4_powerpc_1" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f"))))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && TARGET_SINGLE_FLOAT" - "fnmsubs %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fnmsubsf4_powerpc_2" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") - (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f"))))] - "TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && TARGET_SINGLE_FLOAT && ! HONOR_SIGNED_ZEROS (SFmode)" - "fnmsubs %0,%1,%2,%3" - [(set_attr "type" "fp") - (set_attr "fp_type" "fp_maddsub_s")]) - -(define_insn "*fnmsubsf4_power_1" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (neg:SF (minus:SF (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f")) - (match_operand:SF 3 "gpc_reg_operand" "f"))))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD" - "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - -(define_insn "*fnmsubsf4_power_2" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (minus:SF (match_operand:SF 3 "gpc_reg_operand" "f") - (mult:SF (match_operand:SF 1 "gpc_reg_operand" "%f") - (match_operand:SF 2 "gpc_reg_operand" "f"))))] - "! TARGET_POWERPC && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD - && ! HONOR_SIGNED_ZEROS (SFmode)" - "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul")]) - (define_expand "sqrtsf2" [(set (match_operand:SF 0 "gpc_reg_operand" "") (sqrt:SF (match_operand:SF 1 "gpc_reg_operand" "")))] @@ -6385,17 +6258,7 @@ [(set_attr "type" "fp")]) ; builtin fma support -; If the user explicitly uses the fma builtin, don't convert this to -; (plus (mult op1 op2) op3) -(define_expand "fmadf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "") - (fma:DF (match_operand:DF 1 "gpc_reg_operand" "") - (match_operand:DF 2 "gpc_reg_operand" "") - (match_operand:DF 3 "gpc_reg_operand" "")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" - "") - -(define_insn "fmadf4_fpr" +(define_insn "*fmadf4_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f") (match_operand:DF 2 "gpc_reg_operand" "f") @@ -6417,7 +6280,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "*fnmadf4_fpr" +(define_insn "*nfmadf4_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f") (match_operand:DF 2 "gpc_reg_operand" "f") @@ -6428,7 +6291,7 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -(define_insn "*fnmsdf4_fpr" +(define_insn "*nfmsdf4_fpr" [(set (match_operand:DF 0 "gpc_reg_operand" "=f") (neg:DF (fma:DF (match_operand:DF 1 "gpc_reg_operand" "f") (match_operand:DF 2 "gpc_reg_operand" "f") @@ -6439,73 +6302,6 @@ [(set_attr "type" "fp") (set_attr "fp_type" "fp_maddsub_s")]) -; Fused multiply/add ops created by the combiner -(define_insn "*fmadddf4_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") - (match_operand:DF 2 "gpc_reg_operand" "d")) - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && VECTOR_UNIT_NONE_P (DFmode)" - "{fma|fmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - -(define_insn "*fmsubdf4_fpr" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") - (match_operand:DF 2 "gpc_reg_operand" "d")) - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && VECTOR_UNIT_NONE_P (DFmode)" - "{fms|fmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - -(define_insn "*fnmadddf4_fpr_1" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (neg:DF (plus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") - (match_operand:DF 2 "gpc_reg_operand" "d")) - (match_operand:DF 3 "gpc_reg_operand" "d"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && VECTOR_UNIT_NONE_P (DFmode)" - "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - -(define_insn "*fnmadddf4_fpr_2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (minus:DF (mult:DF (neg:DF (match_operand:DF 1 "gpc_reg_operand" "d")) - (match_operand:DF 2 "gpc_reg_operand" "d")) - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" - "{fnma|fnmadd} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - -(define_insn "*fnmsubdf4_fpr_1" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (neg:DF (minus:DF (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") - (match_operand:DF 2 "gpc_reg_operand" "d")) - (match_operand:DF 3 "gpc_reg_operand" "d"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && VECTOR_UNIT_NONE_P (DFmode)" - "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - -(define_insn "*fnmsubdf4_fpr_2" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (minus:DF (match_operand:DF 3 "gpc_reg_operand" "d") - (mult:DF (match_operand:DF 1 "gpc_reg_operand" "%d") - (match_operand:DF 2 "gpc_reg_operand" "d"))))] - "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_FUSED_MADD && TARGET_DOUBLE_FLOAT - && ! HONOR_SIGNED_ZEROS (DFmode) && VECTOR_UNIT_NONE_P (DFmode)" - "{fnms|fnmsub} %0,%1,%2,%3" - [(set_attr "type" "dmul") - (set_attr "fp_type" "fp_maddsub_d")]) - (define_expand "sqrtdf2" [(set (match_operand:DF 0 "gpc_reg_operand" "") (sqrt:DF (match_operand:DF 1 "gpc_reg_operand" "")))] @@ -16310,7 +16106,74 @@ [(set_attr "type" "integer")]) +;; Builtin fma support. Handle +;; Note that the conditions for expansion are in the FMA_F iterator. +(define_expand "fma4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (match_operand:FMA_F 3 "register_operand" "")))] + "" + "") + +; Altivec only has fma and nfms. +(define_expand "fms4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (neg:FMA_F (match_operand:FMA_F 3 "register_operand" ""))))] + "!VECTOR_UNIT_ALTIVEC_P (mode)" + "") + +;; If signed zeros are ignored, -(a * b - c) = -a * b + c. +(define_expand "fnma4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (neg:FMA_F + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (neg:FMA_F (match_operand:FMA_F 3 "register_operand" "")))))] + "!HONOR_SIGNED_ZEROS (mode)" + "") + +;; If signed zeros are ignored, -(a * b + c) = -a * b - c. +(define_expand "fnms4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (neg:FMA_F + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (match_operand:FMA_F 3 "register_operand" ""))))] + "!HONOR_SIGNED_ZEROS (mode) && !VECTOR_UNIT_ALTIVEC_P (mode)" + "") + +; Not an official optab name, but used from builtins. +(define_expand "nfma4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (neg:FMA_F + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (match_operand:FMA_F 3 "register_operand" ""))))] + "!VECTOR_UNIT_ALTIVEC_P (mode)" + "") + +; Not an official optab name, but used from builtins. +(define_expand "nfms4" + [(set (match_operand:FMA_F 0 "register_operand" "") + (neg:FMA_F + (fma:FMA_F + (match_operand:FMA_F 1 "register_operand" "") + (match_operand:FMA_F 2 "register_operand" "") + (neg:FMA_F (match_operand:FMA_F 3 "register_operand" "")))))] + "" + "") + + + (include "sync.md") (include "vector.md") (include "vsx.md")