Message ID | 54D20CB2.4070200@arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html btw, sorry if the diff looks hard to parse. Some patterns are deleted and replaced with similar-looking ones, which makes the diffs look weird. I've tried a few diff algorithms but this is the best I got. Kyrill On 04/02/15 12:12, Kyrill Tkachov wrote: > Hi all, > > This patch improves the vc<cond> patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. > vceq is expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and > 0s otherwise. > > The catch is that the floating-point comparisons can only be expanded to > the RTL codes when -funsafe-math-optimizations is given and they must > continue to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for > the [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate > 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'. > With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > (<cond> > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, > dm, #0' instructions where appropriate instead of the previous vmov of > #0 into a temp and then a 'vcgt.<type> dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u<size>-typed comparison with #0, which is > what the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. > > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand. > (neon_vceq<mode>): Delete. > (neon_vc<cmp_op><mode>_insn): New pattern. > (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise. > (neon_vcgeu<mode>): Delete. > (neon_vcle<mode>): Likewise. > (neon_vclt<mode>: Likewise. > (neon_vcage<mode>): Likewise. > (neon_vcagt<mode>): Likewise. > (neon_vca<cmp_op><mode>): New define_expand. > (neon_vca<cmp_op><mode>_insn): New pattern. > (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.
Ping now that stage1 is open. https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html Thanks, Kyrill On 04/02/15 12:12, Kyrill Tkachov wrote: > Hi all, > > This patch improves the vc<cond> patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. > vceq is expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and > 0s otherwise. > > The catch is that the floating-point comparisons can only be expanded to > the RTL codes when -funsafe-math-optimizations is given and they must > continue to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for > the [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate > 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'. > With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > (<cond> > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, > dm, #0' instructions where appropriate instead of the previous vmov of > #0 into a temp and then a 'vcgt.<type> dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u<size>-typed comparison with #0, which is > what the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. > > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand. > (neon_vceq<mode>): Delete. > (neon_vc<cmp_op><mode>_insn): New pattern. > (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise. > (neon_vcgeu<mode>): Delete. > (neon_vcle<mode>): Likewise. > (neon_vclt<mode>: Likewise. > (neon_vcage<mode>): Likewise. > (neon_vcagt<mode>): Likewise. > (neon_vca<cmp_op><mode>): New define_expand. > (neon_vca<cmp_op><mode>_insn): New pattern. > (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html Thanks, Kyrill On 04/02/15 12:12, Kyrill Tkachov wrote: > Hi all, > > This patch improves the vc<cond> patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. > vceq is expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and > 0s otherwise. > > The catch is that the floating-point comparisons can only be expanded to > the RTL codes when -funsafe-math-optimizations is given and they must > continue to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for > the [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate > 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'. > With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > (<cond> > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, > dm, #0' instructions where appropriate instead of the previous vmov of > #0 into a temp and then a 'vcgt.<type> dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u<size>-typed comparison with #0, which is > what the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. > > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand. > (neon_vceq<mode>): Delete. > (neon_vc<cmp_op><mode>_insn): New pattern. > (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise. > (neon_vcgeu<mode>): Delete. > (neon_vcle<mode>): Likewise. > (neon_vclt<mode>: Likewise. > (neon_vcage<mode>): Likewise. > (neon_vcagt<mode>): Likewise. > (neon_vca<cmp_op><mode>): New define_expand. > (neon_vca<cmp_op><mode>_insn): New pattern. > (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.
On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Hi all, > > This patch improves the vc<cond> patterns in neon.md to use proper RTL > operations rather than UNSPECS. > It is done in a similar way to the analogous aarch64 operations i.e. vceq is > expressed as > (neg (eq (...) (...))) > since we want to write all 1s to the result element when 'eq' holds and 0s > otherwise. > > The catch is that the floating-point comparisons can only be expanded to the > RTL codes when -funsafe-math-optimizations is given and they must continue > to use the UNSPECS otherwise. > For this I've created a define_expand that generates > the correct RTL depending on -funsafe-math-optimizations and two > define_insns to match the result: one using the RTL codes and one using > UNSPECs. > > I've also compressed some of the patterns together using iterators for the > [eq gt ge le lt] cases. > NOTE: for le and lt before this patch we would never generate 'vclt.<type> > dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'. > With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. > According to the ARM ARM this is just a pseudo-instruction that mapps to > vcgt with the operands swapped around. > I've confirmed that gas supports this code. > > The vcage and vcagt patterns are rewritten to use the form: > (neg > (<cond> > (abs (...)) > (abs (...)))) > > and condensed together using iterators as well. > > Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the > advanced-simd-intrinsics testsuite is passing > (it did catch some bugs during development of this patch) and tried out > other NEON intrinsics codebases. > > The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, dm, > #0' instructions where appropriate instead of the previous vmov of #0 into a > temp and then a 'vcgt.<type> dn, temp, dm'. > I think that is correct behaviour since the test was trying to make sure > that we didn't generate a .u<size>-typed comparison with #0, which is what > the PR was talking about (from what I can gather). > > What do people think of this approach? > I'm proposing this for next stage1, of course. > This is OK - thanks. Ramana > Thanks, > Kyrill > > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code > iterators. > (cmp_op, cmp_type): New code attributes. > (NEON_VCMP, NEON_VACMP): New int iterators. > (cmp_op_unsp): New int attribute. > * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand. > (neon_vceq<mode>): Delete. > (neon_vc<cmp_op><mode>_insn): New pattern. > (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise. > (neon_vcgeu<mode>): Delete. > (neon_vcle<mode>): Likewise. > (neon_vclt<mode>: Likewise. > (neon_vcage<mode>): Likewise. > (neon_vcagt<mode>): Likewise. > (neon_vca<cmp_op><mode>): New define_expand. > (neon_vca<cmp_op><mode>_insn): New pattern. > (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise. > > 2015-02-04 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns > to look for vcl* where appropriate.
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index f7f8ab7..66f3f4d 100644 --- a/gcc/config/arm/iterators.md +++ b/gcc/config/arm/iterators.md @@ -181,6 +181,15 @@ (define_mode_iterator VPF [V8QI V16QI V2SF V4SF]) ;; compare a second time. (define_code_iterator LTUGEU [ltu geu]) +;; The signed gt, ge comparisons +(define_code_iterator GTGE [gt ge]) + +;; The unsigned gt, ge comparisons +(define_code_iterator GTUGEU [gtu geu]) + +;; Comparisons for vc<cmp> +(define_code_iterator COMPARISONS [eq gt ge le lt]) + ;; A list of ... (define_code_iterator ior_xor [ior xor]) @@ -214,6 +223,11 @@ (define_code_attr t2_binop0 (define_code_attr arith_shift_insn [(plus "add") (minus "rsb") (ior "orr") (xor "eor") (and "and")]) +(define_code_attr cmp_op [(eq "eq") (gt "gt") (ge "ge") (lt "lt") (le "le") + (gtu "gt") (geu "ge")]) + +(define_code_attr cmp_type [(eq "i") (gt "s") (ge "s") (lt "s") (le "s")]) + ;;---------------------------------------------------------------------------- ;; Int iterators ;;---------------------------------------------------------------------------- @@ -221,6 +235,10 @@ (define_code_attr arith_shift_insn (define_int_iterator VRINT [UNSPEC_VRINTZ UNSPEC_VRINTP UNSPEC_VRINTM UNSPEC_VRINTR UNSPEC_VRINTX UNSPEC_VRINTA]) +(define_int_iterator NEON_VCMP [UNSPEC_VCEQ UNSPEC_VCGT UNSPEC_VCGE UNSPEC_VCLT UNSPEC_VCLE]) + +(define_int_iterator NEON_VACMP [UNSPEC_VCAGE UNSPEC_VCAGT]) + (define_int_iterator VCVT [UNSPEC_VRINTP UNSPEC_VRINTM UNSPEC_VRINTA]) (define_int_iterator NEON_VRINT [UNSPEC_NVRINTP UNSPEC_NVRINTZ UNSPEC_NVRINTM @@ -677,6 +695,11 @@ (define_int_attr sup [ ]) +(define_int_attr cmp_op_unsp [(UNSPEC_VCEQ "eq") (UNSPEC_VCGT "gt") + (UNSPEC_VCGE "ge") (UNSPEC_VCLE "le") + (UNSPEC_VCLT "lt") (UNSPEC_VCAGE "ge") + (UNSPEC_VCAGT "gt")]) + (define_int_attr r [ (UNSPEC_VRHADD_S "r") (UNSPEC_VRHADD_U "r") (UNSPEC_VHADD_S "") (UNSPEC_VHADD_U "") diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 63c327e..445df2a 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -2200,134 +2200,140 @@ (define_insn "neon_v<r>subhn<mode>" [(set_attr "type" "neon_sub_halve_narrow_q")] ) -(define_insn "neon_vceq<mode>" - [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w") - (unspec:<V_cmp_result> - [(match_operand:VDQW 1 "s_register_operand" "w,w") - (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")] - UNSPEC_VCEQ))] +;; These may expand to an UNSPEC pattern when a floating point mode is used +;; without unsafe math optimizations. +(define_expand "neon_vc<cmp_op><mode>" + [(match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w") + (neg:<V_cmp_result> + (COMPARISONS:VDQW (match_operand:VDQW 1 "s_register_operand" "w,w") + (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")))] "TARGET_NEON" - "@ - vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2 - vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, #0" - [(set (attr "type") - (if_then_else (match_test "<Is_float_mode>") - (const_string "neon_fp_compare_s<q>") - (if_then_else (match_operand 2 "zero_operand") - (const_string "neon_compare_zero<q>") - (const_string "neon_compare<q>"))))] + { + /* For FP comparisons use UNSPECS unless -funsafe-math-optimizations + are enabled. */ + if (GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT + && !flag_unsafe_math_optimizations) + { + /* We don't just emit a gen_neon_vc<cmp_op><mode>_insn_unspec because + we define gen_neon_vceq<mode>_insn_unspec only for float modes + whereas this expander iterates over the integer modes as well, + but we will never expand to UNSPECs for the integer comparisons. */ + switch (<MODE>mode) + { + case V2SFmode: + emit_insn (gen_neon_vc<cmp_op>v2sf_insn_unspec (operands[0], + operands[1], + operands[2])); + break; + case V4SFmode: + emit_insn (gen_neon_vc<cmp_op>v4sf_insn_unspec (operands[0], + operands[1], + operands[2])); + break; + default: + gcc_unreachable (); + } + } + else + emit_insn (gen_neon_vc<cmp_op><mode>_insn (operands[0], + operands[1], + operands[2])); + DONE; + } ) -(define_insn "neon_vcge<mode>" +(define_insn "neon_vc<cmp_op><mode>_insn" [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w") - (unspec:<V_cmp_result> - [(match_operand:VDQW 1 "s_register_operand" "w,w") - (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")] - UNSPEC_VCGE))] - "TARGET_NEON" - "@ - vcge.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2 - vcge.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0" + (neg:<V_cmp_result> + (COMPARISONS:<V_cmp_result> + (match_operand:VDQW 1 "s_register_operand" "w,w") + (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz"))))] + "TARGET_NEON && !(GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT + && !flag_unsafe_math_optimizations)" + { + char pattern[100]; + sprintf (pattern, "vc<cmp_op>.%s%%#<V_sz_elem>\t%%<V_reg>0," + " %%<V_reg>1, %s", + GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT + ? "f" : "<cmp_type>", + which_alternative == 0 + ? "%<V_reg>2" : "#0"); + output_asm_insn (pattern, operands); + return ""; + } [(set (attr "type") - (if_then_else (match_test "<Is_float_mode>") - (const_string "neon_fp_compare_s<q>") - (if_then_else (match_operand 2 "zero_operand") + (if_then_else (match_operand 2 "zero_operand") (const_string "neon_compare_zero<q>") - (const_string "neon_compare<q>"))))] -) - -(define_insn "neon_vcgeu<mode>" - [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") - (unspec:<V_cmp_result> - [(match_operand:VDQIW 1 "s_register_operand" "w") - (match_operand:VDQIW 2 "s_register_operand" "w")] - UNSPEC_VCGEU))] - "TARGET_NEON" - "vcge.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" - [(set_attr "type" "neon_compare<q>")] + (const_string "neon_compare<q>")))] ) -(define_insn "neon_vcgt<mode>" +(define_insn "neon_vc<cmp_op_unsp><mode>_insn_unspec" [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w") (unspec:<V_cmp_result> - [(match_operand:VDQW 1 "s_register_operand" "w,w") - (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")] - UNSPEC_VCGT))] + [(match_operand:VCVTF 1 "s_register_operand" "w,w") + (match_operand:VCVTF 2 "reg_or_zero_operand" "w,Dz")] + NEON_VCMP))] "TARGET_NEON" - "@ - vcgt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2 - vcgt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0" - [(set (attr "type") - (if_then_else (match_test "<Is_float_mode>") - (const_string "neon_fp_compare_s<q>") - (if_then_else (match_operand 2 "zero_operand") - (const_string "neon_compare_zero<q>") - (const_string "neon_compare<q>"))))] + { + char pattern[100]; + sprintf (pattern, "vc<cmp_op_unsp>.f%%#<V_sz_elem>\t%%<V_reg>0," + " %%<V_reg>1, %s", + which_alternative == 0 + ? "%<V_reg>2" : "#0"); + output_asm_insn (pattern, operands); + return ""; +} + [(set_attr "type" "neon_fp_compare_s<q>")] ) -(define_insn "neon_vcgtu<mode>" +(define_insn "neon_vc<cmp_op>u<mode>" [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") - (unspec:<V_cmp_result> - [(match_operand:VDQIW 1 "s_register_operand" "w") - (match_operand:VDQIW 2 "s_register_operand" "w")] - UNSPEC_VCGTU))] + (neg:<V_cmp_result> + (GTUGEU:<V_cmp_result> + (match_operand:VDQIW 1 "s_register_operand" "w") + (match_operand:VDQIW 2 "s_register_operand" "w"))))] "TARGET_NEON" - "vcgt.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" + "vc<cmp_op>.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" [(set_attr "type" "neon_compare<q>")] ) -;; VCLE and VCLT only support comparisons with immediate zero (register -;; variants are VCGE and VCGT with operands reversed). - -(define_insn "neon_vcle<mode>" - [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") - (unspec:<V_cmp_result> - [(match_operand:VDQW 1 "s_register_operand" "w") - (match_operand:VDQW 2 "zero_operand" "Dz")] - UNSPEC_VCLE))] +(define_expand "neon_vca<cmp_op><mode>" + [(set (match_operand:<V_cmp_result> 0 "s_register_operand") + (neg:<V_cmp_result> + (GTGE:<V_cmp_result> + (abs:VCVTF (match_operand:VCVTF 1 "s_register_operand")) + (abs:VCVTF (match_operand:VCVTF 2 "s_register_operand")))))] "TARGET_NEON" - "vcle.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0" - [(set (attr "type") - (if_then_else (match_test "<Is_float_mode>") - (const_string "neon_fp_compare_s<q>") - (if_then_else (match_operand 2 "zero_operand") - (const_string "neon_compare_zero<q>") - (const_string "neon_compare<q>"))))] -) - -(define_insn "neon_vclt<mode>" - [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") - (unspec:<V_cmp_result> - [(match_operand:VDQW 1 "s_register_operand" "w") - (match_operand:VDQW 2 "zero_operand" "Dz")] - UNSPEC_VCLT))] - "TARGET_NEON" - "vclt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0" - [(set (attr "type") - (if_then_else (match_test "<Is_float_mode>") - (const_string "neon_fp_compare_s<q>") - (if_then_else (match_operand 2 "zero_operand") - (const_string "neon_compare_zero<q>") - (const_string "neon_compare<q>"))))] + { + if (flag_unsafe_math_optimizations) + emit_insn (gen_neon_vca<cmp_op><mode>_insn (operands[0], operands[1], + operands[2])); + else + emit_insn (gen_neon_vca<cmp_op><mode>_insn_unspec (operands[0], + operands[1], + operands[2])); + DONE; + } ) -(define_insn "neon_vcage<mode>" +(define_insn "neon_vca<cmp_op><mode>_insn" [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") - (unspec:<V_cmp_result> [(match_operand:VCVTF 1 "s_register_operand" "w") - (match_operand:VCVTF 2 "s_register_operand" "w")] - UNSPEC_VCAGE))] - "TARGET_NEON" - "vacge.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" + (neg:<V_cmp_result> + (GTGE:<V_cmp_result> + (abs:VCVTF (match_operand:VCVTF 1 "s_register_operand" "w")) + (abs:VCVTF (match_operand:VCVTF 2 "s_register_operand" "w")))))] + "TARGET_NEON && flag_unsafe_math_optimizations" + "vac<cmp_op>.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" [(set_attr "type" "neon_fp_compare_s<q>")] ) -(define_insn "neon_vcagt<mode>" +(define_insn "neon_vca<cmp_op_unsp><mode>_insn_unspec" [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w") (unspec:<V_cmp_result> [(match_operand:VCVTF 1 "s_register_operand" "w") (match_operand:VCVTF 2 "s_register_operand" "w")] - UNSPEC_VCAGT))] + NEON_VACMP))] "TARGET_NEON" - "vacgt.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" + "vac<cmp_op_unsp>.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2" [(set_attr "type" "neon_fp_compare_s<q>")] ) diff --git a/gcc/testsuite/gcc.target/arm/neon/pr51534.c b/gcc/testsuite/gcc.target/arm/neon/pr51534.c index 71cbb05..074bbd4 100644 --- a/gcc/testsuite/gcc.target/arm/neon/pr51534.c +++ b/gcc/testsuite/gcc.target/arm/neon/pr51534.c @@ -58,18 +58,18 @@ GEN_COND_TESTS(vceq) /* { dg-final { scan-assembler-times "vcge\.u16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" 2 } } */ /* { dg-final { scan-assembler "vcge\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ /* { dg-final { scan-assembler-times "vcge\.u32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" 2 } } */ -/* { dg-final { scan-assembler "vcgt\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcgt\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcgt\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcgt\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcgt\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcgt\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ -/* { dg-final { scan-assembler "vcge\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */ +/* { dg-final { scan-assembler "vclt\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vclt\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vclt\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vclt\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vclt\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vclt\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s8\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s16\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ +/* { dg-final { scan-assembler "vcle\.s32\[ \]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */ /* { dg-final { scan-assembler-times "vceq\.i8\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */ /* { dg-final { scan-assembler-times "vceq\.i16\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */ /* { dg-final { scan-assembler-times "vceq\.i32\[ \]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */