diff mbox

[ARM] Rewrite vc<cond> NEON patterns to use RTL operations rather than UNSPECs

Message ID 54D20CB2.4070200@arm.com
State New
Headers show

Commit Message

Kyrylo Tkachov Feb. 4, 2015, 12:12 p.m. UTC
Hi all,

This patch improves the vc<cond> patterns in neon.md to use proper RTL 
operations rather than UNSPECS.
It is done in a similar way to the analogous aarch64 operations i.e. 
vceq is expressed as
(neg (eq (...) (...)))
since we want to write all 1s to the result element when 'eq' holds and 
0s otherwise.

The catch is that the floating-point comparisons can only be expanded to 
the RTL codes when -funsafe-math-optimizations is given and they must 
continue to use the UNSPECS otherwise.
For this I've created a define_expand that generates
the correct RTL depending on -funsafe-math-optimizations and two 
define_insns to match the result: one using the RTL codes and one using 
UNSPECs.

I've also compressed some of the patterns together using iterators for 
the [eq gt ge le lt] cases.
NOTE: for le and lt before this patch we would never generate 
'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly. 
According to the ARM ARM this is just a pseudo-instruction that mapps to 
vcgt with the operands swapped around.
I've confirmed that gas supports this code.

The vcage and vcagt patterns are rewritten to use the form:
(neg
   (<cond>
     (abs (...))
     (abs (...))))

and condensed together using iterators as well.

Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the 
advanced-simd-intrinsics testsuite is passing
(it did catch some bugs during development of this patch) and tried out 
other NEON intrinsics codebases.

The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, 
dm, #0' instructions where appropriate instead of the previous vmov of 
#0 into a temp and then a 'vcgt.<type> dn, temp, dm'.
I think that is correct behaviour since the test was trying to make sure 
that we didn't generate a .u<size>-typed comparison with #0, which is 
what the PR was talking about (from what I can gather).

What do people think of this approach?
I'm proposing this for next stage1, of course.

Thanks,
Kyrill


2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
     iterators.
     (cmp_op, cmp_type): New code attributes.
     (NEON_VCMP, NEON_VACMP): New int iterators.
     (cmp_op_unsp): New int attribute.
     * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
     (neon_vceq<mode>): Delete.
     (neon_vc<cmp_op><mode>_insn): New pattern.
     (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
     (neon_vcgeu<mode>): Delete.
     (neon_vcle<mode>): Likewise.
     (neon_vclt<mode>: Likewise.
     (neon_vcage<mode>): Likewise.
     (neon_vcagt<mode>): Likewise.
     (neon_vca<cmp_op><mode>): New define_expand.
     (neon_vca<cmp_op><mode>_insn): New pattern.
     (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.

2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
     to look for vcl* where appropriate.

Comments

Kyrylo Tkachov Feb. 12, 2015, 3:58 p.m. UTC | #1
Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html

btw, sorry if the diff looks hard to parse. Some patterns are deleted 
and replaced with similar-looking ones, which makes the diffs look 
weird. I've tried a few diff algorithms but this is the best I got.

Kyrill

On 04/02/15 12:12, Kyrill Tkachov wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e.
> vceq is expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and
> 0s otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to
> the RTL codes when -funsafe-math-optimizations is given and they must
> continue to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for
> the [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate
> 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>     (<cond>
>       (abs (...))
>       (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn,
> dm, #0' instructions where appropriate instead of the previous vmov of
> #0 into a temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is
> what the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>
> Thanks,
> Kyrill
>
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
>       iterators.
>       (cmp_op, cmp_type): New code attributes.
>       (NEON_VCMP, NEON_VACMP): New int iterators.
>       (cmp_op_unsp): New int attribute.
>       * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
>       (neon_vceq<mode>): Delete.
>       (neon_vc<cmp_op><mode>_insn): New pattern.
>       (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>       (neon_vcgeu<mode>): Delete.
>       (neon_vcle<mode>): Likewise.
>       (neon_vclt<mode>: Likewise.
>       (neon_vcage<mode>): Likewise.
>       (neon_vcagt<mode>): Likewise.
>       (neon_vca<cmp_op><mode>): New define_expand.
>       (neon_vca<cmp_op><mode>_insn): New pattern.
>       (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
>       to look for vcl* where appropriate.
Kyrylo Tkachov April 13, 2015, 12:50 p.m. UTC | #2
Ping now that stage1 is open.
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html

Thanks,
Kyrill

On 04/02/15 12:12, Kyrill Tkachov wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e.
> vceq is expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and
> 0s otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to
> the RTL codes when -funsafe-math-optimizations is given and they must
> continue to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for
> the [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate
> 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>     (<cond>
>       (abs (...))
>       (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn,
> dm, #0' instructions where appropriate instead of the previous vmov of
> #0 into a temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is
> what the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>
> Thanks,
> Kyrill
>
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
>       iterators.
>       (cmp_op, cmp_type): New code attributes.
>       (NEON_VCMP, NEON_VACMP): New int iterators.
>       (cmp_op_unsp): New int attribute.
>       * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
>       (neon_vceq<mode>): Delete.
>       (neon_vc<cmp_op><mode>_insn): New pattern.
>       (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>       (neon_vcgeu<mode>): Delete.
>       (neon_vcle<mode>): Likewise.
>       (neon_vclt<mode>: Likewise.
>       (neon_vcage<mode>): Likewise.
>       (neon_vcagt<mode>): Likewise.
>       (neon_vca<cmp_op><mode>): New define_expand.
>       (neon_vca<cmp_op><mode>_insn): New pattern.
>       (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
>       to look for vcl* where appropriate.
Kyrylo Tkachov April 23, 2015, 10:48 a.m. UTC | #3
Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg00232.html

Thanks,
Kyrill
On 04/02/15 12:12, Kyrill Tkachov wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e.
> vceq is expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and
> 0s otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to
> the RTL codes when -funsafe-math-optimizations is given and they must
> continue to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for
> the [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate
> 'vclt.<type> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>     (<cond>
>       (abs (...))
>       (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn,
> dm, #0' instructions where appropriate instead of the previous vmov of
> #0 into a temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is
> what the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>
> Thanks,
> Kyrill
>
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
>       iterators.
>       (cmp_op, cmp_type): New code attributes.
>       (NEON_VCMP, NEON_VACMP): New int iterators.
>       (cmp_op_unsp): New int attribute.
>       * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
>       (neon_vceq<mode>): Delete.
>       (neon_vc<cmp_op><mode>_insn): New pattern.
>       (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>       (neon_vcgeu<mode>): Delete.
>       (neon_vcle<mode>): Likewise.
>       (neon_vclt<mode>: Likewise.
>       (neon_vcage<mode>): Likewise.
>       (neon_vcagt<mode>): Likewise.
>       (neon_vca<cmp_op><mode>): New define_expand.
>       (neon_vca<cmp_op><mode>_insn): New pattern.
>       (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>       * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
>       to look for vcl* where appropriate.
Ramana Radhakrishnan April 23, 2015, 3 p.m. UTC | #4
On Wed, Feb 4, 2015 at 12:12 PM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
> Hi all,
>
> This patch improves the vc<cond> patterns in neon.md to use proper RTL
> operations rather than UNSPECS.
> It is done in a similar way to the analogous aarch64 operations i.e. vceq is
> expressed as
> (neg (eq (...) (...)))
> since we want to write all 1s to the result element when 'eq' holds and 0s
> otherwise.
>
> The catch is that the floating-point comparisons can only be expanded to the
> RTL codes when -funsafe-math-optimizations is given and they must continue
> to use the UNSPECS otherwise.
> For this I've created a define_expand that generates
> the correct RTL depending on -funsafe-math-optimizations and two
> define_insns to match the result: one using the RTL codes and one using
> UNSPECs.
>
> I've also compressed some of the patterns together using iterators for the
> [eq gt ge le lt] cases.
> NOTE: for le and lt before this patch we would never generate 'vclt.<type>
> dm, dn, dp' instructions, only 'vclt.<type> dm, dn, #0'.
> With this patch we can now generate 'vclt.<type> dm, dn, dp' assembly.
> According to the ARM ARM this is just a pseudo-instruction that mapps to
> vcgt with the operands swapped around.
> I've confirmed that gas supports this code.
>
> The vcage and vcagt patterns are rewritten to use the form:
> (neg
>   (<cond>
>     (abs (...))
>     (abs (...))))
>
> and condensed together using iterators as well.
>
> Bootstrapped and tested on arm-none-linux-gnueabihf, made sure that the
> advanced-simd-intrinsics testsuite is passing
> (it did catch some bugs during development of this patch) and tried out
> other NEON intrinsics codebases.
>
> The test gcc.target/arm/neon/pr51534.c now generates 'vclt.<type> dn, dm,
> #0' instructions where appropriate instead of the previous vmov of #0 into a
> temp and then a 'vcgt.<type> dn, temp, dm'.
> I think that is correct behaviour since the test was trying to make sure
> that we didn't generate a .u<size>-typed comparison with #0, which is what
> the PR was talking about (from what I can gather).
>
> What do people think of this approach?
> I'm proposing this for next stage1, of course.
>


This is OK - thanks.

Ramana
> Thanks,
> Kyrill
>
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * config/arm/iterators.md (GTGE, GTUGEU, COMPARISONS): New code
>     iterators.
>     (cmp_op, cmp_type): New code attributes.
>     (NEON_VCMP, NEON_VACMP): New int iterators.
>     (cmp_op_unsp): New int attribute.
>     * config/arm/neon.md (neon_vc<cmp_op><mode>): New define_expand.
>     (neon_vceq<mode>): Delete.
>     (neon_vc<cmp_op><mode>_insn): New pattern.
>     (neon_vc<cmp_op_unsp><mode>_insn_unspec): Likewise.
>     (neon_vcgeu<mode>): Delete.
>     (neon_vcle<mode>): Likewise.
>     (neon_vclt<mode>: Likewise.
>     (neon_vcage<mode>): Likewise.
>     (neon_vcagt<mode>): Likewise.
>     (neon_vca<cmp_op><mode>): New define_expand.
>     (neon_vca<cmp_op><mode>_insn): New pattern.
>     (neon_vca<cmp_op_unsp><mode>_insn_unspec): Likewise.
>
> 2015-02-04  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>
>     * gcc.target/arm/neon/pr51534.c: Update vcg* scan-assembly patterns
>     to look for vcl* where appropriate.
diff mbox

Patch

diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f7f8ab7..66f3f4d 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -181,6 +181,15 @@  (define_mode_iterator VPF [V8QI V16QI V2SF V4SF])
 ;; compare a second time.
 (define_code_iterator LTUGEU [ltu geu])
 
+;; The signed gt, ge comparisons
+(define_code_iterator GTGE [gt ge])
+
+;; The unsigned gt, ge comparisons
+(define_code_iterator GTUGEU [gtu geu])
+
+;; Comparisons for vc<cmp>
+(define_code_iterator COMPARISONS [eq gt ge le lt])
+
 ;; A list of ...
 (define_code_iterator ior_xor [ior xor])
 
@@ -214,6 +223,11 @@  (define_code_attr t2_binop0
 (define_code_attr arith_shift_insn
   [(plus "add") (minus "rsb") (ior "orr") (xor "eor") (and "and")])
 
+(define_code_attr cmp_op [(eq "eq") (gt "gt") (ge "ge") (lt "lt") (le "le")
+                          (gtu "gt") (geu "ge")])
+
+(define_code_attr cmp_type [(eq "i") (gt "s") (ge "s") (lt "s") (le "s")])
+
 ;;----------------------------------------------------------------------------
 ;; Int iterators
 ;;----------------------------------------------------------------------------
@@ -221,6 +235,10 @@  (define_code_attr arith_shift_insn
 (define_int_iterator VRINT [UNSPEC_VRINTZ UNSPEC_VRINTP UNSPEC_VRINTM
                             UNSPEC_VRINTR UNSPEC_VRINTX UNSPEC_VRINTA])
 
+(define_int_iterator NEON_VCMP [UNSPEC_VCEQ UNSPEC_VCGT UNSPEC_VCGE UNSPEC_VCLT UNSPEC_VCLE])
+
+(define_int_iterator NEON_VACMP [UNSPEC_VCAGE UNSPEC_VCAGT])
+
 (define_int_iterator VCVT [UNSPEC_VRINTP UNSPEC_VRINTM UNSPEC_VRINTA])
 
 (define_int_iterator NEON_VRINT [UNSPEC_NVRINTP UNSPEC_NVRINTZ UNSPEC_NVRINTM
@@ -677,6 +695,11 @@  (define_int_attr sup [
 
 ])
 
+(define_int_attr cmp_op_unsp [(UNSPEC_VCEQ "eq") (UNSPEC_VCGT "gt")
+                              (UNSPEC_VCGE "ge") (UNSPEC_VCLE "le")
+                              (UNSPEC_VCLT "lt") (UNSPEC_VCAGE "ge")
+                              (UNSPEC_VCAGT "gt")])
+
 (define_int_attr r [
   (UNSPEC_VRHADD_S "r") (UNSPEC_VRHADD_U "r")
   (UNSPEC_VHADD_S "") (UNSPEC_VHADD_U "")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 63c327e..445df2a 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -2200,134 +2200,140 @@  (define_insn "neon_v<r>subhn<mode>"
   [(set_attr "type" "neon_sub_halve_narrow_q")]
 )
 
-(define_insn "neon_vceq<mode>"
-  [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQW 1 "s_register_operand" "w,w")
-	   (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")]
-          UNSPEC_VCEQ))]
+;; These may expand to an UNSPEC pattern when a floating point mode is used
+;; without unsafe math optimizations.
+(define_expand "neon_vc<cmp_op><mode>"
+  [(match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
+     (neg:<V_cmp_result>
+       (COMPARISONS:VDQW (match_operand:VDQW 1 "s_register_operand" "w,w")
+                         (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")))]
   "TARGET_NEON"
-  "@
-  vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
-  vceq.<V_if_elem>\t%<V_reg>0, %<V_reg>1, #0"
-  [(set (attr "type")
-      (if_then_else (match_test "<Is_float_mode>")
-                    (const_string "neon_fp_compare_s<q>")
-                    (if_then_else (match_operand 2 "zero_operand")
-                      (const_string "neon_compare_zero<q>")
-                      (const_string "neon_compare<q>"))))]
+  {
+    /* For FP comparisons use UNSPECS unless -funsafe-math-optimizations
+       are enabled.  */
+    if (GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT
+        && !flag_unsafe_math_optimizations)
+      {
+        /* We don't just emit a gen_neon_vc<cmp_op><mode>_insn_unspec because
+           we define gen_neon_vceq<mode>_insn_unspec only for float modes
+           whereas this expander iterates over the integer modes as well,
+           but we will never expand to UNSPECs for the integer comparisons.  */
+        switch (<MODE>mode)
+          {
+            case V2SFmode:
+              emit_insn (gen_neon_vc<cmp_op>v2sf_insn_unspec (operands[0],
+                                                              operands[1],
+                                                              operands[2]));
+              break;
+            case V4SFmode:
+              emit_insn (gen_neon_vc<cmp_op>v4sf_insn_unspec (operands[0],
+                                                              operands[1],
+                                                              operands[2]));
+              break;
+            default:
+              gcc_unreachable ();
+          }
+      }
+    else
+      emit_insn (gen_neon_vc<cmp_op><mode>_insn (operands[0],
+                                                 operands[1],
+                                                 operands[2]));
+    DONE;
+  }
 )
 
-(define_insn "neon_vcge<mode>"
+(define_insn "neon_vc<cmp_op><mode>_insn"
   [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQW 1 "s_register_operand" "w,w")
-	   (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")]
-          UNSPEC_VCGE))]
-  "TARGET_NEON"
-  "@
-  vcge.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
-  vcge.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0"
+        (neg:<V_cmp_result>
+          (COMPARISONS:<V_cmp_result>
+            (match_operand:VDQW 1 "s_register_operand" "w,w")
+            (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz"))))]
+  "TARGET_NEON && !(GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT
+                    && !flag_unsafe_math_optimizations)"
+  {
+    char pattern[100];
+    sprintf (pattern, "vc<cmp_op>.%s%%#<V_sz_elem>\t%%<V_reg>0,"
+                      " %%<V_reg>1, %s",
+                       GET_MODE_CLASS (<MODE>mode) == MODE_VECTOR_FLOAT
+                         ? "f" : "<cmp_type>",
+                       which_alternative == 0
+                         ? "%<V_reg>2" : "#0");
+    output_asm_insn (pattern, operands);
+    return "";
+  }
   [(set (attr "type")
-     (if_then_else (match_test "<Is_float_mode>")
-                   (const_string "neon_fp_compare_s<q>")
-                    (if_then_else (match_operand 2 "zero_operand")
+        (if_then_else (match_operand 2 "zero_operand")
                       (const_string "neon_compare_zero<q>")
-                      (const_string "neon_compare<q>"))))]
-)
-
-(define_insn "neon_vcgeu<mode>"
-  [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQIW 1 "s_register_operand" "w")
-	   (match_operand:VDQIW 2 "s_register_operand" "w")]
-          UNSPEC_VCGEU))]
-  "TARGET_NEON"
-  "vcge.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
-  [(set_attr "type" "neon_compare<q>")]
+                      (const_string "neon_compare<q>")))]
 )
 
-(define_insn "neon_vcgt<mode>"
+(define_insn "neon_vc<cmp_op_unsp><mode>_insn_unspec"
   [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w,w")
         (unspec:<V_cmp_result>
-	  [(match_operand:VDQW 1 "s_register_operand" "w,w")
-	   (match_operand:VDQW 2 "reg_or_zero_operand" "w,Dz")]
-          UNSPEC_VCGT))]
+	  [(match_operand:VCVTF 1 "s_register_operand" "w,w")
+	   (match_operand:VCVTF 2 "reg_or_zero_operand" "w,Dz")]
+          NEON_VCMP))]
   "TARGET_NEON"
-  "@
-  vcgt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2
-  vcgt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0"
-  [(set (attr "type")
-     (if_then_else (match_test "<Is_float_mode>")
-                   (const_string "neon_fp_compare_s<q>")
-                    (if_then_else (match_operand 2 "zero_operand")
-                      (const_string "neon_compare_zero<q>")
-                      (const_string "neon_compare<q>"))))]
+  {
+    char pattern[100];
+    sprintf (pattern, "vc<cmp_op_unsp>.f%%#<V_sz_elem>\t%%<V_reg>0,"
+                       " %%<V_reg>1, %s",
+                       which_alternative == 0
+                         ? "%<V_reg>2" : "#0");
+    output_asm_insn (pattern, operands);
+    return "";
+}
+  [(set_attr "type" "neon_fp_compare_s<q>")]
 )
 
-(define_insn "neon_vcgtu<mode>"
+(define_insn "neon_vc<cmp_op>u<mode>"
   [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQIW 1 "s_register_operand" "w")
-	   (match_operand:VDQIW 2 "s_register_operand" "w")]
-          UNSPEC_VCGTU))]
+        (neg:<V_cmp_result>
+          (GTUGEU:<V_cmp_result>
+	    (match_operand:VDQIW 1 "s_register_operand" "w")
+	    (match_operand:VDQIW 2 "s_register_operand" "w"))))]
   "TARGET_NEON"
-  "vcgt.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+  "vc<cmp_op>.u%#<V_sz_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
   [(set_attr "type" "neon_compare<q>")]
 )
 
-;; VCLE and VCLT only support comparisons with immediate zero (register
-;; variants are VCGE and VCGT with operands reversed).
-
-(define_insn "neon_vcle<mode>"
-  [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQW 1 "s_register_operand" "w")
-	   (match_operand:VDQW 2 "zero_operand" "Dz")]
-          UNSPEC_VCLE))]
+(define_expand "neon_vca<cmp_op><mode>"
+  [(set (match_operand:<V_cmp_result> 0 "s_register_operand")
+        (neg:<V_cmp_result>
+          (GTGE:<V_cmp_result>
+            (abs:VCVTF (match_operand:VCVTF 1 "s_register_operand"))
+            (abs:VCVTF (match_operand:VCVTF 2 "s_register_operand")))))]
   "TARGET_NEON"
-  "vcle.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0"
-  [(set (attr "type")
-      (if_then_else (match_test "<Is_float_mode>")
-                    (const_string "neon_fp_compare_s<q>")
-                    (if_then_else (match_operand 2 "zero_operand")
-                      (const_string "neon_compare_zero<q>")
-                      (const_string "neon_compare<q>"))))]
-)
-
-(define_insn "neon_vclt<mode>"
-  [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
-        (unspec:<V_cmp_result>
-	  [(match_operand:VDQW 1 "s_register_operand" "w")
-	   (match_operand:VDQW 2 "zero_operand" "Dz")]
-          UNSPEC_VCLT))]
-  "TARGET_NEON"
-  "vclt.<V_s_elem>\t%<V_reg>0, %<V_reg>1, #0"
-  [(set (attr "type")
-      (if_then_else (match_test "<Is_float_mode>")
-                    (const_string "neon_fp_compare_s<q>")
-                    (if_then_else (match_operand 2 "zero_operand")
-                      (const_string "neon_compare_zero<q>")
-                      (const_string "neon_compare<q>"))))]
+  {
+    if (flag_unsafe_math_optimizations)
+      emit_insn (gen_neon_vca<cmp_op><mode>_insn (operands[0], operands[1],
+                                                  operands[2]));
+    else
+      emit_insn (gen_neon_vca<cmp_op><mode>_insn_unspec (operands[0],
+                                                         operands[1],
+                                                         operands[2]));
+    DONE;
+  }
 )
 
-(define_insn "neon_vcage<mode>"
+(define_insn "neon_vca<cmp_op><mode>_insn"
   [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
-        (unspec:<V_cmp_result> [(match_operand:VCVTF 1 "s_register_operand" "w")
-		                (match_operand:VCVTF 2 "s_register_operand" "w")]
-                               UNSPEC_VCAGE))]
-  "TARGET_NEON"
-  "vacge.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+        (neg:<V_cmp_result>
+          (GTGE:<V_cmp_result>
+            (abs:VCVTF (match_operand:VCVTF 1 "s_register_operand" "w"))
+            (abs:VCVTF (match_operand:VCVTF 2 "s_register_operand" "w")))))]
+  "TARGET_NEON && flag_unsafe_math_optimizations"
+  "vac<cmp_op>.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
   [(set_attr "type" "neon_fp_compare_s<q>")]
 )
 
-(define_insn "neon_vcagt<mode>"
+(define_insn "neon_vca<cmp_op_unsp><mode>_insn_unspec"
   [(set (match_operand:<V_cmp_result> 0 "s_register_operand" "=w")
         (unspec:<V_cmp_result> [(match_operand:VCVTF 1 "s_register_operand" "w")
 		                (match_operand:VCVTF 2 "s_register_operand" "w")]
-                               UNSPEC_VCAGT))]
+                               NEON_VACMP))]
   "TARGET_NEON"
-  "vacgt.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
+  "vac<cmp_op_unsp>.<V_if_elem>\t%<V_reg>0, %<V_reg>1, %<V_reg>2"
   [(set_attr "type" "neon_fp_compare_s<q>")]
 )
 
diff --git a/gcc/testsuite/gcc.target/arm/neon/pr51534.c b/gcc/testsuite/gcc.target/arm/neon/pr51534.c
index 71cbb05..074bbd4 100644
--- a/gcc/testsuite/gcc.target/arm/neon/pr51534.c
+++ b/gcc/testsuite/gcc.target/arm/neon/pr51534.c
@@ -58,18 +58,18 @@  GEN_COND_TESTS(vceq)
 /* { dg-final { scan-assembler-times "vcge\.u16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" 2 } } */
 /* { dg-final { scan-assembler "vcge\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
 /* { dg-final { scan-assembler-times "vcge\.u32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" 2 } } */
-/* { dg-final { scan-assembler "vcgt\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcgt\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcgt\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcgt\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcgt\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcgt\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, \[dD\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
-/* { dg-final { scan-assembler "vcge\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+" } } */
+/* { dg-final { scan-assembler "vclt\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vclt\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vclt\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vclt\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vclt\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vclt\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s8\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s16\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
+/* { dg-final { scan-assembler "vcle\.s32\[ 	\]+\[qQ\]\[0-9\]+, \[qQ\]\[0-9\]+, #0" } } */
 /* { dg-final { scan-assembler-times "vceq\.i8\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */
 /* { dg-final { scan-assembler-times "vceq\.i16\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */
 /* { dg-final { scan-assembler-times "vceq\.i32\[ 	\]+\[dD\]\[0-9\]+, \[dD\]\[0-9\]+, #0" 2 } } */