diff mbox series

[rs6000] Refactor FP vector comparison operators

Message ID 5913e415-47b2-63bf-d401-3d0976d99184@linux.ibm.com
State New
Headers show
Series [rs6000] Refactor FP vector comparison operators | expand

Commit Message

Kewen.Lin Nov. 11, 2019, 7:40 a.m. UTC
Hi,

This is a subsequent patch to refactor the existing float point
vector comparison operator supports.  The patch to fix PR92132
supplemented vector float point comparison by exposing the names
for unordered/ordered/uneq/ltgt and adding ungt/unge/unlt/unle/
ne.  As Segher pointed out, some patterns can be refactored
together.  The main link on this is: 
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00452.html


The refactoring mainly follows the below patterns:

pattern 1:
  lt(a,b) = gt(b,a)
  le(a,b) = ge(b,a)

pattern 2:
  unge(a,b) = ~gt(b,a)
  unle(a,b) = ~gt(a,b)
  ne(a,b)   = ~eq(a,b)
  ungt(a,b) = ~ge(b,a)
  unlt(a,b) = ~ge(a,b)

pattern 3:
  ltgt: gt(a,b) | gt(b,a)
  ordered: ge(a,b) | ge(b,a)

pattern 4:
  uneq: ~gt(a,b) & ~gt(b,a)
  unordered: ~ge(a,b) & ~ge(b,a)

Naming the code iterators and attributes are really knotty for me :(.

Regression testing just launched.

BR,
Kewen

-------------------------------
gcc/ChangeLog

2019-11-11 Kewen Lin  <linkw@gcc.gnu.org>

	* config/rs6000/vector.md (vec_fp_cmp1): New code iterator.
	(vec_fp_cmp2): Likewise.
	(vec_fp_cmp3): Likewise.
	(vec_fp_cmp4): Likewise.
	(vec_fp_cmp1_attr): New code attribute.
	(vec_fp_cmp2_attr): Likewise.
	(vec_fp_cmp3_attr): Likewise.
	(vec_fp_cmp4_attr): Likewise.
	(vector_<code><mode> for VEC_F and vec_fp_cmp1): New
	define_and_split.
	(vector_<code><mode> for VEC_F and vec_fp_cmp2): Likewise.
	(vector_<code><mode> for VEC_F and vec_fp_cmp3): Likewise.
	(vector_<code><mode> for VEC_F and vec_fp_cmp4): Likewise.
	(vector_lt<mode> for VEC_F): Refactor with vec_fp_cmp1.
	(vector_le<mode> for VEC_F): Likewise.
	(vector_unge<mode> for VEC_F): Refactor with vec_fp_cmp2.
	(vector_unle<mode> for VEC_F): Likewise.
	(vector_ne<mode> for VEC_F): Likewise.
	(vector_ungt<mode> for VEC_F): Likewise.
	(vector_unlt<mode> for VEC_F): Likewise.
	(vector_ltgt<mode> for VEC_F): Refactor with vec_fp_cmp3.
	(vector_ordered<mode> for VEC_F): Likewise.
	(vector_uneq<mode> for VEC_F): Refactor with vec_fp_cmp4.
	(vector_unordered<mode> for VEC_F): Likewise.

Comments

Segher Boessenkool Nov. 11, 2019, 12:51 p.m. UTC | #1
Hi!

On Mon, Nov 11, 2019 at 03:40:51PM +0800, Kewen.Lin wrote:
> This is a subsequent patch to refactor the existing float point
> vector comparison operator supports.  The patch to fix PR92132
> supplemented vector float point comparison by exposing the names
> for unordered/ordered/uneq/ltgt and adding ungt/unge/unlt/unle/
> ne.  As Segher pointed out, some patterns can be refactored
> together.  The main link on this is: 
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00452.html
> 
> 
> The refactoring mainly follows the below patterns:
> 
> pattern 1:
>   lt(a,b) = gt(b,a)
>   le(a,b) = ge(b,a)

This is done by swap_condition normally.

> pattern 2:
>   unge(a,b) = ~gt(b,a)
>   unle(a,b) = ~gt(a,b)
>   ne(a,b)   = ~eq(a,b)
>   ungt(a,b) = ~ge(b,a)
>   unlt(a,b) = ~ge(a,b)

This is reverse_condition_maybe_unordered (and a swap, in two cases).

> pattern 3:
>   ltgt: gt(a,b) | gt(b,a)
>   ordered: ge(a,b) | ge(b,a)

That's the only interesting one :-)

> pattern 4:
>   uneq: ~gt(a,b) & ~gt(b,a)
>   unordered: ~ge(a,b) & ~ge(b,a)

That is 3, reversed.

> Naming the code iterators and attributes are really knotty for me :(.

Maybe the above helps :-)

> 	(vector_<code><mode> for VEC_F and vec_fp_cmp1): New
> 	define_and_split.

> +;; code iterators and attributes for vector FP comparison operators:
> +
> +;; 1. lt and le.
> +(define_code_iterator vec_fp_cmp1 [lt le])
> +(define_code_attr vec_fp_cmp1_attr [(lt "gt")
> +				    (le "ge")])

So this is just (a subset of) swap_condition.

(I'm reordering stuff)

> +; 1. For lt and le:
> +; lt(a,b) = gt(b,a)
> +; le(a,b) = ge(b,a)
> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vec_fp_cmp1:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>    "#"
>    ""
> +  [(set (match_dup 0)
> +	(<vec_fp_cmp1_attr>:VEC_F (match_dup 2)
> +				  (match_dup 1)))]
>  {
>  })

Empty preparation statements (the {}) can just be left out completely.

The split condition "" is incorrect, it should be "&& 1": if it starts
with "&&", the insn condition is included.

> +; 2. unge, unle, ne, ungt and unlt.
> +(define_code_iterator vec_fp_cmp2 [unge unle ne ungt unlt])
> +(define_code_attr vec_fp_cmp2_attr [(unge "gt")
> +				    (unle "gt")
> +				    (ne   "eq")
> +				    (ungt "ge")
> +				    (unlt "ge")])

> +; 2. For unge, unle, ne, ungt and unlt:
> +; unge(a,b) = ~gt(b,a)
> +; unle(a,b) = ~gt(a,b)
> +; ne(a,b)   = ~eq(a,b)
> +; ungt(a,b) = ~ge(b,a)
> +; unlt(a,b) = ~ge(a,b)
> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vec_fp_cmp2:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>    "#"
>    ""
>    [(set (match_dup 3)
> +	(<vec_fp_cmp2_attr>:VEC_F (match_dup 1)
> +				  (match_dup 2)))
>     (set (match_dup 0)
> +	(not:VEC_F (match_dup 3)))]
>  {
>    operands[3] = gen_reg_rtx (<MODE>mode);
> +
> +  if (<CODE> == UNGE || <CODE> == UNGT)
> +    std::swap (operands[1], operands[2]);
>  })

So maybe it is simplest to *do* use match_operator here, handle all of
lt gt le ge eq unge unle ungt unlt ne  in one define_expand, which then
swaps the condition and the args, and expand the extra not for the five
where that is needed?


> +;; 3. ltgt and ordered.
> +(define_code_iterator vec_fp_cmp3 [ltgt ordered])
> +(define_code_attr vec_fp_cmp3_attr [(ltgt "gt")
> +				    (ordered "ge")])
> +
> +;; 4. uneq and unordered.
> +(define_code_iterator vec_fp_cmp4 [uneq unordered])
> +(define_code_attr vec_fp_cmp4_attr [(uneq "gt")
> +				    (unordered "ge")])

And then another one for  ltgt uneq ordered unordered  perhaps?

So you'll need to define two new predicates then.  Something like
vector_fp_comparison_operator and, erm, vector_fp_extra_comparison_operator,
which kind of sucks as a name, but there is only one ;-)

Sorry for sending you first one way, and then back the other.


Segher
Kewen.Lin Nov. 12, 2019, 10:41 a.m. UTC | #2
Hi Segher,

on 2019/11/11 下午8:51, Segher Boessenkool wrote:
> Hi!
> 
>> pattern 1:
>>   lt(a,b) = gt(b,a)
>>   le(a,b) = ge(b,a)
> 
> This is done by swap_condition normally.

Nice!  Done.

> 
>> pattern 2:
>>   unge(a,b) = ~gt(b,a)
>>   unle(a,b) = ~gt(a,b)
>>   ne(a,b)   = ~eq(a,b)
>>   ungt(a,b) = ~ge(b,a)
>>   unlt(a,b) = ~ge(a,b)
> 
> This is reverse_condition_maybe_unordered (and a swap, in two cases).
> 

Nice!  Done.

> 
>> pattern 4:
>>   uneq: ~gt(a,b) & ~gt(b,a)
>>   unordered: ~ge(a,b) & ~ge(b,a)
> 
> That is 3, reversed.
> 

Yes, merge them.

> 
>> +; 1. For lt and le:
>> +; lt(a,b) = gt(b,a)
>> +; le(a,b) = ge(b,a)
>> +(define_insn_and_split "vector_<code><mode>"
>>    [(set (match_operand:VEC_F 0 "vfloat_operand")
>> +	(vec_fp_cmp1:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
>> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "#"
>>    ""
>> +  [(set (match_dup 0)
>> +	(<vec_fp_cmp1_attr>:VEC_F (match_dup 2)
>> +				  (match_dup 1)))]
>>  {
>>  })
> 
> Empty preparation statements (the {}) can just be left out completely.
> 

OK, thanks!

> The split condition "" is incorrect, it should be "&& 1": if it starts
> with "&&", the insn condition is included.
> 

Got it, thanks!
> 
> So maybe it is simplest to *do* use match_operator here, handle all of
> lt gt le ge eq unge unle ungt unlt ne  in one define_expand, which then
> swaps the condition and the args, and expand the extra not for the five
> where that is needed?
> 

I still used define_and_split to support the RTL pattern recognition, so
I didn't use match_operator there, but I put le lt ne unge ungt unlt unle
together in one define_and_split as you suggested.

> 
>> +;; 3. ltgt and ordered.
>> +(define_code_iterator vec_fp_cmp3 [ltgt ordered])
>> +(define_code_attr vec_fp_cmp3_attr [(ltgt "gt")
>> +				    (ordered "ge")])
>> +
>> +;; 4. uneq and unordered.
>> +(define_code_iterator vec_fp_cmp4 [uneq unordered])
>> +(define_code_attr vec_fp_cmp4_attr [(uneq "gt")
>> +				    (unordered "ge")])
> 
> And then another one for  ltgt uneq ordered unordered  perhaps?
> 

Done.

> So you'll need to define two new predicates then.  Something like
> vector_fp_comparison_operator and, erm, vector_fp_extra_comparison_operator,
> which kind of sucks as a name, but there is only one ;-)

Thanks!  I used the names for code_iterator in define_and_split now.

> 
> Sorry for sending you first one way, and then back the other.
> 

It really doesn't matter!

The updated patch is attached.  Since we check those <CODE> and generate
related in prepare of define_and_split, I had one idea to factor those
two parts out and merged as one common split function (move to rs6000.c)?
Does it sound better?


BR,
Kewen

-------------------------------
gcc/ChangeLog

2019-11-12 Kewen Lin  <linkw@gcc.gnu.org>

	* config/rs6000/vector.md (vector_fp_comparison_operator):
	New code iterator.
	(vector_fp_extra_comparison_operator): Likewise.
	(vector_<code><mode> for VEC_F and
	vector_fp_comparison_operator): New define_and_split.
	(vector_<code><mode> for VEC_F and
	vector_fp_extra_comparison_operator): Likewise.
	(vector_lt<mode> for VEC_F): Refactor with
	vector_fp_comparison_operator.
	(vector_le<mode> for VEC_F): Likewise.
	(vector_unge<mode> for VEC_F): Likewise.
	(vector_unle<mode> for VEC_F): Likewise.
	(vector_ne<mode> for VEC_F): Likewise.
	(vector_ungt<mode> for VEC_F): Likewise.
	(vector_unlt<mode> for VEC_F): Likewise.
	(vector_ltgt<mode> for VEC_F): Refactor with
	vector_fp_extra_comparison_operator.
	(vector_ordered<mode> for VEC_F): Likewise.
	(vector_uneq<mode> for VEC_F): Likewise.
	(vector_unordered<mode> for VEC_F): Likewise.
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index b132037..9919fd4 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -107,6 +107,12 @@
 				 (smin "smin")
 				 (smax "smax")])
 
+;; code iterators and attributes for vector FP comparison operators:
+(define_code_iterator vector_fp_comparison_operator [lt le ne
+						     ungt unge unlt unle])
+(define_code_iterator vector_fp_extra_comparison_operator [ltgt uneq
+							   unordered ordered])
+
 
 ;; Vector move instructions.  Little-endian VSX loads and stores require
 ;; special handling to circumvent "element endianness."
@@ -665,88 +671,6 @@
   DONE;
 })
 
-; lt(a,b) = gt(b,a)
-(define_expand "vector_lt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(lt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; le(a,b) = ge(b,a)
-(define_expand "vector_le<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(le:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; ne(a,b) = ~eq(a,b)
-(define_expand "vector_ne<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ne:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_eq<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unge(a,b) = ~gt(b,a)
-(define_expand "vector_unge<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unge:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; ungt(a,b) = ~ge(b,a)
-(define_expand "vector_ungt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ungt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unle(a,b) = ~gt(a,b)
-(define_expand "vector_unle<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unle:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unlt(a,b) = ~ge(a,b)
-(define_expand "vector_unlt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unlt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
 (define_expand "vector_eq<mode>"
   [(set (match_operand:VEC_C 0 "vlogical_operand")
 	(eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
@@ -761,13 +685,6 @@
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
-(define_expand "vector_ge<mode>"
-  [(set (match_operand:VEC_F 0 "vlogical_operand")
-	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
-		  (match_operand:VEC_F 2 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "")
-
 ; >= for integer vectors: swap operands and apply not-greater-than
 (define_expand "vector_nlt<mode>"
   [(set (match_operand:VEC_I 3 "vlogical_operand")
@@ -829,88 +746,114 @@
   operands[3] = gen_reg_rtx_and_attrs (operands[0]);
 })
 
-(define_insn_and_split "vector_uneq<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(and:VEC_F (not:VEC_F (match_dup 3))
-		   (not:VEC_F (match_dup 4))))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+; There are 14 possible vector FP comparison operators, gt and eq of them have
+; been expanded above, so just support 12 remaining operators here.
 
-(define_insn_and_split "vector_ltgt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
+; 0. For ge:
+(define_expand "vector_ge<mode>"
+  [(set (match_operand:VEC_F 0 "vlogical_operand")
+	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
+		  (match_operand:VEC_F 2 "vlogical_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+  "")
 
-(define_insn_and_split "vector_ordered<mode>"
+; 1. For lt/le/ne/ungt/unge/unlt/unle:
+; lt(a,b)   = gt(b,a)
+; le(a,b)   = ge(b,a)
+; unge(a,b) = ~lt(a,b)
+; unle(a,b) = ~gt(a,b)
+; ne(a,b)   = ~eq(a,b)
+; ungt(a,b) = ~le(a,b)
+; unlt(a,b) = ~ge(a,b)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		       (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vector_fp_comparison_operator:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
+  "can_create_pseudo_p ()"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  rtx res = gen_reg_rtx (<MODE>mode);
+  bool need_invert = false;
+  switch (cond)
+    {
+    case LT:
+    case LE:
+      cond = swap_condition (cond);
+      std::swap (operands[1], operands[2]);
+      break;
+    case UNLE:
+    case UNLT:
+    case NE:
+    case UNGE:
+    case UNGT:
+      cond = reverse_condition_maybe_unordered (cond);
+      need_invert = true;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+  emit_insn (gen_rtx_SET (res, comp));
+
+  if (need_invert)
+    emit_insn (gen_one_cmpl<mode>2 (res, res));
+
+  emit_insn (gen_rtx_SET (operands[0], res));
 })
 
-(define_insn_and_split "vector_unordered<mode>"
+; 2. For ltgt/uneq/ordered/unordered:
+; ltgt: gt(a,b) | gt(b,a)
+; uneq: ~gt(a,b) & ~gt(b,a)
+; ordered: ge(a,b) | ge(b,a)
+; unordered: ~ge(a,b) & ~ge(b,a)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-			 (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vector_fp_extra_comparison_operator:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-        (and:VEC_F (not:VEC_F (match_dup 3))
-                   (not:VEC_F (match_dup 4))))]
+  "can_create_pseudo_p ()"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  rtx res1 = gen_reg_rtx (<MODE>mode);
+  rtx res2 = gen_reg_rtx (<MODE>mode);
+  bool need_invert = false;
+  switch (cond)
+    {
+    case UNEQ:
+      need_invert = true;
+    /* Fall through.  */
+    case LTGT:
+      cond = GT;
+      break;
+    case UNORDERED:
+      need_invert = true;
+    /* Fall through.  */
+    case ORDERED:
+      cond = GE;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+  emit_insn (gen_rtx_SET (res1, comp1));
+  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
+  emit_insn (gen_rtx_SET (res2, comp2));
+
+  emit_insn (gen_ior<mode>3 (res1, res1, res2));
+
+  if (need_invert)
+    emit_insn (gen_one_cmpl<mode>2 (res1, res1));
+
+  emit_insn (gen_rtx_SET (operands[0], res1));
 })
 
 ;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
Segher Boessenkool Nov. 19, 2019, 5:29 p.m. UTC | #3
Hi!

On Tue, Nov 12, 2019 at 06:41:07PM +0800, Kewen.Lin wrote:
> +;; code iterators and attributes for vector FP comparison operators:
> +(define_code_iterator vector_fp_comparison_operator [lt le ne
> +						     ungt unge unlt unle])

Let's indent that differently?

(define_code_iterator
  vector_fp_comparison_operator [lt le ne ungt unge unlt unle])

is nice I think.

> +; There are 14 possible vector FP comparison operators, gt and eq of them have
> +; been expanded above, so just support 12 remaining operators here.
>  
> +; 0. For ge:

(Don't number comments please).

> +(define_expand "vector_ge<mode>"
> +  [(set (match_operand:VEC_F 0 "vlogical_operand")
> +	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
> +		  (match_operand:VEC_F 2 "vlogical_operand")))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> +  "")

This already exists?  Line 764 in vector.md?

> +; 1. For lt/le/ne/ungt/unge/unlt/unle:
> +; lt(a,b)   = gt(b,a)
> +; le(a,b)   = ge(b,a)
> +; unge(a,b) = ~lt(a,b)
> +; unle(a,b) = ~gt(a,b)
> +; ne(a,b)   = ~eq(a,b)
> +; ungt(a,b) = ~le(a,b)
> +; unlt(a,b) = ~ge(a,b)
> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vector_fp_comparison_operator:VEC_F
> +			   (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>    "#"
> +  "can_create_pseudo_p ()"
> +  [(pc)]
>  {
> +  enum rtx_code cond = <CODE>;
> +  rtx res = gen_reg_rtx (<MODE>mode);
> +  bool need_invert = false;
> +  switch (cond)
> +    {
> +    case LT:
> +    case LE:
> +      cond = swap_condition (cond);
> +      std::swap (operands[1], operands[2]);
> +      break;
> +    case UNLE:
> +    case UNLT:
> +    case NE:
> +    case UNGE:
> +    case UNGT:
> +      cond = reverse_condition_maybe_unordered (cond);
> +      need_invert = true;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
> +  emit_insn (gen_rtx_SET (res, comp));
> +
> +  if (need_invert)
> +    emit_insn (gen_one_cmpl<mode>2 (res, res));
> +
> +  emit_insn (gen_rtx_SET (operands[0], res));
>  })

Does this work for (say) ungt?  I would do two switch statements: the
first only to do the invert, and then the second to do the swap (with
the modified cond!)  So:

{
  enum rtx_code cond = <CODE>;
  bool need_invert = false;

  if (cond == UNLE || cond == UNLT || cond == NE
      || cond == UNGE || cond == UNGT)
    {
      cond = reverse_condition_maybe_unordered (cond);
      need_invert = true;
    }

  if (cond == LT || cond == LE)
    {
      cond = swap_condition (cond);
      std::swap (operands[1], operands[2]);
    }

  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);

  if (need_invert)
    {
      rtx res = gen_reg_rtx (<MODE>mode);
      emit_insn (gen_rtx_SET (res, comp));
      emit_insn (gen_one_cmpl<mode>2 (operands[0], res));
    }
  else
    emit_insn (gen_rtx_SET (operands[0], comp));
})

> +; 2. For ltgt/uneq/ordered/unordered:
> +; ltgt: gt(a,b) | gt(b,a)
> +; uneq: ~gt(a,b) & ~gt(b,a)
> +; ordered: ge(a,b) | ge(b,a)
> +; unordered: ~ge(a,b) & ~ge(b,a)

You could write that as
  ~(ge(a,b) | ge(b,a))
which may show the structure better?

> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vector_fp_extra_comparison_operator:VEC_F
> +			   (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>    "#"
> +  "can_create_pseudo_p ()"

This should be "&& can_create_pseudo ()".

Also, can_create_pseudo in the split condition can fail, in theory
anyway.  It should be part of the insn condition, instead, and even
then it can fail: after the last split pass, but before reload. such
an insn can in principle be created, and then it sill be never split.

In this case, maybe you should add a scratch register; in the previous
case, you can reuse operands[0] for res instead of making a new reg
for it, if !can_create_pseudo ().

>  {
> +  enum rtx_code cond = <CODE>;
> +  rtx res1 = gen_reg_rtx (<MODE>mode);
> +  rtx res2 = gen_reg_rtx (<MODE>mode);
> +  bool need_invert = false;
> +  switch (cond)
> +    {
> +    case UNEQ:
> +      need_invert = true;
> +    /* Fall through.  */
> +    case LTGT:
> +      cond = GT;
> +      break;
> +    case UNORDERED:
> +      need_invert = true;
> +    /* Fall through.  */
> +    case ORDERED:
> +      cond = GE;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }
> +
> +  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
> +  emit_insn (gen_rtx_SET (res1, comp1));
> +  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
> +  emit_insn (gen_rtx_SET (res2, comp2));
> +
> +  emit_insn (gen_ior<mode>3 (res1, res1, res2));
> +
> +  if (need_invert)
> +    emit_insn (gen_one_cmpl<mode>2 (res1, res1));
> +
> +  emit_insn (gen_rtx_SET (operands[0], res1));
>  })

You can do this similarly:

{
  enum rtx_code cond = <CODE>;
  bool need_invert = false;

  if (cond == UNORDERED || cond == UNEQ)
    {
      cond = reverse_condition_maybe_unordered (cond);
      need_invert = true;
    }

  switch (cond)
    {
    case LTGT:
      cond = GT;
      break;
    case ORDERED:
      cond = GE;
      break;
    default:
      gcc_unreachable ();
    }

  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
  rtx res1 = gen_reg_rtx (<MODE>mode);
  emit_insn (gen_rtx_SET (res1, comp1));
  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
  rtx res2 = gen_reg_rtx (<MODE>mode);
  emit_insn (gen_rtx_SET (res2, comp2));

  if (need_invert)
    {
      rtx comp3 = gen_rtx_fmt_ee (IOR, <MODE>mode, res1, res2);
      rtx comp4 = gen_rtx_fmt_e (NOT, <MODE>mode, comp3);
      emit_insn (gen_rtx_SET (operands[0], comp4));
    }
  else
    emit_insn (gen_ior<mode>3 (operands[0], res1, res2));
})

(the result of a split does not get combine etc. optimisations anymore,
so you have to generate pretty much ideal code directly).

HTH, sorry for the late reply,


Segher
Kewen.Lin Nov. 20, 2019, 7:31 a.m. UTC | #4
Hi Segher,

on 2019/11/20 上午1:29, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, Nov 12, 2019 at 06:41:07PM +0800, Kewen.Lin wrote:
>> +;; code iterators and attributes for vector FP comparison operators:
>> +(define_code_iterator vector_fp_comparison_operator [lt le ne
>> +						     ungt unge unlt unle])
> 
> Let's indent that differently?
> 
> (define_code_iterator
>   vector_fp_comparison_operator [lt le ne ungt unge unlt unle])
> 
> is nice I think.
> 

Done.

>> +; There are 14 possible vector FP comparison operators, gt and eq of them have
>> +; been expanded above, so just support 12 remaining operators here.
>>  
>> +; 0. For ge:
> 
> (Don't number comments please).
> 

Done.

>> +(define_expand "vector_ge<mode>"
>> +  [(set (match_operand:VEC_F 0 "vlogical_operand")
>> +	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
>> +		  (match_operand:VEC_F 2 "vlogical_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>> +  "")
> 
> This already exists?  Line 764 in vector.md?
> 

Yes, that one is only for VEC_F (aka. vector FP), move to here to centralize
them.  Does it make sense?

>> +; 1. For lt/le/ne/ungt/unge/unlt/unle:
>> +; lt(a,b)   = gt(b,a)
>> +; le(a,b)   = ge(b,a)
>> +; unge(a,b) = ~lt(a,b)
>> +; unle(a,b) = ~gt(a,b)
>> +; ne(a,b)   = ~eq(a,b)
>> +; ungt(a,b) = ~le(a,b)
>> +; unlt(a,b) = ~ge(a,b)
>> +(define_insn_and_split "vector_<code><mode>"
>>    [(set (match_operand:VEC_F 0 "vfloat_operand")
>> +	(vector_fp_comparison_operator:VEC_F
>> +			   (match_operand:VEC_F 1 "vfloat_operand")
>> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "#"
>> +  "can_create_pseudo_p ()"
>> +  [(pc)]
>>  {
>> +  enum rtx_code cond = <CODE>;
>> +  rtx res = gen_reg_rtx (<MODE>mode);
>> +  bool need_invert = false;
>> +  switch (cond)
>> +    {
>> +    case LT:
>> +    case LE:
>> +      cond = swap_condition (cond);
>> +      std::swap (operands[1], operands[2]);
>> +      break;
>> +    case UNLE:
>> +    case UNLT:
>> +    case NE:
>> +    case UNGE:
>> +    case UNGT:
>> +      cond = reverse_condition_maybe_unordered (cond);
>> +      need_invert = true;
>> +      break;
>> +    default:
>> +      gcc_unreachable ();
>> +    }
>> +
>> +  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
>> +  emit_insn (gen_rtx_SET (res, comp));
>> +
>> +  if (need_invert)
>> +    emit_insn (gen_one_cmpl<mode>2 (res, res));
>> +
>> +  emit_insn (gen_rtx_SET (operands[0], res));
>>  })
> 
> Does this work for (say) ungt?  I would do two switch statements: the
> first only to do the invert, and then the second to do the swap (with
> the modified cond!)  So:
> 

Yes, it works but it has to call further splitting for "le".  It looks
better to split to the end at once.  Thanks a lot for your example!

> {
>   enum rtx_code cond = <CODE>;
>   bool need_invert = false;
> 
>   if (cond == UNLE || cond == UNLT || cond == NE
>       || cond == UNGE || cond == UNGT)
>     {
>       cond = reverse_condition_maybe_unordered (cond);
>       need_invert = true;
>     }
> 
>   if (cond == LT || cond == LE)
>     {
>       cond = swap_condition (cond);
>       std::swap (operands[1], operands[2]);
>     }
> 

I added one assert here to guard the codes.

>   rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
> 
>   if (need_invert)
>     {
>       rtx res = gen_reg_rtx (<MODE>mode);
>       emit_insn (gen_rtx_SET (res, comp));
>       emit_insn (gen_one_cmpl<mode>2 (operands[0], res));
>     }
>   else
>     emit_insn (gen_rtx_SET (operands[0], comp));
> })
> 

>> +; 2. For ltgt/uneq/ordered/unordered:
>> +; ltgt: gt(a,b) | gt(b,a)
>> +; uneq: ~gt(a,b) & ~gt(b,a)
>> +; ordered: ge(a,b) | ge(b,a)
>> +; unordered: ~ge(a,b) & ~ge(b,a)
> 
> You could write that as
>   ~(ge(a,b) | ge(b,a))
> which may show the structure better?
> 

Done.

>> +(define_insn_and_split "vector_<code><mode>"
>>    [(set (match_operand:VEC_F 0 "vfloat_operand")
>> +	(vector_fp_extra_comparison_operator:VEC_F
>> +			   (match_operand:VEC_F 1 "vfloat_operand")
>> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
>>    "#"
>> +  "can_create_pseudo_p ()"
> 
> This should be "&& can_create_pseudo ()".
> 

Done.

> Also, can_create_pseudo in the split condition can fail, in theory
> anyway.  It should be part of the insn condition, instead, and even
> then it can fail: after the last split pass, but before reload. such
> an insn can in principle be created, and then it sill be never split.
> 
> In this case, maybe you should add a scratch register; in the previous
> case, you can reuse operands[0] for res instead of making a new reg
> for it, if !can_create_pseudo ().
> 

I've updated the previous case with operands[0] with !can_create_pseudo
check.  But for the later one I met ICE if I added two clobbers with
scratch into the RTL pattern like:

  [(set (match_operand:VEC_F 0 "vfloat_operand")
       (vector_fp_extra_comparison_operator:VEC_F
       (match_operand:VEC_F 1 "vfloat_operand")
       (match_operand:VEC_F 2 "vfloat_operand")))
   (clobber (match_scratch:VEC_F 3))
   (clobber (match_scratch:VEC_F 4))]

Some pattern without clobber as below can't be recognized (early in vregs).

(insn 24 23 25 4 (set (reg:V4SF 142)
        (uneq:V4SF (reg:V4SF 141 [ vect__4.11 ])
            (reg:V4SF 127 [ vect_cst__47 ]))) -1
     (nil))

I may miss something, but I can't find a way to satisfy both, to recog the 
simplified pattern and the pattern with clobber.  So I kept it as existing
codes for now to only handle can_create_pseudo (the old define_insn_and_split
even doesn't check can_create_pseudo, but it should since it tries to
gen_reg_rtx?).  Looking forward to your further suggestion!

>>  {
>> +  enum rtx_code cond = <CODE>;
>> +  rtx res1 = gen_reg_rtx (<MODE>mode);
>> +  rtx res2 = gen_reg_rtx (<MODE>mode);
>> +  bool need_invert = false;
>> +  switch (cond)
>> +    {
>> +    case UNEQ:
>> +      need_invert = true;
>> +    /* Fall through.  */
>> +    case LTGT:
>> +      cond = GT;
>> +      break;
>> +    case UNORDERED:
>> +      need_invert = true;
>> +    /* Fall through.  */
>> +    case ORDERED:
>> +      cond = GE;
>> +      break;
>> +    default:
>> +      gcc_unreachable ();
>> +    }
>> +
>> +  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
>> +  emit_insn (gen_rtx_SET (res1, comp1));
>> +  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
>> +  emit_insn (gen_rtx_SET (res2, comp2));
>> +
>> +  emit_insn (gen_ior<mode>3 (res1, res1, res2));
>> +
>> +  if (need_invert)
>> +    emit_insn (gen_one_cmpl<mode>2 (res1, res1));
>> +
>> +  emit_insn (gen_rtx_SET (operands[0], res1));
>>  })
> 
> You can do this similarly:
> 
> {
>   enum rtx_code cond = <CODE>;
>   bool need_invert = false;
> 
>   if (cond == UNORDERED || cond == UNEQ)
>     {
>       cond = reverse_condition_maybe_unordered (cond);
>       need_invert = true;
>     }
> 
>   switch (cond)
>     {
>     case LTGT:
>       cond = GT;
>       break;
>     case ORDERED:
>       cond = GE;
>       break;
>     default:
>       gcc_unreachable ();
>     }
> 
>   rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
>   rtx res1 = gen_reg_rtx (<MODE>mode);
>   emit_insn (gen_rtx_SET (res1, comp1));
>   rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
>   rtx res2 = gen_reg_rtx (<MODE>mode);
>   emit_insn (gen_rtx_SET (res2, comp2));
> 
>   if (need_invert)
>     {
>       rtx comp3 = gen_rtx_fmt_ee (IOR, <MODE>mode, res1, res2);
>       rtx comp4 = gen_rtx_fmt_e (NOT, <MODE>mode, comp3);

I had to add one more gen_rtx_SET for the IOR here, otherwise it gets the error
on the pattern can't be recognized.

>       emit_insn (gen_rtx_SET (operands[0], comp4));
>     }
>   else
>     emit_insn (gen_ior<mode>3 (operands[0], res1, res2));
> })
> 
> (the result of a split does not get combine etc. optimisations anymore,
> so you have to generate pretty much ideal code directly).
> 
> HTH, sorry for the late reply,

Yeah, it helps a lot, thanks again!

I have attached the updated version to get your further review.


BR,
Kewen
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index b132037..deeab9f 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -107,6 +107,12 @@
 				 (smin "smin")
 				 (smax "smax")])
 
+;; code iterators and attributes for vector FP comparison operators:
+(define_code_iterator
+	vector_fp_comparison_operator [lt le ne ungt unge unlt unle])
+(define_code_iterator
+	vector_fp_extra_comparison_operator [ltgt uneq unordered ordered])
+
 
 ;; Vector move instructions.  Little-endian VSX loads and stores require
 ;; special handling to circumvent "element endianness."
@@ -665,88 +671,6 @@
   DONE;
 })
 
-; lt(a,b) = gt(b,a)
-(define_expand "vector_lt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(lt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; le(a,b) = ge(b,a)
-(define_expand "vector_le<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(le:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; ne(a,b) = ~eq(a,b)
-(define_expand "vector_ne<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ne:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_eq<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unge(a,b) = ~gt(b,a)
-(define_expand "vector_unge<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unge:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; ungt(a,b) = ~ge(b,a)
-(define_expand "vector_ungt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ungt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unle(a,b) = ~gt(a,b)
-(define_expand "vector_unle<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unle:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unlt(a,b) = ~ge(a,b)
-(define_expand "vector_unlt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unlt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
 (define_expand "vector_eq<mode>"
   [(set (match_operand:VEC_C 0 "vlogical_operand")
 	(eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
@@ -761,13 +685,6 @@
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
-(define_expand "vector_ge<mode>"
-  [(set (match_operand:VEC_F 0 "vlogical_operand")
-	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
-		  (match_operand:VEC_F 2 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "")
-
 ; >= for integer vectors: swap operands and apply not-greater-than
 (define_expand "vector_nlt<mode>"
   [(set (match_operand:VEC_I 3 "vlogical_operand")
@@ -829,88 +746,118 @@
   operands[3] = gen_reg_rtx_and_attrs (operands[0]);
 })
 
-(define_insn_and_split "vector_uneq<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(and:VEC_F (not:VEC_F (match_dup 3))
-		   (not:VEC_F (match_dup 4))))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+; There are 14 possible vector FP comparison operators, gt and eq of them have
+; been expanded above, so just support 12 remaining operators here.
 
-(define_insn_and_split "vector_ltgt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
+; For ge:
+(define_expand "vector_ge<mode>"
+  [(set (match_operand:VEC_F 0 "vlogical_operand")
+	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
+		  (match_operand:VEC_F 2 "vlogical_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+  "")
 
-(define_insn_and_split "vector_ordered<mode>"
+; For lt/le/ne/ungt/unge/unlt/unle:
+; lt(a,b)   = gt(b,a)
+; le(a,b)   = ge(b,a)
+; unge(a,b) = ~lt(a,b)
+; unle(a,b) = ~gt(a,b)
+; ne(a,b)   = ~eq(a,b)
+; ungt(a,b) = ~le(a,b)
+; unlt(a,b) = ~ge(a,b)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		       (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vector_fp_comparison_operator:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
+  "&& 1"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  bool need_invert = false;
+
+  if (cond == UNLE || cond == UNLT || cond == NE || cond == UNGE
+      || cond == UNGT)
+    {
+      cond = reverse_condition_maybe_unordered (cond);
+      need_invert = true;
+    }
+
+  if (cond == LT || cond == LE)
+    {
+      cond = swap_condition (cond);
+      std::swap (operands[1], operands[2]);
+    }
+
+  gcc_assert (cond == EQ || cond == GE || cond == GT);
+
+  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+
+  if (need_invert)
+    {
+      rtx res
+	= (!can_create_pseudo_p () ? operands[0] : gen_reg_rtx (<MODE>mode));
+      emit_insn (gen_rtx_SET (res, comp));
+      emit_insn (gen_one_cmpl<mode>2 (operands[0], res));
+    }
+  else
+    emit_insn (gen_rtx_SET (operands[0], comp));
 })
 
-(define_insn_and_split "vector_unordered<mode>"
+; For ltgt/uneq/ordered/unordered:
+; ltgt: gt(a,b) | gt(b,a)
+; uneq: ~(gt(a,b) | gt(b,a))
+; ordered: ge(a,b) | ge(b,a)
+; unordered: ~(ge(a,b) | ge(b,a))
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-			 (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+	(vector_fp_extra_comparison_operator:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-        (and:VEC_F (not:VEC_F (match_dup 3))
-                   (not:VEC_F (match_dup 4))))]
+  "&& 1"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  bool need_invert = false;
+
+  if (cond == UNORDERED || cond == UNEQ)
+    {
+      cond = reverse_condition_maybe_unordered (cond);
+      need_invert = true;
+    }
+
+  switch (cond)
+    {
+    case LTGT:
+      cond = GT;
+      break;
+    case ORDERED:
+      cond = GE;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+  rtx res1 = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_rtx_SET (res1, comp1));
+  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
+  rtx res2 = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_rtx_SET (res2, comp2));
+
+  if (need_invert)
+    {
+      rtx comp3 = gen_rtx_fmt_ee (IOR, <MODE>mode, res1, res2);
+      emit_insn (gen_rtx_SET (operands[0], comp3));
+      rtx comp4 = gen_rtx_fmt_e (NOT, <MODE>mode, operands[0]);
+      emit_insn (gen_rtx_SET (operands[0], comp4));
+    }
+  else
+    emit_insn (gen_ior<mode>3 (operands[0], res1, res2));
 })
 
 ;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
Segher Boessenkool Nov. 20, 2019, 2:06 p.m. UTC | #5
Hi!

On Wed, Nov 20, 2019 at 03:31:36PM +0800, Kewen.Lin wrote:
> >> +(define_expand "vector_ge<mode>"
> >> +  [(set (match_operand:VEC_F 0 "vlogical_operand")
> >> +	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
> >> +		  (match_operand:VEC_F 2 "vlogical_operand")))]
> >>    "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
> >> +  "")
> > 
> > This already exists?  Line 764 in vector.md?
> > 
> 
> Yes, that one is only for VEC_F (aka. vector FP), move to here to centralize
> them.  Does it make sense?

Ah, I see.  Yeah, that makes sense.

> > Does this work for (say) ungt?  I would do two switch statements: the
> > first only to do the invert, and then the second to do the swap (with
> > the modified cond!)  So:
> 
> Yes, it works but it has to call further splitting for "le".  It looks
> better to split to the end at once.  Thanks a lot for your example!

Yes, exactly.  The code is simpler in this case, and more importantly,
it is clearer that the code the splitter outputs is somewhat optimsed.
Output from splitters does not get most basic RTL optimisations anymore,
the splitters run after those!

> >   if (cond == UNLE || cond == UNLT || cond == NE
> >       || cond == UNGE || cond == UNGT)
> >     {
> >       cond = reverse_condition_maybe_unordered (cond);
> >       need_invert = true;
> >     }
> > 
> >   if (cond == LT || cond == LE)
> >     {
> >       cond = swap_condition (cond);
> >       std::swap (operands[1], operands[2]);
> >     }
> 
> I added one assert here to guard the codes.

Sure, if you aren't certain no other codes can happen, or you expect
things to ever change, etc.

> > Also, can_create_pseudo in the split condition can fail, in theory
> > anyway.  It should be part of the insn condition, instead, and even
> > then it can fail: after the last split pass, but before reload. such
> > an insn can in principle be created, and then it sill be never split.
> > 
> > In this case, maybe you should add a scratch register; in the previous
> > case, you can reuse operands[0] for res instead of making a new reg
> > for it, if !can_create_pseudo ().
> 
> I've updated the previous case with operands[0] with !can_create_pseudo
> check.  But for the later one I met ICE if I added two clobbers with
> scratch into the RTL pattern like:
> 
>   [(set (match_operand:VEC_F 0 "vfloat_operand")
>        (vector_fp_extra_comparison_operator:VEC_F
>        (match_operand:VEC_F 1 "vfloat_operand")
>        (match_operand:VEC_F 2 "vfloat_operand")))
>    (clobber (match_scratch:VEC_F 3))
>    (clobber (match_scratch:VEC_F 4))]
> 
> Some pattern without clobber as below can't be recognized (early in vregs).
> 
> (insn 24 23 25 4 (set (reg:V4SF 142)
>         (uneq:V4SF (reg:V4SF 141 [ vect__4.11 ])
>             (reg:V4SF 127 [ vect_cst__47 ]))) -1
>      (nil))

Yeah.  Just doing can_create_pseudo in the insn condition (and in the
split condition, via &&) will work -- there just is this window of
failure you should be aware of, and we need to do something about that,
eventually.  It's a very general problem.  There are many places in
rs6000 where we already do evil things with this (and some much worse
than this even), so one more won't hurt too much.

It does need can_create_pseudo in the split condition then though.

> >   rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
> >   rtx res1 = gen_reg_rtx (<MODE>mode);
> >   emit_insn (gen_rtx_SET (res1, comp1));
> >   rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
> >   rtx res2 = gen_reg_rtx (<MODE>mode);
> >   emit_insn (gen_rtx_SET (res2, comp2));
> > 
> >   if (need_invert)
> >     {
> >       rtx comp3 = gen_rtx_fmt_ee (IOR, <MODE>mode, res1, res2);
> >       rtx comp4 = gen_rtx_fmt_e (NOT, <MODE>mode, comp3);
> 
> I had to add one more gen_rtx_SET for the IOR here, otherwise it gets the error
> on the pattern can't be recognized.

Oh, sorry.  The canonical way of writing a NOR is as an ANDCC, so this
should be

      rtx not1 = gen_rtx_fmt_e (NOT, <MODE>mode, res1);
      rtx not2 = gen_rtx_fmt_e (NOT, <MODE>mode, res2);
      rtx comp3 = gen_rtx_fmt_ee (AND, <MODE>mode, not1, not2);
      emit_insn (gen_rtx_SET (operands[0], comp3));

> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
> index b132037..deeab9f 100644
> --- a/gcc/config/rs6000/vector.md
> +++ b/gcc/config/rs6000/vector.md
> @@ -107,6 +107,12 @@
>  				 (smin "smin")
>  				 (smax "smax")])
>  
> +;; code iterators and attributes for vector FP comparison operators:
> +(define_code_iterator
> +	vector_fp_comparison_operator [lt le ne ungt unge unlt unle])

Can you change this name so it is clear it is just the simple ones?


Segher
Kewen.Lin Nov. 21, 2019, 1:15 p.m. UTC | #6
Hi Segher,

on 2019/11/20 下午10:06, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 20, 2019 at 03:31:36PM +0800, Kewen.Lin wrote:
> Yeah.  Just doing can_create_pseudo in the insn condition (and in the
> split condition, via &&) will work -- there just is this window of
> failure you should be aware of, and we need to do something about that,
> eventually.  It's a very general problem.  There are many places in
> rs6000 where we already do evil things with this (and some much worse
> than this even), so one more won't hurt too much.
> 
> It does need can_create_pseudo in the split condition then though.
> 

Done.

>>>   rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
>>>   rtx res1 = gen_reg_rtx (<MODE>mode);
>>>   emit_insn (gen_rtx_SET (res1, comp1));
>>>   rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
>>>   rtx res2 = gen_reg_rtx (<MODE>mode);
>>>   emit_insn (gen_rtx_SET (res2, comp2));
>>>
>>>   if (need_invert)
>>>     {
>>>       rtx comp3 = gen_rtx_fmt_ee (IOR, <MODE>mode, res1, res2);
>>>       rtx comp4 = gen_rtx_fmt_e (NOT, <MODE>mode, comp3);
>>
>> I had to add one more gen_rtx_SET for the IOR here, otherwise it gets the error
>> on the pattern can't be recognized.
> 
> Oh, sorry.  The canonical way of writing a NOR is as an ANDCC, so this
> should be
> 
>       rtx not1 = gen_rtx_fmt_e (NOT, <MODE>mode, res1);
>       rtx not2 = gen_rtx_fmt_e (NOT, <MODE>mode, res2);
>       rtx comp3 = gen_rtx_fmt_ee (AND, <MODE>mode, not1, not2);
>       emit_insn (gen_rtx_SET (operands[0], comp3));
> 

Thanks for the example, updated.

>> diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
>> index b132037..deeab9f 100644
>> --- a/gcc/config/rs6000/vector.md
>> +++ b/gcc/config/rs6000/vector.md
>> @@ -107,6 +107,12 @@
>>  				 (smin "smin")
>>  				 (smax "smax")])
>>  
>> +;; code iterators and attributes for vector FP comparison operators:
>> +(define_code_iterator
>> +	vector_fp_comparison_operator [lt le ne ungt unge unlt unle])
> 
> Can you change this name so it is clear it is just the simple ones?
> 

Renamed as vector_fp_comparison_simple and vector_fp_comparison_complex,
does it look better?

New version attached, compared with last version:
  1) Change code iterator names to vector_fp_comparison_simple
     and vector_fp_comparison_complex.
  2) Add can_create_pseudo in the split condition.
  3) Update NOR generation with AND (NOT, NOT).
  4) Append "DONE" at the end of define_insn_and_split.

Bootstrapped and regress tested on powerpc64le-linux-gnu.

BR,
Kewen

----
gcc/ChangeLog

2019-11-21 Kewen Lin  <linkw@gcc.gnu.org>

	* config/rs6000/vector.md (vector_fp_comparison_simple):
	New code iterator.
	(vector_fp_comparison_complex): Likewise.
	(vector_<code><mode> for VEC_F and
	vector_fp_comparison_simple): New define_and_split.
	(vector_<code><mode> for VEC_F and
	vector_fp_comparison_complex): Likewise.
	(vector_lt<mode> for VEC_F): Refactor with
	vector_fp_comparison_simple.
	(vector_le<mode> for VEC_F): Likewise.
	(vector_unge<mode> for VEC_F): Likewise.
	(vector_unle<mode> for VEC_F): Likewise.
	(vector_ne<mode> for VEC_F): Likewise.
	(vector_ungt<mode> for VEC_F): Likewise.
	(vector_unlt<mode> for VEC_F): Likewise.
	(vector_ltgt<mode> for VEC_F): Refactor with
	vector_fp_comparison_complex.
	(vector_ordered<mode> for VEC_F): Likewise.
	(vector_uneq<mode> for VEC_F): Likewise.
	(vector_unordered<mode> for VEC_F): Likewise.
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index b132037..aa3154a 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -107,6 +107,12 @@
 				 (smin "smin")
 				 (smax "smax")])
 
+;; code iterators and attributes for vector FP comparison operators:
+(define_code_iterator
+	vector_fp_comparison_simple [lt le ne ungt unge unlt unle])
+(define_code_iterator
+	vector_fp_comparison_complex [ltgt uneq unordered ordered])
+
 
 ;; Vector move instructions.  Little-endian VSX loads and stores require
 ;; special handling to circumvent "element endianness."
@@ -665,88 +671,6 @@
   DONE;
 })
 
-; lt(a,b) = gt(b,a)
-(define_expand "vector_lt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(lt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; le(a,b) = ge(b,a)
-(define_expand "vector_le<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(le:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; ne(a,b) = ~eq(a,b)
-(define_expand "vector_ne<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ne:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_eq<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unge(a,b) = ~gt(b,a)
-(define_expand "vector_unge<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unge:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; ungt(a,b) = ~ge(b,a)
-(define_expand "vector_ungt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ungt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unle(a,b) = ~gt(a,b)
-(define_expand "vector_unle<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unle:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unlt(a,b) = ~ge(a,b)
-(define_expand "vector_unlt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unlt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
 (define_expand "vector_eq<mode>"
   [(set (match_operand:VEC_C 0 "vlogical_operand")
 	(eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
@@ -761,13 +685,6 @@
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
-(define_expand "vector_ge<mode>"
-  [(set (match_operand:VEC_F 0 "vlogical_operand")
-	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
-		  (match_operand:VEC_F 2 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "")
-
 ; >= for integer vectors: swap operands and apply not-greater-than
 (define_expand "vector_nlt<mode>"
   [(set (match_operand:VEC_I 3 "vlogical_operand")
@@ -829,88 +746,122 @@
   operands[3] = gen_reg_rtx_and_attrs (operands[0]);
 })
 
-(define_insn_and_split "vector_uneq<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(and:VEC_F (not:VEC_F (match_dup 3))
-		   (not:VEC_F (match_dup 4))))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+; There are 14 possible vector FP comparison operators, gt and eq of them have
+; been expanded above, so just support 12 remaining operators here.
 
-(define_insn_and_split "vector_ltgt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
+; For ge:
+(define_expand "vector_ge<mode>"
+  [(set (match_operand:VEC_F 0 "vlogical_operand")
+	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
+		  (match_operand:VEC_F 2 "vlogical_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "#"
-  ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
-{
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
-})
+  "")
 
-(define_insn_and_split "vector_ordered<mode>"
+; For lt/le/ne/ungt/unge/unlt/unle:
+; lt(a,b)   = gt(b,a)
+; le(a,b)   = ge(b,a)
+; unge(a,b) = ~lt(a,b)
+; unle(a,b) = ~gt(a,b)
+; ne(a,b)   = ~eq(a,b)
+; ungt(a,b) = ~le(a,b)
+; unlt(a,b) = ~ge(a,b)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		       (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vector_fp_comparison_simple:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
+  "&& 1"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  bool need_invert = false;
+
+  if (cond == UNLE || cond == UNLT || cond == NE || cond == UNGE
+      || cond == UNGT)
+    {
+      cond = reverse_condition_maybe_unordered (cond);
+      need_invert = true;
+    }
+
+  if (cond == LT || cond == LE)
+    {
+      cond = swap_condition (cond);
+      std::swap (operands[1], operands[2]);
+    }
+
+  gcc_assert (cond == EQ || cond == GE || cond == GT);
+
+  rtx comp = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+
+  if (need_invert)
+    {
+      rtx res
+	= (!can_create_pseudo_p () ? operands[0] : gen_reg_rtx (<MODE>mode));
+      emit_insn (gen_rtx_SET (res, comp));
+      emit_insn (gen_one_cmpl<mode>2 (operands[0], res));
+    }
+  else
+    emit_insn (gen_rtx_SET (operands[0], comp));
+
+  DONE;
 })
 
-(define_insn_and_split "vector_unordered<mode>"
+; For ltgt/uneq/ordered/unordered:
+; ltgt: gt(a,b) | gt(b,a)
+; uneq: ~(gt(a,b) | gt(b,a))
+; ordered: ge(a,b) | ge(b,a)
+; unordered: ~(ge(a,b) | ge(b,a))
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-			 (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+	(vector_fp_comparison_complex:VEC_F
+			   (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
   "#"
-  ""
-  [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-        (and:VEC_F (not:VEC_F (match_dup 3))
-                   (not:VEC_F (match_dup 4))))]
+  "&& can_create_pseudo_p ()"
+  [(pc)]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+  enum rtx_code cond = <CODE>;
+  bool need_invert = false;
+
+  if (cond == UNORDERED || cond == UNEQ)
+    {
+      cond = reverse_condition_maybe_unordered (cond);
+      need_invert = true;
+    }
+
+  switch (cond)
+    {
+    case LTGT:
+      cond = GT;
+      break;
+    case ORDERED:
+      cond = GE;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  rtx comp1 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[1], operands[2]);
+  rtx res1 = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_rtx_SET (res1, comp1));
+  rtx comp2 = gen_rtx_fmt_ee (cond, <MODE>mode, operands[2], operands[1]);
+  rtx res2 = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_rtx_SET (res2, comp2));
+
+  if (need_invert)
+    {
+      rtx not1 = gen_rtx_fmt_e (NOT, <MODE>mode, res1);
+      rtx not2 = gen_rtx_fmt_e (NOT, <MODE>mode, res2);
+      rtx comp3 = gen_rtx_fmt_ee (AND, <MODE>mode, not1, not2);
+      emit_insn (gen_rtx_SET (operands[0], comp3));
+    }
+  else
+    emit_insn (gen_ior<mode>3 (operands[0], res1, res2));
+
+  DONE;
 })
 
 ;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
Segher Boessenkool Nov. 22, 2019, 4:08 p.m. UTC | #7
Hi!

On Thu, Nov 21, 2019 at 09:15:19PM +0800, Kewen.Lin wrote:
> >> +;; code iterators and attributes for vector FP comparison operators:
> >> +(define_code_iterator
> >> +	vector_fp_comparison_operator [lt le ne ungt unge unlt unle])
> > 
> > Can you change this name so it is clear it is just the simple ones?
> > 
> 
> Renamed as vector_fp_comparison_simple and vector_fp_comparison_complex,
> does it look better?

Sure.

>   4) Append "DONE" at the end of define_insn_and_split.

Ooh did I miss that?  That must have created some interesting code, heh.

> 2019-11-21 Kewen Lin  <linkw@gcc.gnu.org>
> 
> 	* config/rs6000/vector.md (vector_fp_comparison_simple):
> 	New code iterator.
> 	(vector_fp_comparison_complex): Likewise.
> 	(vector_<code><mode> for VEC_F and
> 	vector_fp_comparison_simple): New define_and_split.

(You don't have to wrap so early...  Line length is 80 columns.)

> +;; code iterators and attributes for vector FP comparison operators:
> +(define_code_iterator
> +	vector_fp_comparison_simple [lt le ne ungt unge unlt unle])
> +(define_code_iterator
> +	vector_fp_comparison_complex [ltgt uneq unordered ordered])

Please indent by two spaces, not a tab.

> +; For lt/le/ne/ungt/unge/unlt/unle:
> +; lt(a,b)   = gt(b,a)
> +; le(a,b)   = ge(b,a)
> +; unge(a,b) = ~lt(a,b)
> +; unle(a,b) = ~gt(a,b)
> +; ne(a,b)   = ~eq(a,b)
> +; ungt(a,b) = ~le(a,b)
> +; unlt(a,b) = ~ge(a,b)
> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vector_fp_comparison_simple:VEC_F
> +			   (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]

Indent is weird here as well.

> +; For ltgt/uneq/ordered/unordered:
> +; ltgt: gt(a,b) | gt(b,a)
> +; uneq: ~(gt(a,b) | gt(b,a))
> +; ordered: ge(a,b) | ge(b,a)
> +; unordered: ~(ge(a,b) | ge(b,a))
> +(define_insn_and_split "vector_<code><mode>"
>    [(set (match_operand:VEC_F 0 "vfloat_operand")
> +	(vector_fp_comparison_complex:VEC_F
> +			   (match_operand:VEC_F 1 "vfloat_operand")
> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
>    "#"
> +  "&& can_create_pseudo_p ()"

So hrm, if we do that here we can as well do that in the previous
splitter as well (and not do the operands[0] thing) (sorry for going
back on this -- I have said that before haven't I?)

> +  switch (cond)
> +    {
> +    case LTGT:
> +      cond = GT;
> +      break;
> +    case ORDERED:
> +      cond = GE;
> +      break;
> +    default:
> +      gcc_unreachable ();
> +    }

It feels a bit lighter (and is shorter) if you write

  if (cond == LTGT)
    cond = GT;
  else if (cond == ORDERED)
    cond = GE;
  else
    gcc_unreachable ();


Okay for trunk with those trivialities taken care of one way or the
other.  Thanks!


Segher
Kewen.Lin Nov. 25, 2019, 5:35 a.m. UTC | #8
Hi Segher,

on 2019/11/23 上午12:08, Segher Boessenkool wrote:
> Hi!
>> 2019-11-21 Kewen Lin  <linkw@gcc.gnu.org>
>>
>> 	* config/rs6000/vector.md (vector_fp_comparison_simple):
>> 	New code iterator.
>> 	(vector_fp_comparison_complex): Likewise.
>> 	(vector_<code><mode> for VEC_F and
>> 	vector_fp_comparison_simple): New define_and_split.
> 
> (You don't have to wrap so early...  Line length is 80 columns.)
> 
>> +;; code iterators and attributes for vector FP comparison operators:
>> +(define_code_iterator
>> +	vector_fp_comparison_simple [lt le ne ungt unge unlt unle])
>> +(define_code_iterator
>> +	vector_fp_comparison_complex [ltgt uneq unordered ordered])
> 
> Please indent by two spaces, not a tab.
> 
>> +; For lt/le/ne/ungt/unge/unlt/unle:
>> +; lt(a,b)   = gt(b,a)
>> +; le(a,b)   = ge(b,a)
>> +; unge(a,b) = ~lt(a,b)
>> +; unle(a,b) = ~gt(a,b)
>> +; ne(a,b)   = ~eq(a,b)
>> +; ungt(a,b) = ~le(a,b)
>> +; unlt(a,b) = ~ge(a,b)
>> +(define_insn_and_split "vector_<code><mode>"
>>    [(set (match_operand:VEC_F 0 "vfloat_operand")
>> +	(vector_fp_comparison_simple:VEC_F
>> +			   (match_operand:VEC_F 1 "vfloat_operand")
>> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
> 
> Indent is weird here as well.
> 
>> +; For ltgt/uneq/ordered/unordered:
>> +; ltgt: gt(a,b) | gt(b,a)
>> +; uneq: ~(gt(a,b) | gt(b,a))
>> +; ordered: ge(a,b) | ge(b,a)
>> +; unordered: ~(ge(a,b) | ge(b,a))
>> +(define_insn_and_split "vector_<code><mode>"
>>    [(set (match_operand:VEC_F 0 "vfloat_operand")
>> +	(vector_fp_comparison_complex:VEC_F
>> +			   (match_operand:VEC_F 1 "vfloat_operand")
>> +			   (match_operand:VEC_F 2 "vfloat_operand")))]
>> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode) && can_create_pseudo_p ()"
>>    "#"
>> +  "&& can_create_pseudo_p ()"
> 
> So hrm, if we do that here we can as well do that in the previous
> splitter as well (and not do the operands[0] thing) (sorry for going
> back on this -- I have said that before haven't I?)
> 
>> +  switch (cond)
>> +    {
>> +    case LTGT:
>> +      cond = GT;
>> +      break;
>> +    case ORDERED:
>> +      cond = GE;
>> +      break;
>> +    default:
>> +      gcc_unreachable ();
>> +    }
> 
> It feels a bit lighter (and is shorter) if you write
> 
>   if (cond == LTGT)
>     cond = GT;
>   else if (cond == ORDERED)
>     cond = GE;
>   else
>     gcc_unreachable ();
> 
> 
> Okay for trunk with those trivialities taken care of one way or the
> other.  Thanks!
> 

Updated it as all your suggestions above and committed in r278665.

Thanks again!

BR,
Kewen
diff mbox series

Patch

diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index b132037..be2d425 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -107,6 +107,31 @@ 
 				 (smin "smin")
 				 (smax "smax")])
 
+;; code iterators and attributes for vector FP comparison operators:
+
+;; 1. lt and le.
+(define_code_iterator vec_fp_cmp1 [lt le])
+(define_code_attr vec_fp_cmp1_attr [(lt "gt")
+				    (le "ge")])
+
+; 2. unge, unle, ne, ungt and unlt.
+(define_code_iterator vec_fp_cmp2 [unge unle ne ungt unlt])
+(define_code_attr vec_fp_cmp2_attr [(unge "gt")
+				    (unle "gt")
+				    (ne   "eq")
+				    (ungt "ge")
+				    (unlt "ge")])
+
+;; 3. ltgt and ordered.
+(define_code_iterator vec_fp_cmp3 [ltgt ordered])
+(define_code_attr vec_fp_cmp3_attr [(ltgt "gt")
+				    (ordered "ge")])
+
+;; 4. uneq and unordered.
+(define_code_iterator vec_fp_cmp4 [uneq unordered])
+(define_code_attr vec_fp_cmp4_attr [(uneq "gt")
+				    (unordered "ge")])
+
 
 ;; Vector move instructions.  Little-endian VSX loads and stores require
 ;; special handling to circumvent "element endianness."
@@ -665,88 +690,6 @@ 
   DONE;
 })
 
-; lt(a,b) = gt(b,a)
-(define_expand "vector_lt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(lt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; le(a,b) = ge(b,a)
-(define_expand "vector_le<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(le:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  DONE;
-})
-
-; ne(a,b) = ~eq(a,b)
-(define_expand "vector_ne<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ne:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		  (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_eq<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unge(a,b) = ~gt(b,a)
-(define_expand "vector_unge<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unge:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; ungt(a,b) = ~ge(b,a)
-(define_expand "vector_ungt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ungt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[2], operands[1]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unle(a,b) = ~gt(a,b)
-(define_expand "vector_unle<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unle:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_gt<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
-; unlt(a,b) = ~ge(a,b)
-(define_expand "vector_unlt<mode>"
-  [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unlt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-{
-  emit_insn (gen_vector_ge<mode> (operands[0], operands[1], operands[2]));
-  emit_insn (gen_one_cmpl<mode>2 (operands[0], operands[0]));
-  DONE;
-})
-
 (define_expand "vector_eq<mode>"
   [(set (match_operand:VEC_C 0 "vlogical_operand")
 	(eq:VEC_C (match_operand:VEC_C 1 "vlogical_operand")
@@ -761,13 +704,6 @@ 
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
-(define_expand "vector_ge<mode>"
-  [(set (match_operand:VEC_F 0 "vlogical_operand")
-	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
-		  (match_operand:VEC_F 2 "vlogical_operand")))]
-  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
-  "")
-
 ; >= for integer vectors: swap operands and apply not-greater-than
 (define_expand "vector_nlt<mode>"
   [(set (match_operand:VEC_I 3 "vlogical_operand")
@@ -829,61 +765,74 @@ 
   operands[3] = gen_reg_rtx_and_attrs (operands[0]);
 })
 
-(define_insn_and_split "vector_uneq<mode>"
+; There are 14 possible vector FP comparison operators, gt and eq of them have
+; been expanded above, so just support 12 remaining operators here.
+
+; 0. For ge:
+(define_expand "vector_ge<mode>"
+  [(set (match_operand:VEC_F 0 "vlogical_operand")
+	(ge:VEC_F (match_operand:VEC_F 1 "vlogical_operand")
+		  (match_operand:VEC_F 2 "vlogical_operand")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "")
+
+; 1. For lt and le:
+; lt(a,b) = gt(b,a)
+; le(a,b) = ge(b,a)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vec_fp_cmp1:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
   ""
-  [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
-   (set (match_dup 0)
-	(and:VEC_F (not:VEC_F (match_dup 3))
-		   (not:VEC_F (match_dup 4))))]
+  [(set (match_dup 0)
+	(<vec_fp_cmp1_attr>:VEC_F (match_dup 2)
+				  (match_dup 1)))]
 {
-  operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
 })
 
-(define_insn_and_split "vector_ltgt<mode>"
+; 2. For unge, unle, ne, ungt and unlt:
+; unge(a,b) = ~gt(b,a)
+; unle(a,b) = ~gt(a,b)
+; ne(a,b)   = ~eq(a,b)
+; ungt(a,b) = ~ge(b,a)
+; unlt(a,b) = ~ge(a,b)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		    (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vec_fp_cmp2:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
   ""
   [(set (match_dup 3)
-	(gt:VEC_F (match_dup 1)
-		  (match_dup 2)))
-   (set (match_dup 4)
-	(gt:VEC_F (match_dup 2)
-		  (match_dup 1)))
+	(<vec_fp_cmp2_attr>:VEC_F (match_dup 1)
+				  (match_dup 2)))
    (set (match_dup 0)
-	(ior:VEC_F (match_dup 3)
-		   (match_dup 4)))]
+	(not:VEC_F (match_dup 3)))]
 {
   operands[3] = gen_reg_rtx (<MODE>mode);
-  operands[4] = gen_reg_rtx (<MODE>mode);
+
+  if (<CODE> == UNGE || <CODE> == UNGT)
+    std::swap (operands[1], operands[2]);
 })
 
-(define_insn_and_split "vector_ordered<mode>"
+; 3. For ltgt and ordered:
+; ltgt: gt(a,b) | gt(b,a)
+; ordered: ge(a,b) | ge(b,a)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-		       (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vec_fp_cmp3:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
   ""
   [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
+	(<vec_fp_cmp3_attr>:VEC_F (match_dup 1)
+				  (match_dup 2)))
    (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
+	(<vec_fp_cmp3_attr>:VEC_F (match_dup 2)
+				  (match_dup 1)))
    (set (match_dup 0)
 	(ior:VEC_F (match_dup 3)
 		   (match_dup 4)))]
@@ -892,22 +841,25 @@ 
   operands[4] = gen_reg_rtx (<MODE>mode);
 })
 
-(define_insn_and_split "vector_unordered<mode>"
+; 4. For ltgt and ordered:
+; uneq: ~gt(a,b) & ~gt(b,a)
+; unordered: ~ge(a,b) & ~ge(b,a)
+(define_insn_and_split "vector_<code><mode>"
   [(set (match_operand:VEC_F 0 "vfloat_operand")
-	(unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
-			 (match_operand:VEC_F 2 "vfloat_operand")))]
+	(vec_fp_cmp4:VEC_F (match_operand:VEC_F 1 "vfloat_operand")
+			   (match_operand:VEC_F 2 "vfloat_operand")))]
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "#"
   ""
   [(set (match_dup 3)
-	(ge:VEC_F (match_dup 1)
-		  (match_dup 2)))
+	(<vec_fp_cmp4_attr>:VEC_F (match_dup 1)
+				  (match_dup 2)))
    (set (match_dup 4)
-	(ge:VEC_F (match_dup 2)
-		  (match_dup 1)))
+	(<vec_fp_cmp4_attr>:VEC_F (match_dup 2)
+				  (match_dup 1)))
    (set (match_dup 0)
-        (and:VEC_F (not:VEC_F (match_dup 3))
-                   (not:VEC_F (match_dup 4))))]
+	(and:VEC_F (not:VEC_F (match_dup 3))
+		   (not:VEC_F (match_dup 4))))]
 {
   operands[3] = gen_reg_rtx (<MODE>mode);
   operands[4] = gen_reg_rtx (<MODE>mode);