Patchwork , PowerPC backend fixes for 50310 (vectorization of IEEE floating point comparisons)

login
register
mail settings
Submitter Michael Meissner
Date March 6, 2012, 12:23 a.m.
Message ID <20120306002350.GA26629@ibm-tiger.the-meissners.org>
Download mbox | patch
Permalink /patch/144818/
State New
Headers show

Comments

Michael Meissner - March 6, 2012, 12:23 a.m.
On power7 systems, the backend was not prepared to handle vector comparisons
with UNEQ, LTGT, ORDERED, and UNORDERED tests, since there is no single
comparison instruction for these cases.  This patch adds support for doing
vector conditional move involving these operations.  I have bootstrapped the
compiler with these patches, and there were no regressions.  The test
gcc.c-torture/execute/ieee/pr50340.c now passes if you build the compiler using
--with-cpu=power7 as a default (or define ADDITIONAL_TORTURE_OPTIONS in the
site.exp file to add -mcpu=power7).  Is this ok to install in 4.8?

In addition, I would like to backport this fix to the current older branches.
Can I check it into the 4.7 branch or should this patch wait until after the
4.7 release for 4.7.1?

2012-03-05  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/50310
	* config/rs6000/vector.md (vector_uneq<mode>): Add support for
	UNEQ, LTGT, ORDERED, and UNORDERED IEEE vector comparisons.
	(vector_ltgt<mode>): Likewise.
	(vector_ordered<mode>): Likewise.
	(vector_unordered<mode>): Likewise.
	* config/rs6000/rs6000.c (rs6000_emit_vector_compare_inner):
	Likewise.
David Edelsohn - March 6, 2012, 3:11 a.m.
On Mon, Mar 5, 2012 at 7:23 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On power7 systems, the backend was not prepared to handle vector comparisons
> with UNEQ, LTGT, ORDERED, and UNORDERED tests, since there is no single
> comparison instruction for these cases.  This patch adds support for doing
> vector conditional move involving these operations.  I have bootstrapped the
> compiler with these patches, and there were no regressions.  The test
> gcc.c-torture/execute/ieee/pr50340.c now passes if you build the compiler using
> --with-cpu=power7 as a default (or define ADDITIONAL_TORTURE_OPTIONS in the
> site.exp file to add -mcpu=power7).  Is this ok to install in 4.8?
>
> In addition, I would like to backport this fix to the current older branches.
> Can I check it into the 4.7 branch or should this patch wait until after the
> 4.7 release for 4.7.1?
>
> 2012-03-05  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>        PR target/50310
>        * config/rs6000/vector.md (vector_uneq<mode>): Add support for
>        UNEQ, LTGT, ORDERED, and UNORDERED IEEE vector comparisons.
>        (vector_ltgt<mode>): Likewise.
>        (vector_ordered<mode>): Likewise.
>        (vector_unordered<mode>): Likewise.
>        * config/rs6000/rs6000.c (rs6000_emit_vector_compare_inner):
>        Likewise.

The logical operations on floating point registers seem weird, but okay.

We need to ask the RMs if this patch is acceptable for GCC 4.7.0.

Thanks, David
Jakub Jelinek - March 6, 2012, 7:40 a.m.
On Mon, Mar 05, 2012 at 10:11:49PM -0500, David Edelsohn wrote:
> We need to ask the RMs if this patch is acceptable for GCC 4.7.0.

I think it should wait for 4.7.1.

	Jakub

Patch

Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md	(revision 184959)
+++ gcc/config/rs6000/vector.md	(working copy)
@@ -516,6 +516,94 @@  (define_expand "vector_geu<mode>"
   "VECTOR_UNIT_ALTIVEC_P (<MODE>mode)"
   "")
 
+(define_insn_and_split "*vector_uneq<mode>"
+  [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+	(uneq:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+		    (match_operand:VEC_F 2 "vfloat_operand" "")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:VEC_F (match_dup 1)
+		  (match_dup 2)))
+   (set (match_dup 4)
+	(gt:VEC_F (match_dup 2)
+		  (match_dup 1)))
+   (set (match_dup 0)
+	(not:VEC_F (ior:VEC_F (match_dup 3)
+			      (match_dup 4))))]
+  "
+{
+  operands[3] = gen_reg_rtx (<MODE>mode);
+  operands[4] = gen_reg_rtx (<MODE>mode);
+}")
+
+(define_insn_and_split "*vector_ltgt<mode>"
+  [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+	(ltgt:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+		    (match_operand:VEC_F 2 "vfloat_operand" "")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(gt:VEC_F (match_dup 1)
+		  (match_dup 2)))
+   (set (match_dup 4)
+	(gt:VEC_F (match_dup 2)
+		  (match_dup 1)))
+   (set (match_dup 0)
+	(ior:VEC_F (match_dup 3)
+		   (match_dup 4)))]
+  "
+{
+  operands[3] = gen_reg_rtx (<MODE>mode);
+  operands[4] = gen_reg_rtx (<MODE>mode);
+}")
+
+(define_insn_and_split "*vector_ordered<mode>"
+  [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+	(ordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+		       (match_operand:VEC_F 2 "vfloat_operand" "")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(ge:VEC_F (match_dup 1)
+		  (match_dup 2)))
+   (set (match_dup 4)
+	(ge:VEC_F (match_dup 2)
+		  (match_dup 1)))
+   (set (match_dup 0)
+	(ior:VEC_F (match_dup 3)
+		   (match_dup 4)))]
+  "
+{
+  operands[3] = gen_reg_rtx (<MODE>mode);
+  operands[4] = gen_reg_rtx (<MODE>mode);
+}")
+
+(define_insn_and_split "*vector_unordered<mode>"
+  [(set (match_operand:VEC_F 0 "vfloat_operand" "")
+	(unordered:VEC_F (match_operand:VEC_F 1 "vfloat_operand" "")
+			 (match_operand:VEC_F 2 "vfloat_operand" "")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "#"
+  ""
+  [(set (match_dup 3)
+	(ge:VEC_F (match_dup 1)
+		  (match_dup 2)))
+   (set (match_dup 4)
+	(ge:VEC_F (match_dup 2)
+		  (match_dup 1)))
+   (set (match_dup 0)
+	(not:VEC_F (ior:VEC_F (match_dup 3)
+			      (match_dup 4))))]
+  "
+{
+  operands[3] = gen_reg_rtx (<MODE>mode);
+  operands[4] = gen_reg_rtx (<MODE>mode);
+}")
+
 ;; Note the arguments for __builtin_altivec_vsel are op2, op1, mask
 ;; which is in the reverse order that we want
 (define_expand "vector_select_<mode>"
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 184959)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -16126,6 +16126,10 @@  rs6000_emit_vector_compare_inner (enum r
     case EQ:
     case GT:
     case GTU:
+    case ORDERED:
+    case UNORDERED:
+    case UNEQ:
+    case LTGT:
       mask = gen_reg_rtx (mode);
       emit_insn (gen_rtx_SET (VOIDmode,
 			      mask,