diff mbox

[applied] , Backport PowerPC ISA 3.0 xxperm, builtin, and vneg support to GCC 6.2

Message ID 20160601232722.GA9851@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner June 1, 2016, 11:27 p.m. UTC
I applied the following patches that were applied to the trunk to the GCC 6.2
branch:

[gcc]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/71201
	* config/rs6000/altivec.md (altivec_vperm_<mode>_internal): Drop
	ISA 3.0 xxperm fusion alternative.
	(altivec_vperm_v8hiv16qi): Likewise.
	(altivec_vperm_<mode>_uns_internal): Likewise.
	(vperm_v8hiv4si): Likewise.
	(vperm_v16qiv8hi): Likewise.

	Back port from trunk
	2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
		    Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
	vpermr/xxpermr on ISA 3.0.
	(altivec_expand_vec_perm_le): Likewise.
	* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
	(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
	ISA 3.0.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
		    Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/p9-permute.c: Run test on big endian as well
	as little endian.

	Back port from trunk
	2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
		    Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
	support.

[gcc]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (VParity): New mode iterator for vector
	parity built-in functions.
	(p9v_ctz<mode>2): Add support for ISA 3.0 vector count trailing
	zeros.
	(p9v_parity<mode>2): Likewise.
	* config/rs6000/vector.md (VEC_IP): New mode iterator for vector
	parity.
	(ctz<mode>2): ISA 3.0 expander for vector count trailing zeros.
	(parity<mode>2): ISA 3.0 expander for vector parity.
	* config/rs6000/rs6000-builtin.def (BU_P9_MISC_1): New macros for
	power9 built-ins.
	(BU_P9_64BIT_MISC_0): Likewise.
	(BU_P9_MISC_0): Likewise.
	(BU_P9V_AV_1): Likewise.
	(BU_P9V_AV_2): Likewise.
	(BU_P9V_AV_3): Likewise.
	(BU_P9V_AV_P): Likewise.
	(BU_P9V_VSX_1): Likewise.
	(BU_P9V_OVERLOAD_1): Likewise.
	(BU_P9V_OVERLOAD_2): Likewise.
	(BU_P9V_OVERLOAD_3): Likewise.
	(VCTZB): Add vector count trailing zeros support.
	(VCTZH): Likewise.
	(VCTZW): Likewise.
	(VCTZD): Likewise.
	(VPRTYBD): Add vector parity support.
	(VPRTYBQ): Likewise.
	(VPRTYBW): Likewise.
	(VCTZ): Add overloaded vector count trailing zeros support.
	(VPRTYB): Add overloaded vector parity support.
	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Add
	overloaded vector count trailing zeros and parity instructions.
	* config/rs6000/rs6000.md (wd mode attribute): Add V1TI and TI for
	vector parity support.
	* config/rs6000/altivec.h (vec_vctz): Add ISA 3.0 vector count
	trailing zeros support.
	(vec_cntlz): Likewise.
	(vec_vctzb): Likewise.
	(vec_vctzd): Likewise.
	(vec_vctzh): Likewise.
	(vec_vctzw): Likewise.
	(vec_vprtyb): Add ISA 3.0 vector parity support.
	(vec_vprtybd): Likewise.
	(vec_vprtybw): Likewise.
	(vec_vprtybq): Likewise.
	* doc/extend.texi (PowerPC AltiVec Built-in Functions): Document
	the ISA 3.0 vector count trailing zeros and vector parity built-in
	functions.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-vparity.c: New file to check ISA 3.0
	vector parity built-in functions.
	* gcc.target/powerpc/ctz-3.c: New file to check ISA 3.0 vector
	count trailing zeros automatic vectorization.
	* gcc.target/powerpc/ctz-4.c: New file to check ISA 3.0 vector
	count trailing zeros built-in functions.

[gcc]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (VNEG iterator): New iterator for
	VNEGW/VNEGD instructions.
	(p9_neg<mode>2): New insns for ISA 3.0 VNEGW/VNEGD.
	(neg<mode>2): Add expander for V2DImode added in ISA 2.07, and
	support for ISA 3.0 VNEGW/VNEGD instructions.

[gcc/testsuite]
2016-06-01  Michael Meissner  <meissner@linux.vnet.ibm.com>

	Back port from trunk
	2016-05-24  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-vneg.c: New test for ISA 3.0 VNEGW/VNEGD
	instructions.
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 236958)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6853,21 +6853,29 @@  rs6000_expand_vector_set (rtx target, rt
 			gen_rtvec (3, target, reg,
 				   force_reg (V16QImode, x)),
 			UNSPEC_VPERM);
-  else 
+  else
     {
-      /* Invert selector.  We prefer to generate VNAND on P8 so
-         that future fusion opportunities can kick in, but must
-         generate VNOR elsewhere.  */
-      rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
-      rtx iorx = (TARGET_P8_VECTOR
-		  ? gen_rtx_IOR (V16QImode, notx, notx)
-		  : gen_rtx_AND (V16QImode, notx, notx));
-      rtx tmp = gen_reg_rtx (V16QImode);
-      emit_insn (gen_rtx_SET (tmp, iorx));
-
-      /* Permute with operands reversed and adjusted selector.  */
-      x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
-			  UNSPEC_VPERM);
+      if (TARGET_P9_VECTOR)
+	x = gen_rtx_UNSPEC (mode,
+			    gen_rtvec (3, target, reg,
+				       force_reg (V16QImode, x)),
+			    UNSPEC_VPERMR);
+      else
+	{
+	  /* Invert selector.  We prefer to generate VNAND on P8 so
+	     that future fusion opportunities can kick in, but must
+	     generate VNOR elsewhere.  */
+	  rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
+	  rtx iorx = (TARGET_P8_VECTOR
+		      ? gen_rtx_IOR (V16QImode, notx, notx)
+		      : gen_rtx_AND (V16QImode, notx, notx));
+	  rtx tmp = gen_reg_rtx (V16QImode);
+	  emit_insn (gen_rtx_SET (tmp, iorx));
+
+	  /* Permute with operands reversed and adjusted selector.  */
+	  x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
+			      UNSPEC_VPERM);
+	}
     }
 
   emit_insn (gen_rtx_SET (target, x));
@@ -34128,17 +34136,25 @@  altivec_expand_vec_perm_le (rtx operands
   if (!REG_P (target))
     tmp = gen_reg_rtx (mode);
 
-  /* Invert the selector with a VNAND if available, else a VNOR.
-     The VNAND is preferred for future fusion opportunities.  */
-  notx = gen_rtx_NOT (V16QImode, sel);
-  iorx = (TARGET_P8_VECTOR
-	  ? gen_rtx_IOR (V16QImode, notx, notx)
-	  : gen_rtx_AND (V16QImode, notx, notx));
-  emit_insn (gen_rtx_SET (norreg, iorx));
+  if (TARGET_P9_VECTOR)
+    {
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel),
+			       UNSPEC_VPERMR);
+    }
+  else
+    {
+      /* Invert the selector with a VNAND if available, else a VNOR.
+	 The VNAND is preferred for future fusion opportunities.  */
+      notx = gen_rtx_NOT (V16QImode, sel);
+      iorx = (TARGET_P8_VECTOR
+	      ? gen_rtx_IOR (V16QImode, notx, notx)
+	      : gen_rtx_AND (V16QImode, notx, notx));
+      emit_insn (gen_rtx_SET (norreg, iorx));
 
-  /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
-			   UNSPEC_VPERM);
+      /* Permute with operands reversed and adjusted selector.  */
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
+			       UNSPEC_VPERM);
+    }
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(revision 236941)
+++ gcc/config/rs6000/altivec.md	(working copy)
@@ -58,6 +58,7 @@  (define_c_enum "unspec"
    UNSPEC_VSUM2SWS
    UNSPEC_VSUMSWS
    UNSPEC_VPERM
+   UNSPEC_VPERMR
    UNSPEC_VPERM_UNS
    UNSPEC_VRFIN
    UNSPEC_VCFUX
@@ -1949,32 +1950,30 @@  (define_expand "altivec_vperm_<mode>"
 
 ;; Slightly prefer vperm, since the target does not overlap the source
 (define_insn "*altivec_vperm_<mode>_internal"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo")
-	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo")
-		    (match_operand:VM 2 "register_operand" "v,wo,wo")
-		    (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+		    (match_operand:VM 2 "register_operand" "v,wo")
+		    (match_operand:V16QI 3 "register_operand" "v,wo")]
 		   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_insn "altivec_vperm_v8hiv16qi"
-  [(set (match_operand:V16QI 0 "register_operand" "=v,?wo,?&wo")
-	(unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0,wo")
-   	               (match_operand:V8HI 2 "register_operand" "v,wo,wo")
-		       (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V16QI 0 "register_operand" "=v,?wo")
+	(unspec:V16QI [(match_operand:V8HI 1 "register_operand" "v,0")
+   	               (match_operand:V8HI 2 "register_operand" "v,wo")
+		       (match_operand:V16QI 3 "register_operand" "v,wo")]
 		   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_expand "altivec_vperm_<mode>_uns"
   [(set (match_operand:VM 0 "register_operand" "")
@@ -1992,18 +1991,17 @@  (define_expand "altivec_vperm_<mode>_uns
 })
 
 (define_insn "*altivec_vperm_<mode>_uns_internal"
-  [(set (match_operand:VM 0 "register_operand" "=v,?wo,?&wo")
-	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0,wo")
-		    (match_operand:VM 2 "register_operand" "v,wo,wo")
-		    (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+		    (match_operand:VM 2 "register_operand" "v,wo")
+		    (match_operand:V16QI 3 "register_operand" "v,wo")]
 		   UNSPEC_VPERM_UNS))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_expand "vec_permv16qi"
   [(set (match_operand:V16QI 0 "register_operand" "")
@@ -2032,6 +2030,19 @@  (define_expand "vec_perm_constv16qi"
     FAIL;
 })
 
+(define_insn "*altivec_vpermr_<mode>_internal"
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+		    (match_operand:VM 2 "register_operand" "v,wo")
+		    (match_operand:V16QI 3 "register_operand" "v,wo")]
+		   UNSPEC_VPERMR))]
+  "TARGET_P9_VECTOR"
+  "@
+   vpermr %0,%1,%2,%3
+   xxpermr %x0,%x2,%x3"
+  [(set_attr "type" "vecperm")
+   (set_attr "length" "4")])
+
 (define_insn "altivec_vrfip"		; ceil
   [(set (match_operand:V4SF 0 "register_operand" "=v")
         (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")]
@@ -2791,32 +2802,30 @@  (define_expand "vec_unpacks_lo_<VP_small
   "")
 
 (define_insn "vperm_v8hiv4si"
-  [(set (match_operand:V4SI 0 "register_operand" "=v,?wo,?&wo")
-        (unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v,0,wo")
-		      (match_operand:V4SI 2 "register_operand" "v,wo,wo")
-		      (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V4SI 0 "register_operand" "=v,?wo")
+        (unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v,0")
+		      (match_operand:V4SI 2 "register_operand" "v,wo")
+		      (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERMSI))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 (define_insn "vperm_v16qiv8hi"
-  [(set (match_operand:V8HI 0 "register_operand" "=v,?wo,?&wo")
-        (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v,0,wo")
-		      (match_operand:V8HI 2 "register_operand" "v,wo,wo")
-		      (match_operand:V16QI 3 "register_operand" "v,wo,wo")]
+  [(set (match_operand:V8HI 0 "register_operand" "=v,?wo")
+        (unspec:V8HI [(match_operand:V16QI 1 "register_operand" "v,0")
+		      (match_operand:V8HI 2 "register_operand" "v,wo")
+		      (match_operand:V16QI 3 "register_operand" "v,wo")]
                   UNSPEC_VPERMHI))]
   "TARGET_ALTIVEC"
   "@
    vperm %0,%1,%2,%3
-   xxperm %x0,%x2,%x3
-   xxlor %x0,%x1,%x1\t\t# xxperm fusion\;xxperm %x0,%x2,%x3"
+   xxperm %x0,%x2,%x3"
   [(set_attr "type" "vecperm")
-   (set_attr "length" "4,4,8")])
+   (set_attr "length" "4")])
 
 
 (define_expand "vec_unpacku_hi_v16qi"
Index: gcc/testsuite/gcc.target/powerpc/p9-permute.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-permute.c	(revision 236941)
+++ gcc/testsuite/gcc.target/powerpc/p9-permute.c	(working copy)
@@ -1,4 +1,4 @@ 
-/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-do compile { target { powerpc64*-*-* } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
 /* { dg-options "-mcpu=power9 -O2" } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
@@ -17,5 +17,6 @@  permute (vector long long *p, vector lon
   return vec_perm (a, b, mask);
 }
 
+/* expect xxpermr on little-endian, xxperm on big-endian */
 /* { dg-final { scan-assembler	   "xxperm" } } */
 /* { dg-final { scan-assembler-not "vperm"  } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-vpermr.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-vpermr.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-vpermr.c	(revision 0)
@@ -0,0 +1,21 @@ 
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+/* Test generation of VPERMR/XXPERMR on ISA 3.0 in little endian.  */
+
+#include <altivec.h>
+
+vector long long
+permute (vector long long *p, vector long long *q, vector unsigned char mask)
+{
+  vector long long a = *p;
+  vector long long b = *q;
+
+  /* Force a, b to be in altivec registers to select vpermr insn.  */
+  __asm__ (" # a: %x0, b: %x1" : "+v" (a), "+v" (b));
+
+  return vec_perm (a, b, mask);
+}
+
+/* { dg-final { scan-assembler "vpermr\|xxpermr" } } */