diff mbox

[#3] , Add PowerPC ISA 3.0 vpermr/xxpermr support

Message ID 20160523222222.GB10818@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner May 23, 2016, 10:22 p.m. UTC
On Thu, May 19, 2016 at 10:33:41AM -0500, Segher Boessenkool wrote:
> On Thu, May 19, 2016 at 10:53:41AM -0400, Michael Meissner wrote:
> > GCC 6.1 added support for the XXPERM instruction for the PowerPC ISA 3.0.  The
> > XXPERM instruction is essentially a 4 operand instruction, with only 3 operands
> > in the instruction (the target register overlaps with the first input
> > register).  The Power9 hardware has fusion support where if the instruction
> > that precedes the XXPERM is a XXLOR move instruction to set the first input
> > argument, it is fused with the XXPERM.  I added code to support this fusion.
> > 
> > Unfortunately, in running the testsuite on the power9 simulator, we discovered
> > that the test gcc.c-torture/execute/pr56866.c would fail because the fusion
> > alternatives confused the register allocator and/or the passes after the
> > register allocator.  This patch removes the explicit fusion support from
> > XXPERM.
> 
> Okay.  Please keep the PR open until that problem is fixed.  It also
> shouldn't be "target" category, if the problem is RA.
> 
> > In addition, ISA 3.0 added XXPERMR and VPERMR instructions for little endian
> > support where the permute vector reverses the bytes.  This patch adds support
> > for XXPERMR/VPERMR.
> 
> Please send that as a separate patch, it has nothing to do with the PR.
> 
> > +	x = gen_rtx_UNSPEC (mode,
> > +			    gen_rtvec (3, target, reg, 
> 
> Trailing space.
> 
> > +  if (TARGET_P9_VECTOR)
> > +    {
> > +      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel), 
> 
> And another.
> 
> > +	 The VNAND is preferred for future fusion opportunities.  */
> > +      notx = gen_rtx_NOT (V16QImode, sel);
> > +      iorx = (TARGET_P8_VECTOR
> > +	      ? gen_rtx_IOR (V16QImode, notx, notx)
> > +	      : gen_rtx_AND (V16QImode, notx, notx));
> > +      emit_insn (gen_rtx_SET (norreg, iorx));
> > +      
> 
> Some more.
> 
> > +/* { dg-final { scan-assembler	   "vpermr\|xxpermr" } } */
> 
> Tab in the middle of the line.

Here are the patches for xxpermr/vpermr support that are broken out from fixing
the xxperm fusion bug.  I have built a compiler with these patches (and the
xxperm patches) and it bootstraps and does not cause a regression.  Are they ok
to add to GCC 7 and eventually to GCC 6.2?

[gcc]
2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
	    Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
	vpermr/xxpermr on ISA 3.0.
	(altivec_expand_vec_perm_le): Likewise.
	* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
	(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
	ISA 3.0.

[gcc/testsuite]
2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
	    Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
	support.

Comments

Segher Boessenkool May 23, 2016, 11:16 p.m. UTC | #1
On Mon, May 23, 2016 at 06:22:22PM -0400, Michael Meissner wrote:
> Here are the patches for xxpermr/vpermr support that are broken out from fixing
> the xxperm fusion bug.  I have built a compiler with these patches (and the
> xxperm patches) and it bootstraps and does not cause a regression.  Are they ok
> to add to GCC 7 and eventually to GCC 6.2?
> 
> [gcc]
> 2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
> 	    Kelvin Nilsen  <kelvin@gcc.gnu.org>
> 
> 	* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
> 	vpermr/xxpermr on ISA 3.0.
> 	(altivec_expand_vec_perm_le): Likewise.
> 	* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
> 	(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
> 	ISA 3.0.
> 
> [gcc/testsuite]
> 2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
> 	    Kelvin Nilsen  <kelvin@gcc.gnu.org>
> 
> 	* gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
> 	support.

Okay for trunk.  Okay for 6 after a week or so.

Thanks,


Segher
Kelvin Nilsen May 24, 2016, 5:30 p.m. UTC | #2
I have committed gcc.target/powerpc/p9-vpermr.c to trunk (separately
from the other files mentioned in this ChangeLog), revision 236655.
Approved offline.

On 05/23/2016 05:16 PM, Segher Boessenkool wrote:
> On Mon, May 23, 2016 at 06:22:22PM -0400, Michael Meissner wrote:
>> Here are the patches for xxpermr/vpermr support that are broken out from fixing
>> the xxperm fusion bug.  I have built a compiler with these patches (and the
>> xxperm patches) and it bootstraps and does not cause a regression.  Are they ok
>> to add to GCC 7 and eventually to GCC 6.2?
>>
>> [gcc]
>> 2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
>> 	    Kelvin Nilsen  <kelvin@gcc.gnu.org>
>>
>> 	* config/rs6000/rs6000.c (rs6000_expand_vector_set): Generate
>> 	vpermr/xxpermr on ISA 3.0.
>> 	(altivec_expand_vec_perm_le): Likewise.
>> 	* config/rs6000/altivec.md (UNSPEC_VPERMR): New unspec.
>> 	(altivec_vpermr_<mode>_internal): Add VPERMR/XXPERMR support for
>> 	ISA 3.0.
>>
>> [gcc/testsuite]
>> 2016-05-23  Michael Meissner  <meissner@linux.vnet.ibm.com>
>> 	    Kelvin Nilsen  <kelvin@gcc.gnu.org>
>>
>> 	* gcc.target/powerpc/p9-vpermr.c: New test for ISA 3.0 vpermr
>> 	support.
> 
> Okay for trunk.  Okay for 6 after a week or so.
> 
> Thanks,
> 
> 
> Segher
> 
>
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 236608)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -6863,21 +6863,29 @@  rs6000_expand_vector_set (rtx target, rt
 			gen_rtvec (3, target, reg,
 				   force_reg (V16QImode, x)),
 			UNSPEC_VPERM);
-  else 
+  else
     {
-      /* Invert selector.  We prefer to generate VNAND on P8 so
-         that future fusion opportunities can kick in, but must
-         generate VNOR elsewhere.  */
-      rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
-      rtx iorx = (TARGET_P8_VECTOR
-		  ? gen_rtx_IOR (V16QImode, notx, notx)
-		  : gen_rtx_AND (V16QImode, notx, notx));
-      rtx tmp = gen_reg_rtx (V16QImode);
-      emit_insn (gen_rtx_SET (tmp, iorx));
-
-      /* Permute with operands reversed and adjusted selector.  */
-      x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
-			  UNSPEC_VPERM);
+      if (TARGET_P9_VECTOR)
+	x = gen_rtx_UNSPEC (mode,
+			    gen_rtvec (3, target, reg,
+				       force_reg (V16QImode, x)),
+			    UNSPEC_VPERMR);
+      else
+	{
+	  /* Invert selector.  We prefer to generate VNAND on P8 so
+	     that future fusion opportunities can kick in, but must
+	     generate VNOR elsewhere.  */
+	  rtx notx = gen_rtx_NOT (V16QImode, force_reg (V16QImode, x));
+	  rtx iorx = (TARGET_P8_VECTOR
+		      ? gen_rtx_IOR (V16QImode, notx, notx)
+		      : gen_rtx_AND (V16QImode, notx, notx));
+	  rtx tmp = gen_reg_rtx (V16QImode);
+	  emit_insn (gen_rtx_SET (tmp, iorx));
+
+	  /* Permute with operands reversed and adjusted selector.  */
+	  x = gen_rtx_UNSPEC (mode, gen_rtvec (3, reg, target, tmp),
+			      UNSPEC_VPERM);
+	}
     }
 
   emit_insn (gen_rtx_SET (target, x));
@@ -34365,17 +34373,25 @@  altivec_expand_vec_perm_le (rtx operands
   if (!REG_P (target))
     tmp = gen_reg_rtx (mode);
 
-  /* Invert the selector with a VNAND if available, else a VNOR.
-     The VNAND is preferred for future fusion opportunities.  */
-  notx = gen_rtx_NOT (V16QImode, sel);
-  iorx = (TARGET_P8_VECTOR
-	  ? gen_rtx_IOR (V16QImode, notx, notx)
-	  : gen_rtx_AND (V16QImode, notx, notx));
-  emit_insn (gen_rtx_SET (norreg, iorx));
+  if (TARGET_P9_VECTOR)
+    {
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op0, op1, sel),
+			       UNSPEC_VPERMR);
+    }
+  else
+    {
+      /* Invert the selector with a VNAND if available, else a VNOR.
+	 The VNAND is preferred for future fusion opportunities.  */
+      notx = gen_rtx_NOT (V16QImode, sel);
+      iorx = (TARGET_P8_VECTOR
+	      ? gen_rtx_IOR (V16QImode, notx, notx)
+	      : gen_rtx_AND (V16QImode, notx, notx));
+      emit_insn (gen_rtx_SET (norreg, iorx));
 
-  /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
-			   UNSPEC_VPERM);
+      /* Permute with operands reversed and adjusted selector.  */
+      unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, norreg),
+			       UNSPEC_VPERM);
+    }
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(revision 236608)
+++ gcc/config/rs6000/altivec.md	(working copy)
@@ -58,6 +58,7 @@  (define_c_enum "unspec"
    UNSPEC_VSUM2SWS
    UNSPEC_VSUMSWS
    UNSPEC_VPERM
+   UNSPEC_VPERMR
    UNSPEC_VPERM_UNS
    UNSPEC_VRFIN
    UNSPEC_VCFUX
@@ -2032,6 +2033,19 @@  (define_expand "vec_perm_constv16qi"
     FAIL;
 })
 
+(define_insn "*altivec_vpermr_<mode>_internal"
+  [(set (match_operand:VM 0 "register_operand" "=v,?wo")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v,0")
+		    (match_operand:VM 2 "register_operand" "v,wo")
+		    (match_operand:V16QI 3 "register_operand" "v,wo")]
+		   UNSPEC_VPERMR))]
+  "TARGET_P9_VECTOR"
+  "@
+   vpermr %0,%1,%2,%3
+   xxpermr %x0,%x2,%x3"
+  [(set_attr "type" "vecperm")
+   (set_attr "length" "4")])
+
 (define_insn "altivec_vrfip"		; ceil
   [(set (match_operand:V4SF 0 "register_operand" "=v")
         (unspec:V4SF [(match_operand:V4SF 1 "register_operand" "v")]