diff mbox

, Add PowerPC ISA 3.0 vector extract support

Message ID 20160629193324.GA7321@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner June 29, 2016, 7:33 p.m. UTC
This patch adds support to generate the ISA 3.0 VEXTRACTUB, VEXTRACTUH, and
XXEXTRACTUW instructions to extract a constant element from the small vector
integer types.  I also added support to generate STXSIWX if a DImode is in an
Altivec register (which is needed for this patch when extracting char or short
elements).

I built this on a little endian power8 system, and there were no regressions in
make check.  Is it ok to install in the trunk?

Ideally, I would like to also install it on the GCC 6.2 branch.  Note, it
depends on the June 15th changes to allow DImode into Altivec registers, and
the fix for PR 71677 that was submitted on June 27th being installed before
these patches can be applied.

Assuming the above patches are supplied can I back port it to the GCC 6.2 after
a burn-in period?

[gcc]
2016-06-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/predicates.md (const_0_to_7_operand): New
	predicate, recognize 0..7.
	* config/rs6000/rs6000.c (rs6000_expand_vector_extract): Add
	support for doing extracts from V16QImode, V8HImode, V4SImode
	under ISA 3.0.
	* config/rs6000/vsx.md (VSX_EXTRACT_I): Mode iterator for ISA 3.0
	vector extract support.
	(VSX_EXTRACT_PREDICATE): Mode attribute to validate element number
	for ISA 3.0 vector extract.
	(VSX_EXTRACT_INSN): Mode attribute to give the instruction to use
	for ISA 3.0 vector extract.
	(VSX_EX): Constraints to use for ISA 3.0 vector extract.
	(vsx_extract_<mode>, VSX_EXTRACT_I): Add support for doing
	extracts of a constant element number from small integer vectors
	on 64-bit ISA 3.0 systems.
	(vsx_extract_<mode>_di): Likewise.
	* config/rs6000/rs6000.h (TARGET_VEXTRACTUB): New target macro to
	say when we can do ISA 3.0 vector extracts.
	* config/rs6000/rs6000.md (stfiwx): Allow DImode in Altivec
	registers, using the stxsiwx instruction.

[gcc/testsuite]
2016-06-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-extract-1.c: New file to test ISA 3.0
	vector extract instructions.
	* gcc.target/powerpc/p9-extract-2.c: Likewise.

Comments

Segher Boessenkool June 29, 2016, 10:20 p.m. UTC | #1
On Wed, Jun 29, 2016 at 03:33:24PM -0400, Michael Meissner wrote:
> Is it ok to install in the trunk?

See comments below.  Okay with that taken care of.

> Ideally, I would like to also install it on the GCC 6.2 branch.  Note, it
> depends on the June 15th changes to allow DImode into Altivec registers, and
> the fix for PR 71677 that was submitted on June 27th being installed before
> these patches can be applied.

Is the DImode thing not on trunk yet?

> Assuming the above patches are supplied can I back port it to the GCC 6.2 after
> a burn-in period?

Yes, thanks.


> +;; Mode attribute to give the instruction for vector extract.  %3 contains the
> +;; adjusted element count.
> +(define_mode_attr VSX_EXTRACT_INSN [(V16QI "vextractub %0,%1,%3")
> +				    (V8HI  "vextractuh %0,%1,%3")
> +				    (V4SI  "xxextractuw %x0,%x1,%3")])

Could you use %2 instead, please?  Much less surprising.  And please inline
it in the one place it is used.

> +(define_insn  "vsx_extract_<mode>_di"
> +  [(set (match_operand:DI 0 "gpc_reg_operand" "=<VSX_EX>")
> +	(zero_extend:DI
> +	 (vec_select:<VS_scalar>
> +	  (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>")
> +	  (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n")]))))]
> +  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_VEXTRACTUB"
> +{
> +  int element = INTVAL (operands[2]);
> +  int unit_size = GET_MODE_UNIT_SIZE (<MODE>mode);
> +  int offset = ((VECTOR_ELT_ORDER_BIG)
> +		? unit_size * element
> +		: unit_size * (GET_MODE_NUNITS (<MODE>mode) - 1 - element));
> +
> +  operands[3] = GEN_INT (offset);
> +  return "<VSX_EXTRACT_INSN>";
> +}

So that would be

  operands[2] = GEN_INT (offset);
  if (unit_size == 4)
    return "xxextractuw %x0,%x1,%2";
  else
    return "vextractu<wd> %0,%1,%2";

or such.


Segher
Michael Meissner June 29, 2016, 10:26 p.m. UTC | #2
On Wed, Jun 29, 2016 at 05:20:26PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 29, 2016 at 03:33:24PM -0400, Michael Meissner wrote:
> > Is it ok to install in the trunk?
> 
> See comments below.  Okay with that taken care of.
> 
> > Ideally, I would like to also install it on the GCC 6.2 branch.  Note, it
> > depends on the June 15th changes to allow DImode into Altivec registers, and
> > the fix for PR 71677 that was submitted on June 27th being installed before
> > these patches can be applied.
> 
> Is the DImode thing not on trunk yet?

The DImode patch is on the trunk.  It is not on 6.2, and in fact since this
patch fixes a bug in that patch, both patches need to be applied at the same
time.

> > Assuming the above patches are supplied can I back port it to the GCC 6.2 after
> > a burn-in period?
> 
> Yes, thanks.
> 
> 
> > +;; Mode attribute to give the instruction for vector extract.  %3 contains the
> > +;; adjusted element count.
> > +(define_mode_attr VSX_EXTRACT_INSN [(V16QI "vextractub %0,%1,%3")
> > +				    (V8HI  "vextractuh %0,%1,%3")
> > +				    (V4SI  "xxextractuw %x0,%x1,%3")])
> 
> Could you use %2 instead, please?  Much less surprising.  And please inline
> it in the one place it is used.

I generally don't like overwriting exisiting arguments when you are doing the
modification, but I can use %2 if you prefer.

> > +(define_insn  "vsx_extract_<mode>_di"
> > +  [(set (match_operand:DI 0 "gpc_reg_operand" "=<VSX_EX>")
> > +	(zero_extend:DI
> > +	 (vec_select:<VS_scalar>
> > +	  (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>")
> > +	  (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n")]))))]
> > +  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_VEXTRACTUB"
> > +{
> > +  int element = INTVAL (operands[2]);
> > +  int unit_size = GET_MODE_UNIT_SIZE (<MODE>mode);
> > +  int offset = ((VECTOR_ELT_ORDER_BIG)
> > +		? unit_size * element
> > +		: unit_size * (GET_MODE_NUNITS (<MODE>mode) - 1 - element));
> > +
> > +  operands[3] = GEN_INT (offset);
> > +  return "<VSX_EXTRACT_INSN>";
> > +}
> 
> So that would be
> 
>   operands[2] = GEN_INT (offset);
>   if (unit_size == 4)
>     return "xxextractuw %x0,%x1,%2";
>   else
>     return "vextractu<wd> %0,%1,%2";

Ok.
diff mbox

Patch

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237826)
+++ gcc/config/rs6000/predicates.md	(.../gcc/config/rs6000)	(working copy)
@@ -200,6 +200,11 @@  (define_predicate "const_2_to_3_operand"
   (and (match_code "const_int")
        (match_test "IN_RANGE (INTVAL (op), 2, 3)")))
 
+;; Match op = 0..7.
+(define_predicate "const_0_to_7_operand"
+  (and (match_code "const_int")
+       (match_test "IN_RANGE (INTVAL (op), 0, 7)")))
+
 ;; Match op = 0..15
 (define_predicate "const_0_to_15_operand"
   (and (match_code "const_int")
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237826)
+++ gcc/config/rs6000/rs6000.c	(.../gcc/config/rs6000)	(working copy)
@@ -6916,6 +6916,30 @@  rs6000_expand_vector_extract (rtx target
 	case V4SFmode:
 	  emit_insn (gen_vsx_extract_v4sf (target, vec, GEN_INT (elt)));
 	  return;
+	case V16QImode:
+	  if (TARGET_VEXTRACTUB)
+	    {
+	      emit_insn (gen_vsx_extract_v16qi (target, vec, GEN_INT (elt)));
+	      return;
+	    }
+	  else
+	    break;
+	case V8HImode:
+	  if (TARGET_VEXTRACTUB)
+	    {
+	      emit_insn (gen_vsx_extract_v8hi (target, vec, GEN_INT (elt)));
+	      return;
+	    }
+	  else
+	    break;
+	case V4SImode:
+	  if (TARGET_VEXTRACTUB)
+	    {
+	      emit_insn (gen_vsx_extract_v4si (target, vec, GEN_INT (elt)));
+	      return;
+	    }
+	  else
+	    break;
 	}
     }
 
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237826)
+++ gcc/config/rs6000/vsx.md	(.../gcc/config/rs6000)	(working copy)
@@ -263,6 +263,27 @@  (define_mode_attr VS_64reg [(V2DF	"ws")
 (define_mode_iterator VSINT_84  [V4SI V2DI DI])
 (define_mode_iterator VSINT_842 [V8HI V4SI V2DI])
 
+;; Iterator for ISA 3.0 vector extract/insert of integer vectors
+(define_mode_iterator VSX_EXTRACT_I [V16QI V8HI V4SI])
+
+;; Mode attribute to give the correct predicate for ISA 3.0 vector extract and
+;; insert to validate the operand number.
+(define_mode_attr VSX_EXTRACT_PREDICATE [(V16QI "const_0_to_15_operand")
+					 (V8HI  "const_0_to_7_operand")
+					 (V4SI  "const_0_to_3_operand")])
+
+;; Mode attribute to give the instruction for vector extract.  %3 contains the
+;; adjusted element count.
+(define_mode_attr VSX_EXTRACT_INSN [(V16QI "vextractub %0,%1,%3")
+				    (V8HI  "vextractuh %0,%1,%3")
+				    (V4SI  "xxextractuw %x0,%x1,%3")])
+
+;; Mode attribute to give the constraint for vector extract and insert
+;; operations.
+(define_mode_attr VSX_EX [(V16QI "v")
+			  (V8HI  "v")
+			  (V4SI  "wa")])
+
 ;; Constants for creating unspecs
 (define_c_enum "unspec"
   [UNSPEC_VSX_CONCAT
@@ -2322,6 +2343,75 @@  (define_expand "vec_perm_const<mode>"
     FAIL;
 })
 
+;; Extraction of a single element in a small integer vector.  None of the small
+;; types are currently allowed in a vector register, so we extract to a DImode
+;; and either do a direct move or store.
+(define_insn_and_split  "vsx_extract_<mode>"
+  [(set (match_operand:<VS_scalar> 0 "nonimmediate_operand" "=r,Z")
+	(vec_select:<VS_scalar>
+	 (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>,<VSX_EX>")
+	 (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n,n")])))
+   (clobber (match_scratch:DI 3 "=<VSX_EX>,<VSX_EX>"))]
+  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_VEXTRACTUB"
+  "#"
+  "&& (reload_completed || MEM_P (operands[0]))"
+  [(const_int 0)]
+{
+  rtx dest = operands[0];
+  rtx src = operands[1];
+  rtx element = operands[2];
+  rtx di_tmp = operands[3];
+
+  if (GET_CODE (di_tmp) == SCRATCH)
+    di_tmp = gen_reg_rtx (DImode);
+
+  emit_insn (gen_vsx_extract_<mode>_di (di_tmp, src, element));
+
+  if (REG_P (dest))
+    emit_move_insn (gen_rtx_REG (DImode, REGNO (dest)), di_tmp);
+  else if (SUBREG_P (dest))
+    emit_move_insn (gen_rtx_REG (DImode, subreg_regno (dest)), di_tmp);
+  else if (MEM_P (operands[0]))
+    {
+      if (can_create_pseudo_p ())
+	dest = rs6000_address_for_fpconvert (dest);
+
+      if (<MODE>mode == V16QImode)
+	emit_insn (gen_p9_stxsibx (dest, di_tmp));
+      else if (<MODE>mode == V8HImode)
+	emit_insn (gen_p9_stxsihx (dest, di_tmp));
+      else if (<MODE>mode == V4SImode)
+	emit_insn (gen_stfiwx (dest, di_tmp));
+      else
+	gcc_unreachable ();
+    }
+  else
+    gcc_unreachable ();
+
+  DONE;
+}
+  [(set_attr "type" "vecsimple,fpstore")])
+
+(define_insn  "vsx_extract_<mode>_di"
+  [(set (match_operand:DI 0 "gpc_reg_operand" "=<VSX_EX>")
+	(zero_extend:DI
+	 (vec_select:<VS_scalar>
+	  (match_operand:VSX_EXTRACT_I 1 "gpc_reg_operand" "<VSX_EX>")
+	  (parallel [(match_operand:QI 2 "<VSX_EXTRACT_PREDICATE>" "n")]))))]
+  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_VEXTRACTUB"
+{
+  int element = INTVAL (operands[2]);
+  int unit_size = GET_MODE_UNIT_SIZE (<MODE>mode);
+  int offset = ((VECTOR_ELT_ORDER_BIG)
+		? unit_size * element
+		: unit_size * (GET_MODE_NUNITS (<MODE>mode) - 1 - element));
+
+  operands[3] = GEN_INT (offset);
+  return "<VSX_EXTRACT_INSN>";
+}
+  [(set_attr "type" "vecsimple")])
+
+
 ;; Expanders for builtins
 (define_expand "vsx_mergel_<mode>"
   [(use (match_operand:VSX_D 0 "vsx_register_operand" ""))
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237826)
+++ gcc/config/rs6000/rs6000.h	(.../gcc/config/rs6000)	(working copy)
@@ -599,6 +599,9 @@  extern int rs6000_vector_align[];
 #define TARGET_VADDUQM		(TARGET_P8_VECTOR && TARGET_POWERPC64)
 #define TARGET_DIRECT_MOVE_128	(TARGET_P9_VECTOR && TARGET_DIRECT_MOVE \
 				 && TARGET_POWERPC64)
+#define TARGET_VEXTRACTUB	(TARGET_P9_VECTOR && TARGET_DIRECT_MOVE \
+				 && TARGET_UPPER_REGS_DF \
+				 && TARGET_UPPER_REGS_DI && TARGET_POWERPC64)
 
 /* Byte/char syncs were added as phased in for ISA 2.06B, but are not present
    in power7, so conditionalize them on p8 features.  TImode syncs need quad
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237826)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -5696,11 +5696,13 @@  (define_expand "lround<mode>di2"
 
 ; An UNSPEC is used so we don't have to support SImode in FP registers.
 (define_insn "stfiwx"
-  [(set (match_operand:SI 0 "memory_operand" "=Z")
-	(unspec:SI [(match_operand:DI 1 "gpc_reg_operand" "d")]
+  [(set (match_operand:SI 0 "memory_operand" "=Z,Z")
+	(unspec:SI [(match_operand:DI 1 "gpc_reg_operand" "d,wv")]
 		   UNSPEC_STFIWX))]
   "TARGET_PPC_GFXOPT"
-  "stfiwx %1,%y0"
+  "@
+   stfiwx %1,%y0
+   stxsiwx %x1,%y0"
   [(set_attr "type" "fpstore")])
 
 ;; If we don't have a direct conversion to single precision, don't enable this

Index: gcc/testsuite/gcc.target/powerpc/p9-extract-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-extract-1.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-extract-1.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237858)
@@ -0,0 +1,26 @@ 
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+
+int extract_int_0 (vector int a) { return vec_extract (a, 0); }
+int extract_int_3 (vector int a) { return vec_extract (a, 3); }
+
+int extract_short_0 (vector short a) { return vec_extract (a, 0); }
+int extract_short_3 (vector short a) { return vec_extract (a, 7); }
+
+int extract_schar_0 (vector signed char a) { return vec_extract (a, 0); }
+int extract_schar_3 (vector signed char a) { return vec_extract (a, 15); }
+
+/* { dg-final { scan-assembler     "vextractub"  } } */
+/* { dg-final { scan-assembler     "vextractuh"  } } */
+/* { dg-final { scan-assembler     "xxextractuw" } } */
+/* { dg-final { scan-assembler     "mfvsrd"      } } */
+/* { dg-final { scan-assembler-not "stxvd2x"     } } */
+/* { dg-final { scan-assembler-not "stxv"        } } */
+/* { dg-final { scan-assembler-not "lwa"         } } */
+/* { dg-final { scan-assembler-not "lwz"         } } */
+/* { dg-final { scan-assembler-not "lha"         } } */
+/* { dg-final { scan-assembler-not "lhz"         } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-extract-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-extract-2.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-extract-2.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237858)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2" } */
+
+#include <altivec.h>
+
+void extract_int_0 (int *p, vector int a) { *p = vec_extract (a, 0); }
+void extract_int_3 (int *p, vector int a) { *p = vec_extract (a, 3); }
+
+void extract_short_0 (short *p, vector short a) { *p = vec_extract (a, 0); }
+void extract_short_3 (short *p, vector short a) { *p = vec_extract (a, 7); }
+
+void extract_schar_0 (signed char *p, vector signed char a) { *p = vec_extract (a, 0); }
+void extract_schar_3 (signed char *p, vector signed char a) { *p = vec_extract (a, 15); }
+
+/* { dg-final { scan-assembler     "vextractub"       } } */
+/* { dg-final { scan-assembler     "vextractuh"       } } */
+/* { dg-final { scan-assembler     "xxextractuw"      } } */
+/* { dg-final { scan-assembler     "stxsibx"          } } */
+/* { dg-final { scan-assembler     "stxsihx"          } } */
+/* { dg-final { scan-assembler     "stfiwx\\|stxsiwx" } } */
+/* { dg-final { scan-assembler-not "mfvsrd"           } } */
+/* { dg-final { scan-assembler-not "stxvd2x"          } } */
+/* { dg-final { scan-assembler-not "stxv"             } } */
+/* { dg-final { scan-assembler-not "lwa"              } } */
+/* { dg-final { scan-assembler-not "stw"              } } */