diff mbox

[rs6000] Support -maltivec=be in LE mode for vec_perm builtin

Message ID 1391211391.14633.19.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt Jan. 31, 2014, 11:36 p.m. UTC
Hi,

One more patch in the set supporting -maltivec=be for the Altivec
builtins, this one for vec_perm.  Most of the work was done already in
rs6000.c:altivec_expand_vec_perm_le ().  We can reuse this logic for the
present work by generalizing the code to operate on vector types other
than V16QImode.

When -maltivec=be is specified for a little endian target, we call that
routine to expand the builtin rather than generating the vperm pattern
directly.  The pattern is then matched by
*altivec_vperm_<mode>{,_uns}_internal to generate the actual vperm
instruction.

There are two new test cases that verify correct vec_perm behavior for
BE, LE, and LE with -maltivec=be.  Because vec_perm now works the same
for BE and LE without -maltivec=be, one existing test needs modification
to avoid the previous workaround for endianness.

Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
regressions.  Is this ok for trunk?

Thanks,
Bill


gcc:

2014-01-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (altivec_expand_vec_perm_le): Generalize
	for vector types other than V16QImode.
	* config/rs6000/altivec.md (altivec_vperm_<mode>): Change to a
	define_expand, and call altivec_expand_vec_perm_le when producing
	code with little endian element order.
	(*altivec_vperm_<mode>_internal): New insn having previous
	behavior of altivec_vperm_<mode>.
	(altivec_vperm_<mode>_uns): Change to a define_expand, and call
	altivec_expand_vec_perm_le when producing code with little endian
	element order.
	(*altivec_vperm_<mode>_uns_internal): New insn having previous
	behavior of altivec_vperm_<mode>_uns.

gcc/testsuite:

2014-01-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/3b-15.c: Remove special handling for little endian.
	* gcc.dg/vmx/perm.c: New.
	* gcc.dg/vmx/perm-be-order.c: New.

Comments

David Edelsohn Feb. 1, 2014, 12:02 a.m. UTC | #1
On Fri, Jan 31, 2014 at 6:36 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> One more patch in the set supporting -maltivec=be for the Altivec
> builtins, this one for vec_perm.  Most of the work was done already in
> rs6000.c:altivec_expand_vec_perm_le ().  We can reuse this logic for the
> present work by generalizing the code to operate on vector types other
> than V16QImode.
>
> When -maltivec=be is specified for a little endian target, we call that
> routine to expand the builtin rather than generating the vperm pattern
> directly.  The pattern is then matched by
> *altivec_vperm_<mode>{,_uns}_internal to generate the actual vperm
> instruction.
>
> There are two new test cases that verify correct vec_perm behavior for
> BE, LE, and LE with -maltivec=be.  Because vec_perm now works the same
> for BE and LE without -maltivec=be, one existing test needs modification
> to avoid the previous workaround for endianness.
>
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no
> regressions.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> gcc:
>
> 2014-01-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_le): Generalize
>         for vector types other than V16QImode.
>         * config/rs6000/altivec.md (altivec_vperm_<mode>): Change to a
>         define_expand, and call altivec_expand_vec_perm_le when producing
>         code with little endian element order.
>         (*altivec_vperm_<mode>_internal): New insn having previous
>         behavior of altivec_vperm_<mode>.
>         (altivec_vperm_<mode>_uns): Change to a define_expand, and call
>         altivec_expand_vec_perm_le when producing code with little endian
>         element order.
>         (*altivec_vperm_<mode>_uns_internal): New insn having previous
>         behavior of altivec_vperm_<mode>_uns.
>
> gcc/testsuite:
>
> 2014-01-31  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/3b-15.c: Remove special handling for little endian.
>         * gcc.dg/vmx/perm.c: New.
>         * gcc.dg/vmx/perm-be-order.c: New.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc/testsuite/gcc.dg/vmx/perm-be-order.c
===================================================================
--- gcc/testsuite/gcc.dg/vmx/perm-be-order.c	(revision 0)
+++ gcc/testsuite/gcc.dg/vmx/perm-be-order.c	(revision 0)
@@ -0,0 +1,74 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector signed char vsca = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vucp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23};
+  vector unsigned char vscp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23};
+  vector unsigned char vusp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22};
+  vector unsigned char vssp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22};
+  vector unsigned char vuip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+  vector unsigned char vsip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+  vector unsigned char vfp  = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+#else
+  vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+  vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+  vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+  vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+  vector unsigned char vfp  = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+#endif
+
+  /* Result vectors.  */
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8};
+  vector unsigned short vusr = {0,15,1,14,2,13,3,12};
+  vector signed short vssr = {-8,7,-7,6,-6,5,-5,4};
+  vector unsigned int vuir = {0,7,1,6};
+  vector signed int vsir = {-4,3,-3,2};
+  vector float vfr = {-4.0,3.0,-3.0,2.0};
+
+  vuc = vec_perm (vuca, vucb, vucp);
+  vsc = vec_perm (vsca, vscb, vscp);
+  vus = vec_perm (vusa, vusb, vusp);
+  vss = vec_perm (vssa, vssb, vssp);
+  vui = vec_perm (vuia, vuib, vuip);
+  vsi = vec_perm (vsia, vsib, vsip);
+  vf  = vec_perm (vfa,  vfb,  vfp );
+
+  check (vec_all_eq (vuc, vucr), "vuc");
+  check (vec_all_eq (vsc, vscr), "vsc");
+  check (vec_all_eq (vus, vusr), "vus");
+  check (vec_all_eq (vss, vssr), "vss");
+  check (vec_all_eq (vui, vuir), "vui");
+  check (vec_all_eq (vsi, vsir), "vsi");
+  check (vec_all_eq (vf,  vfr),  "vf" );
+}
Index: gcc/testsuite/gcc.dg/vmx/3b-15.c
===================================================================
--- gcc/testsuite/gcc.dg/vmx/3b-15.c	(revision 207373)
+++ gcc/testsuite/gcc.dg/vmx/3b-15.c	(working copy)
@@ -3,11 +3,7 @@ 
 vector unsigned char
 f (vector unsigned char a, vector unsigned char b, vector unsigned char c)
 {
-#ifdef __BIG_ENDIAN__
   return vec_perm(a,b,c); 
-#else
-  return vec_perm(b,a,c);
-#endif
 }
 
 static void test()
@@ -16,13 +12,8 @@  static void test()
 					    8,9,10,11,12,13,14,15}),
 		     ((vector unsigned char){70,71,72,73,74,75,76,77,
 					    78,79,80,81,82,83,84,85}),
-#ifdef __BIG_ENDIAN__
 		     ((vector unsigned char){0x1,0x14,0x18,0x10,0x16,0x15,0x19,0x1a,
 					    0x1c,0x1c,0x1c,0x12,0x8,0x1d,0x1b,0xe})),
-#else
-                     ((vector unsigned char){0x1e,0xb,0x7,0xf,0x9,0xa,0x6,0x5,
-                                            0x3,0x3,0x3,0xd,0x17,0x2,0x4,0x11})),
-#endif
 		   ((vector unsigned char){1,74,78,70,76,75,79,80,82,82,82,72,8,83,81,14})),
 	"f");
 }
Index: gcc/testsuite/gcc.dg/vmx/perm.c
===================================================================
--- gcc/testsuite/gcc.dg/vmx/perm.c	(revision 0)
+++ gcc/testsuite/gcc.dg/vmx/perm.c	(revision 0)
@@ -0,0 +1,69 @@ 
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb
+    = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+
+  vector signed char vsca
+    = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+  vector unsigned char vfp = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  /* Result vectors.  */
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8};
+  vector unsigned short vusr = {0,15,1,14,2,13,3,12};
+  vector signed short vssr = {-8,7,-7,6,-6,5,-5,4};
+  vector unsigned int vuir = {0,7,1,6};
+  vector signed int vsir = {-4,3,-3,2};
+  vector float vfr = {-4.0,3.0,-3.0,2.0};
+
+  vuc = vec_perm (vuca, vucb, vucp);
+  vsc = vec_perm (vsca, vscb, vscp);
+  vus = vec_perm (vusa, vusb, vusp);
+  vss = vec_perm (vssa, vssb, vssp);
+  vui = vec_perm (vuia, vuib, vuip);
+  vsi = vec_perm (vsia, vsib, vsip);
+  vf  = vec_perm (vfa,  vfb,  vfp );
+
+  check (vec_all_eq (vuc, vucr), "vuc");
+  check (vec_all_eq (vsc, vscr), "vsc");
+  check (vec_all_eq (vus, vusr), "vus");
+  check (vec_all_eq (vss, vssr), "vss");
+  check (vec_all_eq (vui, vuir), "vui");
+  check (vec_all_eq (vsi, vsir), "vsi");
+  check (vec_all_eq (vf,  vfr),  "vf" );
+}
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 207373)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -29840,16 +29840,18 @@  altivec_expand_vec_perm_le (rtx operands[4])
   rtx op1 = operands[2];
   rtx sel = operands[3];
   rtx tmp = target;
+  rtx splatreg = gen_reg_rtx (V16QImode);
+  enum machine_mode mode = GET_MODE (target);
 
   /* Get everything in regs so the pattern matches.  */
   if (!REG_P (op0))
-    op0 = force_reg (V16QImode, op0);
+    op0 = force_reg (mode, op0);
   if (!REG_P (op1))
-    op1 = force_reg (V16QImode, op1);
+    op1 = force_reg (mode, op1);
   if (!REG_P (sel))
     sel = force_reg (V16QImode, sel);
   if (!REG_P (target))
-    tmp = gen_reg_rtx (V16QImode);
+    tmp = gen_reg_rtx (mode);
 
   /* SEL = splat(31) - SEL.  */
   /* We want to subtract from 31, but we can't vspltisb 31 since
@@ -29857,13 +29859,12 @@  altivec_expand_vec_perm_le (rtx operands[4])
      five bits of the permute control vector elements are used.  */
   splat = gen_rtx_VEC_DUPLICATE (V16QImode,
 				 gen_rtx_CONST_INT (QImode, -1));
-  emit_move_insn (tmp, splat);
-  sel = gen_rtx_MINUS (V16QImode, tmp, sel);
-  emit_move_insn (tmp, sel);
+  emit_move_insn (splatreg, splat);
+  sel = gen_rtx_MINUS (V16QImode, splatreg, sel);
+  emit_move_insn (splatreg, sel);
 
   /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, op1, op0, tmp),
-			   UNSPEC_VPERM);
+  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, splatreg), UNSPEC_VPERM);
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(revision 207373)
+++ gcc/config/rs6000/altivec.md	(working copy)
@@ -1804,23 +1804,53 @@ 
   "vrfiz %0,%1"
   [(set_attr "type" "vecfloat")])
 
-(define_insn "altivec_vperm_<mode>"
+(define_expand "altivec_vperm_<mode>"
   [(set (match_operand:VM 0 "register_operand" "=v")
 	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
 		    (match_operand:VM 2 "register_operand" "v")
 		    (match_operand:V16QI 3 "register_operand" "v")]
 		   UNSPEC_VPERM))]
   "TARGET_ALTIVEC"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_vec_perm_le (operands);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_vperm_<mode>_internal"
+  [(set (match_operand:VM 0 "register_operand" "=v")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
+		    (match_operand:VM 2 "register_operand" "v")
+		    (match_operand:V16QI 3 "register_operand" "v")]
+		   UNSPEC_VPERM))]
+  "TARGET_ALTIVEC"
   "vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vperm_<mode>_uns"
+(define_expand "altivec_vperm_<mode>_uns"
   [(set (match_operand:VM 0 "register_operand" "=v")
 	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
 		    (match_operand:VM 2 "register_operand" "v")
 		    (match_operand:V16QI 3 "register_operand" "v")]
 		   UNSPEC_VPERM_UNS))]
   "TARGET_ALTIVEC"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_vec_perm_le (operands);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_vperm_<mode>_uns_internal"
+  [(set (match_operand:VM 0 "register_operand" "=v")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
+		    (match_operand:VM 2 "register_operand" "v")
+		    (match_operand:V16QI 3 "register_operand" "v")]
+		   UNSPEC_VPERM_UNS))]
+  "TARGET_ALTIVEC"
   "vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")])