diff mbox series

[V5,rs6000] Support vrotr<mode>3 for int vector types

Message ID 85937573-94ae-8a13-2cf6-5d4b9edf97e2@linux.ibm.com
State New
Headers show
Series [V5,rs6000] Support vrotr<mode>3 for int vector types | expand

Commit Message

Kewen.Lin Aug. 2, 2019, 8:59 a.m. UTC
Hi Segher,

Sorry for the late, I've addressed your comments in the attached patch.
Some points:
  1) Remove explict AND part.
  2) Rename predicate name to vint_reg_or_vint_const.
  3) Split test cases into altivec and power8.

As to the predicate name and usage, I checked the current vector shifts, 
they don't need to check const_vector specially (like right to left 
conversion), excepting for the one "vec_shr_<mode>", but it checks for
scalar const int.

Btw, I've changed the 
  +      rtx imm_vec = 
  +	simplify_const_unary_operation
back to 
  +      rtx imm_vec
  +	= simplify_const_unary_operation
Otherwise check_GNU_style will report "Trailing operator" error.  :(

Bootstrapped and regtested on powerpc64le-unknown-linux-gnu.

Thanks,
Kewen

------------

gcc/ChangeLog

2019-08-02  Kewen Lin  <linkw@gcc.gnu.org>

	* config/rs6000/predicates.md (vint_reg_or_vint_const): New predicate.
	* config/rs6000/vector.md (vrotr<mode>3): New define_expand.

gcc/testsuite/ChangeLog

2019-08-02  Kewen Lin  <linkw@gcc.gnu.org>

	* gcc.target/powerpc/vec_rotate-1.c: New test.
	* gcc.target/powerpc/vec_rotate-2.c: New test.
	* gcc.target/powerpc/vec_rotate-3.c: New test.
	* gcc.target/powerpc/vec_rotate-4.c: New test.

Comments

Segher Boessenkool Aug. 3, 2019, 8:52 p.m. UTC | #1
Hi!

I somehow lost track of this email, sorry.

On Fri, Aug 02, 2019 at 04:59:44PM +0800, Kewen.Lin wrote:
> As to the predicate name and usage, I checked the current vector shifts, 
> they don't need to check const_vector specially (like right to left 
> conversion), excepting for the one "vec_shr_<mode>", but it checks for
> scalar const int.

I don't understand why we want to expand rotate-by-vector-of-immediates
if we have no insns for that?  If you just use vint_operand, what happens
then?

> Btw, I've changed the 
>   +      rtx imm_vec = 
>   +	simplify_const_unary_operation
> back to 
>   +      rtx imm_vec
>   +	= simplify_const_unary_operation
> Otherwise check_GNU_style will report "Trailing operator" error.  :(

Yeah I got it the wrong way around.  Either way is ugly.  Oh well.

> +/* { dg-options "-O3" } */
> +/* { dg-require-effective-target powerpc_altivec_ok } */

If you use altivec_ok, you need to use -maltivec in the options, too.
This test should probably work with -O2 as well; use that, if possible.

> +/* { dg-require-effective-target powerpc_p8vector_ok } */

I don't think we need this anymore?  Not sure.


Segher
Kewen.Lin Aug. 5, 2019, 3:41 a.m. UTC | #2
Hi Segher,

on 2019/8/4 上午4:52, Segher Boessenkool wrote:
> Hi!
> 
> I somehow lost track of this email, sorry.
> 
> On Fri, Aug 02, 2019 at 04:59:44PM +0800, Kewen.Lin wrote:
>> As to the predicate name and usage, I checked the current vector shifts, 
>> they don't need to check const_vector specially (like right to left 
>> conversion), excepting for the one "vec_shr_<mode>", but it checks for
>> scalar const int.
> 
> I don't understand why we want to expand rotate-by-vector-of-immediates
> if we have no insns for that?  If you just use vint_operand, what happens
> then?
> 

You are right, if we just use vint_operand, the functionality should be the
same, the only small difference is the adjusted constant rotation number 
isn't masked, but it would be fine for functionality.

One example for ULL >r 8, with const vector handling, it gets
  xxspltib 33,56

Without the handling, it gets 
  xxsplitb 33,248

But I agree that it's trivial and unified it as below attached patch.

>> +/* { dg-options "-O3" } */
>> +/* { dg-require-effective-target powerpc_altivec_ok } */
> 
> If you use altivec_ok, you need to use -maltivec in the options, too.
> This test should probably work with -O2 as well; use that, if possible.
> 

Sorry, the test case depends on vectorization which isn't enabled at -O2
by default.

>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> 
> I don't think we need this anymore?  Not sure.
> 

I thought -mdejagnu-cpu=power8 can only ensure power8 cpu setting takes
preference, but can't guarantee the current driver supports power8
complication.  As your comments, I guess since gcc configuration don't
have without-cpu= etc., the power8 support should be always guaranteed?


Thanks,
Kewen

-----------------

gcc/ChangeLog

2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>

	* config/rs6000/vector.md (vrotr<mode>3): New define_expand.

gcc/testsuite/ChangeLog

2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>

	* gcc.target/powerpc/vec_rotate-1.c: New test.
	* gcc.target/powerpc/vec_rotate-2.c: New test.
	* gcc.target/powerpc/vec_rotate-3.c: New test.
	* gcc.target/powerpc/vec_rotate-4.c: New test.
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 70bcfe02e22..886cbad1655 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -1260,6 +1260,19 @@
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
+;; Expanders for rotatert to make use of vrotl
+(define_expand "vrotr<mode>3"
+  [(set (match_operand:VEC_I 0 "vint_operand")
+	(rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand")
+		(match_operand:VEC_I 2 "vint_operand")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  rtx rot_count = gen_reg_rtx (<MODE>mode);
+  emit_insn (gen_neg<mode>2 (rot_count, operands[2]));
+  emit_insn (gen_vrotl<mode>3 (operands[0], operands[1], rot_count));
+  DONE;
+})
+
 ;; Expanders for arithmetic shift left on each vector element
 (define_expand "vashl<mode>3"
   [(set (match_operand:VEC_I 0 "vint_operand")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c
new file mode 100644
index 00000000000..f035a578292
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c
@@ -0,0 +1,39 @@
+/* { dg-options "-O3" } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power, mainly
+   for the case rotation count is const number.
+
+   Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */
+
+#define N 256
+unsigned int suw[N], ruw[N];
+unsigned short suh[N], ruh[N];
+unsigned char sub[N], rub[N];
+
+void
+testUW ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruw[i] = (suw[i] >> 8) | (suw[i] << (sizeof (suw[0]) * 8 - 8));
+}
+
+void
+testUH ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruh[i] = (unsigned short) (suh[i] >> 9)
+	     | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - 9));
+}
+
+void
+testUB ()
+{
+  for (int i = 0; i < 256; ++i)
+    rub[i] = (unsigned char) (sub[i] >> 5)
+	     | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - 5));
+}
+
+/* { dg-final { scan-assembler {\mvrlw\M} } } */
+/* { dg-final { scan-assembler {\mvrlh\M} } } */
+/* { dg-final { scan-assembler {\mvrlb\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c
new file mode 100644
index 00000000000..0a2a965ddcb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c
@@ -0,0 +1,19 @@
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power8" } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power8, mainly
+   for the case rotation count is const number.
+
+   Check for vrld which is available on Power8 and above.  */
+
+#define N 256
+unsigned long long sud[N], rud[N];
+
+void
+testULL ()
+{
+  for (int i = 0; i < 256; ++i)
+    rud[i] = (sud[i] >> 8) | (sud[i] << (sizeof (sud[0]) * 8 - 8));
+}
+
+/* { dg-final { scan-assembler {\mvrld\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c
new file mode 100644
index 00000000000..5e90ae6fd63
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c
@@ -0,0 +1,40 @@
+/* { dg-options "-O3" } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power, mainly
+   for the case rotation count isn't const number.
+
+   Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */
+
+#define N 256
+unsigned int suw[N], ruw[N];
+unsigned short suh[N], ruh[N];
+unsigned char sub[N], rub[N];
+extern unsigned char rot_cnt;
+
+void
+testUW ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruw[i] = (suw[i] >> rot_cnt) | (suw[i] << (sizeof (suw[0]) * 8 - rot_cnt));
+}
+
+void
+testUH ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruh[i] = (unsigned short) (suh[i] >> rot_cnt)
+	     | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - rot_cnt));
+}
+
+void
+testUB ()
+{
+  for (int i = 0; i < 256; ++i)
+    rub[i] = (unsigned char) (sub[i] >> rot_cnt)
+	     | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - rot_cnt));
+}
+
+/* { dg-final { scan-assembler {\mvrlw\M} } } */
+/* { dg-final { scan-assembler {\mvrlh\M} } } */
+/* { dg-final { scan-assembler {\mvrlb\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c
new file mode 100644
index 00000000000..0d3e8378ed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c
@@ -0,0 +1,20 @@
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power8" } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power8, mainly
+   for the case rotation count isn't const number.
+
+   Check for vrld which is available on Power8 and above.  */
+
+#define N 256
+unsigned long long sud[N], rud[N];
+extern unsigned char rot_cnt;
+
+void
+testULL ()
+{
+  for (int i = 0; i < 256; ++i)
+    rud[i] = (sud[i] >> rot_cnt) | (sud[i] << (sizeof (sud[0]) * 8 - rot_cnt));
+}
+
+/* { dg-final { scan-assembler {\mvrld\M} } } */
Segher Boessenkool Aug. 5, 2019, 9:21 p.m. UTC | #3
On Mon, Aug 05, 2019 at 11:41:41AM +0800, Kewen.Lin wrote:
> on 2019/8/4 上午4:52, Segher Boessenkool wrote:
> > On Fri, Aug 02, 2019 at 04:59:44PM +0800, Kewen.Lin wrote:
> >> As to the predicate name and usage, I checked the current vector shifts, 
> >> they don't need to check const_vector specially (like right to left 
> >> conversion), excepting for the one "vec_shr_<mode>", but it checks for
> >> scalar const int.
> > 
> > I don't understand why we want to expand rotate-by-vector-of-immediates
> > if we have no insns for that?  If you just use vint_operand, what happens
> > then?
> 
> You are right, if we just use vint_operand, the functionality should be the
> same, the only small difference is the adjusted constant rotation number 
> isn't masked, but it would be fine for functionality.
> 
> One example for ULL >r 8, with const vector handling, it gets
>   xxspltib 33,56
> 
> Without the handling, it gets 
>   xxsplitb 33,248
> 
> But I agree that it's trivial and unified it as below attached patch.

There are two cases: either all elements are rotated by the same amount,
or they are not.  When they are, on p8 and later we can always use
xxspltib, which allows immediates 0..255, and the rotates look only at
the low bits they need, in that element, for that element (so you can
always splat it to all bytes, all but the low-order bytes are just
ignored by the rotate insns; before p8, we use vsplti[bhw], and those
allow -16..15, so for vrlw you do *not* want to mask it with 31.
There is some mechanics with easy_altivec_constant that should help
here.  Maybe it can use some improvement.

The other case is if not all shift counts are the same.  I'm not sure
we actually care much about this case :-)

> >> +/* { dg-options "-O3" } */
> >> +/* { dg-require-effective-target powerpc_altivec_ok } */
> > 
> > If you use altivec_ok, you need to use -maltivec in the options, too.
> > This test should probably work with -O2 as well; use that, if possible.
> 
> Sorry, the test case depends on vectorization which isn't enabled at -O2
> by default.

Ah yes, that is pretty obvious.

> >> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > 
> > I don't think we need this anymore?  Not sure.
> 
> I thought -mdejagnu-cpu=power8 can only ensure power8 cpu setting takes
> preference, but can't guarantee the current driver supports power8
> complication.  As your comments, I guess since gcc configuration don't
> have without-cpu= etc., the power8 support should be always guaranteed?

The compiler always supports all CPUs.

If you are using something like -maltivec, things are different: you
might have selected a CPU that does not allow -maltivec, so we do need
altivec_ok.  But if you use -mcpu=power8 (or -mdejagnu-cpu=power8), you
can use all p8 insns, including the vector ones (unless you disable
them again with -mno-vsx or similar; just don't do that ;-) )

[ In the past, it was possible to configure the compiler without support
for p8 vector insns, if your assembler doesn't support them.  We do not
do this anymore: now, if your compiler does support things that your
assembler does not, you'll get errors from that assembler if you try to
use those instructions.  Which is fine, just make sure you use a new
enough assembler for the GCC version you use.  This always has been true,
but with a somewhat larger window of allowed versions.  But this "don't
support all insns if the assembler does not" means we need to test a lot
more configurations (or leave them untested, even worse).

As a side effect, most *_ok now do nothing.  *_hw of course is still
needed to check if the test system allows running the testcase.  ]

> gcc/ChangeLog
> 
> 2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>
> 
> 	* config/rs6000/vector.md (vrotr<mode>3): New define_expand.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>
> 
> 	* gcc.target/powerpc/vec_rotate-1.c: New test.
> 	* gcc.target/powerpc/vec_rotate-2.c: New test.
> 	* gcc.target/powerpc/vec_rotate-3.c: New test.
> 	* gcc.target/powerpc/vec_rotate-4.c: New test.

Approved for trunk (with or without the p8vector_ok change).  Thank you!


Segher
Kewen.Lin Aug. 6, 2019, 2:19 a.m. UTC | #4
Hi Segher,

on 2019/8/6 上午5:21, Segher Boessenkool wrote:
> On Mon, Aug 05, 2019 at 11:41:41AM +0800, Kewen.Lin wrote:
>> on 2019/8/4 上午4:52, Segher Boessenkool wrote:
>>> On Fri, Aug 02, 2019 at 04:59:44PM +0800, Kewen.Lin wrote:
> There are two cases: either all elements are rotated by the same amount,
> or they are not.  When they are, on p8 and later we can always use
> xxspltib, which allows immediates 0..255, and the rotates look only at
> the low bits they need, in that element, for that element (so you can
> always splat it to all bytes, all but the low-order bytes are just
> ignored by the rotate insns; before p8, we use vsplti[bhw], and those
> allow -16..15, so for vrlw you do *not* want to mask it with 31.
> There is some mechanics with easy_altivec_constant that should help
> here.  Maybe it can use some improvement.
> 
> The other case is if not all shift counts are the same.  I'm not sure
> we actually care much about this case :-)
> 

Got it, I think even for the "other" case, the neg operation without masking
is also safe since the hardware instructions do ignore the unrelated bits.

>> I thought -mdejagnu-cpu=power8 can only ensure power8 cpu setting takes
>> preference, but can't guarantee the current driver supports power8
>> complication.  As your comments, I guess since gcc configuration don't
>> have without-cpu= etc., the power8 support should be always guaranteed?
> 
> The compiler always supports all CPUs.
> 
> If you are using something like -maltivec, things are different: you
> might have selected a CPU that does not allow -maltivec, so we do need
> altivec_ok.  But if you use -mcpu=power8 (or -mdejagnu-cpu=power8), you
> can use all p8 insns, including the vector ones (unless you disable
> them again with -mno-vsx or similar; just don't do that ;-) )
> 
> [ In the past, it was possible to configure the compiler without support
> for p8 vector insns, if your assembler doesn't support them.  We do not
> do this anymore: now, if your compiler does support things that your
> assembler does not, you'll get errors from that assembler if you try to
> use those instructions.  Which is fine, just make sure you use a new
> enough assembler for the GCC version you use.  This always has been true,
> but with a somewhat larger window of allowed versions.  But this "don't
> support all insns if the assembler does not" means we need to test a lot
> more configurations (or leave them untested, even worse).
> 
> As a side effect, most *_ok now do nothing.  *_hw of course is still
> needed to check if the test system allows running the testcase.  ]
> 

Thanks a lot for the detailed explanation!  I'll remove p8vector_ok.

>> gcc/ChangeLog
>>
>> 2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>
>>
>> 	* config/rs6000/vector.md (vrotr<mode>3): New define_expand.
>>
>> gcc/testsuite/ChangeLog
>>
>> 2019-08-05  Kewen Lin  <linkw@gcc.gnu.org>
>>
>> 	* gcc.target/powerpc/vec_rotate-1.c: New test.
>> 	* gcc.target/powerpc/vec_rotate-2.c: New test.
>> 	* gcc.target/powerpc/vec_rotate-3.c: New test.
>> 	* gcc.target/powerpc/vec_rotate-4.c: New test.
> 
> Approved for trunk (with or without the p8vector_ok change).  Thank you!
> 

Thank you!  :)


Kewen
Segher Boessenkool Aug. 6, 2019, 3:03 p.m. UTC | #5
On Tue, Aug 06, 2019 at 10:19:34AM +0800, Kewen.Lin wrote:
> on 2019/8/6 上午5:21, Segher Boessenkool wrote:
> > The other case is if not all shift counts are the same.  I'm not sure
> > we actually care much about this case :-)
> 
> Got it, I think even for the "other" case, the neg operation without masking
> is also safe since the hardware instructions do ignore the unrelated bits.

Oh it is perfectly safe.  Only wondering how well we optimise this :-)


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 8ca98299950..faf057425a8 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -163,6 +163,17 @@ 
   return VINT_REGNO_P (REGNO (op));
 })
 
+;; Return 1 if op is a vector register that operates on integer vectors
+;; or if op is a const vector with integer vector modes.
+(define_predicate "vint_reg_or_vint_const"
+  (match_code "reg,subreg,const_vector")
+{
+  if (GET_CODE (op) == CONST_VECTOR && GET_MODE_CLASS (mode) == MODE_VECTOR_INT)
+    return 1;
+
+  return vint_operand (op, mode);
+})
+
 ;; Return 1 if op is a vector register to do logical operations on (and, or,
 ;; xor, etc.)
 (define_predicate "vlogical_operand"
diff --git a/gcc/config/rs6000/vector.md b/gcc/config/rs6000/vector.md
index 70bcfe02e22..3111ca9029f 100644
--- a/gcc/config/rs6000/vector.md
+++ b/gcc/config/rs6000/vector.md
@@ -1260,6 +1260,33 @@ 
   "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
   "")
 
+;; Expanders for rotatert to make use of vrotl
+(define_expand "vrotr<mode>3"
+  [(set (match_operand:VEC_I 0 "vint_operand")
+	(rotatert:VEC_I (match_operand:VEC_I 1 "vint_operand")
+		(match_operand:VEC_I 2 "vint_reg_or_vint_const")))]
+  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (<MODE>mode)"
+{
+  rtx rot_count = gen_reg_rtx (<MODE>mode);
+  if (GET_CODE (operands[2]) == CONST_VECTOR)
+    {
+      machine_mode inner_mode = GET_MODE_INNER (<MODE>mode);
+      unsigned int bits = GET_MODE_PRECISION (inner_mode);
+      rtx mask_vec = gen_const_vec_duplicate (<MODE>mode, GEN_INT (bits - 1));
+      rtx imm_vec
+	= simplify_const_unary_operation (NEG, <MODE>mode, operands[2],
+					  GET_MODE (operands[2]));
+      imm_vec
+	= simplify_const_binary_operation (AND, <MODE>mode, imm_vec, mask_vec);
+      rot_count = force_reg (<MODE>mode, imm_vec);
+    }
+  else
+    emit_insn (gen_neg<mode>2 (rot_count, operands[2]));
+
+  emit_insn (gen_vrotl<mode>3 (operands[0], operands[1], rot_count));
+  DONE;
+})
+
 ;; Expanders for arithmetic shift left on each vector element
 (define_expand "vashl<mode>3"
   [(set (match_operand:VEC_I 0 "vint_operand")
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c
new file mode 100644
index 00000000000..f035a578292
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-1.c
@@ -0,0 +1,39 @@ 
+/* { dg-options "-O3" } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power, mainly
+   for the case rotation count is const number.
+
+   Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */
+
+#define N 256
+unsigned int suw[N], ruw[N];
+unsigned short suh[N], ruh[N];
+unsigned char sub[N], rub[N];
+
+void
+testUW ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruw[i] = (suw[i] >> 8) | (suw[i] << (sizeof (suw[0]) * 8 - 8));
+}
+
+void
+testUH ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruh[i] = (unsigned short) (suh[i] >> 9)
+	     | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - 9));
+}
+
+void
+testUB ()
+{
+  for (int i = 0; i < 256; ++i)
+    rub[i] = (unsigned char) (sub[i] >> 5)
+	     | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - 5));
+}
+
+/* { dg-final { scan-assembler {\mvrlw\M} } } */
+/* { dg-final { scan-assembler {\mvrlh\M} } } */
+/* { dg-final { scan-assembler {\mvrlb\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c
new file mode 100644
index 00000000000..0a2a965ddcb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-2.c
@@ -0,0 +1,19 @@ 
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power8" } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power8, mainly
+   for the case rotation count is const number.
+
+   Check for vrld which is available on Power8 and above.  */
+
+#define N 256
+unsigned long long sud[N], rud[N];
+
+void
+testULL ()
+{
+  for (int i = 0; i < 256; ++i)
+    rud[i] = (sud[i] >> 8) | (sud[i] << (sizeof (sud[0]) * 8 - 8));
+}
+
+/* { dg-final { scan-assembler {\mvrld\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c
new file mode 100644
index 00000000000..5e90ae6fd63
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-3.c
@@ -0,0 +1,40 @@ 
+/* { dg-options "-O3" } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power, mainly
+   for the case rotation count isn't const number.
+
+   Check for instructions vrlb/vrlh/vrlw only available if altivec supported. */
+
+#define N 256
+unsigned int suw[N], ruw[N];
+unsigned short suh[N], ruh[N];
+unsigned char sub[N], rub[N];
+extern unsigned char rot_cnt;
+
+void
+testUW ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruw[i] = (suw[i] >> rot_cnt) | (suw[i] << (sizeof (suw[0]) * 8 - rot_cnt));
+}
+
+void
+testUH ()
+{
+  for (int i = 0; i < 256; ++i)
+    ruh[i] = (unsigned short) (suh[i] >> rot_cnt)
+	     | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - rot_cnt));
+}
+
+void
+testUB ()
+{
+  for (int i = 0; i < 256; ++i)
+    rub[i] = (unsigned char) (sub[i] >> rot_cnt)
+	     | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - rot_cnt));
+}
+
+/* { dg-final { scan-assembler {\mvrlw\M} } } */
+/* { dg-final { scan-assembler {\mvrlh\M} } } */
+/* { dg-final { scan-assembler {\mvrlb\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c
new file mode 100644
index 00000000000..0d3e8378ed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate-4.c
@@ -0,0 +1,20 @@ 
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power8" } */
+
+/* Check vectorizer can exploit vector rotation instructions on Power8, mainly
+   for the case rotation count isn't const number.
+
+   Check for vrld which is available on Power8 and above.  */
+
+#define N 256
+unsigned long long sud[N], rud[N];
+extern unsigned char rot_cnt;
+
+void
+testULL ()
+{
+  for (int i = 0; i < 256; ++i)
+    rud[i] = (sud[i] >> rot_cnt) | (sud[i] << (sizeof (sud[0]) * 8 - rot_cnt));
+}
+
+/* { dg-final { scan-assembler {\mvrld\M} } } */