diff mbox series

[rs6000] testcases for vec_insert

Message ID 1539184471.16697.57.camel@brimstone.rchland.ibm.com
State New
Headers show
Series [rs6000] testcases for vec_insert | expand

Commit Message

will schmidt Oct. 10, 2018, 3:14 p.m. UTC
Hi,
  Add some testcases for verification of vec_insert() codegen.
The char,float,int,short tests are broken out into -p8 and -p9
variants due to codegen variations between the platforms.

Tested across assorted power linux platforms.  OK for trunk?

Thanks
-Will
    
[testsuite]
    
2018-10-10  Will Schmidt  <will_schmidt@vnet.ibm.com>

	* gcc.target/powerpc/fold-vec-insert-char-p8.c: New.
	* gcc.target/powerpc/fold-vec-insert-char-p9.c: New.
	* gcc.target/powerpc/fold-vec-insert-double.c: New.
	* gcc.target/powerpc/fold-vec-insert-float-p8.c: New.
	* gcc.target/powerpc/fold-vec-insert-float-p9.c: New.
	* gcc.target/powerpc/fold-vec-insert-int-p8.c: New.
	* gcc.target/powerpc/fold-vec-insert-int-p9.c: New.
	* gcc.target/powerpc/fold-vec-insert-longlong.c: New.
	* gcc.target/powerpc/fold-vec-insert-short-p8.c: New.
	* gcc.target/powerpc/fold-vec-insert-short-p9.c: New.

Comments

Segher Boessenkool Oct. 10, 2018, 5:33 p.m. UTC | #1
Hi!

On Wed, Oct 10, 2018 at 10:14:31AM -0500, Will Schmidt wrote:
>   Add some testcases for verification of vec_insert() codegen.
> The char,float,int,short tests are broken out into -p8 and -p9
> variants due to codegen variations between the platforms.
> 
> Tested across assorted power linux platforms.  OK for trunk?

> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p8.c

> +/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */

The usual questions wrt lp64 :-)

> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-maltivec -O2" } */

You say the same thing (and more) two lines later :-)

> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
> +/* { dg-options "-maltivec -O2 -mcpu=power8" } */

-maltivec is implied by -mcpu=power8 if you do nothing special.

Similar for the other tests.

For all the scan-assembler tests, did you verify these are exactly the
instructions we want generated?

Minus the nits, look great, okay for trunk.  Thanks!


Segher
will schmidt Oct. 10, 2018, 6:24 p.m. UTC | #2
On Wed, 2018-10-10 at 12:33 -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 10, 2018 at 10:14:31AM -0500, Will Schmidt wrote:
> >   Add some testcases for verification of vec_insert() codegen.
> > The char,float,int,short tests are broken out into -p8 and -p9
> > variants due to codegen variations between the platforms.
> > 
> > Tested across assorted power linux platforms.  OK for trunk?
> 
> > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p8.c
> 
> > +/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
> 
> The usual questions wrt lp64 :-)
> 
> > +/* { dg-require-effective-target powerpc_p8vector_ok } */
> > +/* { dg-options "-maltivec -O2" } */
> 
> You say the same thing (and more) two lines later :-)

heh, I really meant it, i guess.

> 
> > +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
> > +/* { dg-options "-maltivec -O2 -mcpu=power8" } */
> 
> -maltivec is implied by -mcpu=power8 if you do nothing special.
> 
> Similar for the other tests.

Yup, I'll clean those up.  (this and the other submitted patches).
Thanks for the review. :-)

> For all the scan-assembler tests, did you verify these are exactly the
> instructions we want generated?

"want" may be a bit strong, but I do verified that is what we get now.

What I specifically do is compare what we do generate now with what we
end up generating after I attempt some early gimple-folding, and make
sure any changes are equivalent or better.

> Minus the nits, look great, okay for trunk.  Thanks!
> 
> 
> Segher
>
Segher Boessenkool Oct. 10, 2018, 7:12 p.m. UTC | #3
On Wed, Oct 10, 2018 at 01:24:52PM -0500, Will Schmidt wrote:
> > For all the scan-assembler tests, did you verify these are exactly the
> > instructions we want generated?
> 
> "want" may be a bit strong, but I do verified that is what we get now.

The difference is if the testcases start failing with random compiler
changes while there is nothing wrong, or not.

We'll know soon enough ;-)


Segher
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p8.c
new file mode 100644
index 0000000..1c634c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p8.c
@@ -0,0 +1,60 @@ 
+/* Verify that overloaded built-ins for vec_insert () with char
+   inputs produce the right codegen.  */
+
+/* The below contains vec_insert () calls with both variable and constant
+ values.  Only the constant value calls are early-gimple folded, but all
+ are tested for coverage.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-maltivec -O2" } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power8" } */
+
+#include <altivec.h>
+
+vector bool char testub_var (unsigned char x, vector bool char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector signed char testss_var (signed char x, vector signed char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector unsigned char testsu_var (signed char x, vector unsigned char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector unsigned char testuu_var (unsigned char x, vector unsigned char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector bool char testub_cst  (unsigned char x, vector bool char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector signed char testss_cst  (signed char x, vector signed char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector unsigned char testsu_cst (signed char x, vector unsigned char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector unsigned char testuu_cst (unsigned char x, vector unsigned char v)
+{
+       return vec_insert (x, v, 12);
+}
+
+/* one store per _var test */
+/* { dg-final { scan-assembler-times {\mstvx\M|\mstxvw4x\M} 4 } } */
+/* one store-byte per test */
+/* { dg-final { scan-assembler-times {\mstb\M} 8 } } */
+/* one load per test */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvw4x\M} 8 } } */
+
+/* one lvebx per _cst test.*/
+/* { dg-final { scan-assembler-times {\mlvebx\M} 4 } } */
+/* one vperm per _cst test.*/
+/* { dg-final { scan-assembler-times {\mvperm\M} 4 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p9.c
new file mode 100644
index 0000000..4fb43e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-char-p9.c
@@ -0,0 +1,57 @@ 
+/* Verify that overloaded built-ins for vec_insert () with char
+   inputs produce the right codegen.  */
+
+/* The below contains vec_insert () calls with both variable and constant
+ values.  Only the constant value calls are early-gimple folded, but all
+ are tested for coverage.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-maltivec -O2" } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power9" } */
+
+#include <altivec.h>
+
+vector bool char testub_var (unsigned char x, vector bool char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector signed char testss_var (signed char x, vector signed char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector unsigned char testsu_var (signed char x, vector unsigned char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector unsigned char testuu_var (unsigned char x, vector unsigned char v, signed int i)
+{
+       return vec_insert (x, v, i);
+}
+vector bool char testub_cst  (unsigned char x, vector bool char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector signed char testss_cst  (signed char x, vector signed char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector unsigned char testsu_cst (signed char x, vector unsigned char v)
+{
+       return vec_insert (x, v, 12);
+}
+vector unsigned char testuu_cst (unsigned char x, vector unsigned char v)
+{
+       return vec_insert (x, v, 12);
+}
+
+/* load immediate, add, store, stb, load variable test.  */
+/* { dg-final { scan-assembler-times {\mstxv\M|\mstvx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mstb\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlvebx\M|\mlxv\M|\mlvx\M} 4 } } */
+
+/* an insert and a move per constant test. */
+/* { dg-final { scan-assembler-times {\mmtvsrwz\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvinsertb\M} 4 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-double.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-double.c
new file mode 100644
index 0000000..a042c31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-double.c
@@ -0,0 +1,29 @@ 
+/* Verify that overloaded built-ins for vec_insert with 
+   double inputs produce the right codegen.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include <altivec.h>
+
+vector double
+testd_var (double d, vector double vd, signed int si)
+{
+  return vec_insert (d, vd, si);
+}
+
+vector double
+testd_cst (double d, vector double vd)
+{
+  return vec_insert (d, vd, 1);
+}
+/* The number of xxpermdi instructions varies between
+ P7,P8,P9, ensure at least one hit. */
+/* { dg-final { scan-assembler {\mxxpermdi\M} } } */
+
+/* { dg-final { scan-assembler-times {\mrldic\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvd2x\M|\mstxv\M|\mstvx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstfdx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M|\mlvx\M} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p8.c
new file mode 100644
index 0000000..8c1088e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p8.c
@@ -0,0 +1,31 @@ 
+/* Verify that overloaded built-ins for vec_insert with float
+   inputs produce the right codegen.  Power8 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power8" } */
+
+#include <altivec.h>
+
+vector float
+testf_var (float f, vector float vf, signed int i)
+{
+  return vec_insert (f, vf, i);
+}
+
+vector float
+testf_cst (float f, vector float vf)
+{
+  return vec_insert (f, vf, 12);
+}
+
+/* { dg-final { scan-assembler-times {\mstvx\M|\mstxv\M|\mstxvd2x\M} 1 } } */
+/* cst tests has stfs instead of stfsx. */
+/* { dg-final { scan-assembler-times {\mstfs\M|\mstfsx\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxv\M|\mlxvd2x\M|\mlxvw4x\M} 2 } } */
+
+/* cst test has a lvewx,vperm combo */
+/* { dg-final { scan-assembler-times {\mlvewx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
new file mode 100644
index 0000000..78ecad46
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
@@ -0,0 +1,31 @@ 
+/* Verify that overloaded built-ins for vec_insert with float
+   inputs produce the right codegen.  Power9 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power9" } */
+
+#include <altivec.h>
+
+vector float
+testf_var (float f, vector float vf, signed int i)
+{
+  return vec_insert (f, vf, i);
+}
+
+vector float
+testf_cst (float f, vector float vf)
+{
+  return vec_insert (f, vf, 12);
+}
+
+/* var test has a load and store. */
+/* { dg-final { scan-assembler-times {\mlxv\M|\mlvx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstfsx\M} 1 } } */
+
+/* cst test have a xscvdpspn,xxextractuw,xxinsertw combo */
+/* { dg-final { scan-assembler-times {\mxscvdpspn\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxextractuw\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mxxinsertw\M} 1 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p8.c
new file mode 100644
index 0000000..228a290
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p8.c
@@ -0,0 +1,58 @@ 
+/* Verify that overloaded built-ins for vec_insert() with int
+   inputs produce the right codegen.  Power8 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power8" } */
+
+#include <altivec.h>
+
+vector bool int
+testbi_var(unsigned int x, vector bool int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector signed int
+testsi_var(signed int x, vector signed int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned int
+testui1_var(signed int x, vector unsigned int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned int
+testui2_var(unsigned int x, vector unsigned int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector bool int
+testbi_cst(unsigned int x, vector bool int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector signed int
+testsi_cst(signed int x, vector signed int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned int
+testui1_cst(signed int x, vector unsigned int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned int
+testui2_cst(unsigned int x, vector unsigned int v)
+{
+   return vec_insert(x, v, 12);
+}
+
+/* Each test has lvx (8).  cst tests have additional lvewx. (4) */
+/* var tests have both stwx (4) and stvx (4).  cst tests have stw (4).*/
+/* { dg-final { scan-assembler-times {\mstvx\M|\mstwx\M|\mstw\M|\mstxvw4x\M} 12 } } */
+/* { dg-final { scan-assembler-times {\mlvx\M|\mlxvw4x\M} 8 } } */
+
+/* { dg-final { scan-assembler-times {\mlvewx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 4 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c
new file mode 100644
index 0000000..97daf94
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-int-p9.c
@@ -0,0 +1,62 @@ 
+/* Verify that overloaded built-ins for vec_insert() with int
+   inputs produce the right codegen.  Power9 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power9" } */
+
+#include <altivec.h>
+
+vector bool int
+testbi_var(unsigned int x, vector bool int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector signed int
+testsi_var(signed int x, vector signed int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned int
+testui1_var(signed int x, vector unsigned int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned int
+testui2_var(unsigned int x, vector unsigned int v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector bool int
+testbi_cst(unsigned int x, vector bool int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector signed int
+testsi_cst(signed int x, vector signed int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned int
+testui1_cst(signed int x, vector unsigned int v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned int
+testui2_cst(unsigned int x, vector unsigned int v)
+{
+   return vec_insert(x, v, 12);
+}
+
+
+/* load immediate, add, store, stb, load variable test.  */
+/* { dg-final { scan-assembler-times {\mstxv\M|\mstvx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mstwx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlxv\M|\mlvx\M} 4 } } */
+
+/* an insert and a move per constant test. */
+/* { dg-final { scan-assembler-times {\mmtvsrwz\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mxxinsertw\M} 4 } } */
+
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-longlong.c
new file mode 100644
index 0000000..f77dbf2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-longlong.c
@@ -0,0 +1,69 @@ 
+/* Verify that overloaded built-ins for vec_insert() with long long
+   inputs produce the right codegen.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mvsx -O2" } */
+
+#include <altivec.h>
+
+vector bool long long
+testbl_var(unsigned long long x, vector bool long long v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+
+vector signed long long
+testsl_var(signed long long x, vector signed long long v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+
+vector unsigned long long
+testul1_var(signed long long x, vector unsigned long long v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+
+vector unsigned long long
+testul2_var(unsigned long long x, vector unsigned long long v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+
+vector bool long long
+testbl_cst(unsigned long long x, vector bool long long v)
+{
+   return vec_insert(x, v, 12);
+}
+
+vector signed long long
+testsl_cst(signed long long x, vector signed long long v)
+{
+   return vec_insert(x, v, 12);
+}
+
+vector unsigned long long
+testul1_cst(signed long long x, vector unsigned long v)
+{
+   return vec_insert(x, v, 12);
+}
+
+vector unsigned long long
+testul2_cst(unsigned long long x, vector unsigned long long v)
+{
+   return vec_insert(x, v, 12);
+}
+
+/* Number of xxpermdi insns varies between power targets.  ensure at least one. */
+/* { dg-final { scan-assembler {\mxxpermdi\M} } } */
+
+/* { dg-final { scan-assembler-times {\mrldic\M} 4 } } */
+/* The number of addi instructions decreases on newer systems.  Measured as 8 on
+ power7 and power8 targets, and drops to 4 on power9 targets that use the
+ newer stxv,lxv instructions.  For this test ensure we get at least one.  */
+/* { dg-final { scan-assembler {\maddi\M} } } */
+/* { dg-final { scan-assembler-times {\mstxvd2x\M|\mstvx\M|\mstxv\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mstdx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlxvd2x\M|\mlxv\M|\mlvx\M} 4 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p8.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p8.c
new file mode 100644
index 0000000..e99e867
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p8.c
@@ -0,0 +1,58 @@ 
+/* Verify that overloaded built-ins for vec_insert() with short
+   inputs produce the right codegen.  Power8 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power8" } */
+
+#include <altivec.h>
+
+vector bool short
+testbs_var(unsigned short x, vector bool short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector signed short
+testss_var(signed short x, vector signed short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned short
+testus1_var(signed short x, vector unsigned short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned short
+testus2_var(unsigned short x, vector unsigned short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector bool short
+testbs_cst(signed short x, vector bool short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector signed short
+testss_cst(signed short x, vector signed short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned short
+testus1_cst(signed short x, vector unsigned short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned short
+testus2_cst(unsigned short x, vector unsigned short v)
+{
+   return vec_insert(x, v, 12);
+}
+
+/* { dg-final { scan-assembler-times {\mlhz\M|\mlvx\M|\mlxv\M|\mlxvw4x\M} 8 } } */
+/* stores.. 2 each per variable tests, 1 each per cst test. */
+/* { dg-final { scan-assembler-times {\msthx\M|\mstvx\M|\msth\M|\mstxvw4x\M} 12 } } */
+
+/* { dg-final { scan-assembler-times {\mlvehx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 4 } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p9.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p9.c
new file mode 100644
index 0000000..a9024c2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-short-p9.c
@@ -0,0 +1,57 @@ 
+/* Verify that overloaded built-ins for vec_insert() with short
+   inputs produce the right codegen.  Power9 variant.  */
+
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-options "-maltivec -O2 -mcpu=power9" } */
+
+#include <altivec.h>
+
+vector bool short
+testbs_var(unsigned short x, vector bool short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector signed short
+testss_var(signed short x, vector signed short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned short
+testus1_var(signed short x, vector unsigned short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector unsigned short
+testus2_var(unsigned short x, vector unsigned short v, signed int i)
+{
+   return vec_insert(x, v, i);
+}
+vector bool short
+testbs_cst(signed short x, vector bool short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector signed short
+testss_cst(signed short x, vector signed short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned short
+testus1_cst(signed short x, vector unsigned short v)
+{
+   return vec_insert(x, v, 12);
+}
+vector unsigned short
+testus2_cst(unsigned short x, vector unsigned short v)
+{
+   return vec_insert(x, v, 12);
+}
+
+/* { dg-final { scan-assembler-times {\mmtvsrwz\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mvinserth\M} 4 } } */
+
+/* { dg-final { scan-assembler-times {\mstxv\M|\mstvx\M} 4 } } */
+/* { dg-final { scan-assembler-times {\mlxv\M|\mlvx\M} 4 } } */
+