diff mbox series

rs6000: Add MMA __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins

Message ID 6797c56b-bc5a-2410-4a90-424cda38abaf@linux.ibm.com
State New
Headers show
Series rs6000: Add MMA __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins | expand

Commit Message

Peter Bergner June 30, 2021, 9:56 p.m. UTC
LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins.
The following patch adds support for them to GCC so that we stay in sync
with LLVM.

The following patch is bootstrapping and regtesting on powerpc64le-linux.
Ok for trunk if there are no regressions?  Ok to backport to GCC 11 and
GCC 10 after baking on trunk for a while?

Peter


gcc/
	* config/rs6000/rs6000-builtin.def (BU_MMA_LD, BU_MMA_ST): New macros.
	(__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins.
	* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand
	lxvp and stxvp built-ins.
	(mma_init_builtins): Handle lxvp and stxvp built-ins.
	(builtin_function_type): Likewise.

gcc/testsuite/
	* gcc.target/powerpc/mma-builtin-7.c: New test.
	* gcc.target/powerpc/mma-builtin-8.c: New test.

Comments

Peter Bergner July 1, 2021, 1:35 a.m. UTC | #1
On 6/30/21 3:56 PM, Peter Bergner wrote:
> The following patch is bootstrapping and regtesting on powerpc64le-linux.
> Ok for trunk if there are no regressions?  Ok to backport to GCC 11 and
> GCC 10 after baking on trunk for a while?

Bootstrap and regression testing came back clean.

Peter
Segher Boessenkool July 1, 2021, 7:01 p.m. UTC | #2
Hi!

On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote:
> LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins.
> The following patch adds support for them to GCC so that we stay in sync
> with LLVM.

This should be documented somewhere.

> +  else if (fncode == VSX_BUILTIN_LXVP)
> +    {
> +      push_gimplify_context (true);
> +      tree offset = gimple_call_arg (stmt, 0);
> +      tree ptr = gimple_call_arg (stmt, 1);
> +      tree lhs = gimple_call_lhs (stmt);
> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));

Line (much) too long.

> +  else if (fncode == VSX_BUILTIN_STXVP)
> +    {
> +      push_gimplify_context (true);
> +      tree src = gimple_call_arg (stmt, 0);
> +      tree offset = gimple_call_arg (stmt, 1);
> +      tree ptr = gimple_call_arg (stmt, 2);
> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));

Same.

Can BU_MMA_LD be used only for lxvp?  Name it BU_MMA_PAIR_LD then?  Same
for _ST ofc.

The patch is okay for trunk.  For the backports it is okay if Bill has
looked at this patch as well.  Thanks!


Segher
Peter Bergner July 1, 2021, 7:23 p.m. UTC | #3
On 7/1/21 1:01 PM, Segher Boessenkool wrote:
> On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote:
>> LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins.
>> The following patch adds support for them to GCC so that we stay in sync
>> with LLVM.
> 
> This should be documented somewhere.

Hmmm, I had that change in one of my trees, but must have dropped
it when moving to another system/tree and didn't notice!  Doh!
Let me add that back and make the other changes you suggested
and then I'll repost and test here.  Thanks for catching that!

Peter
Peter Bergner July 1, 2021, 7:48 p.m. UTC | #4
On 7/1/21 1:01 PM, Segher Boessenkool wrote:
> On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote:
>> LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins.
>> The following patch adds support for them to GCC so that we stay in sync
>> with LLVM.
> 
> This should be documented somewhere.

Done.


>> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));
> 
> Line (much) too long.

Fixed.


>> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));
> 
> Same.

Fixed.


> Can BU_MMA_LD be used only for lxvp?  Name it BU_MMA_PAIR_LD then?  Same
> for _ST ofc.

Yes, it's used only for lxvp and stxvp and uses our MMA specific
__vector_pair types too.  Macro name changed.



> The patch is okay for trunk.

Below is the updated patch which is bootstrapping now.  I'll commit it
if it shows no regressions.


> For the backports it is okay if Bill has looked at this patch as well.

Bill has not seen the patch.  I'm not sure when/if he'll get a chance
to either.


Peter




gcc/
	* config/rs6000/rs6000-builtin.def (BU_MMA_PAIR_LD, BU_MMA_PAIR_ST):
	New macros.
	(__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins.
	* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand
	lxvp and stxvp built-ins.
	(mma_init_builtins): Handle lxvp and stxvp built-ins.
	(builtin_function_type): Likewise.
	* doc/extend.texi (__builtin_vsx_lxvp, __builtin_mma_stxvp): Document.

gcc/testsuite/
	* gcc.target/powerpc/mma-builtin-7.c: New test.
	* gcc.target/powerpc/mma-builtin-8.c: New test.


diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index d7ce4de421e..6270444ef70 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -484,6 +484,25 @@
 		     | RS6000_BTC_SENARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
 
+#define BU_MMA_PAIR_LD(ENUM, NAME, ATTR)				\
+  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_MMA,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY				\
+		     | RS6000_BTC_GIMPLE),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
+#define BU_MMA_PAIR_ST(ENUM, NAME, ATTR)				\
+  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_MMA,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_TERNARY				\
+		     | RS6000_BTC_VOID					\
+		     | RS6000_BTC_GIMPLE),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
 /* ISA 2.05 (power6) convenience macros. */
 /* For functions that depend on the CMPB instruction */
 #define BU_P6_2(ENUM, NAME, ATTR, ICODE)				\
@@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
 BU_P10V_VSX_1 (XVCVBF16SPN,	 "xvcvbf16spn",	MISC, vsx_xvcvbf16spn)
 BU_P10V_VSX_1 (XVCVSPBF16,	    "xvcvspbf16",	MISC, vsx_xvcvspbf16)
 
+BU_MMA_PAIR_LD (LXVP,	    "lxvp",		MISC)
+BU_MMA_PAIR_ST (STXVP,	    "stxvp",		PAIR)
+
 BU_MMA_1 (XXMFACC,	    "xxmfacc",		QUAD, mma_xxmfacc)
 BU_MMA_1 (XXMTACC,	    "xxmtacc",		QUAD, mma_xxmtacc)
 BU_MMA_1 (XXSETACCZ,	    "xxsetaccz",	MISC, mma_xxsetaccz)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b67789845a5..6115e3b34d9 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11913,6 +11913,32 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi)
       gsi_replace_with_seq (gsi, new_seq, true);
       return true;
     }
+  else if (fncode == VSX_BUILTIN_LXVP)
+    {
+      push_gimplify_context (true);
+      tree offset = gimple_call_arg (stmt, 0);
+      tree ptr = gimple_call_arg (stmt, 1);
+      tree lhs = gimple_call_lhs (stmt);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
+					       TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (lhs, mem, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
+  else if (fncode == VSX_BUILTIN_STXVP)
+    {
+      push_gimplify_context (true);
+      tree src = gimple_call_arg (stmt, 0);
+      tree offset = gimple_call_arg (stmt, 1);
+      tree ptr = gimple_call_arg (stmt, 2);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
+					       TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (mem, src, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
 
   /* Convert this built-in into an internal version that uses pass-by-value
      arguments.  The internal built-in follows immediately after this one.  */
@@ -14264,11 +14290,14 @@ mma_init_builtins (void)
       if (gimple_func)
 	{
 	  gcc_assert (icode == CODE_FOR_nothing);
-	  op[nopnds++] = void_type_node;
 	  /* Some MMA built-ins that are expanded into gimple are converted
 	     into internal MMA built-ins that are expanded into rtl.
 	     The internal built-in follows immediately after this built-in.  */
-	  icode = d[1].icode;
+	  if (d[1].icode != CODE_FOR_nothing)
+	    {
+	      op[nopnds++] = void_type_node;
+	      icode = d[1].icode;
+	    }
 	}
       else
 	{
@@ -14291,6 +14320,19 @@ mma_init_builtins (void)
 	  else
 	    op[nopnds++] = build_pointer_type (vector_pair_type_node);
 	}
+      else if (d->code == VSX_BUILTIN_LXVP)
+	{
+	  op[nopnds++] = vector_pair_type_node;
+	  op[nopnds++] = sizetype;
+	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
+	}
+      else if (d->code == VSX_BUILTIN_STXVP)
+	{
+	  op[nopnds++] = void_type_node;
+	  op[nopnds++] = vector_pair_type_node;
+	  op[nopnds++] = sizetype;
+	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
+	}
       else
 	{
 	  /* This is a normal MMA built-in function.  */
@@ -14781,6 +14823,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
       h.uns_p[2] = 1;
       break;
 
+    case VSX_BUILTIN_LXVP:
+      h.uns_p[0] = 1;
+      h.uns_p[2] = 1;
+      break;
+
+    case VSX_BUILTIN_STXVP:
+      h.uns_p[1] = 1;
+      h.uns_p[3] = 1;
+      break;
+
     default:
       break;
     }
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8fc66d626d8..b83cd4919bb 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -20731,6 +20731,9 @@ void __builtin_vsx_disassemble_pair (void *, __vector_pair *);
 
 vec_t __builtin_vsx_xvcvspbf16 (vec_t);
 vec_t __builtin_vsx_xvcvbf16spn (vec_t);
+
+__vector_pair __builtin_vsx_lxvp (size_t, __vector_pair *);
+void __builtin_vsx_stxvp (__vector_pair, size_t, __vector_pair *);
 @end smallexample
 
 @node PRU Built-in Functions
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
new file mode 100644
index 00000000000..c661a4b84bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+void
+foo (__vector_pair *dst, __vector_pair *src, long idx)
+{
+  dst[0] = __builtin_vsx_lxvp (0, src);
+  dst[2] = __builtin_vsx_lxvp (32, src);
+  dst[4] = __builtin_vsx_lxvp (64, src);
+  /* Non-constant offset should generate a lxvpx.  */
+  dst[6] = __builtin_vsx_lxvp (idx, src);
+  /* Non-aligned offset should generate a plxvp.  */
+  dst[8] = __builtin_vsx_lxvp (257, src);
+}
+
+#if !__has_builtin (__builtin_vsx_lxvp)
+#  error "__has_builtin (__builtin_vsx_lxvp) failed"
+#endif
+
+/* { dg-final { scan-assembler-not {\mlxv\M} } } */
+/* { dg-final { scan-assembler-not {\mstxv\M} } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
new file mode 100644
index 00000000000..af29e479f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+void
+foo (__vector_pair *dst, __vector_pair *src, long idx)
+{
+  __vector_pair pair = *src;
+  __builtin_vsx_stxvp (pair, 0, dst);
+  __builtin_vsx_stxvp (pair, 32, dst);
+  __builtin_vsx_stxvp (pair, 64, dst);
+  /* Non-constant offset should generate a stxvpx.  */
+  __builtin_vsx_stxvp (pair, idx, dst);
+  /* Non-aligned offset should generate a pstxvp.  */
+  __builtin_vsx_stxvp (pair, 257, dst);
+}
+
+#if !__has_builtin (__builtin_vsx_stxvp)
+#  error "__has_builtin (__builtin_vsx_stxvp) failed"
+#endif
+
+/* { dg-final { scan-assembler-not {\mlxv\M} } } */
+/* { dg-final { scan-assembler-not {\mstxv\M} } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */
Peter Bergner July 2, 2021, 6:32 p.m. UTC | #5
On 7/1/21 2:48 PM, Peter Bergner wrote:
> On 7/1/21 1:01 PM, Segher Boessenkool wrote:
>> The patch is okay for trunk.
> 
> Below is the updated patch which is bootstrapping now.  I'll commit it
> if it shows no regressions.

Testing was clean so I pushed it to trunk.



>> For the backports it is okay if Bill has looked at this patch as well.
> 
> Bill has not seen the patch.  I'm not sure when/if he'll get a chance
> to either.

Since Bill won't have a chance to look at this, ok for the backport
after a couple of days on trunk?  Bill's builtin rewrite is only targeted
for trunk, so it doesn't affect the backport at all.

Peter
Segher Boessenkool July 3, 2021, 3:56 p.m. UTC | #6
On Fri, Jul 02, 2021 at 01:32:45PM -0500, Peter Bergner wrote:
> On 7/1/21 2:48 PM, Peter Bergner wrote:
> > On 7/1/21 1:01 PM, Segher Boessenkool wrote:
> >> The patch is okay for trunk.
> > 
> > Below is the updated patch which is bootstrapping now.  I'll commit it
> > if it shows no regressions.
> 
> Testing was clean so I pushed it to trunk.
> 
> 
> 
> >> For the backports it is okay if Bill has looked at this patch as well.
> > 
> > Bill has not seen the patch.  I'm not sure when/if he'll get a chance
> > to either.
> 
> Since Bill won't have a chance to look at this, ok for the backport
> after a couple of days on trunk?  Bill's builtin rewrite is only targeted
> for trunk, so it doesn't affect the backport at all.

Okay.  Thanks!


Segher
Li, Pan2 via Gcc-patches July 4, 2021, 3:28 p.m. UTC | #7
On 7/3/21 10:56 AM, Segher Boessenkool wrote:
> On Fri, Jul 02, 2021 at 01:32:45PM -0500, Peter Bergner wrote:
>> On 7/1/21 2:48 PM, Peter Bergner wrote:
>>> On 7/1/21 1:01 PM, Segher Boessenkool wrote:
>>>> The patch is okay for trunk.
>>> Below is the updated patch which is bootstrapping now.  I'll commit it
>>> if it shows no regressions.
>> Testing was clean so I pushed it to trunk.
>>
>>
>>
>>>> For the backports it is okay if Bill has looked at this patch as well.
>>> Bill has not seen the patch.  I'm not sure when/if he'll get a chance
>>> to either.
>> Since Bill won't have a chance to look at this, ok for the backport
>> after a couple of days on trunk?  Bill's builtin rewrite is only targeted
>> for trunk, so it doesn't affect the backport at all.
> Okay.  Thanks!

I will take a look hopefully today.  Mostly looks good but I want to 
check a couple of things before signing off.

Bill

>
>
> Segher
Li, Pan2 via Gcc-patches July 4, 2021, 4:32 p.m. UTC | #8
Hi Peter,

On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote:
> gcc/
> 	* config/rs6000/rs6000-builtin.def (BU_MMA_PAIR_LD, BU_MMA_PAIR_ST):
> 	New macros.
> 	(__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins.
> 	* config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand
> 	lxvp and stxvp built-ins.
> 	(mma_init_builtins): Handle lxvp and stxvp built-ins.
> 	(builtin_function_type): Likewise.
> 	* doc/extend.texi (__builtin_vsx_lxvp, __builtin_mma_stxvp): Document.
>
> gcc/testsuite/
> 	* gcc.target/powerpc/mma-builtin-7.c: New test.
> 	* gcc.target/powerpc/mma-builtin-8.c: New test.
>
>
> diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
> index d7ce4de421e..6270444ef70 100644
> --- a/gcc/config/rs6000/rs6000-builtin.def
> +++ b/gcc/config/rs6000/rs6000-builtin.def
> @@ -484,6 +484,25 @@
>   		     | RS6000_BTC_SENARY),				\
>   		    CODE_FOR_ ## ICODE)			/* ICODE */
>
> +#define BU_MMA_PAIR_LD(ENUM, NAME, ATTR)				\
> +  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
> +		    "__builtin_vsx_" NAME,		/* NAME */	\
> +		    RS6000_BTM_MMA,			/* MASK */	\
> +		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
> +		     | RS6000_BTC_BINARY				\
> +		     | RS6000_BTC_GIMPLE),				\
> +		    CODE_FOR_nothing)			/* ICODE */
> +
> +#define BU_MMA_PAIR_ST(ENUM, NAME, ATTR)				\
> +  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
> +		    "__builtin_vsx_" NAME,		/* NAME */	\
> +		    RS6000_BTM_MMA,			/* MASK */	\
> +		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
> +		     | RS6000_BTC_TERNARY				\
> +		     | RS6000_BTC_VOID					\
> +		     | RS6000_BTC_GIMPLE),				\
> +		    CODE_FOR_nothing)			/* ICODE */
> +
>   /* ISA 2.05 (power6) convenience macros. */
>   /* For functions that depend on the CMPB instruction */
>   #define BU_P6_2(ENUM, NAME, ATTR, ICODE)				\
> @@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
>   BU_P10V_VSX_1 (XVCVBF16SPN,	 "xvcvbf16spn",	MISC, vsx_xvcvbf16spn)
>   BU_P10V_VSX_1 (XVCVSPBF16,	    "xvcvspbf16",	MISC, vsx_xvcvspbf16)
>
> +BU_MMA_PAIR_LD (LXVP,	    "lxvp",		MISC)
> +BU_MMA_PAIR_ST (STXVP,	    "stxvp",		PAIR)
> +
>   BU_MMA_1 (XXMFACC,	    "xxmfacc",		QUAD, mma_xxmfacc)
>   BU_MMA_1 (XXMTACC,	    "xxmtacc",		QUAD, mma_xxmtacc)
>   BU_MMA_1 (XXSETACCZ,	    "xxsetaccz",	MISC, mma_xxsetaccz)
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index b67789845a5..6115e3b34d9 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -11913,6 +11913,32 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi)
>         gsi_replace_with_seq (gsi, new_seq, true);
>         return true;
>       }
> +  else if (fncode == VSX_BUILTIN_LXVP)
> +    {
> +      push_gimplify_context (true);
> +      tree offset = gimple_call_arg (stmt, 0);
> +      tree ptr = gimple_call_arg (stmt, 1);
> +      tree lhs = gimple_call_lhs (stmt);
> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
> +					       TREE_TYPE (ptr), ptr, offset));
> +      gimplify_assign (lhs, mem, &new_seq);
> +      pop_gimplify_context (NULL);
> +      gsi_replace_with_seq (gsi, new_seq, true);
> +      return true;
> +    }
> +  else if (fncode == VSX_BUILTIN_STXVP)
> +    {
> +      push_gimplify_context (true);
> +      tree src = gimple_call_arg (stmt, 0);
> +      tree offset = gimple_call_arg (stmt, 1);
> +      tree ptr = gimple_call_arg (stmt, 2);
> +      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR,
> +					       TREE_TYPE (ptr), ptr, offset));
> +      gimplify_assign (mem, src, &new_seq);
> +      pop_gimplify_context (NULL);
> +      gsi_replace_with_seq (gsi, new_seq, true);
> +      return true;
> +    }
>
>     /* Convert this built-in into an internal version that uses pass-by-value
>        arguments.  The internal built-in follows immediately after this one.  */
> @@ -14264,11 +14290,14 @@ mma_init_builtins (void)
>         if (gimple_func)
>   	{
>   	  gcc_assert (icode == CODE_FOR_nothing);
> -	  op[nopnds++] = void_type_node;
>   	  /* Some MMA built-ins that are expanded into gimple are converted
>   	     into internal MMA built-ins that are expanded into rtl.
>   	     The internal built-in follows immediately after this built-in.  */
> -	  icode = d[1].icode;
> +	  if (d[1].icode != CODE_FOR_nothing)
> +	    {
> +	      op[nopnds++] = void_type_node;
> +	      icode = d[1].icode;
> +	    }

This hunk bothers me.  The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST 
define only one builtin, not two.  They are both flagged as 
RS6000_BTC_GIMPLE.  The use of d[1] in this case is suspect and 
fragile.  It appears you're relying on the built-in following each of 
__builtin_vsx_lxvp and __builtin_vsx_stxvp to exist, as otherwise d[1] 
will be out of bounds.  It's otherwise rather meaningless because you 
later handle VSX_BUILTIN_LXVP and VSX_BUILTIN_STXVP separately.  Can you 
please replace this with something less fragile?  Perhaps leave the 
gimple_func leg alone, but first test for these two builtins and do the 
right thing for them instead.

Otherwise this LGTM.

Thanks,
Bill

>   	}
>         else
>   	{
> @@ -14291,6 +14320,19 @@ mma_init_builtins (void)
>   	  else
>   	    op[nopnds++] = build_pointer_type (vector_pair_type_node);
>   	}
> +      else if (d->code == VSX_BUILTIN_LXVP)
> +	{
> +	  op[nopnds++] = vector_pair_type_node;
> +	  op[nopnds++] = sizetype;
> +	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
> +	}
> +      else if (d->code == VSX_BUILTIN_STXVP)
> +	{
> +	  op[nopnds++] = void_type_node;
> +	  op[nopnds++] = vector_pair_type_node;
> +	  op[nopnds++] = sizetype;
> +	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
> +	}
>         else
>   	{
>   	  /* This is a normal MMA built-in function.  */
> @@ -14781,6 +14823,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
>         h.uns_p[2] = 1;
>         break;
>
> +    case VSX_BUILTIN_LXVP:
> +      h.uns_p[0] = 1;
> +      h.uns_p[2] = 1;
> +      break;
> +
> +    case VSX_BUILTIN_STXVP:
> +      h.uns_p[1] = 1;
> +      h.uns_p[3] = 1;
> +      break;
> +
>       default:
>         break;
>       }
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 8fc66d626d8..b83cd4919bb 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -20731,6 +20731,9 @@ void __builtin_vsx_disassemble_pair (void *, __vector_pair *);
>
>   vec_t __builtin_vsx_xvcvspbf16 (vec_t);
>   vec_t __builtin_vsx_xvcvbf16spn (vec_t);
> +
> +__vector_pair __builtin_vsx_lxvp (size_t, __vector_pair *);
> +void __builtin_vsx_stxvp (__vector_pair, size_t, __vector_pair *);
>   @end smallexample
>
>   @node PRU Built-in Functions
> diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
> new file mode 100644
> index 00000000000..c661a4b84bc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
> @@ -0,0 +1,26 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +void
> +foo (__vector_pair *dst, __vector_pair *src, long idx)
> +{
> +  dst[0] = __builtin_vsx_lxvp (0, src);
> +  dst[2] = __builtin_vsx_lxvp (32, src);
> +  dst[4] = __builtin_vsx_lxvp (64, src);
> +  /* Non-constant offset should generate a lxvpx.  */
> +  dst[6] = __builtin_vsx_lxvp (idx, src);
> +  /* Non-aligned offset should generate a plxvp.  */
> +  dst[8] = __builtin_vsx_lxvp (257, src);
> +}
> +
> +#if !__has_builtin (__builtin_vsx_lxvp)
> +#  error "__has_builtin (__builtin_vsx_lxvp) failed"
> +#endif
> +
> +/* { dg-final { scan-assembler-not {\mlxv\M} } } */
> +/* { dg-final { scan-assembler-not {\mstxv\M} } } */
> +/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
> new file mode 100644
> index 00000000000..af29e479f83
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target power10_ok } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
> +
> +void
> +foo (__vector_pair *dst, __vector_pair *src, long idx)
> +{
> +  __vector_pair pair = *src;
> +  __builtin_vsx_stxvp (pair, 0, dst);
> +  __builtin_vsx_stxvp (pair, 32, dst);
> +  __builtin_vsx_stxvp (pair, 64, dst);
> +  /* Non-constant offset should generate a stxvpx.  */
> +  __builtin_vsx_stxvp (pair, idx, dst);
> +  /* Non-aligned offset should generate a pstxvp.  */
> +  __builtin_vsx_stxvp (pair, 257, dst);
> +}
> +
> +#if !__has_builtin (__builtin_vsx_stxvp)
> +#  error "__has_builtin (__builtin_vsx_stxvp) failed"
> +#endif
> +
> +/* { dg-final { scan-assembler-not {\mlxv\M} } } */
> +/* { dg-final { scan-assembler-not {\mstxv\M} } } */
> +/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */
> +/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */
Peter Bergner July 6, 2021, 7:29 p.m. UTC | #9
On 7/4/21 11:32 AM, Bill Schmidt wrote
> On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote:
>> +      if (d[1].icode != CODE_FOR_nothing)
>> +        {
>> +          op[nopnds++] = void_type_node;
>> +          icode = d[1].icode;
>> +        }


> This hunk bothers me.  The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST
> define only one builtin, not two.

Correct, unlike the other MMA builtins, they do not expand from gimple
into internal builtins that are expanded at rtl time.  They instead are
expanded directly into gimple memory references.


> They are both flagged as RS6000_BTC_GIMPLE.

Yes, because they are supposed to be expanded at gimple time.

Note, other non-MMA builtins that expand at gimple time should
probably also set this flag.  That would allow an early exit
to be added to rs6000_gimple_fold_builtin().  I can add that
as a TODO.




> The use of d[1] in this case is suspect and fragile.  It appears you're relying
> on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp
> to exist, as otherwise d[1] will be out of bounds.  

Ok, if they had been added as the last builtins in the array (they aren't),
that would be a problem.


> Can you please replace this with something less fragile?
> Perhaps leave the gimple_func leg alone, but first test for these two
> builtins and do the right thing for them instead.

These are RS6000_BTC_GIMPLE, so I think they should be handled within
the "if (gimple_func) ..." leg.  That said, how about the following
change to resolve the issue you have?  I'll kick off a bootstrap and
regtest for this change.

Peter


	* config/rs6000/rs6000-call.c (mma_init_builtins): Use VSX_BUILTIN_LXVP
	and VSX_BUILTIN_STXVP.

diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index 6115e3b34d9..904e104c058 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -14293,7 +14293,8 @@ mma_init_builtins (void)
          /* Some MMA built-ins that are expanded into gimple are converted
             into internal MMA built-ins that are expanded into rtl.
             The internal built-in follows immediately after this built-in.  */
-         if (d[1].icode != CODE_FOR_nothing)
+         if (d->code != VSX_BUILTIN_LXVP
+             && d->code != VSX_BUILTIN_STXVP)
            {
              op[nopnds++] = void_type_node;
              icode = d[1].icode;
Li, Pan2 via Gcc-patches July 6, 2021, 8:08 p.m. UTC | #10
On 7/6/21 2:29 PM, Peter Bergner wrote:
> On 7/4/21 11:32 AM, Bill Schmidt wrote
>> On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote:
>>> +      if (d[1].icode != CODE_FOR_nothing)
>>> +        {
>>> +          op[nopnds++] = void_type_node;
>>> +          icode = d[1].icode;
>>> +        }
>
>> This hunk bothers me.  The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST
>> define only one builtin, not two.
> Correct, unlike the other MMA builtins, they do not expand from gimple
> into internal builtins that are expanded at rtl time.  They instead are
> expanded directly into gimple memory references.
>
>
>> They are both flagged as RS6000_BTC_GIMPLE.
> Yes, because they are supposed to be expanded at gimple time.
>
> Note, other non-MMA builtins that expand at gimple time should
> probably also set this flag.  That would allow an early exit
> to be added to rs6000_gimple_fold_builtin().  I can add that
> as a TODO.
>
>
>
>
>> The use of d[1] in this case is suspect and fragile.  It appears you're relying
>> on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp
>> to exist, as otherwise d[1] will be out of bounds.
> Ok, if they had been added as the last builtins in the array (they aren't),
> that would be a problem.
>
>
>> Can you please replace this with something less fragile?
>> Perhaps leave the gimple_func leg alone, but first test for these two
>> builtins and do the right thing for them instead.
> These are RS6000_BTC_GIMPLE, so I think they should be handled within
> the "if (gimple_func) ..." leg.  That said, how about the following
> change to resolve the issue you have?  I'll kick off a bootstrap and
> regtest for this change.


Thanks, yes, that works for me!

Bill

>
> Peter
>
>
> 	* config/rs6000/rs6000-call.c (mma_init_builtins): Use VSX_BUILTIN_LXVP
> 	and VSX_BUILTIN_STXVP.
>
> diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
> index 6115e3b34d9..904e104c058 100644
> --- a/gcc/config/rs6000/rs6000-call.c
> +++ b/gcc/config/rs6000/rs6000-call.c
> @@ -14293,7 +14293,8 @@ mma_init_builtins (void)
>            /* Some MMA built-ins that are expanded into gimple are converted
>               into internal MMA built-ins that are expanded into rtl.
>               The internal built-in follows immediately after this built-in.  */
> -         if (d[1].icode != CODE_FOR_nothing)
> +         if (d->code != VSX_BUILTIN_LXVP
> +             && d->code != VSX_BUILTIN_STXVP)
>              {
>                op[nopnds++] = void_type_node;
>                icode = d[1].icode;
>
>
>
Peter Bergner July 6, 2021, 9:13 p.m. UTC | #11
On 7/6/21 3:08 PM, Bill Schmidt wrote:
> On 7/6/21 2:29 PM, Peter Bergner wrote:
>> These are RS6000_BTC_GIMPLE, so I think they should be handled within
>> the "if (gimple_func) ..." leg.  That said, how about the following
>> change to resolve the issue you have?  I'll kick off a bootstrap and
>> regtest for this change.
> 
> 
> Thanks, yes, that works for me!

Great, thanks!


Segher, the patch was clean on testing.  Ok with you too?

Peter
Segher Boessenkool July 6, 2021, 10:05 p.m. UTC | #12
On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote:
> On 7/6/21 3:08 PM, Bill Schmidt wrote:
> > On 7/6/21 2:29 PM, Peter Bergner wrote:
> >> These are RS6000_BTC_GIMPLE, so I think they should be handled within
> >> the "if (gimple_func) ..." leg.  That said, how about the following
> >> change to resolve the issue you have?  I'll kick off a bootstrap and
> >> regtest for this change.
> > 
> > 
> > Thanks, yes, that works for me!
> 
> Great, thanks!
> 
> 
> Segher, the patch was clean on testing.  Ok with you too?

Of course.  Okay for trunk and backports (you might want to hurry it
for 11.2).  Thanks to you both!


Segher
Peter Bergner July 7, 2021, 4:55 p.m. UTC | #13
On 7/6/21 5:05 PM, Segher Boessenkool wrote:
> On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote:
>> On 7/6/21 3:08 PM, Bill Schmidt wrote:
>>> On 7/6/21 2:29 PM, Peter Bergner wrote:
>>>> These are RS6000_BTC_GIMPLE, so I think they should be handled within
>>>> the "if (gimple_func) ..." leg.  That said, how about the following
>>>> change to resolve the issue you have?  I'll kick off a bootstrap and
>>>> regtest for this change.
>>>
>>>
>>> Thanks, yes, that works for me!
>>
>> Great, thanks!
>>
>>
>> Segher, the patch was clean on testing.  Ok with you too?
> 
> Of course.  Okay for trunk and backports (you might want to hurry it
> for 11.2).  Thanks to you both!

Ok, committed to trunk.  I'm testing the GCC 11 backport now and will
commit if clean.  Thanks.


Peter
Peter Bergner July 7, 2021, 6:35 p.m. UTC | #14
On 7/7/21 11:55 AM, Peter Bergner wrote:
> On 7/6/21 5:05 PM, Segher Boessenkool wrote:
>> On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote:
>>> On 7/6/21 3:08 PM, Bill Schmidt wrote:
>>>> On 7/6/21 2:29 PM, Peter Bergner wrote:
>>>>> These are RS6000_BTC_GIMPLE, so I think they should be handled within
>>>>> the "if (gimple_func) ..." leg.  That said, how about the following
>>>>> change to resolve the issue you have?  I'll kick off a bootstrap and
>>>>> regtest for this change.
>>>>
>>>>
>>>> Thanks, yes, that works for me!
>>>
>>> Great, thanks!
>>>
>>>
>>> Segher, the patch was clean on testing.  Ok with you too?
>>
>> Of course.  Okay for trunk and backports (you might want to hurry it
>> for 11.2).  Thanks to you both!
> 
> Ok, committed to trunk.  I'm testing the GCC 11 backport now and will
> commit if clean.  Thanks.

Backport testing was clean, so pushed to the GCC 11 branch.
I'll work on the GCC 10 backport in a day or two.

Peter
diff mbox series

Patch

diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def
index d7ce4de421e..a42c3cbd55b 100644
--- a/gcc/config/rs6000/rs6000-builtin.def
+++ b/gcc/config/rs6000/rs6000-builtin.def
@@ -484,6 +484,25 @@ 
 		     | RS6000_BTC_SENARY),				\
 		    CODE_FOR_ ## ICODE)			/* ICODE */
 
+#define BU_MMA_LD(ENUM, NAME, ATTR)					\
+  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_MMA,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY				\
+		     | RS6000_BTC_GIMPLE),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
+#define BU_MMA_ST(ENUM, NAME, ATTR)					\
+  RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_MMA,			/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_TERNARY				\
+		     | RS6000_BTC_VOID					\
+		     | RS6000_BTC_GIMPLE),				\
+		    CODE_FOR_nothing)			/* ICODE */
+
 /* ISA 2.05 (power6) convenience macros. */
 /* For functions that depend on the CMPB instruction */
 #define BU_P6_2(ENUM, NAME, ATTR, ICODE)				\
@@ -3253,6 +3272,9 @@  BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
 BU_P10V_VSX_1 (XVCVBF16SPN,	 "xvcvbf16spn",	MISC, vsx_xvcvbf16spn)
 BU_P10V_VSX_1 (XVCVSPBF16,	    "xvcvspbf16",	MISC, vsx_xvcvspbf16)
 
+BU_MMA_LD (LXVP,	  "lxvp",	MISC)
+BU_MMA_ST (STXVP,	  "stxvp",	PAIR)
+
 BU_MMA_1 (XXMFACC,	    "xxmfacc",		QUAD, mma_xxmfacc)
 BU_MMA_1 (XXMTACC,	    "xxmtacc",		QUAD, mma_xxmtacc)
 BU_MMA_1 (XXSETACCZ,	    "xxsetaccz",	MISC, mma_xxsetaccz)
diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c
index b67789845a5..ebc03da9b24 100644
--- a/gcc/config/rs6000/rs6000-call.c
+++ b/gcc/config/rs6000/rs6000-call.c
@@ -11913,6 +11913,30 @@  rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi)
       gsi_replace_with_seq (gsi, new_seq, true);
       return true;
     }
+  else if (fncode == VSX_BUILTIN_LXVP)
+    {
+      push_gimplify_context (true);
+      tree offset = gimple_call_arg (stmt, 0);
+      tree ptr = gimple_call_arg (stmt, 1);
+      tree lhs = gimple_call_lhs (stmt);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (lhs, mem, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
+  else if (fncode == VSX_BUILTIN_STXVP)
+    {
+      push_gimplify_context (true);
+      tree src = gimple_call_arg (stmt, 0);
+      tree offset = gimple_call_arg (stmt, 1);
+      tree ptr = gimple_call_arg (stmt, 2);
+      tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset));
+      gimplify_assign (mem, src, &new_seq);
+      pop_gimplify_context (NULL);
+      gsi_replace_with_seq (gsi, new_seq, true);
+      return true;
+    }
 
   /* Convert this built-in into an internal version that uses pass-by-value
      arguments.  The internal built-in follows immediately after this one.  */
@@ -14264,11 +14288,14 @@  mma_init_builtins (void)
       if (gimple_func)
 	{
 	  gcc_assert (icode == CODE_FOR_nothing);
-	  op[nopnds++] = void_type_node;
 	  /* Some MMA built-ins that are expanded into gimple are converted
 	     into internal MMA built-ins that are expanded into rtl.
 	     The internal built-in follows immediately after this built-in.  */
-	  icode = d[1].icode;
+	  if (d[1].icode != CODE_FOR_nothing)
+	    {
+	      op[nopnds++] = void_type_node;
+	      icode = d[1].icode;
+	    }
 	}
       else
 	{
@@ -14291,6 +14318,19 @@  mma_init_builtins (void)
 	  else
 	    op[nopnds++] = build_pointer_type (vector_pair_type_node);
 	}
+      else if (d->code == VSX_BUILTIN_LXVP)
+	{
+	  op[nopnds++] = vector_pair_type_node;
+	  op[nopnds++] = sizetype;
+	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
+	}
+      else if (d->code == VSX_BUILTIN_STXVP)
+	{
+	  op[nopnds++] = void_type_node;
+	  op[nopnds++] = vector_pair_type_node;
+	  op[nopnds++] = sizetype;
+	  op[nopnds++] = build_pointer_type (vector_pair_type_node);
+	}
       else
 	{
 	  /* This is a normal MMA built-in function.  */
@@ -14781,6 +14821,16 @@  builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0,
       h.uns_p[2] = 1;
       break;
 
+    case VSX_BUILTIN_LXVP:
+      h.uns_p[0] = 1;
+      h.uns_p[2] = 1;
+      break;
+
+    case VSX_BUILTIN_STXVP:
+      h.uns_p[1] = 1;
+      h.uns_p[3] = 1;
+      break;
+
     default:
       break;
     }
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
new file mode 100644
index 00000000000..c661a4b84bc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c
@@ -0,0 +1,26 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+void
+foo (__vector_pair *dst, __vector_pair *src, long idx)
+{
+  dst[0] = __builtin_vsx_lxvp (0, src);
+  dst[2] = __builtin_vsx_lxvp (32, src);
+  dst[4] = __builtin_vsx_lxvp (64, src);
+  /* Non-constant offset should generate a lxvpx.  */
+  dst[6] = __builtin_vsx_lxvp (idx, src);
+  /* Non-aligned offset should generate a plxvp.  */
+  dst[8] = __builtin_vsx_lxvp (257, src);
+}
+
+#if !__has_builtin (__builtin_vsx_lxvp)
+#  error "__has_builtin (__builtin_vsx_lxvp) failed"
+#endif
+
+/* { dg-final { scan-assembler-not {\mlxv\M} } } */
+/* { dg-final { scan-assembler-not {\mstxv\M} } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
new file mode 100644
index 00000000000..af29e479f83
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2" } */
+
+void
+foo (__vector_pair *dst, __vector_pair *src, long idx)
+{
+  __vector_pair pair = *src;
+  __builtin_vsx_stxvp (pair, 0, dst);
+  __builtin_vsx_stxvp (pair, 32, dst);
+  __builtin_vsx_stxvp (pair, 64, dst);
+  /* Non-constant offset should generate a stxvpx.  */
+  __builtin_vsx_stxvp (pair, idx, dst);
+  /* Non-aligned offset should generate a pstxvp.  */
+  __builtin_vsx_stxvp (pair, 257, dst);
+}
+
+#if !__has_builtin (__builtin_vsx_stxvp)
+#  error "__has_builtin (__builtin_vsx_stxvp) failed"
+#endif
+
+/* { dg-final { scan-assembler-not {\mlxv\M} } } */
+/* { dg-final { scan-assembler-not {\mstxv\M} } } */
+/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */
+/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */
+/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */