diff mbox series

omp: Fix simdclone arguments with veclen lower than simdlen [PR113040]

Message ID a30a8faa-44c2-4462-af52-97cad01d28dc@arm.com
State New
Headers show
Series omp: Fix simdclone arguments with veclen lower than simdlen [PR113040] | expand

Commit Message

Andre Vieira (lists) Dec. 20, 2023, 3:17 p.m. UTC
This patch fixes an issue introduced by:
commit ea4a3d08f11a59319df7b750a955ac613a3f438a
Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
Date:   Wed Nov 1 17:02:41 2023 +0000

     omp: Reorder call for TARGET_SIMD_CLONE_ADJUST

The problem was that after this patch we no longer added multiple 
arguments for vector arguments where the veclen was lower than the simdlen.

gcc/ChangeLog:

	* omp-simd-clone.cc (simd_clone_adjust_argument_types): Add multiple
	vector arguments where simdlen is larger than veclen.

Bootstrapped and regression tested on x86_64-pc-linux-gnu and 
aarch64-unknown-linux-gnu.

OK for trunk?

PS: struggling to add a testcase for this, the dumps don't show the 
simdclone prototype and I can't easily create a run-test for this as it 
requires glibc.  Only option is a very flaky assembly scan test to see 
if it's writing to ymm4 (i.e. it is passing enough parameters), but I 
haven't because I don't think that's a good idea.
PPS: maybe we ought to print the simdclone prototype when passing 
-fdump-ipa-simdclone ?

Comments

Richard Biener Dec. 21, 2023, 7:19 a.m. UTC | #1
On Wed, 20 Dec 2023, Andre Vieira (lists) wrote:

> This patch fixes an issue introduced by:
> commit ea4a3d08f11a59319df7b750a955ac613a3f438a
> Author: Andre Vieira <andre.simoesdiasvieira@arm.com>
> Date:   Wed Nov 1 17:02:41 2023 +0000
> 
>     omp: Reorder call for TARGET_SIMD_CLONE_ADJUST
> 
> The problem was that after this patch we no longer added multiple arguments
> for vector arguments where the veclen was lower than the simdlen.
> 
> gcc/ChangeLog:
> 
> 	* omp-simd-clone.cc (simd_clone_adjust_argument_types): Add multiple
> 	vector arguments where simdlen is larger than veclen.
> 
> Bootstrapped and regression tested on x86_64-pc-linux-gnu and
> aarch64-unknown-linux-gnu.
> 
> OK for trunk?

OK.

Richard.
diff mbox series

Patch

diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc
index 3fbe428125243bc02bd58f6e50a3333c773e8df8..5151fef3bcdaa76802184df43ba13b8709645fd4 100644
--- a/gcc/omp-simd-clone.cc
+++ b/gcc/omp-simd-clone.cc
@@ -781,6 +781,7 @@  simd_clone_adjust_argument_types (struct cgraph_node *node)
   struct cgraph_simd_clone *sc = node->simdclone;
   unsigned i, k;
   poly_uint64 veclen;
+  auto_vec<tree> new_params;
 
   for (i = 0; i < sc->nargs; ++i)
     {
@@ -798,9 +799,11 @@  simd_clone_adjust_argument_types (struct cgraph_node *node)
       switch (sc->args[i].arg_type)
 	{
 	default:
+	  new_params.safe_push (parm_type);
 	  break;
 	case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP:
 	case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP:
+	  new_params.safe_push (parm_type);
 	  if (node->definition)
 	    sc->args[i].simd_array
 	      = create_tmp_simd_array (IDENTIFIER_POINTER (DECL_NAME (parm)),
@@ -828,6 +831,9 @@  simd_clone_adjust_argument_types (struct cgraph_node *node)
 	  else
 	    vtype = build_vector_type (parm_type, veclen);
 	  sc->args[i].vector_type = vtype;
+	  k = vector_unroll_factor (sc->simdlen, veclen);
+	  for (unsigned j = 0; j < k; j++)
+	    new_params.safe_push (vtype);
 
 	  if (node->definition)
 	    sc->args[i].simd_array
@@ -893,22 +899,8 @@  simd_clone_adjust_argument_types (struct cgraph_node *node)
 	last_parm_void = true;
 
       gcc_assert (TYPE_ARG_TYPES (TREE_TYPE (node->decl)));
-      for (i = 0; i < sc->nargs; i++)
-	{
-	  tree ptype;
-	  switch (sc->args[i].arg_type)
-	    {
-	    default:
-	      ptype = sc->args[i].orig_type;
-	      break;
-	    case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP:
-	    case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP:
-	    case SIMD_CLONE_ARG_TYPE_VECTOR:
-	      ptype = sc->args[i].vector_type;
-	      break;
-	    }
-	  new_arg_types = tree_cons (NULL_TREE, ptype, new_arg_types);
-	}
+      for (i = 0; i < new_params.length (); i++)
+	new_arg_types = tree_cons (NULL_TREE, new_params[i], new_arg_types);
       new_reversed = nreverse (new_arg_types);
       if (last_parm_void)
 	{