diff mbox series

[RFC] Extend the simd function attribute

Message ID d09cd430-398a-d680-8e43-bda3ffc6d7ed@arm.com
State New
Headers show
Series [RFC] Extend the simd function attribute | expand

Commit Message

Szabolcs Nagy Nov. 5, 2019, 12:10 p.m. UTC
(sorry for the resend, i forgot to add the mailing list)

GCC currently supports two ways to declare the availability of vector
variants of a scalar function:

  #pragma omp declare simd
  void f (void);

and

  __attribute__ ((simd))
  void f (void);

However neither can declare unambiguously a single vector variant, only
a set of vector variants which may change in the future as new vector
architectures or vector call ABIs are introduced. So these mechanisms
are not suitable for libraries which must not declare more than what
they provide.

One solution is to use the omp declare variant feature of OpenMP 5,
but that seems overcomplicated and still does not provide a reliable
mechanism (requires gcc or vendor specific extensions for unambiguous
declarations). And the omp pragma only works with -fopenmp-simd or
-fopenmp.

A simpler approach is to extend the gcc specific simd attribute such
that it can specify a single vector variant of simple scalar functions.
Where simple scalar functions are ones that only take or return scalar
integer or floating type values. I believe this can be achieved by

  __attribute__ ((simd (mask, simdlen, simdabi))))

where mask is "inbranch" or "notinbranch" like now, simdlen is an int
with the same meaning as in omp declare simd and simdabi is a string
specifying the call ABI (which the intel vector ABI calls ISA), can be
the same single letter string as the one used for name mangling.

A library may not want to use the ABI symbol name or provide multiple
implementations for the same vector function so it might make sense to
extend the syntax to

  __attribute__ ((simd (mask, simdlen, simdabi, name))))

when the same vector function is declared with different names then
it's unspecified which one the compiler picks.

The simd attribute currently can be used for both declarations and
definitions, in the latter case the simd varaints of the function are
generated, which should work with the extended simd attribute too.

The implementation of the simd attribute relied on openmp related
infrastructure in the compiler, i extended that to cover the new
attribute parameters, but I'm not sure if the new OMP_*_CLAUSEs may
break somewhere.

The simd attribute has some issues i did not try to address:

- incompatible vector prototypes are not checked:

  int _ZGVbN4_foo(int);
  __attribute__((simd("notinbranch", 4, "b")))
  void foo(void);

- incompatible symbol definitions are not checked:

  int _ZGVbN4_foo(int x) {return 2*x;}
  __attribute__((simd("notinbranch", 4, "b")))
  void foo(void) {}

- symbol redirection with asm does not work:

  int _ZGVbN4_foo(int) __asm__("bar");
  __attribute__((simd("notinbranch", 4, "b")))
  void foo(void);

Remaining work:

- extend the fortran builtin directive that's currently

  !GCC$ builtin (func) attributes simd FLAGS if(target)

I assume it can be

  !GCC$ builtin (func) attributes simd (mask, len, abi) if(target)

but i don't know how to handle old compilers in fortran.

- syntax for scalable vector functions: simdlen == 0 should
  work but requires internal changes,

- update documentation and add tests,

- and there are various TODOs left in the code.

Tested on aarch64-linux-gnu and x86_64-linux-gnu, I would like to see
some feedback on the approach.

Comments

Joseph Myers Nov. 5, 2019, 6:42 p.m. UTC | #1
On Tue, 5 Nov 2019, Szabolcs Nagy wrote:

> GCC currently supports two ways to declare the availability of vector
> variants of a scalar function:
> 
>   #pragma omp declare simd
>   void f (void);
> 
> and
> 
>   __attribute__ ((simd))
>   void f (void);
> 
> However neither can declare unambiguously a single vector variant, only
> a set of vector variants which may change in the future as new vector
> architectures or vector call ABIs are introduced. So these mechanisms
> are not suitable for libraries which must not declare more than what
> they provide.

Rather, these mechanisms must imply a fixed set of variants defined at the 
time the ABI was set, with any new variants added in future requiring some 
different attribute / pragma to be specified, to avoid breaking existing 
libraries and headers.  That's a requirement I made clear in the glibc 
review of the original libmvec addition, and is why the x86_64 vector ABI 
<https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt> 
says "Compiler implementations must not generate calls to version of other 
ISAs unless some non-standard pragma or clause is used to declare those 
other versions are available.".
Szabolcs Nagy Nov. 7, 2019, 12:39 p.m. UTC | #2
On 05/11/2019 18:42, Joseph Myers wrote:
> On Tue, 5 Nov 2019, Szabolcs Nagy wrote:
>> GCC currently supports two ways to declare the availability of vector
>> variants of a scalar function:
>>
>>   #pragma omp declare simd
>>   void f (void);
>>
>> and
>>
>>   __attribute__ ((simd))
>>   void f (void);
>>
>> However neither can declare unambiguously a single vector variant, only
>> a set of vector variants which may change in the future as new vector
>> architectures or vector call ABIs are introduced. So these mechanisms
>> are not suitable for libraries which must not declare more than what
>> they provide.
> 
> Rather, these mechanisms must imply a fixed set of variants defined at the 
> time the ABI was set, with any new variants added in future requiring some 
> different attribute / pragma to be specified, to avoid breaking existing 
> libraries and headers.  That's a requirement I made clear in the glibc 
> review of the original libmvec addition, and is why the x86_64 vector ABI 
> <https://sourceware.org/glibc/wiki/libmvec?action=AttachFile&do=view&target=VectorABI.txt> 
> says "Compiler implementations must not generate calls to version of other 
> ISAs unless some non-standard pragma or clause is used to declare those 
> other versions are available.".

OK i guess similar text can be added to the aarch64 vector abi document,
so the meaning of omp declare simd is stable.

And then we only need the proposed mechanism to be able to specify a
smaller set of variants, or for future vector ABIs.
diff mbox series

Patch

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 1c9f28587fbb2348cc30e302e889a5a22906901a..ae73b88a882cbe7fed1bade48d3d111d255433c5 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -448,7 +448,7 @@  const struct attribute_spec c_common_attribute_table[] =
 			      handle_omp_declare_variant_attribute, NULL },
   { "omp declare variant variant", 0, -1, true,  false, false, false,
 			      handle_omp_declare_variant_attribute, NULL },
-  { "simd",		      0, 1, true,  false, false, false,
+  { "simd",		      0, 4, true,  false, false, false,
 			      handle_simd_attribute, NULL },
   { "omp declare target",     0, -1, true, false, false, false,
 			      handle_omp_declare_target_attribute, NULL },
@@ -3094,6 +3094,15 @@  handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
     {
       tree t = get_identifier ("omp declare simd");
       tree attr = NULL_TREE;
+
+      /* Allow
+	  simd
+	  simd (mask)
+	  simd (mask, simdlen)
+	  simd (mask, simdlen, simdabi)
+	  simd (mask, simdlen, simdabi, name)
+	 forms.  */
+
       if (args)
 	{
 	  tree id = TREE_VALUE (args);
@@ -3118,8 +3127,73 @@  handle_simd_attribute (tree *node, tree name, tree args, int, bool *no_add_attrs
 	      *no_add_attrs = true;
 	      return NULL_TREE;
 	    }
+
+	  args = TREE_CHAIN (args);
 	}
 
+      if (args)
+	{
+	  tree arg = TREE_VALUE (args);
+
+	  // TODO: simdlen == 0 means scalable ?
+	  arg = c_fully_fold (arg, false, NULL);
+	  if (TREE_CODE (arg) != INTEGER_CST
+	      || !INTEGRAL_TYPE_P (TREE_TYPE (arg))
+	      || tree_int_cst_sgn (arg) != 1)
+	    {
+	      error ("second argument of attribute %qE must be positive "
+		     "constant integer expression", name);
+	      *no_add_attrs = true;
+	      return NULL_TREE;
+	    }
+	  tree c = build_omp_clause (DECL_SOURCE_LOCATION (*node),
+				     OMP_CLAUSE_SIMDLEN);
+	  OMP_CLAUSE_SIMDLEN_EXPR (c) = arg;
+	  OMP_CLAUSE_CHAIN (c) = attr;
+	  attr = c;
+	  args = TREE_CHAIN (args);
+	}
+
+      if (args)
+	{
+	  tree arg = TREE_VALUE (args);
+
+	  if (TREE_CODE (arg) != STRING_CST)
+	    {
+	      error ("attribute %qE argument not a string", name);
+	      *no_add_attrs = true;
+	      return NULL_TREE;
+	    }
+	  // TODO: verify simdabi here with target hook?
+	  tree c = build_omp_clause (DECL_SOURCE_LOCATION (*node),
+				     OMP_CLAUSE__SIMDABI_);
+	  OMP_CLAUSE__SIMDABI__EXPR (c) = arg;
+	  OMP_CLAUSE_CHAIN (c) = attr;
+	  attr = c;
+	  args = TREE_CHAIN (args);
+	}
+
+      if (args)
+	{
+	  tree arg = TREE_VALUE (args);
+
+	  if (TREE_CODE (arg) != STRING_CST)
+	    {
+	      error ("attribute %qE argument not a string", name);
+	      *no_add_attrs = true;
+	      return NULL_TREE;
+	    }
+	  // TODO: check identifier? check type if it's declared?
+	  tree c = build_omp_clause (DECL_SOURCE_LOCATION (*node),
+				     OMP_CLAUSE__SIMDNAME_);
+	  OMP_CLAUSE__SIMDNAME__EXPR (c) = arg;
+	  OMP_CLAUSE_CHAIN (c) = attr;
+	  attr = c;
+	  args = TREE_CHAIN (args);
+	}
+
+      gcc_assert (args == NULL);
+
       DECL_ATTRIBUTES (*node)
 	= tree_cons (t, build_tree_list (NULL_TREE, attr),
 		     DECL_ATTRIBUTES (*node));
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 0c5a6960d103e891741bd140c5e2b01ecec0e8ed..6f99ed0f361078ac9246f7f9cbe18d301566685a 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -805,6 +805,9 @@  struct GTY(()) cgraph_simd_clone_arg {
 /* Specific data for a SIMD function clone.  */
 
 struct GTY(()) cgraph_simd_clone {
+  /* Symbol associated with this clone or NULL if vector ABI name is used.  */
+  const char *simdname;
+
   /* Number of words in the SIMD lane associated with this clone.  */
   unsigned int simdlen;
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 599d07a729e7438080f8b5240ee95037a49fb983..23ed8bebe69da832d7afe9f67ea2a579aff6a818 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -21283,7 +21283,15 @@  aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
 	}
     }
 
-  clonei->vecsize_mangle = 'n';
+  if (clonei->vecsize_mangle == '\0')
+    clonei->vecsize_mangle = 'n';
+  else if (clonei->vecsize_mangle != 'n')
+    {
+      /* SIMD ABI is explicitly set.  */
+      warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		  "unsupported simd abi %c", clonei->vecsize_mangle);
+      return 0;
+    }
   clonei->mask_mode = VOIDmode;
   elt_bits = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type));
   if (clonei->simdlen == 0)
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 03a7082d2fc3e66b36760154b1c376e8bfa2cfad..92c117417c62cc5cb1f8c2d3b7677fe024022bcd 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21841,7 +21841,18 @@  ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
 	}
     }
 
-  if (!TREE_PUBLIC (node->decl))
+  if (clonei->vecsize_mangle != '\0')
+    {
+      /* SIMD ABI is explicitly set.  */
+      if (strchr ("bcde", clonei->vecsize_mangle) == NULL)
+	{
+	  warning_at (DECL_SOURCE_LOCATION (node->decl), 0,
+		      "unsupported simd abi %c", clonei->vecsize_mangle);
+	  return 0;
+	}
+      ret = 1;
+    }
+  else if (!TREE_PUBLIC (node->decl))
     {
       /* If the function isn't exported, we can pick up just one ISA
 	 for the clones.  */
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 279b6ef893ad704ce503860f5cf70f0652064779..be42e0222eab45a865473a9384cef290a0f7617f 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -1480,6 +1480,8 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	    install_var_local (decl, ctx);
 	  break;
 
+	case OMP_CLAUSE__SIMDABI_:
+	case OMP_CLAUSE__SIMDNAME_:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
@@ -1660,6 +1662,8 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE__CONDTEMP_:
 	  break;
 
+	case OMP_CLAUSE__SIMDABI_:
+	case OMP_CLAUSE__SIMDNAME_:
 	case OMP_CLAUSE__CACHE_:
 	default:
 	  gcc_unreachable ();
diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c
index 76aea56bdcf3abcdbbf360bfb90fe407f607d147..a21c8288268bcd8abf562f0a21426a2a20480206 100644
--- a/gcc/omp-simd-clone.c
+++ b/gcc/omp-simd-clone.c
@@ -153,6 +153,24 @@  simd_clone_clauses_extract (struct cgraph_node *node, tree clauses,
 	  clone_info->simdlen
 	    = TREE_INT_CST_LOW (OMP_CLAUSE_SIMDLEN_EXPR (t));
 	  break;
+	case OMP_CLAUSE__SIMDABI_:
+	  {
+	    const char *simdabi
+	      = TREE_STRING_POINTER (OMP_CLAUSE__SIMDABI__EXPR (t));
+	    // TODO: only single char abi names are supported, check earlier?
+	    if (!simdabi[0] || simdabi[1])
+	      {
+		warning_at (OMP_CLAUSE_LOCATION (t), 0,
+			    "ignoring unsupported simd abi %s", simdabi);
+		return NULL;
+	      }
+	    clone_info->vecsize_mangle = simdabi[0];
+	    break;
+	  }
+	case OMP_CLAUSE__SIMDNAME_:
+	  clone_info->simdname
+	    = TREE_STRING_POINTER (OMP_CLAUSE__SIMDNAME__EXPR (t));
+	  break;
 	case OMP_CLAUSE_LINEAR:
 	  {
 	    tree decl = OMP_CLAUSE_DECL (t);
@@ -341,9 +359,14 @@  simd_clone_mangle (struct cgraph_node *node,
   unsigned int simdlen = clone_info->simdlen;
   unsigned int n;
   pretty_printer pp;
+  const char *str;
 
   gcc_assert (vecsize_mangle && simdlen);
 
+  str = clone_info->simdname;
+  if (str)
+    goto end;
+
   pp_string (&pp, "_ZGV");
   pp_character (&pp, vecsize_mangle);
   pp_character (&pp, mask);
@@ -408,12 +431,13 @@  simd_clone_mangle (struct cgraph_node *node,
     }
 
   pp_underscore (&pp);
-  const char *str = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl));
+  str = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (node->decl));
   if (*str == '*')
     ++str;
   pp_string (&pp, str);
   str = pp_formatted_text (&pp);
 
+end:
   /* If there already is a SIMD clone with the same mangled name, don't
      add another one.  This can happen e.g. for
      #pragma omp declare simd
diff --git a/gcc/tree-core.h b/gcc/tree-core.h
index cfd88393c04ed48b8e44bb767944c542a99a1359..fac0bfeb93402b72ca68e191ac90e437086f0d13 100644
--- a/gcc/tree-core.h
+++ b/gcc/tree-core.h
@@ -415,6 +415,12 @@  enum omp_clause_code {
   /* OpenMP clause: simdlen (constant-integer-expression).  */
   OMP_CLAUSE_SIMDLEN,
 
+  /* extension: SIMD call ABI.  */
+  OMP_CLAUSE__SIMDABI_,
+
+  /* extension: SIMD symbol name.  */
+  OMP_CLAUSE__SIMDNAME_,
+
   /* OpenMP clause: device_type ({host,nohost,any}).  */
   OMP_CLAUSE_DEVICE_TYPE,
 
diff --git a/gcc/tree-nested.c b/gcc/tree-nested.c
index 4df07edcf0a8bb31c33d71f7442aa1e343a14727..7823cd6a403b8d40e4b25d30a1727576a28ab000 100644
--- a/gcc/tree-nested.c
+++ b/gcc/tree-nested.c
@@ -1372,6 +1372,8 @@  convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi)
 	case OMP_CLAUSE_UNIFORM:
 	case OMP_CLAUSE_INBRANCH:
 	case OMP_CLAUSE_NOTINBRANCH:
+	case OMP_CLAUSE__SIMDABI_:
+	case OMP_CLAUSE__SIMDNAME_:
 	  /* The following clauses are only allowed on OpenMP cancel and
 	     cancellation point directives, which at this point have already
 	     been lowered into a function call.  */
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 53b3f55a3e6a14d4df19e65deb7eb5829b5b06ae..cd7c9603872b1d2b968a17c79d67e6999bb66354 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -984,6 +984,20 @@  dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
       pp_right_paren (pp);
       break;
 
+    case OMP_CLAUSE__SIMDABI_:
+      pp_string (pp, "_simdabi_(");
+      dump_generic_node (pp, OMP_CLAUSE__SIMDABI__EXPR (clause),
+			 spc, flags, false);
+      pp_right_paren (pp);
+      break;
+
+    case OMP_CLAUSE__SIMDNAME_:
+      pp_string (pp, "_simdname_(");
+      dump_generic_node (pp, OMP_CLAUSE__SIMDNAME__EXPR (clause),
+			 spc, flags, false);
+      pp_right_paren (pp);
+      break;
+
     case OMP_CLAUSE_PRIORITY:
       pp_string (pp, "priority(");
       dump_generic_node (pp, OMP_CLAUSE_PRIORITY_EXPR (clause),
diff --git a/gcc/tree.h b/gcc/tree.h
index 81351548ecdf505b71bbe349489f69959abed08a..3c605919f8baeace24c8e8a385cbe1c36749834e 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1725,6 +1725,12 @@  class auto_suppress_location_wrappers
 #define OMP_CLAUSE_SIMDLEN_EXPR(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_SIMDLEN), 0)
 
+#define OMP_CLAUSE__SIMDABI__EXPR(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__SIMDABI_), 0)
+
+#define OMP_CLAUSE__SIMDNAME__EXPR(NODE) \
+  OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__SIMDNAME_), 0)
+
 #define OMP_CLAUSE__SIMDUID__DECL(NODE) \
   OMP_CLAUSE_OPERAND (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE__SIMDUID_), 0)
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 741f7a2ce659ab6a0386d36a6565bb021a522043..bfa450d4b53192e629c03a21ba2b20857384b5dc 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -334,6 +334,8 @@  unsigned const char omp_clause_num_ops[] =
   0, /* OMP_CLAUSE_PROC_BIND  */
   1, /* OMP_CLAUSE_SAFELEN  */
   1, /* OMP_CLAUSE_SIMDLEN  */
+  1, /* OMP_CLAUSE__SIMDABI_  */
+  1, /* OMP_CLAUSE__SIMDNAME_  */
   0, /* OMP_CLAUSE_DEVICE_TYPE  */
   0, /* OMP_CLAUSE_FOR  */
   0, /* OMP_CLAUSE_PARALLEL  */
@@ -419,6 +421,8 @@  const char * const omp_clause_code_name[] =
   "proc_bind",
   "safelen",
   "simdlen",
+  "_simdabi_",
+  "_simdname_",
   "device_type",
   "for",
   "parallel",
@@ -12079,6 +12083,8 @@  walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	case OMP_CLAUSE__CONDTEMP_:
 	case OMP_CLAUSE__SCANTEMP_:
 	case OMP_CLAUSE__SIMDUID_:
+	case OMP_CLAUSE__SIMDABI_:
+	case OMP_CLAUSE__SIMDNAME_:
 	  WALK_SUBTREE (OMP_CLAUSE_OPERAND (*tp, 0));
 	  /* FALLTHRU */