diff mbox

add support for placing variables in shared memory

Message ID alpine.LNX.2.20.1604201957570.14803@monopod.intra.ispras.ru
State New
Headers show

Commit Message

Alexander Monakov April 20, 2016, 4:58 p.m. UTC
Allow using __attribute__((shared)) to place static variables in '.shared'
memory space.

Previously posted here:

[gomp-nvptx 04/13] nvptx backend: add support for placing variables in shared memory
https://gcc.gnu.org/ml/gcc-patches/2016-01/msg01546.html

[gomp-nvptx] doc: document nvptx shared attribute
https://gcc.gnu.org/ml/gcc-patches/2016-04/msg00940.html

2016-04-19  Alexander Monakov  <amonakov@ispras.ru>

	* doc/extend.texi (Nvidia PTX Variable Attributes): New section.

2016-01-17  Alexander Monakov  <amonakov@ispras.ru>

	* config/nvptx/nvptx.c (nvptx_encode_section_info): Handle "shared"
	attribute.
	(nvptx_handle_shared_attribute): New.  Use it...
	(nvptx_attribute_table): ... here (new entry).

Comments

Nathan Sidwell April 21, 2016, 2 p.m. UTC | #1
On 04/20/16 12:58, Alexander Monakov wrote:
> Allow using __attribute__((shared)) to place static variables in '.shared'
> memory space.

What is the rationale for a new attribute, rather than leveraging the existing 
section(".shared") machinery?

> +  else if (current_function_decl && !TREE_STATIC (decl))
> +    {
> +      error ("%qE attribute only applies to non-stack variables", name);

'non-stack'?  don't you mean 'static' or 'static storage'?

Needs a test case.

nathan
Alexander Monakov April 21, 2016, 2:25 p.m. UTC | #2
On Thu, 21 Apr 2016, Nathan Sidwell wrote:
> On 04/20/16 12:58, Alexander Monakov wrote:
> > Allow using __attribute__((shared)) to place static variables in '.shared'
> > memory space.
> 
> What is the rationale for a new attribute, rather than leveraging the existing
> section(".shared") machinery?

Section switching does not work at all on NVPTX in GCC at present.  PTX
assembly has no notion of different data sections, so the backend does not
advertise section switching capability to the middle end.

CUDA C does it via attributes too, and there's no point in diverging
gratuitously I think.

> > +  else if (current_function_decl && !TREE_STATIC (decl))
> > +    {
> > +      error ("%qE attribute only applies to non-stack variables", name);
> 
> 'non-stack'?  don't you mean 'static' or 'static storage'?

I avoided using 'static' because it applies to external declarations as well.
Other backends use "%qE attribute not allowed with auto storage class"; I'll
be happy to switch to that for consistency.

> Needs a test case.

OK, will follow up with a testcase.

Thanks.
Alexander
Nathan Sidwell April 22, 2016, 1:08 p.m. UTC | #3
On 04/21/16 10:25, Alexander Monakov wrote:
> On Thu, 21 Apr 2016, Nathan Sidwell wrote:
>> On 04/20/16 12:58, Alexander Monakov wrote:
>>> Allow using __attribute__((shared)) to place static variables in '.shared'
>>> memory space.
>>
>> What is the rationale for a new attribute, rather than leveraging the existing
>> section(".shared") machinery?
>
> Section switching does not work at all on NVPTX in GCC at present.  PTX
> assembly has no notion of different data sections, so the backend does not
> advertise section switching capability to the middle end.

Correct.  How is that relevant?  Look at DECL_SECTION_NAME in encode_section_info.

> CUDA C does it via attributes too, and there's no point in diverging
> gratuitously I think.

Also correct.  It seems you're trying to make the compiler look like CUDA rather 
than fit new features into existing idioms.

> I avoided using 'static' because it applies to external declarations as well.
> Other backends use "%qE attribute not allowed with auto storage class"; I'll
> be happy to switch to that for consistency.


Why can it not be applied to external declarations?  Doesn't '.extern .shared 
whatever' work?


Why can it not apply to  variables of auto storage?  I.e. function scope, 
function lifetime?  That would seem to be a useful property.

What happens if an initializer is present, is it silently ignored?  thinking 
further the uninitialized behaviour might be a reason  for not going with 
section(".shared").

nathan
Alexander Monakov April 22, 2016, 2:04 p.m. UTC | #4
On Fri, 22 Apr 2016, Nathan Sidwell wrote:
> On 04/21/16 10:25, Alexander Monakov wrote:
> > On Thu, 21 Apr 2016, Nathan Sidwell wrote:
> > > What is the rationale for a new attribute, rather than leveraging the
> > > existing section(".shared") machinery?
> >
> > Section switching does not work at all on NVPTX in GCC at present.  PTX
> > assembly has no notion of different data sections, so the backend does not
> > advertise section switching capability to the middle end.
> 
> Correct.  How is that relevant?  Look at DECL_SECTION_NAME in
> encode_section_info.

Middle end rejects the "section" attribute:

echo 'int v __attribute__((section("foo")));' |
  x86_64-pc-linux-gnu-accel-nvptx-none-gcc -xc - -o /dev/null
<stdin>:1:5: error: section attributes are not supported for this target

> > I avoided using 'static' because it applies to external declarations as
> > well.
> > Other backends use "%qE attribute not allowed with auto storage class"; I'll
> > be happy to switch to that for consistency.
> 
> 
> Why can it not be applied to external declarations?  Doesn't '.extern .shared
> whatever' work?

It can and it does; the point was to avoid ambiguity; the precise variable
class would be 'variables with static storage duration' (doesn't matter with
external or internal linkage, defined or declared), but due to possible
confusion with variables declared with 'static' keyword, it's useful to avoid
that wording.

> Why can it not apply to  variables of auto storage?  I.e. function scope,
> function lifetime?  That would seem to be a useful property.

Because PTX does not support auto storage semantics for .shared data.  It's
statically allocated at link time.

> What happens if an initializer is present, is it silently ignored?

GCC accepts and reemits it in assembly output (if non-zero), and ptxas rejects
it ("syntax error").

Alexander
Nathan Sidwell April 25, 2016, 2:47 p.m. UTC | #5
On 04/22/16 10:04, Alexander Monakov wrote:

> echo 'int v __attribute__((section("foo")));' |
>    x86_64-pc-linux-gnu-accel-nvptx-none-gcc -xc - -o /dev/null
> <stdin>:1:5: error: section attributes are not supported for this target

Presumably it's missing a necessary hook?  Couldn't such a hook check the 
section name is acceptable?


>> Why can it not apply to  variables of auto storage?  I.e. function scope,
>> function lifetime?  That would seem to be a useful property.
>
> Because PTX does not support auto storage semantics for .shared data.  It's
> statically allocated at link time.

I  suppose it's not worth going through hoops to define such function-scoped 
variables if PTX isn't going to take advantage of that.

>> What happens if an initializer is present, is it silently ignored?
>
> GCC accepts and reemits it in assembly output (if non-zero), and ptxas rejects
> it ("syntax error").

ptx errors are inscrutable.

It would be better for nvptx_assemble_decl_end to check if an initializer has 
been output and emit an error (you'll need to record the decl itself in the 
initializer structure to do that).  Record the  decl in 
nvptx_assemble_decl_begin if the symbol's data area is .shared, and then check 
in NADE?

nathan
Alexander Monakov April 25, 2016, 5:49 p.m. UTC | #6
On Mon, 25 Apr 2016, Nathan Sidwell wrote:
> On 04/22/16 10:04, Alexander Monakov wrote:
> > echo 'int v __attribute__((section("foo")));' |
> >    x86_64-pc-linux-gnu-accel-nvptx-none-gcc -xc - -o /dev/null
> > <stdin>:1:5: error: section attributes are not supported for this target
> 
> Presumably it's missing a necessary hook?  Couldn't such a hook check the
> section name is acceptable?

No, that really doesn't sound viable.  You'd need to somehow take into account
every instance where the compiler attempts to switch sections internally
(.text/.data/.bss, -ffunction-sections/-fdata-sections etc.).

> > > Why can it not apply to  variables of auto storage?  I.e. function scope,
> > > function lifetime?  That would seem to be a useful property.
> >
> > Because PTX does not support auto storage semantics for .shared data.  It's
> > statically allocated at link time.
> 
> I  suppose it's not worth going through hoops to define such function-scoped
> variables if PTX isn't going to take advantage of that.

It's not even about 'taking advantage', basic correctness expectations would be
violated (with auto storage you get new instances of the variable when
reentering function scope recursively).

> > > What happens if an initializer is present, is it silently ignored?
> >
> > GCC accepts and reemits it in assembly output (if non-zero), and ptxas
> > rejects
> > it ("syntax error").
> 
> ptx errors are inscrutable.
> 
> It would be better for nvptx_assemble_decl_end to check if an initializer has
> been output and emit an error (you'll need to record the decl itself in the
> initializer structure to do that).  Record the  decl in
> nvptx_assemble_decl_begin if the symbol's data area is .shared, and then check
> in NADE?

Ugh.  Checking DECL_INITIAL in nvptx_encode_section_info would be much
simpler (and that's how other backends perform a similar test).

Note, rejecting zero-initializers is debatable:
C and C++ don't have a concept of uninitialized global-scope data; if
the initializer is missing, it's exactly as if it was 0.  However, GCC has
-fcommon enabled by default (which, btw, shouldn't we change on NVPTX?), and
that makes a difference: 'int v = 0;' is a strong definition, while 'int v;'
becomes a common symbol, and ultimately a weak definition on NVPTX.

So if all-zeros initializers are rejected, to make a strong definition of a
shared variable one would have to write:

  int v __attribute__((shared,nocommon));

With -fno-common enabled by default on this target that wouldn't be an issue.

Thanks.
Alexander
Nathan Sidwell April 26, 2016, 1:06 p.m. UTC | #7
On 04/25/16 13:49, Alexander Monakov wrote:
> On Mon, 25 Apr 2016, Nathan Sidwell wrote:
acceptable?
>
> No, that really doesn't sound viable.  You'd need to somehow take into account
> every instance where the compiler attempts to switch sections internally
> (.text/.data/.bss, -ffunction-sections/-fdata-sections etc.).

Hm, -f{function/data}=sections would be the killer here.


> Ugh.  Checking DECL_INITIAL in nvptx_encode_section_info would be much
> simpler (and that's how other backends perform a similar test).

If that's available, then great.

> Note, rejecting zero-initializers is debatable:
> C and C++ don't have a concept of uninitialized global-scope data; if
> the initializer is missing, it's exactly as if it was 0.  However, GCC has
> -fcommon enabled by default (which, btw, shouldn't we change on NVPTX?), and
> that makes a difference: 'int v = 0;' is a strong definition, while 'int v;'
> becomes a common symbol, and ultimately a weak definition on NVPTX.
>
> So if all-zeros initializers are rejected, to make a strong definition of a
> shared variable one would have to write:

I think we should  reject the cases where the backend gets to see an explicit 
initializer.


fno-common would be a good default for PTX (I had thought it was  already on?)

nathan
diff mbox

Patch

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 2d4dad1..e9e4d06 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -234,9 +224,12 @@  nvptx_encode_section_info (tree decl, rtx rtl, int first)
       if (TREE_CONSTANT (decl))
 	area = DATA_AREA_CONST;
       else if (TREE_CODE (decl) == VAR_DECL)
-	/* TODO: This would be a good place to check for a .shared or
-	   other section name.  */
-	area = TREE_READONLY (decl) ? DATA_AREA_CONST : DATA_AREA_GLOBAL;
+	{
+	  if (lookup_attribute ("shared", DECL_ATTRIBUTES (decl)))
+	    area = DATA_AREA_SHARED;
+	  else
+	    area = TREE_READONLY (decl) ? DATA_AREA_CONST : DATA_AREA_GLOBAL;
+	}
 
       SET_SYMBOL_DATA_AREA (XEXP (rtl, 0), area);
     }
@@ -3805,12 +4025,36 @@  nvptx_handle_kernel_attribute (tree *node, tree name, tree ARG_UNUSED (args),
   return NULL_TREE;
 }
 
+/* Handle a "shared" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+nvptx_handle_shared_attribute (tree *node, tree name, tree ARG_UNUSED (args),
+			       int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  tree decl = *node;
+
+  if (TREE_CODE (decl) != VAR_DECL)
+    {
+      error ("%qE attribute only applies to variables", name);
+      *no_add_attrs = true;
+    }
+  else if (current_function_decl && !TREE_STATIC (decl))
+    {
+      error ("%qE attribute only applies to non-stack variables", name);
+      *no_add_attrs = true;
+    }
+
+  return NULL_TREE;
+}
+
 /* Table of valid machine attributes.  */
 static const struct attribute_spec nvptx_attribute_table[] =
 {
   /* { name, min_len, max_len, decl_req, type_req, fn_type_req, handler,
        affects_type_identity } */
   { "kernel", 0, 0, true, false,  false, nvptx_handle_kernel_attribute, false },
+  { "shared", 0, 0, true, false,  false, nvptx_handle_shared_attribute, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e11ce4d..5eeb179 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -5469,6 +5469,7 @@  attributes.
 * MeP Variable Attributes::
 * Microsoft Windows Variable Attributes::
 * MSP430 Variable Attributes::
+* Nvidia PTX Variable Attributes::
 * PowerPC Variable Attributes::
 * RL78 Variable Attributes::
 * SPU Variable Attributes::
@@ -6099,6 +6100,20 @@  same name (@pxref{MSP430 Function Attributes}).
 These attributes can be applied to both functions and variables.
 @end table
 
+@node Nvidia PTX Variable Attributes
+@subsection Nvidia PTX Variable Attributes
+
+These variable attributes are supported by the Nvidia PTX back end:
+
+@table @code
+@item shared
+@cindex @code{shared} attribute, Nvidia PTX
+Use this attribute to place a variable in the @code{.shared} memory space.
+This memory space is private to each cooperative thread array; only threads
+within one thread block refer to the same instance of the variable.
+The runtime does not initialize variables in this memory space.
+@end table
+
 @node PowerPC Variable Attributes
 @subsection PowerPC Variable Attributes