diff mbox

[gomp4] Propagate independent clause for OpenACC kernels pass

Message ID 874ml73wtj.fsf@schwinge.name
State New
Headers show

Commit Message

Thomas Schwinge July 14, 2015, 9:36 a.m. UTC
Hi Chung-Lin!

On Tue, 14 Jul 2015 13:46:04 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> this patch provides a 'bool independent' field in struct loop, which
> will be switched on by an "independent" clause in a #pragma acc loop directive.

Thanks!


This patch has been developed in context of OpenACC kernels constructs,
but, is there anything still to be done regarding OpenACC parallel
constructs?  That is, are we currently *using* the "independent yes/no"
information appropriately for these?


>         * omp-low.c (struct omp_region): Add 'int kind' and
>         'bool independent' fields.

> --- omp-low.c	(revision 225758)
> +++ omp-low.c	(working copy)
> @@ -136,8 +136,16 @@ struct omp_region
>    /* True if this is nested inside an OpenACC kernels construct.  */
>    bool inside_kernels_p;
>  
> +  /* Records a generic kind field.  */
> +  int kind;
> +
>    /* For an OpenACC loop, the level of parallelism requested.  */
>    int gwv_this;
> +
> +  /* For an OpenACC loop directive, true if has the 'independent' clause.  */
> +  bool independent;
> +
> +  tree broadcast_array;
>  };

I'm assuming just a patch conflict resolution hiccup; committed in
r225767:

commit 4a10c97b741bbc3d7278779337d5c0bfea8297c2
Author: tschwinge <tschwinge@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Tue Jul 14 09:30:02 2015 +0000

    Code cleanup
    
    	gcc/
    	* omp-low.c (struct omp_region): Remove broadcast_array member.
    
    ... that mistakenly reappeared in r225759, after having been removed in
    r225647.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@225767 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog.gomp | 4 ++++
 gcc/omp-low.c      | 2 --
 2 files changed, 4 insertions(+), 2 deletions(-)



Grüße,
 Thomas

Comments

Thomas Schwinge July 14, 2015, 11:05 a.m. UTC | #1
Hi!

On Tue, 14 Jul 2015 11:36:24 +0200, I wrote:
> On Tue, 14 Jul 2015 13:46:04 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> > this patch provides a 'bool independent' field in struct loop, which
> > will be switched on by an "independent" clause in a #pragma acc loop directive.
> 
> Thanks!
> 
> 
> This patch has been developed in context of OpenACC kernels constructs,
> but, is there anything still to be done regarding OpenACC parallel
> constructs?  That is, are we currently *using* the "independent yes/no"
> information appropriately for these?

Tom mentioned:

| openacc spec:
| ...
| 2.7.9 independent clause
| In a kernels construct, the independent clause tells the implementation 
| that the iterations of this loop are data-independent with respect to 
| each other. This allows the implementation to generate code to execute 
| the iterations in parallel with no synchronization.
| In a parallel construct, the independent clause is implied on all loop 
| directives without a seq clause.
| ...
|
| I think you're sort of asking if the seq clause has been implemented.
|
| openacc spec:
| ...
| 2.7.5 seq clause
| The seq clause specifies that the associated loop or loops are to be 
| executed sequentially by the accelerator. This clause will override any 
| automatic parallelization or vectorization.
| ...

Thanks, and right, I also realized that.  ;-)

Yet, my request is still to properly "document" this.  That is, if the
idea is that gcc/omp-low.c:scan_omp_for makes sure to emit a diagnostic
for the invalid combination of a gang/worker/vector clause together with
a seq clause, then in combination with the code newly added to
gcc/omp-low.c:find_omp_for_region_data to set region->independent = true
inside OpenACC parallel constructs, can't we then
assert(region->independent == true) in expand_omp_for_static_chunk and
expand_omp_for_static_nochunk?

So far it looks to me as if for OpenACC parallel constructs, we only have
a producer of region->independent = true, but no consumer.  Even if the
latter one is "implicit", we should "document" (using an assertion) this
in some way, for clarity.

While looking at that code, are we convinced that the diagnostic
machinery and subsequent handling will do the right thing in such cases:

    [...]
    #pragma acc loop [gang/worker/vector]
      for [...]
    #pragma acc loop seq
        for[...]

To me it looks (but I have not verified) that in such cases, the inner
region's ctx->gwv_this will have been initialized from the outer_ctx, so
to some combination of gang/worker/vector, and will that then be used to
parallelize the inner loop, which shouldn't be done, as I'm understanding
this?

It seems to be as if this gang/worker/vector/seq/independent clause
handling code that is currently distributed over
gcc/omp-low.c:scan_omp_for and gcc/omp-low.c:find_omp_for_region_data
(and, worse, also the front ends; see below) should be merged into one
place.

gcc/fortran/openmp.c:resolve_oacc_loop_blocks emits a diagnostic for seq
with independent; gcc/omp-low.c:scan_omp_for doesn't.  Should it?  Then,
why do we need this Fortran-specific checking code?  Should the
additional checking being done there get moved to OMP lowering?

In the C and C++ front ends, there's related checking code for those
clause attached to OpenACC routine constructs; not sure if that checking
should also be handled during OMP lowering, in one place?  I couldn't
find similar code in the Fortran front end (but didn't look very hard).

This is reminiscent of the discussion started in
<http://news.gmane.org/find-root.php?message_id=%3C87mw0zirnq.fsf%40schwinge.name%3E>
and following messages, about using
gcc/omp-low.c:check_omp_nesting_restrictions to do such checking (which
Cesar has not yet followed up on, as far as I know).

A few more points:

Does OMP_CLAUSE_INDEPENDENT need to be handled in
gcc/c-family/c-omp.c:c_oacc_split_loop_clauses?

While looking at that, ;-) again a few more things...  Are others clauses
missing to be handled there: tile, device_type (probably not; has all
been processed before?)?  Why is the firstprivate clause being duplicated
for loop_clauses/non_loop_clauses?  In OpenACC, there is no firstprivate
clause for loop constructs.  Why is the private clause being duplicated
for loop_clauses/non_loop_clauses?  My understanding is that the private
clause in a combined OpenACC parallel loop construct is to be handled as
if it were attached to the loop construct (... which is in contrast to
firstprivate -- yes OpenACC 2.0a is a bit assymetric in that regard).

What about the corresponding Fortran code,
gcc/fortran/trans-openmp.c:gfc_trans_oacc_combined_directive?

Do we have test cases to cover all this?


Grüße,
 Thomas
diff mbox

Patch

diff --git gcc/ChangeLog.gomp gcc/ChangeLog.gomp
index dfe0c95..d7459d0 100644
--- gcc/ChangeLog.gomp
+++ gcc/ChangeLog.gomp
@@ -1,3 +1,7 @@ 
+2015-07-14  Thomas Schwinge  <thomas@codesourcery.com>
+
+	* omp-low.c (struct omp_region): Remove broadcast_array member.
+
 2015-07-14  Tom de Vries  <tom@codesourcery.com>
 
 	* tree-parloops.c (parallelize_loops): Use marked_independent flag in
diff --git gcc/omp-low.c gcc/omp-low.c
index fc6c7a9..0419dcd 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -144,8 +144,6 @@  struct omp_region
 
   /* For an OpenACC loop directive, true if has the 'independent' clause.  */
   bool independent;
-
-  tree broadcast_array;
 };
 
 /* Context structure.  Used to store information about each parallel