diff mbox series

[Fortran] OpenACC – permit common blocks in some clauses

Message ID b71f7479-302f-01ad-9a00-daead56ac0d4@codesourcery.com
State New
Headers show
Series [Fortran] OpenACC – permit common blocks in some clauses | expand

Commit Message

Tobias Burnus Oct. 15, 2019, 9:32 p.m. UTC
This OpenACC-only patch extends the support for /common/ blocks.

[In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported in copyin/copyprivate, in firstprivate/lastprivate/private/shared, in threadprivate and in declare target.]

For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.

This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
present_or_copy_out, present_or_create and self.
[Of those, only "copy()" is also an OpenMP clause name.]
[Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new since OpenACC 2.0.]



I think the Fortran part is obvious, once one agrees on the list of clauses; and OK from a Fortran's maintainer view.

gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
The ME change is about privatizing common blocks (I haven't studied this part closer.)


@Thomas: Please review
@Jakub, all: comments and approvals are welcome.

Tobias

PS: This patch is the rediffed OG9 (alias OG8) patch 0793cef408c9937f4c4e2423dd1f7d6a97b9bed3 by Cesar Philippidis from 2016. (Which was on gomp-4_0-branch as r240165). Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000 – which in addition also does some other things like handling OpenACC device pointers.

Comments

Thomas Schwinge Oct. 18, 2019, 1:26 p.m. UTC | #1
Hi!

On 2019-10-15T23:32:32+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> This OpenACC-only patch extends the support for /common/ blocks.

I'll be quick to note that I don't have any first-hand experience with
Fortran common blocks.  :-P

> [In OpenMP (4.0 to 5.0, unchanged) and gfortran, common blocks are supported in copyin/copyprivate, in firstprivate/lastprivate/private/shared, in threadprivate and in declare target.]
>
> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.
>
> This patch adds them (for OpenACC only) to copy/copyin/copyout, create/delete,
> host, pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
> present_or_copy_out, present_or_create and self.
> [Of those, only "copy()" is also an OpenMP clause name.]

I'm confused: in
<http://mid.mail-archive.com/20181204133007.GO12380@tucnak> Jakub stated
that "OpenMP doesn't have a copy clause, so I'd expect true here":

| @@ -1051,7 +1052,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
|  	  if ((mask & OMP_CLAUSE_COPY)
|  	      && gfc_match ("copy ( ") == MATCH_YES
|  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
| -					   OMP_MAP_TOFROM))
| +					   OMP_MAP_TOFROM, openacc))
|  	    continue;

> [Cf. OpenACC 2.7 in 1.9 (for the p* variants) and 2.13; the latter is new since OpenACC 2.0.]
>
>
>
> I think the Fortran part is obvious, once one agrees on the list of clauses; and OK from a Fortran's maintainer view.

I'll defer to your judgement there, but just one comment: I noticed that
OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses except
'deviceptr' and 'present', the list argument may include a Fortran
_common block_ name enclosed within slashes, if that _common block_ name
also appears in a 'declare' directive 'link' clause".

Are we already properly handling the aspect that requires that the "that
_common block_ name also appears in a 'declare' directive 'link' clause"?

The libgomp execution test cases you're adding all state that "This test
does not exercise ACC DECLARE", yet they supposedly already do work fine.
Or am I understading the OpenACC specification wrongly here?

I'm certainly aware of (big) deficiencies in the OpenACC 'declare'
handling, so I guess my question here may be whether these test cases are
valid after all?

> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
> The ME change is about privatizing common blocks (I haven't studied this part closer.)

So, please do study that closer.  ;-P

In <http://mid.mail-archive.com/87efx6haep.fsf@euler.schwinge.homeip.net>
I raised some questions, got a bit of an answer, and in
<http://mid.mail-archive.com/87bms85kra.fsf@hertz.schwinge.homeip.net>
asked further, didn't get an answer.

All the rationale from Cesar's original submission email should be
transferred into 'gcc/gimplify.c' as much as feasible, to make that
"voodoo code" better understandable.

> @Jakub, all: comments and approvals are welcome.

Indeed.  :-)

> 	gcc/
> 	* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
> 	(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
> 	common block decls.

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7218,15 +7218,20 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>  {
>    const char *rkind;
>    bool on_device = false;
> +  bool is_private = false;
>    bool declared = is_oacc_declared (decl);
>    tree type = TREE_TYPE (decl);
>  
>    if (lang_hooks.decls.omp_privatize_by_reference (decl))
>      type = TREE_TYPE (type);
>  
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>    if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>        && is_global_var (decl)
> -      && device_resident_p (decl))
> +      && device_resident_p (decl)
> +      && !is_private)
>      {
>        on_device = true;
>        flags |= GOVD_MAP_TO_ONLY;
> @@ -7237,7 +7242,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_KERNELS:
>        rkind = "kernels";
>  
> -      if (AGGREGATE_TYPE_P (type))
> +      if (is_private)
> +	flags |= GOVD_MAP;
> +      else if (AGGREGATE_TYPE_P (type))
>  	{
>  	  /* Aggregates default to 'present_or_copy', or 'present'.  */
>  	  if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7254,7 +7261,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_PARALLEL:
>        rkind = "parallel";
>  
> -      if (on_device || declared)
> +      if (is_private)
> +	flags |= GOVD_FIRSTPRIVATE;
> +      else if (on_device || declared)
>  	flags |= GOVD_MAP;
>        else if (AGGREGATE_TYPE_P (type))
>  	{
> @@ -7320,7 +7329,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>  	{
>  	  tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +	  if (!(ctx->region_type & ORT_ACC)
> +	      && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>  	    return omp_notice_threadprivate_variable (ctx, decl, value);
>  	}
>  
> @@ -7352,7 +7362,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>    n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>    if ((ctx->region_type & ORT_TARGET) != 0)
>      {
> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
> +      shared = !(ctx->region_type & ORT_ACC);
> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
>        if (n == NULL)
>  	{
>  	  unsigned nflags = flags;
> @@ -7520,6 +7531,8 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>      }
>  
>    shared = ((flags | n->value) & GOVD_SHARED) != 0;
> +  if (ctx->region_type & ORT_ACC)
> +    shared = false;
>    ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
>  
>    /* If nothing changed, there's nothing left to do.  */


> PS: This patch is the rediffed OG9 (alias OG8) patch 0793cef408c9937f4c4e2423dd1f7d6a97b9bed3 by Cesar Philippidis from 2016. (Which was on gomp-4_0-branch as r240165). Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000 – which in addition also does some other things like handling OpenACC device pointers.

There's no Git magic involved there: somebody just (manually) merged
several these patches together into one, for no good reason.  ;-\


Grüße
 Thomas


> diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> new file mode 100644
> index 00000000000..1cbbb49d638
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> @@ -0,0 +1,69 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, validates early matching errors.
> +
> +subroutine subtest
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +end subroutine subtest
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v)
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +end program test
> diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> new file mode 100644
> index 00000000000..b83638918a3
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> @@ -0,0 +1,49 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, resolver errors such as duplicate data clauses.
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end data
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
> +end program test
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
> new file mode 100644
> index 00000000000..a17a33536f3
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
> @@ -0,0 +1,105 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.
> +
> +module const
> +  integer, parameter :: n = 100
> +end module const
> +
> +subroutine check
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  do i = 1, n
> +     if (x(i) .ne. y) call abort
> +  end do
> +end subroutine check
> +
> +module m
> +  use const
> +  integer a(n), b
> +  common /BLOCK/ a, b
> +
> +contains
> +  subroutine mod_implicit_incr
> +    implicit none
> +    integer i
> +
> +    !$acc parallel loop
> +    do i = 1, n
> +       a(i) = b
> +    end do
> +    !$acc end parallel loop
> +
> +    call check
> +  end subroutine mod_implicit_incr
> +
> +  subroutine mod_explicit_incr
> +    implicit none
> +    integer i
> +
> +    !$acc parallel loop copy(a(1:n)) copyin(b)
> +    do i = 1, n
> +       a(i) = b
> +    end do
> +    !$acc end parallel loop
> +
> +    call check
> +  end subroutine mod_explicit_incr
> +end module m
> +
> +subroutine sub_implicit_incr
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     x(i) = y
> +  end do
> +  !$acc end parallel loop
> +
> +  call check
> +end subroutine sub_implicit_incr
> +
> +subroutine sub_explicit_incr
> +  use const
> +
> +  implicit none
> +  integer i, x(n), y
> +  common /BLOCK/ x, y
> +
> +  !$acc parallel loop copy(x(1:n)) copyin(y)
> +  do i = 1, n
> +     x(i) = y
> +  end do
> +  !$acc end parallel loop
> +
> +  call check
> +end subroutine sub_explicit_incr
> +
> +program main
> +  use m
> +
> +  implicit none
> +
> +  a(:) = -1
> +  b = 5
> +  call mod_implicit_incr
> +
> +  a(:) = -2
> +  b = 6
> +  call mod_explicit_incr
> +
> +  a(:) = -3
> +  b = 7
> +  call sub_implicit_incr
> +
> +  a(:) = -4
> +  b = 8
> +  call sub_explicit_incr
> +end program main
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
> new file mode 100644
> index 00000000000..e27a225a024
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
> @@ -0,0 +1,150 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.  All data clauses are explicit.
> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +subroutine validate
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  do i = 1, n
> +     if (abs(x(i) - i - z) .ge. 0.0001) call abort
> +  end do
> +end subroutine validate
> +
> +subroutine incr
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc parallel loop pcopy(/BLOCK/)
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end parallel loop
> +end subroutine incr
> +
> +program main
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 a(n), b(n), c
> +  common /BLOCK/ a, b, c, j
> +
> +  ! Test copyout, pcopy, device
> +
> +  !$acc data copyout(a, c)
> +
> +  c = 1.0
> +
> +  !$acc update device(c)
> +
> +  !$acc parallel loop pcopy(a)
> +  do i = 1, n
> +     a(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  call incr
> +  call incr
> +  call incr
> +  !$acc end data
> +
> +  c = 3.0
> +  call validate
> +
> +  ! Test pcopy without copyout
> +
> +  c = 2.0
> +  call incr
> +  c = 5.0
> +  call validate
> +
> +  ! Test create, delete, host, copyout, copyin
> +
> +  !$acc enter data create(b)
> +
> +  !$acc parallel loop pcopy(b)
> +  do i = 1, n
> +     b(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc update host (b)
> +
> +  !$acc parallel loop pcopy(b) copyout(a) copyin(c)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc exit data delete(b)
> +
> +  call validate
> +
> +  a(:) = b(:)
> +  c = 0.0
> +  call validate
> +
> +  ! Test copy
> +
> +  c = 1.0
> +  !$acc parallel loop copy(/BLOCK/)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  ! Test pcopyin, pcopyout FIXME
> +
> +  c = 2.0
> +  !$acc data copyin(b, c) copyout(a)
> +
> +  !$acc parallel loop pcopyin(b, c) pcopyout(a)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  !$acc end data
> +
> +  call validate
> +
> +  ! Test reduction, private
> +
> +  j = 0
> +
> +  !$acc parallel private(i) copy(j)
> +  !$acc loop reduction(+:j)
> +  do i = 1, n
> +     j = j + 1
> +  end do
> +  !$acc end parallel
> +
> +  if (j .ne. n) call abort
> +
> +  ! Test firstprivate, copy
> +
> +  a(:) = 0
> +  c = j
> +
> +  !$acc parallel loop firstprivate(c) copyout(a)
> +  do i = 1, n
> +     a(i) = i + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +end program main
> diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
> new file mode 100644
> index 00000000000..90448d2da72
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
> @@ -0,0 +1,137 @@
> +! Test data located inside common blocks.  This test does not exercise
> +! ACC DECLARE.  Most of the data clauses are implicit.
> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +subroutine validate
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  do i = 1, n
> +     if (abs(x(i) - i - z) .ge. 0.0001) call abort
> +  end do
> +end subroutine validate
> +
> +subroutine incr_parallel
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end parallel loop
> +end subroutine incr_parallel
> +
> +subroutine incr_kernels
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 x(n), y(n), z
> +  common /BLOCK/ x, y, z, j
> +
> +  !$acc kernels
> +  do i = 1, n
> +     x(i) = x(i) + z
> +  end do
> +  !$acc end kernels
> +end subroutine incr_kernels
> +
> +program main
> +  use consts
> +
> +  implicit none
> +  integer i, j
> +  real*4 a(n), b(n), c
> +  common /BLOCK/ a, b, c, j
> +
> +  !$acc data copyout(a, c)
> +
> +  c = 1.0
> +
> +  !$acc update device(c)
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = i
> +  end do
> +  !$acc end parallel loop
> +
> +  call incr_parallel
> +  call incr_parallel
> +  call incr_parallel
> +  !$acc end data
> +
> +  c = 3.0
> +  call validate
> +
> +  ! Test pcopy without copyout
> +
> +  c = 2.0
> +  call incr_kernels
> +  c = 5.0
> +  call validate
> +
> +  !$acc kernels
> +  do i = 1, n
> +     b(i) = i
> +  end do
> +  !$acc end kernels
> +
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  a(:) = b(:)
> +  c = 0.0
> +  call validate
> +
> +  ! Test copy
> +
> +  c = 1.0
> +  !$acc parallel loop
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end parallel loop
> +
> +  call validate
> +
> +  c = 2.0
> +  !$acc data copyin(b, c) copyout(a)
> +
> +  !$acc kernels
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc end kernels
> +
> +  !$acc end data
> +
> +  call validate
> +
> +  j = 0
> +
> +  !$acc parallel loop reduction(+:j)
> +  do i = 1, n
> +     j = j + 1
> +  end do
> +  !$acc end parallel loop
> +
> +  if (j .ne. n) call abort
> +end program main
Tobias Burnus Oct. 23, 2019, 8:34 p.m. UTC | #2
Hi Thomas,

Updated version attached. Changes:
* Use "true" instead of "openacc" for the OpenACC-only "copy()" clause 
(as not shared w/ OpenMP)
* Add some documentation to gimplify.c
* Use GOVD_FIRSTPRIVATE also for "kernel"

The patch survived bootstrapping + regtesting on my laptop (no 
offloading) and on a build server (with nvptx offloading).

On 10/18/19 3:26 PM, Thomas Schwinge wrote:
> I'll be quick to note that I don't have any first-hand experience with 
> Fortran common blocks. :-P 

To quote you from below: "So, please do study that closer. ;-P"

I also do not have first-hand experience (as I started with Fortran 95 + 
some of F2003), but common blocks were a nice idea of the early 1960 to 
provide access to global memory, avoiding to pass all data as arguments 
(which also has stack issues). They have been replaced by derived types 
and variables declared at module level since Fortran 90. See 
https://j3-fortran.org/doc/year/18/18-007r1.pdf or 
https://web.stanford.edu/class/me200c/tutorial_77/13_common.html


On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.
>> […] [Of those, only "copy()" is also an OpenMP clause name.]
> I'm confused: in […] "OpenMP doesn't have a copy clause, so I'd expect true here":

I concur – only "copyin" and "copyprivate" exist in OpenMP. (But thanks 
to "if (openacc)" no "openacc" is needed, either.)


> I'll defer to your judgement there, but just one comment: I noticed 
> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses 
> except 'deviceptr' and 'present', the list argument may include a 
> Fortran_common block_ name enclosed within slashes, if that _common 
> block_ name also appears in a 'declare' directive 'link' clause".
>
> Are we already properly handling the aspect that requires that the 
> "that _common block_ name also appears in a 'declare' directive 'link' 
> clause"? 

I don't know neither the OpenACC spec nor the GCC implementation well 
enough to claim proper (!) handling. However, as stated above: 
device_resident/usedevice/cache/flush/link do support common block 
arguments.

Looking at the testsuite, link and device_resident are tested in 
gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use 
come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been 
added.


> The libgomp execution test cases you're adding all state that "This test does not exercise ACC DECLARE", yet they supposedly already do work fine. Or am I understading the OpenACC specification wrongly here?

You need to ask Cesar, who wrote the test case and that comment, why he 
added it.

The patch does not touch 'link'/'device_resident' clauses of 'declare', 
hence, I think he didn't see a reason to add a run-time test case for 
it. – That's independent from whether it is supported by the OpenACC 
spec and whether it is "properly" implemented in GCC/gfortran.

> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling so I guess my question here may be whether these test cases are valid after all?

Well, you are the OpenACC specialist – both spec wise and 
GCC-implementation wise. However, as the test cases are currently 
parsing-only test cases, I think they should be fine.


>> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
>> The ME change is about privatizing common blocks (I haven't studied this part closer.)
> So, please do study that closer.  ;-P
>
> In<http://mid.mail-archive.com/87efx6haep.fsf@euler.schwinge.homeip.net>
> I raised some questions, got a bit of an answer, and in
> <http://mid.mail-archive.com/87bms85kra.fsf@hertz.schwinge.homeip.net>
> asked further, didn't get an answer.
>
> All the rationale from Cesar's original submission email should be
> transferred into 'gcc/gimplify.c' as much as feasible, to make that
> "voodoo code" better understandable.


I have now added some comments to the patch. I also changed GOVD_MAP to 
GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think 
that makes sense in terms of what Cesar has written – but I am not 
completely sure about this.

Cross ref: The original email is 
https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00950.html ; the review 
starts here https://gcc.gnu.org/ml/gcc-patches/2017-04/msg00250.html 
(same email as mid.mail-archive.com link above).

BTW: That patch – rediffed for OG9 and augmented by several other 
patches (including deviceptr) – was then submitted at 
https://gcc.gnu.org/ml/gcc-patches/2018-06/msg01911.html and first 
reviewed at https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00176.html and 
then committed to OG9 at 
https://gcc.gnu.org/ml/gcc-patches/2019-01/msg00051.html


>> Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000
> There's no Git magic involved there: somebody just (manually) merged
> several these patches together into one, for no good reason.  ;-\

Well, there is more. If you do not enforce linear history, you cannot 
easily say to git: Give me all changes between this commit and that 
commit – as they pass by in a sneak path. And by default, GIT merges 
such that the private version is the "main" branch – and one merges the 
other branch ("upstream") into the own branch. This can quickly become 
quite confusing.

In particular, it is not easy to see when/why some code disappeared. You 
have some patch – someone else had merge problems, accidentally removed 
it and then if you diff or do "log -p", it looks as if the code was 
never there, unless you explicitly dig into the branch whose commits 
were merged into the "main" branch.

Tobias

PS: I am a great fan of patch submissions by the authors – it avoids 
later digging and guess work for reasons why someone else wrote 
something in a particular way.
Thomas Schwinge Oct. 25, 2019, 8:43 a.m. UTC | #3
Hi Tobias!

On 2019-10-23T22:34:42+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> Updated version attached. Changes:
> * Use "true" instead of "openacc" for the OpenACC-only "copy()" clause 
> (as not shared w/ OpenMP)
> * Add some documentation to gimplify.c
> * Use GOVD_FIRSTPRIVATE also for "kernel"

Thanks!

> The patch survived bootstrapping + regtesting on my laptop (no 
> offloading) and on a build server (with nvptx offloading).

OK for trunk, with the following few small items considered.  To record
the review effort, please include "Reviewed-by: Thomas Schwinge
<thomas@codesourcery.com>" in the commit log, see
<https://gcc.gnu.org/wiki/Reviewed-by>.


> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>> I'll be quick to note that I don't have any first-hand experience with 
>> Fortran common blocks. :-P 
>
> To quote you from below: "So, please do study that closer. ;-P"

Haha!  ;-P (You don't want to know how long is my list of items that I
might/want/could look into...)

> I also do not have first-hand experience (as I started with Fortran 95 + 
> some of F2003), but common blocks were a nice idea of the early 1960 to 
> provide access to global memory, avoiding to pass all data as arguments 
> (which also has stack issues). They have been replaced by derived types 
> and variables declared at module level since Fortran 90. See 
> https://j3-fortran.org/doc/year/18/18-007r1.pdf or 
> https://web.stanford.edu/class/me200c/tutorial_77/13_common.html

..., and didn't "They have been replaced by [...]" go as far as that
they've actually been deprecated in recent Fortran standard revisions?
(I may be misremembering.)


Anyway:

> On 10/18/19 3:26 PM, Thomas Schwinge wrote:
>>> For OpenACC, gfortran already supports common blocks for device_resident/usedevice/cache/flush/link.

>> I'll defer to your judgement there, but just one comment: I noticed 
>> that OpenACC 2.7 in 2.7. "Data Clauses" states that "For all clauses 
>> except 'deviceptr' and 'present', the list argument may include a 
>> Fortran_common block_ name enclosed within slashes, if that _common 
>> block_ name also appears in a 'declare' directive 'link' clause".
>>
>> Are we already properly handling the aspect that requires that the 
>> "that _common block_ name also appears in a 'declare' directive 'link' 
>> clause"? 
>
> I don't know neither the OpenACC spec nor the GCC implementation well 
> enough to claim proper (!) handling. However, as stated above: 
> device_resident/usedevice/cache/flush/link do support common block 
> arguments.

(... in the front end at least.)

> Looking at the testsuite, link and device_resident are tested in 
> gfortran.dg/goacc/declare-2.f95. (list.f95 and reduction.f95 also use 
> come common blocks.) – And gfortran.dg/goacc/common-block-1.f90 has been 
> added.

(..., again, that'S all front end testing, so not sufficient to claim it
actually works for executing user code.)  ;-\

>> The libgomp execution test cases you're adding all state that "This test does not exercise ACC DECLARE", yet they supposedly already do work fine. Or am I understading the OpenACC specification wrongly here?
>
> You need to ask Cesar, who wrote the test case and that comment, why he 
> added it.

Well, Cesar is not working on GCC anymore, thus you've been asked to
adopt his patch, and fix it up, change it as necessary.

> The patch does not touch 'link'/'device_resident' clauses of 'declare', 
> hence, I think he didn't see a reason to add a run-time test case for 
> it.

(Or such testing didn't work, but there was no time/interest at that
point to make it work.)

> – That's independent from whether it is supported by the OpenACC 
> spec and whether it is "properly" implemented in GCC/gfortran.
>
>> I'm certainly aware of (big) deficiencies in the OpenACC 'declare' handling so I guess my question here may be whether these test cases are valid after all?
>
> Well, you are the OpenACC specialist – both spec wise and 
> GCC-implementation wise.

Sure, I do know some things, but I'm certainly not all-knowing -- that's
why I needed you to look into this in more detail.

> However, as the test cases are currently 
> parsing-only test cases, I think they should be fine.

OK, and everything else we're thus delaying for later.  That's OK -- what
we got here now is certainly an improvement on its own.  I just wanted to
make sure that we're not missing something obvious.


>>> gcc/gimplify.c: oacc_default_clause contains some changes; there are additionally two lines which only differ for ORT_ACC – Hence, it is an OpenACC-only change!
>>> The ME change is about privatizing common blocks (I haven't studied this part closer.)
>> So, please do study that closer.  ;-P
>>
>> In<http://mid.mail-archive.com/87efx6haep.fsf@euler.schwinge.homeip.net>
>> I raised some questions, got a bit of an answer, and in
>> <http://mid.mail-archive.com/87bms85kra.fsf@hertz.schwinge.homeip.net>
>> asked further, didn't get an answer.

By the way, in the mean time I also found the original GCC trunk
submission email:
<http://mid.mail-archive.com/a028d039-7e9e-792a-7424-ccab1bb425f4@codesourcery.com>.
(Mentioning that just in case that carries any additional information for
you.)


>> All the rationale from Cesar's original submission email should be
>> transferred into 'gcc/gimplify.c' as much as feasible, to make that
>> "voodoo code" better understandable.
>
> I have now added some comments to the patch.

Thanks.


> I also changed GOVD_MAP to 
> GOVD_FIRSTPRIVATE for "acc kernels" to match "acc parallel"; I think 
> that makes sense in terms of what Cesar has written – but I am not 
> completely sure about this.

OK.  Given that this "abstractly" seems to make sense to both of us,
let's do it that way.

Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
faild (at run time, say, with aforementioned duplicate mapping errors, or
would contain "strange"/duplicate/conflicting mapping items in the
'-fdump-tree-gimple' dump)?


>>> Due to the wonders of GIT – when not requiring linear history and due to rebasing with GCC9, it is also part of the OG9 commit ac6c90812344f4f4cfe4d2f5901c1a9d038a4000
>> There's no Git magic involved there: somebody just (manually) merged
>> several these patches together into one, for no good reason.  ;-\
>
> Well, there is more. If you do not enforce linear history, you cannot 
> easily say to git: Give me all changes between this commit and that 
> commit – as they pass by in a sneak path. And by default, GIT merges 
> such that the private version is the "main" branch – and one merges the 
> other branch ("upstream") into the own branch. This can quickly become 
> quite confusing.

I'm not sure I'm following your argument there.

> In particular, it is not easy to see when/why some code disappeared. You 
> have some patch – someone else had merge problems, accidentally removed 
> it and then if you diff or do "log -p", it looks as if the code was 
> never there, unless you explicitly dig into the branch whose commits 
> were merged into the "main" branch.

There are "Diff Formatting" options like '-c', '--cc', '-m', '-t' which
may help.

Anyway, that's certainly not related to Fortran common block support.


> PS: I am a great fan of patch submissions by the authors – it avoids 
> later digging and guess work for reasons why someone else wrote 
> something in a particular way.

Absolutely agreed!

On the other hand, what you've now done, re-engineering the original
rationale etc., makes my review much easier, because that then gives
greater confidence in the changes, I can then trust that you didn't just
copy the original patch, but instead spent your own time thinking it
through.


> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -7219,15 +7219,28 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>  {
>    const char *rkind;
>    bool on_device = false;
> +  bool is_private = false;
>    bool declared = is_oacc_declared (decl);
>    tree type = TREE_TYPE (decl);
>  
>    if (lang_hooks.decls.omp_privatize_by_reference (decl))
>      type = TREE_TYPE (type);
>  
> +  /* For Fortran COMMON blocks, only used variables in those blocks are
> +     transfered and remapped.  The block itself will have a private clause to
> +     avoid transfering the data twice.
> +     The hook evaluates to false by default.  For a variable in Fortran's COMMON
> +     or EQUIVALENCE block, returns 'true' (as we have shared=false) - as only
> +     the variables in such a COMMON/EQUIVALENCE block shall be privatized not
> +     the whole block.  For C++ and Fortran, it can also be true under certain
> +     other conditions, if DECL_HAS_VALUE_EXPR.  */
> +  if (RECORD_OR_UNION_TYPE_P (type))
> +    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
> +
>    if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
>        && is_global_var (decl)
> -      && device_resident_p (decl))
> +      && device_resident_p (decl)
> +      && !is_private)
>      {
>        on_device = true;
>        flags |= GOVD_MAP_TO_ONLY;
> @@ -7238,7 +7251,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_KERNELS:
>        rkind = "kernels";
>  
> -      if (AGGREGATE_TYPE_P (type))
> +      if (is_private)
> +	flags |= GOVD_FIRSTPRIVATE;
> +      else if (AGGREGATE_TYPE_P (type))
>  	{
>  	  /* Aggregates default to 'present_or_copy', or 'present'.  */
>  	  if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
> @@ -7255,7 +7270,9 @@ oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
>      case ORT_ACC_PARALLEL:
>        rkind = "parallel";
>  
> -      if (on_device || declared)
> +      if (is_private)
> +	flags |= GOVD_FIRSTPRIVATE;
> +      else if (on_device || declared)
>  	flags |= GOVD_MAP;
>        else if (AGGREGATE_TYPE_P (type))
>  	{
> @@ -7321,7 +7338,11 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
|        if (DECL_HAS_VALUE_EXPR_P (decl))
>  	{
>  	  tree value = get_base_address (DECL_VALUE_EXPR (decl));
>  
> -	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
> +	  /* For OpenACC, defer expansion of value to avoid transfering
> +	     privatized common block data instead of im-/explicitly transfered
> +	     variables which are in common blocks.  */
> +	  if (!(ctx->region_type & ORT_ACC)
> +	      && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>  	    return omp_notice_threadprivate_variable (ctx, decl, value);
>  	}

Wouldn't it be clearer if that latter one were written as follows:

    if (DECL_HAS_VALUE_EXPR_P (decl))
      {
        if (ctx->region_type & ORT_ACC)
          /* For OpenACC, defer expansion of value to avoid transfering
             privatized common block data instead of im-/explicitly transfered
             variables which are in common blocks.  */
          ;
        else
          {
            tree value = get_base_address (DECL_VALUE_EXPR (decl));
    
            if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
              return omp_notice_threadprivate_variable (ctx, decl, value);
          }
      }

> @@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>    n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>    if ((ctx->region_type & ORT_TARGET) != 0)
>      {
> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
> +      /* For OpenACC, as remarked above, defer expansion.  */
> +      shared = !(ctx->region_type & ORT_ACC);
> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

Also more explicit, easier to read:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, as remarked above, defer expansion.  */
      shared = false;
    else
      shared = true;

> @@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>      }
>  
>    shared = ((flags | n->value) & GOVD_SHARED) != 0;
> +  /* For OpenACC, cf. remark above regaring common blocks.  */
> +  if (ctx->region_type & ORT_ACC)
> +    shared = false;
>    ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);

And again:

    if (ctx->region_type & ORT_ACC)
      /* For OpenACC, cf. remark above regaring common blocks.  */
      shared = false;
    else
      shared = ((flags | n->value) & GOVD_SHARED) != 0;

(In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
point out the special case.)

It's still some kind of voodoo to me -- but at least, you've now also
reviewed this, and it's now documented what's going on.


> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90

> @@ -0,0 +1,69 @@
> +! Test data clauses involving common blocks and common block data.
> +! Specifically, validates early matching errors.
> +
> +subroutine subtest
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +end subroutine subtest
> +
> +program test
> +  implicit none
> +  integer, parameter :: n = 10
> +  integer a(n), b(n), c, d(n), e
> +  real*4 x(n), y(n), z, w(n), v
> +  common /blockA/ a, c, x
> +  common /blockB/ b, y, z
> +  !$acc declare link(/blockA/, /blockB/, e, v)
> +
> +  !$acc data copy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data create(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data copyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopy(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyin(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcreate(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data pcopyout(/blockA/, /blockB/, e, v)
> +  !$acc end data
> +
> +  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc parallel private(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc end parallel
> +
> +  !$acc exit data delete(/blockA/, /blockB/, e, v)

I note there is one single 'exit data' test, but no 'enter data'.

Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.

> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +
> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
> +end program test

Is there a reason for the duplicated 'deviceptr' testing?

Move 'data deviceptr' up a little bit, next to the other 'data' construct
testing?

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Similarly.


Grüße
 Thomas
Tobias Burnus Oct. 25, 2019, 2:36 p.m. UTC | #4
Hi Thomas,

On 10/25/19 10:43 AM, Thomas Schwinge wrote:
> OK for trunk, with the following few small items considered.

Committed as Rev. 277451 – after a fresh bootstrap and regtesting.

Changes:
* I have now a new test case 
libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at 
omplower.
* In the compile-time *{2,3} test case, there is now also a 'enter data' 
and 'update host/self/device' test.
* the libgomp tests have a 'dg-do run'.
* I modified the code in gimplify.c as proposed.


Regarding the new test case: Without the gcc/gimplify.c changes, one has 
(see last item before child fn):

     #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len: 
812]) [child fn …
     #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) 
map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4]) 
map(tofrom:block [len: 812])  [child fn …

With the changes of gcc/gimplify.c, one has:

     #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
     #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c 
[len: 4])  [child fn …


And without gimplify.c, the added run-tests indeed fail with:
libgomp: Trying to map into device [0x407100..0x407294) object when 
[0x407100..0x407290) is already mapped


Tobias

PS:
> Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
> faild (at run time, say, with aforementioned duplicate mapping errors, or
> would contain "strange"/duplicate/conflicting mapping items in the
> '-fdump-tree-gimple' dump)?

See new test case and result for the current tests.

Additionally, I have applied:

> Wouldn't it be clearer if that latter one were written as follows:
>      if (DECL_HAS_VALUE_EXPR_P (decl))
>        {
>          if (ctx->region_type & ORT_ACC)
>            /* For OpenACC, defer expansion of value to avoid transfering
>               privatized common block data instead of im-/explicitly transfered
>               variables which are in common blocks.  */
>            ;
>          else
>            {
>              tree value = get_base_address (DECL_VALUE_EXPR (decl));
>      
>              if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
>                return omp_notice_threadprivate_variable (ctx, decl, value);
>            }
>        }
>
>> @@ -7353,7 +7374,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>>     n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
>>     if ((ctx->region_type & ORT_TARGET) != 0)
>>       {
>> -      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
>> +      /* For OpenACC, as remarked above, defer expansion.  */
>> +      shared = !(ctx->region_type & ORT_ACC);
>> +      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
> Also more explicit, easier to read:
>
>      if (ctx->region_type & ORT_ACC)
>        /* For OpenACC, as remarked above, defer expansion.  */
>        shared = false;
>      else
>        shared = true;
>
>> @@ -7521,6 +7544,9 @@ omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
>>       }
>>   
>>     shared = ((flags | n->value) & GOVD_SHARED) != 0;
>> +  /* For OpenACC, cf. remark above regaring common blocks.  */
>> +  if (ctx->region_type & ORT_ACC)
>> +    shared = false;
>>     ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
> And again:
>
>      if (ctx->region_type & ORT_ACC)
>        /* For OpenACC, cf. remark above regaring common blocks.  */
>        shared = false;
>      else
>        shared = ((flags | n->value) & GOVD_SHARED) != 0;
>
> (In all three cases, using an easy 'if (ctx->region_type & ORT_ACC)' to
> point out the special case.)
>
> It's still some kind of voodoo to me -- but at least, you've now also
> reviewed this, and it's now documented what's going on.


And changed the test case based on:

>> +  !$acc exit data delete(/blockA/, /blockB/, e, v)
> I note there is one single 'exit data' test, but no 'enter data'.
>
> Also, 'update' is missing, to test the 'device' and 'self'/'host' clauses.
>
>> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
>> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
>> +
>> +  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
>> +  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
>> +end program test
> Is there a reason for the duplicated 'deviceptr' testing?
>
> Move 'data deviceptr' up a little bit, next to the other 'data' construct
> testing?
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> Similarly.
Thomas Schwinge Nov. 11, 2019, 9:39 a.m. UTC | #5
Hi Tobias!

By the way, do you know what's the status is for Fortran common blocks in
OpenMP: supported vs. expected per the specification?


On 2019-10-25T16:36:10+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 10/25/19 10:43 AM, Thomas Schwinge wrote:
>> Or, would it be easy to add an OpenACC 'kernels' test case that otherwise
>> faild (at run time, say, with aforementioned duplicate mapping errors, or
>> would contain "strange"/duplicate/conflicting mapping items in the
>> '-fdump-tree-gimple' dump)?

> * I have now a new test case 
> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at 
> omplower.

Thanks.

Curious: why 'omplower' instead of 'gimple' dump?


> Regarding the new test case: Without the gcc/gimplify.c changes, one has 
> (see last item before child fn):
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) map(tofrom:block [len: 
> 812]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) 
> map(tofrom:kernel_block [len: 804]) map(force_tofrom:c [len: 4]) 
> map(tofrom:block [len: 812])  [child fn …
>
> With the changes of gcc/gimplify.c, one has:
>
>      #pragma omp target oacc_parallel map(tofrom:a [len: 400]) 
> map(tofrom:b [len: 400]) map(tofrom:c [len: 4]) [child fn …
>      #pragma omp target oacc_kernels map(force_tofrom:i [len: 4]) 
> map(tofrom:y [len: 400]) map(tofrom:x [len: 400]) map(force_tofrom:c 
> [len: 4])  [child fn …
>
>
> And without gimplify.c, the added run-tests indeed fail with:
> libgomp: Trying to map into device [0x407100..0x407294) object when 
> [0x407100..0x407290) is already mapped

OK, good, my suspicion was thus right that there's something "strange"
there.  ;-)

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> @@ -0,0 +1,39 @@
> +! { dg-options "-fopenacc -fdump-tree-omplower" }

(For later: we usually just use 'dg-additional-options
"-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)

> +
> +module consts
> +  integer, parameter :: n = 100
> +end module consts
> +
> +program main
> +  use consts
> +  implicit none
> +
> +  integer :: i, j
> +  real ::  a(n) = 0, b(n) = 0, c, d
> +  real ::  x(n) = 0, y(n), z
> +  common /BLOCK/ a, b, c, j, d
> +  common /KERNELS_BLOCK/ x, y, z
> +
> +  c = 1.0
> +  !$acc parallel loop copy(/BLOCK/)
> +  do i = 1, n
> +     a(i) = b(i) + c
> +  end do
> +  !$acc kernels
> +  do i = 1, n
> +     x(i) = y(i) + c
> +  end do
> +  !$acc end kernels
> +end program main
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:a \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:b \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_parallel .*map\\(tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:i \\\[len: 4\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:x \\\[len: 400\\\]\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:y \\\[len: 400\\\]\\\)" 1 "omplower" } }
> +! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
> +
> +! { dg-final { scan-tree-dump-not "map\\(.*:block\\)" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block\\)" "omplower" } }

For my understanding: the several unused variables in the common blocks
are to make sure that they don't cause any issues, don't get mapped at
all?

I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
for the upcoming OpenACC 'serial' construct (which basically equals the
OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
test case, or add a new, separate common block?

Or, asking the other way round: why aren't in the current test case,
'parallel' and 'kernels' using the same common block, and both explicitly
'copy' the common block vs. not do that?


> * In the compile-time *{2,3} test case, there is now also a 'enter data' 
> and 'update host/self/device' test.

;-) Heh, 'update' got inside the 'parallel' region:

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
> @@ -0,0 +1,74 @@
> +[...]
> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
> +  !$acc update device(/blockA/)
> +  !$acc update self(/blockB/, v)
> +  !$acc update host(/blockA/, e, /blockB/)
> +  !$acc end parallel
> +[...]

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90

Likewise.

As obvoius; see attached, committed "Fix OpenACC directives nesting in
'gfortran.dg/goacc/common-block-1.f90',
'gfortran.dg/goacc/common-block-2.f90'" to trunk in r278047.


Grüße
 Thomas
Tobias Burnus Nov. 25, 2019, 2:02 p.m. UTC | #6
Hi Thomas,

sorry for the belated reply.

Some comments – and a patch modifying two test cases (see below).
Regarding the patch: OK for the trunk?

On 11/11/19 10:39 AM, Thomas Schwinge wrote:
> By the way, do you know what's the status is for Fortran common blocks in
> OpenMP: supported vs. expected per the specification?

No; however, I had a quick glance at the spec and at the test cases; 
both compile-time and run-time test have some coverage, although I 
didn't spot a run-time test for one 'omp target'.

Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 
'common /name/ x" (continues) define the common block named "name" by 
adding 'x' to it. While "common // y" or "common y" appends 'y' to the 
blank common.

In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as 
general rule in the definition of "list item" (which are inherited by 
"extended list item" and "locator-list item"). [There are also some 
constraints and notes regarding common blocks)]. It does not really tell 
whether blank commons are permitted or not; some description is 
explicitly for named-common variables, leaving blank-common ones out 
(and undefined). But later sections explicitly make reference to blank 
commons, hence, one can assume they are permitted unless explicitly 
stated that they are not.

And then very selectively for some items:
* allocate – only with default allocator.
* declare target – some restrictions and no blank commons
* depend clause – no common permitted
* threadprivate – some notes and explanation of the syntax (why?)
   also only here requirement regarding common blocks with bind(c)
   (why not also for declare target?)
* linear clause – no common permitted
* copyin – some notes
* copyprivate – some notes

As target test cases were suspiciously left out, I tries '!$omp target 
map(/name/)' which was rejected. I think one should add test cases for 
newer features – which mostly means 'omp target' and add the missing 
common-block checks. – And one has to find out why blank commons are not 
permitted and for the places where they are permitted, support has to be 
added.

Talking about blank common blocks, the current OpenACC implementation 
does not seem to like them (see below); the spec (2.7) does not mention 
blank common blocks at all. – It talks about name between two slashes, 
but leaves it open whether the name can also be an empty string.

common // x,y  !blank common
!$acc parallel copyin(//)
!$acc end parallel
end

fails with:

     2 | !$acc parallel copyin(//)
       |                       1
Error: Syntax error in OpenMP variable list at (1)


On 2019-10-25T16:36:10+0200, Tobias Burnus<tobias@codesourcery.com>  wrote:

>> * I have now a new test case
>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>> omplower.
> Thanks. Curious: why 'omplower' instead of 'gimple' dump?

So far I found -fdump-tree-original, -fdump-omplower and 
-fdump-optimized quite useful – I have so far not used 
-fdump-tree-gimple, hence, I cannot state what's the advantage of the 
latter.

The original dump I like because it shows what the FE generates, the 
omplower dump has the result after lowering including the assignments to 
the omp_arr variables but it keeps a readable pragma line (avoids 
guessing what the kind value was again etc.) while the optimized dump 
really shows what ends up in the call (with the pro and con that it 
depends on the optimization option).

If you think it makes sense, one can switch.

>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
>> @@ -0,0 +1,39 @@
>> +! { dg-options "-fopenacc -fdump-tree-omplower" }
> (For later: we usually just use 'dg-additional-options
> "-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)

My impression was that it is not effective unless repeated – and I think 
I even tries it. In gcc/testsuite/gfortran.dg/gomp/ all 64 test cases 
with dg-options specify re-add the "-fopenmp".

And in gcc/testsuite/gfortran.dg/goacc, 4 test cases use dg-options, 3 
specify -fopenacc and one doesn't. I wouldn't call this 'usually' and I 
wonder whether -fopenacc is really active for goacc/pr84963.f90. (The 
file uses no directive at all!)

Hence, I wonder whether one should add it to goacc/pr84963.f90 – see 
attached patch.

Without the patch, I get:

Executing on host: …/gfortran/../../gfortran -B…/gfortran/../../ 
-B…/./libgfortran/ …/goacc/pr84963.f90 -fno-diagnostics-show-caret 
-fno-diagnostics-show-line-numbers -fdiagnostics-color=never  
-fdiagnostics-urls=never    -O  -O2 -S -o pr84963.s    (timeout = 300)

And with the patch, I get
… -fdiagnostics-urls=never -O -fopenacc -O2 -S -o pr84963.s …


>> +  integer :: i, j
>> +  real ::  a(n) = 0, b(n) = 0, c, d
>> +  real ::  x(n) = 0, y(n), z
>> +  common /BLOCK/ a, b, c, j, d
>> +  common /KERNELS_BLOCK/ x, y, z
> For my understanding: the several unused variables in the common blocks
> are to make sure that they don't cause any issues, don't get mapped at
> all?

I think that's the idea – common-block variables which are not used 
should also not get mapped (= optimization). But, obviously, they should 
also not cause any issues.

Hence, one could/should also check that they are not mapped – done in 
the attached patch.

> I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
> for the upcoming OpenACC 'serial' construct (which basically equals the
> OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
> test case, or add a new, separate common block?
>
> Or, asking the other way round: why aren't in the current test case,
> 'parallel' and 'kernels' using the same common block, and both explicitly
> 'copy' the common block vs. not do that?

I think one could do either way – by itself, the blocks should be 
independent and, hence, could re-use the same common block. Re-using the 
same common block tests other things, hence, maybe you should do so – 
and use different variables from both blocks.

>> * In the compile-time *{2,3} test case, there is now also a 'enter data'
>> and 'update host/self/device' test.
> ;-) Heh, 'update' got inside the 'parallel' region:
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
>> @@ -0,0 +1,74 @@
>> +[...]
>> +  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
>> +  !$acc update device(/blockA/)
>> +  !$acc update self(/blockB/, v)
>> +  !$acc update host(/blockA/, e, /blockB/)
>> +  !$acc end parallel
>> +[...]
>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
> Likewise.
>
> As obvoius; see attached, committed "Fix OpenACC directives nesting in
> 'gfortran.dg/goacc/common-block-1.f90',
> 'gfortran.dg/goacc/common-block-2.f90'" to trunk in r278047.

Thanks.

Cheers,

Tobias
Tobias Burnus Nov. 26, 2019, 2:02 p.m. UTC | #7
Hi Thomas,

I now played also around common blocks with "!$acc declare 
device_resident (/block/)". [See attached test-case diff.]

Observations:

* !$acc declare has to come after the declaration of the common block. 
In terms of the spec, it just needs to be in the declaration section, 
i.e. it could also be before. – Seems as if one needs to split parsing 
and resolving clauses.

* If I just use '!$acc parallel', the used variables are copied in 
according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of 
OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays 
are map(fromto:). – Does this behaviour match the spec or should this 
automatically mapped to, e.g., no_create as the 'device_resident' is 
known? [Side remark: the module file does contain 
"OACC_DECLARE_DEVICE_RESIDENT".]

* If I explicitly use '!$acc parallel present(/block/)' that fails 
because present() does not permit common blocks.
(OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and 
present, the list argument may include a Fortran common block name 
enclosed within slashes"). I could use no_create, but that's not yet 
supported.

Cheers,

Tobias
Thomas Schwinge Nov. 28, 2019, 5:01 p.m. UTC | #8
Hi Tobias!

On 2019-11-25T15:02:16+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> sorry for the belated reply.

Eh, no worries -- I'm way more behind on things...


> On 11/11/19 10:39 AM, Thomas Schwinge wrote:
>> By the way, do you know what's the status is for Fortran common blocks in
>> OpenMP: supported vs. expected per the specification?
>
> No; however, I had a quick glance at the spec and at the test cases; 
> both compile-time and run-time test have some coverage, although I 
> didn't spot a run-time test for one 'omp target'.

Thanks.

> Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 
> 'common /name/ x" (continues) define the common block named "name" by 
> adding 'x' to it. While "common // y" or "common y" appends 'y' to the 
> blank common.

Thanks for the concise summary.

> In OpenMP 5, common blocks appear twice – once [2.1, p.39, ll.11ff.] as 
> general rule in the definition of "list item" (which are inherited by 
> "extended list item" and "locator-list item"). [There are also some 
> constraints and notes regarding common blocks)]. It does not really tell 
> whether blank commons are permitted or not; some description is 
> explicitly for named-common variables, leaving blank-common ones out 
> (and undefined). But later sections explicitly make reference to blank 
> commons, hence, one can assume they are permitted unless explicitly 
> stated that they are not.

Yes, I go by the assumption that everything contained in the base
languages of OpenACC/OpenMP (so, the respective C, C++, Fortran
standards), should also work in an OpenACC/OpenMP context in a sensible
manner (detailed/clarified in the respective specification as necessary),
and if not supported then that ought to be spelled out explicitly.  (For
example, see either the "catch-all" notes in OpenACC 3.0,
1.7. "References", or the more in-detail notes in specific sections.)
Anything else I'd consider a bug in the respective specification, which
should be reported/fixed.

That said, if you think OpenMP needs to clarify whether Fortran blank
common blocks are supported or not, then file an issue or directly submit
a pull request against the specification on <https://github.com/OpenMP>
(once we've got access).

> And then very selectively for some items:
> * allocate – only with default allocator.
> * declare target – some restrictions and no blank commons
> * depend clause – no common permitted
> * threadprivate – some notes and explanation of the syntax (why?)
>    also only here requirement regarding common blocks with bind(c)
>    (why not also for declare target?)
> * linear clause – no common permitted
> * copyin – some notes
> * copyprivate – some notes
>
> As target test cases were suspiciously left out, I tries '!$omp target 
> map(/name/)' which was rejected. I think one should add test cases for 
> newer features – which mostly means 'omp target' and add the missing 
> common-block checks. – And one has to find out why blank commons are not 
> permitted and for the places where they are permitted, support has to be 
> added.

ACK.  Instead of "burying" such things in long emails, I like to see GCC
PRs filed, which can then be actioned on individually.

> Talking about blank common blocks, the current OpenACC implementation 
> does not seem to like them (see below); the spec (2.7) does not mention 
> blank common blocks at all. – It talks about name between two slashes, 
> but leaves it open whether the name can also be an empty string.

My assumption would thus be: yes, ought to be supported -- but I haven't
thought through whether that makes sense, so...

> common // x,y  !blank common
> !$acc parallel copyin(//)
> !$acc end parallel
> end
>
> fails with:
>
>      2 | !$acc parallel copyin(//)
>        |                       1
> Error: Syntax error in OpenMP variable list at (1)

..., please test with the PGI compiler (just to get more data), and
determine whether that makes sense to support in an OpenACC context
(likewise for OpenMP, of course), and then (once you've got access)
either file an issue, or (better) directly submit a pull request for
<https://github.com/OpenACC/openacc-spec/> to clarify that.  Sometimes
it's as easy as replacing non-standard text ("name between two slashes")
with the corresponding standard text (whatever the Fortran specification
calls this).


> On 2019-10-25T16:36:10+0200, Tobias Burnus<tobias@codesourcery.com>  wrote:
>
>>> * I have now a new test case
>>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>>> omplower.
>> Thanks. Curious: why 'omplower' instead of 'gimple' dump?
>
> So far I found -fdump-tree-original, -fdump-omplower and 
> -fdump-optimized quite useful – I have so far not used 
> -fdump-tree-gimple, hence, I cannot state what's the advantage of the 
> latter.

My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
test for that stuff in the 'gimple' dump (which is between 'original' and
'omplower').

> The original dump I like because it shows what the FE generates, the 
> omplower dump has the result after lowering including the assignments to 
> the omp_arr variables but it keeps a readable pragma line (avoids 
> guessing what the kind value was again etc.) while the optimized dump 
> really shows what ends up in the call (with the pro and con that it 
> depends on the optimization option).
>
> If you think it makes sense, one can switch.

I think it does (but please argue if it doesn't to you), but that's not
high priority, of course.


>>> --- /dev/null
>>> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
>>> @@ -0,0 +1,39 @@
>>> +! { dg-options "-fopenacc -fdump-tree-omplower" }
>> (For later: we usually just use 'dg-additional-options
>> "-fdump-tree-omplower"'; '-fopenacc' is implied inside '*/goacc/'.)
>
> My impression was that it is not effective unless repeated – and I think 
> I even tries it. In gcc/testsuite/gfortran.dg/gomp/ all 64 test cases 
> with dg-options specify re-add the "-fopenmp".
>
> And in gcc/testsuite/gfortran.dg/goacc, 4 test cases use dg-options, 3 
> specify -fopenacc and one doesn't. I wouldn't call this 'usually'

Note that I said 'dg-additional-options', not 'dg-options', so please
re-consider.


> and I 
> wonder whether -fopenacc is really active for goacc/pr84963.f90. (The 
> file uses no directive at all!)
>
> Hence, I wonder whether one should add it to goacc/pr84963.f90 – see 
> attached patch.

Good find.  Please confirm that indeed this is meant to enable OpenACC
processing by reading discussion in <https://gcc.gnu.org/PR84963> and
<http://mid.mail-archive.com/ead44f52-ee30-6b5e-e18a-5dd49d9a2614@suse.cz>.
(Likely yes, of course.)

> Without the patch, I get:
>
> Executing on host: …/gfortran/../../gfortran -B…/gfortran/../../ 
> -B…/./libgfortran/ …/goacc/pr84963.f90 -fno-diagnostics-show-caret 
> -fno-diagnostics-show-line-numbers -fdiagnostics-color=never  
> -fdiagnostics-urls=never    -O  -O2 -S -o pr84963.s    (timeout = 300)

(Confirmed.)

> And with the patch, I get
> … -fdiagnostics-urls=never -O -fopenacc -O2 -S -o pr84963.s …

> --- a/gcc/testsuite/gfortran.dg/goacc/pr84963.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/pr84963.f90
> @@ -1,5 +1,5 @@
>  ! PR ipa/84963
> -! { dg-options "-O2" }
> +! { dg-options "-fopenacc -O2" }

I suggest to change 'dg-options "-O2"' to 'dg-additional-options "-O2"'.
Please verify, and then commit that to trunk, gcc-8-branch, gcc-7-branch,
referencing PR84963 in the ChangeLog update.

That's a separate fix/commit from everything else discussed here.


>>> +  integer :: i, j
>>> +  real ::  a(n) = 0, b(n) = 0, c, d
>>> +  real ::  x(n) = 0, y(n), z
>>> +  common /BLOCK/ a, b, c, j, d
>>> +  common /KERNELS_BLOCK/ x, y, z
>> For my understanding: the several unused variables in the common blocks
>> are to make sure that they don't cause any issues, don't get mapped at
>> all?
>
> I think that's the idea

Then let's please document that in the test case sources, for that's not
quite obvious.

> – common-block variables which are not used 
> should also not get mapped (= optimization). But, obviously, they should 
> also not cause any issues.
>
> Hence, one could/should also check that they are not mapped – done in 
> the attached patch.

Good, thanks.

> --- a/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> +++ b/gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
> @@ -9,7 +9,7 @@ program main
>    implicit none
>  
>    integer :: i, j
> -  real ::  a(n) = 0, b(n) = 0, c, d
> +  real ::  a(n) = 0, b(n) = 0, c, d, e(n)
>    real ::  x(n) = 0, y(n), z
>    common /BLOCK/ a, b, c, j, d
>    common /KERNELS_BLOCK/ x, y, z
> @@ -35,5 +35,8 @@ end program main
>  ! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(tofrom:y \\\[len: 400\\\]\\\)" 1 "omplower" } }
>  ! { dg-final { scan-tree-dump-times "omp target oacc_kernels .*map\\(force_tofrom:c \\\[len: 4\\\]\\)" 1 "omplower" } }
>  
> -! { dg-final { scan-tree-dump-not "map\\(.*:block\\)" "omplower" } }
> -! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block\\)" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:block" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:kernels_block" "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:d " "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:e " "omplower" } }
> +! { dg-final { scan-tree-dump-not "map\\(.*:z " "omplower" } }

OK for trunk, with some suitable commentary added ("expecting no mapping
of un-referenced blocks/variables", or something like that, before the
'scan-tree-dump-not' ones).  To record the review effort, please include
"Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>" in the commit
log, see <https://gcc.gnu.org/wiki/Reviewed-by>.


>> I we were to add to 'gfortran.dg/goacc/common-block-3.f90' a test case
>> for the upcoming OpenACC 'serial' construct (which basically equals the
>> OpenACC 'parallel' construct), would we copy/adapt the 'parallel' 'BLOCK'
>> test case, or add a new, separate common block?
>>
>> Or, asking the other way round: why aren't in the current test case,
>> 'parallel' and 'kernels' using the same common block, and both explicitly
>> 'copy' the common block vs. not do that?
>
> I think one could do either way – by itself, the blocks should be 
> independent and, hence, could re-use the same common block. Re-using the 
> same common block tests other things, hence, maybe you should do so – 
> and use different variables from both blocks.

ACK, thanks.


Grüße
 Thomas
Thomas Schwinge Nov. 28, 2019, 5:02 p.m. UTC | #9
Hi Tobias!

On 2019-11-26T15:02:34+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> I now played also around common blocks with "!$acc declare 
> device_resident (/block/)". [See attached test-case diff.]

If you'd like to, please commit that, to document the status quo.  (I
have not reviewed.)


There are several issues with the OpenACC 'declare' implementation, so
that one generally needs to be re-visited as some point.  Basically
everything from the front ends handling, to middle end handling, to nvptx
back end handling (supposedly?; see <https://gcc.gnu.org/PR81689>), to
libgomp handling.  So, you're adding here some more.  ;-)

> Observations:
>
> * !$acc declare has to come after the declaration of the common block. 
> In terms of the spec, it just needs to be in the declaration section, 
> i.e. it could also be before. – Seems as if one needs to split parsing 
> and resolving clauses.

Good find -- purely a Fortran front end issue, as I understand.  Please
file a GCC PR, unless there is a reason (implementation complexity?) to
be more "strict" ("referenced variable/common block needs to be lexically
in scope", or something like that?), and the OpenACC specification should
be changed instead?

> * If I just use '!$acc parallel', the used variables are copied in 
> according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of 
> OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays 
> are map(fromto:). – Does this behaviour match the spec or should this 
> automatically mapped to, e.g., no_create as the 'device_resident' is 
> known? [Side remark: the module file does contain 
> "OACC_DECLARE_DEVICE_RESIDENT".]

Not sure at this point.

> * If I explicitly use '!$acc parallel present(/block/)' that fails 
> because present() does not permit common blocks.
> (OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and 
> present, the list argument may include a Fortran common block name 
> enclosed within slashes").

Do you understand the rationale behind that restriction, by the way?  I'm
not sure I do.  Is it because we don't know/can't be sure that *all* of
the common block has been mapped (per the rules set elsewhere)?  That
would make sense in context of this:

> I could use no_create

... which basically means 'present' but don't complain if not actually
present.

> but that's not yet 
> supported.

But will be soon.  :-)


Grüße
 Thomas


> --- a/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/declare-5.f90
> @@ -1,29 +1,106 @@
>  ! { dg-do run }
>  
>  module vars
>    implicit none
>    real b
> - !$acc declare device_resident (b)
> +  !$acc declare device_resident (b)
> +
> +  integer :: x, y, z
> +  common /block/ x, y, z
> +  !$acc declare device_resident (/block/)
>  end module vars
>  
> +subroutine set()
> +  use openacc
> +  implicit none
> +  integer :: a(5), b(1), c, vals(7)
> +  common /another/ a, b, c
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (a)) stop 10
> +  if (.not. acc_is_present (b)) stop 11
> +  if (.not. acc_is_present (c)) stop 12
> +
> +  vals = 99
> +  !$acc parallel copyout(vals) present(a, b, c) ! OK
> +                                                ! but w/o 'present', 'c' is firstprivate and a+b are 'map(fromto:'
> +                                                ! additionally, OpenACC 2.7 does not permit present(/another/)
> +                                                ! and no_create is not yet in the trunk (but submitted)
> +    a = [11,12,13,14,15]
> +    b = 16
> +    c = 47
> +    vals(1:5) = a
> +    vals(6:6) = b
> +    vals(7) = c
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (a)) stop 13
> +  if (.not. acc_is_present (b)) stop 14
> +  if (.not. acc_is_present (c)) stop 15
> +
> +  if (any (vals /= [11,12,13,14,15,16,47])) stop 16
> +end subroutine set
> +
> +subroutine check()
> +  use openacc
> +  implicit none
> +  integer :: g, h(3), i(3)
> +  common /another/ g, h, i
> +  integer :: val(7)
> +  !$acc declare device_resident (/another/)
> +  if (.not. acc_is_present (g)) stop 20
> +  if (.not. acc_is_present (h)) stop 21
> +  if (.not. acc_is_present (i)) stop 22
> +
> +  val = 99
> +  !$acc parallel copyout(val) present(g, h, i)
> +    val(5:7) = i
> +    val(1) = g
> +    val(2:4) = h
> +  !$acc end parallel
> +
> +  if (.not. acc_is_present (g)) stop 23
> +  if (.not. acc_is_present (h)) stop 24
> +  if (.not. acc_is_present (i)) stop 25
> +
> +
> +  !print *, val
> +  if (any (val /= [11,12,13,14,15,16,47])) stop 26
> +end subroutine check
> +
> +
>  program test
>    use vars
>    use openacc
>    implicit none
>    real a
> +  integer :: k
>  
> -  if (acc_is_present (b) .neqv. .true.) STOP 1
> +  call set()
> +  call check()
> +
> +  if (.not. acc_is_present (b)) stop 1
> +  if (.not. acc_is_present (x)) stop 2
> +  if (.not. acc_is_present (y)) stop 3
> +  if (.not. acc_is_present (z)) stop 4
>  
>    a = 2.0
> +  k = 42
>  
> -  !$acc parallel copy (a)
> +  !$acc parallel copy (a, k)
>      b = a
>      a = 1.0
>      a = a + b
> +    x = k
> +    y = 7*k - 2*x
> +    z = 3*y
> +    k = k - z + y
>     !$acc end parallel
>  
> -  if (acc_is_present (b) .neqv. .true.) STOP 2
> -
> -  if (a .ne. 3.0) STOP 3
> +  if (.not. acc_is_present (b)) stop 5
> +  if (.not. acc_is_present (x)) stop 6
> +  if (.not. acc_is_present (y)) stop 7
> +  if (.not. acc_is_present (z)) stop 8
>  
> +  if (a /= 3.0) stop 3
> +  if (k /= -378) stop 3
>  end program test
Tobias Burnus Nov. 29, 2019, 5:32 p.m. UTC | #10
Hi Thomas,

On 11/28/19 6:01 PM, Thomas Schwinge wrote:
> Definition (3.32.1 in F2018): "blank common" = "unnamed common block". 

I just want to add the following, which came into my mind after thinking 
more about device_resident (the other email in this thread). Fortran 
(here: 2018, 8.10.2.5) has:

"Named common blocks of the same name shall be of the same size in all 
scoping units of a program in which they appear, but blank common blocks 
may be of different sizes."

* * *

Depending on the use of a common block (see other email in the same 
thread, to be send shortly), that's fine or not. If the common block 
only exists on the device (i.e. in a device routine / 'target' 
procedure) or only on the host, everything is fine. — In this case, the 
connection between host and target is done by single variables – and no 
one cares whether they are in a common block or not.

It only becomes interesting if the same(-named) common block is known to 
both the host and the device – in that case, it is important that the 
size matches, otherwise either the copying to the device or (via 
'update') from the device to the host will write beyond the static 
variable! — Also in the latter case, it makes sense that 
'copy(/block_name/)' will map the whole common block and not only the 
directly used variables (which might be none).

* * *

OpenACC: Does one need device_resident to allocate 'static' global 
memory on the device? If not, then its only use would be for same-named 
common blocks, existing on both the device and the host. If it is 
needed, then one needs to think about the semantic – will it declare a 
common block which exists only on the device or one which exists on both 
device and host with the same name. — I think that needs to be spelt out 
in the spec clearly; at the moment, it is ambiguous. In any case, it 
influences how copy(/block_name/) acts.

For OpenMP, my impression is that the spec is completely silent on 
device-located common blocks. And if a common block is only on the host, 
copy(/block/) just maps the used (common-block) variables to the target 
– which is fine. — Seems as if some spec work is needed as well.

* * *

> I go by the assumption that everything contained in the base
> languages of OpenACC/OpenMP ([…] C, C++, Fortran
> standards), should also work in an OpenACC/OpenMP context in a sensible
> manner […]
I concur.
> ACK. Instead of "burying" such things in long emails, I like to see 
> GCC PRs filed, which can then be actioned on individually.

Well, I think one first needs to understand what's supposed to be in the 
standard. Having said this, I have now filled – PR 92728 + PR 92730.

[blank commons]

> assumption would thus be: yes, ought to be supported -- but I haven't 
> thought through whether that makes sense, so...

By itself, using blank commons make sense if one maps variables from a 
common block but not if one maps the whole common block.

[Blank commons + PGI]

I will later play around with the PGI compiler; but I think it is really 
a spec issue and I care less what a specific compiler does. (Even 
though, with OpenACC, it is kind of the reference compiler.)

> determine whether that makes sense to support in an OpenACC context

I think that needs discussion about what one wants to achieve instead of 
directly patching the spec.

>>>> * I have now a new test case
>>>> libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 which looks at
>>>> omplower.
Actually, I think this should be: 
gcc/testsuite/gfortran.dg/goacc/common-block-3.f90
>>> Thanks. Curious: why 'omplower' instead of 'gimple' dump?
>> […]
> My rationale is that your code changes are in 'gcc/gimplify.c', so you'd
> test for that stuff in the 'gimple' dump (which is between 'original' and
> 'omplower').
[switching to gimple]
> I think it does (but please argue if it doesn't to you), but that's not
> high priority, of course.

Hmm, that might be more specific to other parts – but with optional 
arguments, I had constantly to look at what has been passed on to 
libgomp via omp_arr – even though the code was produced directly by the 
front end. The 'pragma' simply didn't tell the whole story – omp-low.c 
added and removed some '*' and '&' which were crucial.

Probably, 'map' as parsed here doesn't change any more between 
gimplify.c and the early stages of omp-low.c, but I feel safer if the 
wanted result survived until the end of omp-low.c and does not get 
modified in an unintended way later on.

> Note that I said 'dg-additional-options', not 'dg-options', so please
> re-consider.

Ups. Yes, dg-additional-options should work :-)

[oacc/pr84963.f90]

> Good find. […] to change 'dg-options "-O2"' to 'dg-additional-options 
> "-O2"
> Please verify, and then commit that to trunk, gcc-8-branch, gcc-7-branch, 

Done so – except I committed to GCC 9 instead of GCC 7, which is now 
closed :-)
[Question about a test case]

> Then let's please document that in the test case sources, for that's not
> quite obvious.

I have now committed the test-case patch with a comment as suggested: 
r278843.

Tobias
Tobias Burnus Nov. 29, 2019, 5:47 p.m. UTC | #11
Hi Thomas,

I have started with this email – and then stopped and replied to the 
other email in this thread: 
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02678.html – which covers 
parts which otherwise would belong into this email.

On 11/28/19 6:02 PM, Thomas Schwinge wrote:
[Test case which uses common blocks in device_resident.]
> If you'd like to, please commit that, to document the status quo.  (I
> have not reviewed.)

Did so as r278845 with a slightly updated comment.

>> Observations:
>> * !$acc declare has to come after the declaration of the common block.
That's now tracked in PR fortran/92728 for OpenMP/OpenACC – together 
with blank commons.
> Good find -- purely a Fortran front end issue […] is a reason 
> (implementation complexity?)

Having any order makes it feel more Fortran like; the complexity comes 
from splitting matching and checking the clauses, but it shouldn't be 
rocket science.

>> * If I just use '!$acc parallel', the used variables are copied in
>> according to OpenMP 4.0 semantics, i.e. without a defaultmap clause (of
>> OpenMP 4.5+; not yet in gfortran), scalars are firstprivate and arrays
>> are map(fromto:). – Does this behaviour match the spec or should this
>> automatically mapped to, e.g., no_create as the 'device_resident' is
>> known? [Side remark: the module file does contain
>> "OACC_DECLARE_DEVICE_RESIDENT".]
> Not sure at this point.

s/OpenMP 4.0/OpenMP 4.5/

Regarding the mapping: Both OpenACC and OpenMP agree and it is fine (cf. 
OpenACC 2.7 last 'Description' paragraph in parallel/kernels, 2.5.1 + 
2.5.2). And OpenMP 4.5, Sect. 2.15.5 (esp. last three bullet points) or 
OpenMP 5, Sect. 2.19.7. (Missing omp bits see PR fortran/92568.).

Regarding device_resident: It is not fully clear to me what the intent 
is – and "The host may not be able to access variables in a 
device_resident clause." does not make it clearer.

In terms of the spec, the mapping with firstprivate/[copy alias tofrom] 
is fine – as is the explict use of present. However, if commons exists 
on both device + host, 'copy(/block/)' should work and also copy 
common-block variables, which are not referrenced in the 
parallel/kernels block – which currently does not work.

>> * If I explicitly use '!$acc parallel present(/block/)' that fails
>> because present() does not permit common blocks.
>> (OpenACC 2.7, p36, l.1054: "For all clauses except deviceptr and
>> present, the list argument may include a Fortran common block name
>> enclosed within slashes").
> Do you understand the rationale behind that restriction, by the way?  I'm
> not sure I do.

Regarding 'present', I don't: If copy/no_create is fine, why should 
present be a problem? (And vice versa.)

For 'deviceptr', it kind of does make sense – unless one wants to store 
the pointer as 'intptr_t' in an integer variable or want to have a 
pointer (i.e. Fortran attribute) in 'common' which will cause mapping 
problems for the common block. — In any case, the 'dummy argument' 
constraint prevents common blocks. – BTW: Those constraints do not make 
sense but seem to be same as for OpenMP's is_device_ptr. (They are both 
too loose and to strict; I miss type(c_ptr) [as local var + as dummy w/ 
value attribute].)

Cheers,

Tobias
Tobias Burnus Dec. 2, 2019, 1:33 p.m. UTC | #12
Hi Thomas,

for completeness, I tried now *blank commons* with OpenACC in *PGI's 
pgfortran.*

 From the error message, it looks as if the parser does not handle blank 
commons at all. (Matches the current parser in gfortran.) pgfortran is 
also not very good at diagnostics as nonexisting common block names are 
not diagnosed.

Cheers,

Tobias
Thomas Schwinge Dec. 3, 2019, 9:16 a.m. UTC | #13
Hi Tobias!

On 2019-11-29T18:47:12+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 11/28/19 6:02 PM, Thomas Schwinge wrote:
> [Test case which uses common blocks in device_resident.]
>> If you'd like to, please commit that, to document the status quo.  (I
>> have not reviewed.)
>
> Did so as r278845 with a slightly updated comment.

Testing with nvptx offloading on two different systems, on both I'm
seeing 'STOP 10' execution test FAILure for all optimization levels:

    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  execution test
    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O1  execution test
    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  execution test
    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O3 -g  execution test
    PASS: libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  (test for excess errors)
    [-PASS:-]{+FAIL:+} libgomp.oacc-fortran/declare-5.f90 -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -Os  execution test


Grüße
 Thomas
Tobias Burnus Dec. 3, 2019, 3:22 p.m. UTC | #14
Hi Thomas,

Quick version: The attached patch seems to work, kind of,  but fails at 
run time with:
libgomp: Trying to map into device [0x407218..0x40721c) object when 
[0x407210..0x40721c) is already mapped
This marks the common-block decl but not the common-block vars as 
'device resident' (alias "omp declare target").

The attached with '#if 0'  set to '1' does not work as it gives an ICE 
in lto1. – If one only marks the common-block variables, it fails as the 
ME check complains "variable 'block' has been referenced in offloaded 
code but hasn't been marked to be included in the offloaded code" –

I think the first version is fine, but it seems as if the ME needs to 
use pcreate and not create for those. I think that's also the reason for 
the odd is-program check mentioned at the very bottom.

Tobias

PS: Hmm, I really wonder why it seemed to have passed before. Looking at 
the code, it cannot have passed — more below. That goes wrong since 
r272453 for PR85221 (well, it can't before). I don't quickly see whether 
it also affects OpenMP or other clauses.

I think for a proper fix it would be very useful to know some more 
details about the intention of 'declare device_resident' (existing only 
on the device, existing on both host and device etc.). Cf. previous email.

In terms of this issue, if one does:  "integer :: a, b, c; common /name/ 
a,b,c; !$acc declare device_resident(a)", should this make all of the 
common-block variables as device resident or not? Internally, one gets 
for declare-5.f90 the following, i.e. /another/ is the common name and 
g, h and i are common-name variables:

   static integer(kind=4) g [value-expr: another.g];
   static integer(kind=4) h[3] [value-expr: another.h];
   static integer(kind=4) i[3] [value-expr: another.i];

For the test case, the issue is that 'gfc_get_symbol_decl' only called 
after it's tree representation (sym->backend_decl) has already been 
created; this happens for common blocks. – The attached patch fixes 
this, marking the common block decl and all its variables as declare 
device_resident.

One could think of handling other attributes (which ones?). For 
EQUIVALENCE in commons, the attributes are collected using 
accumulate_equivalence_attributes – and for normal variables, it is 
handled in trans-decl.c's add_attributes_to_decl

  * * *

Additionally, and unrelated to the test case, the following code looks 
very suspicious (from finish_oacc_declare in fortran/trans-decl.c):

   module_oacc_clauses = NULL;
   gfc_traverse_ns (ns, find_module_oacc_declare_clauses);
   if (module_oacc_clauses && sym->attr.flavor == FL_PROGRAM)

First, it very much looks like memory leak – one creates a linked list,
but always dumps it w/o using or freeing it if one is currently not
processing the main program. Additionally, it assumes that the main program
has a full view of module-declared 'declare device_resident' variables, but
it is trivial to construct programs where the main program does not see this
property. Most trivial example is:
   subroutine foo()
     use module_w_device_decl
   end subroutine
independent whether that function exists as such or as module procedure or
(at some place) is a procedure contained in another procedure. A general
assumption is also that the whole program is compiled with -fopenacc
and that the main program is written in Fortran and not, e.g., in C or C++.
diff mbox series

Patch

2019-10-15  Cesar Philippidis <cesar@codesourcery.com>
	    Tobias Burnus  <tobias@codesourcery.com>

	gcc/fortran/
	* openmp.c (gfc_match_omp_map_clause): Add and pass allow_commons
	argument.
	(gfc_match_omp_clauses): Update calls to permit common blocks for
	OpenACC's copy/copyin/copyout, create/delete, host,
	pcopy/pcopy_in/pcopy_out, present_or_copy, present_or_copy_in,
	present_or_copy_out, present_or_create and self.

	gcc/
	* gimplify.c (oacc_default_clause): Privatize fortran common blocks.
	(omp_notice_variable): Defer the expansion of DECL_VALUE_EXPR for
	common block decls.
    
	gcc/testsuite/
	* gfortran.dg/goacc/common-block-1.f90: New test.
	* gfortran.dg/goacc/common-block-2.f90: New test.
    
	libgomp/
	* testsuite/libgomp.oacc-fortran/common-block-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/common-block-3.f90: New test.

diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 5c91fcdfd31..dbcb647ea6a 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -926,10 +926,11 @@  omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
    mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool allow_common)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
+  if (gfc_match_omp_variable_list ("", list, allow_common, NULL, &head, true)
       == MATCH_YES)
     {
       gfc_omp_namelist *n;
@@ -1051,7 +1052,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, openacc))
 	    continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	    {
@@ -1059,7 +1060,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-						   OMP_MAP_TO))
+						   OMP_MAP_TO, true))
 		    continue;
 		}
 	      else if (gfc_match_omp_variable_list ("copyin (",
@@ -1070,7 +1071,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	      && gfc_match_omp_variable_list ("copyprivate (",
@@ -1080,7 +1081,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, true))
 	    continue;
 	  break;
 	case 'd':
@@ -1116,7 +1117,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_DELETE)
 	      && gfc_match ("delete ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_RELEASE))
+					   OMP_MAP_RELEASE, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEPEND)
 	      && gfc_match ("depend ( ") == MATCH_YES)
@@ -1168,12 +1169,12 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      && openacc
 	      && gfc_match ("device ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_TO))
+					   OMP_MAP_FORCE_TO, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
 	      && gfc_match ("deviceptr ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_DEVICEPTR))
+					   OMP_MAP_FORCE_DEVICEPTR, false))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
 	      && gfc_match_omp_variable_list
@@ -1251,7 +1252,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("host ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, true))
 	    continue;
 	  break;
 	case 'i':
@@ -1523,47 +1524,47 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("pcopy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("pcopyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("pcopyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("pcreate ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRESENT)
 	      && gfc_match ("present ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_PRESENT))
+					   OMP_MAP_FORCE_PRESENT, false))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("present_or_copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("present_or_copyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("present_or_copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("present_or_create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRIORITY)
 	      && c->priority == NULL
@@ -1781,7 +1782,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("self ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, true))
 	    continue;
 	  if ((mask & OMP_CLAUSE_SEQ)
 	      && !c->seq
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 836706961f3..258b756ef70 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7218,15 +7218,20 @@  oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
 {
   const char *rkind;
   bool on_device = false;
+  bool is_private = false;
   bool declared = is_oacc_declared (decl);
   tree type = TREE_TYPE (decl);
 
   if (lang_hooks.decls.omp_privatize_by_reference (decl))
     type = TREE_TYPE (type);
 
+  if (RECORD_OR_UNION_TYPE_P (type))
+    is_private = lang_hooks.decls.omp_disregard_value_expr (decl, false);
+
   if ((ctx->region_type & (ORT_ACC_PARALLEL | ORT_ACC_KERNELS)) != 0
       && is_global_var (decl)
-      && device_resident_p (decl))
+      && device_resident_p (decl)
+      && !is_private)
     {
       on_device = true;
       flags |= GOVD_MAP_TO_ONLY;
@@ -7237,7 +7242,9 @@  oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
     case ORT_ACC_KERNELS:
       rkind = "kernels";
 
-      if (AGGREGATE_TYPE_P (type))
+      if (is_private)
+	flags |= GOVD_MAP;
+      else if (AGGREGATE_TYPE_P (type))
 	{
 	  /* Aggregates default to 'present_or_copy', or 'present'.  */
 	  if (ctx->default_kind != OMP_CLAUSE_DEFAULT_PRESENT)
@@ -7254,7 +7261,9 @@  oacc_default_clause (struct gimplify_omp_ctx *ctx, tree decl, unsigned flags)
     case ORT_ACC_PARALLEL:
       rkind = "parallel";
 
-      if (on_device || declared)
+      if (is_private)
+	flags |= GOVD_FIRSTPRIVATE;
+      else if (on_device || declared)
 	flags |= GOVD_MAP;
       else if (AGGREGATE_TYPE_P (type))
 	{
@@ -7320,7 +7329,8 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
 	{
 	  tree value = get_base_address (DECL_VALUE_EXPR (decl));
 
-	  if (value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
+	  if (!(ctx->region_type & ORT_ACC)
+	      && value && DECL_P (value) && DECL_THREAD_LOCAL_P (value))
 	    return omp_notice_threadprivate_variable (ctx, decl, value);
 	}
 
@@ -7352,7 +7362,8 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
   n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
   if ((ctx->region_type & ORT_TARGET) != 0)
     {
-      ret = lang_hooks.decls.omp_disregard_value_expr (decl, true);
+      shared = !(ctx->region_type & ORT_ACC);
+      ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
       if (n == NULL)
 	{
 	  unsigned nflags = flags;
@@ -7520,6 +7531,8 @@  omp_notice_variable (struct gimplify_omp_ctx *ctx, tree decl, bool in_code)
     }
 
   shared = ((flags | n->value) & GOVD_SHARED) != 0;
+  if (ctx->region_type & ORT_ACC)
+    shared = false;
   ret = lang_hooks.decls.omp_disregard_value_expr (decl, shared);
 
   /* If nothing changed, there's nothing left to do.  */
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
new file mode 100644
index 00000000000..1cbbb49d638
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-1.f90
@@ -0,0 +1,69 @@ 
+! Test data clauses involving common blocks and common block data.
+! Specifically, validates early matching errors.
+
+subroutine subtest
+  implicit none
+  integer, parameter :: n = 10
+  integer a(n), b(n), c, d(n), e
+  real*4 x(n), y(n), z, w(n), v
+  common /blockA/ a, c, x
+  common /blockB/ b, y, z
+  !$acc declare link(/blockA/, /blockB/, e, v)
+end subroutine subtest
+
+program test
+  implicit none
+  integer, parameter :: n = 10
+  integer a(n), b(n), c, d(n), e
+  real*4 x(n), y(n), z, w(n), v
+  common /blockA/ a, c, x
+  common /blockB/ b, y, z
+  !$acc declare link(/blockA/, /blockB/, e, v)
+
+  !$acc data copy(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data copyin(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data copyout(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data create(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data copyout(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data pcopy(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data pcopyin(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data pcopyout(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data pcreate(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data pcopyout(/blockA/, /blockB/, e, v)
+  !$acc end data
+
+  !$acc data present(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
+  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
+
+  !$acc parallel private(/blockA/, /blockB/, e, v)
+  !$acc end parallel
+
+  !$acc parallel firstprivate(/blockA/, /blockB/, e, v)
+  !$acc end parallel
+
+  !$acc exit data delete(/blockA/, /blockB/, e, v)
+
+  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
+  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
+
+  !$acc data deviceptr(/blockA/, /blockB/, e, v) ! { dg-error "Syntax error in OpenMP variable list" }
+  !$acc end data ! { dg-error "Unexpected ..ACC END DATA statement" }
+end program test
diff --git a/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90 b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
new file mode 100644
index 00000000000..b83638918a3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/common-block-2.f90
@@ -0,0 +1,49 @@ 
+! Test data clauses involving common blocks and common block data.
+! Specifically, resolver errors such as duplicate data clauses.
+
+program test
+  implicit none
+  integer, parameter :: n = 10
+  integer a(n), b(n), c, d(n), e
+  real*4 x(n), y(n), z, w(n), v
+  common /blockA/ a, c, x
+  common /blockB/ b, y, z
+
+  !$acc data copy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data copyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data create(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data copyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data pcopy(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data pcopyin(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data pcreate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc data pcopyout(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end data
+
+  !$acc parallel private(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end parallel
+
+  !$acc parallel firstprivate(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+  !$acc end parallel
+
+  !$acc exit data delete(/blockA/, /blockB/, e, v, a) ! { dg-error "Symbol .a. present on multiple clauses" }
+end program test
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
new file mode 100644
index 00000000000..a17a33536f3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-1.f90
@@ -0,0 +1,105 @@ 
+! Test data located inside common blocks.  This test does not exercise
+! ACC DECLARE.
+
+module const
+  integer, parameter :: n = 100
+end module const
+
+subroutine check
+  use const
+
+  implicit none
+  integer i, x(n), y
+  common /BLOCK/ x, y
+
+  do i = 1, n
+     if (x(i) .ne. y) call abort
+  end do
+end subroutine check
+
+module m
+  use const
+  integer a(n), b
+  common /BLOCK/ a, b
+
+contains
+  subroutine mod_implicit_incr
+    implicit none
+    integer i
+
+    !$acc parallel loop
+    do i = 1, n
+       a(i) = b
+    end do
+    !$acc end parallel loop
+
+    call check
+  end subroutine mod_implicit_incr
+
+  subroutine mod_explicit_incr
+    implicit none
+    integer i
+
+    !$acc parallel loop copy(a(1:n)) copyin(b)
+    do i = 1, n
+       a(i) = b
+    end do
+    !$acc end parallel loop
+
+    call check
+  end subroutine mod_explicit_incr
+end module m
+
+subroutine sub_implicit_incr
+  use const
+
+  implicit none
+  integer i, x(n), y
+  common /BLOCK/ x, y
+
+  !$acc parallel loop
+  do i = 1, n
+     x(i) = y
+  end do
+  !$acc end parallel loop
+
+  call check
+end subroutine sub_implicit_incr
+
+subroutine sub_explicit_incr
+  use const
+
+  implicit none
+  integer i, x(n), y
+  common /BLOCK/ x, y
+
+  !$acc parallel loop copy(x(1:n)) copyin(y)
+  do i = 1, n
+     x(i) = y
+  end do
+  !$acc end parallel loop
+
+  call check
+end subroutine sub_explicit_incr
+
+program main
+  use m
+
+  implicit none
+
+  a(:) = -1
+  b = 5
+  call mod_implicit_incr
+
+  a(:) = -2
+  b = 6
+  call mod_explicit_incr
+
+  a(:) = -3
+  b = 7
+  call sub_implicit_incr
+
+  a(:) = -4
+  b = 8
+  call sub_explicit_incr
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
new file mode 100644
index 00000000000..e27a225a024
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-2.f90
@@ -0,0 +1,150 @@ 
+! Test data located inside common blocks.  This test does not exercise
+! ACC DECLARE.  All data clauses are explicit.
+
+module consts
+  integer, parameter :: n = 100
+end module consts
+
+subroutine validate
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 x(n), y(n), z
+  common /BLOCK/ x, y, z, j
+
+  do i = 1, n
+     if (abs(x(i) - i - z) .ge. 0.0001) call abort
+  end do
+end subroutine validate
+
+subroutine incr
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 x(n), y(n), z
+  common /BLOCK/ x, y, z, j
+
+  !$acc parallel loop pcopy(/BLOCK/)
+  do i = 1, n
+     x(i) = x(i) + z
+  end do
+  !$acc end parallel loop
+end subroutine incr
+
+program main
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 a(n), b(n), c
+  common /BLOCK/ a, b, c, j
+
+  ! Test copyout, pcopy, device
+
+  !$acc data copyout(a, c)
+
+  c = 1.0
+
+  !$acc update device(c)
+
+  !$acc parallel loop pcopy(a)
+  do i = 1, n
+     a(i) = i
+  end do
+  !$acc end parallel loop
+
+  call incr
+  call incr
+  call incr
+  !$acc end data
+
+  c = 3.0
+  call validate
+
+  ! Test pcopy without copyout
+
+  c = 2.0
+  call incr
+  c = 5.0
+  call validate
+
+  ! Test create, delete, host, copyout, copyin
+
+  !$acc enter data create(b)
+
+  !$acc parallel loop pcopy(b)
+  do i = 1, n
+     b(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host (b)
+
+  !$acc parallel loop pcopy(b) copyout(a) copyin(c)
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end parallel loop
+
+  !$acc exit data delete(b)
+
+  call validate
+
+  a(:) = b(:)
+  c = 0.0
+  call validate
+
+  ! Test copy
+
+  c = 1.0
+  !$acc parallel loop copy(/BLOCK/)
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end parallel loop
+
+  call validate
+
+  ! Test pcopyin, pcopyout FIXME
+
+  c = 2.0
+  !$acc data copyin(b, c) copyout(a)
+
+  !$acc parallel loop pcopyin(b, c) pcopyout(a)
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end parallel loop
+
+  !$acc end data
+
+  call validate
+
+  ! Test reduction, private
+
+  j = 0
+
+  !$acc parallel private(i) copy(j)
+  !$acc loop reduction(+:j)
+  do i = 1, n
+     j = j + 1
+  end do
+  !$acc end parallel
+
+  if (j .ne. n) call abort
+
+  ! Test firstprivate, copy
+
+  a(:) = 0
+  c = j
+
+  !$acc parallel loop firstprivate(c) copyout(a)
+  do i = 1, n
+     a(i) = i + c
+  end do
+  !$acc end parallel loop
+
+  call validate
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
new file mode 100644
index 00000000000..90448d2da72
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/common-block-3.f90
@@ -0,0 +1,137 @@ 
+! Test data located inside common blocks.  This test does not exercise
+! ACC DECLARE.  Most of the data clauses are implicit.
+
+module consts
+  integer, parameter :: n = 100
+end module consts
+
+subroutine validate
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 x(n), y(n), z
+  common /BLOCK/ x, y, z, j
+
+  do i = 1, n
+     if (abs(x(i) - i - z) .ge. 0.0001) call abort
+  end do
+end subroutine validate
+
+subroutine incr_parallel
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 x(n), y(n), z
+  common /BLOCK/ x, y, z, j
+
+  !$acc parallel loop
+  do i = 1, n
+     x(i) = x(i) + z
+  end do
+  !$acc end parallel loop
+end subroutine incr_parallel
+
+subroutine incr_kernels
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 x(n), y(n), z
+  common /BLOCK/ x, y, z, j
+
+  !$acc kernels
+  do i = 1, n
+     x(i) = x(i) + z
+  end do
+  !$acc end kernels
+end subroutine incr_kernels
+
+program main
+  use consts
+
+  implicit none
+  integer i, j
+  real*4 a(n), b(n), c
+  common /BLOCK/ a, b, c, j
+
+  !$acc data copyout(a, c)
+
+  c = 1.0
+
+  !$acc update device(c)
+
+  !$acc parallel loop
+  do i = 1, n
+     a(i) = i
+  end do
+  !$acc end parallel loop
+
+  call incr_parallel
+  call incr_parallel
+  call incr_parallel
+  !$acc end data
+
+  c = 3.0
+  call validate
+
+  ! Test pcopy without copyout
+
+  c = 2.0
+  call incr_kernels
+  c = 5.0
+  call validate
+
+  !$acc kernels
+  do i = 1, n
+     b(i) = i
+  end do
+  !$acc end kernels
+
+  !$acc parallel loop
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end parallel loop
+
+  call validate
+
+  a(:) = b(:)
+  c = 0.0
+  call validate
+
+  ! Test copy
+
+  c = 1.0
+  !$acc parallel loop
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end parallel loop
+
+  call validate
+
+  c = 2.0
+  !$acc data copyin(b, c) copyout(a)
+
+  !$acc kernels
+  do i = 1, n
+     a(i) = b(i) + c
+  end do
+  !$acc end kernels
+
+  !$acc end data
+
+  call validate
+
+  j = 0
+
+  !$acc parallel loop reduction(+:j)
+  do i = 1, n
+     j = j + 1
+  end do
+  !$acc end parallel loop
+
+  if (j .ne. n) call abort
+end program main