diff mbox

OpenACC routines in fortran modules

Message ID 5776D55A.4030002@codesourcery.com
State New
Headers show

Commit Message

Cesar Philippidis July 1, 2016, 8:40 p.m. UTC
It turns out that the acc routine parallelism isn't being recorded in
fortran .mod files. This is a problem because then the ME can't validate
if a routine has compatible parallelism with the call site. This patch
does two things:

 1. Encode gang, worker, vector and seq level parallelism in module
    files. This introduces a new oacc_function enum, which I ended
    up using to record the parallelism of standalone acc routines too.

 2. Extends gfc_match_oacc_routine to add acc routine directive support
    for intrinsic procedures such as abort.

Is this patch OK for trunk? I included support for intrinsic procedures
because it was necessary with my previous patch which treated all calls
to non-acc routines from within an OpenACC offloaded region as errors.
Now that it has been determined that those patches should be link time
errors, we technically don't need to add acc routine support for
intrinsic procedures. So I can drop that part of the patch if necessary.

Cesar

Comments

Tobias Burnus July 28, 2016, 9:55 a.m. UTC | #1
Cesar Philippidis wrote:
> It turns out that the acc routine parallelism isn't being recorded in
> fortran .mod files. This is a problem because then the ME can't validate
> if a routine has compatible parallelism with the call site. 

Nothing against saving such information in .mod files. However, I wonder
whether it can happen that one places such an 'acc routine' outside of a
module in one file - and still accesses it from another file. In the simple
non-ACC case, one can have:

!----- one.f90 ----
subroutine foo()
  print *, "abc"
end subroutine foo

!---- two.f90 ---
program example
  call foo()
end program example

where "foo()" is torn in without any information about it (except that it
is a subroutine, does not require an explicit interface, and takes no
arguments).

I don't know whether the ACC spec requires an explicit interface in that
case (i.e. for acc routines); I bet it does - or at least should. In that
case, something like the following would be valid - and should be supported
as well. (I don't know whether it currently is.)

!----- one.f90 ----
subroutine foo()
  !$acc routine gang
  .... ! something
end subroutine foo

!---- two.f90 ---
program example
  INTERFACE
    subroutine foo()
      !$acc routine gang
      ! Nothing here
    end subroutine foo
  END INTERFACE

  call foo()
end program example

Namely, a replication of the declaration of the procedure, including
the "acc routine", in the 'interface'.
(If one concats the two files, I would also expect an error with -fopenacc,
if the "acc routine" doesn't match between "foo" and the "foo" in the
"interface" block.)


Otherwise: Have you checked whether an unmodified gfortran still accepts the
.mod file written by the patched gfortran - and vice versa? Especially if
-fopenacc is not used, backward compatibility of .mod files is a goal.
(Even though we often have to bump the .mod version for major releases.)

Cheers,

Tobias
Cesar Philippidis July 28, 2016, 9:33 p.m. UTC | #2
On 07/28/2016 02:55 AM, Tobias Burnus wrote:
> Cesar Philippidis wrote:
>> It turns out that the acc routine parallelism isn't being recorded in
>> fortran .mod files. This is a problem because then the ME can't validate
>> if a routine has compatible parallelism with the call site. 
> 
> Nothing against saving such information in .mod files. However, I wonder
> whether it can happen that one places such an 'acc routine' outside of a
> module in one file - and still accesses it from another file. In the simple
> non-ACC case, one can have:
> 
> !----- one.f90 ----
> subroutine foo()
>   print *, "abc"
> end subroutine foo
> 
> !---- two.f90 ---
> program example
>   call foo()
> end program example
> 
> where "foo()" is torn in without any information about it (except that it
> is a subroutine, does not require an explicit interface, and takes no
> arguments).
> 
> I don't know whether the ACC spec requires an explicit interface in that
> case (i.e. for acc routines); I bet it does - or at least should. In that

Jakub and I discussed this issue a while ago. There were two major
problems with treating calls to non-acc routines as errors. 1) What do
we do about intrinsic procedures, and 2) how should builtin and
libc/libm functions get handled? Jakub and I came to the conclusion that
the linker should resolve those issues, hence this patch
<https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00043.html> which teaches
the lto wrapper to error when it encounters missing symbols. From a
compiler standpoint, if the user does something like this

!$acc parallel
...
call foo()
...
!$acc end parallel

and if foo isn't marked as an acc routine, then the compiler will treat
that function as having an implicit 'acc routine seq'.

Note that trunk currently generates an error if the user tries apply an
acc routine directive on an intrinsic routine. This patch teaches
gfortran to accept acc routine directives on those procedures. However,
note that those routines aren't automatically parallelized though, i.e.
they are effectively implemented as 'acc routine seq'.

> case, something like the following would be valid - and should be supported
> as well. (I don't know whether it currently is.)
>
> !----- one.f90 ----
> subroutine foo()
>   !$acc routine gang
>   .... ! something
> end subroutine foo
> 
> !---- two.f90 ---
> program example
>   INTERFACE
>     subroutine foo()
>       !$acc routine gang
>       ! Nothing here
>     end subroutine foo
>   END INTERFACE
> 
>   call foo()
> end program example
> 
> Namely, a replication of the declaration of the procedure, including
> the "acc routine", in the 'interface'.
> (If one concats the two files, I would also expect an error with -fopenacc,
> if the "acc routine" doesn't match between "foo" and the "foo" in the
> "interface" block.)

I tested this case and it works. There is, however, a problem with
mismatched routine clauses. See PR72741 that Thomas filed recently.

> Otherwise: Have you checked whether an unmodified gfortran still accepts the
> .mod file written by the patched gfortran - and vice versa? Especially if
> -fopenacc is not used, backward compatibility of .mod files is a goal.
> (Even though we often have to bump the .mod version for major releases.)

I just tested this situation, and neither backward or forward compatible
isn't preserved. Basically, this patch introduces a mandatory
OACC_FUNCTION_ field inside the module file. Perhaps I should make that
field optional. At least that way we'd maintain backwards compatibility.
Is there something I can do to maintain forward compatibility?

Cesar
diff mbox

Patch

2016-07-01  Cesar Philippidis  <cesar@codesourcery.com>

	gcc/fortran/
	* gfortran.h (enum oacc_function): Define.
	(oacc_function_type): Declare.
	(symbol_attribute): Change the type of oacc_function from unsigned
	to an ENUM_BITFIELD.
	* module.c (oacc_function): New DECL_MIO_NAME.
	(mio_symbol_attribute): Set the oacc_function attribute.
	* openmp.c (gfc_oacc_routine_dims): Change the return type from
	int to oacc_function.
	(gfc_match_oacc_routine): Handle intrinsic procedures.
	* symbol.c (oacc_function_types): Define.
	* trans-decl.c (add_attributes_to_decl): Update to handle the
	retyped oacc_function attribute.

	gcc/testsuite/
	* gfortran.dg/goacc/fixed-1.f: Add test coverage.
	* gfortran.dg/goacc/routine-7.f90: New test.

	libgomp/
	* testsuite/libgomp.oacc-fortran/abort-1.f90: Test acc routine
	on intrinsic abort.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-7.f90: Likewise.

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 0bb71cb..fac94ca 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -303,6 +303,15 @@  enum save_state
 { SAVE_NONE = 0, SAVE_EXPLICIT, SAVE_IMPLICIT
 };
 
+/* Flags to keep track of ACC routine states.  */
+enum oacc_function
+{ OACC_FUNCTION_NONE = 0,
+  OACC_FUNCTION_SEQ,
+  OACC_FUNCTION_GANG,
+  OACC_FUNCTION_WORKER,
+  OACC_FUNCTION_VECTOR
+};
+
 /* Strings for all symbol attributes.  We use these for dumping the
    parse tree, in error messages, and also when reading and writing
    modules.  In symbol.c.  */
@@ -312,6 +321,7 @@  extern const mstring intents[];
 extern const mstring access_types[];
 extern const mstring ifsrc_types[];
 extern const mstring save_status[];
+extern const mstring oacc_function_types[];
 
 /* Enumeration of all the generic intrinsic functions.  Used by the
    backend for identification of a function.  */
@@ -862,7 +872,7 @@  typedef struct
   unsigned oacc_declare_link:1;
 
   /* This is an OpenACC acclerator function at level N - 1  */
-  unsigned oacc_function:3;
+  ENUM_BITFIELD (oacc_function) oacc_function:3;
 
   /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES).  */
   unsigned ext_attr:EXT_ATTR_NUM;
diff --git a/gcc/fortran/module.c b/gcc/fortran/module.c
index 4d664f0..267858f 100644
--- a/gcc/fortran/module.c
+++ b/gcc/fortran/module.c
@@ -2095,6 +2095,7 @@  DECL_MIO_NAME (procedure_type)
 DECL_MIO_NAME (ref_type)
 DECL_MIO_NAME (sym_flavor)
 DECL_MIO_NAME (sym_intent)
+DECL_MIO_NAME (oacc_function)
 #undef DECL_MIO_NAME
 
 /* Symbol attributes are stored in list with the first three elements
@@ -2116,6 +2117,8 @@  mio_symbol_attribute (symbol_attribute *attr)
   attr->proc = MIO_NAME (procedure_type) (attr->proc, procedures);
   attr->if_source = MIO_NAME (ifsrc) (attr->if_source, ifsrc_types);
   attr->save = MIO_NAME (save_state) (attr->save, save_status);
+  attr->oacc_function = MIO_NAME (oacc_function) (attr->oacc_function,
+						  oacc_function_types);
 
   ext_attr = attr->ext_attr;
   mio_integer ((int *) &ext_attr);
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 865e0d9..10b880c 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -1714,21 +1714,31 @@  gfc_match_oacc_cache (void)
 
 /* Determine the loop level for a routine.   */
 
-static int
+static oacc_function
 gfc_oacc_routine_dims (gfc_omp_clauses *clauses)
 {
   int level = -1;
+  oacc_function ret = OACC_FUNCTION_SEQ;
 
   if (clauses)
     {
       unsigned mask = 0;
 
       if (clauses->gang)
-	level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
+	{
+	  level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
+	  ret = OACC_FUNCTION_GANG;
+	}
       if (clauses->worker)
-	level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
+	{
+	  level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
+	  ret = OACC_FUNCTION_WORKER;
+	}
       if (clauses->vector)
-	level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
+	{
+	  level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
+	  ret = OACC_FUNCTION_VECTOR;
+	}
       if (clauses->seq)
 	level = GOMP_DIM_MAX, mask |= GOMP_DIM_MASK (level);
 
@@ -1736,10 +1746,7 @@  gfc_oacc_routine_dims (gfc_omp_clauses *clauses)
 	gfc_error ("Multiple loop axes specified for routine");
     }
 
-  if (level < 0)
-    level = GOMP_DIM_MAX;
-
-  return level;
+  return ret;
 }
 
 match
@@ -1750,6 +1757,7 @@  gfc_match_oacc_routine (void)
   match m;
   gfc_omp_clauses *c = NULL;
   gfc_oacc_routine_name *n = NULL;
+  gfc_intrinsic_sym *isym = NULL;
 
   old_loc = gfc_current_locus;
 
@@ -1767,12 +1775,14 @@  gfc_match_oacc_routine (void)
   if (m == MATCH_YES)
     {
       char buffer[GFC_MAX_SYMBOL_LEN + 1];
-      gfc_symtree *st;
+      gfc_symtree *st = NULL;
 
       m = gfc_match_name (buffer);
       if (m == MATCH_YES)
 	{
-	  st = gfc_find_symtree (gfc_current_ns->sym_root, buffer);
+	  if ((isym = gfc_find_function (buffer)) == NULL
+	      && (isym = gfc_find_subroutine (buffer)) == NULL)
+	    st = gfc_find_symtree (gfc_current_ns->sym_root, buffer);
 	  if (st)
 	    {
 	      sym = st->n.sym;
@@ -1780,7 +1790,7 @@  gfc_match_oacc_routine (void)
 	        sym = NULL;
 	    }
 
-	  if (st == NULL
+	  if ((st == NULL && isym == NULL)
 	      || (sym
 		  && !sym->attr.external
 		  && !sym->attr.function
@@ -1814,7 +1824,10 @@  gfc_match_oacc_routine (void)
 	  != MATCH_YES))
     return MATCH_ERROR;
 
-  if (sym != NULL)
+  if (isym != NULL)
+    /* There is nothing to do for intrinsic procedures.  */
+    ;
+  else if (sym != NULL)
     {
       n = gfc_get_oacc_routine_name ();
       n->sym = sym;
@@ -1832,7 +1845,7 @@  gfc_match_oacc_routine (void)
 				       &old_loc))
 	goto cleanup;
       gfc_current_ns->proc_name->attr.oacc_function
-	= gfc_oacc_routine_dims (c) + 1;
+	= gfc_oacc_routine_dims (c);
     }
 
   if (n)
diff --git a/gcc/fortran/symbol.c b/gcc/fortran/symbol.c
index 0ee7dec..b1dd32b 100644
--- a/gcc/fortran/symbol.c
+++ b/gcc/fortran/symbol.c
@@ -87,6 +87,15 @@  const mstring save_status[] =
     minit ("IMPLICIT-SAVE", SAVE_IMPLICIT),
 };
 
+const mstring oacc_function_types[] =
+{
+  minit ("NONE", OACC_FUNCTION_NONE),
+  minit ("OACC_FUNCTION_SEQ", OACC_FUNCTION_SEQ),
+  minit ("OACC_FUNCTION_GANG", OACC_FUNCTION_GANG),
+  minit ("OACC_FUNCTION_WORKER", OACC_FUNCTION_WORKER),
+  minit ("OACC_FUNCTION_VECTOR", OACC_FUNCTION_VECTOR)
+};
+
 /* This is to make sure the backend generates setup code in the correct
    order.  */
 
diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 2f5e434..04f9860 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1327,11 +1327,26 @@  add_attributes_to_decl (symbol_attribute sym_attr, tree list)
     list = tree_cons (get_identifier ("omp declare target"),
 		      NULL_TREE, list);
 
-  if (sym_attr.oacc_function)
+  if (sym_attr.oacc_function != OACC_FUNCTION_NONE)
     {
       tree dims = NULL_TREE;
       int ix;
-      int level = sym_attr.oacc_function - 1;
+      int level = GOMP_DIM_MAX;
+
+      switch (sym_attr.oacc_function)
+	{
+	case OACC_FUNCTION_GANG:
+	  level = GOMP_DIM_GANG;
+	  break;
+	case OACC_FUNCTION_WORKER:
+	  level = GOMP_DIM_WORKER;
+	  break;
+	case OACC_FUNCTION_VECTOR:
+	  level = GOMP_DIM_VECTOR;
+	  break;
+	case OACC_FUNCTION_SEQ:
+	default:;
+	}
 
       for (ix = GOMP_DIM_MAX; ix--;)
 	dims = tree_cons (build_int_cst (boolean_type_node, ix >= level),
diff --git a/gcc/testsuite/gfortran.dg/goacc/fixed-1.f b/gcc/testsuite/gfortran.dg/goacc/fixed-1.f
index 6a454190..0c0fb98 100644
--- a/gcc/testsuite/gfortran.dg/goacc/fixed-1.f
+++ b/gcc/testsuite/gfortran.dg/goacc/fixed-1.f
@@ -1,3 +1,5 @@ 
+!$ACC ROUTINE(ABORT) SEQ
+
       INTEGER :: ARGC
       ARGC = COMMAND_ARGUMENT_COUNT ()
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/routine-7.f90 b/gcc/testsuite/gfortran.dg/goacc/routine-7.f90
new file mode 100644
index 0000000..e1e0ab7
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/routine-7.f90
@@ -0,0 +1,69 @@ 
+! Test acc routines inside modules.
+
+! { dg-additional-options "-O0" }
+
+module routines
+contains
+  subroutine vector
+    implicit none
+    !$acc routine vector
+  end subroutine vector
+
+  subroutine worker
+    implicit none
+    !$acc routine worker
+  end subroutine worker
+
+  subroutine gang
+    implicit none
+    !$acc routine gang
+  end subroutine gang
+
+  subroutine seq
+    implicit none
+    !$acc routine seq
+  end subroutine seq
+end module routines
+
+program main
+  use routines
+  implicit none
+
+  integer :: i
+
+  !$acc parallel loop gang
+  do i = 1, 10
+     call gang ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call worker
+     call vector
+     call seq
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel loop worker
+  do i = 1, 10
+     call gang ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call worker ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call vector
+     call seq
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel loop vector
+  do i = 1, 10
+     call gang ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call worker ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call vector ! { dg-error "routine call uses same OpenACC parallelism as containing loop" }
+     call seq
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel loop seq
+  do i = 1, 10
+     call gang
+     call worker
+     call vector
+     call seq
+  end do
+  !$acc end parallel loop
+end program main
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90
index b38303d..48ebc38 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/abort-1.f90
@@ -1,5 +1,6 @@ 
 program main
   implicit none
+  !$acc routine(abort) seq
 
   print *, "CheCKpOInT"
   !$acc parallel
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
index a19045b..cbd1dd9 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
+++ b/libgomp/testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f
@@ -6,6 +6,7 @@ 
 
       USE OPENACC
       IMPLICIT NONE
+!$ACC ROUTINE(ABORT) SEQ
 
 !Host.
 
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
index 200188e..07cd6d9 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
@@ -1,121 +1,95 @@ 
+! Test acc routines inside modules.
 
 ! { dg-do run }
-! { dg-additional-options "-cpp" }
 
-#define M 8
-#define N 32
+module routines
+  integer, parameter :: N = 32
 
-program main
-  integer :: i
-  integer :: a(N)
-  integer :: b(M * N)
-
-  do i = 1, N
-    a(i) = 0
-  end do
+contains
+  subroutine vector (a)
+    implicit none
+    !$acc routine vector
+    integer, intent (inout) :: a(N)
+    integer :: i
 
-  !$acc parallel copy (a)
-  !$acc loop seq
+    !$acc loop vector
     do i = 1, N
-      call seq (a)
+       a(i) = 1
     end do
-  !$acc end parallel
+  end subroutine vector
 
-  do i = 1, N
-    if (a(i) .ne.N) call abort
-  end do
+  subroutine worker (a)
+    implicit none
+    !$acc routine worker
+    integer, intent (inout) :: a(N)
+    integer :: i
 
-  !$acc parallel copy (a)
-  !$acc loop seq
-    do i = 1, N 
-      call gang (a)
+    !$acc loop worker
+    do i = 1, N
+       a(i) = 2
     end do
-  !$acc end parallel
-
-  do i = 1, N
-    if (a(i) .ne. (N + (N * (-1 * i)))) call abort
-  end do
+  end subroutine worker
 
-  do i = 1, N
-    b(i) = i
-  end do
+  subroutine gang (a)
+    implicit none
+    !$acc routine gang
+    integer, intent (inout) :: a(N)
+    integer :: i
 
-  !$acc parallel copy (b)
-  !$acc loop seq
+    !$acc loop gang
     do i = 1, N
-      call worker (b)
+       a(i) = 3
     end do
-  !$acc end parallel
+  end subroutine gang
 
-  do i = 1, N
-    if (b(i) .ne. N + i) call abort
-  end do
+  subroutine seq (a)
+    implicit none
+    !$acc routine seq
+    integer, intent (inout) :: a(N)
+    integer :: i
 
-  do i = 1, N
-    a(i) = i
-  end do
-
-  !$acc parallel copy (a)
-  !$acc loop seq
     do i = 1, N
-      call vector (a)
+       a(i) = 4
     end do
-  !$acc end parallel
-
-  do i = 1, N
-    if (a(i) .ne. 0) call abort
-  end do
+  end subroutine seq
+end module routines
 
-contains
+program main
+  use routines
+  implicit none
 
-subroutine vector (a)
-  !$acc routine vector
-  integer, intent (inout) :: a(N)
   integer :: i
+  integer :: a(N)
+
+  !$acc parallel
+  call seq (a)
+  !$acc end parallel
 
-  !$acc loop vector
   do i = 1, N
-    a(i) = a(i) - a(i) 
+    if (a(i) .ne. 4) call abort
   end do
 
-end subroutine vector
-
-subroutine worker (b)
-  !$acc routine worker
-  integer, intent (inout) :: b(M*N)
-  integer :: i, j
+  !$acc parallel
+  call gang (a)
+  !$acc end parallel
 
-  !$acc loop worker
   do i = 1, N
-  !$acc loop vector
-    do j = 1, M
-      b(j + ((i - 1) * M)) = b(j + ((i - 1) * M)) + 1
-    end do
+    if (a(i) .ne. 3) call abort
   end do
 
-end subroutine worker
-
-subroutine gang (a)
-  !$acc routine gang
-  integer, intent (inout) :: a(N)
-  integer :: i
+  !$acc parallel
+  call worker (a)
+  !$acc end parallel
 
-  !$acc loop gang
   do i = 1, N
-    a(i) = a(i) - i 
+    if (a(i) .ne. 2) call abort
   end do
 
-end subroutine gang
-
-subroutine seq (a)
-  !$acc routine seq
-  integer, intent (inout) :: a(M)
-  integer :: i
+  !$acc parallel
+  call vector (a)
+  !$acc end parallel
 
   do i = 1, N
-    a(i) = a(i) + 1
+    if (a(i) .ne. 1) call abort
   end do
-
-end subroutine seq
-
 end program main