diff mbox

[WIP,PR,fortran/72741] Rework Fortran OpenACC routine clause handling (was: [PATCH] OpenACC routines in fortran modules)

Message ID 878tw35o6k.fsf@kepler.schwinge.homeip.net
State New
Headers show

Commit Message

Thomas Schwinge Aug. 11, 2016, 3:18 p.m. UTC
Hi!

This is still hacky and WIP; posting for Cesar and Tobias to have a look.
I'm still not too much of a Fortran person.  ;-)

On Fri, 1 Jul 2016 13:40:58 -0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
> It turns out that the acc routine parallelism isn't being recorded in
> fortran .mod files. This is a problem because then the ME can't validate
> if a routine has compatible parallelism with the call site. This patch
> does two things:
> 
>  1. Encode gang, worker, vector and seq level parallelism in module
>     files. This introduces a new oacc_function enum, which I ended
>     up using to record the parallelism of standalone acc routines too.

Building on top of this patch, and on top of
<https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01910.html> "[gomp4] Fix
PR72741", I reworked these patches (effectively reverting a lot of
Cesar's earlier changes, which nevertheless gave good guidance to me,
about which code I needed to touch).  With this patch, we now handle more
Fortran OpenACC routine directive use/misuse (see the test case changes),
much in spirit of what I discussed in <http://gcc.gnu.org/PR72741>
"Fortran OpenACC routine directive doesn't properly handle clauses
specifying the level of parallelism", minus items that Cesar already
clarified for me, where Fortran is different from what I expected,
different from the C/C++ environment I'm more used to.  This now also
paves the way for adding Fortran support to my recent patch
<https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00069.html> "Use
verify_oacc_routine_clauses", and then ultimately
<https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00071.html> "Repeated use
of the OpenACC routine directive".

However, my changes are still hacky and WIP, still contains a bunch of
TODOs.  Can you, Cesar and/or Tobias, please advise on these?

>  2. Extends gfc_match_oacc_routine to add acc routine directive support
>     for intrinsic procedures such as abort.
> 
> Is this patch OK for trunk? I included support for intrinsic procedures
> because it was necessary with my previous patch which treated all calls
> to non-acc routines from within an OpenACC offloaded region as errors.
> Now that it has been determined that those patches should be link time
> errors, we technically don't need to add acc routine support for
> intrinsic procedures. So I can drop that part of the patch if necessary.

That could've been a patch separate from the others, as it's doing a
separate thing.  We will want to handle intrinsics used with the OpenACC
routine directive with a name (but it certainly isn't a priority).  I
left in these changes, and also extended them a bit.

First some comments on Cesar's patch:

> --- a/gcc/fortran/module.c
> +++ b/gcc/fortran/module.c

>  [...]
> +DECL_MIO_NAME (oacc_function)
>  [...]

As discussed between Cesar and Tobias, these module.c/symbol.c changes
introduce an incompatibility in the Fortran module file format, which
we'll want to avoid.  Reverting that to use individual bit flags instead
of the "enum oacc_function", I think that we're safe (but I have not
verified that).  On the other hand, given that I'm not at all handling in
module.c/symbol.c the new "locus omp_clauses_locus" and "struct
symbol_attribute *next" members that I'm adding to "symbol_attribute",
I'm not sure whether I'm actually testing this properly.  ;-) I guess I'm
not.

> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c

> @@ -1814,7 +1824,10 @@ gfc_match_oacc_routine (void)
>  	  != MATCH_YES))
>      return MATCH_ERROR;
>  
> -  if (sym != NULL)
> +  if (isym != NULL)
> +    /* There is nothing to do for intrinsic procedures.  */
> +    ;

We will want to check that no incompatible clauses are being specified,
for example (but, low priority).  I'm adding a hacky implementation of
that.

> +  else if (sym != NULL)
>      {
>        n = gfc_get_oacc_routine_name ();
>        n->sym = sym;

> --- a/gcc/fortran/trans-decl.c
> +++ b/gcc/fortran/trans-decl.c
> @@ -1327,11 +1327,26 @@ add_attributes_to_decl (symbol_attribute sym_attr, tree list)
>      list = tree_cons (get_identifier ("omp declare target"),
>  		      NULL_TREE, list);
>  
> -  if (sym_attr.oacc_function)
> +  if (sym_attr.oacc_function != OACC_FUNCTION_NONE)
>      {
>        tree dims = NULL_TREE;
>        int ix;
> -      int level = sym_attr.oacc_function - 1;
> +      int level = GOMP_DIM_MAX;
> +
> +      switch (sym_attr.oacc_function)
> +	{
> +	case OACC_FUNCTION_GANG:
> +	  level = GOMP_DIM_GANG;
> +	  break;
> +	case OACC_FUNCTION_WORKER:
> +	  level = GOMP_DIM_WORKER;
> +	  break;
> +	case OACC_FUNCTION_VECTOR:
> +	  level = GOMP_DIM_VECTOR;
> +	  break;
> +	case OACC_FUNCTION_SEQ:
> +	default:;
> +	}
>  
>        for (ix = GOMP_DIM_MAX; ix--;)
>  	dims = tree_cons (build_int_cst (boolean_type_node, ix >= level),

As discussed before, this should use the generic omp-low.c functions,
which I've implemented.

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/routine-7.f90
> @@ -0,0 +1,69 @@
> +! Test acc routines inside modules.
> +
> +! { dg-additional-options "-O0" }

-O0 to prevent inlining of functions tagged with OpenACC routine
directives, or another reason?

> --- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
> @@ -1,121 +1,95 @@
> +! Test acc routines inside modules.
>  
>  ! { dg-do run }
> -! { dg-additional-options "-cpp" }
>  
> -#define M 8
> -#define N 32
> +module routines
> +  integer, parameter :: N = 32
>  
> -program main
> -  integer :: i
> -  integer :: a(N)
> -  integer :: b(M * N)
> -
> -  do i = 1, N
> -    a(i) = 0
> -  end do
> +contains
> +  subroutine vector (a)
> +    implicit none
> +    !$acc routine vector
> +    integer, intent (inout) :: a(N)
> +    integer :: i
> [...]

This seems to completely rewrite the test case.  Is that intentional, or
should the original test case be preserved, and the changed/new/rewritten
one be added as a new test case?


Now, my hacky WIP patch.

One big chunk of the gcc/fortran/gfortran.h changes is just to move some
stuff around, without any changes, so that I can use "locus" in
"symbol_attribute".

I very much "cargo cult"ed all that "oacc_routine*" bit flag stuff in
module.c/symbol.c, replicating what's being done for "omp target
declare", without really knowing what I'm doing there.  I will appreciate
test cases actually exercising this code -- which doesn't currently at
all handle the new "locus omp_clauses_locus" and "struct symbol_attribute
*next" members that I'm adding to "symbol_attribute", as I've mentioned
before.  (But I suppose it should?)

We're not implementing the OpenACC device_type clause at the moment, so
the "TODO: handle device_type clauses" comment in
gcc/fortran/openmp.c:gfc_match_oacc_routine is not a concern right now.

With these changes, we're now actually also paying attention the clauses
specified with the OpenACC routine directive with a name -- one of the
things mentioned as missing in <http://gcc.gnu.org/PR72741> "Fortran
OpenACC routine directive doesn't properly handle clauses specifying the
level of parallelism".

To handle several "pending" OpenACC routine directives, I had to add the
"struct symbol_attribute *next" member to "symbol_attribute" -- I hope
that doesn't disqualify the proposed changes as too ugly.  (Several other
structs already contain such "next" pointers, and the use is very much
confined to only the OpenACC routine directive.)  I will of course be
happy to learn about a better/different way to do this.

commit ca4a098dab72f27c6e1121aa7e5e49764921974e
Author: Thomas Schwinge <thomas@codesourcery.com>
Date:   Thu Aug 11 16:34:22 2016 +0200

    [WIP] [PR fortran/72741] Rework Fortran OpenACC routine clause handling
---
 gcc/fortran/gfortran.h                             | 275 +++++++++++----------
 gcc/fortran/module.c                               |  34 ++-
 gcc/fortran/openmp.c                               | 106 ++++----
 gcc/fortran/symbol.c                               | 135 +++++++++-
 gcc/fortran/trans-decl.c                           | 106 ++++++--
 .../gfortran.dg/goacc/oaccdevlow-routine.f95       |   2 +-
 gcc/testsuite/gfortran.dg/goacc/pr72741-2.f        |  39 +++
 .../gfortran.dg/goacc/pr72741-intrinsic-1.f        |  16 ++
 gcc/testsuite/gfortran.dg/goacc/pr72741.f90        |  14 +-
 9 files changed, 501 insertions(+), 226 deletions(-)



Grüße
 Thomas

Comments

Jakub Jelinek Aug. 11, 2016, 3:40 p.m. UTC | #1
On Thu, Aug 11, 2016 at 05:18:43PM +0200, Thomas Schwinge wrote:
> --- gcc/fortran/gfortran.h
> +++ gcc/fortran/gfortran.h
> @@ -729,7 +839,7 @@ ext_attr_t;
>  extern const ext_attr_t ext_attr_list[];
>  
>  /* Symbol attribute structure.  */
> -typedef struct
> +typedef struct symbol_attribute
>  {
>    /* Variable attributes.  */
>    unsigned allocatable:1, dimension:1, codimension:1, external:1, intrinsic:1,
> @@ -864,6 +974,13 @@ typedef struct
>    /* Mentioned in OMP DECLARE TARGET.  */
>    unsigned omp_declare_target:1;
>  
> +  /* OpenACC routine.  */
> +  unsigned oacc_routine:1;
> +  unsigned oacc_routine_gang:1;
> +  unsigned oacc_routine_worker:1;
> +  unsigned oacc_routine_vector:1;
> +  unsigned oacc_routine_seq:1;
> +
>    /* Mentioned in OACC DECLARE.  */
>    unsigned oacc_declare_create:1;
>    unsigned oacc_declare_copyin:1;
> @@ -871,137 +988,24 @@ typedef struct
>    unsigned oacc_declare_device_resident:1;
>    unsigned oacc_declare_link:1;
>  
> -  /* This is an OpenACC acclerator function at level N - 1  */
> -  ENUM_BITFIELD (oacc_function) oacc_function:3;
> -
>    /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES).  */
>    unsigned ext_attr:EXT_ATTR_NUM;
>  
> +  /* Location information for OMP clauses.  */
> +  //TODO: how to handle in module.c/symbol.c?
> +  locus omp_clauses_locus;
> +
>    /* The namespace where the attribute has been set.  */
>    struct gfc_namespace *volatile_ns, *asynchronous_ns;
> +
> +  /* Chain to another set of symbol attributes.  Currently only used for
> +     OpenACC routine.  */
> +  //TODO: how to handle in module.c/symbol.c?
> +  struct symbol_attribute *next;

While symbol_attribute is already bloated, I don't like bloating it this
much further.  Do you really need it for all symbols, or just all subroutines?
In the latter case, it is much better to add some openacc specific pointer
into the namespace structure and stick everything you need into some custom
structure it will refer to.  E.g. look at gfc_omp_declare_simd struct
in ns->omp_declare_simd.
omp_clauses_locus makes no sense, symbol_attribute contains parsed info from
many different clauses, which one it is?

	Jakub
Thomas Schwinge Aug. 11, 2016, 4:26 p.m. UTC | #2
Hi!

As Cesar asked for it, there is now a Git branch
tschwinge/omp/pr72741-wip containing these changes (plus some other
pending changes that I didn't single out at this time), at
<https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/tschwinge/omp/pr72741-wip>.
(I expect it does, but I didn't verify that this actually builds; I have
further changes on top of that.)  Cesar, please tell me if you'd like me
to push this to GitHub, in case you want to use their review/commentary
functions, or the like.


On Thu, 11 Aug 2016 17:40:26 +0200, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Aug 11, 2016 at 05:18:43PM +0200, Thomas Schwinge wrote:
> > --- gcc/fortran/gfortran.h
> > +++ gcc/fortran/gfortran.h

> >  /* Symbol attribute structure.  */
> > -typedef struct
> > +typedef struct symbol_attribute
> >  {

> While symbol_attribute is already bloated, I don't like bloating it this
> much further.  Do you really need it for all symbols, or just all subroutines?

Certainly not for all symbole; just for what is valid to be used with the
OpenACC routine directive, which per OpenACC 2.0a, 2.13.1 Routine
Directive is:

    In Fortran the syntax of the routine directive is:
        !$acc routine clause-list
        !$acc routine( name ) clause-list
    In Fortran, the routine directive without a name may appear within the specification part of a subroutine or function definition, or within an interface body for a subroutine or function in an interface block, and applies to the containing subroutine or function. The routine directive with a name may appear in the specification part of a subroutine, function or module, and applies to the named subroutine or function.

(Pasting that in full just in case that contains some additional Fortran
lingo, meaning more than "subroutines".)

> In the latter case, it is much better to add some openacc specific pointer
> into the namespace structure and stick everything you need into some custom
> structure it will refer to.  E.g. look at gfc_omp_declare_simd struct
> in ns->omp_declare_simd.

Thanks for the suggestion, I'll look into that.


> omp_clauses_locus makes no sense, symbol_attribute contains parsed info from
> many different clauses, which one it is?

Well, it makes some sense -- it works no worse than the existing code ;-)
-- but I agree that it's not exactly pretty.  To the best of my
knowledge, in Fortran OpenACC/OpenMP clauses parsing, we're currently not
tracking (saving) specific location information for individual clauses
(at least, that's what a casual scan through the code, and
gfc_match_oacc_routine or gfc_match_omp_declare_target in particular make
me think: gfc_omp_clauses collects all clause data, but only contains a
single "locus loc" member (which maybe I should have used instead of
"old_loc", the location information for the directive itself?).  Maybe I
misunderstood, and we do have more precise location information available
for individual clauses?  In that case, I'll happily use that, of course.


Grüße
 Thomas
Jakub Jelinek Aug. 11, 2016, 4:41 p.m. UTC | #3
On Thu, Aug 11, 2016 at 06:26:50PM +0200, Thomas Schwinge wrote:
> > > --- gcc/fortran/gfortran.h
> > > +++ gcc/fortran/gfortran.h
> 
> > >  /* Symbol attribute structure.  */
> > > -typedef struct
> > > +typedef struct symbol_attribute
> > >  {
> 
> > While symbol_attribute is already bloated, I don't like bloating it this
> > much further.  Do you really need it for all symbols, or just all subroutines?
> 
> Certainly not for all symbole; just for what is valid to be used with the
> OpenACC routine directive, which per OpenACC 2.0a, 2.13.1 Routine
> Directive is:
> 
>     In Fortran the syntax of the routine directive is:
>         !$acc routine clause-list
>         !$acc routine( name ) clause-list
>     In Fortran, the routine directive without a name may appear within the specification part of a subroutine or function definition, or within an interface body for a subroutine or function in an interface block, and applies to the containing subroutine or function. The routine directive with a name may appear in the specification part of a subroutine, function or module, and applies to the named subroutine or function.
> 
> (Pasting that in full just in case that contains some additional Fortran
> lingo, meaning more than "subroutines".)

By "subroutines" I've meant of course also functions, those have their own
namespace structure too.

> > omp_clauses_locus makes no sense, symbol_attribute contains parsed info from
> > many different clauses, which one it is?
> 
> Well, it makes some sense -- it works no worse than the existing code ;-)
> -- but I agree that it's not exactly pretty.  To the best of my
> knowledge, in Fortran OpenACC/OpenMP clauses parsing, we're currently not
> tracking (saving) specific location information for individual clauses
> (at least, that's what a casual scan through the code, and
> gfc_match_oacc_routine or gfc_match_omp_declare_target in particular make
> me think: gfc_omp_clauses collects all clause data, but only contains a
> single "locus loc" member (which maybe I should have used instead of
> "old_loc", the location information for the directive itself?).  Maybe I
> misunderstood, and we do have more precise location information available
> for individual clauses?  In that case, I'll happily use that, of course.

The Fortran FE generally doesn't track locations of any of the attributes
symbols have, attributes as well as OpenMP clauses are represented just as
bits (for boolean stuff), etc., only if you have some expression you have
location for the expression.
I don't see what is so special on these clauses that they need to have
location tracked compared to say CONTIGUOUS or whatever other attribute, just
use the location of the function.  Unless of course you want to rewrite all
the Fortran FE data structures and track detailed locations for everything.
But just treating selected OpenACC clauses specially, ignoring how the FE is
structured, is at least inconsistent with the rest.

	Jakub
Cesar Philippidis Aug. 11, 2016, 4:44 p.m. UTC | #4
On 08/11/2016 08:18 AM, Thomas Schwinge wrote:

>> --- a/gcc/fortran/module.c
>> +++ b/gcc/fortran/module.c
> 
>>  [...]
>> +DECL_MIO_NAME (oacc_function)
>>  [...]
> 
> As discussed between Cesar and Tobias, these module.c/symbol.c changes
> introduce an incompatibility in the Fortran module file format, which
> we'll want to avoid.  Reverting that to use individual bit flags instead
> of the "enum oacc_function", I think that we're safe (but I have not
> verified that).  On the other hand, given that I'm not at all handling in
> module.c/symbol.c the new "locus omp_clauses_locus" and "struct
> symbol_attribute *next" members that I'm adding to "symbol_attribute",
> I'm not sure whether I'm actually testing this properly.  ;-) I guess I'm
> not.

How are you testing it? Basically, what you need to do is create two
source files, one containing a module and another with the program unit.
Then compile one of those files with the old, say gcc6 fortran, and the
other with trunk gfortran and try to link the .o files together.

I've attached some test cases so that you can experiment with. Each
driver file corresponds to a test file, with the exception of
test-driver which uses both test-interface.f90 and test-module.f90.

>> --- a/gcc/fortran/openmp.c
>> +++ b/gcc/fortran/openmp.c
> 
>> @@ -1814,7 +1824,10 @@ gfc_match_oacc_routine (void)
>>  	  != MATCH_YES))
>>      return MATCH_ERROR;
>>  
>> -  if (sym != NULL)
>> +  if (isym != NULL)
>> +    /* There is nothing to do for intrinsic procedures.  */
>> +    ;
> 
> We will want to check that no incompatible clauses are being specified,
> for example (but, low priority).  I'm adding a hacky implementation of
> that.

So this is what I was overlooking in PR72741. For some reason I was only
considering invalid clauses of the form

  !$acc routine gang worker

and not actually checking for compatible parallelism at the call sites.
The title "Fortran OpenACC routine directive doesn't properly handle
clauses specifying the level of parallelism" was kind of misleading.

Shouldn't the oaccdevlow pass already catch these types of errors already?

>> --- /dev/null
>> +++ b/gcc/testsuite/gfortran.dg/goacc/routine-7.f90
>> @@ -0,0 +1,69 @@
>> +! Test acc routines inside modules.
>> +
>> +! { dg-additional-options "-O0" }
> 
> -O0 to prevent inlining of functions tagged with OpenACC routine
> directives, or another reason?

I'm not sure why, but that's probably it.

>> --- a/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
>> +++ b/libgomp/testsuite/libgomp.oacc-fortran/routine-7.f90
>> @@ -1,121 +1,95 @@
>> +! Test acc routines inside modules.
>>  
>>  ! { dg-do run }
>> -! { dg-additional-options "-cpp" }
>>  
>> -#define M 8
>> -#define N 32
>> +module routines
>> +  integer, parameter :: N = 32
>>  
>> -program main
>> -  integer :: i
>> -  integer :: a(N)
>> -  integer :: b(M * N)
>> -
>> -  do i = 1, N
>> -    a(i) = 0
>> -  end do
>> +contains
>> +  subroutine vector (a)
>> +    implicit none
>> +    !$acc routine vector
>> +    integer, intent (inout) :: a(N)
>> +    integer :: i
>> [...]
> 
> This seems to completely rewrite the test case.  Is that intentional, or
> should the original test case be preserved, and the changed/new/rewritten
> one be added as a new test case?

The original test was completely bogus because it had a lot of race
conditions when writing to variables. I could restore it, but then you'd
need to remove all of the gang, worker, vector clauses and force it to
run in seq. But that would defeat the intent behind the patch.

> Now, my hacky WIP patch.
> 
> One big chunk of the gcc/fortran/gfortran.h changes is just to move some
> stuff around, without any changes, so that I can use "locus" in
> "symbol_attribute".

I agree with Jakub about creating a new gfc_acc_routine struct to
contain the locus and clauses for acc routines. That way, you can also
link them together for device_type.

But at the same time, since device_type isn't a priority for in the near
term, we might be better off using the existing oacc_function and nohost
attribute bits instead of introducing a new struct.

> I very much "cargo cult"ed all that "oacc_routine*" bit flag stuff in
> module.c/symbol.c, replicating what's being done for "omp target
> declare", without really knowing what I'm doing there.  I will appreciate
> test cases actually exercising this code -- which doesn't currently at
> all handle the new "locus omp_clauses_locus" and "struct symbol_attribute
> *next" members that I'm adding to "symbol_attribute", as I've mentioned
> before.  (But I suppose it should?)
> 
> We're not implementing the OpenACC device_type clause at the moment, so
> the "TODO: handle device_type clauses" comment in
> gcc/fortran/openmp.c:gfc_match_oacc_routine is not a concern right now.
> 
> With these changes, we're now actually also paying attention the clauses
> specified with the OpenACC routine directive with a name -- one of the
> things mentioned as missing in <http://gcc.gnu.org/PR72741> "Fortran
> OpenACC routine directive doesn't properly handle clauses specifying the
> level of parallelism".
> 
> To handle several "pending" OpenACC routine directives, I had to add the
> "struct symbol_attribute *next" member to "symbol_attribute" -- I hope
> that doesn't disqualify the proposed changes as too ugly.  (Several other
> structs already contain such "next" pointers, and the use is very much
> confined to only the OpenACC routine directive.)  I will of course be
> happy to learn about a better/different way to do this.
> 
> commit ca4a098dab72f27c6e1121aa7e5e49764921974e
> Author: Thomas Schwinge <thomas@codesourcery.com>
> Date:   Thu Aug 11 16:34:22 2016 +0200
> 
>     [WIP] [PR fortran/72741] Rework Fortran OpenACC routine clause handling

>  /* Structure and list of supported extension attributes.  */
>  typedef enum
>  {
> @@ -729,7 +839,7 @@ ext_attr_t;
>  extern const ext_attr_t ext_attr_list[];
>  
>  /* Symbol attribute structure.  */
> -typedef struct
> +typedef struct symbol_attribute
>  {
>    /* Variable attributes.  */
>    unsigned allocatable:1, dimension:1, codimension:1, external:1, intrinsic:1,
> @@ -864,6 +974,13 @@ typedef struct
>    /* Mentioned in OMP DECLARE TARGET.  */
>    unsigned omp_declare_target:1;
>  
> +  /* OpenACC routine.  */
> +  unsigned oacc_routine:1;
> +  unsigned oacc_routine_gang:1;
> +  unsigned oacc_routine_worker:1;
> +  unsigned oacc_routine_vector:1;
> +  unsigned oacc_routine_seq:1;
> +
>    /* Mentioned in OACC DECLARE.  */
>    unsigned oacc_declare_create:1;
>    unsigned oacc_declare_copyin:1;
> @@ -871,137 +988,24 @@ typedef struct
>    unsigned oacc_declare_device_resident:1;
>    unsigned oacc_declare_link:1;
>  
> -  /* This is an OpenACC acclerator function at level N - 1  */
> -  ENUM_BITFIELD (oacc_function) oacc_function:3;
> -

I'm not sure what's better from a stylistic standpoint. Personally, I'd
prefer if all of these extra bits were coalesced into an oacc_routine
and oacc_declare enums. At least for acc routines, gang, worker, vector
and seq are all mutually exclusive.

> +++ gcc/fortran/module.c
> @@ -1986,6 +1986,7 @@ enum ab_attribute
>    AB_IS_CLASS, AB_PROCEDURE, AB_PROC_POINTER, AB_ASYNCHRONOUS, AB_CODIMENSION,
>    AB_COARRAY_COMP, AB_VTYPE, AB_VTAB, AB_CONTIGUOUS, AB_CLASS_POINTER,
>    AB_IMPLICIT_PURE, AB_ARTIFICIAL, AB_UNLIMITED_POLY, AB_OMP_DECLARE_TARGET,
> +  AB_OACC_ROUTINE, AB_OACC_ROUTINE_GANG, AB_OACC_ROUTINE_WORKER, AB_OACC_ROUTINE_VECTOR, AB_OACC_ROUTINE_SEQ,
>    AB_ARRAY_OUTER_DEPENDENCY, AB_MODULE_PROCEDURE, AB_OACC_DECLARE_CREATE,
>    AB_OACC_DECLARE_COPYIN, AB_OACC_DECLARE_DEVICEPTR,
>    AB_OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_LINK
> @@ -2044,6 +2045,11 @@ static const mstring attr_bits[] =
>      minit ("IMPLICIT_PURE", AB_IMPLICIT_PURE),
>      minit ("UNLIMITED_POLY", AB_UNLIMITED_POLY),
>      minit ("OMP_DECLARE_TARGET", AB_OMP_DECLARE_TARGET),
> +    minit ("OACC_ROUTINE", AB_OACC_ROUTINE),
> +    minit ("OACC_ROUTINE_GANG", AB_OACC_ROUTINE_GANG),
> +    minit ("OACC_ROUTINE_WORKER", AB_OACC_ROUTINE_WORKER),
> +    minit ("OACC_ROUTINE_VECTOR", AB_OACC_ROUTINE_VECTOR),
> +    minit ("OACC_ROUTINE_SEQ", AB_OACC_ROUTINE_SEQ),
>      minit ("ARRAY_OUTER_DEPENDENCY", AB_ARRAY_OUTER_DEPENDENCY),
>      minit ("MODULE_PROCEDURE", AB_MODULE_PROCEDURE),
>      minit ("OACC_DECLARE_CREATE", AB_OACC_DECLARE_CREATE),
> @@ -2095,7 +2101,6 @@ DECL_MIO_NAME (procedure_type)
>  DECL_MIO_NAME (ref_type)
>  DECL_MIO_NAME (sym_flavor)
>  DECL_MIO_NAME (sym_intent)
> -DECL_MIO_NAME (oacc_function)
>  #undef DECL_MIO_NAME
>  
>  /* Symbol attributes are stored in list with the first three elements
> @@ -2117,8 +2122,6 @@ mio_symbol_attribute (symbol_attribute *attr)
>    attr->proc = MIO_NAME (procedure_type) (attr->proc, procedures);
>    attr->if_source = MIO_NAME (ifsrc) (attr->if_source, ifsrc_types);
>    attr->save = MIO_NAME (save_state) (attr->save, save_status);
> -  attr->oacc_function = MIO_NAME (oacc_function) (attr->oacc_function,
> -						  oacc_function_types);
>  
>    ext_attr = attr->ext_attr;
>    mio_integer ((int *) &ext_attr);
> @@ -2236,6 +2239,16 @@ mio_symbol_attribute (symbol_attribute *attr)
>  	MIO_NAME (ab_attribute) (AB_VTAB, attr_bits);
>        if (attr->omp_declare_target)
>  	MIO_NAME (ab_attribute) (AB_OMP_DECLARE_TARGET, attr_bits);
> +      if (attr->oacc_routine)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE, attr_bits);
> +      if (attr->oacc_routine_gang)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_GANG, attr_bits);
> +      if (attr->oacc_routine_worker)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_WORKER, attr_bits);
> +      if (attr->oacc_routine_vector)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_VECTOR, attr_bits);
> +      if (attr->oacc_routine_seq)
> +	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_SEQ, attr_bits);
>        if (attr->array_outer_dependency)
>  	MIO_NAME (ab_attribute) (AB_ARRAY_OUTER_DEPENDENCY, attr_bits);
>        if (attr->module_procedure)
> @@ -2422,6 +2435,21 @@ mio_symbol_attribute (symbol_attribute *attr)
>  	    case AB_OMP_DECLARE_TARGET:
>  	      attr->omp_declare_target = 1;
>  	      break;
> +	    case AB_OACC_ROUTINE:
> +	      attr->oacc_routine = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_GANG:
> +	      attr->oacc_routine_gang = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_WORKER:
> +	      attr->oacc_routine_worker = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_VECTOR:
> +	      attr->oacc_routine_vector = 1;
> +	      break;
> +	    case AB_OACC_ROUTINE_SEQ:
> +	      attr->oacc_routine_seq = 1;
> +	      break;
>  	    case AB_ARRAY_OUTER_DEPENDENCY:
>  	      attr->array_outer_dependency =1;
>  	      break;

That seems similar to what my patch is did, albeit with some checking
deferred. I don't think this would maintain backwards compatibility with
object files generated by older versions of gcc.

Regarding backwards compatibility, maybe we should teach gfortran to
default to seq parallelism if an oacc_function attribute is missing in
an older version of the .mod file? I'm not sure if there's anything we
can do about forwards compatibility, i.e., linking a module generated by
gcc7 with gcc6.

> diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
> index 05e4661..5a69e38 100644
> --- gcc/fortran/openmp.c
> +++ gcc/fortran/openmp.c
> @@ -1714,44 +1714,6 @@ gfc_match_oacc_cache (void)
>    return MATCH_YES;
>  }
>  
> -/* Determine the loop level for a routine.  Returns OACC_FUNCTION_NONE if
> -   any error is detected.  */
> -
> -static oacc_function
> -gfc_oacc_routine_dims (gfc_omp_clauses *clauses)
> -{
> -  int level = -1;
> -  oacc_function ret = OACC_FUNCTION_SEQ;
> -
> -  if (clauses)
> -    {
> -      unsigned mask = 0;
> -
> -      if (clauses->gang)
> -	{
> -	  level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_GANG;
> -	}
> -      if (clauses->worker)
> -	{
> -	  level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_WORKER;
> -	}
> -      if (clauses->vector)
> -	{
> -	  level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
> -	  ret = OACC_FUNCTION_VECTOR;
> -	}
> -      if (clauses->seq)
> -	level = GOMP_DIM_MAX, mask |= GOMP_DIM_MASK (level);
> -
> -      if (mask != (mask & -mask))
> -	ret = OACC_FUNCTION_NONE;
> -    }
> -
> -  return ret;
> -}
> -
>  match
>  gfc_match_oacc_routine (void)
>  {
> @@ -1761,7 +1723,8 @@ gfc_match_oacc_routine (void)
>    gfc_omp_clauses *c = NULL;
>    gfc_oacc_routine_name *n = NULL;
>    gfc_intrinsic_sym *isym = NULL;
> -  oacc_function dims = OACC_FUNCTION_NONE;
> +  symbol_attribute *add_attr = NULL;
> +  const char *add_attr_name = NULL;
>  
>    old_loc = gfc_current_locus;
>  
> @@ -1828,19 +1791,26 @@ gfc_match_oacc_routine (void)
>  	  != MATCH_YES))
>      return MATCH_ERROR;
>  
> -  dims = gfc_oacc_routine_dims (c);
> -  if (dims == OACC_FUNCTION_NONE)
> -    {
> -      gfc_error ("Multiple loop axes specified for routine %C");
> -      gfc_current_locus = old_loc;
> -      return MATCH_ERROR;
> -    }
> -
>    if (isym != NULL)
> -    /* There is nothing to do for intrinsic procedures.  */
> -    ;
> +    {
> +      //TODO gfc_intrinsic_sym doesn't have symbol_attribute?
> +      //add_attr = &isym->attr;
> +      //add_attr_name = NULL; //TODO
> +      /* Fake it.  TODO: handle device_type clauses...  */
> +      if (c->gang || c->worker || c->vector)
> +	{
> +	  gfc_error ("Intrinsic symbol specified in !$ACC ROUTINE ( NAME )"
> +		     " at %C, with incompatible clauses specifying the level"
> +		     " of parallelism");
> +	  gfc_current_locus = old_loc;
> +	  return MATCH_ERROR;
> +	}
> +    }
>    else if (sym != NULL)
>      {
> +      add_attr = &sym->attr;
> +      add_attr_name = NULL; //TODO
> +
>        n = gfc_get_oacc_routine_name ();
>        n->sym = sym;
>        n->clauses = NULL;
> @@ -1852,11 +1822,41 @@ gfc_match_oacc_routine (void)
>      }
>    else if (gfc_current_ns->proc_name)
>      {
> -      if (!gfc_add_omp_declare_target (&gfc_current_ns->proc_name->attr,
> -				       gfc_current_ns->proc_name->name,
> -				       &old_loc))
> +      add_attr = &gfc_current_ns->proc_name->attr;
> +      add_attr_name = gfc_current_ns->proc_name->name;
> +    }
> +  else
> +    gcc_unreachable ();
> +
> +  if (add_attr != NULL)
> +    {
> +      if (!gfc_add_omp_declare_target (add_attr, add_attr_name, &old_loc))
>  	goto cleanup;
> -      gfc_current_ns->proc_name->attr.oacc_function = dims;
> +      /* Skip over any existing symbol attributes capturing OpenACC routine
> +	 directives.  */
> +      while (add_attr->next != NULL)
> +	add_attr = add_attr->next;
> +      if (add_attr->oacc_routine)
> +	{
> +	  add_attr->next = XCNEW (symbol_attribute);
> +	  gfc_clear_attr (add_attr->next);
> +	  add_attr = add_attr->next;
> +	}
> +      if (!gfc_add_oacc_routine (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->gang
> +	  && !gfc_add_oacc_routine_gang (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->worker
> +	  && !gfc_add_oacc_routine_worker (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->vector
> +	  && !gfc_add_oacc_routine_vector (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      if (c && c->seq
> +	  && !gfc_add_oacc_routine_seq (add_attr, add_attr_name, &old_loc))
> +	goto cleanup;
> +      add_attr->omp_clauses_locus = old_loc; //TODO OK to just assign that?
>      }

This is another stylistic thing I don't like. Instead of having a single
function for mutually exclusive attributes, you need five. And each of
those functions are extremely similar, and I copy and paste issues when
dealing with such functions in the past.

With that in mind, I do see some value in preserving the routine clauses
and location information for device_type. But I thought that device_type
was more of a future project.

Cesar
diff mbox

Patch

diff --git gcc/fortran/gfortran.h gcc/fortran/gfortran.h
index c70f51f..5f19421 100644
--- gcc/fortran/gfortran.h
+++ gcc/fortran/gfortran.h
@@ -303,15 +303,6 @@  enum save_state
 { SAVE_NONE = 0, SAVE_EXPLICIT, SAVE_IMPLICIT
 };
 
-/* Flags to keep track of ACC routine states.  */
-enum oacc_function
-{ OACC_FUNCTION_NONE = 0,
-  OACC_FUNCTION_SEQ,
-  OACC_FUNCTION_GANG,
-  OACC_FUNCTION_WORKER,
-  OACC_FUNCTION_VECTOR
-};
-
 /* Strings for all symbol attributes.  We use these for dumping the
    parse tree, in error messages, and also when reading and writing
    modules.  In symbol.c.  */
@@ -321,7 +312,6 @@  extern const mstring intents[];
 extern const mstring access_types[];
 extern const mstring ifsrc_types[];
 extern const mstring save_status[];
-extern const mstring oacc_function_types[];
 
 /* Enumeration of all the generic intrinsic functions.  Used by the
    backend for identification of a function.  */
@@ -705,6 +695,126 @@  CInteropKind_t;
 extern CInteropKind_t c_interop_kinds_table[];
 
 
+/* We need to store source lines as sequences of multibyte source
+   characters. We define here a type wide enough to hold any multibyte
+   source character, just like libcpp does.  A 32-bit type is enough.  */
+
+#if HOST_BITS_PER_INT >= 32
+typedef unsigned int gfc_char_t;
+#elif HOST_BITS_PER_LONG >= 32
+typedef unsigned long gfc_char_t;
+#elif defined(HAVE_LONG_LONG) && (HOST_BITS_PER_LONGLONG >= 32)
+typedef unsigned long long gfc_char_t;
+#else
+# error "Cannot find an integer type with at least 32 bits"
+#endif
+
+
+/* The following three structures are used to identify a location in
+   the sources.
+
+   gfc_file is used to maintain a tree of the source files and how
+   they include each other
+
+   gfc_linebuf holds a single line of source code and information
+   which file it resides in
+
+   locus point to the sourceline and the character in the source
+   line.
+*/
+
+typedef struct gfc_file
+{
+  struct gfc_file *next, *up;
+  int inclusion_line, line;
+  char *filename;
+} gfc_file;
+
+typedef struct gfc_linebuf
+{
+  source_location location;
+  struct gfc_file *file;
+  struct gfc_linebuf *next;
+
+  int truncated;
+  bool dbg_emitted;
+
+  gfc_char_t line[1];
+} gfc_linebuf;
+
+#define gfc_linebuf_header_size (offsetof (gfc_linebuf, line))
+
+#define gfc_linebuf_linenum(LBUF) (LOCATION_LINE ((LBUF)->location))
+
+typedef struct
+{
+  gfc_char_t *nextc;
+  gfc_linebuf *lb;
+} locus;
+
+/* In order for the "gfc" format checking to work correctly, you must
+   have declared a typedef locus first.  */
+#if GCC_VERSION >= 4001
+#define ATTRIBUTE_GCC_GFC(m, n) __attribute__ ((__format__ (__gcc_gfc__, m, n))) ATTRIBUTE_NONNULL(m)
+#else
+#define ATTRIBUTE_GCC_GFC(m, n) ATTRIBUTE_NONNULL(m)
+#endif
+
+
+/* Suppress error messages or re-enable them.  */
+
+void gfc_push_suppress_errors (void);
+void gfc_pop_suppress_errors (void);
+
+
+/* Character length structures hold the expression that gives the
+   length of a character variable.  We avoid putting these into
+   gfc_typespec because doing so prevents us from doing structure
+   copies and forces us to deallocate any typespecs we create, as well
+   as structures that contain typespecs.  They also can have multiple
+   character typespecs pointing to them.
+
+   These structures form a singly linked list within the current
+   namespace and are deallocated with the namespace.  It is possible to
+   end up with gfc_charlen structures that have nothing pointing to them.  */
+
+typedef struct gfc_charlen
+{
+  struct gfc_expr *length;
+  struct gfc_charlen *next;
+  bool length_from_typespec; /* Length from explicit array ctor typespec?  */
+  tree backend_decl;
+  tree passed_length; /* Length argument explicitly passed.  */
+
+  int resolved;
+}
+gfc_charlen;
+
+#define gfc_get_charlen() XCNEW (gfc_charlen)
+
+/* Type specification structure.  */
+typedef struct
+{
+  bt type;
+  int kind;
+
+  union
+  {
+    struct gfc_symbol *derived;	/* For derived types only.  */
+    gfc_charlen *cl;		/* For character types only.  */
+    int pad;			/* For hollerith types only.  */
+  }
+  u;
+
+  struct gfc_symbol *interface;	/* For PROCEDURE declarations.  */
+  int is_c_interop;
+  int is_iso_c;
+  bt f90_type;
+  bool deferred;
+}
+gfc_typespec;
+
+
 /* Structure and list of supported extension attributes.  */
 typedef enum
 {
@@ -729,7 +839,7 @@  ext_attr_t;
 extern const ext_attr_t ext_attr_list[];
 
 /* Symbol attribute structure.  */
-typedef struct
+typedef struct symbol_attribute
 {
   /* Variable attributes.  */
   unsigned allocatable:1, dimension:1, codimension:1, external:1, intrinsic:1,
@@ -864,6 +974,13 @@  typedef struct
   /* Mentioned in OMP DECLARE TARGET.  */
   unsigned omp_declare_target:1;
 
+  /* OpenACC routine.  */
+  unsigned oacc_routine:1;
+  unsigned oacc_routine_gang:1;
+  unsigned oacc_routine_worker:1;
+  unsigned oacc_routine_vector:1;
+  unsigned oacc_routine_seq:1;
+
   /* Mentioned in OACC DECLARE.  */
   unsigned oacc_declare_create:1;
   unsigned oacc_declare_copyin:1;
@@ -871,137 +988,24 @@  typedef struct
   unsigned oacc_declare_device_resident:1;
   unsigned oacc_declare_link:1;
 
-  /* This is an OpenACC acclerator function at level N - 1  */
-  ENUM_BITFIELD (oacc_function) oacc_function:3;
-
   /* Attributes set by compiler extensions (!GCC$ ATTRIBUTES).  */
   unsigned ext_attr:EXT_ATTR_NUM;
 
+  /* Location information for OMP clauses.  */
+  //TODO: how to handle in module.c/symbol.c?
+  locus omp_clauses_locus;
+
   /* The namespace where the attribute has been set.  */
   struct gfc_namespace *volatile_ns, *asynchronous_ns;
+
+  /* Chain to another set of symbol attributes.  Currently only used for
+     OpenACC routine.  */
+  //TODO: how to handle in module.c/symbol.c?
+  struct symbol_attribute *next;
 }
 symbol_attribute;
 
 
-/* We need to store source lines as sequences of multibyte source
-   characters. We define here a type wide enough to hold any multibyte
-   source character, just like libcpp does.  A 32-bit type is enough.  */
-
-#if HOST_BITS_PER_INT >= 32
-typedef unsigned int gfc_char_t;
-#elif HOST_BITS_PER_LONG >= 32
-typedef unsigned long gfc_char_t;
-#elif defined(HAVE_LONG_LONG) && (HOST_BITS_PER_LONGLONG >= 32)
-typedef unsigned long long gfc_char_t;
-#else
-# error "Cannot find an integer type with at least 32 bits"
-#endif
-
-
-/* The following three structures are used to identify a location in
-   the sources.
-
-   gfc_file is used to maintain a tree of the source files and how
-   they include each other
-
-   gfc_linebuf holds a single line of source code and information
-   which file it resides in
-
-   locus point to the sourceline and the character in the source
-   line.
-*/
-
-typedef struct gfc_file
-{
-  struct gfc_file *next, *up;
-  int inclusion_line, line;
-  char *filename;
-} gfc_file;
-
-typedef struct gfc_linebuf
-{
-  source_location location;
-  struct gfc_file *file;
-  struct gfc_linebuf *next;
-
-  int truncated;
-  bool dbg_emitted;
-
-  gfc_char_t line[1];
-} gfc_linebuf;
-
-#define gfc_linebuf_header_size (offsetof (gfc_linebuf, line))
-
-#define gfc_linebuf_linenum(LBUF) (LOCATION_LINE ((LBUF)->location))
-
-typedef struct
-{
-  gfc_char_t *nextc;
-  gfc_linebuf *lb;
-} locus;
-
-/* In order for the "gfc" format checking to work correctly, you must
-   have declared a typedef locus first.  */
-#if GCC_VERSION >= 4001
-#define ATTRIBUTE_GCC_GFC(m, n) __attribute__ ((__format__ (__gcc_gfc__, m, n))) ATTRIBUTE_NONNULL(m)
-#else
-#define ATTRIBUTE_GCC_GFC(m, n) ATTRIBUTE_NONNULL(m)
-#endif
-
-
-/* Suppress error messages or re-enable them.  */
-
-void gfc_push_suppress_errors (void);
-void gfc_pop_suppress_errors (void);
-
-
-/* Character length structures hold the expression that gives the
-   length of a character variable.  We avoid putting these into
-   gfc_typespec because doing so prevents us from doing structure
-   copies and forces us to deallocate any typespecs we create, as well
-   as structures that contain typespecs.  They also can have multiple
-   character typespecs pointing to them.
-
-   These structures form a singly linked list within the current
-   namespace and are deallocated with the namespace.  It is possible to
-   end up with gfc_charlen structures that have nothing pointing to them.  */
-
-typedef struct gfc_charlen
-{
-  struct gfc_expr *length;
-  struct gfc_charlen *next;
-  bool length_from_typespec; /* Length from explicit array ctor typespec?  */
-  tree backend_decl;
-  tree passed_length; /* Length argument explicitly passed.  */
-
-  int resolved;
-}
-gfc_charlen;
-
-#define gfc_get_charlen() XCNEW (gfc_charlen)
-
-/* Type specification structure.  */
-typedef struct
-{
-  bt type;
-  int kind;
-
-  union
-  {
-    struct gfc_symbol *derived;	/* For derived types only.  */
-    gfc_charlen *cl;		/* For character types only.  */
-    int pad;			/* For hollerith types only.  */
-  }
-  u;
-
-  struct gfc_symbol *interface;	/* For PROCEDURE declarations.  */
-  int is_c_interop;
-  int is_iso_c;
-  bt f90_type;
-  bool deferred;
-}
-gfc_typespec;
-
 /* Array specification.  */
 typedef struct
 {
@@ -2816,6 +2820,11 @@  bool gfc_add_result (symbol_attribute *, const char *, locus *);
 bool gfc_add_save (symbol_attribute *, save_state, const char *, locus *);
 bool gfc_add_threadprivate (symbol_attribute *, const char *, locus *);
 bool gfc_add_omp_declare_target (symbol_attribute *, const char *, locus *);
+bool gfc_add_oacc_routine (symbol_attribute *, const char *, locus *);
+bool gfc_add_oacc_routine_gang (symbol_attribute *, const char *, locus *);
+bool gfc_add_oacc_routine_worker (symbol_attribute *, const char *, locus *);
+bool gfc_add_oacc_routine_vector (symbol_attribute *, const char *, locus *);
+bool gfc_add_oacc_routine_seq (symbol_attribute *, const char *, locus *);
 bool gfc_add_saved_common (symbol_attribute *, locus *);
 bool gfc_add_target (symbol_attribute *, locus *);
 bool gfc_add_dummy (symbol_attribute *, const char *, locus *);
diff --git gcc/fortran/module.c gcc/fortran/module.c
index 267858f..4b590c6 100644
--- gcc/fortran/module.c
+++ gcc/fortran/module.c
@@ -1986,6 +1986,7 @@  enum ab_attribute
   AB_IS_CLASS, AB_PROCEDURE, AB_PROC_POINTER, AB_ASYNCHRONOUS, AB_CODIMENSION,
   AB_COARRAY_COMP, AB_VTYPE, AB_VTAB, AB_CONTIGUOUS, AB_CLASS_POINTER,
   AB_IMPLICIT_PURE, AB_ARTIFICIAL, AB_UNLIMITED_POLY, AB_OMP_DECLARE_TARGET,
+  AB_OACC_ROUTINE, AB_OACC_ROUTINE_GANG, AB_OACC_ROUTINE_WORKER, AB_OACC_ROUTINE_VECTOR, AB_OACC_ROUTINE_SEQ,
   AB_ARRAY_OUTER_DEPENDENCY, AB_MODULE_PROCEDURE, AB_OACC_DECLARE_CREATE,
   AB_OACC_DECLARE_COPYIN, AB_OACC_DECLARE_DEVICEPTR,
   AB_OACC_DECLARE_DEVICE_RESIDENT, AB_OACC_DECLARE_LINK
@@ -2044,6 +2045,11 @@  static const mstring attr_bits[] =
     minit ("IMPLICIT_PURE", AB_IMPLICIT_PURE),
     minit ("UNLIMITED_POLY", AB_UNLIMITED_POLY),
     minit ("OMP_DECLARE_TARGET", AB_OMP_DECLARE_TARGET),
+    minit ("OACC_ROUTINE", AB_OACC_ROUTINE),
+    minit ("OACC_ROUTINE_GANG", AB_OACC_ROUTINE_GANG),
+    minit ("OACC_ROUTINE_WORKER", AB_OACC_ROUTINE_WORKER),
+    minit ("OACC_ROUTINE_VECTOR", AB_OACC_ROUTINE_VECTOR),
+    minit ("OACC_ROUTINE_SEQ", AB_OACC_ROUTINE_SEQ),
     minit ("ARRAY_OUTER_DEPENDENCY", AB_ARRAY_OUTER_DEPENDENCY),
     minit ("MODULE_PROCEDURE", AB_MODULE_PROCEDURE),
     minit ("OACC_DECLARE_CREATE", AB_OACC_DECLARE_CREATE),
@@ -2095,7 +2101,6 @@  DECL_MIO_NAME (procedure_type)
 DECL_MIO_NAME (ref_type)
 DECL_MIO_NAME (sym_flavor)
 DECL_MIO_NAME (sym_intent)
-DECL_MIO_NAME (oacc_function)
 #undef DECL_MIO_NAME
 
 /* Symbol attributes are stored in list with the first three elements
@@ -2117,8 +2122,6 @@  mio_symbol_attribute (symbol_attribute *attr)
   attr->proc = MIO_NAME (procedure_type) (attr->proc, procedures);
   attr->if_source = MIO_NAME (ifsrc) (attr->if_source, ifsrc_types);
   attr->save = MIO_NAME (save_state) (attr->save, save_status);
-  attr->oacc_function = MIO_NAME (oacc_function) (attr->oacc_function,
-						  oacc_function_types);
 
   ext_attr = attr->ext_attr;
   mio_integer ((int *) &ext_attr);
@@ -2236,6 +2239,16 @@  mio_symbol_attribute (symbol_attribute *attr)
 	MIO_NAME (ab_attribute) (AB_VTAB, attr_bits);
       if (attr->omp_declare_target)
 	MIO_NAME (ab_attribute) (AB_OMP_DECLARE_TARGET, attr_bits);
+      if (attr->oacc_routine)
+	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE, attr_bits);
+      if (attr->oacc_routine_gang)
+	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_GANG, attr_bits);
+      if (attr->oacc_routine_worker)
+	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_WORKER, attr_bits);
+      if (attr->oacc_routine_vector)
+	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_VECTOR, attr_bits);
+      if (attr->oacc_routine_seq)
+	MIO_NAME (ab_attribute) (AB_OACC_ROUTINE_SEQ, attr_bits);
       if (attr->array_outer_dependency)
 	MIO_NAME (ab_attribute) (AB_ARRAY_OUTER_DEPENDENCY, attr_bits);
       if (attr->module_procedure)
@@ -2422,6 +2435,21 @@  mio_symbol_attribute (symbol_attribute *attr)
 	    case AB_OMP_DECLARE_TARGET:
 	      attr->omp_declare_target = 1;
 	      break;
+	    case AB_OACC_ROUTINE:
+	      attr->oacc_routine = 1;
+	      break;
+	    case AB_OACC_ROUTINE_GANG:
+	      attr->oacc_routine_gang = 1;
+	      break;
+	    case AB_OACC_ROUTINE_WORKER:
+	      attr->oacc_routine_worker = 1;
+	      break;
+	    case AB_OACC_ROUTINE_VECTOR:
+	      attr->oacc_routine_vector = 1;
+	      break;
+	    case AB_OACC_ROUTINE_SEQ:
+	      attr->oacc_routine_seq = 1;
+	      break;
 	    case AB_ARRAY_OUTER_DEPENDENCY:
 	      attr->array_outer_dependency =1;
 	      break;
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index 05e4661..5a69e38 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1714,44 +1714,6 @@  gfc_match_oacc_cache (void)
   return MATCH_YES;
 }
 
-/* Determine the loop level for a routine.  Returns OACC_FUNCTION_NONE if
-   any error is detected.  */
-
-static oacc_function
-gfc_oacc_routine_dims (gfc_omp_clauses *clauses)
-{
-  int level = -1;
-  oacc_function ret = OACC_FUNCTION_SEQ;
-
-  if (clauses)
-    {
-      unsigned mask = 0;
-
-      if (clauses->gang)
-	{
-	  level = GOMP_DIM_GANG, mask |= GOMP_DIM_MASK (level);
-	  ret = OACC_FUNCTION_GANG;
-	}
-      if (clauses->worker)
-	{
-	  level = GOMP_DIM_WORKER, mask |= GOMP_DIM_MASK (level);
-	  ret = OACC_FUNCTION_WORKER;
-	}
-      if (clauses->vector)
-	{
-	  level = GOMP_DIM_VECTOR, mask |= GOMP_DIM_MASK (level);
-	  ret = OACC_FUNCTION_VECTOR;
-	}
-      if (clauses->seq)
-	level = GOMP_DIM_MAX, mask |= GOMP_DIM_MASK (level);
-
-      if (mask != (mask & -mask))
-	ret = OACC_FUNCTION_NONE;
-    }
-
-  return ret;
-}
-
 match
 gfc_match_oacc_routine (void)
 {
@@ -1761,7 +1723,8 @@  gfc_match_oacc_routine (void)
   gfc_omp_clauses *c = NULL;
   gfc_oacc_routine_name *n = NULL;
   gfc_intrinsic_sym *isym = NULL;
-  oacc_function dims = OACC_FUNCTION_NONE;
+  symbol_attribute *add_attr = NULL;
+  const char *add_attr_name = NULL;
 
   old_loc = gfc_current_locus;
 
@@ -1828,19 +1791,26 @@  gfc_match_oacc_routine (void)
 	  != MATCH_YES))
     return MATCH_ERROR;
 
-  dims = gfc_oacc_routine_dims (c);
-  if (dims == OACC_FUNCTION_NONE)
-    {
-      gfc_error ("Multiple loop axes specified for routine %C");
-      gfc_current_locus = old_loc;
-      return MATCH_ERROR;
-    }
-
   if (isym != NULL)
-    /* There is nothing to do for intrinsic procedures.  */
-    ;
+    {
+      //TODO gfc_intrinsic_sym doesn't have symbol_attribute?
+      //add_attr = &isym->attr;
+      //add_attr_name = NULL; //TODO
+      /* Fake it.  TODO: handle device_type clauses...  */
+      if (c->gang || c->worker || c->vector)
+	{
+	  gfc_error ("Intrinsic symbol specified in !$ACC ROUTINE ( NAME )"
+		     " at %C, with incompatible clauses specifying the level"
+		     " of parallelism");
+	  gfc_current_locus = old_loc;
+	  return MATCH_ERROR;
+	}
+    }
   else if (sym != NULL)
     {
+      add_attr = &sym->attr;
+      add_attr_name = NULL; //TODO
+
       n = gfc_get_oacc_routine_name ();
       n->sym = sym;
       n->clauses = NULL;
@@ -1852,11 +1822,41 @@  gfc_match_oacc_routine (void)
     }
   else if (gfc_current_ns->proc_name)
     {
-      if (!gfc_add_omp_declare_target (&gfc_current_ns->proc_name->attr,
-				       gfc_current_ns->proc_name->name,
-				       &old_loc))
+      add_attr = &gfc_current_ns->proc_name->attr;
+      add_attr_name = gfc_current_ns->proc_name->name;
+    }
+  else
+    gcc_unreachable ();
+
+  if (add_attr != NULL)
+    {
+      if (!gfc_add_omp_declare_target (add_attr, add_attr_name, &old_loc))
 	goto cleanup;
-      gfc_current_ns->proc_name->attr.oacc_function = dims;
+      /* Skip over any existing symbol attributes capturing OpenACC routine
+	 directives.  */
+      while (add_attr->next != NULL)
+	add_attr = add_attr->next;
+      if (add_attr->oacc_routine)
+	{
+	  add_attr->next = XCNEW (symbol_attribute);
+	  gfc_clear_attr (add_attr->next);
+	  add_attr = add_attr->next;
+	}
+      if (!gfc_add_oacc_routine (add_attr, add_attr_name, &old_loc))
+	goto cleanup;
+      if (c && c->gang
+	  && !gfc_add_oacc_routine_gang (add_attr, add_attr_name, &old_loc))
+	goto cleanup;
+      if (c && c->worker
+	  && !gfc_add_oacc_routine_worker (add_attr, add_attr_name, &old_loc))
+	goto cleanup;
+      if (c && c->vector
+	  && !gfc_add_oacc_routine_vector (add_attr, add_attr_name, &old_loc))
+	goto cleanup;
+      if (c && c->seq
+	  && !gfc_add_oacc_routine_seq (add_attr, add_attr_name, &old_loc))
+	goto cleanup;
+      add_attr->omp_clauses_locus = old_loc; //TODO OK to just assign that?
     }
 
   if (n)
diff --git gcc/fortran/symbol.c gcc/fortran/symbol.c
index 84fa2bd..36852da 100644
--- gcc/fortran/symbol.c
+++ gcc/fortran/symbol.c
@@ -87,15 +87,6 @@  const mstring save_status[] =
     minit ("IMPLICIT-SAVE", SAVE_IMPLICIT),
 };
 
-const mstring oacc_function_types[] =
-{
-  minit ("NONE", OACC_FUNCTION_NONE),
-  minit ("OACC_FUNCTION_SEQ", OACC_FUNCTION_SEQ),
-  minit ("OACC_FUNCTION_GANG", OACC_FUNCTION_GANG),
-  minit ("OACC_FUNCTION_WORKER", OACC_FUNCTION_WORKER),
-  minit ("OACC_FUNCTION_VECTOR", OACC_FUNCTION_VECTOR)
-};
-
 /* This is to make sure the backend generates setup code in the correct
    order.  */
 
@@ -385,6 +376,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
     *contiguous = "CONTIGUOUS", *generic = "GENERIC";
   static const char *threadprivate = "THREADPRIVATE";
   static const char *omp_declare_target = "OMP DECLARE TARGET";
+  static const char *oacc_routine = "OACC ROUTINE";
+  static const char *oacc_routine_gang = "OACC ROUTINE GANG";
+  static const char *oacc_routine_worker = "OACC ROUTINE WORKER";
+  static const char *oacc_routine_vector = "OACC ROUTINE VECTOR";
+  static const char *oacc_routine_seq = "OACC ROUTINE SEQ";
   static const char *oacc_declare_copyin = "OACC DECLARE COPYIN";
   static const char *oacc_declare_create = "OACC DECLARE CREATE";
   static const char *oacc_declare_deviceptr = "OACC DECLARE DEVICEPTR";
@@ -482,6 +478,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (dummy, intrinsic);
   conf (dummy, threadprivate);
   conf (dummy, omp_declare_target);
+  conf (dummy, oacc_routine);
+  conf (dummy, oacc_routine_gang);
+  conf (dummy, oacc_routine_worker);
+  conf (dummy, oacc_routine_vector);
+  conf (dummy, oacc_routine_seq);
   conf (pointer, target);
   conf (pointer, intrinsic);
   conf (pointer, elemental);
@@ -526,6 +527,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (in_equivalence, allocatable);
   conf (in_equivalence, threadprivate);
   conf (in_equivalence, omp_declare_target);
+  conf (in_equivalence, oacc_routine);
+  conf (in_equivalence, oacc_routine_gang);
+  conf (in_equivalence, oacc_routine_worker);
+  conf (in_equivalence, oacc_routine_vector);
+  conf (in_equivalence, oacc_routine_seq);
   conf (in_equivalence, oacc_declare_create);
   conf (in_equivalence, oacc_declare_copyin);
   conf (in_equivalence, oacc_declare_deviceptr);
@@ -579,6 +585,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (cray_pointee, in_equivalence);
   conf (cray_pointee, threadprivate);
   conf (cray_pointee, omp_declare_target);
+  conf (cray_pointee, oacc_routine);
+  conf (cray_pointee, oacc_routine_gang);
+  conf (cray_pointee, oacc_routine_worker);
+  conf (cray_pointee, oacc_routine_vector);
+  conf (cray_pointee, oacc_routine_seq);
   conf (cray_pointee, oacc_declare_create);
   conf (cray_pointee, oacc_declare_copyin);
   conf (cray_pointee, oacc_declare_deviceptr);
@@ -637,6 +648,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
   conf (proc_pointer, abstract)
 
   conf (entry, omp_declare_target)
+  conf (entry, oacc_routine)
+  conf (entry, oacc_routine_gang)
+  conf (entry, oacc_routine_worker)
+  conf (entry, oacc_routine_vector)
+  conf (entry, oacc_routine_seq)
   conf (entry, oacc_declare_create)
   conf (entry, oacc_declare_copyin)
   conf (entry, oacc_declare_deviceptr)
@@ -678,6 +694,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
       conf2 (subroutine);
       conf2 (threadprivate);
       conf2 (omp_declare_target);
+      conf2 (oacc_routine);
+      conf2 (oacc_routine_gang);
+      conf2 (oacc_routine_worker);
+      conf2 (oacc_routine_vector);
+      conf2 (oacc_routine_seq);
       conf2 (oacc_declare_create);
       conf2 (oacc_declare_copyin);
       conf2 (oacc_declare_deviceptr);
@@ -764,6 +785,11 @@  check_conflict (symbol_attribute *attr, const char *name, locus *where)
       conf2 (threadprivate);
       conf2 (result);
       conf2 (omp_declare_target);
+      conf2 (oacc_routine);
+      conf2 (oacc_routine_gang);
+      conf2 (oacc_routine_worker);
+      conf2 (oacc_routine_vector);
+      conf2 (oacc_routine_seq);
       conf2 (oacc_declare_create);
       conf2 (oacc_declare_copyin);
       conf2 (oacc_declare_deviceptr);
@@ -1266,7 +1292,6 @@  bool
 gfc_add_omp_declare_target (symbol_attribute *attr, const char *name,
 			    locus *where)
 {
-
   if (check_used (attr, name, where))
     return false;
 
@@ -1279,6 +1304,81 @@  gfc_add_omp_declare_target (symbol_attribute *attr, const char *name,
 
 
 bool
+gfc_add_oacc_routine (symbol_attribute *attr, const char *name,
+		      locus *where)
+{
+  if (check_used (attr, name, where))
+    return false;
+
+  if (attr->oacc_routine)
+    return true;
+
+  attr->oacc_routine = 1;
+  return check_conflict (attr, name, where);
+}
+
+
+bool
+gfc_add_oacc_routine_gang (symbol_attribute *attr, const char *name,
+			   locus *where)
+{
+  if (check_used (attr, name, where))
+    return false;
+
+  if (attr->oacc_routine_gang)
+    return true;
+
+  attr->oacc_routine_gang = 1;
+  return check_conflict (attr, name, where);
+}
+
+
+bool
+gfc_add_oacc_routine_worker (symbol_attribute *attr, const char *name,
+			     locus *where)
+{
+  if (check_used (attr, name, where))
+    return false;
+
+  if (attr->oacc_routine_worker)
+    return true;
+
+  attr->oacc_routine_worker = 1;
+  return check_conflict (attr, name, where);
+}
+
+
+bool
+gfc_add_oacc_routine_vector (symbol_attribute *attr, const char *name,
+			     locus *where)
+{
+  if (check_used (attr, name, where))
+    return false;
+
+  if (attr->oacc_routine_vector)
+    return true;
+
+  attr->oacc_routine_vector = 1;
+  return check_conflict (attr, name, where);
+}
+
+
+bool
+gfc_add_oacc_routine_seq (symbol_attribute *attr, const char *name,
+			  locus *where)
+{
+  if (check_used (attr, name, where))
+    return false;
+
+  if (attr->oacc_routine_seq)
+    return true;
+
+  attr->oacc_routine_seq = 1;
+  return check_conflict (attr, name, where);
+}
+
+
+bool
 gfc_add_oacc_declare_create (symbol_attribute *attr, const char *name,
 			     locus *where)
 {
@@ -1915,6 +2015,21 @@  gfc_copy_attr (symbol_attribute *dest, symbol_attribute *src, locus *where)
   if (src->omp_declare_target
       && !gfc_add_omp_declare_target (dest, NULL, where))
     goto fail;
+  if (src->oacc_routine
+      && !gfc_add_oacc_routine (dest, NULL, where))
+    goto fail;
+  if (src->oacc_routine_gang
+      && !gfc_add_oacc_routine_gang (dest, NULL, where))
+    goto fail;
+  if (src->oacc_routine_worker
+      && !gfc_add_oacc_routine_worker (dest, NULL, where))
+    goto fail;
+  if (src->oacc_routine_vector
+      && !gfc_add_oacc_routine_vector (dest, NULL, where))
+    goto fail;
+  if (src->oacc_routine_seq
+      && !gfc_add_oacc_routine_seq (dest, NULL, where))
+    goto fail;
   if (src->oacc_declare_create
       && !gfc_add_oacc_declare_create (dest, NULL, where))
     goto fail;
diff --git gcc/fortran/trans-decl.c gcc/fortran/trans-decl.c
index 1934453..d1b956c 100644
--- gcc/fortran/trans-decl.c
+++ gcc/fortran/trans-decl.c
@@ -46,6 +46,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "trans-stmt.h"
 #include "gomp-constants.h"
 #include "gimplify.h"
+#include "omp-low.h"
 
 #define MAX_LABEL_VALUE 99999
 
@@ -1360,37 +1361,94 @@  add_attributes_to_decl (symbol_attribute sym_attr, tree list)
       }
 
   if (sym_attr.omp_declare_target)
-    list = tree_cons (get_identifier ("omp declare target"),
-		      NULL_TREE, list);
-
-  if (sym_attr.oacc_function != OACC_FUNCTION_NONE)
     {
-      tree dims = NULL_TREE;
-      int ix;
-      int level = GOMP_DIM_MAX;
+      tree clauses = NULL_TREE;
+      symbol_attribute *oacc_routine_attr = &sym_attr;
+      while (oacc_routine_attr != NULL
+	     && oacc_routine_attr->oacc_routine)
+	{
+	  location_t loc = oacc_routine_attr->omp_clauses_locus.lb->location;
+	  //TODO use gfc_trans_omp_clauses?
+	  tree clauses_ = NULL_TREE;
+	  if (oacc_routine_attr->oacc_routine_gang)
+	    {
+	      tree c = build_omp_clause (loc, OMP_CLAUSE_GANG);
+	      OMP_CLAUSE_CHAIN (c) = clauses_;
+	      clauses_ = c;
+	    }
+	  if (oacc_routine_attr->oacc_routine_worker)
+	    {
+	      tree c = build_omp_clause (loc, OMP_CLAUSE_WORKER);
+	      OMP_CLAUSE_CHAIN (c) = clauses_;
+	      clauses_ = c;
+	    }
+	  if (oacc_routine_attr->oacc_routine_vector)
+	    {
+	      tree c = build_omp_clause (loc, OMP_CLAUSE_VECTOR);
+	      OMP_CLAUSE_CHAIN (c) = clauses_;
+	      clauses_ = c;
+	    }
+	  /* Default to seq if nothing else has been specified.  */
+	  if (oacc_routine_attr->oacc_routine_seq
+	      || clauses_ == NULL_TREE)
+	    {
+	      tree c = build_omp_clause (loc, OMP_CLAUSE_SEQ);
+	      OMP_CLAUSE_CHAIN (c) = clauses_;
+	      clauses_ = c;
+	    }
+
+	  /* If we saw more than one clause specifying the level of
+	     parallelism...  */
+	  if (OMP_CLAUSE_CHAIN (clauses_) != NULL_TREE)
+	    {
+	      gfc_error ("Multiple loop axes specified for routine at %L",
+			 &oacc_routine_attr->omp_clauses_locus);
+
+	      /* ..., only one clause survives.  */
+	      OMP_CLAUSE_CHAIN (clauses_) = NULL_TREE;
+	    }
+
+	  OMP_CLAUSE_CHAIN (clauses_) = clauses;
+	  clauses = clauses_;
+
+	  oacc_routine_attr = oacc_routine_attr->next;
+	}
 
-      switch (sym_attr.oacc_function)
+      /* For any chained symbol attributes for OpenACC routine, handle, and
+	 clean these up.  */
+      while (sym_attr.next != NULL)
 	{
-	case OACC_FUNCTION_GANG:
-	  level = GOMP_DIM_GANG;
-	  break;
-	case OACC_FUNCTION_WORKER:
-	  level = GOMP_DIM_WORKER;
-	  break;
-	case OACC_FUNCTION_VECTOR:
-	  level = GOMP_DIM_VECTOR;
-	  break;
-	case OACC_FUNCTION_SEQ:
-	default:;
+	  symbol_attribute *sym_attr_next = sym_attr.next->next;
+
+	  gfc_error ("!$ACC ROUTINE already applied at %L",
+		     &sym_attr.next->omp_clauses_locus);
+
+	  free (sym_attr.next);
+
+	  sym_attr.next = sym_attr_next;
 	}
 
-      for (ix = GOMP_DIM_MAX; ix--;)
-	dims = tree_cons (build_int_cst (boolean_type_node, ix >= level),
-			  integer_zero_node, dims);
+      if (sym_attr.oacc_routine)
+	{
+	  gcc_checking_assert (clauses != NULL_TREE);
+	  /* If we saw more than one set of symbol attributes for OpenACC
+	     routine, only one clause survives.  */
+	  OMP_CLAUSE_CHAIN (clauses) = NULL_TREE;
 
-      list = tree_cons (get_identifier ("oacc function"),
-			dims, list);
+	  /* Set the routine's level of parallelism.  */
+	  tree dims = build_oacc_routine_dims (clauses);
+#if 0
+	  // TODO Can we call this before decl_attributes has been called, which happens only after returning from add_attributes_to_decl?
+	  replace_oacc_fn_attrib (fndecl, dims);
+#else
+	  list = tree_cons (get_identifier ("oacc function"),
+			    dims, list);
+#endif
+	}
+      list = tree_cons (get_identifier ("omp declare target"),
+			NULL_TREE, list);
     }
+  gcc_checking_assert (sym_attr.next == NULL);
 
   return list;
 }
diff --git gcc/testsuite/gfortran.dg/goacc/oaccdevlow-routine.f95 gcc/testsuite/gfortran.dg/goacc/oaccdevlow-routine.f95
index 2161fe2..6af19d5 100644
--- gcc/testsuite/gfortran.dg/goacc/oaccdevlow-routine.f95
+++ gcc/testsuite/gfortran.dg/goacc/oaccdevlow-routine.f95
@@ -20,7 +20,7 @@  subroutine ROUTINE
 end subroutine ROUTINE
 
 ! Check the offloaded function's attributes.
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(omp declare target, oacc function \\(0 0, 1 0, 1 0\\)\\)\\)" 1 "ompexp" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 0, 1 0\\), omp declare target\\)\\)" 1 "ompexp" } }
 
 ! Check the offloaded function's classification and compute dimensions (will
 ! always be [1, 1, 1] for target compilation).
diff --git gcc/testsuite/gfortran.dg/goacc/pr72741-2.f gcc/testsuite/gfortran.dg/goacc/pr72741-2.f
new file mode 100644
index 0000000..e0c35d6
--- /dev/null
+++ gcc/testsuite/gfortran.dg/goacc/pr72741-2.f
@@ -0,0 +1,39 @@ 
+      SUBROUTINE v_1
+!$ACC ROUTINE
+!$ACC ROUTINE ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE GANG ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE SEQ ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE WORKER ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+      END SUBROUTINE v_1
+
+      SUBROUTINE sub_1
+      IMPLICIT NONE
+      EXTERNAL :: g_1
+!$ACC ROUTINE (g_1) GANG
+!$ACC ROUTINE (g_1) VECTOR ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (g_1) SEQ ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (g_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (g_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+
+      CALL v_1
+      CALL g_1
+      CALL ABORT
+      END SUBROUTINE sub_1
+
+      MODULE m_w_1
+      IMPLICIT NONE
+      EXTERNAL :: w_1
+!$ACC ROUTINE (w_1) WORKER
+!$ACC ROUTINE (w_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (w_1) SEQ ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (w_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+!$ACC ROUTINE (w_1) VECTOR ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+
+      CONTAINS
+      SUBROUTINE sub_2
+      CALL v_1
+      CALL w_1
+      CALL ABORT
+      END SUBROUTINE sub_2
+      END MODULE m_w_1
diff --git gcc/testsuite/gfortran.dg/goacc/pr72741-intrinsic-1.f gcc/testsuite/gfortran.dg/goacc/pr72741-intrinsic-1.f
new file mode 100644
index 0000000..d84cdf9
--- /dev/null
+++ gcc/testsuite/gfortran.dg/goacc/pr72741-intrinsic-1.f
@@ -0,0 +1,16 @@ 
+      SUBROUTINE sub_1
+      IMPLICIT NONE
+!$ACC ROUTINE (ABORT) SEQ VECTOR ! { dg-error "Intrinsic symbol specified in \\!\\\$ACC ROUTINE \\( NAME \\) at \\(1\\), with incompatible clauses specifying the level of parallelism" }
+
+      CALL ABORT
+      END SUBROUTINE sub_1
+
+      MODULE m_w_1
+      IMPLICIT NONE
+!$ACC ROUTINE (ABORT) VECTOR GANG ! { dg-error "Intrinsic symbol specified in \\!\\\$ACC ROUTINE \\( NAME \\) at \\(1\\), with incompatible clauses specifying the level of parallelism" }
+
+      CONTAINS
+      SUBROUTINE sub_2
+      CALL ABORT
+      END SUBROUTINE sub_2
+      END MODULE m_w_1
diff --git gcc/testsuite/gfortran.dg/goacc/pr72741.f90 gcc/testsuite/gfortran.dg/goacc/pr72741.f90
index cf89727..bf47fc2 100644
--- gcc/testsuite/gfortran.dg/goacc/pr72741.f90
+++ gcc/testsuite/gfortran.dg/goacc/pr72741.f90
@@ -1,12 +1,19 @@ 
 SUBROUTINE v_1
   !$ACC ROUTINE VECTOR WORKER ! { dg-error "Multiple loop axes" }
+  !$ACC ROUTINE VECTOR ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE GANG VECTOR ! { dg-error "Multiple loop axes" }
+  ! { dg-error "\\!\\\$ACC ROUTINE already applied" "already" { target *-*-* } 5 }
 END SUBROUTINE v_1
 
 SUBROUTINE sub_1
   IMPLICIT NONE
   EXTERNAL :: g_1
   !$ACC ROUTINE (g_1) GANG WORKER ! { dg-error "Multiple loop axes" }
-  !$ACC ROUTINE (ABORT) SEQ VECTOR ! { dg-error "Multiple loop axes" }
+  !$ACC ROUTINE (g_1) GANG ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE (g_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE (g_1) VECTOR GANG ! { dg-error "Multiple loop axes" }
+  ! { dg-error "\\!\\\$ACC ROUTINE already applied" "already" { target *-*-* } 15 }
 
   CALL v_1
   CALL g_1
@@ -17,7 +24,10 @@  MODULE m_w_1
   IMPLICIT NONE
   EXTERNAL :: w_1
   !$ACC ROUTINE (w_1) WORKER SEQ ! { dg-error "Multiple loop axes" }
-  !$ACC ROUTINE (ABORT) VECTOR GANG ! { dg-error "Multiple loop axes" }
+  !$ACC ROUTINE (w_1) WORKER ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE (w_1) ! { dg-error "\\!\\\$ACC ROUTINE already applied" }
+  !$ACC ROUTINE (w_1) VECTOR WORKER ! { dg-error "Multiple loop axes" }
+  ! { dg-error "\\!\\\$ACC ROUTINE already applied" "already" { target *-*-* } 29 }
 
 CONTAINS
   SUBROUTINE sub_2