[gomp4] Remove device-specific filtering during parsing for OpenACC
diff mbox

Message ID 20150716163212.777040e9@octopus
State New
Headers show

Commit Message

Julian Brown July 16, 2015, 3:32 p.m. UTC
Hi,

This patch removes the device-specific filtering (for NVidia PTX) from
the parsing stages of the host compiler (for the device_type clause --
separately for C, C++ and Fortran) in favour of fully parsing the
device_type clauses, but not actually implementing anything for them
(device_type support is a feature that we're not planning to implement
just yet: the existing "support" is something of a red herring).

With this patch, the parsed device_type clauses will be ready at OMP
lowering time whenever we choose to do something with them (e.g.
transforming them into a representation that can be streamed out and
re-read by the appropriate offload compiler). The representation is
more-or-less the same for all supported languages, modulo
clause ordering.

I've altered the dtype-*.* tests to account for the new behaviour (and
to not use e.g. mixed-case "nVidia" or "acc_device_nvidia" names, which
are contrary to the recommendations in the spec).

OK to apply, or any comments?

Thanks,

Julian

ChangeLog

    gcc/
    * gimplify.c (gimplify_scan_omp_clauses): Handle
    OMP_CLAUSE_DEVICE_TYPE.
    (gimplify_adjust_omp_clauses): Likewise.
    * omp-low.c (scan_sharing_clauses): Likewise.
    (expand_omp_target): Add "sorry" for device_type support.
    * tree-pretty-print.c (dump_omp_clause): Add device_type support.
    * tree.c (walk_tree_1): Likewise.

    gcc/c/
    * c-parser.c (c_parser_oacc_all_clauses): Don't call
    c_oacc_filter_device_types.
    * c-typeck.c (c_finish_omp_clauses): Handle OMP_CLAUSE_DEVICE_TYPE.

    gcc/cp/
    * parser.c (cp_parser_oacc_all_clauses): Don't call
    c_oacc_filter_device_types.
    * pt.c (tsubst_omp_clauses): Handle OMP_CLAUSE_DEVICE_TYPE.
    * semantics.c (finish_omp_clauses): Likewise.

    gcc/fortran/
    * gfortran.h (gfc_omp_clauses): Change "dtype" int field to
    "device_types" gfc_expr_list.
    * openmp.c (gfc_match_omp_clauses): Remove scan_dtype variable (add
    OMP_CLAUSE_DEVICE_TYPE directly to appropriate bitmasks). Parse all
    device_type clauses without filtering.
    (OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK)
    (OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK)
    (OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK)
    (OACC_ROUTINE_CLAUSE_DEVICE_TYPE_MASK)
    (OACC_UPDATE_CLAUSE_DEVICE_TYPE_MASK): Add OMP_CLAUSE_DEVICE_TYPE.
    * trans-openmp.c (gfc_trans_omp_clauses): Translate device_type
    clauses, and split old body into...
    (gfc_trans_omp_clauses_1): New function.

    gcc/testsuite/
    * c-c++-common/goacc/dtype-1.c: Update test for new behaviour.
    * c-c++-common/goacc/dtype-2.c: Likewise.
    * c-c++-common/goacc/dtype-3.c: Likewise.
    * c-c++-common/goacc/dtype-4.c: Likewise.
    * gfortran.dg/goacc/dtype-1.f95: Likewise.
    * gfortran.dg/goacc/dtype-2.f95: Likewise.
    * gfortran.dg/goacc/dtype-3.f: Likewise.

Comments

Nathan Sidwell July 16, 2015, 5:15 p.m. UTC | #1
On 07/16/15 11:32, Julian Brown wrote:
> Hi,
>
> This patch removes the device-specific filtering (for NVidia PTX) from
> the parsing stages of the host compiler (for the device_type clause --
> separately for C, C++ and Fortran) in favour of fully parsing the
> device_type clauses, but not actually implementing anything for them
> (device_type support is a feature that we're not planning to implement
> just yet: the existing "support" is something of a red herring).
>
> With this patch, the parsed device_type clauses will be ready at OMP
> lowering time whenever we choose to do something with them (e.g.
> transforming them into a representation that can be streamed out and
> re-read by the appropriate offload compiler). The representation is
> more-or-less the same for all supported languages, modulo
> clause ordering.
>
> I've altered the dtype-*.* tests to account for the new behaviour (and
> to not use e.g. mixed-case "nVidia" or "acc_device_nvidia" names, which
> are contrary to the recommendations in the spec).
>
> OK to apply, or any comments?

thanks!
Thomas Schwinge July 17, 2015, 12:57 p.m. UTC | #2
Hi Julian!

On Thu, 16 Jul 2015 16:32:12 +0100, Julian Brown <julian@codesourcery.com> wrote:
> This patch removes the device-specific filtering (for NVidia PTX) from
> the parsing stages of the host compiler (for the device_type clause --
> separately for C, C++ and Fortran) in favour of fully parsing the
> device_type clauses, but not actually implementing anything for them
> (device_type support is a feature that we're not planning to implement
> just yet: the existing "support" is something of a red herring).
> 
> With this patch, the parsed device_type clauses will be ready at OMP
> lowering time whenever we choose to do something with them (e.g.
> transforming them into a representation that can be streamed out and
> re-read by the appropriate offload compiler). The representation is
> more-or-less the same for all supported languages

Thanks!

> modulo clause ordering.

Is that something that a) doesn't need to be/already has been addressed
(with your patch), or b) still needs to be addressed?


> I've altered the dtype-*.* tests to account for the new behaviour (and
> to not use e.g. mixed-case "nVidia" or "acc_device_nvidia" names, which
> are contrary to the recommendations in the spec).

OpenACC 2.0a indeed seems to suggest that device_type arguments are
case-sensitive -- contrary to the ACC_DEVICE_TYPE environment variable,
which probably is where the idea came from to parse them
case-insensitive.

As to the latter "invalid" names, I thought the idea has been to verify
that the clauses following such device_types clauses are indeed ignored
in the later processing.  (Obviously, there should've been comments
indicating that, as otherwise that's very confusing -- as we've just seen
-- due to the similarity to the runtime library's acc_device_* device
type values.)


> OK to apply, or any comments?

Your commit r225927 appears to have caused:

    [-PASS:-]{+FAIL: libgomp.fortran/declare-simd-2.f90   -O0  (internal compiler error)+}
    {+FAIL:+} libgomp.fortran/declare-simd-2.f90   -O0  (test for excess errors)
    [-PASS:-]{+UNRESOLVED:+} libgomp.fortran/declare-simd-2.f90   -O0  [-execution test-]
    [-PASS:-]{+compilation failed to produce executable+}
    [same for other optimization levels]

    [...]/source-gcc/libgomp/testsuite/libgomp.fortran/declare-simd-3.f90:17:0: internal compiler error: Segmentation fault
    0xc39b6f crash_signal
            [...]/source-gcc/gcc/toplev.c:352
    0x7043a8 gfc_trans_omp_clauses
            [...]/source-gcc/gcc/fortran/trans-openmp.c:2671
    0x7049a8 gfc_trans_omp_declare_simd(gfc_namespace*)
            [...]/source-gcc/gcc/fortran/trans-openmp.c:4589
    0x6b8542 gfc_get_extern_function_decl(gfc_symbol*)
            [...]/source-gcc/gcc/fortran/trans-decl.c:2025
    0x6b878d gfc_get_extern_function_decl(gfc_symbol*)
            [...]/source-gcc/gcc/fortran/trans-decl.c:1820
    0x6ce952 conv_function_val
            [...]/source-gcc/gcc/fortran/trans-expr.c:3601
    0x6ce952 gfc_conv_procedure_call(gfc_se*, gfc_symbol*, gfc_actual_arglist*, gfc_expr*, vec<tree_node*, va_gc, vl_embed>*)
            [...]/source-gcc/gcc/fortran/trans-expr.c:5873
    0x6cf4c2 gfc_conv_expr(gfc_se*, gfc_expr*)
            [...]/source-gcc/gcc/fortran/trans-expr.c:7391
    0x6d71d0 gfc_trans_assignment_1
            [...]/source-gcc/gcc/fortran/trans-expr.c:9127
    0x692465 trans_code
            [...]/source-gcc/gcc/fortran/trans.c:1674
    0x6fa457 gfc_trans_omp_code
            [...]/source-gcc/gcc/fortran/trans-openmp.c:2711
    0x705410 gfc_trans_omp_do
            [...]/source-gcc/gcc/fortran/trans-openmp.c:3459
    0x707f9f gfc_trans_omp_directive(gfc_code*)
            [...]/source-gcc/gcc/fortran/trans-openmp.c:4521
    0x6922b7 trans_code
            [...]/source-gcc/gcc/fortran/trans.c:1924
    0x6c0660 gfc_generate_function_code(gfc_namespace*)
            [...]/source-gcc/gcc/fortran/trans-decl.c:6231
    0x64d630 translate_all_program_units
            [...]/source-gcc/gcc/fortran/parse.c:5523
    0x64d630 gfc_parse_file()
            [...]/source-gcc/gcc/fortran/parse.c:5728
    0x68ef12 gfc_be_parse_file
            [...]/source-gcc/gcc/fortran/f95-lang.c:214


> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -12439,10 +12439,7 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
>    c_parser_skip_to_pragma_eol (parser);
>  
>    if (finish_p)
> -    {
> -      clauses = c_oacc_filter_device_types (clauses);
> -      return c_finish_omp_clauses (clauses, true);
> -    }
> +    return c_finish_omp_clauses (clauses, true);
>  
>    return clauses;
>  }

In combination with the equivant change to
gcc/cp/parser.c:cp_parser_oacc_all_clauses,
gcc/c-family/c-omp.c:c_oacc_filter_device_types, and transitively also
the struct identifier_hasher and c_oacc_extract_device_id function
preceding it, are now unused.  (Not an exhaustive list; have not checked
which other auxilliary functions etc. Cesar has added in his device_type
changes.)  Does it make any sense to keep these for later, or dump them
now?


> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c
> @@ -12568,6 +12568,10 @@ c_finish_omp_clauses (tree clauses, bool oacc)
>  	  pc = &OMP_CLAUSE_CHAIN (c);
>  	  continue;
>  
> +        case OMP_CLAUSE_DEVICE_TYPE:
> +	  pc = &OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
> +	  continue;
> +
>  	case OMP_CLAUSE_INBRANCH:
>  	case OMP_CLAUSE_NOTINBRANCH:
>  	  if (branch_seen)

From a quick glance only, this seems to be different from the C++ front
end (have not checked Fortran).

I have not looked at what the front end parsing is now actually doing; is
it just attaching any clauses following a device_type clause to the
latter?  (The same should be done for all front ends, obviously.  Even if
it's not important right now, because of the sorry diagnostic that will
be emitted later on as soon as there is one device_type clause, this
should best be addressed now, while you still remember what's going on
here ;-) so that there will be no bad surprises once we actually
implement the handling in OMP lowering/streaming/device compilers.)

Do we need manually need to take care to "finalize" (c_finish_omp_clauses
et al.) such "masked" clause chains, or will the right thing happen
automatically?

For reference, C++ does not appear to use OMP_CLAUSE_DEVICE_TYPE_CLAUSES
here:

> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -13666,6 +13666,7 @@ tsubst_omp_clauses (tree clauses, bool declare_simd,
>  	case OMP_CLAUSE_AUTO:
>  	case OMP_CLAUSE_SEQ:
>  	case OMP_CLAUSE_TILE:
> +	case OMP_CLAUSE_DEVICE_TYPE:
>  	  break;
>  	default:
>  	  gcc_unreachable ();
> --- a/gcc/cp/semantics.c
> +++ b/gcc/cp/semantics.c
> @@ -5951,6 +5951,7 @@ finish_omp_clauses (tree clauses, bool oacc)
>  	case OMP_CLAUSE_BIND:
>  	case OMP_CLAUSE_NOHOST:
>  	case OMP_CLAUSE_TILE:
> +	case OMP_CLAUSE_DEVICE_TYPE:
>  	  break;
>  
>  	case OMP_CLAUSE_INBRANCH:

(I have not checked Fortran.)


I also remember that I had a comment regarding device_type handling in
gcc/c-family/c-omp.c:c_oacc_split_loop_clauses,
<http://news.gmane.org/find-root.php?message_id=%3C87zj2z2e4x.fsf%40schwinge.name%3E>.
Given that these clauses are now no longer being handled to completion in
the front ends, does this need to be addressed?


> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -2028,6 +2028,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
>  	case OMP_CLAUSE_AUTO:
>  	case OMP_CLAUSE_SEQ:
>  	case OMP_CLAUSE_TILE:
> +	case OMP_CLAUSE_DEVICE_TYPE:
>  	  break;
>  
>  	case OMP_CLAUSE_ALIGNED:
> @@ -2163,6 +2164,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
>  	case OMP_CLAUSE_AUTO:
>  	case OMP_CLAUSE_SEQ:
>  	case OMP_CLAUSE_TILE:
> +	case OMP_CLAUSE_DEVICE_TYPE:
>  	  break;
>  
>  	case OMP_CLAUSE_DEVICE_RESIDENT:
> @@ -9774,6 +9776,10 @@ expand_omp_target (struct omp_region *region)
>  	tree t_async;
>  	int t_wait_idx;
>  
> +	c = find_omp_clause (clauses, OMP_CLAUSE_DEVICE_TYPE);
> +	if (c)
> +	  sorry ("device_type clause is not supported yet");
> +
>  	/* Default values for t_async.  */
>  	t_async = fold_convert_loc (gimple_location (entry_stmt),
>  				    integer_type_node,

Typically, sorry messages are emitted in the generic clause handling code
in scan_sharing_clauses.


Grüße,
 Thomas
Julian Brown July 17, 2015, 1:43 p.m. UTC | #3
On Fri, 17 Jul 2015 14:57:14 +0200
Thomas Schwinge <thomas@codesourcery.com> wrote:

> Hi Julian!
> 
> On Thu, 16 Jul 2015 16:32:12 +0100, Julian Brown
> <julian@codesourcery.com> wrote:
> > This patch removes the device-specific filtering (for NVidia PTX)
> > from the parsing stages of the host compiler (for the device_type
> > clause -- separately for C, C++ and Fortran) in favour of fully
> > parsing the device_type clauses, but not actually implementing
> > anything for them (device_type support is a feature that we're not
> > planning to implement just yet: the existing "support" is something
> > of a red herring).
> > 
> > With this patch, the parsed device_type clauses will be ready at OMP
> > lowering time whenever we choose to do something with them (e.g.
> > transforming them into a representation that can be streamed out and
> > re-read by the appropriate offload compiler). The representation is
> > more-or-less the same for all supported languages
> 
> Thanks!
> 
> > modulo clause ordering.
> 
> Is that something that a) doesn't need to be/already has been
> addressed (with your patch), or b) still needs to be addressed?

It's something that doesn't matter, I think: clauses are chained
together like this:

  num_gangs
  num_workers
  ...
  |
  device_type(foo)
  \__num_gangs    (OMP_CLAUSE_DEVICE_TYPE_CLAUSES)
  |  num_workers
  |  ...
  device_type(bar)
  \__num_gangs
  |  num_workers
  |  ...
  V
  (OMP_CLAUSE_CHAIN)

"foo" and "bar" are OMP_CLAUSE_DEVICE_TYPE_DEVICES -- tree lists. The
Fortran front-end will emit num_gangs, num_workers etc. clauses in a
fixed order (irrespective of their order in the source program), but the
C and C++ frontends will emit them in the (reverse of the) order
encountered.

There isn't really a consumer for this information yet, but when there
is, it will just have to not care about that (which should be
straightforward, I think).

> > I've altered the dtype-*.* tests to account for the new behaviour
> > (and to not use e.g. mixed-case "nVidia" or "acc_device_nvidia"
> > names, which are contrary to the recommendations in the spec).
> 
> OpenACC 2.0a indeed seems to suggest that device_type arguments are
> case-sensitive -- contrary to the ACC_DEVICE_TYPE environment
> variable, which probably is where the idea came from to parse them
> case-insensitive.
> 
> As to the latter "invalid" names, I thought the idea has been to
> verify that the clauses following such device_types clauses are
> indeed ignored in the later processing.  (Obviously, there should've
> been comments indicating that, as otherwise that's very confusing --
> as we've just seen -- due to the similarity to the runtime library's
> acc_device_* device type values.)

Yes, and there are still some tests for that functionality. I figured
there wasn't much point in "over-testing" it, especially since none of
this code does that much yet.

> > OK to apply, or any comments?
> 
> Your commit r225927 appears to have caused:
> 
>     [-PASS:-]{+FAIL: libgomp.fortran/declare-simd-2.f90   -O0
> (internal compiler error)+} {+FAIL:+}
> libgomp.fortran/declare-simd-2.f90   -O0  (test for excess errors)
> [-PASS:-]{+UNRESOLVED:+} libgomp.fortran/declare-simd-2.f90   -O0
> [-execution test-] [-PASS:-]{+compilation failed to produce
> executable+} [same for other optimization levels]
> 
>     [...]/source-gcc/libgomp/testsuite/libgomp.fortran/declare-simd-3.f90:17:0:
> internal compiler error: Segmentation fault 0xc39b6f crash_signal
>             [...]/source-gcc/gcc/toplev.c:352
>     0x7043a8 gfc_trans_omp_clauses
>             [...]/source-gcc/gcc/fortran/trans-openmp.c:2671
>     0x7049a8 gfc_trans_omp_declare_simd(gfc_namespace*)
>             [...]/source-gcc/gcc/fortran/trans-openmp.c:4589
>     0x6b8542 gfc_get_extern_function_decl(gfc_symbol*)
>             [...]/source-gcc/gcc/fortran/trans-decl.c:2025
>     0x6b878d gfc_get_extern_function_decl(gfc_symbol*)
>             [...]/source-gcc/gcc/fortran/trans-decl.c:1820
>     0x6ce952 conv_function_val
>             [...]/source-gcc/gcc/fortran/trans-expr.c:3601
>     0x6ce952 gfc_conv_procedure_call(gfc_se*, gfc_symbol*,
> gfc_actual_arglist*, gfc_expr*, vec<tree_node*, va_gc, vl_embed>*)
> [...]/source-gcc/gcc/fortran/trans-expr.c:5873 0x6cf4c2
> gfc_conv_expr(gfc_se*, gfc_expr*)
> [...]/source-gcc/gcc/fortran/trans-expr.c:7391 0x6d71d0
> gfc_trans_assignment_1 [...]/source-gcc/gcc/fortran/trans-expr.c:9127
>     0x692465 trans_code
>             [...]/source-gcc/gcc/fortran/trans.c:1674
>     0x6fa457 gfc_trans_omp_code
>             [...]/source-gcc/gcc/fortran/trans-openmp.c:2711
>     0x705410 gfc_trans_omp_do
>             [...]/source-gcc/gcc/fortran/trans-openmp.c:3459
>     0x707f9f gfc_trans_omp_directive(gfc_code*)
>             [...]/source-gcc/gcc/fortran/trans-openmp.c:4521
>     0x6922b7 trans_code
>             [...]/source-gcc/gcc/fortran/trans.c:1924
>     0x6c0660 gfc_generate_function_code(gfc_namespace*)
>             [...]/source-gcc/gcc/fortran/trans-decl.c:6231
>     0x64d630 translate_all_program_units
>             [...]/source-gcc/gcc/fortran/parse.c:5523
>     0x64d630 gfc_parse_file()
>             [...]/source-gcc/gcc/fortran/parse.c:5728
>     0x68ef12 gfc_be_parse_file
>             [...]/source-gcc/gcc/fortran/f95-lang.c:214

Thanks, I'll have a look at this.

> > --- a/gcc/c/c-parser.c
> > +++ b/gcc/c/c-parser.c
> > @@ -12439,10 +12439,7 @@ c_parser_oacc_all_clauses (c_parser
> > *parser, omp_clause_mask mask, c_parser_skip_to_pragma_eol (parser);
> >  
> >    if (finish_p)
> > -    {
> > -      clauses = c_oacc_filter_device_types (clauses);
> > -      return c_finish_omp_clauses (clauses, true);
> > -    }
> > +    return c_finish_omp_clauses (clauses, true);
> >  
> >    return clauses;
> >  }
> 
> In combination with the equivant change to
> gcc/cp/parser.c:cp_parser_oacc_all_clauses,
> gcc/c-family/c-omp.c:c_oacc_filter_device_types, and transitively also
> the struct identifier_hasher and c_oacc_extract_device_id function
> preceding it, are now unused.  (Not an exhaustive list; have not
> checked which other auxilliary functions etc. Cesar has added in his
> device_type changes.)  Does it make any sense to keep these for
> later, or dump them now?

It probably makes sense to dump them.

> > --- a/gcc/c/c-typeck.c
> > +++ b/gcc/c/c-typeck.c
> > @@ -12568,6 +12568,10 @@ c_finish_omp_clauses (tree clauses, bool
> > oacc) pc = &OMP_CLAUSE_CHAIN (c);
> >  	  continue;
> >  
> > +        case OMP_CLAUSE_DEVICE_TYPE:
> > +	  pc = &OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
> > +	  continue;
> > +
> >  	case OMP_CLAUSE_INBRANCH:
> >  	case OMP_CLAUSE_NOTINBRANCH:
> >  	  if (branch_seen)
> 
> From a quick glance only, this seems to be different from the C++
> front end (have not checked Fortran).
> 
> I have not looked at what the front end parsing is now actually
> doing; is it just attaching any clauses following a device_type
> clause to the latter?  (The same should be done for all front ends,
> obviously.  Even if it's not important right now, because of the
> sorry diagnostic that will be emitted later on as soon as there is
> one device_type clause, this should best be addressed now, while you
> still remember what's going on here ;-) so that there will be no bad
> surprises once we actually implement the handling in OMP
> lowering/streaming/device compilers.)
> 
> Do we need manually need to take care to
> "finalize" (c_finish_omp_clauses et al.) such "masked" clause chains,
> or will the right thing happen automatically?
> 
> For reference, C++ does not appear to use
> OMP_CLAUSE_DEVICE_TYPE_CLAUSES here:
> 
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -13666,6 +13666,7 @@ tsubst_omp_clauses (tree clauses, bool
> > declare_simd, case OMP_CLAUSE_AUTO:
> >  	case OMP_CLAUSE_SEQ:
> >  	case OMP_CLAUSE_TILE:
> > +	case OMP_CLAUSE_DEVICE_TYPE:
> >  	  break;
> >  	default:
> >  	  gcc_unreachable ();
> > --- a/gcc/cp/semantics.c
> > +++ b/gcc/cp/semantics.c
> > @@ -5951,6 +5951,7 @@ finish_omp_clauses (tree clauses, bool oacc)
> >  	case OMP_CLAUSE_BIND:
> >  	case OMP_CLAUSE_NOHOST:
> >  	case OMP_CLAUSE_TILE:
> > +	case OMP_CLAUSE_DEVICE_TYPE:
> >  	  break;
> >  
> >  	case OMP_CLAUSE_INBRANCH:
> 
> (I have not checked Fortran.)

Hmm, not sure about that.

> I also remember that I had a comment regarding device_type handling in
> gcc/c-family/c-omp.c:c_oacc_split_loop_clauses,
> <http://news.gmane.org/find-root.php?message_id=%3C87zj2z2e4x.fsf%40schwinge.name%3E>.
> Given that these clauses are now no longer being handled to
> completion in the front ends, does this need to be addressed?

Nor that.

> > --- a/gcc/omp-low.c
> > +++ b/gcc/omp-low.c
> > @@ -2028,6 +2028,7 @@ scan_sharing_clauses (tree clauses,
> > omp_context *ctx) case OMP_CLAUSE_AUTO:
> >  	case OMP_CLAUSE_SEQ:
> >  	case OMP_CLAUSE_TILE:
> > +	case OMP_CLAUSE_DEVICE_TYPE:
> >  	  break;
> >  
> >  	case OMP_CLAUSE_ALIGNED:
> > @@ -2163,6 +2164,7 @@ scan_sharing_clauses (tree clauses,
> > omp_context *ctx) case OMP_CLAUSE_AUTO:
> >  	case OMP_CLAUSE_SEQ:
> >  	case OMP_CLAUSE_TILE:
> > +	case OMP_CLAUSE_DEVICE_TYPE:
> >  	  break;
> >  
> >  	case OMP_CLAUSE_DEVICE_RESIDENT:
> > @@ -9774,6 +9776,10 @@ expand_omp_target (struct omp_region *region)
> >  	tree t_async;
> >  	int t_wait_idx;
> >  
> > +	c = find_omp_clause (clauses, OMP_CLAUSE_DEVICE_TYPE);
> > +	if (c)
> > +	  sorry ("device_type clause is not supported yet");
> > +
> >  	/* Default values for t_async.  */
> >  	t_async = fold_convert_loc (gimple_location (entry_stmt),
> >  				    integer_type_node,
> 
> Typically, sorry messages are emitted in the generic clause handling
> code in scan_sharing_clauses.

OK, I guess that can be moved in a follow-up patch.

Cheers,

Julian

Patch
diff mbox

commit 123298186bb8ce87f84b6a3a72743939d4fdae11
Author: Julian Brown <julian@codesourcery.com>
Date:   Thu Jul 16 08:06:01 2015 -0700

    Fix device_type parsing, add sorry() for missing implementation of remainder.

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 1c65abf..d90c18e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -12439,10 +12439,7 @@  c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
   c_parser_skip_to_pragma_eol (parser);
 
   if (finish_p)
-    {
-      clauses = c_oacc_filter_device_types (clauses);
-      return c_finish_omp_clauses (clauses, true);
-    }
+    return c_finish_omp_clauses (clauses, true);
 
   return clauses;
 }
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 98b8e3d..dcc246c 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12568,6 +12568,10 @@  c_finish_omp_clauses (tree clauses, bool oacc)
 	  pc = &OMP_CLAUSE_CHAIN (c);
 	  continue;
 
+        case OMP_CLAUSE_DEVICE_TYPE:
+	  pc = &OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c);
+	  continue;
+
 	case OMP_CLAUSE_INBRANCH:
 	case OMP_CLAUSE_NOTINBRANCH:
 	  if (branch_seen)
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 28f0048..80aabed 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -29879,10 +29879,7 @@  cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
   cp_parser_skip_to_pragma_eol (parser, pragma_tok);
 
   if (finish_p)
-    {
-      clauses = c_oacc_filter_device_types (clauses);
-      return finish_omp_clauses (clauses, true);
-    }
+    return finish_omp_clauses (clauses, true);
 
   return clauses;
 }
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 205dc30..056b2c1 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13666,6 +13666,7 @@  tsubst_omp_clauses (tree clauses, bool declare_simd,
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 	default:
 	  gcc_unreachable ();
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 8935eb6..1ce1dfa 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -5951,6 +5951,7 @@  finish_omp_clauses (tree clauses, bool oacc)
 	case OMP_CLAUSE_BIND:
 	case OMP_CLAUSE_NOHOST:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
 	case OMP_CLAUSE_INBRANCH:
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 89f6816..12d46a9 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1267,7 +1267,7 @@  typedef struct gfc_omp_clauses
   struct gfc_expr *num_workers_expr;
   struct gfc_expr *vector_length_expr;
   struct gfc_symbol *routine_bind;
-  int dtype;
+  gfc_expr_list *device_types;
   struct gfc_omp_clauses *dtype_clauses;
   gfc_expr_list *wait_list;
   gfc_expr_list *tile_list;
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index c3d3ccf..4be3417 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -507,7 +507,6 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 {
   gfc_omp_clauses *base_clauses, *c = gfc_get_omp_clauses ();
   locus old_loc;
-  bool scan_dtype = false;
 
   base_clauses = c;
 
@@ -1154,39 +1153,50 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
       if ((mask & OMP_CLAUSE_DEVICE) && c->device == NULL
 	  && gfc_match ("device ( %e )", &c->device) == MATCH_YES)
 	continue;
-      if (((mask & OMP_CLAUSE_DEVICE_TYPE) || scan_dtype)
+      if ((mask & OMP_CLAUSE_DEVICE_TYPE)
 	  && (gfc_match ("device_type ( ") == MATCH_YES
 	      || gfc_match ("dtype ( ") == MATCH_YES))
 	{
-	  int device = GOMP_DEVICE_NONE;
 	  gfc_omp_clauses *t = gfc_get_omp_clauses ();
+	  gfc_expr_list *p = NULL, *head, *tail;
 
-	  c->dtype_clauses = t;
-	  c = t;
+	  head = tail = NULL;
 
 	  if (gfc_match (" * ") == MATCH_YES)
-	    device = GOMP_DEVICE_DEFAULT;
+	    {
+	      head = p = gfc_get_expr_list ();
+	      p->expr
+	        = gfc_get_character_expr (gfc_default_character_kind,
+					  &gfc_current_locus, "*", 1);
+	    }
 	  else
 	    {
 	      char n[GFC_MAX_SYMBOL_LEN + 1];
 
-	      do {
-		if (gfc_match (" %n ", n) == MATCH_YES)
-		  {
-		    if (!strcasecmp ("nvidia", n))
-		      device = GOMP_DEVICE_NVIDIA_PTX;
-		    else
-		      {
-			/* The OpenACC technical committee advises compilers
-			   to silently ignore unknown devices.  */
-		      }
-		  }
-		else
-		  {
-		    gfc_error ("missing device_type argument");
-		    continue;
-		  }
-	      } while (gfc_match (" , ") == MATCH_YES);
+	      do
+		{
+		  p = gfc_get_expr_list ();
+
+		  if (head == NULL)
+	            head = tail = p;
+		  else
+	            {
+		      tail->next = p;
+		      tail = p;
+		    }
+
+		  if (gfc_match (" %n ", n) == MATCH_YES)
+		    p->expr
+		      = gfc_get_character_expr (gfc_default_character_kind,
+						&gfc_current_locus, n,
+						strlen (n));
+		  else
+		    {
+		      gfc_error ("missing device_type argument");
+		      continue;
+		    }
+		}
+	      while (gfc_match (" , ") == MATCH_YES);
 	    }
 
 	  /* Consume the trailing ')'.  */
@@ -1196,9 +1206,12 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 	      continue;
 	    }
 
-	  c->dtype = device;
+	  /* Move to chained pointer for parsing remaining clauses.  */
+	  c->device_types = head;
+	  c->dtype_clauses = t;
+	  c = t;
+
 	  mask = dtype_mask;
-	  scan_dtype = true;
 	  continue;
 	}
       if ((mask & OMP_CLAUSE_THREAD_LIMIT) && c->thread_limit == NULL
@@ -1259,69 +1272,6 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
       return MATCH_ERROR;
     }
 
-  /* Filter out the device_type clauses.  */
-  if (base_clauses->dtype_clauses)
-    {
-      gfc_omp_clauses *t;
-      gfc_omp_clauses *seen_default = NULL;
-      gfc_omp_clauses *seen_nvidia = NULL;
-
-      /* Scan for device_type clauses.  */
-      c = base_clauses->dtype_clauses;
-      while (c)
-	{
-	  if (c->dtype == GOMP_DEVICE_DEFAULT)
-	    {
-	      if (seen_default)
-		gfc_error ("duplicate device_type (*)");
-	      else
-		seen_default = c;
-	    }
-	  else if (c->dtype == GOMP_DEVICE_NVIDIA_PTX)
-	    {
-	      if (seen_nvidia)
-		gfc_error ("duplicate device_type (nvidia)");
-	      else
-		seen_nvidia = c;
-	    }
-	  c = c->dtype_clauses;
-	}
-
-      /* Update the clauses in the original set of clauses.  */
-      c = seen_nvidia ? seen_nvidia : seen_default;
-      if (c)
-	{
-#define acc_clause0(mask) do if (c->mask) { base_clauses->mask = 1; } while (0)
-#define acc_clause1(mask, expr, type) do if (c->mask) { type t; \
-	      base_clauses->mask = 1; t = base_clauses->expr; \
-	      base_clauses->expr = c->expr; c->expr = t; } while (0)
-
-	  acc_clause1 (acc_collapse, collapse, int);
-	  acc_clause1 (gang, gang_expr, gfc_expr *);
-	  acc_clause1 (worker, worker_expr, gfc_expr *);
-	  acc_clause1 (vector, vector_expr, gfc_expr *);
-	  acc_clause0 (par_auto);
-	  acc_clause0 (independent);
-	  acc_clause0 (seq);
-	  acc_clause1 (tile, tile_list, gfc_expr_list *);
-	  acc_clause1 (async, async_expr, gfc_expr *);
-	  acc_clause1 (wait, wait_list, gfc_expr_list *);
-	  acc_clause1 (num_gangs, num_gangs_expr, gfc_expr *);
-	  acc_clause1 (num_workers, num_workers_expr, gfc_expr *);
-	  acc_clause1 (vector_length, vector_length_expr, gfc_expr *);
-	  acc_clause1 (bind, routine_bind, gfc_symbol *);
-	}
-
-      /* Remove the device_type clauses.  */
-      c = base_clauses->dtype_clauses;
-      while (c)
-	{
-	  t = c->dtype_clauses;
-	  gfc_free_omp_clauses (c);
-	  c = t;
-	}      
-    }
-
   *cp = base_clauses;
   return MATCH_YES;
 }
@@ -1384,17 +1334,18 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
 
 #define OACC_LOOP_CLAUSE_DEVICE_TYPE_MASK \
   (OMP_CLAUSE_COLLAPSE | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER		    \
-   | OMP_CLAUSE_VECTOR | OMP_CLAUSE_AUTO | OMP_CLAUSE_SEQ | OMP_CLAUSE_TILE)
+   | OMP_CLAUSE_VECTOR | OMP_CLAUSE_AUTO | OMP_CLAUSE_SEQ | OMP_CLAUSE_TILE \
+   | OMP_CLAUSE_DEVICE_TYPE)
 #define OACC_KERNELS_CLAUSE_DEVICE_TYPE_MASK \
-  (OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT)
+  (OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT | OMP_CLAUSE_DEVICE_TYPE)
 #define OACC_PARALLEL_CLAUSE_DEVICE_TYPE_MASK				   \
   (OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS | OMP_CLAUSE_NUM_WORKERS	   \
-   | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_WAIT | OMP_CLAUSE_DEVICE_TYPE)
 #define OACC_ROUTINE_CLAUSE_DEVICE_TYPE_MASK				   \
    (OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER | OMP_CLAUSE_VECTOR		   \
-    | OMP_CLAUSE_SEQ | OMP_CLAUSE_BIND)
+    | OMP_CLAUSE_SEQ | OMP_CLAUSE_BIND | OMP_CLAUSE_DEVICE_TYPE)
 #define OACC_UPDATE_CLAUSE_DEVICE_TYPE_MASK				   \
-   (OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT)
+   (OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT | OMP_CLAUSE_DEVICE_TYPE)
 
 
 match
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 56e65ec..20a1e65 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1730,8 +1730,8 @@  gfc_convert_expr_to_tree (stmtblock_t *block, gfc_expr *expr)
 }
 
 static tree
-gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
-		       locus where, bool declare_simd = false)
+gfc_trans_omp_clauses_1 (stmtblock_t *block, gfc_omp_clauses *clauses,
+			 locus where, bool declare_simd = false)
 {
   tree omp_clauses = NULL_TREE, chunk_size, c;
   int list;
@@ -2661,6 +2661,45 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
   return nreverse (omp_clauses);
 }
 
+static tree
+gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
+		       locus where, bool declare_simd = false)
+{
+  tree omp_clauses = gfc_trans_omp_clauses_1 (block, clauses, where,
+					      declare_simd);
+
+  for (; clauses->device_types; clauses = clauses->dtype_clauses)
+    {
+      tree c, following_clauses = NULL_TREE, dev_list = NULL_TREE;
+
+      if (clauses->dtype_clauses)
+        {
+	  gfc_expr_list *p;
+
+          following_clauses
+	    = gfc_trans_omp_clauses_1 (block, clauses->dtype_clauses,
+				       where, declare_simd);
+
+	  for (p = clauses->device_types; p; p = p->next)
+	    {
+	      tree dev = gfc_conv_constant_to_tree (p->expr);
+	      dev = get_identifier (TREE_STRING_POINTER (dev));
+	      if (dev_list)
+		dev_list = chainon (dev_list, dev);
+	      else
+	        dev_list = dev;
+	    }
+
+	  c = build_omp_clause (where.lb->location, OMP_CLAUSE_DEVICE_TYPE);
+	  OMP_CLAUSE_DEVICE_TYPE_CLAUSES (c) = following_clauses;
+	  OMP_CLAUSE_DEVICE_TYPE_DEVICES (c) = dev_list;
+	  omp_clauses = gfc_trans_add_clause (c, omp_clauses);
+	}
+    }
+
+  return omp_clauses;
+}
+
 /* Like gfc_trans_code, but force creation of a BIND_EXPR around it.  */
 
 static tree
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index efae2e5..f5ec04a 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -6616,6 +6616,7 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 	case OMP_CLAUSE_PROC_BIND:
 	case OMP_CLAUSE_SAFELEN:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
 	case OMP_CLAUSE_ALIGNED:
@@ -7035,6 +7036,7 @@  gimplify_adjust_omp_clauses (gimple_seq *pre_p, tree *list_p)
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
 	default:
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 0419dcd..37b853f 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -2028,6 +2028,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
 	case OMP_CLAUSE_ALIGNED:
@@ -2163,6 +2164,7 @@  scan_sharing_clauses (tree clauses, omp_context *ctx)
 	case OMP_CLAUSE_AUTO:
 	case OMP_CLAUSE_SEQ:
 	case OMP_CLAUSE_TILE:
+	case OMP_CLAUSE_DEVICE_TYPE:
 	  break;
 
 	case OMP_CLAUSE_DEVICE_RESIDENT:
@@ -9774,6 +9776,10 @@  expand_omp_target (struct omp_region *region)
 	tree t_async;
 	int t_wait_idx;
 
+	c = find_omp_clause (clauses, OMP_CLAUSE_DEVICE_TYPE);
+	if (c)
+	  sorry ("device_type clause is not supported yet");
+
 	/* Default values for t_async.  */
 	t_async = fold_convert_loc (gimple_location (entry_stmt),
 				    integer_type_node,
diff --git a/gcc/testsuite/c-c++-common/goacc/dtype-1.c b/gcc/testsuite/c-c++-common/goacc/dtype-1.c
index e6d6e54..665f21e 100644
--- a/gcc/testsuite/c-c++-common/goacc/dtype-1.c
+++ b/gcc/testsuite/c-c++-common/goacc/dtype-1.c
@@ -1,5 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-fopenacc -fdump-tree-omplower" } */
+/* { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" } */
 
 void
 test ()
@@ -8,7 +9,7 @@  test ()
 
   /* ACC PARALLEL DEVICE_TYPE: */
 
-#pragma acc parallel device_type (nVidia) async (1) num_gangs (100) num_workers (100) vector_length (32) wait (1)
+#pragma acc parallel device_type (nvidia) async (1) num_gangs (100) num_workers (100) vector_length (32) wait (1)
   {
   }
 
@@ -45,49 +46,33 @@  test ()
   /* ACC LOOP DEVICE_TYPE: */
 
 #pragma acc parallel
-#pragma acc loop dtype (nVidia) gang tile (1)
+#pragma acc loop dtype (nvidia) gang tile (1)
   for (i1 = 1; i1 < 10; i1++)
-#pragma acc loop device_type (nVidia) worker collapse (1)
+#pragma acc loop device_type (nvidia) worker collapse (1)
     for (i2 = 1; i2 < 10; i2++)
-#pragma acc loop device_type (nVidia) vector
+#pragma acc loop device_type (nvidia) vector
       for (i3 = 1; i3 < 10; i3++)
-#pragma acc loop dtype (nVidia) auto
+#pragma acc loop dtype (nvidia) auto
 	for (i4 = 1; i4 < 10; i4++)
-#pragma acc loop dtype (nVidia)
+#pragma acc loop dtype (nvidia)
 	  for (i5 = 1; i5 < 10; i5++)
-#pragma acc loop device_type (nVidia) seq
+#pragma acc loop device_type (nvidia) seq
 	    for (i6 = 1; i6 < 10; i6++)
 	      {
 	      }
 
 #pragma acc parallel
-#pragma acc loop device_type (nVidia) gang tile (1) dtype (*) seq
+#pragma acc loop device_type (nvidia) gang tile (1) dtype (*) seq
   for (i1 = 1; i1 < 10; i1++)
-#pragma acc loop dtype (nVidia) worker collapse (1) device_type (*) seq
+#pragma acc loop dtype (nvidia) worker collapse (1) device_type (*) seq
     for (i2 = 1; i2 < 10; i2++)
-#pragma acc loop device_type (nVidia) vector dtype (*) seq
+#pragma acc loop device_type (nvidia) vector dtype (*) seq
       for (i3 = 1; i3 < 10; i3++)
-#pragma acc loop dtype (nVidia) auto device_type (*) seq
+#pragma acc loop dtype (nvidia) auto device_type (*) seq
 	for (i4 = 1; i4 < 10; i4++)
-#pragma acc loop device_type (nVidia) device_type (*) seq
+#pragma acc loop device_type (nvidia) device_type (*) seq
 	  for (i5 = 1; i5 < 10; i5++)
-#pragma acc loop device_type (nVidia) seq
-	    for (i6 = 1; i6 < 10; i6++)
-	      {
-	      }
-
-#pragma acc parallel
-#pragma acc loop dtype (nVidiaGPU) gang tile (1) device_type (*) seq
-  for (i1 = 1; i1 < 10; i1++)
-#pragma acc loop device_type (nVidiaGPU) worker collapse (1) dtype (*) seq
-    for (i2 = 1; i2 < 10; i2++)
-#pragma acc loop dtype (nVidiaGPU) vector device_type (*) seq
-      for (i3 = 1; i3 < 10; i3++)
-#pragma acc loop device_type (nVidiaGPU) auto device_type (*) seq
-	for (i4 = 1; i4 < 10; i4++)
-#pragma acc loop dtype (nVidiaGPU) dtype (*) seq
-	  for (i5 = 1; i5 < 10; i5++)
-#pragma acc loop device_type (nVidiaGPU) seq device_type (*) seq
+#pragma acc loop device_type (nvidia) seq
 	    for (i6 = 1; i6 < 10; i6++)
 	      {
 	      }
@@ -123,36 +108,36 @@  test ()
 #pragma acc routine (foo14) dtype (gpu) seq dtype (*) worker
 #pragma acc routine (foo15) dtype (gpu) bind (foo) dtype (*) seq
 
-/* { dg-final { scan-tree-dump-times "oacc_parallel wait\\(1\\) vector_length\\(32\\) num_workers\\(100\\) num_gangs\\(100\\) async\\(1\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(nvidia\\) \\\[ wait\\(1\\) vector_length\\(32\\) num_workers\\(100\\) num_gangs\\(100\\) async\\(1\\) \\\]" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "oacc_parallel wait\\(1\\) vector_length\\(1\\) num_workers\\(1\\) num_gangs\\(1\\) async\\(1\\) wait\\(2\\) vector_length\\(64\\) num_workers\\(200\\) num_gangs\\(200\\) async\\(2\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(nvidia\\) \\\[ wait\\(2\\) vector_length\\(64\\) num_workers\\(200\\) num_gangs\\(200\\) async\\(2\\) \\\] wait\\(1\\) vector_length\\(1\\) num_workers\\(1\\) num_gangs\\(1\\) async\\(1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc_parallel wait\\(1\\) vector_length\\(1\\) num_workers\\(1\\) num_gangs\\(1\\) async\\(1\\) wait\\(3\\) vector_length\\(128\\) num_workers\\(300\\) num_gangs\\(300\\) async\\(3" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(\\*\\) \\\[ wait\\(10\\) vector_length\\(10\\) num_workers\\(10\\) num_gangs\\(10\\) async\\(10\\) \\\] device_type\\(nvidia\\) \\\[ wait\\(3\\) vector_length\\(128\\) num_workers\\(300\\) num_gangs\\(300\\) async\\(3\\) \\\] wait\\(1\\) vector_length\\(1\\) num_workers\\(1\\) num_gangs\\(1\\) async\\(1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "oacc_kernels async\\(-1\\)" 4 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(nvidia\\) \\\[ async\\(-1\\) \\\]" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "oacc_kernels async\\(-1\\) wait\\(2\\) async\\(2\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(nvidia\\) \\\[ wait\\(1\\) async\\(1\\) \\\] async\\(-1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "oacc_kernels async\\(-1\\) wait\\(0\\) async\\(0\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(\\*\\) \\\[ wait\\(0\\) async\\(0\\) \\\] device_type\\(nvidia\\) \\\[ wait\\(2\\) async\\(2\\) \\\] async\\(-1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop tile\\(1\\) gang private\\(i1\\.0\\) private\\(i1\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ tile\\(1\\) gang \\\] private\\(i1\\.0\\) private\\(i1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop tile\\(1\\) gang private\\(i1\\.1\\) private\\(i1\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ tile\\(1\\) gang \\\] private\\(i1\\.1\\) private\\(i1\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop seq private\\(i1\\.2\\) private\\(i1\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ collapse\\(1\\) worker \\\] private\\(i2\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop collapse\\(1\\) worker private\\(i2\\)" 2 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ vector \\\] private\\(i3\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop vector private\\(i3\\)" 2 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ auto \\\] private\\(i4\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop auto private\\(i4\\)" 2 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ \\\] private\\(i5\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop private\\(i5\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ seq \\\] private\\(i6\\)" 2 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop seq private\\(i6\\)" 3 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ collapse\\(1\\) worker \\\] private\\(i2\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop seq private\\(i2\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ vector \\\] private\\(i3\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop seq private\\(i4\\)" 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ auto \\\] private\\(i4\\)" 1 "omplower" } } */
 
-/* { dg-final { scan-tree-dump-times "acc loop seq private\\(i5\\)" 2 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ \\\] private\\(i5\\)" 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/dtype-2.c b/gcc/testsuite/c-c++-common/goacc/dtype-2.c
index 6fa48b9..96acab0 100644
--- a/gcc/testsuite/c-c++-common/goacc/dtype-2.c
+++ b/gcc/testsuite/c-c++-common/goacc/dtype-2.c
@@ -1,4 +1,5 @@ 
 /* { dg-do compile } */
+/* { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" } */
 
 void
 test ()
@@ -7,7 +8,7 @@  test ()
 
   /* ACC PARALLEL DEVICE_TYPE: */
 
-#pragma acc parallel dtype (nVidia) async (1) num_gangs (100) num_workers (100) vector_length (32) wait (1) copy (i1) /* { dg-error "not valid" } */
+#pragma acc parallel dtype (nvidia) async (1) num_gangs (100) num_workers (100) vector_length (32) wait (1) copy (i1) /* { dg-error "not valid" } */
   {
   }
 
@@ -20,7 +21,7 @@  test ()
   /* ACC LOOP DEVICE_TYPE: */
 
 #pragma acc parallel
-#pragma acc loop device_type (nVidia) gang tile (1) private (i2) /* { dg-error "not valid" } */
+#pragma acc loop device_type (nvidia) gang tile (1) private (i2) /* { dg-error "not valid" } */
   for (i1 = 1; i1 < 10; i1++)
     {
     }
diff --git a/gcc/testsuite/c-c++-common/goacc/dtype-3.c b/gcc/testsuite/c-c++-common/goacc/dtype-3.c
index 53ab94c..bfb06c4 100644
--- a/gcc/testsuite/c-c++-common/goacc/dtype-3.c
+++ b/gcc/testsuite/c-c++-common/goacc/dtype-3.c
@@ -1,4 +1,5 @@ 
-/* { dg-do compile } */
+/* { dg-do compile { xfail *-*-* } } */
+/* { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" } */
 
 float b;
 #pragma acc declare link (b)
@@ -14,15 +15,15 @@  main (int argc, char **argv)
   {
   }
 
-#pragma acc parallel device_type (acc_device_nvidia) num_gangs (1)
+#pragma acc parallel device_type (nvidia) num_gangs (1)
   {
   }
 
-#pragma acc parallel device_type (acc_device_host, acc_device_nvidia) num_gangs (1)
+#pragma acc parallel device_type (host, nvidia) num_gangs (1)
   {
   }
 
-#pragma acc parallel device_type (acc_device_host) num_gangs (1)
+#pragma acc parallel device_type (host) num_gangs (1)
   {
   }
 
diff --git a/gcc/testsuite/c-c++-common/goacc/dtype-4.c b/gcc/testsuite/c-c++-common/goacc/dtype-4.c
index f49d522..8328783 100644
--- a/gcc/testsuite/c-c++-common/goacc/dtype-4.c
+++ b/gcc/testsuite/c-c++-common/goacc/dtype-4.c
@@ -1,4 +1,4 @@ 
-/* { dg-do compile } */
+/* { dg-do compile { xfail *-*-* } } */
 
 int
 main (int argc, char **argv)
@@ -8,19 +8,19 @@  main (int argc, char **argv)
   a = 2.0;
   b = 0.0;
 
-  #pragma acc parallel copy (a, b) device_type (acc_device_host) num_gangs (1) device_type (acc_device_nvidia) num_gangs (2)
+  #pragma acc parallel copy (a, b) device_type (host) num_gangs (1) device_type (nvidia) num_gangs (2) /* { dg-message "sorry, unimplemented: device_type clause is not supported yet" } */
   {
   }
 
-  #pragma acc parallel copy (a, b) num_gangs (3) device_type (acc_device_host) num_gangs (1) device_type (acc_device_nvidia) num_gangs (2)
+  #pragma acc parallel copy (a, b) num_gangs (3) device_type (host) num_gangs (1) device_type (nvidia) num_gangs (2) /* { dg-message "sorry, unimplemented: device_type clause is not supported yet" } */
   {
   }
 
-#pragma acc parallel copy (a, b) device_type (acc_device_host) num_gangs (1) device_type (acc_device_nvidia) num_gangs (2) device_type (acc_device_host) num_gangs (60) /* { dg-error "duplicate device_type" } */
+#pragma acc parallel copy (a, b) device_type (host) num_gangs (1) device_type (nvidia) num_gangs (2) device_type (host) num_gangs (60) /* { dg-message "sorry, unimplemented: device_type clause is not supported yet" } */
   {
   }
   
-#pragma acc parallel copy (a, b) num_gangs (3) device_type (nvidia) num_gangs (1) device_type (nvidia) num_gangs (2) /* { dg-error "duplicate device_type" } */
+#pragma acc parallel copy (a, b) num_gangs (3) device_type (nvidia) num_gangs (1) device_type (nvidia) num_gangs (2) /* { dg-message "sorry, unimplemented: device_type clause is not supported yet" } */
   {
   }
 
diff --git a/gcc/testsuite/gfortran.dg/goacc/dtype-1.f95 b/gcc/testsuite/gfortran.dg/goacc/dtype-1.f95
index 8f68bbe..5919ae4 100644
--- a/gcc/testsuite/gfortran.dg/goacc/dtype-1.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/dtype-1.f95
@@ -1,12 +1,13 @@ 
 ! { dg-do compile }
 ! { dg-options "-fopenacc -fdump-tree-omplower" }
+! { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" }
 
 program dtype
   integer i1, i2, i3, i4, i5, i6
 
 !! ACC PARALLEL DEVICE_TYPE:
 
-!$acc parallel dtype (nVidia) async (1) num_gangs (100) &
+!$acc parallel dtype (nvidia) async (1) num_gangs (100) &
 !$acc&  num_workers (100) vector_length (32) wait (1)
 !$acc end parallel
 
@@ -46,17 +47,17 @@  program dtype
 !! ACC LOOP DEVICE_TYPE:
 
 !$acc parallel
-!$acc loop device_type (nVidia) gang tile (1)
+!$acc loop device_type (nvidia) gang tile (1)
   do i1 = 1, 10
-     !$acc loop dtype (nVidia) worker collapse (1)
+     !$acc loop dtype (nvidia) worker collapse (1)
      do i2 = 1, 10
-        !$acc loop device_type (nVidia) vector
+        !$acc loop device_type (nvidia) vector
         do i3 = 1, 10
-           !$acc loop device_type (nVidia) auto
+           !$acc loop device_type (nvidia) auto
            do i4 = 1, 10
-              !$acc loop dtype (nVidia)
+              !$acc loop dtype (nvidia)
               do i5 = 1, 10
-                 !$acc loop dtype (nVidia) seq
+                 !$acc loop dtype (nvidia) seq
                  do i6 = 1, 10
                  end do
               end do
@@ -67,42 +68,19 @@  program dtype
 !$acc end parallel
 
 !$acc parallel
-!$acc loop dtype (nVidia) gang tile (1) dtype (*) seq
+!$acc loop dtype (nvidia) gang tile (1) dtype (*) seq
   do i1 = 1, 10
-     !$acc loop device_type (nVidia) worker collapse (1) &
+     !$acc loop device_type (nvidia) worker collapse (1) &
      !$acc& device_type (*) seq
      do i2 = 1, 10
-        !$acc loop device_type (nVidia) vector dtype (*) seq
+        !$acc loop device_type (nvidia) vector dtype (*) seq
         do i3 = 1, 10
-           !$acc loop device_type (nVidia) auto dtype (*) seq
+           !$acc loop device_type (nvidia) auto dtype (*) seq
            do i4 = 1, 10
-              !$acc loop dtype (nVidia) &
+              !$acc loop dtype (nvidia) &
               !$acc& dtype (*) seq
               do i5 = 1, 10
-                 !$acc loop device_type (nVidia) seq
-                 do i6 = 1, 10
-                 end do
-              end do
-           end do
-        end do
-     end do
-  end do
-!$acc end parallel
-
-!$acc parallel
-!$acc loop dtype (nVidiaGPU) gang tile (1) dtype (*) seq
-  do i1 = 1, 10
-     !$acc loop dtype (nVidiaGPU) worker collapse (1) &
-     !$acc& device_type (*) seq
-     do i2 = 1, 10
-        !$acc loop device_type (nVidiaGPU) vector device_type (*) seq
-        do i3 = 1, 10
-           !$acc loop dtype (nVidiaGPU) auto device_type (*) seq
-           do i4 = 1, 10
-              !$acc loop dtype (nVidiaGPU) &
-              !$acc& dtype (*) seq
-              do i5 = 1, 10
-                 !$acc loop dtype (nVidiaGPU) seq device_type (*) seq
+                 !$acc loop device_type (nvidia) seq
                  do i6 = 1, 10
                  end do
               end do
@@ -189,42 +167,38 @@  subroutine sr5b ()
   !$acc routine dtype (gpu) bind (foo) device_type (*) seq
 end subroutine sr5b
 
-! { dg-final { scan-tree-dump-times "oacc_parallel async\\(1\\) wait\\(1\\) num_gangs\\(100\\) num_workers\\(100\\) vector_length\\(32\\)" 1 "omplower" } }
-
-! { dg-final { scan-tree-dump-times "oacc_parallel async\\(2\\) wait\\(2\\) num_gangs\\(200\\) num_workers\\(200\\) vector_length\\(64\\)" 1 "omplower" } }
-
-! { dg-final { scan-tree-dump-times "oacc_parallel async\\(3\\) wait\\(3\\) num_gangs\\(300\\) num_workers\\(300\\) vector_length\\(128\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(nvidia\\) \\\[ async\\(1\\) wait\\(1\\) num_gangs\\(100\\) num_workers\\(100\\) vector_length\\(32\\) \\\]" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "oacc_parallel async\\(10\\) wait\\(10\\) num_gangs\\(10\\) num_workers\\(10\\) vector_length\\(10\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(nvidia\\) \\\[ async\\(2\\) wait\\(2\\) num_gangs\\(200\\) num_workers\\(200\\) vector_length\\(64\\) \\\] async\\(1\\) wait\\(1\\) num_gangs\\(1\\) num_workers\\(1\\) vector_length\\(1\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "oacc_kernels async\\(-1\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(\\*\\) \\\[ async\\(10\\) wait\\(10\\) num_gangs\\(10\\) num_workers\\(10\\) vector_length\\(10\\) \\\] device_type\\(nvidia\\) \\\[ async\\(3\\) wait\\(3\\) num_gangs\\(300\\) num_workers\\(300\\) vector_length\\(128\\) \\\] async\\(1\\) wait\\(1\\) num_gangs\\(1\\) num_workers\\(1\\) vector_length\\(1\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "oacc_kernels async\\(1\\) wait\\(1\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_parallel device_type\\(\\*\\) \\\[ async\\(10\\) wait\\(10\\) num_gangs\\(10\\) num_workers\\(10\\) vector_length\\(10\\) \\\] device_type\\(nvidia_ptx\\) \\\[ async\\(3\\) wait\\(3\\) num_gangs\\(300\\) num_workers\\(300\\) vector_length\\(128\\) \\\] async\\(1\\) wait\\(1\\) num_gangs\\(1\\) num_workers\\(1\\) vector_length\\(1\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "oacc_kernels async\\(2\\) wait\\(2\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(nvidia\\) \\\[ async\\(-1\\) \\\]" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "oacc_kernels async\\(0\\) wait\\(0\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(nvidia\\) \\\[ async\\(1\\) wait\\(1\\) \\\]" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i1\\) tile\\(1\\) gang private\\(i1\\.1\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(\\*\\) \\\[ async\\(0\\) wait\\(0\\) \\\] device_type\\(nvidia\\) \\\[ async\\(2\\) wait\\(2\\) \\\]" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i1\\) tile\\(1\\) gang private\\(i1\\.2\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "oacc_kernels device_type\\(\\*\\) \\\[ async\\(0\\) wait\\(0\\) \\\] device_type\\(nvidia_ptx\\) \\\[ async\\(1\\) wait\\(1\\) \\\] async\\(-1\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i1\\) seq private\\(i1\\.3\\)" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ tile\\(1\\) gang \\\] private\\(i1\\) private\\(i1\\.1\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i2\\) collapse\\(1\\) worker" 2 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ tile\\(1\\) gang \\\] private\\(i1\\) private\\(i1\\.2\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i3\\) vector" 2 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ collapse\\(1\\) worker \\\] private\\(i2\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i4\\) auto" 2 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ vector \\\] private\\(i3\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i4\\)" 3 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ auto \\\] private\\(i4\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i5\\)" 3 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ \\\] private\\(i5\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i6\\) seq" 3 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(nvidia\\) \\\[ seq \\\] private\\(i6\\)" 2 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i2\\) seq" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ collapse\\(1\\) worker \\\] private\\(i2\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i4\\) seq" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ auto \\\] private\\(i4\\)" 1 "omplower" } }
 
-! { dg-final { scan-tree-dump-times "acc loop private\\(i5\\) seq" 1 "omplower" } }
+! { dg-final { scan-tree-dump-times "acc loop device_type\\(\\*\\) \\\[ seq \\\] device_type\\(nvidia\\) \\\[ \\\] private\\(i5\\)" 1 "omplower" } }
diff --git a/gcc/testsuite/gfortran.dg/goacc/dtype-2.f95 b/gcc/testsuite/gfortran.dg/goacc/dtype-2.f95
index 0d96e37..6405749 100644
--- a/gcc/testsuite/gfortran.dg/goacc/dtype-2.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/dtype-2.f95
@@ -1,11 +1,12 @@ 
 ! { dg-do compile }
+! { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" }
 
 program dtype
   integer i1, i2, i3, i4, i5, i6
 
 !! ACC PARALLEL DEVICE_TYPE:
 
-!$acc parallel device_type (nVidia) async (1) num_gangs (100) &
+!$acc parallel device_type (nvidia) async (1) num_gangs (100) &
 !$acc&  num_workers (100) vector_length (32) wait (1) copy (i1)
 !$acc end parallel
 
@@ -17,7 +18,7 @@  program dtype
 !! ACC LOOP DEVICE_TYPE:
 
 !$acc parallel
-!$acc loop dtype (nVidia) gang tile (1) private (i1)
+!$acc loop dtype (nvidia) gang tile (1) private (i1)
   do i1 = 1, 10
   end do
 !$acc end parallel
@@ -28,12 +29,12 @@  program dtype
 
 end program dtype
 
-! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 8 }
-! { dg-error "Unexpected" "" { target *-*-* } 10 }
+! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 9 }
+! { dg-error "Unexpected" "" { target *-*-* } 11 }
 
-! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 14 }
-! { dg-error "Unexpected" "" { target *-*-* } 15 }
+! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 15 }
+! { dg-error "Unexpected" "" { target *-*-* } 16 }
 
-! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 20 }
+! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 21 }
 
-! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 27 }
+! { dg-error "Unclassifiable OpenACC directive" "" { target *-*-* } 28 }
diff --git a/gcc/testsuite/gfortran.dg/goacc/dtype-3.f b/gcc/testsuite/gfortran.dg/goacc/dtype-3.f
index 2b2d45f..dd82c78 100644
--- a/gcc/testsuite/gfortran.dg/goacc/dtype-3.f
+++ b/gcc/testsuite/gfortran.dg/goacc/dtype-3.f
@@ -1,4 +1,5 @@ 
 ! { dg-do compile }
+! { dg-prune-output "sorry, unimplemented: device_type clause is not supported yet" }
 
       IMPLICIT NONE
 
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 4b410a3..00fcee5 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -798,6 +798,15 @@  dump_omp_clause (pretty_printer *pp, tree clause, int spc, int flags)
 			 spc, flags, false);
       pp_right_paren (pp);
       break;
+    case OMP_CLAUSE_DEVICE_TYPE:
+      pp_string (pp, "device_type(");
+      dump_generic_node (pp, OMP_CLAUSE_DEVICE_TYPE_DEVICES (clause),
+			 spc, flags, false);
+      pp_string (pp, ") [");
+      dump_omp_clauses (pp, OMP_CLAUSE_DEVICE_TYPE_CLAUSES (clause),
+			spc, flags);
+      pp_string (pp, " ]");
+      break;
 
     default:
       /* Should never happen.  */
diff --git a/gcc/tree.c b/gcc/tree.c
index e67315a..ff533a3 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -11401,6 +11401,11 @@  walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	case OMP_CLAUSE_TILE:
 	  WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
 
+	case OMP_CLAUSE_DEVICE_TYPE:
+	  WALK_SUBTREE (OMP_CLAUSE_DEVICE_TYPE_DEVICES (*tp));
+	  WALK_SUBTREE (OMP_CLAUSE_DEVICE_TYPE_CLAUSES (*tp));
+	  WALK_SUBTREE_TAIL (OMP_CLAUSE_CHAIN (*tp));
+
 	case OMP_CLAUSE_LASTPRIVATE:
 	  WALK_SUBTREE (OMP_CLAUSE_DECL (*tp));
 	  WALK_SUBTREE (OMP_CLAUSE_LASTPRIVATE_STMT (*tp));