diff mbox series

OpenACC 2.6 manual deep copy support (attach/detach)

Message ID 1543578069-386-1-git-send-email-julian@codesourcery.com
State New
Headers show
Series OpenACC 2.6 manual deep copy support (attach/detach) | expand

Commit Message

Julian Brown Nov. 30, 2018, 11:41 a.m. UTC
This is a new version of the patch incorporating
several improvements/bugfixes made on the og8 branch:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01773.html
  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02366.html

The "dynamic"/multidimensional arrays parts (mostly initialisations of
new struct fields in the appropriate places) are missing, because they
haven't landed upstream yet. Non-runtime parts of the patch are the same
as before. There are a few new tests. The patch also supersedes this one:

  https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01175.html

and this one:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02543.html

Tested with offloading to nvptx and bootstrapped, on top of Chung-Lin's
async patches, previously-posted patches in the series and with my patch
to consistency-check refcounts (to be posted). OK?

Thanks,

Julian

ChangeLog

	gcc/c-family/
	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
	PRAGMA_OACC_CLAUSE_DETACH.

	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Add parsing of attach and
	detach clauses.
	(c_parser_omp_variable_list): Allow deref (->) in variable lists.
	(c_parser_oacc_data_clause): Support attach and detach clauses.
	(c_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
	PRAGMA_OACC_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
	* c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach
	and detach.  Support deref.
	(c_oacc_check_attachments): New function.
	(c_finish_omp_clauses): Check attach/detach arguments for being
	pointers using above.  Support deref.

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Support attach and detach
	clauses.
	(cp_parser_omp_var_list_no_open): Support deref.
	(cp_parser_oacc_data_clause): Support attach and detach clauses.
	(cp_parser_oacc_all_clauses): Likewise.
	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
	PRAGMA_OACC_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
	* semantics.c (handle_omp_array_sections_1): Reject subarrays for
	attach and detach.
	(cp_oacc_check_attachments): New function.
	(finish_omp_clauses): Use above function.  Allow structure fields and
	class members to appear in OpenACC data clauses.  Support deref.

	gcc/fortran/
	* gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH.
	* openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter.
	Parse derived-type member accesses if true.
	(omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH.
	(gfc_match_omp_map_clause): Add allow_derived parameter.  Pass to
	gfc_match_omp_variable_list.
	(gfc_match_omp_clauses): Support attach and detach.  Support derived
	types for appropriate OpenACC directives.
	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
	(OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH.
	(check_symbol_not_pointer): Don't disallow pointer objects of derived
	type.
	(resolve_oacc_data_clauses): Don't disallow allocatable derived types.
	(resolve_omp_clauses): Perform duplicate checking only for non-derived
	type component accesses (plain variables and arrays or array sections).
	Support component refs.
	* trans-openmp.c (gfc_omp_privatize_by_reference): Support component
	refs.
	(gfc_trans_omp_clauses): Support component refs, attach and detach
	clauses.

	gcc/
	* gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS.
	(insert_struct_component_mapping): Support derived-type member mappings
	for arrays with descriptors which use GOMP_MAP_TO_PSET.
	(gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to
	GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers.
	Handle pointer mappings that use GOMP_MAP_TO_PSET.  Handle attach/detach
	clauses.
	(gimplify_adjust_omp_clauses_1): Skip adjustments for explicit
	attach/detach clauses.
	(gimplify_omp_target_update): Handle finalize for detach.
	* omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH,
	GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH.
	* tree-pretty-print.c (dump_omp_clause): Likewise.

	gcc/include/
	* gomp-constants.h (GOMP_MAP_DEEP_COPY): Define.
	(gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH,
	GOMP_MAP_FORCE_DETACH.

	gcc/testsuite/
	* c-c++-common/goacc/mdc-1.c: New test.
	* c-c++-common/goacc/mdc-2.c: New test.
	* gcc.dg/goacc/mdc.C: New test.
	* gfortran.dg/goacc/data-clauses.f95: New test.
	* gfortran.dg/goacc/derived-types.f90: New test.
	* gfortran.dg/goacc/enter-exit-data.f95: New test.

	libgomp/
	* libgomp.h (struct target_var_desc): Add do_detach flag.
	(struct splay_tree_key_s): Add attach_count field.  Substitute
	dynamic_refcount field for virtual_refcount.
	(struct acc_dispatch_t): Remove data_environ field.
	(enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA.
	(gomp_acc_insert_pointer): Remove prototype.
	(gomp_acc_remove_pointer): Update prototype.
	(struct gomp_coalesce_buf): Add forward declaration.
	(gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add
	prototypes.
	* libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async,
	acc_detach, acc_detach_async, acc_detach_finalize,
	acc_detach_finalize_async.
	* oacc-async.c (goacc_remove_var_async): New function.
	* oacc-host.c (host_dispatch): Don't initialise removed data_environ
	field.
	* oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of
	gomp_unmap_vars to remove mappings by splay tree key instead of target
	memory descriptor.
	* oacc-int.h (splay_tree_key_s): Add forward declaration.
	(goacc_remove_car_async): Add prototype.
	* oacc-mem.c (lookup_dev_1): New function.
	(lookup_dev): Reimplement using above.
	(acc_free, acc_hostptr): Update calls to lookup_dev.
	(acc_map_data): Likewise.  Don't add to data_environ list.
	(acc_unmap_data): Update call to gomp_unmap_vars.  Fix semantics to
	remove mapping, but not mapped data.
	(present_create_copy): Use virtual_refcount instead of
	dynamic_refcount.  Don't manipulate data_environ.  Fix target pointer
	return value.
	(delete_copyout): Update for virtual_refcount semantics.  Use
	goacc_remove_var_async for asynchronous delete/copyouts.
	(gomp_acc_insert_pointer): Remove function.
	(gomp_acc_remove_pointer): Reimplement.
	(acc_attach_async, acc_attach, goacc_detach_internal, acc_detach)
	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New
	functions.
	* oacc-parallel.c (find_pointer): Support attach/detach.  Make a little
	more strict.
	(GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses.
	(GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT.
	Don't call gomp_acc_insert_pointer.
	* openacc.h (acc_attach, acc_attach_async, acc_detach)
	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add
	prototypes.
	* target.c (limits.h): Include.
	(gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc.
	(gomp_attach_pointer, gomp_detach_pointer): New functions.
	(gomp_map_val): Make global.
	(gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA.  Update
	for virtual_refcount semantics.  Support attach and detach.
	(gomp_remove_var): Free attach count array if present.
	(gomp_unmap_vars_async): Support detach and update for virtual_refcount
	semantics.
	(gomp_load_image_to_device): Zero-initialise virtual_refcount fields.
	(gomp_free_memmap): Remove function.
	(omp_target_associate_ptr): Zero-initialise virtual_refcount,
	attach_count and link_key splay tree key fields.
	(gomp_target_init): Don't initialise removed data_environ field.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test.
	* testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
	* testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/update-2.f90: New test.
---
 gcc/c-family/c-pragma.h                            |    2 +
 gcc/c/c-parser.c                                   |   34 ++-
 gcc/c/c-typeck.c                                   |   59 +++-
 gcc/cp/parser.c                                    |   38 ++-
 gcc/cp/semantics.c                                 |   75 ++++-
 gcc/fortran/gfortran.h                             |    2 +
 gcc/fortran/openmp.c                               |  145 +++++---
 gcc/fortran/trans-openmp.c                         |   78 +++-
 gcc/gimplify.c                                     |   85 ++++-
 gcc/omp-low.c                                      |    3 +
 gcc/testsuite/c-c++-common/goacc/mdc-1.c           |   54 +++
 gcc/testsuite/c-c++-common/goacc/mdc-2.c           |   62 +++
 gcc/testsuite/g++.dg/goacc/mdc.C                   |   68 ++++
 gcc/testsuite/gfortran.dg/goacc/data-clauses.f95   |   38 +-
 gcc/testsuite/gfortran.dg/goacc/derived-types.f90  |   77 ++++
 .../gfortran.dg/goacc/enter-exit-data.f95          |   24 +-
 gcc/tree-pretty-print.c                            |    9 +
 include/gomp-constants.h                           |    8 +
 libgomp/libgomp.h                                  |   34 ++-
 libgomp/libgomp.map                                |   10 +
 libgomp/oacc-async.c                               |   18 +
 libgomp/oacc-host.c                                |    2 -
 libgomp/oacc-init.c                                |    9 +-
 libgomp/oacc-int.h                                 |    5 +
 libgomp/oacc-mem.c                                 |  402 +++++++++-----------
 libgomp/oacc-parallel.c                            |  240 +++++++++---
 libgomp/openacc.h                                  |    6 +
 libgomp/target.c                                   |  238 ++++++++++--
 .../libgomp.oacc-c-c++-common/context-2.c          |    6 +-
 .../libgomp.oacc-c-c++-common/context-4.c          |    6 +-
 .../libgomp.oacc-c-c++-common/deep-copy-1.c        |   24 ++
 .../libgomp.oacc-c-c++-common/deep-copy-2.c        |   29 ++
 .../libgomp.oacc-c-c++-common/deep-copy-3.c        |   34 ++
 .../libgomp.oacc-c-c++-common/deep-copy-4.c        |   87 +++++
 .../libgomp.oacc-c-c++-common/deep-copy-5.c        |   81 ++++
 .../libgomp.oacc-c-c++-common/deep-copy-6.c        |   59 +++
 .../libgomp.oacc-c-c++-common/deep-copy-7.c        |   45 +++
 .../libgomp.oacc-c-c++-common/deep-copy-8.c        |   54 +++
 libgomp/testsuite/libgomp.oacc-fortran/data-2.f90  |    7 +-
 .../testsuite/libgomp.oacc-fortran/deep-copy-1.f90 |   35 ++
 .../testsuite/libgomp.oacc-fortran/deep-copy-2.f90 |   33 ++
 .../testsuite/libgomp.oacc-fortran/deep-copy-3.f90 |   34 ++
 .../testsuite/libgomp.oacc-fortran/deep-copy-4.f90 |   49 +++
 .../testsuite/libgomp.oacc-fortran/deep-copy-5.f90 |   57 +++
 .../testsuite/libgomp.oacc-fortran/deep-copy-6.f90 |   61 +++
 .../testsuite/libgomp.oacc-fortran/deep-copy-7.f90 |   89 +++++
 .../testsuite/libgomp.oacc-fortran/deep-copy-8.f90 |   41 ++
 .../libgomp.oacc-fortran/derived-type-1.f90        |   28 ++
 .../testsuite/libgomp.oacc-fortran/update-2.f90    |  284 ++++++++++++++
 49 files changed, 2526 insertions(+), 442 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-1.c
 create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-2.c
 create mode 100644 gcc/testsuite/g++.dg/goacc/mdc.C
 create mode 100644 gcc/testsuite/gfortran.dg/goacc/derived-types.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/update-2.f90

Comments

Julian Brown Dec. 3, 2018, 5:03 p.m. UTC | #1
On Fri, 30 Nov 2018 03:41:09 -0800
Julian Brown <julian@codesourcery.com> wrote:

> This is a new version of the patch incorporating
> several improvements/bugfixes made on the og8 branch:

I released I forgot (again!) to incorporate the changes suggested by
Bernhard in:

https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00861.html

I've folded those into my copy of the local patch now, but I'll hold
off resubmitting until the rest of the patch is reviewed.

Thanks,

Julian
Jakub Jelinek Dec. 7, 2018, 1:50 p.m. UTC | #2
On Fri, Nov 30, 2018 at 03:41:09AM -0800, Julian Brown wrote:
> 	gcc/c-family/
> 	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
> 	PRAGMA_OACC_CLAUSE_DETACH.
...
> @@ -11804,9 +11808,12 @@ c_parser_omp_variable_list (c_parser *parser,
>  	    case OMP_CLAUSE_MAP:
>  	    case OMP_CLAUSE_FROM:
>  	    case OMP_CLAUSE_TO:
> -	      while (c_parser_next_token_is (parser, CPP_DOT))
> +	      while (c_parser_next_token_is (parser, CPP_DOT)
> +		     || c_parser_next_token_is (parser, CPP_DEREF))
>  		{
>  		  location_t op_loc = c_parser_peek_token (parser)->location;
> +		  if (c_parser_next_token_is (parser, CPP_DEREF))
> +		    t = build_simple_mem_ref (t);

This change is not ok, if OpenACC allows it in clauses, OpenMP 4.5 does not
and OpenMP 5.0 allows arbitrary lvalues that will need to be handled
differently (still unimplemented).  So, this needs to be guarded for OpenACC
only (perhaps for selected OpenACC clauses)?

> @@ -12632,6 +12631,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>  		}
>  	      t = TREE_OPERAND (t, 0);
>  	    }
> +	  if (TREE_CODE (t) == MEM_REF)
> +	    t = TREE_OPERAND (t, 0);

Again, better guard this for OpenACC.  Maybe verify that mem_ref_offset is 0?

> @@ -14163,6 +14214,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  		}
>  	      if (remove)
>  		break;
> +	      if (TREE_CODE (t) == MEM_REF)
> +		t = TREE_OPERAND (t, 0);

Guard again?

> @@ -31832,15 +31836,19 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
>  	    case OMP_CLAUSE_MAP:
>  	    case OMP_CLAUSE_FROM:
>  	    case OMP_CLAUSE_TO:
> -	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT))
> +	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)
> +		     || cp_lexer_next_token_is (parser->lexer, CPP_DEREF))

Ditto as for C.

> @@ -4691,6 +4690,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>    if (low_bound == NULL_TREE)
>      low_bound = integer_zero_node;
>  
> +  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> +      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> +	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
> +    {
> +      if (length != integer_one_node)
> +	{
> +	  error_at (OMP_CLAUSE_LOCATION (c),
> +		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> +		    ? "array section in %<attach%> clause"
> +		    : "array section in %<detach%> clause");

So, are any array sections invalid, including e.g. [0:1] or say
[5:] where size of the array is 6 elts, or what exactly is invalid?

> +      if (TREE_CODE (type) != POINTER_TYPE)
> +	{
> +	  error_at (OMP_CLAUSE_LOCATION (c),
> +		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> +		    ? "expected pointer in %<attach%> clause"
> +		    : "expected pointer in %<detach%> clause");

Perhaps you can use %qs and omp_clause_name [OMP_CLAUSE_CODE (c)] ?
> +	  return true;
> +	}
> +    }
> +
> +  return false;
> +}
> +
>  /* For all elements of CLAUSES, validate them vs OpenMP constraints.
>     Remove any elements from the list that are invalid.  */
>  
> @@ -6288,7 +6337,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  	    t = OMP_CLAUSE_DECL (c);
>  	check_dup_generic_t:
>  	  if (t == current_class_ptr
> -	      && (ort != C_ORT_OMP_DECLARE_SIMD
> +	      && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC)
>  		  || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR
>  		      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM)))
>  	    {
> @@ -6352,8 +6401,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  	handle_field_decl:
>  	  if (!remove
>  	      && TREE_CODE (t) == FIELD_DECL
> -	      && t == OMP_CLAUSE_DECL (c)
> -	      && ort != C_ORT_ACC)
> +	      && t == OMP_CLAUSE_DECL (c))
>  	    {
>  	      OMP_CLAUSE_DECL (c)
>  		= omp_privatize_field (t, (OMP_CLAUSE_CODE (c)
> @@ -6420,7 +6468,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  	    omp_note_field_privatization (t, OMP_CLAUSE_DECL (c));
>  	  else
>  	    t = OMP_CLAUSE_DECL (c);
> -	  if (t == current_class_ptr)
> +	  if (ort != C_ORT_ACC && t == current_class_ptr)
>  	    {
>  	      error_at (OMP_CLAUSE_LOCATION (c),
>  			"%<this%> allowed in OpenMP only in %<declare simd%>"
> @@ -6907,7 +6955,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  	    }
>  	  if (t == error_mark_node)
>  	    remove = true;
> -	  else if (t == current_class_ptr)
> +	  else if (ort != C_ORT_ACC && t == current_class_ptr)
>  	    {
>  	      error_at (OMP_CLAUSE_LOCATION (c),
>  			"%<this%> allowed in OpenMP only in %<declare simd%>"
> @@ -7037,6 +7085,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  			}
>  		    }
>  		}
> +	      if (cp_oacc_check_attachments (c))
> +		remove = true;
>  	      break;
>  	    }
>  	  if (t == error_mark_node)
> @@ -7044,14 +7094,25 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  	      remove = true;
>  	      break;
>  	    }
> +	  /* OpenACC attach / detach clauses must be pointers.  */
> +	  if (cp_oacc_check_attachments (c))
> +	    {
> +	      remove = true;
> +	      break;
> +	    }
>  	  if (REFERENCE_REF_P (t)
>  	      && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF)
>  	    {
>  	      t = TREE_OPERAND (t, 0);
>  	      OMP_CLAUSE_DECL (c) = t;
>  	    }
> +	  if (ort == C_ORT_ACC
> +	      && TREE_CODE (t) == COMPONENT_REF
> +	      && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF)
> +	    t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
>  	  if (TREE_CODE (t) == COMPONENT_REF
> -	      && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
> +	      && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
> +		  || ort == C_ORT_ACC)
>  	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
>  	    {
>  	      if (type_dependent_expression_p (t))
> diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
> index d8ef35d..9f96418 100644
> --- a/gcc/fortran/gfortran.h
> +++ b/gcc/fortran/gfortran.h
> @@ -1175,10 +1175,12 @@ enum gfc_omp_depend_op
>  enum gfc_omp_map_op
>  {
>    OMP_MAP_ALLOC,
> +  OMP_MAP_ATTACH,
>    OMP_MAP_TO,
>    OMP_MAP_FROM,
>    OMP_MAP_TOFROM,
>    OMP_MAP_DELETE,
> +  OMP_MAP_DETACH,
>    OMP_MAP_FORCE_ALLOC,
>    OMP_MAP_FORCE_TO,
>    OMP_MAP_FORCE_FROM,
> diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
> index 6430e61..ebba7ca 100644
> --- a/gcc/fortran/openmp.c
> +++ b/gcc/fortran/openmp.c
> @@ -222,7 +222,8 @@ static match
>  gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
>  			     bool allow_common, bool *end_colon = NULL,
>  			     gfc_omp_namelist ***headp = NULL,
> -			     bool allow_sections = false)
> +			     bool allow_sections = false,
> +			     bool allow_derived = false)
>  {
>    gfc_omp_namelist *head, *tail, *p;
>    locus old_loc, cur_loc;
> @@ -248,7 +249,8 @@ gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
>  	case MATCH_YES:
>  	  gfc_expr *expr;
>  	  expr = NULL;
> -	  if (allow_sections && gfc_peek_ascii_char () == '(')
> +	  if ((allow_sections && gfc_peek_ascii_char () == '(')
> +	      || (allow_derived && gfc_peek_ascii_char () == '%'))
>  	    {
>  	      gfc_current_locus = cur_loc;
>  	      m = gfc_match_variable (&expr, 0);
> @@ -785,7 +787,7 @@ enum omp_mask1
>    OMP_MASK1_LAST
>  };
>  
> -/* OpenACC 2.0 specific clauses. */
> +/* OpenACC 2.0+ specific clauses. */
>  enum omp_mask2
>  {
>    OMP_CLAUSE_ASYNC,
> @@ -811,6 +813,8 @@ enum omp_mask2
>    OMP_CLAUSE_TILE,
>    OMP_CLAUSE_IF_PRESENT,
>    OMP_CLAUSE_FINALIZE,
> +  OMP_CLAUSE_ATTACH,
> +  OMP_CLAUSE_DETACH,
>    /* This must come last.  */
>    OMP_MASK2_LAST
>  };
> @@ -914,10 +918,12 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
>     mapping.  */
>  
>  static bool
> -gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
> +gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
> +			  bool allow_derived = false)
>  {
>    gfc_omp_namelist **head = NULL;
> -  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
> +  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true,
> +				   allow_derived)
>        == MATCH_YES)
>      {
>        gfc_omp_namelist *n;
> @@ -939,6 +945,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  {
>    gfc_omp_clauses *c = gfc_get_omp_clauses ();
>    locus old_loc;
> +  /* Determine whether we're dealing with an OpenACC directive that permits
> +     derived type member accesses.  This in particular disallows
> +     "!$acc declare" from using such accesses, because it's not clear if/how
> +     that should work.  */
> +  bool allow_derived = (openacc
> +			&& ((mask & OMP_CLAUSE_ATTACH)
> +			    || (mask & OMP_CLAUSE_DETACH)
> +			    || (mask & OMP_CLAUSE_HOST_SELF)));
>  
>    gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64);
>    *cp = NULL;
> @@ -1012,6 +1026,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	      needs_space = true;
>  	      continue;
>  	    }
> +	  if ((mask & OMP_CLAUSE_ATTACH)
> +	      && gfc_match ("attach ( ") == MATCH_YES
> +	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> +					   OMP_MAP_ATTACH, allow_derived))
> +	    continue;
>  	  break;
>  	case 'c':
>  	  if ((mask & OMP_CLAUSE_COLLAPSE)
> @@ -1039,7 +1058,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_COPY)
>  	      && gfc_match ("copy ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_TOFROM))
> +					   OMP_MAP_TOFROM, allow_derived))
>  	    continue;
>  	  if (mask & OMP_CLAUSE_COPYIN)
>  	    {
> @@ -1047,7 +1066,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  		{
>  		  if (gfc_match ("copyin ( ") == MATCH_YES
>  		      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -						   OMP_MAP_TO))
> +						   OMP_MAP_TO, allow_derived))
>  		    continue;
>  		}
>  	      else if (gfc_match_omp_variable_list ("copyin (",
> @@ -1058,7 +1077,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_COPYOUT)
>  	      && gfc_match ("copyout ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FROM))
> +					   OMP_MAP_FROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
>  	      && gfc_match_omp_variable_list ("copyprivate (",
> @@ -1068,7 +1087,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_CREATE)
>  	      && gfc_match ("create ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_ALLOC))
> +					   OMP_MAP_ALLOC, allow_derived))
>  	    continue;
>  	  break;
>  	case 'd':
> @@ -1104,7 +1123,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_DELETE)
>  	      && gfc_match ("delete ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_RELEASE))
> +					   OMP_MAP_RELEASE, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_DEPEND)
>  	      && gfc_match ("depend ( ") == MATCH_YES)
> @@ -1147,6 +1166,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	      else
>  		gfc_current_locus = old_loc;
>  	    }
> +	  if ((mask & OMP_CLAUSE_DETACH)
> +	      && gfc_match ("detach ( ") == MATCH_YES
> +	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> +					   OMP_MAP_DETACH, allow_derived))
> +	    continue;
>  	  if ((mask & OMP_CLAUSE_DEVICE)
>  	      && !openacc
>  	      && c->device == NULL
> @@ -1156,12 +1180,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	      && openacc
>  	      && gfc_match ("device ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FORCE_TO))
> +					   OMP_MAP_FORCE_TO, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_DEVICEPTR)
>  	      && gfc_match ("deviceptr ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FORCE_DEVICEPTR))
> +					   OMP_MAP_FORCE_DEVICEPTR,
> +					   allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
>  	      && gfc_match_omp_variable_list
> @@ -1239,7 +1264,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_HOST_SELF)
>  	      && gfc_match ("host ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FORCE_FROM))
> +					   OMP_MAP_FORCE_FROM, allow_derived))
>  	    continue;
>  	  break;
>  	case 'i':
> @@ -1511,47 +1536,48 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_COPY)
>  	      && gfc_match ("pcopy ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_TOFROM))
> +					   OMP_MAP_TOFROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPYIN)
>  	      && gfc_match ("pcopyin ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_TO))
> +					   OMP_MAP_TO, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPYOUT)
>  	      && gfc_match ("pcopyout ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FROM))
> +					   OMP_MAP_FROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_CREATE)
>  	      && gfc_match ("pcreate ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_ALLOC))
> +					   OMP_MAP_ALLOC, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_PRESENT)
>  	      && gfc_match ("present ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FORCE_PRESENT))
> +					   OMP_MAP_FORCE_PRESENT,
> +					   allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPY)
>  	      && gfc_match ("present_or_copy ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_TOFROM))
> +					   OMP_MAP_TOFROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPYIN)
>  	      && gfc_match ("present_or_copyin ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_TO))
> +					   OMP_MAP_TO, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_COPYOUT)
>  	      && gfc_match ("present_or_copyout ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FROM))
> +					   OMP_MAP_FROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_CREATE)
>  	      && gfc_match ("present_or_create ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_ALLOC))
> +					   OMP_MAP_ALLOC, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_PRIORITY)
>  	      && c->priority == NULL
> @@ -1669,8 +1695,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  
>  	      if (gfc_match_omp_variable_list (" :",
>  					       &c->lists[OMP_LIST_REDUCTION],
> -					       false, NULL, &head,
> -					       openacc) == MATCH_YES)
> +					       false, NULL, &head, openacc,
> +					       allow_derived) == MATCH_YES)
>  		{
>  		  gfc_omp_namelist *n;
>  		  if (rop == OMP_REDUCTION_NONE)
> @@ -1769,7 +1795,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>  	  if ((mask & OMP_CLAUSE_HOST_SELF)
>  	      && gfc_match ("self ( ") == MATCH_YES
>  	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
> -					   OMP_MAP_FORCE_FROM))
> +					   OMP_MAP_FORCE_FROM, allow_derived))
>  	    continue;
>  	  if ((mask & OMP_CLAUSE_SEQ)
>  	      && !c->seq
> @@ -1927,17 +1953,17 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>     | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
>     | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR	      \
>     | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT	      \
> -   | OMP_CLAUSE_WAIT)
> +   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
>  #define OACC_KERNELS_CLAUSES \
>    (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS	      \
>     | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \
>     | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
>     | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
> -   | OMP_CLAUSE_WAIT)
> +   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
>  #define OACC_DATA_CLAUSES \
>    (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
>     | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
> -   | OMP_CLAUSE_PRESENT)
> +   | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH)
>  #define OACC_LOOP_CLAUSES \
>    (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER	      \
>     | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT	      \
> @@ -1958,10 +1984,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
>     | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT)
>  #define OACC_ENTER_DATA_CLAUSES \
>    (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
> -   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE)
> +   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH)
>  #define OACC_EXIT_DATA_CLAUSES \
>    (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
> -   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE)
> +   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE	      \
> +   | OMP_CLAUSE_DETACH)
>  #define OACC_WAIT_CLAUSES \
>    omp_mask (OMP_CLAUSE_ASYNC)
>  #define OACC_ROUTINE_CLAUSES \
> @@ -3734,9 +3761,6 @@ resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause)
>  static void
>  check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name)
>  {
> -  if (sym->ts.type == BT_DERIVED && sym->attr.pointer)
> -    gfc_error ("POINTER object %qs of derived type in %s clause at %L",
> -	       sym->name, name, &loc);
>    if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer)
>      gfc_error ("Cray pointer object %qs of derived type in %s clause at %L",
>  	       sym->name, name, &loc);
> @@ -3781,9 +3805,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name)
>  static void
>  resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
>  {
> -  if (sym->ts.type == BT_DERIVED && sym->attr.allocatable)
> -    gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L",
> -	       sym->name, name, &loc);
>    if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable)
>        || (sym->ts.type == BT_CLASS && CLASS_DATA (sym)
>  	  && CLASS_DATA (sym)->attr.allocatable))
> @@ -4153,11 +4174,23 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>  	&& (list != OMP_LIST_REDUCTION || !openacc))
>        for (n = omp_clauses->lists[list]; n; n = n->next)
>  	{
> -	  if (n->sym->mark)
> -	    gfc_error ("Symbol %qs present on multiple clauses at %L",
> -		       n->sym->name, &n->where);
> -	  else
> -	    n->sym->mark = 1;
> +	  bool array_only_p = true;
> +	  /* Disallow duplicate bare variable references and multiple
> +	     subarrays of the same array here, but allow multiple components of
> +	     the same (e.g. derived-type) variable.  For the latter, duplicate
> +	     components are detected elsewhere.  */
> +	  if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE)
> +	    for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
> +	      if (ref->type != REF_ARRAY)
> +		array_only_p = false;
> +	  if (array_only_p)
> +	    {
> +	      if (n->sym->mark)
> +		gfc_error ("Symbol %qs present on multiple clauses at %L",
> +			   n->sym->name, &n->where);
> +	      else
> +		n->sym->mark = 1;
> +	    }
>  	}
>  
>    gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1);
> @@ -4348,23 +4381,41 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
>  				 "are allowed on ORDERED directive at %L",
>  				 &n->where);
>  		  }
> +		gfc_ref *array_ref = NULL;
> +		bool resolved = false;
>  		if (n->expr)
>  		  {
> -		    if (!gfc_resolve_expr (n->expr)
> +		    array_ref = n->expr->ref;
> +		    resolved = gfc_resolve_expr (n->expr);
> +
> +		    /* Look through component refs to find last array
> +		       reference.  */
> +		    while (resolved
> +			   && array_ref
> +			   && (array_ref->type == REF_COMPONENT
> +			       || (array_ref->type == REF_ARRAY
> +				   && array_ref->next
> +			           && array_ref->next->type == REF_COMPONENT)))
> +		      array_ref = array_ref->next;

I'd guard this stuff for OpenACC only, keep what it did for OpenMP.

> +		  }
> +		if (array_ref
> +		    || (n->expr
> +			&& (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
> +		  {
> +		    if (!resolved
>  			|| n->expr->expr_type != EXPR_VARIABLE
> -			|| n->expr->ref == NULL
> -			|| n->expr->ref->next
> -			|| n->expr->ref->type != REF_ARRAY)
> +			|| array_ref->next
> +			|| array_ref->type != REF_ARRAY)
>  		      gfc_error ("%qs in %s clause at %L is not a proper "
>  				 "array section", n->sym->name, name,
>  				 &n->where);
> -		    else if (n->expr->ref->u.ar.codimen)
> +		    else if (array_ref->u.ar.codimen)
>  		      gfc_error ("Coarrays not supported in %s clause at %L",
>  				 name, &n->where);
>  		    else
>  		      {
>  			int i;
> -			gfc_array_ref *ar = &n->expr->ref->u.ar;
> +			gfc_array_ref *ar = &array_ref->u.ar;
>  			for (i = 0; i < ar->dimen; i++)
>  			  if (ar->stride[i])
>  			    {

> +		  /* For OpenACC, pointers in structs should trigger an
> +		     attach action.  */
> +		  if (ptr && (region_type & ORT_ACC) != 0)
> +		    {
> +		      /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a
> +			 GOMP_MAP_ATTACH clause after we have detected a case
> +			 that needs a GOMP_MAP_STRUCT mapping adding.  */
> +		      OMP_CLAUSE_SET_MAP_KIND (c,
> +			(code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH
> +						 : GOMP_MAP_ATTACH);

Bad formatting, I'd suggest use a temporary with gomp_map_kind type.

> +		      has_attachments = true;
> +		    }
>  		  if (n == NULL || (n->value & GOVD_MAP) == 0)
>  		    {
>  		      tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c),
>  						 OMP_CLAUSE_MAP);
> -		      OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT);
> +		      OMP_CLAUSE_SET_MAP_KIND (l, attach
> +			? GOMP_MAP_FORCE_PRESENT : GOMP_MAP_STRUCT);

Likewise.
>  		      if (!base_eq_orig_base)
>  			OMP_CLAUSE_DECL (l) = unshare_expr (orig_base);
>  		      else
>  			OMP_CLAUSE_DECL (l) = decl;
> -		      OMP_CLAUSE_SIZE (l) = size_int (1);
> +		      OMP_CLAUSE_SIZE (l) = attach
> +			? (DECL_P (OMP_CLAUSE_DECL (l))
> +			     ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
> +			     : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l))))
> +			: size_int (1);

Again, bad formatting. = attach should be on a next line, the indentation
is also weird, best like:
		      OMP_CLAUSE_SIZE (l)
			= (!attach
			   ? size_int (1)
			   : DECL_P (OMP_CLAUSE_DECL (l))
			   ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
			   : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l))));

>  		      if (struct_map_to_clause == NULL)
>  			struct_map_to_clause = new hash_map<tree, tree>;
>  		      struct_map_to_clause->put (decl, l);
> @@ -8681,9 +8713,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
>  		      flags = GOVD_MAP | GOVD_EXPLICIT;
>  		      if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr)
>  			flags |= GOVD_SEEN;
> +		      if (has_attachments)
> +			flags |= GOVD_MAP_HAS_ATTACHMENTS;
>  		      goto do_add_decl;
>  		    }
> -		  else
> +		  else if (struct_map_to_clause)
>  		    {
>  		      tree *osc = struct_map_to_clause->get (decl);
>  		      tree *sc = NULL, *scp = NULL;
> @@ -8692,8 +8726,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
>  		      sc = &OMP_CLAUSE_CHAIN (*osc);
>  		      if (*sc != c
>  			  && (OMP_CLAUSE_MAP_KIND (*sc)
> -			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) 
> +			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
>  			sc = &OMP_CLAUSE_CHAIN (*sc);
> +		      /* Here "prev_list_p" is the end of the inserted
> +			 alloc/release nodes after the struct node, OSC.  */
>  		      for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc))
>  			if (ptr && sc == prev_list_p)
>  			  break;
> @@ -8752,9 +8788,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
>  			  }
>  		      if (remove)
>  			break;
> -		      OMP_CLAUSE_SIZE (*osc)
> -			= size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
> -				      size_one_node);
> +		      if (!attach)
> +			OMP_CLAUSE_SIZE (*osc)
> +			  = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
> +					size_one_node);
>  		      if (ptr)
>  			{
>  			  tree cl
> @@ -8786,11 +8823,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
>  		}
>  	      if (!remove
>  		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER
> +		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
>  		  && OMP_CLAUSE_CHAIN (c)
>  		  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP
> -		  && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
> -		      == GOMP_MAP_ALWAYS_POINTER))
> +		  && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
> +		       == GOMP_MAP_ALWAYS_POINTER)
> +		      || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
> +		          == GOMP_MAP_TO_PSET)))
>  		prev_list_p = list_p;
> +
>  	      break;
>  	    }
>  	  flags = GOVD_MAP | GOVD_EXPLICIT;
> @@ -9412,6 +9453,8 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
>      return 0;
>    if ((flags & GOVD_SEEN) == 0)
>      return 0;
> +  if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0)
> +    return 0;
>    if (flags & GOVD_DEBUG_PRIVATE)
>      {
>        gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED);
> @@ -11795,8 +11838,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
>  			       OMP_CLAUSE_FINALIZE))
>      {
> -      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
> -	 semantics apply to all mappings of this OpenACC directive.  */
> +      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
> +	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
> +	 to all mappings of this OpenACC directive.  */
>        bool finalize_marked = false;
>        for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
>  	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
> @@ -11810,10 +11854,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
>  	      finalize_marked = true;
>  	      break;
> +	    case GOMP_MAP_DETACH:
> +	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
> +	      finalize_marked = true;
> +	      break;
> +	    case GOMP_MAP_STRUCT:
> +	    case GOMP_MAP_FORCE_PRESENT:
> +	      /* Skip over an initial struct or force_present mapping.  */
> +	      break;
>  	    default:
> -	      /* Check consistency: libgomp relies on the very first data
> -		 mapping clause being marked, so make sure we did that before
> -		 any other mapping clauses.  */
> +	      /* Check consistency: libgomp relies on the very first
> +		 non-struct, non-force-present data mapping clause being
> +		 marked, so make sure we did that before any other mapping
> +		 clauses.  */
>  	      gcc_assert (finalize_marked);
>  	      break;
>  	    }
> diff --git a/gcc/omp-low.c b/gcc/omp-low.c
> index ca78d7a..55dbc0b 100644
> --- a/gcc/omp-low.c
> +++ b/gcc/omp-low.c
> @@ -9138,6 +9138,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
>  	  case GOMP_MAP_FORCE_DEVICEPTR:
>  	  case GOMP_MAP_DEVICE_RESIDENT:
>  	  case GOMP_MAP_LINK:
> +	  case GOMP_MAP_ATTACH:
> +	  case GOMP_MAP_DETACH:
> +	  case GOMP_MAP_FORCE_DETACH:
>  	    gcc_assert (is_gimple_omp_oacc (stmt));
>  	    break;
>  	  default:
> diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
> new file mode 100644
> index 0000000..84a44af
> --- /dev/null
> +++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
> @@ -0,0 +1,54 @@
> +/* Test OpenACC's support for manual deep copy, including the attach
> +   and detach clauses.  */
> +
> +/* { dg-additional-options "-fdump-tree-omplower" } */
> +
> +void
> +t1 ()
> +{
> +  struct foo {
> +    int *a, *b, c, d, *e;
> +  } s;
> +
> +  int *a, *z;
> +
> +#pragma acc enter data copyin(s)
> +  {
> +#pragma acc data copy(s.a[0:10]) copy(z[0:10])
> +    {
> +      s.e = z;
> +#pragma acc parallel loop attach(s.e)
> +      for (int i = 0; i < 10; i++)
> +        s.a[i] = s.e[i];
> +
> +
> +      a = s.e;
> +#pragma acc enter data attach(a)
> +#pragma acc exit data detach(a)
> +    }
> +
> +#pragma acc enter data copyin(a)
> +#pragma acc acc enter data attach(s.e)
> +#pragma acc exit data detach(s.e)
> +
> +#pragma acc data attach(s.e)
> +    {
> +    }
> +#pragma acc exit data delete(a)
> +
> +#pragma acc exit data detach(a) finalize
> +#pragma acc exit data detach(s.a) finalize
> +  }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */
> +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */

Aren't the lengths here heavily dependent on the target?  E.g. if it depends
on sizeof (int) == 4, maybe the whole test needs to be guarded with { target int32 }

> @@ -918,8 +920,13 @@ struct splay_tree_key_s {
>    uintptr_t tgt_offset;
>    /* Reference count.  */
>    uintptr_t refcount;
> -  /* Dynamic reference count.  */
> -  uintptr_t dynamic_refcount;
> +  /* Reference counts beyond those that represent genuine references in the
> +     linked splay tree key/target memory structures, e.g. for multiple OpenACC
> +     "present increment" operations (via "acc enter data") refering to the same
> +     host-memory block.  */
> +  uintptr_t virtual_refcount;
> +  /* For a block with attached pointers, the attachment counters for each.  */
> +  unsigned short *attach_count;
>    /* Pointer to the original mapping of "omp declare target link" object.  */
>    splay_tree_key link_key;
>  };

This is something I'm worried about a lot, the nodes keep growing way too
much.  Is there a way to reuse some other field if it is of certain kind?

Also, why unsigned short, can you only attach 65535 times?

	Jakub
Julian Brown Dec. 10, 2018, 7:41 p.m. UTC | #3
Hi Jakub,

Thank you for the review! Here's a new version of the patch.

On Fri, 7 Dec 2018 14:50:19 +0100
Jakub Jelinek <jakub@redhat.com> wrote:

> On Fri, Nov 30, 2018 at 03:41:09AM -0800, Julian Brown wrote:
> > 	gcc/c-family/
> > 	* c-pragma.h (pragma_omp_clause): Add
> > PRAGMA_OACC_CLAUSE_ATTACH, PRAGMA_OACC_CLAUSE_DETACH.  
> ...
> > @@ -11804,9 +11808,12 @@ c_parser_omp_variable_list (c_parser
> > *parser, case OMP_CLAUSE_MAP:
> >  	    case OMP_CLAUSE_FROM:
> >  	    case OMP_CLAUSE_TO:
> > -	      while (c_parser_next_token_is (parser, CPP_DOT))
> > +	      while (c_parser_next_token_is (parser, CPP_DOT)
> > +		     || c_parser_next_token_is (parser, CPP_DEREF))
> >  		{
> >  		  location_t op_loc = c_parser_peek_token
> > (parser)->location;
> > +		  if (c_parser_next_token_is (parser, CPP_DEREF))
> > +		    t = build_simple_mem_ref (t);  
> 
> This change is not ok, if OpenACC allows it in clauses, OpenMP 4.5
> does not and OpenMP 5.0 allows arbitrary lvalues that will need to be
> handled differently (still unimplemented).  So, this needs to be
> guarded for OpenACC only (perhaps for selected OpenACC clauses)?

I've added an "allow_deref" parameter to those functions. I guess that
can be extended/beautified when OpenMP 5.0 support comes along.

> > @@ -12632,6 +12631,8 @@ handle_omp_array_sections_1 (tree c, tree
> > t, vec<tree> &types, }
> >  	      t = TREE_OPERAND (t, 0);
> >  	    }
> > +	  if (TREE_CODE (t) == MEM_REF)
> > +	    t = TREE_OPERAND (t, 0);  
> 
> Again, better guard this for OpenACC.  Maybe verify that
> mem_ref_offset is 0?

Done (though I'm not sure how to trigger that for testing!).

> > @@ -14163,6 +14214,8 @@ c_finish_omp_clauses (tree clauses, enum
> > c_omp_region_type ort) }
> >  	      if (remove)
> >  		break;
> > +	      if (TREE_CODE (t) == MEM_REF)
> > +		t = TREE_OPERAND (t, 0);  
> 
> Guard again?

OK.

> > @@ -31832,15 +31836,19 @@ cp_parser_omp_var_list_no_open (cp_parser
> > *parser, enum omp_clause_code kind, case OMP_CLAUSE_MAP:
> >  	    case OMP_CLAUSE_FROM:
> >  	    case OMP_CLAUSE_TO:
> > -	      while (cp_lexer_next_token_is (parser->lexer,
> > CPP_DOT))
> > +	      while (cp_lexer_next_token_is (parser->lexer,
> > CPP_DOT)
> > +		     || cp_lexer_next_token_is (parser->lexer,
> > CPP_DEREF))  
> 
> Ditto as for C.

Fixed, similarly.

> > @@ -4691,6 +4690,19 @@ handle_omp_array_sections_1 (tree c, tree t,
> > vec<tree> &types, if (low_bound == NULL_TREE)
> >      low_bound = integer_zero_node;
> >  
> > +  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
> > +      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> > +	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
> > +    {
> > +      if (length != integer_one_node)
> > +	{
> > +	  error_at (OMP_CLAUSE_LOCATION (c),
> > +		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> > +		    ? "array section in %<attach%> clause"
> > +		    : "array section in %<detach%> clause");  
> 
> So, are any array sections invalid, including e.g. [0:1] or say
> [5:] where size of the array is 6 elts, or what exactly is invalid?

IIUC, the restriction is that a single "attach" or "detach" clause may
only refer to a single pointer, so (possibly multiple dimensions of)
slices with single items are permitted. I've adjusted the error message
accordingly.

> > +      if (TREE_CODE (type) != POINTER_TYPE)
> > +	{
> > +	  error_at (OMP_CLAUSE_LOCATION (c),
> > +		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> > +		    ? "expected pointer in %<attach%> clause"
> > +		    : "expected pointer in %<detach%> clause");  
> 
> Perhaps you can use %qs and omp_clause_name [OMP_CLAUSE_CODE (c)] ?

Fixed.

> >  		  }
> > +		gfc_ref *array_ref = NULL;
> > +		bool resolved = false;
> >  		if (n->expr)
> >  		  {
> > -		    if (!gfc_resolve_expr (n->expr)
> > +		    array_ref = n->expr->ref;
> > +		    resolved = gfc_resolve_expr (n->expr);
> > +
> > +		    /* Look through component refs to find last
> > array
> > +		       reference.  */
> > +		    while (resolved
> > +			   && array_ref
> > +			   && (array_ref->type == REF_COMPONENT
> > +			       || (array_ref->type == REF_ARRAY
> > +				   && array_ref->next
> > +			           && array_ref->next->type ==
> > REF_COMPONENT)))
> > +		      array_ref = array_ref->next;  
> 
> I'd guard this stuff for OpenACC only, keep what it did for OpenMP.

I've added a guard.

> > +		  /* For OpenACC, pointers in structs should
> > trigger an
> > +		     attach action.  */
> > +		  if (ptr && (region_type & ORT_ACC) != 0)
> > +		    {
> > +		      /* Turning a GOMP_MAP_ALWAYS_POINTER clause
> > into a
> > +			 GOMP_MAP_ATTACH clause after we have
> > detected a case
> > +			 that needs a GOMP_MAP_STRUCT mapping
> > adding.  */
> > +		      OMP_CLAUSE_SET_MAP_KIND (c,
> > +			(code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH
> > +						 :
> > GOMP_MAP_ATTACH);  
> 
> Bad formatting, I'd suggest use a temporary with gomp_map_kind type.

Fixed.

> > +		      has_attachments = true;
> > +		    }
> >  		  if (n == NULL || (n->value & GOVD_MAP) == 0)
> >  		    {
> >  		      tree l = build_omp_clause
> > (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
> > -		      OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT);
> > +		      OMP_CLAUSE_SET_MAP_KIND (l, attach
> > +			? GOMP_MAP_FORCE_PRESENT :
> > GOMP_MAP_STRUCT);  
> 
> Likewise.

Fixed.

> >  		      if (!base_eq_orig_base)
> >  			OMP_CLAUSE_DECL (l) = unshare_expr
> > (orig_base); else
> >  			OMP_CLAUSE_DECL (l) = decl;
> > -		      OMP_CLAUSE_SIZE (l) = size_int (1);
> > +		      OMP_CLAUSE_SIZE (l) = attach
> > +			? (DECL_P (OMP_CLAUSE_DECL (l))
> > +			     ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
> > +			     : TYPE_SIZE_UNIT (TREE_TYPE
> > (OMP_CLAUSE_DECL (l))))
> > +			: size_int (1);  
> 
> Again, bad formatting. = attach should be on a next line, the
> indentation is also weird, best like:
> 		      OMP_CLAUSE_SIZE (l)
> 			= (!attach
> 			   ? size_int (1)
> 			   : DECL_P (OMP_CLAUSE_DECL (l))
> 			   ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
> 			   : TYPE_SIZE_UNIT (TREE_TYPE
> (OMP_CLAUSE_DECL (l))));

Fixed.

> > +/* { dg-final { scan-tree-dump-times "pragma omp target
> > oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */ +/*
> > { dg-final { scan-tree-dump-times "pragma omp target oacc_data
> > map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len:
> > 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1
> > "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp
> > target oacc_parallel map.force_present:s .len: 32..
> > map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > map.attach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > map.detach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > map.to:a .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1
> > "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp
> > target oacc_data map.force_present:s .len: 32..
> > map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > map.release:a .len: 8.." 1 "omplower" } } */ +/* { dg-final
> > { scan-tree-dump-times "pragma omp target oacc_enter_exit_data
> > finalize map.force_detach:a .len: 8.." 1 "omplower" } } */ +/*
> > { dg-final { scan-tree-dump-times "pragma omp target
> > oacc_enter_exit_data finalize map.force_present:s .len: 32..
> > map.force_detach:s.a .len: 8.." 1 "omplower" } } */  
> 
> Aren't the lengths here heavily dependent on the target?  E.g. if it
> depends on sizeof (int) == 4, maybe the whole test needs to be
> guarded with { target int32 }

Fixed (by adding the { target int32 }).

> > @@ -918,8 +920,13 @@ struct splay_tree_key_s {
> >    uintptr_t tgt_offset;
> >    /* Reference count.  */
> >    uintptr_t refcount;
> > -  /* Dynamic reference count.  */
> > -  uintptr_t dynamic_refcount;
> > +  /* Reference counts beyond those that represent genuine
> > references in the
> > +     linked splay tree key/target memory structures, e.g. for
> > multiple OpenACC
> > +     "present increment" operations (via "acc enter data")
> > refering to the same
> > +     host-memory block.  */
> > +  uintptr_t virtual_refcount;
> > +  /* For a block with attached pointers, the attachment counters
> > for each.  */
> > +  unsigned short *attach_count;
> >    /* Pointer to the original mapping of "omp declare target link"
> > object.  */ splay_tree_key link_key;
> >  };  
> 
> This is something I'm worried about a lot, the nodes keep growing way
> too much.  Is there a way to reuse some other field if it is of
> certain kind?

How about this -- it seems that the link_key is only used for OpenMP,
and the attach count is only needed for OpenACC. So the obvious thing
to do is probably to put those two together into a tagged union. The
question is where to put the tag?

Options are, I guess:

1. The high or low bits of the address.  Potentially non-portable, ugly.

2. Or, the virtual refcount is also only needed for OpenACC, so we can
   reserve a magic value for that field to act as a tag.

I've tried implementing the latter in the attached patch, and it seems
to work OK.

It occurs to me that "uintptr_t" is probably overkill for each of the
refcounts: we could probably reduce each of those to int32_t without
loss of too much generality, if we wanted to make further savings.

> Also, why unsigned short, can you only attach 65535 times?

A probably-misguided attempt to keep memory usage low (there's no such
restriction in the spec). I've switched this to uintptr_t, in line with
the current refcount sizes elsewhere.

Re-tested with offloading to nvptx. OK?

Thanks,

Julian

ChangeLog

	gcc/c-family/
	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
	PRAGMA_OACC_CLAUSE_DETACH.

	gcc/c/
	* c-parser.c (c_parser_omp_clause_name): Add parsing of attach and
	detach clauses.
	(c_parser_omp_variable_list): Add ALLOW_DEREF parameter.  Allow deref
	(->) in variable lists if true.
	(c_parser_omp_var_list_parens): Add ALLOW_DEREF optional parameter. 
	Pass to c_parser_omp_variable_list.
	(c_parser_oacc_data_clause): Support attach and detach clauses.  Update
	call to c_parser_omp_variable_list.
	(c_parser_omp_clause_reduction, c_parser_omp_clause_aligned)
	(c_parser_omp_clause_linear, c_parser_omp_clause_depend)
	(c_parser_omp_clause_map, c_parser_omp_clause_uniform): Update calls to
	c_parser_omp_variable_list.
	(c_parser_oacc_all_clauses): Support attach and detach clauses.
	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
	PRAGMA_OACC_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
	* c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach
	and detach.  Support deref.
	(c_oacc_check_attachments): New function.
	(c_finish_omp_clauses): Check attach/detach arguments for being
	pointers using above.  Support deref.

	gcc/cp/
	* parser.c (cp_parser_omp_clause_name): Support attach and detach
	clauses.
	(cp_parser_omp_var_list_no_open): Add ALLOW_DEREF parameter.
	Parse deref if true.
	(cp_parser_omp_var_list): Add ALLOW_DEREF optional parameter.  Pass to
	cp_parser_omp_var_list_no_open.
	(cp_parser_oacc_data_clause): Support attach and detach clauses.  
	Update call to cp_parser_omp_var_list_no_open.
	(cp_parser_omp_clause_aligned, cp_parser_omp_clause_lastprivate)
	(cp_parser_omp_clause_linear, cp_parser_omp_clause_map): Update calls
	to cp_parser_omp_var_list_no_open.
	(cp_parser_oacc_all_clauses): Support attach and detach.
	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
	PRAGMA_OACC_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
	* semantics.c (handle_omp_array_sections_1): Reject subarrays for
	attach and detach.
	(cp_oacc_check_attachments): New function.
	(finish_omp_clauses): Use above function.  Allow structure fields and
	class members to appear in OpenACC data clauses.  Support deref.

	gcc/fortran/
	* gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH.
	* openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter.
	Parse derived-type member accesses if true.
	(omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH.
	(gfc_match_omp_map_clause): Add allow_derived parameter.  Pass to
	gfc_match_omp_variable_list.
	(gfc_match_omp_clauses): Support attach and detach.  Support derived
	types for appropriate OpenACC directives.
	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
	(OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH.
	(OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH.
	(check_symbol_not_pointer): Don't disallow pointer objects of derived
	type.
	(resolve_oacc_data_clauses): Don't disallow allocatable derived types.
	(resolve_omp_clauses): Perform duplicate checking only for non-derived
	type component accesses (plain variables and arrays or array sections).
	Support component refs.
	* trans-openmp.c (gfc_omp_privatize_by_reference): Support component
	refs.
	(gfc_trans_omp_clauses): Support component refs, attach and detach
	clauses.

	gcc/
	* gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS.
	(insert_struct_component_mapping): Support derived-type member mappings
	for arrays with descriptors which use GOMP_MAP_TO_PSET.
	(gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to
	GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers.
	Handle pointer mappings that use GOMP_MAP_TO_PSET.  Handle attach/detach
	clauses.
	(gimplify_adjust_omp_clauses_1): Skip adjustments for explicit
	attach/detach clauses.
	(gimplify_omp_target_update): Handle finalize for detach.
	* omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH,
	GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH.
	* tree-pretty-print.c (dump_omp_clause): Likewise.

	gcc/include/
	* gomp-constants.h (GOMP_MAP_DEEP_COPY): Define.
	(gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH,
	GOMP_MAP_FORCE_DETACH.

	gcc/testsuite/
	* c-c++-common/goacc/mdc-1.c: New test.
	* c-c++-common/goacc/mdc-2.c: New test.
	* gcc.dg/goacc/mdc.C: New test.
	* gfortran.dg/goacc/data-clauses.f95: New test.
	* gfortran.dg/goacc/derived-types.f90: New test.
	* gfortran.dg/goacc/enter-exit-data.f95: New test.

	libgomp/
	* libgomp.h (struct target_var_desc): Add do_detach flag.
	(VREFCOUNT_LINK_KEY): New macro.
	(struct splay_tree_key_s): Put link_key and new attach_count field into
	a new union.  Substitute dynamic_refcount field for virtual_refcount.
	(struct acc_dispatch_t): Remove data_environ field.
	(enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA.
	(gomp_acc_insert_pointer): Remove prototype.
	(gomp_acc_remove_pointer): Update prototype.
	(struct gomp_coalesce_buf): Add forward declaration.
	(gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add
	prototypes.
	* libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async,
	acc_detach, acc_detach_async, acc_detach_finalize,
	acc_detach_finalize_async.
	* oacc-async.c (goacc_remove_var_async): New function.
	* oacc-host.c (host_dispatch): Don't initialise removed data_environ
	field.
	* oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of
	gomp_unmap_vars to remove mappings by splay tree key instead of target
	memory descriptor.
	* oacc-int.h (splay_tree_key_s): Add forward declaration.
	(goacc_remove_var_async): Add prototype.
	* oacc-mem.c (lookup_dev_1): New function.
	(lookup_dev): Reimplement using above.
	(acc_free, acc_hostptr): Update calls to lookup_dev.
	(acc_map_data): Likewise.  Don't add to data_environ list.
	(acc_unmap_data): Remove call to gomp_unmap_vars.  Fix semantics to
	remove mapping, but not mapped data.
	(present_create_copy): Use virtual_refcount instead of
	dynamic_refcount.  Don't manipulate data_environ.  Fix target pointer
	return value.
	(delete_copyout): Update for virtual_refcount semantics.  Use
	goacc_remove_var_async for asynchronous delete/copyouts.
	(gomp_acc_insert_pointer): Remove function.
	(gomp_acc_remove_pointer): Reimplement.
	(acc_attach_async, acc_attach, goacc_detach_internal, acc_detach)
	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New
	functions.
	* oacc-parallel.c (find_pointer): Support attach/detach.  Make a little
	more strict.
	(GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses.
	(GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT.
	Don't call gomp_acc_insert_pointer.
	* openacc.h (acc_attach, acc_attach_async, acc_detach)
	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add
	prototypes.
	* target.c (limits.h): Include.
	(gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc.
	(gomp_attach_pointer, gomp_detach_pointer): New functions.
	(gomp_map_val): Make global.
	(gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA.  Update
	for virtual_refcount semantics.  Support attach and detach.
	(gomp_remove_var): Free attach count array if present.
	(gomp_unmap_vars_async): Support detach and update for virtual_refcount
	semantics.  Disambiguate link_key/attach_count using virtual_refcount
	with magic value as a tag.
	(gomp_load_image_to_device): Zero-initialise virtual_refcount fields.
	(gomp_free_memmap): Remove function.
	(gomp_exit_data): Check virtual_refcount for tag value before using
	link_key.
	(omp_target_associate_ptr): Zero-initialise virtual_refcount and
	link_key splay tree key fields.
	(gomp_target_init): Don't initialise removed data_environ field.
	* testsuite/libgomp.oacc-c-c++-common/context-2.c: Use correct API to
	deallocate acc_copyin'd data.
	* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test.
	* testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test.
	* testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
	* testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test.
	* testsuite/libgomp.oacc-fortran/update-2.f90: New test.
commit a002588d38ed34df7a20cedd5fbe4ed9c4ee6094
Author: Julian Brown <julian@codesourcery.com>
Date:   Mon Nov 5 16:12:09 2018 -0800

    OpenACC 2.6 manual deep copy support (attach/detach)
    
    	gcc/c-family/
    	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
    	PRAGMA_OACC_CLAUSE_DETACH.
    
    	gcc/c/
    	* c-parser.c (c_parser_omp_clause_name): Add parsing of attach and
    	detach clauses.
    	(c_parser_omp_variable_list): Add ALLOW_DEREF parameter.  Allow deref
    	(->) in variable lists if true.
    	(c_parser_omp_var_list_parens): Add ALLOW_DEREF optional parameter.
    	Pass to c_parser_omp_variable_list.
    	(c_parser_oacc_data_clause): Support attach and detach clauses.  Update
    	call to c_parser_omp_variable_list.
    	(c_parser_omp_clause_reduction, c_parser_omp_clause_aligned)
    	(c_parser_omp_clause_linear, c_parser_omp_clause_depend)
    	(c_parser_omp_clause_map, c_parser_omp_clause_uniform): Update calls to
    	c_parser_omp_variable_list.
    	(c_parser_oacc_all_clauses): Support attach and detach clauses.
    	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    	PRAGMA_OACC_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    	* c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach
    	and detach.  Support deref.
    	(c_oacc_check_attachments): New function.
    	(c_finish_omp_clauses): Check attach/detach arguments for being
    	pointers using above.  Support deref.
    
    	gcc/cp/
    	* parser.c (cp_parser_omp_clause_name): Support attach and detach
    	clauses.
    	(cp_parser_omp_var_list_no_open): Add ALLOW_DEREF parameter.
    	Parse deref if true.
    	(cp_parser_omp_var_list): Add ALLOW_DEREF optional parameter.  Pass to
    	cp_parser_omp_var_list_no_open.
    	(cp_parser_oacc_data_clause): Support attach and detach clauses.
    	Update call to cp_parser_omp_var_list_no_open.
    	(cp_parser_omp_clause_aligned, cp_parser_omp_clause_lastprivate)
    	(cp_parser_omp_clause_linear, cp_parser_omp_clause_map): Update calls
    	to cp_parser_omp_var_list_no_open.
    	(cp_parser_oacc_all_clauses): Support attach and detach.
    	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    	PRAGMA_OACC_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    	* semantics.c (handle_omp_array_sections_1): Reject subarrays for
    	attach and detach.
    	(cp_oacc_check_attachments): New function.
    	(finish_omp_clauses): Use above function.  Allow structure fields and
    	class members to appear in OpenACC data clauses.  Support deref.
    
    	gcc/fortran/
    	* gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH.
    	* openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter.
    	Parse derived-type member accesses if true.
    	(omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH.
    	(gfc_match_omp_map_clause): Add allow_derived parameter.  Pass to
    	gfc_match_omp_variable_list.
    	(gfc_match_omp_clauses): Support attach and detach.  Support derived
    	types for appropriate OpenACC directives.
    	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
    	(OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH.
    	(check_symbol_not_pointer): Don't disallow pointer objects of derived
    	type.
    	(resolve_oacc_data_clauses): Don't disallow allocatable derived types.
    	(resolve_omp_clauses): Perform duplicate checking only for non-derived
    	type component accesses (plain variables and arrays or array sections).
    	Support component refs.
    	* trans-openmp.c (gfc_omp_privatize_by_reference): Support component
    	refs.
    	(gfc_trans_omp_clauses): Support component refs, attach and detach
    	clauses.
    
    	gcc/
    	* gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS.
    	(insert_struct_component_mapping): Support derived-type member mappings
    	for arrays with descriptors which use GOMP_MAP_TO_PSET.
    	(gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to
    	GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers.
    	Handle pointer mappings that use GOMP_MAP_TO_PSET.  Handle attach/detach
    	clauses.
    	(gimplify_adjust_omp_clauses_1): Skip adjustments for explicit
    	attach/detach clauses.
    	(gimplify_omp_target_update): Handle finalize for detach.
    	* omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH,
    	GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH.
    	* tree-pretty-print.c (dump_omp_clause): Likewise.
    
    	gcc/include/
    	* gomp-constants.h (GOMP_MAP_DEEP_COPY): Define.
    	(gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH,
    	GOMP_MAP_FORCE_DETACH.
    
    	gcc/testsuite/
    	* c-c++-common/goacc/mdc-1.c: New test.
    	* c-c++-common/goacc/mdc-2.c: New test.
    	* gcc.dg/goacc/mdc.C: New test.
    	* gfortran.dg/goacc/data-clauses.f95: New test.
    	* gfortran.dg/goacc/derived-types.f90: New test.
    	* gfortran.dg/goacc/enter-exit-data.f95: New test.
    
    	libgomp/
    	* libgomp.h (struct target_var_desc): Add do_detach flag.
    	(VREFCOUNT_LINK_KEY): New macro.
    	(struct splay_tree_key_s): Put link_key and new attach_count field into
    	a new union.  Substitute dynamic_refcount field for virtual_refcount.
    	(struct acc_dispatch_t): Remove data_environ field.
    	(enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA.
    	(gomp_acc_insert_pointer): Remove prototype.
    	(gomp_acc_remove_pointer): Update prototype.
    	(struct gomp_coalesce_buf): Add forward declaration.
    	(gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add
    	prototypes.
    	* libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async,
    	acc_detach, acc_detach_async, acc_detach_finalize,
    	acc_detach_finalize_async.
    	* oacc-async.c (goacc_remove_var_async): New function.
    	* oacc-host.c (host_dispatch): Don't initialise removed data_environ
    	field.
    	* oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of
    	gomp_unmap_vars to remove mappings by splay tree key instead of target
    	memory descriptor.
    	* oacc-int.h (splay_tree_key_s): Add forward declaration.
    	(goacc_remove_var_async): Add prototype.
    	* oacc-mem.c (lookup_dev_1): New function.
    	(lookup_dev): Reimplement using above.
    	(acc_free, acc_hostptr): Update calls to lookup_dev.
    	(acc_map_data): Likewise.  Don't add to data_environ list.
    	(acc_unmap_data): Remove call to gomp_unmap_vars.  Fix semantics to
    	remove mapping, but not mapped data.
    	(present_create_copy): Use virtual_refcount instead of
    	dynamic_refcount.  Don't manipulate data_environ.  Fix target pointer
    	return value.
    	(delete_copyout): Update for virtual_refcount semantics.  Use
    	goacc_remove_var_async for asynchronous delete/copyouts.
    	(gomp_acc_insert_pointer): Remove function.
    	(gomp_acc_remove_pointer): Reimplement.
    	(acc_attach_async, acc_attach, goacc_detach_internal, acc_detach)
    	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New
    	functions.
    	* oacc-parallel.c (find_pointer): Support attach/detach.  Make a little
    	more strict.
    	(GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses.
    	(GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT.
    	Don't call gomp_acc_insert_pointer.
    	* openacc.h (acc_attach, acc_attach_async, acc_detach)
    	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add
    	prototypes.
    	* target.c (limits.h): Include.
    	(gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc.
    	(gomp_attach_pointer, gomp_detach_pointer): New functions.
    	(gomp_map_val): Make global.
    	(gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA.  Update
    	for virtual_refcount semantics.  Support attach and detach.
    	(gomp_remove_var): Free attach count array if present.
    	(gomp_unmap_vars_async): Support detach and update for virtual_refcount
    	semantics.  Disambiguate link_key/attach_count using virtual_refcount
    	with magic value as a tag.
    	(gomp_load_image_to_device): Zero-initialise virtual_refcount fields.
    	(gomp_free_memmap): Remove function.
    	(gomp_exit_data): Check virtual_refcount for tag value before using
    	link_key.
    	(omp_target_associate_ptr): Zero-initialise virtual_refcount and
    	link_key splay tree key fields.
    	(gomp_target_init): Don't initialise removed data_environ field.
    	* testsuite/libgomp.oacc-c-c++-common/context-2.c: Use correct API to
    	deallocate acc_copyin'd data.
    	* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test.
    	* testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
    	* testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test.
    	* testsuite/libgomp.oacc-fortran/update-2.f90: New test.

diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index b781f73..dd8d807 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -136,11 +136,13 @@ enum pragma_omp_clause {
 
   /* Clauses for OpenACC.  */
   PRAGMA_OACC_CLAUSE_ASYNC,
+  PRAGMA_OACC_CLAUSE_ATTACH,
   PRAGMA_OACC_CLAUSE_AUTO,
   PRAGMA_OACC_CLAUSE_COPY,
   PRAGMA_OACC_CLAUSE_COPYOUT,
   PRAGMA_OACC_CLAUSE_CREATE,
   PRAGMA_OACC_CLAUSE_DELETE,
+  PRAGMA_OACC_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_DEVICEPTR,
   PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
   PRAGMA_OACC_CLAUSE_FINALIZE,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fef8594..a8d0062 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11599,6 +11599,8 @@ c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -11621,6 +11623,8 @@ c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OACC_CLAUSE_DELETE;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -11870,7 +11874,8 @@ c_parser_oacc_wait_list (c_parser *parser, location_t clause_loc, tree list)
 static tree
 c_parser_omp_variable_list (c_parser *parser,
 			    location_t clause_loc,
-			    enum omp_clause_code kind, tree list)
+			    enum omp_clause_code kind, tree list,
+			    bool allow_deref)
 {
   auto_vec<c_token> tokens;
   unsigned int tokens_avail = 0;
@@ -11983,9 +11988,13 @@ c_parser_omp_variable_list (c_parser *parser,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (c_parser_next_token_is (parser, CPP_DOT))
+	      while (c_parser_next_token_is (parser, CPP_DOT)
+		     || (allow_deref
+			 && c_parser_next_token_is (parser, CPP_DEREF)))
 		{
 		  location_t op_loc = c_parser_peek_token (parser)->location;
+		  if (c_parser_next_token_is (parser, CPP_DEREF))
+		    t = build_simple_mem_ref (t);
 		  c_parser_consume_token (parser);
 		  if (!c_parser_next_token_is (parser, CPP_NAME))
 		    {
@@ -12106,11 +12115,12 @@ c_parser_omp_variable_list (c_parser *parser,
 }
 
 /* Similarly, but expect leading and trailing parenthesis.  This is a very
-   common case for OpenACC and OpenMP clauses.  */
+   common case for OpenACC and OpenMP clauses.  The optional ALLOW_DEREF
+   argument is true if list items can use the deref (->) operator.  */
 
 static tree
 c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
-			      tree list)
+			      tree list, bool allow_deref = false)
 {
   /* The clauses location.  */
   location_t loc = c_parser_peek_token (parser)->location;
@@ -12118,18 +12128,20 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
   matching_parens parens;
   if (parens.require_open (parser))
     {
-      list = c_parser_omp_variable_list (parser, loc, kind, list);
+      list = c_parser_omp_variable_list (parser, loc, kind, list, allow_deref);
       parens.skip_until_found_close (parser);
     }
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach (variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -12139,6 +12151,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -12154,6 +12169,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -12173,7 +12191,7 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
       gcc_unreachable ();
     }
   tree nl, c;
-  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list);
+  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -12579,7 +12597,8 @@ c_parser_omp_clause_lastprivate (c_parser *parser, tree list)
 	    }
 	}
       tree nlist = c_parser_omp_variable_list (parser, loc,
-					       OMP_CLAUSE_LASTPRIVATE, list);
+					       OMP_CLAUSE_LASTPRIVATE, list,
+					       false);
       c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
       if (conditional)
 	for (tree c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
@@ -13551,7 +13570,8 @@ c_parser_omp_clause_reduction (c_parser *parser, enum omp_clause_code kind,
 	{
 	  tree nl, c;
 
-	  nl = c_parser_omp_variable_list (parser, clause_loc, kind, list);
+	  nl = c_parser_omp_variable_list (parser, clause_loc, kind, list,
+					   false);
 	  for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
 	    {
 	      tree d = OMP_CLAUSE_DECL (c), type;
@@ -13942,8 +13962,8 @@ c_parser_omp_clause_aligned (c_parser *parser, tree list)
   if (!parens.require_open (parser))
     return list;
 
-  nl = c_parser_omp_variable_list (parser, clause_loc,
-				   OMP_CLAUSE_ALIGNED, list);
+  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_ALIGNED,
+				   list, false);
 
   if (c_parser_next_token_is (parser, CPP_COLON))
     {
@@ -14004,8 +14024,8 @@ c_parser_omp_clause_linear (c_parser *parser, tree list)
 	}
     }
 
-  nl = c_parser_omp_variable_list (parser, clause_loc,
-				   OMP_CLAUSE_LINEAR, list);
+  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_LINEAR, list,
+				   false);
 
   if (kind != OMP_CLAUSE_LINEAR_DEFAULT)
     parens.skip_until_found_close (parser);
@@ -14399,8 +14419,8 @@ c_parser_omp_clause_depend (c_parser *parser, tree list)
     nl = c_parser_omp_clause_depend_sink (parser, clause_loc, list);
   else
     {
-      nl = c_parser_omp_variable_list (parser, clause_loc,
-				       OMP_CLAUSE_DEPEND, list);
+      nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_DEPEND,
+				       list, false);
 
       if (iterators)
 	{
@@ -14549,7 +14569,8 @@ c_parser_omp_clause_map (c_parser *parser, tree list)
 	}
     }
 
-  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_MAP, list);
+  nl = c_parser_omp_variable_list (parser, clause_loc, OMP_CLAUSE_MAP, list,
+				   false);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -14716,8 +14737,8 @@ c_parser_omp_clause_uniform (c_parser *parser, tree list)
   matching_parens parens;
   if (parens.require_open (parser))
     {
-      list = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE_UNIFORM,
-					 list);
+      list = c_parser_omp_variable_list (parser, loc, OMP_CLAUSE_UNIFORM, list,
+					 false);
       parens.skip_until_found_close (parser);
     }
   return list;
@@ -14757,6 +14778,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 						clauses);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = c_parser_omp_clause_collapse (parser, clauses);
 	  c_name = "collapse";
@@ -14785,6 +14810,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_omp_clause_default (parser, clauses, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -15263,7 +15292,8 @@ c_parser_oacc_cache (location_t loc, c_parser *parser)
 */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -15446,6 +15476,7 @@ c_parser_oacc_declare (c_parser *parser)
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
@@ -15455,6 +15486,7 @@ c_parser_oacc_declare (c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -15589,6 +15621,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -15604,6 +15637,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 1a89727..6f6352d 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12879,7 +12879,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	  return error_mark_node;
 	}
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM))
@@ -12901,6 +12900,16 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 		}
 	      t = TREE_OPERAND (t, 0);
 	    }
+	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
+	    {
+	      poly_int64 offset = mem_ref_offset (t).force_shwi ();
+	      if (maybe_ne (offset, 0))
+	        error_at (OMP_CLAUSE_LOCATION (c),
+			  "cannot dereference %qE in %qs clause", t,
+			  omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+	      else
+		t = TREE_OPERAND (t, 0);
+	    }
 	}
       if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
 	{
@@ -12985,7 +12994,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
     length = fold_convert (sizetype, length);
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
-
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "array section in %<attach%> clause"
+		    : "array section in %<detach%> clause");
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -13662,6 +13683,37 @@ c_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+c_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "expected pointer in %<attach%> clause"
+		    : "expected pointer in %<detach%> clause");
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them against their constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -14386,6 +14438,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (c_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -14393,8 +14447,13 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (c_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (DECL_BIT_FIELD (TREE_OPERAND (t, 1)))
@@ -14432,6 +14491,16 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 		}
 	      if (remove)
 		break;
+	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
+	        {
+		  poly_int64 offset = mem_ref_offset (t).force_shwi ();
+		  if (maybe_ne (offset, 0))
+	            error_at (OMP_CLAUSE_LOCATION (c),
+			      "cannot dereference %qE in %qs clause", t,
+			      omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+		  else
+		    t = TREE_OPERAND (t, 0);
+		}
 	      if (VAR_P (t) || TREE_CODE (t) == PARM_DECL)
 		{
 		  if (bitmap_bit_p (&map_field_head, DECL_UID (t)))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 8223a70..2f91a14 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31885,6 +31885,8 @@ cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -31905,6 +31907,8 @@ cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_DEFAULTMAP;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -32111,7 +32115,7 @@ check_no_duplicate_clause (tree clauses, enum omp_clause_code code,
 
 static tree
 cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
-				tree list, bool *colon)
+				tree list, bool *colon, bool allow_deref)
 {
   cp_token *token;
   bool saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
@@ -32184,15 +32188,20 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT))
+	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		     || (allow_deref
+			 && cp_lexer_next_token_is (parser->lexer, CPP_DEREF)))
 		{
+		  cpp_ttype ttype
+		    = cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		      ? CPP_DOT : CPP_DEREF;
 		  location_t loc
 		    = cp_lexer_peek_token (parser->lexer)->location;
 		  cp_id_kind idk = CP_ID_KIND_NONE;
 		  cp_lexer_consume_token (parser->lexer);
 		  decl = convert_from_reference (decl);
 		  decl
-		    = cp_parser_postfix_dot_deref_expression (parser, CPP_DOT,
+		    = cp_parser_postfix_dot_deref_expression (parser, ttype,
 							      decl, false,
 							      &idk, loc);
 		}
@@ -32310,19 +32319,23 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
    common case for omp clauses.  */
 
 static tree
-cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list)
+cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
+			bool allow_deref = false)
 {
   if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
-    return cp_parser_omp_var_list_no_open (parser, kind, list, NULL);
+    return cp_parser_omp_var_list_no_open (parser, kind, list, NULL,
+					   allow_deref);
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach ( variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -32332,6 +32345,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -32347,6 +32363,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -32366,7 +32385,7 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
       gcc_unreachable ();
     }
   tree nl, c;
-  nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list);
+  nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, true);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -33560,7 +33579,7 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind,
     goto resync_fail;
 
   nlist = cp_parser_omp_var_list_no_open (parser, kind, list,
-					  NULL);
+					  NULL, false);
   for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
     {
       OMP_CLAUSE_REDUCTION_CODE (c) = code;
@@ -33865,7 +33884,7 @@ cp_parser_omp_clause_aligned (cp_parser *parser, tree list)
     return list;
 
   nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_ALIGNED, list,
-					  &colon);
+					  &colon, false);
 
   if (colon)
     {
@@ -33915,7 +33934,7 @@ cp_parser_omp_clause_lastprivate (cp_parser *parser, tree list)
     }
 
   tree nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_LASTPRIVATE,
-					       list, NULL);
+					       list, NULL, false);
 
   if (conditional)
     for (tree c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
@@ -33962,7 +33981,7 @@ cp_parser_omp_clause_linear (cp_parser *parser, tree list,
 
   if (kind == OMP_CLAUSE_LINEAR_DEFAULT)
     nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_LINEAR, list,
-					    &colon);
+					    &colon, false);
   else
     {
       nlist = cp_parser_omp_var_list (parser, OMP_CLAUSE_LINEAR, list);
@@ -34374,7 +34393,7 @@ cp_parser_omp_clause_depend (cp_parser *parser, tree list, location_t loc)
   else
     {
       nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_DEPEND,
-					      list, NULL);
+					      list, NULL, false);
 
       if (iterators)
 	{
@@ -34489,7 +34508,7 @@ cp_parser_omp_clause_map (cp_parser *parser, tree list)
     }
 
   nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_MAP, list,
-					  NULL);
+					  NULL, false);
 
   for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -34701,6 +34720,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 						 clauses, here);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = cp_parser_omp_clause_collapse (parser, clauses, here);
 	  c_name = "collapse";
@@ -34729,6 +34752,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 	  clauses = cp_parser_omp_clause_default (parser, clauses, here, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -38370,10 +38397,12 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
      structured-block  */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) )
@@ -38573,6 +38602,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -38583,6 +38613,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -38686,6 +38717,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -38701,6 +38733,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c1240cc..084ef38 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4558,7 +4558,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	t = TREE_OPERAND (t, 0);
       ret = t;
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)
@@ -4681,6 +4680,18 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
 
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    "expected single pointer in %qs clause",
+		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -6038,6 +6049,42 @@ cp_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+cp_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+      tree type;
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      type = TREE_TYPE (t);
+
+      if (TREE_CODE (type) == REFERENCE_TYPE)
+	type = TREE_TYPE (type);
+
+      if (TREE_CODE (type) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    "expected pointer in %qs clause",
+		    omp_clause_code_name[OMP_CLAUSE_CODE (c)]);
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them vs OpenMP constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -6278,7 +6325,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    t = OMP_CLAUSE_DECL (c);
 	check_dup_generic_t:
 	  if (t == current_class_ptr
-	      && (ort != C_ORT_OMP_DECLARE_SIMD
+	      && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC)
 		  || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR
 		      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM)))
 	    {
@@ -6342,8 +6389,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	handle_field_decl:
 	  if (!remove
 	      && TREE_CODE (t) == FIELD_DECL
-	      && t == OMP_CLAUSE_DECL (c)
-	      && ort != C_ORT_ACC)
+	      && t == OMP_CLAUSE_DECL (c))
 	    {
 	      OMP_CLAUSE_DECL (c)
 		= omp_privatize_field (t, (OMP_CLAUSE_CODE (c)
@@ -6410,7 +6456,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    omp_note_field_privatization (t, OMP_CLAUSE_DECL (c));
 	  else
 	    t = OMP_CLAUSE_DECL (c);
-	  if (t == current_class_ptr)
+	  if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -6897,7 +6943,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    }
 	  if (t == error_mark_node)
 	    remove = true;
-	  else if (t == current_class_ptr)
+	  else if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -7027,6 +7073,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (cp_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -7034,14 +7082,25 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (cp_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (REFERENCE_REF_P (t)
 	      && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF)
 	    {
 	      t = TREE_OPERAND (t, 0);
 	      OMP_CLAUSE_DECL (c) = t;
 	    }
+	  if (ort == C_ORT_ACC
+	      && TREE_CODE (t) == COMPONENT_REF
+	      && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF)
+	    t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+	      && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+		  || ort == C_ORT_ACC)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (type_dependent_expression_p (t))
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 4dd6298..96bed48 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1175,10 +1175,12 @@ enum gfc_omp_depend_op
 enum gfc_omp_map_op
 {
   OMP_MAP_ALLOC,
+  OMP_MAP_ATTACH,
   OMP_MAP_TO,
   OMP_MAP_FROM,
   OMP_MAP_TOFROM,
   OMP_MAP_DELETE,
+  OMP_MAP_DETACH,
   OMP_MAP_FORCE_ALLOC,
   OMP_MAP_FORCE_TO,
   OMP_MAP_FORCE_FROM,
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index e1560c1..d075a69 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -222,7 +222,8 @@ static match
 gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 			     bool allow_common, bool *end_colon = NULL,
 			     gfc_omp_namelist ***headp = NULL,
-			     bool allow_sections = false)
+			     bool allow_sections = false,
+			     bool allow_derived = false)
 {
   gfc_omp_namelist *head, *tail, *p;
   locus old_loc, cur_loc;
@@ -248,7 +249,8 @@ gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 	case MATCH_YES:
 	  gfc_expr *expr;
 	  expr = NULL;
-	  if (allow_sections && gfc_peek_ascii_char () == '(')
+	  if ((allow_sections && gfc_peek_ascii_char () == '(')
+	      || (allow_derived && gfc_peek_ascii_char () == '%'))
 	    {
 	      gfc_current_locus = cur_loc;
 	      m = gfc_match_variable (&expr, 0);
@@ -785,7 +787,7 @@ enum omp_mask1
   OMP_MASK1_LAST
 };
 
-/* OpenACC 2.0 specific clauses. */
+/* OpenACC 2.0+ specific clauses. */
 enum omp_mask2
 {
   OMP_CLAUSE_ASYNC,
@@ -811,6 +813,8 @@ enum omp_mask2
   OMP_CLAUSE_TILE,
   OMP_CLAUSE_IF_PRESENT,
   OMP_CLAUSE_FINALIZE,
+  OMP_CLAUSE_ATTACH,
+  OMP_CLAUSE_DETACH,
   /* This must come last.  */
   OMP_MASK2_LAST
 };
@@ -914,10 +918,12 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
    mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool allow_derived = false)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
+  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true,
+				   allow_derived)
       == MATCH_YES)
     {
       gfc_omp_namelist *n;
@@ -939,6 +945,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 {
   gfc_omp_clauses *c = gfc_get_omp_clauses ();
   locus old_loc;
+  /* Determine whether we're dealing with an OpenACC directive that permits
+     derived type member accesses.  This in particular disallows
+     "!$acc declare" from using such accesses, because it's not clear if/how
+     that should work.  */
+  bool allow_derived = (openacc
+			&& ((mask & OMP_CLAUSE_ATTACH)
+			    || (mask & OMP_CLAUSE_DETACH)
+			    || (mask & OMP_CLAUSE_HOST_SELF)));
 
   gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64);
   *cp = NULL;
@@ -1012,6 +1026,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      needs_space = true;
 	      continue;
 	    }
+	  if ((mask & OMP_CLAUSE_ATTACH)
+	      && gfc_match ("attach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_ATTACH, allow_derived))
+	    continue;
 	  break;
 	case 'c':
 	  if ((mask & OMP_CLAUSE_COLLAPSE)
@@ -1039,7 +1058,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	    {
@@ -1047,7 +1066,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-						   OMP_MAP_TO))
+						   OMP_MAP_TO, allow_derived))
 		    continue;
 		}
 	      else if (gfc_match_omp_variable_list ("copyin (",
@@ -1058,7 +1077,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	      && gfc_match_omp_variable_list ("copyprivate (",
@@ -1068,7 +1087,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  break;
 	case 'd':
@@ -1104,7 +1123,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_DELETE)
 	      && gfc_match ("delete ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_RELEASE))
+					   OMP_MAP_RELEASE, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEPEND)
 	      && gfc_match ("depend ( ") == MATCH_YES)
@@ -1147,6 +1166,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      else
 		gfc_current_locus = old_loc;
 	    }
+	  if ((mask & OMP_CLAUSE_DETACH)
+	      && gfc_match ("detach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_DETACH, allow_derived))
+	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE)
 	      && !openacc
 	      && c->device == NULL
@@ -1156,12 +1180,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      && openacc
 	      && gfc_match ("device ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_TO))
+					   OMP_MAP_FORCE_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
 	      && gfc_match ("deviceptr ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_DEVICEPTR))
+					   OMP_MAP_FORCE_DEVICEPTR,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
 	      && gfc_match_omp_variable_list
@@ -1239,7 +1264,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("host ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  break;
 	case 'i':
@@ -1511,47 +1536,48 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("pcopy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("pcopyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("pcopyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("pcreate ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRESENT)
 	      && gfc_match ("present ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_PRESENT))
+					   OMP_MAP_FORCE_PRESENT,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("present_or_copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("present_or_copyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("present_or_copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("present_or_create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRIORITY)
 	      && c->priority == NULL
@@ -1669,8 +1695,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 
 	      if (gfc_match_omp_variable_list (" :",
 					       &c->lists[OMP_LIST_REDUCTION],
-					       false, NULL, &head,
-					       openacc) == MATCH_YES)
+					       false, NULL, &head, openacc,
+					       allow_derived) == MATCH_YES)
 		{
 		  gfc_omp_namelist *n;
 		  if (rop == OMP_REDUCTION_NONE)
@@ -1769,7 +1795,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("self ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_SEQ)
 	      && !c->seq
@@ -1926,17 +1952,17 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR	      \
    | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_KERNELS_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS	      \
    | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
    | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
-   | OMP_CLAUSE_PRESENT)
+   | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH)
 #define OACC_LOOP_CLAUSES \
   (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER	      \
    | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT	      \
@@ -1957,10 +1983,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT)
 #define OACC_ENTER_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE)
+   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH)
 #define OACC_EXIT_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE)
+   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE	      \
+   | OMP_CLAUSE_DETACH)
 #define OACC_WAIT_CLAUSES \
   omp_mask (OMP_CLAUSE_ASYNC)
 #define OACC_ROUTINE_CLAUSES \
@@ -3733,9 +3760,6 @@ resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause)
 static void
 check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.pointer)
-    gfc_error ("POINTER object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer)
     gfc_error ("Cray pointer object %qs of derived type in %s clause at %L",
 	       sym->name, name, &loc);
@@ -3780,9 +3804,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name)
 static void
 resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.allocatable)
-    gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable)
       || (sym->ts.type == BT_CLASS && CLASS_DATA (sym)
 	  && CLASS_DATA (sym)->attr.allocatable))
@@ -4152,11 +4173,26 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	&& (list != OMP_LIST_REDUCTION || !openacc))
       for (n = omp_clauses->lists[list]; n; n = n->next)
 	{
-	  if (n->sym->mark)
-	    gfc_error ("Symbol %qs present on multiple clauses at %L",
-		       n->sym->name, &n->where);
-	  else
-	    n->sym->mark = 1;
+	  bool array_only_p = true;
+	  /* Disallow duplicate bare variable references and multiple
+	     subarrays of the same array here, but allow multiple components of
+	     the same (e.g. derived-type) variable.  For the latter, duplicate
+	     components are detected elsewhere.  */
+	  if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE)
+	    for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
+	      if (ref->type != REF_ARRAY)
+	        {
+		  array_only_p = false;
+		  break;
+		}
+	  if (array_only_p)
+	    {
+	      if (n->sym->mark)
+		gfc_error ("Symbol %qs present on multiple clauses at %L",
+			   n->sym->name, &n->where);
+	      else
+		n->sym->mark = 1;
+	    }
 	}
 
   gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1);
@@ -4347,23 +4383,43 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 				 "are allowed on ORDERED directive at %L",
 				 &n->where);
 		  }
+		gfc_ref *array_ref = NULL;
+		bool resolved = false;
 		if (n->expr)
 		  {
-		    if (!gfc_resolve_expr (n->expr)
+		    array_ref = n->expr->ref;
+		    resolved = gfc_resolve_expr (n->expr);
+
+		    /* Look through component refs to find last array
+		       reference.  */
+		    if (openacc)
+		      while (resolved
+			     && array_ref
+			     && (array_ref->type == REF_COMPONENT
+				 || (array_ref->type == REF_ARRAY
+				     && array_ref->next
+			             && (array_ref->next->type
+					 == REF_COMPONENT))))
+			array_ref = array_ref->next;
+		  }
+		if (array_ref
+		    || (n->expr
+			&& (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
+		  {
+		    if (!resolved
 			|| n->expr->expr_type != EXPR_VARIABLE
-			|| n->expr->ref == NULL
-			|| n->expr->ref->next
-			|| n->expr->ref->type != REF_ARRAY)
+			|| array_ref->next
+			|| array_ref->type != REF_ARRAY)
 		      gfc_error ("%qs in %s clause at %L is not a proper "
 				 "array section", n->sym->name, name,
 				 &n->where);
-		    else if (n->expr->ref->u.ar.codimen)
+		    else if (array_ref->u.ar.codimen)
 		      gfc_error ("Coarrays not supported in %s clause at %L",
 				 name, &n->where);
 		    else
 		      {
 			int i;
-			gfc_array_ref *ar = &n->expr->ref->u.ar;
+			gfc_array_ref *ar = &array_ref->u.ar;
 			for (i = 0; i < ar->dimen; i++)
 			  if (ar->stride[i])
 			    {
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 07dd790..4d50c6d 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -60,6 +60,9 @@ gfc_omp_privatize_by_reference (const_tree decl)
 
   if (TREE_CODE (type) == POINTER_TYPE)
     {
+      while (TREE_CODE (decl) == COMPONENT_REF)
+        decl = TREE_OPERAND (decl, 1);
+
       /* Array POINTER/ALLOCATABLE have aggregate types, all user variables
 	 that have POINTER_TYPE type and aren't scalar pointers, scalar
 	 allocatables, Cray pointees or C pointers are supposed to be
@@ -2113,20 +2116,47 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	      tree decl = gfc_get_symbol_decl (n->sym);
 	      if (DECL_P (decl))
 		TREE_ADDRESSABLE (decl) = 1;
-	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
+
+	      gfc_ref *ref = n->expr ? n->expr->ref : NULL;
+	      symbol_attribute *sym_attr = &n->sym->attr;
+	      gomp_map_kind ptr_map_kind = GOMP_MAP_POINTER;
+
+	      if (ref && n->sym->ts.type == BT_DERIVED)
+	        {
+		  if (gfc_omp_privatize_by_reference (decl))
+		    decl = build_fold_indirect_ref (decl);
+
+		  for (; ref && ref->type == REF_COMPONENT; ref = ref->next)
+		    {
+		      tree field = ref->u.c.component->backend_decl;
+		      gcc_assert (field && TREE_CODE (field) == FIELD_DECL);
+		      decl = fold_build3 (COMPONENT_REF, TREE_TYPE (field),
+					  decl, field, NULL_TREE);
+		      sym_attr = &ref->u.c.component->attr;
+		    }
+
+		  ptr_map_kind = GOMP_MAP_ALWAYS_POINTER;
+		}
+
+	      if (ref == NULL || ref->u.ar.type == AR_FULL)
 		{
+		  tree field = decl;
+
+		  while (TREE_CODE (field) == COMPONENT_REF)
+		    field = TREE_OPERAND (field, 1);
+
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
 		      && (gfc_omp_privatize_by_reference (decl)
-			  || GFC_DECL_GET_SCALAR_POINTER (decl)
-			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
-			  || GFC_DECL_CRAY_POINTEE (decl)
+			  || GFC_DECL_GET_SCALAR_POINTER (field)
+			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (field)
+			  || GFC_DECL_CRAY_POINTEE (field)
 			  || GFC_DESCRIPTOR_TYPE_P
-					(TREE_TYPE (TREE_TYPE (decl)))))
+					(TREE_TYPE (TREE_TYPE (field)))))
 		    {
 		      tree orig_decl = decl;
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2136,13 +2166,15 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			{
 			  node3 = build_omp_clause (input_location,
 						    OMP_CLAUSE_MAP);
-			  OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+			  OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 			  OMP_CLAUSE_DECL (node3) = decl;
 			  OMP_CLAUSE_SIZE (node3) = size_int (0);
 			  decl = build_fold_indirect_ref (decl);
 			}
 		    }
-		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
+		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
+		      && n->u.map_op != OMP_MAP_ATTACH
+		      && n->u.map_op != OMP_MAP_DETACH)
 		    {
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
@@ -2157,14 +2189,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		      OMP_CLAUSE_SIZE (node3) = size_int (0);
 
 		      /* We have to check for n->sym->attr.dimension because
 			 of scalar coarrays.  */
-		      if (n->sym->attr.pointer && n->sym->attr.dimension)
+		      if (sym_attr->pointer && sym_attr->dimension)
 			{
 			  stmtblock_t cond_block;
 			  tree size
@@ -2194,11 +2228,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 							     else_b));
 			  OMP_CLAUSE_SIZE (node) = size;
 			}
-		      else if (n->sym->attr.dimension)
+		      else if (sym_attr->dimension)
 			OMP_CLAUSE_SIZE (node)
 			  = gfc_full_array_size (block, decl,
 						 GFC_TYPE_ARRAY_RANK (type));
-		      if (n->sym->attr.dimension)
+		      if (sym_attr->dimension)
 			{
 			  tree elemsz
 			    = TYPE_SIZE_UNIT (gfc_get_element_type (type));
@@ -2211,11 +2245,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  else
 		    OMP_CLAUSE_DECL (node) = decl;
 		}
-	      else
+	      else if (ref)
 		{
 		  tree ptr, ptr2;
 		  gfc_init_se (&se, NULL);
-		  if (n->expr->ref->u.ar.type == AR_ELEMENT)
+		  if (ref->u.ar.type == AR_ELEMENT)
 		    {
 		      gfc_conv_expr_reference (&se, n->expr);
 		      gfc_add_block_to_block (block, &se.pre);
@@ -2249,7 +2283,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		    {
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2266,9 +2300,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		    }
 		  else
 		    {
@@ -2281,18 +2317,23 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			}
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3) = decl;
 		    }
 		  ptr2 = fold_convert (sizetype, ptr2);
 		  OMP_CLAUSE_SIZE (node3)
 		    = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2);
 		}
+	      else
+	        gcc_unreachable ();
 	      switch (n->u.map_op)
 		{
 		case OMP_MAP_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
 		  break;
+		case OMP_MAP_ATTACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ATTACH);
+		  break;
 		case OMP_MAP_TO:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TO);
 		  break;
@@ -2317,6 +2358,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		case OMP_MAP_DELETE:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE);
 		  break;
+		case OMP_MAP_DETACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DETACH);
+		  break;
 		case OMP_MAP_FORCE_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC);
 		  break;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 93e49ed..c27ea76 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -113,6 +113,10 @@ enum gimplify_omp_var_data
 
   GOVD_NONTEMPORAL = 4194304,
 
+  /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for
+     fields.  */
+  GOVD_MAP_HAS_ATTACHMENTS = 8388608,
+
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
 			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
 			   | GOVD_LOCAL)
@@ -8018,7 +8022,13 @@ insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node,
   OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
   OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c));
   OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node;
-  OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
+  if (OMP_CLAUSE_CHAIN (prev_node) != c
+      && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node))
+	  == GOMP_MAP_TO_PSET))
+    OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node));
+  else
+    OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
   if (struct_node)
     OMP_CLAUSE_CHAIN (struct_node) = c2;
 
@@ -8612,7 +8622,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		  remove = true;
 		  break;
 		}
-	      if (DECL_P (decl))
+	      if (DECL_P (decl)
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
+		  && code != OACC_UPDATE)
 		{
 		  if (error_operand_p (decl))
 		    {
@@ -8664,16 +8676,40 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		    = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
 		  bool ptr = (OMP_CLAUSE_MAP_KIND (c)
 			      == GOMP_MAP_ALWAYS_POINTER);
+		  bool attach = OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+			        || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH;
+		  bool has_attachments = false;
+		  /* For OpenACC, pointers in structs should trigger an
+		     attach action.  */
+		  if (ptr && (region_type & ORT_ACC) != 0)
+		    {
+		      /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a
+			 GOMP_MAP_ATTACH clause after we have detected a case
+			 that needs a GOMP_MAP_STRUCT mapping added.  */
+		      gomp_map_kind k
+			= (code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH
+						   : GOMP_MAP_ATTACH;
+		      OMP_CLAUSE_SET_MAP_KIND (c, k);
+		      has_attachments = true;
+		    }
 		  if (n == NULL || (n->value & GOVD_MAP) == 0)
 		    {
 		      tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c),
 						 OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT);
+		      gomp_map_kind k = attach ? GOMP_MAP_FORCE_PRESENT
+					       : GOMP_MAP_STRUCT;
+
+		      OMP_CLAUSE_SET_MAP_KIND (l, k);
 		      if (!base_eq_orig_base)
 			OMP_CLAUSE_DECL (l) = unshare_expr (orig_base);
 		      else
 			OMP_CLAUSE_DECL (l) = decl;
-		      OMP_CLAUSE_SIZE (l) = size_int (1);
+		      OMP_CLAUSE_SIZE (l)
+			= (!attach
+			   ? size_int (1)
+			   : DECL_P (OMP_CLAUSE_DECL (l))
+			   ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
+			   : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l))));
 		      if (struct_map_to_clause == NULL)
 			struct_map_to_clause = new hash_map<tree, tree>;
 		      struct_map_to_clause->put (decl, l);
@@ -8705,9 +8741,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      flags = GOVD_MAP | GOVD_EXPLICIT;
 		      if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr)
 			flags |= GOVD_SEEN;
+		      if (has_attachments)
+			flags |= GOVD_MAP_HAS_ATTACHMENTS;
 		      goto do_add_decl;
 		    }
-		  else
+		  else if (struct_map_to_clause)
 		    {
 		      tree *osc = struct_map_to_clause->get (decl);
 		      tree *sc = NULL, *scp = NULL;
@@ -8716,8 +8754,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      sc = &OMP_CLAUSE_CHAIN (*osc);
 		      if (*sc != c
 			  && (OMP_CLAUSE_MAP_KIND (*sc)
-			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) 
+			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
 			sc = &OMP_CLAUSE_CHAIN (*sc);
+		      /* Here "prev_list_p" is the end of the inserted
+			 alloc/release nodes after the struct node, OSC.  */
 		      for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc))
 			if (ptr && sc == prev_list_p)
 			  break;
@@ -8776,9 +8816,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 			  }
 		      if (remove)
 			break;
-		      OMP_CLAUSE_SIZE (*osc)
-			= size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
-				      size_one_node);
+		      if (!attach)
+			OMP_CLAUSE_SIZE (*osc)
+			  = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
+					size_one_node);
 		      if (ptr)
 			{
 			  tree cl
@@ -8810,11 +8851,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		}
 	      if (!remove
 		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
 		  && OMP_CLAUSE_CHAIN (c)
 		  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP
-		  && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
-		      == GOMP_MAP_ALWAYS_POINTER))
+		  && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		       == GOMP_MAP_ALWAYS_POINTER)
+		      || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		          == GOMP_MAP_TO_PSET)))
 		prev_list_p = list_p;
+
 	      break;
 	    }
 	  flags = GOVD_MAP | GOVD_EXPLICIT;
@@ -9436,6 +9481,8 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
     return 0;
   if ((flags & GOVD_SEEN) == 0)
     return 0;
+  if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0)
+    return 0;
   if (flags & GOVD_DEBUG_PRIVATE)
     {
       gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED);
@@ -11819,8 +11866,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
 			       OMP_CLAUSE_FINALIZE))
     {
-      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
-	 semantics apply to all mappings of this OpenACC directive.  */
+      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
+	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
+	 to all mappings of this OpenACC directive.  */
       bool finalize_marked = false;
       for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
@@ -11834,10 +11882,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
 	      finalize_marked = true;
 	      break;
+	    case GOMP_MAP_DETACH:
+	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
+	      finalize_marked = true;
+	      break;
+	    case GOMP_MAP_STRUCT:
+	    case GOMP_MAP_FORCE_PRESENT:
+	      /* Skip over an initial struct or force_present mapping.  */
+	      break;
 	    default:
-	      /* Check consistency: libgomp relies on the very first data
-		 mapping clause being marked, so make sure we did that before
-		 any other mapping clauses.  */
+	      /* Check consistency: libgomp relies on the very first
+		 non-struct, non-force-present data mapping clause being
+		 marked, so make sure we did that before any other mapping
+		 clauses.  */
 	      gcc_assert (finalize_marked);
 	      break;
 	    }
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index b406ce7..60d541b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9152,6 +9152,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
 	  case GOMP_MAP_LINK:
+	  case GOMP_MAP_ATTACH:
+	  case GOMP_MAP_DETACH:
+	  case GOMP_MAP_FORCE_DETACH:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
 	  default:
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
new file mode 100644
index 0000000..b8d03a0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
@@ -0,0 +1,55 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+/* { dg-do compile { target int32 } } */
+/* { dg-additional-options "-fdump-tree-omplower" } */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z;
+
+#pragma acc enter data copyin(s)
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e)
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+
+      a = s.e;
+#pragma acc enter data attach(a)
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data copyin(a)
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e)
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a)
+
+#pragma acc exit data detach(a) finalize
+#pragma acc exit data detach(s.a) finalize
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
new file mode 100644
index 0000000..ebfb99d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
@@ -0,0 +1,62 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z, scalar, **y;
+
+#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data attach(z[:5]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:5]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[1:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[1:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(z[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e) attach(z) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a) attach(s.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(scalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(scalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(s) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(s) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(y[10])
+#pragma acc exit data detach(y[10])
+}
diff --git a/gcc/testsuite/g++.dg/goacc/mdc.C b/gcc/testsuite/g++.dg/goacc/mdc.C
new file mode 100644
index 0000000..fbd43aa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/goacc/mdc.C
@@ -0,0 +1,68 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  struct foo& rs = s;
+  
+  int *a, *z, scalar, **y;
+  int* const &ra = a;
+  int* const &rz = z;
+  int& rscalar = scalar;
+  int** const &ry = y;
+
+#pragma acc enter data copyin(rs) detach(rz) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(rs.a[0:10]) copy(rz[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(rs.e) detach(rs.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(ra) detach(rs.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(ra)
+    }
+
+#pragma acc enter data attach(rz[:5]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:5]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[1:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[1:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(rz[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(rs.e)
+#pragma acc exit data detach(rs.e) attach(rz) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(rs.e)
+    {
+    }
+#pragma acc exit data delete(ra) attach(rs.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(rscalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rscalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rs) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rs) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(ry[10])
+#pragma acc exit data detach(ry[10])
+}
diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
index b94214e..1a4a671 100644
--- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
@@ -39,9 +39,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel copy (tip)
   !$acc end parallel
-  !$acc parallel copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -54,9 +54,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyin (tip)
   !$acc end parallel
-  !$acc parallel copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -71,9 +71,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyout (tip)
   !$acc end parallel
-  !$acc parallel copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -90,9 +90,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel create (tip) ! { dg-error "POINTER" }
+  !$acc parallel create (tip)
   !$acc end parallel
-  !$acc parallel create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -111,9 +111,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present (tip) ! { dg-error "POINTER" }
+  !$acc parallel present (tip)
   !$acc end parallel
-  !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -144,9 +144,9 @@ contains
   !$acc end parallel
 
 
-  !$acc parallel present_or_copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copy (tip)
   !$acc end parallel
-  !$acc parallel present_or_copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -169,9 +169,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyin (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -196,9 +196,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyout (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -225,9 +225,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_create (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_create (tip)
   !$acc end parallel
-  !$acc parallel present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -256,4 +256,4 @@ contains
   !$acc end data
 
   end subroutine foo
-end module test
\ No newline at end of file
+end module test
diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
new file mode 100644
index 0000000..5fb2981
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
@@ -0,0 +1,77 @@
+! Test ACC UPDATE with derived types.
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type dtype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type dtype
+end module dt
+
+program derived_acc
+  use dt
+  
+  implicit none
+  type(dtype):: var
+  integer i
+  !$acc declare create(var)
+  !$acc declare pcopy(var%a) ! { dg-error "Syntax error in OpenMP" }
+
+  !$acc update host(var)
+  !$acc update host(var%a)
+  !$acc update device(var)
+  !$acc update device(var%a)
+  !$acc update self(var)
+  !$acc update self(var%a)
+  
+  !$acc enter data copyin(var)
+  !$acc enter data copyin(var%a)
+
+  !$acc exit data copyout(var)
+  !$acc exit data copyout(var%a)
+
+  !$acc data copy(var)
+  !$acc end data
+
+  !$acc data copyout(var%a)
+  !$acc end data
+
+  !$acc parallel loop pcopyout(var)
+  do i = 1, 10
+  end do  
+  !$acc end parallel loop
+
+  !$acc parallel loop copyout(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel pcopy(var)
+  !$acc end parallel
+
+  !$acc parallel pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel
+  
+  !$acc kernels pcopyin(var)
+  !$acc end kernels
+
+  !$acc kernels pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels
+
+  !$acc kernels loop pcopyin(var)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+
+  !$acc kernels loop pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+end program derived_acc
diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
index 805459c..b616b39 100644
--- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
@@ -44,14 +44,14 @@ contains
   !$acc enter data wait (i, 1) 
   !$acc enter data wait (a) ! { dg-error "INTEGER" }
   !$acc enter data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc enter data copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data create (tip) ! { dg-error "POINTER" }
-  !$acc enter data create (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_create (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc enter data copyin (tip)
+  !$acc enter data copyin (tia)
+  !$acc enter data create (tip)
+  !$acc enter data create (tia)
+  !$acc enter data present_or_copyin (tip)
+  !$acc enter data present_or_copyin (tia)
+  !$acc enter data present_or_create (tip)
+  !$acc enter data present_or_create (tia)
   !$acc enter data copyin (i) create (i) ! { dg-error "multiple clauses" }
   !$acc enter data copyin (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc enter data create (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
@@ -79,10 +79,10 @@ contains
   !$acc exit data wait (i, 1) 
   !$acc exit data wait (a) ! { dg-error "INTEGER" }
   !$acc exit data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc exit data copyout (tip) ! { dg-error "POINTER" }
-  !$acc exit data copyout (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc exit data delete (tip) ! { dg-error "POINTER" }
-  !$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc exit data copyout (tip)
+  !$acc exit data copyout (tia)
+  !$acc exit data delete (tip)
+  !$acc exit data delete (tia)
   !$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" }
   !$acc exit data finalize
   !$acc exit data finalize copyout (i)
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 0861cc9..c648da5 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -826,6 +826,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_ATTACH:
+	  pp_string (pp, "attach");
+	  break;
+	case GOMP_MAP_DETACH:
+	  pp_string (pp, "detach");
+	  break;
+	case GOMP_MAP_FORCE_DETACH:
+	  pp_string (pp, "force_detach");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index acd2585..b7ce640 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -42,6 +42,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
+#define GOMP_MAP_DEEP_COPY		(1 << 5)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
 #define GOMP_MAP_FLAG_FORCE		(1 << 7)
 
@@ -128,6 +129,13 @@ enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* In OpenACC, attach a pointer to a mapped struct field.  */
+    GOMP_MAP_ATTACH =			(GOMP_MAP_DEEP_COPY | 0),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_DETACH =			(GOMP_MAP_DEEP_COPY | 1),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_FORCE_DETACH =		(GOMP_MAP_DEEP_COPY
+					 | GOMP_MAP_FLAG_FORCE | 1),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 17aee87..3c1024a 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -866,6 +866,8 @@ struct target_var_desc {
   bool copy_from;
   /* True if data always should be copied from device to host at the end.  */
   bool always_copy_from;
+  /* True if variable should be detached at end of region.  */
+  bool do_detach;
   /* Relative offset against key host_start.  */
   uintptr_t offset;
   /* Actual length.  */
@@ -902,6 +904,10 @@ struct target_mem_desc {
    artificial pointer to "omp declare target link" object.  */
 #define REFCOUNT_LINK (~(uintptr_t) 1)
 
+/* A special tag value for "virtual_refcount" in the splay_tree_key_s structure
+   below.  */
+#define VREFCOUNT_LINK_KEY (~(uintptr_t) 0)
+
 /* Special offset values.  */
 #define OFFSET_INLINED (~(uintptr_t) 0)
 #define OFFSET_POINTER (~(uintptr_t) 1)
@@ -918,10 +924,21 @@ struct splay_tree_key_s {
   uintptr_t tgt_offset;
   /* Reference count.  */
   uintptr_t refcount;
-  /* Dynamic reference count.  */
-  uintptr_t dynamic_refcount;
-  /* Pointer to the original mapping of "omp declare target link" object.  */
-  splay_tree_key link_key;
+  /* Reference counts beyond those that represent genuine references in the
+     linked splay tree key/target memory structures, e.g. for multiple OpenACC
+     "present increment" operations (via "acc enter data") refering to the same
+     host-memory block.
+     If set to VREFCOUNT_LINK_KEY (for OpenMP, where this field is not otherwise
+     needed), the union below represents a link key.  */
+  uintptr_t virtual_refcount;
+  union {
+    /* For a block with attached pointers, the attachment counters for each.
+       Only used for OpenACC.  */
+    uintptr_t *attach_count;
+    /* Pointer to the original mapping of "omp declare target link" object.
+       Only used for OpenMP.  */
+    splay_tree_key link_key;
+  } u;
 };
 
 /* The comparison function.  */
@@ -943,13 +960,6 @@ splay_compare (splay_tree_key x, splay_tree_key y)
 
 typedef struct acc_dispatch_t
 {
-  /* This is a linked list of data mapped using the
-     acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
-     Unlike mapped_data in the goacc_thread struct, unmapping can
-     happen out-of-order with respect to mapping.  */
-  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
-  struct target_mem_desc *data_environ;
-
   /* Execute.  */
   __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func;
 
@@ -1056,13 +1066,17 @@ struct gomp_device_descr
 enum gomp_map_vars_kind
 {
   GOMP_MAP_VARS_OPENACC,
+  GOMP_MAP_VARS_OPENACC_ENTER_DATA,
   GOMP_MAP_VARS_TARGET,
   GOMP_MAP_VARS_DATA,
   GOMP_MAP_VARS_ENTER_DATA
 };
 
-extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int);
-extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int);
+struct gomp_coalesce_buf;
+
+extern void gomp_acc_remove_pointer (struct gomp_device_descr *, void **,
+				     size_t *, unsigned short *, int, bool,
+				     int);
 extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *,
 				       unsigned short *);
 struct gomp_coalesce_buf;
@@ -1072,6 +1086,14 @@ extern void gomp_copy_host2dev (struct gomp_device_descr *,
 extern void gomp_copy_dev2host (struct gomp_device_descr *,
 				struct goacc_asyncqueue *, void *, const void *,
 				size_t);
+extern uintptr_t gomp_map_val (struct target_mem_desc *, void **, size_t);
+extern void gomp_attach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree,
+				 splay_tree_key, uintptr_t, size_t,
+				 struct gomp_coalesce_buf *);
+extern void gomp_detach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree_key,
+				 uintptr_t, bool, struct gomp_coalesce_buf *);
 
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index ba9218b..a086dd2 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -480,6 +480,16 @@ OACC_2.5 {
 	acc_update_self_async_array_h_;
 } OACC_2.0.1;
 
+OACC_2.6 {
+  global:
+	acc_attach;
+	acc_attach_async;
+	acc_detach;
+	acc_detach_async;
+	acc_detach_finalize;
+	acc_detach_finalize_async;
+} OACC_2.5;
+
 GOACC_2.0 {
   global:
 	GOACC_data_end;
diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c
index 68aaf19..077e28f 100644
--- a/libgomp/oacc-async.c
+++ b/libgomp/oacc-async.c
@@ -263,6 +263,24 @@ goacc_async_copyout_unmap_vars (struct target_mem_desc *tgt,
 					      (void *) tgt);
 }
 
+/* Remove a variable asynchronously.  This actually removes the variable
+   mapping immediately, but retains the linked target_mem_desc until the
+   asynchronous operation has completed (as it may still refer to target
+   memory).  The device lock must be held before entry, and remains locked on
+   exit.  */
+
+attribute_hidden void
+goacc_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key n,
+                       struct goacc_asyncqueue *aq)
+{
+  struct target_mem_desc *tgt = n->tgt;
+  assert (tgt);
+  tgt->refcount++;
+  gomp_remove_var (devicep, n);
+  devicep->openacc.async.queue_callback_func (aq, goacc_async_unmap_tgt,
+                                             (void *) tgt);
+}
+
 attribute_hidden void
 goacc_async_free (struct gomp_device_descr *devicep,
 		  struct goacc_asyncqueue *aq, void *ptr)
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 53658c8..6ab77a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -262,8 +262,6 @@ static struct gomp_device_descr host_dispatch =
     .state = GOMP_DEVICE_UNINITIALIZED,
 
     .openacc = {
-      .data_environ = NULL,
-
       .exec_func = host_openacc_exec,
 
       .async = {
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index 823e20e..f644d15 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -302,9 +302,13 @@ acc_shutdown_1 (acc_device_t d)
 
       if (walk->dev)
 	{
-	  gomp_mutex_lock (&walk->dev->lock);
-	  gomp_free_memmap (&walk->dev->mem_map);
-	  gomp_mutex_unlock (&walk->dev->lock);
+	  while (walk->dev->mem_map.root)
+	    {
+	      splay_tree_key k = &walk->dev->mem_map.root->key;
+	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
+		k->u.link_key = NULL;
+	      gomp_remove_var (walk->dev, k);
+	    }
 
 	  walk->dev = NULL;
 	  walk->base_dev = NULL;
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 3354eb6..2e4045e 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -102,10 +102,15 @@ void goacc_restore_bind (void);
 void goacc_lazy_initialize (void);
 void goacc_host_init (void);
 
+struct splay_tree_key_s;
+
 void goacc_init_asyncqueues (struct gomp_device_descr *);
 bool goacc_fini_asyncqueues (struct gomp_device_descr *);
 void goacc_async_copyout_unmap_vars (struct target_mem_desc *,
 				     struct goacc_asyncqueue *);
+void goacc_remove_var_async (struct gomp_device_descr *devicep,
+			     struct splay_tree_key_s *n,
+			     struct goacc_asyncqueue *aq);
 void goacc_async_free (struct gomp_device_descr *,
 		       struct goacc_asyncqueue *, void *);
 struct goacc_asyncqueue *get_goacc_asyncqueue (int);
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 050eb0d..eace3db 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -52,6 +52,25 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
   return key;
 }
 
+/* Helper for lookup_dev.  Iterate over splay tree.  */
+
+static splay_tree_key
+lookup_dev_1 (splay_tree_node node, uintptr_t d, size_t s)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t = k->tgt;
+
+  if (d >= t->tgt_start && d + s <= t->tgt_end)
+    return k;
+
+  if (node->left)
+    return lookup_dev_1 (node->left, d, s);
+  if (node->right)
+    return lookup_dev_1 (node->right, d, s);
+
+  return NULL;
+}
+
 /* Return block containing [D->S), or NULL if not contained.
    The list isn't ordered by device address, so we have to iterate
    over the whole array.  This is not expected to be a common
@@ -59,35 +78,12 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
    remains locked on exit.  */
 
 static splay_tree_key
-lookup_dev (struct target_mem_desc *tgt, void *d, size_t s)
+lookup_dev (splay_tree mem_map, void *d, size_t s)
 {
-  int i;
-  struct target_mem_desc *t;
-
-  if (!tgt)
-    return NULL;
-
-  for (t = tgt; t != NULL; t = t->prev)
-    {
-      if (t->tgt_start <= (uintptr_t) d && t->tgt_end >= (uintptr_t) d + s)
-        break;
-    }
-
-  if (!t)
+  if (!mem_map || !mem_map->root)
     return NULL;
 
-  for (i = 0; i < t->list_count; i++)
-    {
-      void * offset;
-
-      splay_tree_key k = &t->array[i].key;
-      offset = d - t->tgt_start + k->tgt_offset;
-
-      if (k->host_start + offset <= (void *) k->host_end)
-        return k;
-    }
-
-  return NULL;
+  return lookup_dev_1 (mem_map->root, (uintptr_t) d, s);
 }
 
 /* OpenACC is silent on how memory exhaustion is indicated.  We return
@@ -136,7 +132,7 @@ acc_free (void *d)
   /* We don't have to call lazy open here, as the ptr value must have
      been returned by acc_malloc.  It's not permitted to pass NULL in
      (unless you got that null from acc_malloc).  */
-  if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1)))
+  if ((k = lookup_dev (&acc_dev->mem_map, d, 1)))
     {
       void *offset;
 
@@ -260,7 +256,7 @@ acc_hostptr (void *d)
 
   gomp_mutex_lock (&acc_dev->lock);
 
-  n = lookup_dev (acc_dev->openacc.data_environ, d, 1);
+  n = lookup_dev (&acc_dev->mem_map, d, 1);
 
   if (!n)
     {
@@ -348,7 +344,7 @@ acc_map_data (void *h, void *d, size_t s)
 		      (int)s);
 	}
 
-      if (lookup_dev (thr->dev->openacc.data_environ, d, s))
+      if (lookup_dev (&thr->dev->mem_map, d, s))
         {
 	  gomp_mutex_unlock (&acc_dev->lock);
 	  gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d,
@@ -361,11 +357,6 @@ acc_map_data (void *h, void *d, size_t s)
 			   &kinds, true, GOMP_MAP_VARS_OPENACC);
       tgt->list[0].key->refcount = REFCOUNT_INFINITY;
     }
-
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
-  gomp_mutex_unlock (&acc_dev->lock);
 }
 
 void
@@ -373,6 +364,7 @@ acc_unmap_data (void *h)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
 
   /* No need to call lazy open, as the address must have been mapped.  */
 
@@ -380,12 +372,11 @@ acc_unmap_data (void *h)
   if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  size_t host_size;
-
   gomp_mutex_lock (&acc_dev->lock);
 
-  splay_tree_key n = lookup_host (acc_dev, h, 1);
-  struct target_mem_desc *t;
+  cur_node.host_start = (uintptr_t) h;
+  cur_node.host_end = cur_node.host_start + 1;
+  splay_tree_key n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
   if (!n)
     {
@@ -393,46 +384,27 @@ acc_unmap_data (void *h)
       gomp_fatal ("%p is not a mapped block", (void *)h);
     }
 
-  host_size = n->host_end - n->host_start;
-
   if (n->host_start != (uintptr_t) h)
     {
+      size_t host_size = n->host_end - n->host_start;
       gomp_mutex_unlock (&acc_dev->lock);
       gomp_fatal ("[%p,%d] surrounds %p",
 		  (void *) n->host_start, (int) host_size, (void *) h);
     }
 
-  /* Mark for removal.  */
-  n->refcount = 1;
+  splay_tree_remove (&acc_dev->mem_map, n);
 
-  t = n->tgt;
+  struct target_mem_desc *tgt = n->tgt;
 
-  if (t->refcount == 2)
+  if (tgt->refcount > 0)
+    tgt->refcount--;
+  else
     {
-      struct target_mem_desc *tp;
-
-      /* This is the last reference, so pull the descriptor off the
-         chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from
-         freeing the device memory. */
-      t->tgt_end = 0;
-      t->to_free = 0;
-
-      for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	   tp = t, t = t->prev)
-	if (n->tgt == t)
-	  {
-	    if (tp)
-	      tp->prev = t->prev;
-	    else
-	      acc_dev->openacc.data_environ = t->prev;
-
-	    break;
-	  }
+      free (tgt->array);
+      free (tgt);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
-
-  gomp_unmap_vars (t, true);
 }
 
 #define FLAG_PRESENT (1 << 0)
@@ -476,11 +448,14 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 	  gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
 	}
 
+      assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
       if (n->refcount != REFCOUNT_INFINITY)
 	{
 	  n->refcount++;
-	  n->dynamic_refcount++;
+	  n->virtual_refcount++;
 	}
+
       gomp_mutex_unlock (&acc_dev->lock);
     }
   else if (!(f & FLAG_CREATE))
@@ -490,7 +465,6 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
     }
   else
     {
-      struct target_mem_desc *tgt;
       size_t mapnum = 1;
       unsigned short kinds;
       void *hostaddrs = h;
@@ -504,18 +478,14 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 
       goacc_aq aq = get_goacc_asyncqueue (async);
 
-      tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s,
-				 &kinds, true, GOMP_MAP_VARS_OPENACC);
-
-      /* Initialize dynamic refcount.  */
-      tgt->list[0].key->dynamic_refcount = 1;
+      gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, &kinds,
+			   true, GOMP_MAP_VARS_OPENACC_ENTER_DATA);
 
       gomp_mutex_lock (&acc_dev->lock);
-
-      d = tgt->to_free;
-      tgt->prev = acc_dev->openacc.data_environ;
-      acc_dev->openacc.data_environ = tgt;
-
+      n = lookup_host (acc_dev, h, s);
+      assert (n != NULL);
+      d = (void *) (n->tgt->tgt_start + n->tgt_offset + (uintptr_t) h
+		    - n->host_start);
       gomp_mutex_unlock (&acc_dev->lock);
     }
 
@@ -590,7 +560,6 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 {
   size_t host_size;
   splay_tree_key n;
-  void *d;
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
@@ -610,8 +579,7 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
       gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s);
     }
 
-  d = (void *) (n->tgt->tgt_start + n->tgt_offset
-		+ (uintptr_t) h - n->host_start);
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
 
   host_size = n->host_end - n->host_start;
 
@@ -625,48 +593,37 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
   if (n->refcount == REFCOUNT_INFINITY)
     {
       n->refcount = 0;
-      n->dynamic_refcount = 0;
-    }
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
+      n->virtual_refcount = 0;
     }
 
   if (f & FLAG_FINALIZE)
     {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
+      n->refcount -= n->virtual_refcount;
+      n->virtual_refcount = 0;
     }
-  else if (n->dynamic_refcount)
+
+  if (n->virtual_refcount > 0)
     {
-      n->dynamic_refcount--;
       n->refcount--;
+      n->virtual_refcount--;
     }
+  else if (n->refcount > 0)
+    n->refcount--;
 
   if (n->refcount == 0)
     {
-      if (n->tgt->refcount == 2)
-	{
-	  struct target_mem_desc *tp, *t;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    if (n->tgt == t)
-	      {
-		if (tp)
-		  tp->prev = t->prev;
-		else
-		  acc_dev->openacc.data_environ = t->prev;
-		break;
-	      }
-	}
+      goacc_aq aq = get_goacc_asyncqueue (async);
 
       if (f & FLAG_COPYOUT)
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
+        {
+	  void *d = (void *) (n->tgt->tgt_start + n->tgt_offset
+			      + (uintptr_t) h - n->host_start);
 	  gomp_copy_dev2host (acc_dev, aq, h, d, s);
 	}
-      gomp_remove_var (acc_dev, n);
+      if (aq)
+	goacc_remove_var_async (acc_dev, n, aq);
+      else
+	gomp_remove_var (acc_dev, n);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
@@ -783,140 +740,161 @@ acc_update_self_async (void *h, size_t s, int async)
 }
 
 void
-gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
-			 void *kinds, int async)
+gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs,
+			 size_t *sizes, unsigned short *kinds, int async,
+			 bool finalize, int mapnum)
 {
-  struct target_mem_desc *tgt;
-  struct goacc_thread *thr = goacc_thread ();
-  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+
+  gomp_mutex_lock (&acc_dev->lock);
 
-  if (acc_is_present (*hostaddrs, *sizes))
+  for (int i = 0; i < mapnum; i++)
     {
-      splay_tree_key n;
-      gomp_mutex_lock (&acc_dev->lock);
-      n = lookup_host (acc_dev, *hostaddrs, *sizes);
-      gomp_mutex_unlock (&acc_dev->lock);
+      int kind = kinds[i] & 0xff;
+      bool copyfrom = false;
 
-      tgt = n->tgt;
-      for (size_t i = 0; i < tgt->list_count; i++)
-	if (tgt->list[i].key == n)
-	  {
-	    for (size_t j = 0; j < mapnum; j++)
-	      if (i + j < tgt->list_count && tgt->list[i + j].key)
-		{
-		  tgt->list[i + j].key->refcount++;
-		  tgt->list[i + j].key->dynamic_refcount++;
-		}
-	    return;
-	  }
-      /* Should not reach here.  */
-      gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset");
-    }
+      switch (kind)
+        {
+	case GOMP_MAP_FROM:
+	case GOMP_MAP_FORCE_FROM:
+	case GOMP_MAP_ALWAYS_FROM:
+	  copyfrom = true;
+	  /* Fallthrough.  */
+
+	case GOMP_MAP_TO_PSET:
+	case GOMP_MAP_POINTER:
+	case GOMP_MAP_DELETE:
+	case GOMP_MAP_RELEASE:
+	case GOMP_MAP_DETACH:
+	case GOMP_MAP_FORCE_DETACH:
+	  cur_node.host_start = (uintptr_t) hostaddrs[i];
+	  cur_node.host_end = cur_node.host_start
+			      + ((kind == GOMP_MAP_DETACH
+				  || kind == GOMP_MAP_FORCE_DETACH
+				  || kind == GOMP_MAP_POINTER)
+				 ? sizeof (void *) : sizes[i]);
+	  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
+
+	  if (n == NULL)
+	    continue;
+
+	  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+	  if (n->refcount == REFCOUNT_INFINITY)
+	    {
+	      n->refcount = 1;
+	      n->virtual_refcount = 0;
+	    }
 
-  gomp_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
-  goacc_aq aq = get_goacc_asyncqueue (async);
-  tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs,
-			     NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC);
-  gomp_debug (0, "  %s: mappings prepared\n", __FUNCTION__);
+	  if (finalize)
+	    {
+	      n->refcount -= n->virtual_refcount;
+	      n->virtual_refcount = 0;
+	    }
 
-  /* Initialize dynamic refcount.  */
-  tgt->list[0].key->dynamic_refcount = 1;
+	  if (n->virtual_refcount > 0)
+	    {
+	      n->refcount--;
+	      n->virtual_refcount--;
+	    }
+	  else if (n->refcount > 0)
+	    n->refcount--;
+
+	  if (copyfrom)
+	    gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start,
+				(void *) (n->tgt->tgt_start + n->tgt_offset
+					  + cur_node.host_start
+					  - n->host_start),
+				cur_node.host_end - cur_node.host_start);
+
+	  if (n->refcount == 0)
+	    gomp_remove_var (acc_dev, n);
+	  break;
+
+	default:
+	  gomp_mutex_unlock (&acc_dev->lock);
+	  gomp_fatal ("gomp_acc_remove_pointer unhandled kind 0x%.2x",
+		      kind);
+	}
+    }
 
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
   gomp_mutex_unlock (&acc_dev->lock);
 }
 
+
 void
-gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async,
-			 int finalize, int mapnum)
+acc_attach_async (void **hostaddr, int async)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  goacc_aq aq = get_goacc_asyncqueue (async);
+
+  struct splay_tree_key_s cur_node;
   splay_tree_key n;
-  struct target_mem_desc *t;
-  int minrefs = (mapnum == 1) ? 2 : 3;
 
-  if (!acc_is_present (h, s))
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  gomp_mutex_lock (&acc_dev->lock);
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  n = lookup_host (acc_dev, h, 1);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_attach");
 
-  if (!n)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("%p is not a mapped block", (void *)h);
-    }
+  gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr,
+		       0, NULL);
+}
 
-  gomp_debug (0, "  %s: restore mappings\n", __FUNCTION__);
+void
+acc_attach (void **hostaddr)
+{
+  acc_attach_async (hostaddr, acc_async_sync);
+}
 
-  t = n->tgt;
+static void
+goacc_detach_internal (void **hostaddr, int async, bool finalize)
+{
+  struct goacc_thread *thr = goacc_thread ();
+  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+  struct goacc_asyncqueue *aq = get_goacc_asyncqueue (async);
 
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
-    }
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+    return;
 
-  if (finalize)
-    {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
-    }
-  else if (n->dynamic_refcount)
-    {
-      n->dynamic_refcount--;
-      n->refcount--;
-    }
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  gomp_mutex_unlock (&acc_dev->lock);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_detach");
 
-  if (n->refcount == 0)
-    {
-      if (t->refcount == minrefs)
-	{
-	  /* This is the last reference, so pull the descriptor off the
-	     chain. This prevents gomp_unmap_vars via gomp_unmap_tgt from
-	     freeing the device memory. */
-	  struct target_mem_desc *tp;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    {
-	      if (n->tgt == t)
-		{
-		  if (tp)
-		    tp->prev = t->prev;
-		  else
-		    acc_dev->openacc.data_environ = t->prev;
-		  break;
-		}
-	    }
-	}
+  gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL);
+}
 
-      /* Set refcount to 1 to allow gomp_unmap_vars to unmap it.  */
-      n->refcount = 1;
-      t->refcount = minrefs;
-      for (size_t i = 0; i < t->list_count; i++)
-	if (t->list[i].key == n)
-	  {
-	    t->list[i].copy_from = force_copyfrom ? 1 : 0;
-	    break;
-	  }
-
-      /* If running synchronously, unmap immediately.  */
-      if (async < acc_async_noval)
-	gomp_unmap_vars (t, true);
-      else
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
-	  goacc_async_copyout_unmap_vars (t, aq);
-	}
-    }
+void
+acc_detach (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, false);
+}
 
-  gomp_mutex_unlock (&acc_dev->lock);
+void
+acc_detach_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, false);
+}
 
-  gomp_debug (0, "  %s: mappings restored\n", __FUNCTION__);
+void
+acc_detach_finalize (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, true);
+}
+
+void
+acc_detach_finalize_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, true);
 }
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 251154e..08f54cf 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -47,12 +47,32 @@ find_pointer (int pos, size_t mapnum, unsigned short *kinds)
   if (pos + 1 >= mapnum)
     return 0;
 
-  unsigned char kind = kinds[pos+1] & 0xff;
+  unsigned char kind0 = kinds[pos] & 0xff;
 
-  if (kind == GOMP_MAP_TO_PSET)
-    return 3;
-  else if (kind == GOMP_MAP_POINTER)
-    return 2;
+  switch (kind0)
+    {
+    case GOMP_MAP_TO:
+    case GOMP_MAP_FORCE_TO:
+    case GOMP_MAP_FROM:
+    case GOMP_MAP_FORCE_FROM:
+    case GOMP_MAP_TOFROM:
+    case GOMP_MAP_FORCE_TOFROM:
+    case GOMP_MAP_ALLOC:
+    case GOMP_MAP_RELEASE:
+      {
+	unsigned char kind1 = kinds[pos + 1] & 0xff;
+	if (kind1 == GOMP_MAP_POINTER
+	    || kind1 == GOMP_MAP_ALWAYS_POINTER
+	    || kind1 == GOMP_MAP_ATTACH
+	    || kind1 == GOMP_MAP_DETACH
+	    || kind1 == GOMP_MAP_FORCE_DETACH)
+	  return 2;
+	else if (kind1 == GOMP_MAP_TO_PSET)
+	  return 3;
+      }
+    default:
+      /* empty.  */;
+    }
 
   return 0;
 }
@@ -235,9 +255,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-    devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-			    + tgt->list[i].key->tgt_offset
-			    + tgt->list[i].offset);
+    devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i);
+
   if (aq == NULL)
     {
       acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
@@ -356,6 +375,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
   if (mapnum > 0)
     {
       unsigned char kind = kinds[0] & 0xff;
+
+      if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT)
+        kind = kinds[1] & 0xff;
+
       if (kind == GOMP_MAP_DELETE
 	  || kind == GOMP_MAP_FORCE_FROM)
 	finalize = true;
@@ -366,11 +389,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
     {
       unsigned char kind = kinds[i] & 0xff;
 
-      if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET)
+      if (kind == GOMP_MAP_POINTER
+	  || kind == GOMP_MAP_TO_PSET
+	  || kind == GOMP_MAP_STRUCT
+	  || kind == GOMP_MAP_FORCE_PRESENT)
 	continue;
 
       if (kind == GOMP_MAP_FORCE_ALLOC
-	  || kind == GOMP_MAP_FORCE_PRESENT
+	  || kind == GOMP_MAP_ATTACH
 	  || kind == GOMP_MAP_FORCE_TO
 	  || kind == GOMP_MAP_TO
 	  || kind == GOMP_MAP_ALLOC)
@@ -381,6 +407,8 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 
       if (kind == GOMP_MAP_RELEASE
 	  || kind == GOMP_MAP_DELETE
+	  || kind == GOMP_MAP_DETACH
+	  || kind == GOMP_MAP_FORCE_DETACH
 	  || kind == GOMP_MAP_FROM
 	  || kind == GOMP_MAP_FORCE_FROM)
 	break;
@@ -414,6 +442,9 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_ALLOC:
 		  acc_present_or_create (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_ATTACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
 		case GOMP_MAP_FORCE_ALLOC:
 		  acc_create (hostaddrs[i], sizes[i]);
 		  break;
@@ -423,6 +454,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_FORCE_TO:
 		  acc_copyin (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_STRUCT:
+		  {
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    gomp_map_vars_async (acc_dev, aq, elems + 1, &hostaddrs[i],
+					 NULL, &sizes[i], &kinds[i], true,
+					 GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		    i += elems;
+		  }
+		  break;
 		default:
 		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
 			      kind);
@@ -431,8 +472,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	    }
 	  else
 	    {
-	      gomp_acc_insert_pointer (pointer, &hostaddrs[i],
-				       &sizes[i], &kinds[i], async);
+	      goacc_aq aq = get_goacc_asyncqueue (async);
+	      for (int j = 0; j < 2; j++)
+		gomp_map_vars_async (acc_dev, aq,
+				     (j == 0 || pointer == 2) ? 1 : 2,
+				     &hostaddrs[i + j], NULL,
+				     &sizes[i + j], &kinds[i + j], true,
+				     GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+
 	      /* Increment 'i' by two because OpenACC requires fortran
 		 arrays to be contiguous, so each PSET is associated with
 		 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
@@ -440,51 +487,145 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	      i += pointer - 1;
 	    }
 	}
+
+      /* This loop only handles explicit "attach" clauses that are not an
+	 implicit part of a copy{,in,out}, etc. mapping.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  /* Scan for pointers and PSETs.  */
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_ATTACH)
+		acc_attach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    i += pointer - 1;
+	}
     }
   else
-    for (i = 0; i < mapnum; ++i)
-      {
-	unsigned char kind = kinds[i] & 0xff;
+    {
+      /* Handle "detach" before copyback/deletion of mapped data.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
 
-	int pointer = find_pointer (i, mapnum, kinds);
+	  int pointer = find_pointer (i, mapnum, kinds);
 
-	if (!pointer)
-	  {
-	    switch (kind)
-	      {
-	      case GOMP_MAP_RELEASE:
-	      case GOMP_MAP_DELETE:
-		if (acc_is_present (hostaddrs[i], sizes[i]))
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_FORCE_DETACH)
+		acc_detach_finalize (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    {
+	      unsigned char kind2 = kinds[i + pointer - 1] & 0xff;
+
+	      if (kind2 == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i + pointer - 1]);
+	      else if (kind2 == GOMP_MAP_FORCE_DETACH)
+	        acc_detach_finalize (hostaddrs[i + pointer - 1]);
+
+	      i += pointer - 1;
+	    }
+	}
+
+      for (i = 0; i < mapnum; ++i)
+	{
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      switch (kind)
+		{
+		case GOMP_MAP_RELEASE:
+		case GOMP_MAP_DELETE:
+		  if (acc_is_present (hostaddrs[i], sizes[i]))
+		    {
+		      if (finalize)
+			acc_delete_finalize_async (hostaddrs[i], sizes[i],
+						   async);
+		      else
+			acc_delete_async (hostaddrs[i], sizes[i], async);
+		    }
+		  break;
+		case GOMP_MAP_DETACH:
+		case GOMP_MAP_FORCE_DETACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
+		case GOMP_MAP_FROM:
+		case GOMP_MAP_FORCE_FROM:
+		  if (finalize)
+		    acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
+		  else
+		    acc_copyout_async (hostaddrs[i], sizes[i], async);
+		  break;
+		case GOMP_MAP_STRUCT:
 		  {
-		    if (finalize)
-		      acc_delete_finalize_async (hostaddrs[i], sizes[i], async);
-		    else
-		      acc_delete_async (hostaddrs[i], sizes[i], async);
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    for (int j = 1; j <= elems; j++)
+		      {
+			struct splay_tree_key_s k;
+			k.host_start = (uintptr_t) hostaddrs[i + j];
+			k.host_end = k.host_start + sizes[i + j];
+			splay_tree_key str;
+			gomp_mutex_lock (&acc_dev->lock);
+			str = splay_tree_lookup (&acc_dev->mem_map, &k);
+			gomp_mutex_unlock (&acc_dev->lock);
+			if (str)
+		          {
+			    assert (str->virtual_refcount
+				    != VREFCOUNT_LINK_KEY);
+			    if (finalize)
+			      {
+				str->refcount -= str->virtual_refcount;
+				str->virtual_refcount = 0;
+			      }
+			    if (str->virtual_refcount > 0)
+			      {
+				str->refcount--;
+				str->virtual_refcount--;
+			      }
+			    else if (str->refcount > 0)
+			      str->refcount--;
+			    if (str->refcount == 0)
+			      {
+				if (aq)
+				  goacc_remove_var_async (acc_dev, str, aq);
+				else
+				  gomp_remove_var (acc_dev, str);
+			      }
+			  }
+		      }
+		    i += elems;
 		  }
-		break;
-	      case GOMP_MAP_FROM:
-	      case GOMP_MAP_FORCE_FROM:
-		if (finalize)
-		  acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
-		else
-		  acc_copyout_async (hostaddrs[i], sizes[i], async);
-		break;
-	      default:
-		gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
-			    kind);
-		break;
-	      }
-	  }
-	else
-	  {
-	    bool copyfrom = (kind == GOMP_MAP_FORCE_FROM
-			     || kind == GOMP_MAP_FROM);
-	    gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async,
-				     finalize, pointer);
-	    /* See the above comment.  */
-	    i += pointer - 1;
-	  }
-      }
+		  break;
+		default:
+		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
+			      kind);
+		  break;
+		}
+	    }
+	  else
+	    {
+	      gomp_acc_remove_pointer (acc_dev, &hostaddrs[i], &sizes[i],
+				       &kinds[i], async, finalize, pointer);
+	      i += pointer - 1;
+	    }
+	}
+    }
 }
 
 static void
diff --git a/libgomp/openacc.h b/libgomp/openacc.h
index 2505ac0..1bf2d65 100644
--- a/libgomp/openacc.h
+++ b/libgomp/openacc.h
@@ -113,12 +113,18 @@ void *acc_hostptr (void *) __GOACC_NOTHROW;
 int acc_is_present (void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_to_device (void *, void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_from_device (void *, void *, size_t) __GOACC_NOTHROW;
+void acc_attach (void **) __GOACC_NOTHROW;
+void acc_attach_async (void **, int) __GOACC_NOTHROW;
+void acc_detach (void **) __GOACC_NOTHROW;
+void acc_detach_async (void **, int) __GOACC_NOTHROW;
 
 /* Finalize versions of copyout/delete functions, specified in OpenACC 2.5.  */
 void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
 void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
+void acc_detach_finalize (void **) __GOACC_NOTHROW;
+void acc_detach_finalize_async (void **, int) __GOACC_NOTHROW;
 
 /* Async functions, specified in OpenACC 2.5.  */
 void acc_copyin_async (void *, size_t, int) __GOACC_NOTHROW;
diff --git a/libgomp/target.c b/libgomp/target.c
index 2bfc7e2..464e529 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -39,6 +39,9 @@
 #include <string.h>
 #include <assert.h>
 #include <errno.h>
+#ifdef RC_CHECKING
+#include <stdio.h>
+#endif
 
 #ifdef PLUGIN_SUPPORT
 #include <dlfcn.h>
@@ -372,6 +375,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
   tgt_var->key = oldn;
   tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
   tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
+  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
   tgt_var->offset = newn->host_start - oldn->host_start;
   tgt_var->length = newn->host_end - newn->host_start;
 
@@ -505,7 +509,136 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
 	      (void *) cur_node.host_end);
 }
 
-static inline uintptr_t
+void
+gomp_attach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree mem_map,
+		     splay_tree_key n, uintptr_t attach_to, size_t bias,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  struct splay_tree_key_s s;
+  size_t size, idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for attach");
+    }
+
+  size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *);
+  /* We might have a pointer in a packed struct: however we cannot have more
+     than one such pointer in each pointer-sized portion of the struct, so
+     this is safe.  */
+  idx = (attach_to - n->host_start) / sizeof (void *);
+
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+  if (!n->u.attach_count)
+    n->u.attach_count
+      = gomp_malloc_cleared (sizeof (*n->u.attach_count) * size);
+
+  if (n->u.attach_count[idx] < UINTPTR_MAX)
+    n->u.attach_count[idx]++;
+  else
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count overflow");
+    }
+
+  if (n->u.attach_count[idx] == 1)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) attach_to;
+      splay_tree_key tn;
+      uintptr_t data;
+
+      if ((void *) target == NULL)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("attempt to attach null pointer");
+	}
+
+      s.host_start = target + bias;
+      s.host_end = s.host_start + 1;
+      tn = splay_tree_lookup (mem_map, &s);
+
+      if (!tn)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("pointer target not mapped for attach");
+	}
+
+      data = tn->tgt->tgt_start + tn->tgt_offset + target - tn->host_start;
+
+      gomp_debug (1,
+		  "%s: attaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) attach_to, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset), (void *) data);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &data,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) attach_to, (int) n->u.attach_count[idx]);
+}
+
+void
+gomp_detach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree_key n,
+		     uintptr_t detach_from, bool finalize,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  size_t idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for detach");
+    }
+
+  idx = (detach_from - n->host_start) / sizeof (void *);
+
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+  if (!n->u.attach_count)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("no attachment counters for struct");
+    }
+
+  if (finalize)
+    n->u.attach_count[idx] = 1;
+
+  if (n->u.attach_count[idx] == 0)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count underflow");
+    }
+  else
+    n->u.attach_count[idx]--;
+
+  if (n->u.attach_count[idx] == 0)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + detach_from
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) detach_from;
+
+      gomp_debug (1,
+		  "%s: detaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) detach_from, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset),
+		  (void *) target);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &target,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) detach_from, (int) n->u.attach_count[idx]);
+}
+
+uintptr_t
 gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
 {
   if (tgt->list[i].key != NULL)
@@ -547,8 +680,9 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 attribute_hidden struct target_mem_desc *
 gomp_map_vars_async (struct gomp_device_descr *devicep,
 		     struct goacc_asyncqueue *aq, size_t mapnum,
-		     void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
-		     bool short_mapkind, enum gomp_map_vars_kind pragma_kind)
+		     void **hostaddrs, void **devaddrs, size_t *sizes,
+		     void *kinds, bool short_mapkind,
+		     enum gomp_map_vars_kind pragma_kind)
 {
   size_t i, tgt_align, tgt_size, not_found_cnt = 0;
   bool has_firstprivate = false;
@@ -559,7 +693,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
   struct target_mem_desc *tgt
     = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
   tgt->list_count = mapnum;
-  tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
+  tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+		   || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
 
@@ -674,8 +809,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if ((kind & typemask) == GOMP_MAP_ATTACH)
+	{
+	  tgt->list[i].key = NULL;
+	  has_firstprivate = true;
+	  continue;
+	}
       cur_node.host_start = (uintptr_t) hostaddrs[i];
-      if (!GOMP_MAP_POINTER_P (kind & typemask))
+      if (!GOMP_MAP_POINTER_P (kind & typemask)
+          && (kind & typemask) != GOMP_MAP_ATTACH)
 	cur_node.host_end = cur_node.host_start + sizes[i];
       else
 	cur_node.host_end = cur_node.host_start + sizeof (void *);
@@ -882,6 +1024,32 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 		cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
 				      + cur_node.host_start - n->host_start;
 		continue;
+	      case GOMP_MAP_ATTACH:
+		{
+		  cur_node.host_start = (uintptr_t) hostaddrs[i];
+		  cur_node.host_end = cur_node.host_start + sizeof (void *);
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n != NULL)
+		    {
+		      tgt->list[i].key = n;
+		      tgt->list[i].offset = cur_node.host_start - n->host_start;
+		      tgt->list[i].length = n->host_end - n->host_start;
+		      tgt->list[i].copy_from = false;
+		      tgt->list[i].always_copy_from = false;
+		      tgt->list[i].do_detach
+		        = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		      n->refcount++;
+		    }
+		  else
+		    {
+		      gomp_mutex_unlock (&devicep->lock);
+		      gomp_fatal ("outer struct not mapped for attach");
+		    }
+		  gomp_attach_pointer (devicep, aq, mem_map, n,
+				       (uintptr_t) hostaddrs[i], sizes[i],
+				       cbufp);
+		  continue;
+		}
 	      default:
 		break;
 	      }
@@ -899,13 +1067,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 				      kind & typemask, cbufp);
 	    else
 	      {
-		k->link_key = NULL;
+		if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
+		  k->u.link_key = NULL;
 		if (n && n->refcount == REFCOUNT_LINK)
 		  {
 		    /* Replace target address of the pointer with target address
 		       of mapped object in the splay tree.  */
 		    splay_tree_remove (mem_map, n);
-		    k->link_key = n;
+		    k->u.link_key = n;
+		    k->virtual_refcount = VREFCOUNT_LINK_KEY;
 		  }
 		size_t align = (size_t) 1 << (kind >> rshift);
 		tgt->list[i].key = k;
@@ -926,10 +1096,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 		tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
 		tgt->list[i].always_copy_from
 		  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
+		tgt->list[i].do_detach = false;
 		tgt->list[i].offset = 0;
 		tgt->list[i].length = k->host_end - k->host_start;
 		k->refcount = 1;
-		k->dynamic_refcount = 0;
+		k->virtual_refcount = 0;
+		k->u.attach_count = NULL;
 		tgt->refcount++;
 		array->left = NULL;
 		array->right = NULL;
@@ -980,6 +1152,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 			  tgt->list[j].key = k;
 			  tgt->list[j].copy_from = false;
 			  tgt->list[j].always_copy_from = false;
+			  tgt->list[j].do_detach = false;
 			  if (k->refcount != REFCOUNT_INFINITY)
 			    k->refcount++;
 			  gomp_map_pointer (tgt, aq,
@@ -1023,7 +1196,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 				kind);
 		  }
 
-		if (k->link_key)
+		if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
 		  {
 		    /* Set link pointer on target to the device address of the
 		       mapped object.  */
@@ -1063,8 +1236,20 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
   /* If the variable from "omp target enter data" map-list was already mapped,
      tgt is not needed.  Otherwise tgt will be freed by gomp_unmap_vars or
      gomp_exit_data.  */
-  if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0)
-    {
+  if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+       || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+      && tgt->refcount == 0)
+    {
+      /* If we're about to discard a target_mem_desc with no "structural"
+	 references (tgt->refcount == 0), any splay keys linked in the tgt's
+	 list must have their virtual refcount incremented to represent that
+	 "lost" reference in order to implement the semantics of the OpenACC
+	 "present increment" operation properly.  */
+      if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+	for (i = 0; i < tgt->list_count; i++)
+	  if (tgt->list[i].key)
+	    tgt->list[i].key->virtual_refcount++;
+
       free (tgt);
       tgt = NULL;
     }
@@ -1089,8 +1274,10 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
 {
   bool is_tgt_unmapped = false;
   splay_tree_remove (&devicep->mem_map, k);
-  if (k->link_key)
-    splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key);
+  if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
+    splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->u.link_key);
+  if (k->virtual_refcount != VREFCOUNT_LINK_KEY && k->u.attach_count)
+    free (k->u.attach_count);
   if (k->tgt->refcount > 1)
     k->tgt->refcount--;
   else
@@ -1133,14 +1320,35 @@ gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom,
     }
 
   size_t i;
+
+  /* We must perform detachments before any copies back to the host.  */
+  for (i = 0; i < tgt->list_count; i++)
+    {
+      splay_tree_key k = tgt->list[i].key;
+
+      if (k != NULL && tgt->list[i].do_detach)
+	gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start
+					     + tgt->list[i].offset,
+			     k->refcount == 1, NULL);
+    }
+
   for (i = 0; i < tgt->list_count; i++)
     {
       splay_tree_key k = tgt->list[i].key;
+
       if (k == NULL)
 	continue;
 
       bool do_unmap = false;
-      if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
+      if (k->tgt == tgt
+	  && k->virtual_refcount > 0
+	  && k->virtual_refcount != VREFCOUNT_LINK_KEY
+	  && k->refcount != REFCOUNT_INFINITY)
+	{
+	  k->virtual_refcount--;
+	  k->refcount--;
+	}
+      else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
 	k->refcount--;
       else if (k->refcount == 1)
 	{
@@ -1283,7 +1491,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_table[i].start;
       k->refcount = REFCOUNT_INFINITY;
-      k->link_key = NULL;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -1315,7 +1524,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_var->start;
       k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
-      k->link_key = NULL;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -1549,22 +1759,6 @@ gomp_unload_device (struct gomp_device_descr *devicep)
     }
 }
 
-/* Free address mapping tables.  MM must be locked on entry, and remains locked
-   on return.  */
-
-attribute_hidden void
-gomp_free_memmap (struct splay_tree_s *mem_map)
-{
-  while (mem_map->root)
-    {
-      struct target_mem_desc *tgt = mem_map->root->key.tgt;
-
-      splay_tree_remove (mem_map, &mem_map->root->key);
-      free (tgt->array);
-      free (tgt);
-    }
-}
-
 /* Do we have offload data available for the given offload target type?
    Instead of verifying that *all* offload data is available that could
    possibly be required, we instead just look for *any*.  If we later find any
@@ -2092,9 +2286,9 @@ gomp_exit_data (struct gomp_device_descr *devicep, size_t mapnum,
 	  if (k->refcount == 0)
 	    {
 	      splay_tree_remove (&devicep->mem_map, k);
-	      if (k->link_key)
+	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
 		splay_tree_insert (&devicep->mem_map,
-				   (splay_tree_node) k->link_key);
+				   (splay_tree_node) k->u.link_key);
 	      if (k->tgt->refcount > 1)
 		k->tgt->refcount--;
 	      else
@@ -2631,6 +2825,8 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
       k->tgt = tgt;
       k->tgt_offset = (uintptr_t) device_ptr + device_offset;
       k->refcount = REFCOUNT_INFINITY;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -3087,7 +3283,6 @@ gomp_target_init (void)
 		current_device.type = current_device.get_type_func ();
 		current_device.mem_map.root = NULL;
 		current_device.state = GOMP_DEVICE_UNINITIALIZED;
-		current_device.openacc.data_environ = NULL;
 
 		/* Augment DEVICES and NUM_DEVICES.  */
 		devices = gomp_realloc (devices,
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
index 6a52f74..6bdcfe7 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
@@ -182,13 +182,13 @@ main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
index 71365e8..b403a5c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
@@ -176,13 +176,13 @@ main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
new file mode 100644
index 0000000..d8d7067
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
@@ -0,0 +1,24 @@
+#include <stdlib.h>
+#include <assert.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i;
+  struct dc v = { .a = 3, .b = (int *) malloc (sizeof (int) * n) };
+
+#pragma acc parallel loop copy(v.a, v.b[:n])
+  for (i = 0; i < n; i++)
+    v.b[i] = v.a;
+
+  for (i = 0; i < 10; i++)
+    assert (v.b[i] == v.a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
new file mode 100644
index 0000000..7e26e9a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
@@ -0,0 +1,29 @@
+#include <assert.h>
+#include <stdlib.h>
+
+int
+main(int argc, char* argv[])
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  s.a = (int *) malloc (16 * sizeof (int));
+  s.b = (int *) malloc (16 * sizeof (int));
+  s.e = (int *) malloc (16 * sizeof (int));
+
+  #pragma acc data copy(s)
+  {
+    #pragma acc data copy(s.a[0:10])
+    {
+      #pragma acc parallel loop attach(s.a)
+      for (int i = 0; i < 10; i++)
+	s.a[i] = i;
+    }
+  }
+
+  for (int i = 0; i < 10; i++)
+    assert (s.a[i] == i);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
new file mode 100644
index 0000000..cec764b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
@@ -0,0 +1,34 @@
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+int
+main ()
+{
+  int n = 100, i;
+  int *a = (int *) malloc (sizeof (int) * n);
+  int *b;
+
+  for (i = 0; i < n; i++)
+    a[i] = i+1;
+
+#pragma acc enter data copyin(a[:n]) create(b)
+
+  b = a;
+  acc_attach ((void **)&b);
+
+#pragma acc parallel loop present (b[:n])
+  for (i = 0; i < n; i++)
+    b[i] = i+1;
+
+  acc_detach ((void **)&b);
+
+#pragma acc exit data copyout(a[:n], b)
+
+  for (i = 0; i < 10; i++)
+    assert (a[i] == b[i]);
+
+  free (a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
new file mode 100644
index 0000000..8874ca0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
@@ -0,0 +1,87 @@
+#include <assert.h>
+#include <stdlib.h>
+
+#define LIST_LENGTH 10
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    {
+#pragma acc exit data detach(head->next)
+    }
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+#pragma acc enter data copyin(n[:1])
+#pragma acc enter data attach(head->next)
+  if (n->next)
+    {
+#pragma acc enter data attach(n->next)
+    }
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+#pragma acc exit data detach(head->next)
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	{
+#pragma acc exit data detach(n->next)
+	}
+#pragma acc exit data delete (n[:1])
+      if (head->next)
+	{
+#pragma acc enter data attach(head->next)
+	}
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+#pragma acc enter data copyin(list)
+
+  for (i = 0; i < LIST_LENGTH; i++)
+    insert (&list, i + 1);
+
+  assert (sum_nodes (&list) == (LIST_LENGTH * LIST_LENGTH + LIST_LENGTH) / 2);
+
+  destroy (&list);
+
+#pragma acc exit data delete(list)
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
new file mode 100644
index 0000000..89cafbb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
@@ -0,0 +1,81 @@
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    acc_detach ((void **) &head->next);
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+  acc_copyin (n, sizeof (struct node));
+  acc_attach((void **) &head->next);
+
+  if (n->next)
+    acc_attach ((void **) &n->next);
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+      acc_detach ((void **) &head->next);
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	acc_detach ((void **) &n->next);
+
+      acc_delete (n, sizeof (struct node));
+      if (head->next)
+	acc_attach((void **) &head->next);
+
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+  acc_copyin (&list, sizeof (struct node));
+
+  for (i = 0; i < 10; i++)
+    insert (&list, 2);
+
+  assert (sum_nodes (&list) == 10 * 2);
+
+  destroy (&list);
+
+  acc_delete (&list, sizeof (struct node));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
new file mode 100644
index 0000000..81c1c5e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
@@ -0,0 +1,59 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int **b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int **) malloc (sizeof (int *) * n);
+  for (i = 0; i < n; i++)
+    v.b[i] = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc data copy(v)
+      {
+#pragma acc data copy(v.b[:n])
+	{
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_copyin (v.b[i], sizeof (int) * n);
+	      acc_attach ((void **) &v.b[i]);
+	    }
+
+#pragma acc parallel loop
+	  for (i = 0; i < n; i++)
+	    for (j = 0; j < n; j++)
+	      v.b[i][j] = v.a + i + j;
+
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_detach ((void **) &v.b[i]);
+	      acc_copyout (v.b[i], sizeof (int) * n);
+	    }
+	}
+      }
+
+      for (i = 0; i < n; i++)
+	for (j = 0; j < n; j++)
+	  assert (v.b[i][j] == v.a + i + j);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      for (i = 0; i < n; i++)
+        assert (!acc_is_present (v.b[i], sizeof (int) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
new file mode 100644
index 0000000..a59047a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
@@ -0,0 +1,45 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+      /* Here, we do not explicitly copy the enclosing structure, but work
+	 with fields directly.  Make sure attachment counters and reference
+	 counters work properly in that case.  */
+#pragma acc enter data copyin(v.a, v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+
+#pragma acc parallel loop present(v.a, v.b)
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n]) finalize
+#pragma acc exit data delete(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
new file mode 100644
index 0000000..0ca5990
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
@@ -0,0 +1,54 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+  int *c;
+  int *d;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+  v.c = (int *) malloc (sizeof (int) * n);
+  v.d = (int *) malloc (sizeof (int) * n);
+
+#pragma acc enter data copyin(v)
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc enter data copyin(v.a, v.b[:n], v.c[:n], v.d[:n])
+
+#pragma acc parallel loop
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n])
+#pragma acc exit data copyout(v.c[:n])
+#pragma acc exit data copyout(v.d[:n])
+#pragma acc exit data copyout(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      assert (!acc_is_present (v.c, sizeof (int *) * n));
+      assert (!acc_is_present (v.d, sizeof (int *) * n));
+    }
+
+#pragma acc exit data copyout(v)
+
+  assert (!acc_is_present (&v, sizeof (v)));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
index 83a5400..6bb92c1 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
 
 program test
   use openacc
@@ -70,10 +71,14 @@ program test
     end do
   !$acc end parallel
   
-  !$acc exit data copyout (d(1:N)) async
+  !$acc exit data delete (c(1:N)) copyout (d(1:N)) async
   !$acc exit data async
   !$acc wait
 
+#if !ACC_MEM_SHARED
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+#endif
+
   do i = 1, N
     if (d(i) .ne. 4.0) call abort
   end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
new file mode 100644
index 0000000..c4cea11
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
@@ -0,0 +1,35 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var)
+!$acc data copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
new file mode 100644
index 0000000..3593661
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc data", two clauses at once.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var) copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
new file mode 100644
index 0000000..667d944
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
@@ -0,0 +1,34 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc parallel".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc parallel loop copy(var) copy(var%a(1:n)) copy(var%b(1:n))
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i
+  end do
+!$acc end parallel loop
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i .ne. var%b(i)) stop 2
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
new file mode 100644
index 0000000..6949e12
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
@@ -0,0 +1,49 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc enter/exit data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer, allocatable :: r(:)
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(r(1:n))
+
+!$acc enter data copyin(var)
+
+!$acc enter data copyin(var%a, var%b, r)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i * 2
+    r(i) = i * 3
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a)
+!$acc exit data copyout(var%b)
+!$acc exit data copyout(r)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i * 2 .ne. var%b(i)) stop 2
+    if (i * 3 .ne. r(i)) stop 3
+  end do
+
+!$acc exit data delete(var)
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(r)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
new file mode 100644
index 0000000..6843cf1
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
@@ -0,0 +1,57 @@
+! { dg-do run }
+
+! Test of attach/detach, "enter data" inside "data", and subarray.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
new file mode 100644
index 0000000..12910d0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
@@ -0,0 +1,61 @@
+! { dg-do run }
+
+! Test of attachment counters and finalize.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+  do i = 1,20
+    !$acc enter data attach(var%a)
+  end do
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) finalize
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
new file mode 100644
index 0000000..ab44f0a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
@@ -0,0 +1,89 @@
+! { dg-do run }
+
+! Test of attach/detach with scalar elements and nested derived types.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type subtype
+    integer :: g, h
+    integer, allocatable :: q(:)
+  end type subtype
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: c, d
+    integer, allocatable :: b(:)
+    integer :: f
+    type(subtype) :: s
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(var%c)
+  allocate(var%d)
+  allocate(var%s%q(1:n))
+
+  var%c = 16
+  var%d = 20
+  var%f = 7
+  var%s%g = 21
+  var%s%h = 38
+
+!$acc enter data copyin(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+    var%s%q(i) = 0
+  end do
+
+!$acc data copy(var%a(5:n - 5), var%b(5:n - 5), var%c, var%d) &
+!$acc & copy(var%s%q)
+
+!$acc parallel loop default(none) present(var)
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+    var%s%q(i) = i * 3
+    var%s%g = 100
+    var%s%h = 101
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+!$acc exit data copyout(var)
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+    if (var%s%q(i) .ne. 0) stop 3
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 4
+    if (i * 2 .ne. var%b(i)) stop 5
+    if (i * 3 .ne. var%s%q(i)) stop 6
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 7
+    if (var%b(i) .ne. 0) stop 8
+    if (var%s%q(i) .ne. 0) stop 9
+  end do
+
+  if (var%c .ne. 16) stop 10
+  if (var%d .ne. 20) stop 11
+  if (var%s%g .ne. 100 .or. var%s%h .ne. 101) stop 12
+  if (var%f .ne. 7) stop 13
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(var%c)
+  deallocate(var%d)
+  deallocate(var%s%q)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
new file mode 100644
index 0000000..d142763a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
@@ -0,0 +1,41 @@
+! { dg-do run }
+
+! Test of explicit attach/detach clauses and attachment counters. There are no
+! acc_attach/acc_detach API routines in Fortran.
+
+program dtype
+  use openacc
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+  call acc_copyin(var)
+  call acc_copyin(var%a)
+
+  !$acc enter data attach(var%a)
+
+!$acc parallel loop attach(var%a)
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+  !$acc exit data detach(var%a)
+
+  call acc_copyout(var%a)
+  call acc_copyout(var)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
new file mode 100644
index 0000000..eb7812d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
@@ -0,0 +1,28 @@
+! Test derived types with subarrays
+
+! { dg-do run }
+
+  implicit none
+  type dtype
+     integer :: a, b, c
+  end type dtype
+  integer, parameter :: n = 100
+  integer i
+  type (dtype), dimension(n) :: d
+
+  !$acc data copy(d(1:n))
+  !$acc parallel loop
+  do i = 1, n
+     d(i)%a = i
+     d(i)%b = i-1
+     d(i)%c = i+1
+  end do
+  !$acc end data
+
+  do i = 1, n
+     if (d(i)%a /= i) stop 1
+     if (d(i)%b /= i-1) stop 2
+     if (d(i)%c /= i+1) stop 3
+  end do
+end program
+
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
new file mode 100644
index 0000000..c3c8a07
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
@@ -0,0 +1,284 @@
+! Test ACC UPDATE with derived types.
+
+! { dg-do run }
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type mytype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type mytype
+end module dt
+
+program derived_acc
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 1
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 2
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 3
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 4
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 5
+     if (i == 5 .and. var%c(i) /= 1) stop 6
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 7
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 8
+     if (i >= n/2 .and. var%c(i) /= i) stop 9
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 10
+     if (i == 5 .and. var%in%d(i) /= 1) stop 11
+  end do
+
+  !$acc exit data delete(var)
+
+  call derived_acc_subroutine(var)
+end program derived_acc
+
+subroutine derived_acc_subroutine(var)
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 12
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 13
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 14
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 15
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 16
+     if (i == 5 .and. var%c(i) /= 1) stop 17
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 18
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 19
+     if (i >= n/2 .and. var%c(i) /= i) stop 20
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 21
+     if (i == 5 .and. var%in%d(i) /= 1) stop 22
+  end do
+
+  !$acc exit data delete(var)
+end subroutine derived_acc_subroutine
Jakub Jelinek Dec. 13, 2018, 10:57 a.m. UTC | #4
On Mon, Dec 10, 2018 at 07:41:37PM +0000, Julian Brown wrote:
> @@ -11870,7 +11874,8 @@ c_parser_oacc_wait_list (c_parser *parser, location_t clause_loc, tree list)
>  static tree
>  c_parser_omp_variable_list (c_parser *parser,
>  			    location_t clause_loc,
> -			    enum omp_clause_code kind, tree list)
> +			    enum omp_clause_code kind, tree list,
> +			    bool allow_deref)

Make it bool allow_deref = false so that you don't have to change all
callers?

> @@ -12579,7 +12597,8 @@ c_parser_omp_clause_lastprivate (c_parser *parser, tree list)
>  	    }
>  	}
>        tree nlist = c_parser_omp_variable_list (parser, loc,
> -					       OMP_CLAUSE_LASTPRIVATE, list);
> +					       OMP_CLAUSE_LASTPRIVATE, list,
> +					       false);
>        c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, "expected %<)%>");
>        if (conditional)
>  	for (tree c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))

Like these etc.

> +	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	    {
> +	      poly_int64 offset = mem_ref_offset (t).force_shwi ();
> +	      if (maybe_ne (offset, 0))

Just do if (maybe_ne (mem_ref_offset (t), 0)) ?

> @@ -14432,6 +14491,16 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  		}
>  	      if (remove)
>  		break;
> +	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	        {
> +		  poly_int64 offset = mem_ref_offset (t).force_shwi ();
> +		  if (maybe_ne (offset, 0))

Likewise.

> @@ -32111,7 +32115,7 @@ check_no_duplicate_clause (tree clauses, enum omp_clause_code code,
>  
>  static tree
>  cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
> -				tree list, bool *colon)
> +				tree list, bool *colon, bool allow_deref)

See above.

> @@ -33560,7 +33579,7 @@ cp_parser_omp_clause_reduction (cp_parser *parser, enum omp_clause_code kind,
>      goto resync_fail;
>  
>    nlist = cp_parser_omp_var_list_no_open (parser, kind, list,
> -					  NULL);
> +					  NULL, false);
>    for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
>      {
>        OMP_CLAUSE_REDUCTION_CODE (c) = code;

See above.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -39,6 +39,9 @@
>  #include <string.h>
>  #include <assert.h>
>  #include <errno.h>
> +#ifdef RC_CHECKING
> +#include <stdio.h>
> +#endif

This doesn't belong here.

> @@ -1089,8 +1274,10 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
>  {
>    bool is_tgt_unmapped = false;
>    splay_tree_remove (&devicep->mem_map, k);
> -  if (k->link_key)
> -    splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key);
> +  if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
> +    splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->u.link_key);
> +  if (k->virtual_refcount != VREFCOUNT_LINK_KEY && k->u.attach_count)
> +    free (k->u.attach_count);

So write
  if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
    {
      if (k->u.link_key)
	splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->u.link_key);
    }
  else if (k->u.attach_count)
    free (k->u.attach_count);
?

	Jakub
Julian Brown Dec. 14, 2018, 7 p.m. UTC | #5
Hi!

Thanks again for review. Here's another iteration of the patch, with
your comments addressed.

One small thing, additionally: I've introduced a function
c_omp_map_clause_name in c-family/c-omp.c to return the name of OpenACC
map clauses -- since many of these use OMP_CLAUSE_MAP and then use
OMP_CLAUSE_MAP_KIND to distinguish between different mapping types,
just emitting 'map' to refer to several different clauses won't be very
meaningful to the user. I didn't see any existing way to get the right
names.

On Thu, 13 Dec 2018 11:57:05 +0100
Jakub Jelinek <jakub@redhat.com> wrote:

> On Mon, Dec 10, 2018 at 07:41:37PM +0000, Julian Brown wrote:
> > @@ -11870,7 +11874,8 @@ c_parser_oacc_wait_list (c_parser *parser,
> > location_t clause_loc, tree list) static tree
> >  c_parser_omp_variable_list (c_parser *parser,
> >  			    location_t clause_loc,
> > -			    enum omp_clause_code kind, tree list)
> > +			    enum omp_clause_code kind, tree list,
> > +			    bool allow_deref)  
> 
> Make it bool allow_deref = false so that you don't have to change all
> callers?

Done.

> > @@ -12579,7 +12597,8 @@ c_parser_omp_clause_lastprivate (c_parser
> > *parser, tree list) }
> >  	}
> >        tree nlist = c_parser_omp_variable_list (parser, loc,
> > -
> > OMP_CLAUSE_LASTPRIVATE, list);
> > +
> > OMP_CLAUSE_LASTPRIVATE, list,
> > +					       false);
> >        c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
> > "expected %<)%>"); if (conditional)
> >  	for (tree c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))  
> 
> Like these etc.

Those bits reverted.

> > +	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> > +	    {
> > +	      poly_int64 offset = mem_ref_offset (t).force_shwi ();
> > +	      if (maybe_ne (offset, 0))  
> 
> Just do if (maybe_ne (mem_ref_offset (t), 0)) ?

Fixed.

> > @@ -14432,6 +14491,16 @@ c_finish_omp_clauses (tree clauses, enum
> > c_omp_region_type ort) }
> >  	      if (remove)
> >  		break;
> > +	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> > +	        {
> > +		  poly_int64 offset = mem_ref_offset
> > (t).force_shwi ();
> > +		  if (maybe_ne (offset, 0))  
> 
> Likewise.

Fixed.

> > @@ -32111,7 +32115,7 @@ check_no_duplicate_clause (tree clauses,
> > enum omp_clause_code code, 
> >  static tree
> >  cp_parser_omp_var_list_no_open (cp_parser *parser, enum
> > omp_clause_code kind,
> > -				tree list, bool *colon)
> > +				tree list, bool *colon, bool
> > allow_deref)  
> 
> See above.

Fixed.

> > @@ -33560,7 +33579,7 @@ cp_parser_omp_clause_reduction (cp_parser
> > *parser, enum omp_clause_code kind, goto resync_fail;
> >  
> >    nlist = cp_parser_omp_var_list_no_open (parser, kind, list,
> > -					  NULL);
> > +					  NULL, false);
> >    for (c = nlist; c != list; c = OMP_CLAUSE_CHAIN (c))
> >      {
> >        OMP_CLAUSE_REDUCTION_CODE (c) = code;  
> 
> See above.

Fixed.

> > --- a/libgomp/target.c
> > +++ b/libgomp/target.c
> > @@ -39,6 +39,9 @@
> >  #include <string.h>
> >  #include <assert.h>
> >  #include <errno.h>
> > +#ifdef RC_CHECKING
> > +#include <stdio.h>
> > +#endif  
> 
> This doesn't belong here.

Oops! This leaked over from the refcount-checking patch via a bad
merge. Fixed.

> > @@ -1089,8 +1274,10 @@ gomp_remove_var (struct gomp_device_descr
> > *devicep, splay_tree_key k) {
> >    bool is_tgt_unmapped = false;
> >    splay_tree_remove (&devicep->mem_map, k);
> > -  if (k->link_key)
> > -    splay_tree_insert (&devicep->mem_map, (splay_tree_node)
> > k->link_key);
> > +  if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
> > +    splay_tree_insert (&devicep->mem_map, (splay_tree_node)
> > k->u.link_key);
> > +  if (k->virtual_refcount != VREFCOUNT_LINK_KEY &&
> > k->u.attach_count)
> > +    free (k->u.attach_count);  
> 
> So write
>   if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
>     {
>       if (k->u.link_key)
> 	splay_tree_insert (&devicep->mem_map, (splay_tree_node)
> k->u.link_key); }
>   else if (k->u.attach_count)
>     free (k->u.attach_count);
> ?

I've done so.

Re-tested with offloading to nvptx. OK?

Thanks,

Julian
commit 238772fb45bd4c757ba9f9c29ff714e8d753a94f
Author: Julian Brown <julian@codesourcery.com>
Date:   Mon Nov 5 16:12:09 2018 -0800

    OpenACC 2.6 manual deep copy support (attach/detach)
    
    	gcc/c-family/
    	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
    	PRAGMA_OACC_CLAUSE_DETACH.
    	* c-common.h (c_omp_map_clause_name): Add prototype.
    	* c-omp.c (c_omp_map_clause_name): New function.
    
    	gcc/c/
    	* c-parser.c (c_parser_omp_clause_name): Add parsing of attach and
    	detach clauses.
    	(c_parser_omp_variable_list): Add ALLOW_DEREF optional parameter.
    	Allow deref (->) in variable lists if true.
    	(c_parser_omp_var_list_parens): Add ALLOW_DEREF optional parameter.
    	Pass to c_parser_omp_variable_list.
    	(c_parser_oacc_data_clause): Support attach and detach clauses.  Update
    	call to c_parser_omp_variable_list.
    	(c_parser_oacc_all_clauses): Support attach and detach clauses.
    	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    	PRAGMA_OACC_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    	* c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach
    	and detach.  Support deref.
    	(c_oacc_check_attachments): New function.
    	(c_finish_omp_clauses): Check attach/detach arguments for being
    	pointers using above.  Support deref.
    
    	gcc/cp/
    	* parser.c (cp_parser_omp_clause_name): Support attach and detach
    	clauses.
    	(cp_parser_omp_var_list_no_open): Add ALLOW_DEREF optional parameter.
    	Parse deref if true.
    	(cp_parser_omp_var_list): Add ALLOW_DEREF optional parameter.  Pass to
    	cp_parser_omp_var_list_no_open.
    	(cp_parser_oacc_data_clause): Support attach and detach clauses.
    	Update call to cp_parser_omp_var_list_no_open.
    	(cp_parser_oacc_all_clauses): Support attach and detach.
    	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    	(OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    	PRAGMA_OACC_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    	* semantics.c (handle_omp_array_sections_1): Reject subarrays for
    	attach and detach.
    	(cp_oacc_check_attachments): New function.
    	(finish_omp_clauses): Use above function.  Allow structure fields and
    	class members to appear in OpenACC data clauses.  Support deref.
    
    	gcc/fortran/
    	* gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH.
    	* openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter.
    	Parse derived-type member accesses if true.
    	(omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH.
    	(gfc_match_omp_map_clause): Add allow_derived parameter.  Pass to
    	gfc_match_omp_variable_list.
    	(gfc_match_omp_clauses): Support attach and detach.  Support derived
    	types for appropriate OpenACC directives.
    	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
    	(OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH.
    	(OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH.
    	(check_symbol_not_pointer): Don't disallow pointer objects of derived
    	type.
    	(resolve_oacc_data_clauses): Don't disallow allocatable derived types.
    	(resolve_omp_clauses): Perform duplicate checking only for non-derived
    	type component accesses (plain variables and arrays or array sections).
    	Support component refs.
    	* trans-openmp.c (gfc_omp_privatize_by_reference): Support component
    	refs.
    	(gfc_trans_omp_clauses): Support component refs, attach and detach
    	clauses.
    
    	gcc/
    	* gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS.
    	(insert_struct_component_mapping): Support derived-type member mappings
    	for arrays with descriptors which use GOMP_MAP_TO_PSET.
    	(gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to
    	GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers.
    	Handle pointer mappings that use GOMP_MAP_TO_PSET.  Handle attach/detach
    	clauses.
    	(gimplify_adjust_omp_clauses_1): Skip adjustments for explicit
    	attach/detach clauses.
    	(gimplify_omp_target_update): Handle finalize for detach.
    	* omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH,
    	GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH.
    	* tree-pretty-print.c (dump_omp_clause): Likewise.
    
    	include/
    	* gomp-constants.h (GOMP_MAP_DEEP_COPY): Define.
    	(gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH,
    	GOMP_MAP_FORCE_DETACH.
    
    	gcc/testsuite/
    	* c-c++-common/goacc/mdc-1.c: New test.
    	* c-c++-common/goacc/mdc-2.c: New test.
    	* gcc.dg/goacc/mdc.C: New test.
    	* gfortran.dg/goacc/data-clauses.f95: New test.
    	* gfortran.dg/goacc/derived-types.f90: New test.
    	* gfortran.dg/goacc/enter-exit-data.f95: New test.
    
    	libgomp/
    	* libgomp.h (struct target_var_desc): Add do_detach flag.
    	(VREFCOUNT_LINK_KEY): New macro.
    	(struct splay_tree_key_s): Put link_key and new attach_count field into
    	a new union.  Substitute dynamic_refcount field for virtual_refcount.
    	(struct acc_dispatch_t): Remove data_environ field.
    	(enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA.
    	(gomp_acc_insert_pointer): Remove prototype.
    	(gomp_acc_remove_pointer): Update prototype.
    	(struct gomp_coalesce_buf): Add forward declaration.
    	(gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add
    	prototypes.
    	* libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async,
    	acc_detach, acc_detach_async, acc_detach_finalize,
    	acc_detach_finalize_async.
    	* oacc-async.c (goacc_remove_var_async): New function.
    	* oacc-host.c (host_dispatch): Don't initialise removed data_environ
    	field.
    	* oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of
    	gomp_unmap_vars to remove mappings by splay tree key instead of target
    	memory descriptor.
    	* oacc-int.h (splay_tree_key_s): Add forward declaration.
    	(goacc_remove_var_async): Add prototype.
    	* oacc-mem.c (lookup_dev_1): New function.
    	(lookup_dev): Reimplement using above.
    	(acc_free, acc_hostptr): Update calls to lookup_dev.
    	(acc_map_data): Likewise.  Don't add to data_environ list.
    	(acc_unmap_data): Remove call to gomp_unmap_vars.  Fix semantics to
    	remove mapping, but not mapped data.
    	(present_create_copy): Use virtual_refcount instead of
    	dynamic_refcount.  Don't manipulate data_environ.  Fix target pointer
    	return value.
    	(delete_copyout): Update for virtual_refcount semantics.  Use
    	goacc_remove_var_async for asynchronous delete/copyouts.
    	(gomp_acc_insert_pointer): Remove function.
    	(gomp_acc_remove_pointer): Reimplement.
    	(acc_attach_async, acc_attach, goacc_detach_internal, acc_detach)
    	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New
    	functions.
    	* oacc-parallel.c (find_pointer): Support attach/detach.  Make a little
    	more strict.
    	(GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses.
    	(GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT.
    	Don't call gomp_acc_insert_pointer.
    	* openacc.h (acc_attach, acc_attach_async, acc_detach)
    	(acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add
    	prototypes.
    	* target.c (gomp_map_vars_existing): Initialise do_detach field of
    	tgt_var_desc.
    	(gomp_attach_pointer, gomp_detach_pointer): New functions.
    	(gomp_map_val): Make global.
    	(gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA.  Update
    	for virtual_refcount semantics.  Support attach and detach.
    	(gomp_remove_var): Free attach count array if present.
    	(gomp_unmap_vars_async): Support detach and update for virtual_refcount
    	semantics.  Disambiguate link_key/attach_count using virtual_refcount
    	with magic value as a tag.
    	(gomp_load_image_to_device): Zero-initialise virtual_refcount fields.
    	(gomp_free_memmap): Remove function.
    	(gomp_exit_data): Check virtual_refcount for tag value before using
    	link_key.
    	(omp_target_associate_ptr): Zero-initialise virtual_refcount and
    	link_key splay tree key fields.
    	(gomp_target_init): Don't initialise removed data_environ field.
    	* testsuite/libgomp.oacc-c-c++-common/context-2.c: Use correct API to
    	deallocate acc_copyin'd data.
    	* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test.
    	* testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test.
    	* testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test.
    	* testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
    	* testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test.
    	* testsuite/libgomp.oacc-fortran/update-2.f90: New test.

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 4187343..0d831d8 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1176,6 +1176,7 @@ extern void c_omp_split_clauses (location_t, enum tree_code, omp_clause_mask,
 extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
 extern void c_omp_declare_simd_clauses_to_decls (tree, tree);
 extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
+extern const char * c_omp_map_clause_name (tree, bool);
 
 /* Return next tree in the chain for chain_next walking of tree nodes.  */
 static inline tree
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index c7d4456..d6a06c2 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -2035,3 +2035,36 @@ c_omp_predetermined_sharing (tree decl)
 
   return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
 }
+
+/* For OpenACC, the OMP_CLAUSE_MAP_KIND of an OMP_CLAUSE_MAP is used internally
+   to distinguish clauses as seen by the user.  Return the "friendly" clause
+   name for error messages etc., where possible.  See also
+   c/c-parser.c:c_parser_oacc_data_clause and
+   cp/parser.c:cp_parser_oacc_data_clause.  */
+
+const char *
+c_omp_map_clause_name (tree clause, bool oacc)
+{
+  if (oacc && OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP)
+    switch (OMP_CLAUSE_MAP_KIND (clause))
+    {
+    case GOMP_MAP_FORCE_ALLOC:
+    case GOMP_MAP_ALLOC: return "create";
+    case GOMP_MAP_FORCE_TO:
+    case GOMP_MAP_TO: return "copyin";
+    case GOMP_MAP_FORCE_FROM:
+    case GOMP_MAP_FROM: return "copyout";
+    case GOMP_MAP_FORCE_TOFROM:
+    case GOMP_MAP_TOFROM: return "copy";
+    case GOMP_MAP_RELEASE: return "delete";
+    case GOMP_MAP_FORCE_PRESENT: return "present";
+    case GOMP_MAP_ATTACH: return "attach";
+    case GOMP_MAP_FORCE_DETACH:
+    case GOMP_MAP_DETACH: return "detach";
+    case GOMP_MAP_DEVICE_RESIDENT: return "device_resident";
+    case GOMP_MAP_LINK: return "link";
+    case GOMP_MAP_FORCE_DEVICEPTR: return "deviceptr";
+    default:;
+    }
+  return omp_clause_code_name[OMP_CLAUSE_CODE (clause)];
+}
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index b781f73..dd8d807 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -136,11 +136,13 @@ enum pragma_omp_clause {
 
   /* Clauses for OpenACC.  */
   PRAGMA_OACC_CLAUSE_ASYNC,
+  PRAGMA_OACC_CLAUSE_ATTACH,
   PRAGMA_OACC_CLAUSE_AUTO,
   PRAGMA_OACC_CLAUSE_COPY,
   PRAGMA_OACC_CLAUSE_COPYOUT,
   PRAGMA_OACC_CLAUSE_CREATE,
   PRAGMA_OACC_CLAUSE_DELETE,
+  PRAGMA_OACC_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_DEVICEPTR,
   PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
   PRAGMA_OACC_CLAUSE_FINALIZE,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index fef8594..302ff6e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11599,6 +11599,8 @@ c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -11621,6 +11623,8 @@ c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OACC_CLAUSE_DELETE;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -11865,12 +11869,16 @@ c_parser_oacc_wait_list (c_parser *parser, location_t clause_loc, tree list)
    If KIND is nonzero, CLAUSE_LOC is the location of the clause.
 
    If KIND is zero, create a TREE_LIST with the decl in TREE_PURPOSE;
-   return the list created.  */
+   return the list created.
+
+   The optional ALLOW_DEREF argument is true if list items can use the deref
+   (->) operator.  */
 
 static tree
 c_parser_omp_variable_list (c_parser *parser,
 			    location_t clause_loc,
-			    enum omp_clause_code kind, tree list)
+			    enum omp_clause_code kind, tree list,
+			    bool allow_deref = false)
 {
   auto_vec<c_token> tokens;
   unsigned int tokens_avail = 0;
@@ -11983,9 +11991,13 @@ c_parser_omp_variable_list (c_parser *parser,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (c_parser_next_token_is (parser, CPP_DOT))
+	      while (c_parser_next_token_is (parser, CPP_DOT)
+		     || (allow_deref
+			 && c_parser_next_token_is (parser, CPP_DEREF)))
 		{
 		  location_t op_loc = c_parser_peek_token (parser)->location;
+		  if (c_parser_next_token_is (parser, CPP_DEREF))
+		    t = build_simple_mem_ref (t);
 		  c_parser_consume_token (parser);
 		  if (!c_parser_next_token_is (parser, CPP_NAME))
 		    {
@@ -12106,11 +12118,12 @@ c_parser_omp_variable_list (c_parser *parser,
 }
 
 /* Similarly, but expect leading and trailing parenthesis.  This is a very
-   common case for OpenACC and OpenMP clauses.  */
+   common case for OpenACC and OpenMP clauses.  The optional ALLOW_DEREF
+   argument is true if list items can use the deref (->) operator.  */
 
 static tree
 c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
-			      tree list)
+			      tree list, bool allow_deref = false)
 {
   /* The clauses location.  */
   location_t loc = c_parser_peek_token (parser)->location;
@@ -12118,18 +12131,20 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
   matching_parens parens;
   if (parens.require_open (parser))
     {
-      list = c_parser_omp_variable_list (parser, loc, kind, list);
+      list = c_parser_omp_variable_list (parser, loc, kind, list, allow_deref);
       parens.skip_until_found_close (parser);
     }
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach ( variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -12139,6 +12154,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -12154,6 +12172,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -12173,7 +12194,7 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
       gcc_unreachable ();
     }
   tree nl, c;
-  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list);
+  nl = c_parser_omp_var_list_parens (parser, OMP_CLAUSE_MAP, list, true);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -14757,6 +14778,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 						clauses);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = c_parser_omp_clause_collapse (parser, clauses);
 	  c_name = "collapse";
@@ -14785,6 +14810,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_omp_clause_default (parser, clauses, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -15263,7 +15292,8 @@ c_parser_oacc_cache (location_t loc, c_parser *parser)
 */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -15446,6 +15476,7 @@ c_parser_oacc_declare (c_parser *parser)
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
@@ -15455,6 +15486,7 @@ c_parser_oacc_declare (c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -15589,6 +15621,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -15604,6 +15637,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 1a89727..88c15b2 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12879,7 +12879,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	  return error_mark_node;
 	}
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM))
@@ -12901,6 +12900,15 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 		}
 	      t = TREE_OPERAND (t, 0);
 	    }
+	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
+	    {
+	      if (maybe_ne (mem_ref_offset (t), 0))
+	        error_at (OMP_CLAUSE_LOCATION (c),
+			  "cannot dereference %qE in %qs clause", t,
+			  c_omp_map_clause_name (c, true));
+	      else
+		t = TREE_OPERAND (t, 0);
+	    }
 	}
       if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
 	{
@@ -12985,7 +12993,18 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
     length = fold_convert (sizetype, length);
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
-
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    "expected single pointer in %qs clause",
+		    c_omp_map_clause_name (c, ort == C_ORT_ACC));
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -13662,6 +13681,35 @@ c_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+c_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c), "expected pointer in %qs clause",
+		    c_omp_map_clause_name (c, true));
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them against their constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -14386,6 +14434,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (c_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -14393,8 +14443,13 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (c_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (DECL_BIT_FIELD (TREE_OPERAND (t, 1)))
@@ -14432,6 +14487,15 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 		}
 	      if (remove)
 		break;
+	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
+	        {
+		  if (maybe_ne (mem_ref_offset (t), 0))
+	            error_at (OMP_CLAUSE_LOCATION (c),
+			      "cannot dereference %qE in %qs clause", t,
+			      c_omp_map_clause_name (c, true));
+		  else
+		    t = TREE_OPERAND (t, 0);
+		}
 	      if (VAR_P (t) || TREE_CODE (t) == PARM_DECL)
 		{
 		  if (bitmap_bit_p (&map_field_head, DECL_UID (t)))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 8223a70..4bb8d05 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31885,6 +31885,8 @@ cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -31905,6 +31907,8 @@ cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_DEFAULTMAP;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -32107,11 +32111,15 @@ check_no_duplicate_clause (tree clauses, enum omp_clause_code code,
 
    COLON can be NULL if only closing parenthesis should end the list,
    or pointer to bool which will receive false if the list is terminated
-   by closing parenthesis or true if the list is terminated by colon.  */
+   by closing parenthesis or true if the list is terminated by colon.
+
+   The optional ALLOW_DEREF argument is true if list items can use the deref
+   (->) operator.  */
 
 static tree
 cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
-				tree list, bool *colon)
+				tree list, bool *colon,
+				bool allow_deref = false)
 {
   cp_token *token;
   bool saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
@@ -32184,15 +32192,20 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT))
+	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		     || (allow_deref
+			 && cp_lexer_next_token_is (parser->lexer, CPP_DEREF)))
 		{
+		  cpp_ttype ttype
+		    = cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		      ? CPP_DOT : CPP_DEREF;
 		  location_t loc
 		    = cp_lexer_peek_token (parser->lexer)->location;
 		  cp_id_kind idk = CP_ID_KIND_NONE;
 		  cp_lexer_consume_token (parser->lexer);
 		  decl = convert_from_reference (decl);
 		  decl
-		    = cp_parser_postfix_dot_deref_expression (parser, CPP_DOT,
+		    = cp_parser_postfix_dot_deref_expression (parser, ttype,
 							      decl, false,
 							      &idk, loc);
 		}
@@ -32310,19 +32323,23 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
    common case for omp clauses.  */
 
 static tree
-cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list)
+cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list,
+			bool allow_deref = false)
 {
   if (cp_parser_require (parser, CPP_OPEN_PAREN, RT_OPEN_PAREN))
-    return cp_parser_omp_var_list_no_open (parser, kind, list, NULL);
+    return cp_parser_omp_var_list_no_open (parser, kind, list, NULL,
+					   allow_deref);
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach ( variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -32332,6 +32349,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -32347,6 +32367,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -32366,7 +32389,7 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
       gcc_unreachable ();
     }
   tree nl, c;
-  nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list);
+  nl = cp_parser_omp_var_list (parser, OMP_CLAUSE_MAP, list, true);
 
   for (c = nl; c != list; c = OMP_CLAUSE_CHAIN (c))
     OMP_CLAUSE_SET_MAP_KIND (c, kind);
@@ -34701,6 +34724,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 						 clauses, here);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = cp_parser_omp_clause_collapse (parser, clauses, here);
 	  c_name = "collapse";
@@ -34729,6 +34756,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 	  clauses = cp_parser_omp_clause_default (parser, clauses, here, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -38370,10 +38401,12 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
      structured-block  */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) )
@@ -38573,6 +38606,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -38583,6 +38617,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -38686,6 +38721,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -38701,6 +38737,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index c1240cc..a9d5015 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4558,7 +4558,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	t = TREE_OPERAND (t, 0);
       ret = t;
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)
@@ -4681,6 +4680,18 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
 
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    "expected single pointer in %qs clause",
+		    c_omp_map_clause_name (c, ort == C_ORT_ACC));
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -6038,6 +6049,41 @@ cp_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+cp_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+      tree type;
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      type = TREE_TYPE (t);
+
+      if (TREE_CODE (type) == REFERENCE_TYPE)
+	type = TREE_TYPE (type);
+
+      if (TREE_CODE (type) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c), "expected pointer in %qs clause",
+		    c_omp_map_clause_name (c, true));
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them vs OpenMP constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -6278,7 +6324,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    t = OMP_CLAUSE_DECL (c);
 	check_dup_generic_t:
 	  if (t == current_class_ptr
-	      && (ort != C_ORT_OMP_DECLARE_SIMD
+	      && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC)
 		  || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR
 		      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM)))
 	    {
@@ -6342,8 +6388,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	handle_field_decl:
 	  if (!remove
 	      && TREE_CODE (t) == FIELD_DECL
-	      && t == OMP_CLAUSE_DECL (c)
-	      && ort != C_ORT_ACC)
+	      && t == OMP_CLAUSE_DECL (c))
 	    {
 	      OMP_CLAUSE_DECL (c)
 		= omp_privatize_field (t, (OMP_CLAUSE_CODE (c)
@@ -6410,7 +6455,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    omp_note_field_privatization (t, OMP_CLAUSE_DECL (c));
 	  else
 	    t = OMP_CLAUSE_DECL (c);
-	  if (t == current_class_ptr)
+	  if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -6897,7 +6942,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    }
 	  if (t == error_mark_node)
 	    remove = true;
-	  else if (t == current_class_ptr)
+	  else if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -7027,6 +7072,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (cp_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -7034,14 +7081,25 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (cp_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (REFERENCE_REF_P (t)
 	      && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF)
 	    {
 	      t = TREE_OPERAND (t, 0);
 	      OMP_CLAUSE_DECL (c) = t;
 	    }
+	  if (ort == C_ORT_ACC
+	      && TREE_CODE (t) == COMPONENT_REF
+	      && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF)
+	    t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+	      && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+		  || ort == C_ORT_ACC)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (type_dependent_expression_p (t))
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 4dd6298..96bed48 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1175,10 +1175,12 @@ enum gfc_omp_depend_op
 enum gfc_omp_map_op
 {
   OMP_MAP_ALLOC,
+  OMP_MAP_ATTACH,
   OMP_MAP_TO,
   OMP_MAP_FROM,
   OMP_MAP_TOFROM,
   OMP_MAP_DELETE,
+  OMP_MAP_DETACH,
   OMP_MAP_FORCE_ALLOC,
   OMP_MAP_FORCE_TO,
   OMP_MAP_FORCE_FROM,
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index e1560c1..d075a69 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -222,7 +222,8 @@ static match
 gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 			     bool allow_common, bool *end_colon = NULL,
 			     gfc_omp_namelist ***headp = NULL,
-			     bool allow_sections = false)
+			     bool allow_sections = false,
+			     bool allow_derived = false)
 {
   gfc_omp_namelist *head, *tail, *p;
   locus old_loc, cur_loc;
@@ -248,7 +249,8 @@ gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 	case MATCH_YES:
 	  gfc_expr *expr;
 	  expr = NULL;
-	  if (allow_sections && gfc_peek_ascii_char () == '(')
+	  if ((allow_sections && gfc_peek_ascii_char () == '(')
+	      || (allow_derived && gfc_peek_ascii_char () == '%'))
 	    {
 	      gfc_current_locus = cur_loc;
 	      m = gfc_match_variable (&expr, 0);
@@ -785,7 +787,7 @@ enum omp_mask1
   OMP_MASK1_LAST
 };
 
-/* OpenACC 2.0 specific clauses. */
+/* OpenACC 2.0+ specific clauses. */
 enum omp_mask2
 {
   OMP_CLAUSE_ASYNC,
@@ -811,6 +813,8 @@ enum omp_mask2
   OMP_CLAUSE_TILE,
   OMP_CLAUSE_IF_PRESENT,
   OMP_CLAUSE_FINALIZE,
+  OMP_CLAUSE_ATTACH,
+  OMP_CLAUSE_DETACH,
   /* This must come last.  */
   OMP_MASK2_LAST
 };
@@ -914,10 +918,12 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
    mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool allow_derived = false)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
+  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true,
+				   allow_derived)
       == MATCH_YES)
     {
       gfc_omp_namelist *n;
@@ -939,6 +945,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 {
   gfc_omp_clauses *c = gfc_get_omp_clauses ();
   locus old_loc;
+  /* Determine whether we're dealing with an OpenACC directive that permits
+     derived type member accesses.  This in particular disallows
+     "!$acc declare" from using such accesses, because it's not clear if/how
+     that should work.  */
+  bool allow_derived = (openacc
+			&& ((mask & OMP_CLAUSE_ATTACH)
+			    || (mask & OMP_CLAUSE_DETACH)
+			    || (mask & OMP_CLAUSE_HOST_SELF)));
 
   gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64);
   *cp = NULL;
@@ -1012,6 +1026,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      needs_space = true;
 	      continue;
 	    }
+	  if ((mask & OMP_CLAUSE_ATTACH)
+	      && gfc_match ("attach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_ATTACH, allow_derived))
+	    continue;
 	  break;
 	case 'c':
 	  if ((mask & OMP_CLAUSE_COLLAPSE)
@@ -1039,7 +1058,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	    {
@@ -1047,7 +1066,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-						   OMP_MAP_TO))
+						   OMP_MAP_TO, allow_derived))
 		    continue;
 		}
 	      else if (gfc_match_omp_variable_list ("copyin (",
@@ -1058,7 +1077,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	      && gfc_match_omp_variable_list ("copyprivate (",
@@ -1068,7 +1087,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  break;
 	case 'd':
@@ -1104,7 +1123,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_DELETE)
 	      && gfc_match ("delete ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_RELEASE))
+					   OMP_MAP_RELEASE, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEPEND)
 	      && gfc_match ("depend ( ") == MATCH_YES)
@@ -1147,6 +1166,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      else
 		gfc_current_locus = old_loc;
 	    }
+	  if ((mask & OMP_CLAUSE_DETACH)
+	      && gfc_match ("detach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_DETACH, allow_derived))
+	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE)
 	      && !openacc
 	      && c->device == NULL
@@ -1156,12 +1180,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      && openacc
 	      && gfc_match ("device ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_TO))
+					   OMP_MAP_FORCE_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
 	      && gfc_match ("deviceptr ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_DEVICEPTR))
+					   OMP_MAP_FORCE_DEVICEPTR,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
 	      && gfc_match_omp_variable_list
@@ -1239,7 +1264,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("host ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  break;
 	case 'i':
@@ -1511,47 +1536,48 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("pcopy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("pcopyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("pcopyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("pcreate ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRESENT)
 	      && gfc_match ("present ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_PRESENT))
+					   OMP_MAP_FORCE_PRESENT,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("present_or_copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("present_or_copyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("present_or_copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("present_or_create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRIORITY)
 	      && c->priority == NULL
@@ -1669,8 +1695,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 
 	      if (gfc_match_omp_variable_list (" :",
 					       &c->lists[OMP_LIST_REDUCTION],
-					       false, NULL, &head,
-					       openacc) == MATCH_YES)
+					       false, NULL, &head, openacc,
+					       allow_derived) == MATCH_YES)
 		{
 		  gfc_omp_namelist *n;
 		  if (rop == OMP_REDUCTION_NONE)
@@ -1769,7 +1795,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("self ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_SEQ)
 	      && !c->seq
@@ -1926,17 +1952,17 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR	      \
    | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_KERNELS_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS	      \
    | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
    | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
-   | OMP_CLAUSE_PRESENT)
+   | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH)
 #define OACC_LOOP_CLAUSES \
   (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER	      \
    | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT	      \
@@ -1957,10 +1983,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT)
 #define OACC_ENTER_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE)
+   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH)
 #define OACC_EXIT_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE)
+   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE	      \
+   | OMP_CLAUSE_DETACH)
 #define OACC_WAIT_CLAUSES \
   omp_mask (OMP_CLAUSE_ASYNC)
 #define OACC_ROUTINE_CLAUSES \
@@ -3733,9 +3760,6 @@ resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause)
 static void
 check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.pointer)
-    gfc_error ("POINTER object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer)
     gfc_error ("Cray pointer object %qs of derived type in %s clause at %L",
 	       sym->name, name, &loc);
@@ -3780,9 +3804,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name)
 static void
 resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.allocatable)
-    gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable)
       || (sym->ts.type == BT_CLASS && CLASS_DATA (sym)
 	  && CLASS_DATA (sym)->attr.allocatable))
@@ -4152,11 +4173,26 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	&& (list != OMP_LIST_REDUCTION || !openacc))
       for (n = omp_clauses->lists[list]; n; n = n->next)
 	{
-	  if (n->sym->mark)
-	    gfc_error ("Symbol %qs present on multiple clauses at %L",
-		       n->sym->name, &n->where);
-	  else
-	    n->sym->mark = 1;
+	  bool array_only_p = true;
+	  /* Disallow duplicate bare variable references and multiple
+	     subarrays of the same array here, but allow multiple components of
+	     the same (e.g. derived-type) variable.  For the latter, duplicate
+	     components are detected elsewhere.  */
+	  if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE)
+	    for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
+	      if (ref->type != REF_ARRAY)
+	        {
+		  array_only_p = false;
+		  break;
+		}
+	  if (array_only_p)
+	    {
+	      if (n->sym->mark)
+		gfc_error ("Symbol %qs present on multiple clauses at %L",
+			   n->sym->name, &n->where);
+	      else
+		n->sym->mark = 1;
+	    }
 	}
 
   gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1);
@@ -4347,23 +4383,43 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 				 "are allowed on ORDERED directive at %L",
 				 &n->where);
 		  }
+		gfc_ref *array_ref = NULL;
+		bool resolved = false;
 		if (n->expr)
 		  {
-		    if (!gfc_resolve_expr (n->expr)
+		    array_ref = n->expr->ref;
+		    resolved = gfc_resolve_expr (n->expr);
+
+		    /* Look through component refs to find last array
+		       reference.  */
+		    if (openacc)
+		      while (resolved
+			     && array_ref
+			     && (array_ref->type == REF_COMPONENT
+				 || (array_ref->type == REF_ARRAY
+				     && array_ref->next
+			             && (array_ref->next->type
+					 == REF_COMPONENT))))
+			array_ref = array_ref->next;
+		  }
+		if (array_ref
+		    || (n->expr
+			&& (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
+		  {
+		    if (!resolved
 			|| n->expr->expr_type != EXPR_VARIABLE
-			|| n->expr->ref == NULL
-			|| n->expr->ref->next
-			|| n->expr->ref->type != REF_ARRAY)
+			|| array_ref->next
+			|| array_ref->type != REF_ARRAY)
 		      gfc_error ("%qs in %s clause at %L is not a proper "
 				 "array section", n->sym->name, name,
 				 &n->where);
-		    else if (n->expr->ref->u.ar.codimen)
+		    else if (array_ref->u.ar.codimen)
 		      gfc_error ("Coarrays not supported in %s clause at %L",
 				 name, &n->where);
 		    else
 		      {
 			int i;
-			gfc_array_ref *ar = &n->expr->ref->u.ar;
+			gfc_array_ref *ar = &array_ref->u.ar;
 			for (i = 0; i < ar->dimen; i++)
 			  if (ar->stride[i])
 			    {
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 07dd790..4d50c6d 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -60,6 +60,9 @@ gfc_omp_privatize_by_reference (const_tree decl)
 
   if (TREE_CODE (type) == POINTER_TYPE)
     {
+      while (TREE_CODE (decl) == COMPONENT_REF)
+        decl = TREE_OPERAND (decl, 1);
+
       /* Array POINTER/ALLOCATABLE have aggregate types, all user variables
 	 that have POINTER_TYPE type and aren't scalar pointers, scalar
 	 allocatables, Cray pointees or C pointers are supposed to be
@@ -2113,20 +2116,47 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	      tree decl = gfc_get_symbol_decl (n->sym);
 	      if (DECL_P (decl))
 		TREE_ADDRESSABLE (decl) = 1;
-	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
+
+	      gfc_ref *ref = n->expr ? n->expr->ref : NULL;
+	      symbol_attribute *sym_attr = &n->sym->attr;
+	      gomp_map_kind ptr_map_kind = GOMP_MAP_POINTER;
+
+	      if (ref && n->sym->ts.type == BT_DERIVED)
+	        {
+		  if (gfc_omp_privatize_by_reference (decl))
+		    decl = build_fold_indirect_ref (decl);
+
+		  for (; ref && ref->type == REF_COMPONENT; ref = ref->next)
+		    {
+		      tree field = ref->u.c.component->backend_decl;
+		      gcc_assert (field && TREE_CODE (field) == FIELD_DECL);
+		      decl = fold_build3 (COMPONENT_REF, TREE_TYPE (field),
+					  decl, field, NULL_TREE);
+		      sym_attr = &ref->u.c.component->attr;
+		    }
+
+		  ptr_map_kind = GOMP_MAP_ALWAYS_POINTER;
+		}
+
+	      if (ref == NULL || ref->u.ar.type == AR_FULL)
 		{
+		  tree field = decl;
+
+		  while (TREE_CODE (field) == COMPONENT_REF)
+		    field = TREE_OPERAND (field, 1);
+
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
 		      && (gfc_omp_privatize_by_reference (decl)
-			  || GFC_DECL_GET_SCALAR_POINTER (decl)
-			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
-			  || GFC_DECL_CRAY_POINTEE (decl)
+			  || GFC_DECL_GET_SCALAR_POINTER (field)
+			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (field)
+			  || GFC_DECL_CRAY_POINTEE (field)
 			  || GFC_DESCRIPTOR_TYPE_P
-					(TREE_TYPE (TREE_TYPE (decl)))))
+					(TREE_TYPE (TREE_TYPE (field)))))
 		    {
 		      tree orig_decl = decl;
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2136,13 +2166,15 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			{
 			  node3 = build_omp_clause (input_location,
 						    OMP_CLAUSE_MAP);
-			  OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+			  OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 			  OMP_CLAUSE_DECL (node3) = decl;
 			  OMP_CLAUSE_SIZE (node3) = size_int (0);
 			  decl = build_fold_indirect_ref (decl);
 			}
 		    }
-		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
+		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
+		      && n->u.map_op != OMP_MAP_ATTACH
+		      && n->u.map_op != OMP_MAP_DETACH)
 		    {
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
@@ -2157,14 +2189,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		      OMP_CLAUSE_SIZE (node3) = size_int (0);
 
 		      /* We have to check for n->sym->attr.dimension because
 			 of scalar coarrays.  */
-		      if (n->sym->attr.pointer && n->sym->attr.dimension)
+		      if (sym_attr->pointer && sym_attr->dimension)
 			{
 			  stmtblock_t cond_block;
 			  tree size
@@ -2194,11 +2228,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 							     else_b));
 			  OMP_CLAUSE_SIZE (node) = size;
 			}
-		      else if (n->sym->attr.dimension)
+		      else if (sym_attr->dimension)
 			OMP_CLAUSE_SIZE (node)
 			  = gfc_full_array_size (block, decl,
 						 GFC_TYPE_ARRAY_RANK (type));
-		      if (n->sym->attr.dimension)
+		      if (sym_attr->dimension)
 			{
 			  tree elemsz
 			    = TYPE_SIZE_UNIT (gfc_get_element_type (type));
@@ -2211,11 +2245,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  else
 		    OMP_CLAUSE_DECL (node) = decl;
 		}
-	      else
+	      else if (ref)
 		{
 		  tree ptr, ptr2;
 		  gfc_init_se (&se, NULL);
-		  if (n->expr->ref->u.ar.type == AR_ELEMENT)
+		  if (ref->u.ar.type == AR_ELEMENT)
 		    {
 		      gfc_conv_expr_reference (&se, n->expr);
 		      gfc_add_block_to_block (block, &se.pre);
@@ -2249,7 +2283,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		    {
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2266,9 +2300,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		    }
 		  else
 		    {
@@ -2281,18 +2317,23 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			}
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3) = decl;
 		    }
 		  ptr2 = fold_convert (sizetype, ptr2);
 		  OMP_CLAUSE_SIZE (node3)
 		    = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2);
 		}
+	      else
+	        gcc_unreachable ();
 	      switch (n->u.map_op)
 		{
 		case OMP_MAP_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
 		  break;
+		case OMP_MAP_ATTACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ATTACH);
+		  break;
 		case OMP_MAP_TO:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TO);
 		  break;
@@ -2317,6 +2358,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		case OMP_MAP_DELETE:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE);
 		  break;
+		case OMP_MAP_DETACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DETACH);
+		  break;
 		case OMP_MAP_FORCE_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC);
 		  break;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 93e49ed..c27ea76 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -113,6 +113,10 @@ enum gimplify_omp_var_data
 
   GOVD_NONTEMPORAL = 4194304,
 
+  /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for
+     fields.  */
+  GOVD_MAP_HAS_ATTACHMENTS = 8388608,
+
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
 			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
 			   | GOVD_LOCAL)
@@ -8018,7 +8022,13 @@ insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node,
   OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
   OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c));
   OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node;
-  OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
+  if (OMP_CLAUSE_CHAIN (prev_node) != c
+      && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node))
+	  == GOMP_MAP_TO_PSET))
+    OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node));
+  else
+    OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
   if (struct_node)
     OMP_CLAUSE_CHAIN (struct_node) = c2;
 
@@ -8612,7 +8622,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		  remove = true;
 		  break;
 		}
-	      if (DECL_P (decl))
+	      if (DECL_P (decl)
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
+		  && code != OACC_UPDATE)
 		{
 		  if (error_operand_p (decl))
 		    {
@@ -8664,16 +8676,40 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		    = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
 		  bool ptr = (OMP_CLAUSE_MAP_KIND (c)
 			      == GOMP_MAP_ALWAYS_POINTER);
+		  bool attach = OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+			        || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH;
+		  bool has_attachments = false;
+		  /* For OpenACC, pointers in structs should trigger an
+		     attach action.  */
+		  if (ptr && (region_type & ORT_ACC) != 0)
+		    {
+		      /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a
+			 GOMP_MAP_ATTACH clause after we have detected a case
+			 that needs a GOMP_MAP_STRUCT mapping added.  */
+		      gomp_map_kind k
+			= (code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH
+						   : GOMP_MAP_ATTACH;
+		      OMP_CLAUSE_SET_MAP_KIND (c, k);
+		      has_attachments = true;
+		    }
 		  if (n == NULL || (n->value & GOVD_MAP) == 0)
 		    {
 		      tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c),
 						 OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT);
+		      gomp_map_kind k = attach ? GOMP_MAP_FORCE_PRESENT
+					       : GOMP_MAP_STRUCT;
+
+		      OMP_CLAUSE_SET_MAP_KIND (l, k);
 		      if (!base_eq_orig_base)
 			OMP_CLAUSE_DECL (l) = unshare_expr (orig_base);
 		      else
 			OMP_CLAUSE_DECL (l) = decl;
-		      OMP_CLAUSE_SIZE (l) = size_int (1);
+		      OMP_CLAUSE_SIZE (l)
+			= (!attach
+			   ? size_int (1)
+			   : DECL_P (OMP_CLAUSE_DECL (l))
+			   ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
+			   : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l))));
 		      if (struct_map_to_clause == NULL)
 			struct_map_to_clause = new hash_map<tree, tree>;
 		      struct_map_to_clause->put (decl, l);
@@ -8705,9 +8741,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      flags = GOVD_MAP | GOVD_EXPLICIT;
 		      if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr)
 			flags |= GOVD_SEEN;
+		      if (has_attachments)
+			flags |= GOVD_MAP_HAS_ATTACHMENTS;
 		      goto do_add_decl;
 		    }
-		  else
+		  else if (struct_map_to_clause)
 		    {
 		      tree *osc = struct_map_to_clause->get (decl);
 		      tree *sc = NULL, *scp = NULL;
@@ -8716,8 +8754,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      sc = &OMP_CLAUSE_CHAIN (*osc);
 		      if (*sc != c
 			  && (OMP_CLAUSE_MAP_KIND (*sc)
-			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) 
+			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
 			sc = &OMP_CLAUSE_CHAIN (*sc);
+		      /* Here "prev_list_p" is the end of the inserted
+			 alloc/release nodes after the struct node, OSC.  */
 		      for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc))
 			if (ptr && sc == prev_list_p)
 			  break;
@@ -8776,9 +8816,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 			  }
 		      if (remove)
 			break;
-		      OMP_CLAUSE_SIZE (*osc)
-			= size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
-				      size_one_node);
+		      if (!attach)
+			OMP_CLAUSE_SIZE (*osc)
+			  = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
+					size_one_node);
 		      if (ptr)
 			{
 			  tree cl
@@ -8810,11 +8851,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		}
 	      if (!remove
 		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
 		  && OMP_CLAUSE_CHAIN (c)
 		  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP
-		  && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
-		      == GOMP_MAP_ALWAYS_POINTER))
+		  && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		       == GOMP_MAP_ALWAYS_POINTER)
+		      || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		          == GOMP_MAP_TO_PSET)))
 		prev_list_p = list_p;
+
 	      break;
 	    }
 	  flags = GOVD_MAP | GOVD_EXPLICIT;
@@ -9436,6 +9481,8 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
     return 0;
   if ((flags & GOVD_SEEN) == 0)
     return 0;
+  if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0)
+    return 0;
   if (flags & GOVD_DEBUG_PRIVATE)
     {
       gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED);
@@ -11819,8 +11866,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
 			       OMP_CLAUSE_FINALIZE))
     {
-      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
-	 semantics apply to all mappings of this OpenACC directive.  */
+      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
+	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
+	 to all mappings of this OpenACC directive.  */
       bool finalize_marked = false;
       for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
@@ -11834,10 +11882,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
 	      finalize_marked = true;
 	      break;
+	    case GOMP_MAP_DETACH:
+	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
+	      finalize_marked = true;
+	      break;
+	    case GOMP_MAP_STRUCT:
+	    case GOMP_MAP_FORCE_PRESENT:
+	      /* Skip over an initial struct or force_present mapping.  */
+	      break;
 	    default:
-	      /* Check consistency: libgomp relies on the very first data
-		 mapping clause being marked, so make sure we did that before
-		 any other mapping clauses.  */
+	      /* Check consistency: libgomp relies on the very first
+		 non-struct, non-force-present data mapping clause being
+		 marked, so make sure we did that before any other mapping
+		 clauses.  */
 	      gcc_assert (finalize_marked);
 	      break;
 	    }
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index b406ce7..60d541b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9152,6 +9152,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
 	  case GOMP_MAP_LINK:
+	  case GOMP_MAP_ATTACH:
+	  case GOMP_MAP_DETACH:
+	  case GOMP_MAP_FORCE_DETACH:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
 	  default:
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
new file mode 100644
index 0000000..b8d03a0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
@@ -0,0 +1,55 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+/* { dg-do compile { target int32 } } */
+/* { dg-additional-options "-fdump-tree-omplower" } */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z;
+
+#pragma acc enter data copyin(s)
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e)
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+
+      a = s.e;
+#pragma acc enter data attach(a)
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data copyin(a)
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e)
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a)
+
+#pragma acc exit data detach(a) finalize
+#pragma acc exit data detach(s.a) finalize
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
new file mode 100644
index 0000000..fae8667
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
@@ -0,0 +1,62 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z, scalar, **y;
+
+#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data attach(z[:5]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:5]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[1:]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[1:]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[:]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(z[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e) attach(z) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a) attach(s.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(scalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(scalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(s) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(s) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(y[10])
+#pragma acc exit data detach(y[10])
+}
diff --git a/gcc/testsuite/g++.dg/goacc/mdc.C b/gcc/testsuite/g++.dg/goacc/mdc.C
new file mode 100644
index 0000000..b3abab3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/goacc/mdc.C
@@ -0,0 +1,68 @@
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  struct foo& rs = s;
+  
+  int *a, *z, scalar, **y;
+  int* const &ra = a;
+  int* const &rz = z;
+  int& rscalar = scalar;
+  int** const &ry = y;
+
+#pragma acc enter data copyin(rs) detach(rz) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(rs.a[0:10]) copy(rz[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(rs.e) detach(rs.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(ra) detach(rs.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(ra)
+    }
+
+#pragma acc enter data attach(rz[:5]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:5]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[1:]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[1:]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[:]) /* { dg-error "expected single pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:]) /* { dg-error "expected single pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(rz[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(rs.e)
+#pragma acc exit data detach(rs.e) attach(rz) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(rs.e)
+    {
+    }
+#pragma acc exit data delete(ra) attach(rs.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(rscalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rscalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rs) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rs) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(ry[10])
+#pragma acc exit data detach(ry[10])
+}
diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
index b94214e..1a4a671 100644
--- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
@@ -39,9 +39,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel copy (tip)
   !$acc end parallel
-  !$acc parallel copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -54,9 +54,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyin (tip)
   !$acc end parallel
-  !$acc parallel copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -71,9 +71,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyout (tip)
   !$acc end parallel
-  !$acc parallel copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -90,9 +90,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel create (tip) ! { dg-error "POINTER" }
+  !$acc parallel create (tip)
   !$acc end parallel
-  !$acc parallel create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -111,9 +111,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present (tip) ! { dg-error "POINTER" }
+  !$acc parallel present (tip)
   !$acc end parallel
-  !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -144,9 +144,9 @@ contains
   !$acc end parallel
 
 
-  !$acc parallel present_or_copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copy (tip)
   !$acc end parallel
-  !$acc parallel present_or_copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -169,9 +169,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyin (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -196,9 +196,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyout (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -225,9 +225,9 @@ contains
   !$acc end data
 
 
-  !$acc parallel present_or_create (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_create (tip)
   !$acc end parallel
-  !$acc parallel present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -256,4 +256,4 @@ contains
   !$acc end data
 
   end subroutine foo
-end module test
\ No newline at end of file
+end module test
diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
new file mode 100644
index 0000000..5fb2981
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
@@ -0,0 +1,77 @@
+! Test ACC UPDATE with derived types.
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type dtype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type dtype
+end module dt
+
+program derived_acc
+  use dt
+  
+  implicit none
+  type(dtype):: var
+  integer i
+  !$acc declare create(var)
+  !$acc declare pcopy(var%a) ! { dg-error "Syntax error in OpenMP" }
+
+  !$acc update host(var)
+  !$acc update host(var%a)
+  !$acc update device(var)
+  !$acc update device(var%a)
+  !$acc update self(var)
+  !$acc update self(var%a)
+  
+  !$acc enter data copyin(var)
+  !$acc enter data copyin(var%a)
+
+  !$acc exit data copyout(var)
+  !$acc exit data copyout(var%a)
+
+  !$acc data copy(var)
+  !$acc end data
+
+  !$acc data copyout(var%a)
+  !$acc end data
+
+  !$acc parallel loop pcopyout(var)
+  do i = 1, 10
+  end do  
+  !$acc end parallel loop
+
+  !$acc parallel loop copyout(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel pcopy(var)
+  !$acc end parallel
+
+  !$acc parallel pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel
+  
+  !$acc kernels pcopyin(var)
+  !$acc end kernels
+
+  !$acc kernels pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels
+
+  !$acc kernels loop pcopyin(var)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+
+  !$acc kernels loop pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+end program derived_acc
diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
index 805459c..b616b39 100644
--- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
@@ -44,14 +44,14 @@ contains
   !$acc enter data wait (i, 1) 
   !$acc enter data wait (a) ! { dg-error "INTEGER" }
   !$acc enter data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc enter data copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data create (tip) ! { dg-error "POINTER" }
-  !$acc enter data create (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_create (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc enter data copyin (tip)
+  !$acc enter data copyin (tia)
+  !$acc enter data create (tip)
+  !$acc enter data create (tia)
+  !$acc enter data present_or_copyin (tip)
+  !$acc enter data present_or_copyin (tia)
+  !$acc enter data present_or_create (tip)
+  !$acc enter data present_or_create (tia)
   !$acc enter data copyin (i) create (i) ! { dg-error "multiple clauses" }
   !$acc enter data copyin (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc enter data create (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
@@ -79,10 +79,10 @@ contains
   !$acc exit data wait (i, 1) 
   !$acc exit data wait (a) ! { dg-error "INTEGER" }
   !$acc exit data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc exit data copyout (tip) ! { dg-error "POINTER" }
-  !$acc exit data copyout (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc exit data delete (tip) ! { dg-error "POINTER" }
-  !$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc exit data copyout (tip)
+  !$acc exit data copyout (tia)
+  !$acc exit data delete (tip)
+  !$acc exit data delete (tia)
   !$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" }
   !$acc exit data finalize
   !$acc exit data finalize copyout (i)
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 0861cc9..c648da5 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -826,6 +826,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_ATTACH:
+	  pp_string (pp, "attach");
+	  break;
+	case GOMP_MAP_DETACH:
+	  pp_string (pp, "detach");
+	  break;
+	case GOMP_MAP_FORCE_DETACH:
+	  pp_string (pp, "force_detach");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index acd2585..b7ce640 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -42,6 +42,7 @@
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
+#define GOMP_MAP_DEEP_COPY		(1 << 5)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
 #define GOMP_MAP_FLAG_FORCE		(1 << 7)
 
@@ -128,6 +129,13 @@ enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* In OpenACC, attach a pointer to a mapped struct field.  */
+    GOMP_MAP_ATTACH =			(GOMP_MAP_DEEP_COPY | 0),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_DETACH =			(GOMP_MAP_DEEP_COPY | 1),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_FORCE_DETACH =		(GOMP_MAP_DEEP_COPY
+					 | GOMP_MAP_FLAG_FORCE | 1),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 17aee87..3e1aebb 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -866,6 +866,8 @@ struct target_var_desc {
   bool copy_from;
   /* True if data always should be copied from device to host at the end.  */
   bool always_copy_from;
+  /* True if variable should be detached at end of region.  */
+  bool do_detach;
   /* Relative offset against key host_start.  */
   uintptr_t offset;
   /* Actual length.  */
@@ -902,6 +904,10 @@ struct target_mem_desc {
    artificial pointer to "omp declare target link" object.  */
 #define REFCOUNT_LINK (~(uintptr_t) 1)
 
+/* A special tag value for "virtual_refcount" in the splay_tree_key_s structure
+   below.  */
+#define VREFCOUNT_LINK_KEY (~(uintptr_t) 0)
+
 /* Special offset values.  */
 #define OFFSET_INLINED (~(uintptr_t) 0)
 #define OFFSET_POINTER (~(uintptr_t) 1)
@@ -918,10 +924,21 @@ struct splay_tree_key_s {
   uintptr_t tgt_offset;
   /* Reference count.  */
   uintptr_t refcount;
-  /* Dynamic reference count.  */
-  uintptr_t dynamic_refcount;
-  /* Pointer to the original mapping of "omp declare target link" object.  */
-  splay_tree_key link_key;
+  /* Reference counts beyond those that represent genuine references in the
+     linked splay tree key/target memory structures, e.g. for multiple OpenACC
+     "present increment" operations (via "acc enter data") referring to the same
+     host-memory block.
+     If set to VREFCOUNT_LINK_KEY (for OpenMP, where this field is not otherwise
+     needed), the union below represents a link key.  */
+  uintptr_t virtual_refcount;
+  union {
+    /* For a block with attached pointers, the attachment counters for each.
+       Only used for OpenACC.  */
+    uintptr_t *attach_count;
+    /* Pointer to the original mapping of "omp declare target link" object.
+       Only used for OpenMP.  */
+    splay_tree_key link_key;
+  } u;
 };
 
 /* The comparison function.  */
@@ -943,13 +960,6 @@ splay_compare (splay_tree_key x, splay_tree_key y)
 
 typedef struct acc_dispatch_t
 {
-  /* This is a linked list of data mapped using the
-     acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
-     Unlike mapped_data in the goacc_thread struct, unmapping can
-     happen out-of-order with respect to mapping.  */
-  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
-  struct target_mem_desc *data_environ;
-
   /* Execute.  */
   __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func;
 
@@ -1056,13 +1066,17 @@ struct gomp_device_descr
 enum gomp_map_vars_kind
 {
   GOMP_MAP_VARS_OPENACC,
+  GOMP_MAP_VARS_OPENACC_ENTER_DATA,
   GOMP_MAP_VARS_TARGET,
   GOMP_MAP_VARS_DATA,
   GOMP_MAP_VARS_ENTER_DATA
 };
 
-extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int);
-extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int);
+struct gomp_coalesce_buf;
+
+extern void gomp_acc_remove_pointer (struct gomp_device_descr *, void **,
+				     size_t *, unsigned short *, int, bool,
+				     int);
 extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *,
 				       unsigned short *);
 struct gomp_coalesce_buf;
@@ -1072,6 +1086,14 @@ extern void gomp_copy_host2dev (struct gomp_device_descr *,
 extern void gomp_copy_dev2host (struct gomp_device_descr *,
 				struct goacc_asyncqueue *, void *, const void *,
 				size_t);
+extern uintptr_t gomp_map_val (struct target_mem_desc *, void **, size_t);
+extern void gomp_attach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree,
+				 splay_tree_key, uintptr_t, size_t,
+				 struct gomp_coalesce_buf *);
+extern void gomp_detach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree_key,
+				 uintptr_t, bool, struct gomp_coalesce_buf *);
 
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index ba9218b..a086dd2 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -480,6 +480,16 @@ OACC_2.5 {
 	acc_update_self_async_array_h_;
 } OACC_2.0.1;
 
+OACC_2.6 {
+  global:
+	acc_attach;
+	acc_attach_async;
+	acc_detach;
+	acc_detach_async;
+	acc_detach_finalize;
+	acc_detach_finalize_async;
+} OACC_2.5;
+
 GOACC_2.0 {
   global:
 	GOACC_data_end;
diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c
index 68aaf19..077e28f 100644
--- a/libgomp/oacc-async.c
+++ b/libgomp/oacc-async.c
@@ -263,6 +263,24 @@ goacc_async_copyout_unmap_vars (struct target_mem_desc *tgt,
 					      (void *) tgt);
 }
 
+/* Remove a variable asynchronously.  This actually removes the variable
+   mapping immediately, but retains the linked target_mem_desc until the
+   asynchronous operation has completed (as it may still refer to target
+   memory).  The device lock must be held before entry, and remains locked on
+   exit.  */
+
+attribute_hidden void
+goacc_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key n,
+                       struct goacc_asyncqueue *aq)
+{
+  struct target_mem_desc *tgt = n->tgt;
+  assert (tgt);
+  tgt->refcount++;
+  gomp_remove_var (devicep, n);
+  devicep->openacc.async.queue_callback_func (aq, goacc_async_unmap_tgt,
+                                             (void *) tgt);
+}
+
 attribute_hidden void
 goacc_async_free (struct gomp_device_descr *devicep,
 		  struct goacc_asyncqueue *aq, void *ptr)
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 53658c8..6ab77a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -262,8 +262,6 @@ static struct gomp_device_descr host_dispatch =
     .state = GOMP_DEVICE_UNINITIALIZED,
 
     .openacc = {
-      .data_environ = NULL,
-
       .exec_func = host_openacc_exec,
 
       .async = {
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index 823e20e..f644d15 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -302,9 +302,13 @@ acc_shutdown_1 (acc_device_t d)
 
       if (walk->dev)
 	{
-	  gomp_mutex_lock (&walk->dev->lock);
-	  gomp_free_memmap (&walk->dev->mem_map);
-	  gomp_mutex_unlock (&walk->dev->lock);
+	  while (walk->dev->mem_map.root)
+	    {
+	      splay_tree_key k = &walk->dev->mem_map.root->key;
+	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
+		k->u.link_key = NULL;
+	      gomp_remove_var (walk->dev, k);
+	    }
 
 	  walk->dev = NULL;
 	  walk->base_dev = NULL;
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 3354eb6..2e4045e 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -102,10 +102,15 @@ void goacc_restore_bind (void);
 void goacc_lazy_initialize (void);
 void goacc_host_init (void);
 
+struct splay_tree_key_s;
+
 void goacc_init_asyncqueues (struct gomp_device_descr *);
 bool goacc_fini_asyncqueues (struct gomp_device_descr *);
 void goacc_async_copyout_unmap_vars (struct target_mem_desc *,
 				     struct goacc_asyncqueue *);
+void goacc_remove_var_async (struct gomp_device_descr *devicep,
+			     struct splay_tree_key_s *n,
+			     struct goacc_asyncqueue *aq);
 void goacc_async_free (struct gomp_device_descr *,
 		       struct goacc_asyncqueue *, void *);
 struct goacc_asyncqueue *get_goacc_asyncqueue (int);
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 050eb0d..eace3db 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -52,6 +52,25 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
   return key;
 }
 
+/* Helper for lookup_dev.  Iterate over splay tree.  */
+
+static splay_tree_key
+lookup_dev_1 (splay_tree_node node, uintptr_t d, size_t s)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t = k->tgt;
+
+  if (d >= t->tgt_start && d + s <= t->tgt_end)
+    return k;
+
+  if (node->left)
+    return lookup_dev_1 (node->left, d, s);
+  if (node->right)
+    return lookup_dev_1 (node->right, d, s);
+
+  return NULL;
+}
+
 /* Return block containing [D->S), or NULL if not contained.
    The list isn't ordered by device address, so we have to iterate
    over the whole array.  This is not expected to be a common
@@ -59,35 +78,12 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
    remains locked on exit.  */
 
 static splay_tree_key
-lookup_dev (struct target_mem_desc *tgt, void *d, size_t s)
+lookup_dev (splay_tree mem_map, void *d, size_t s)
 {
-  int i;
-  struct target_mem_desc *t;
-
-  if (!tgt)
-    return NULL;
-
-  for (t = tgt; t != NULL; t = t->prev)
-    {
-      if (t->tgt_start <= (uintptr_t) d && t->tgt_end >= (uintptr_t) d + s)
-        break;
-    }
-
-  if (!t)
+  if (!mem_map || !mem_map->root)
     return NULL;
 
-  for (i = 0; i < t->list_count; i++)
-    {
-      void * offset;
-
-      splay_tree_key k = &t->array[i].key;
-      offset = d - t->tgt_start + k->tgt_offset;
-
-      if (k->host_start + offset <= (void *) k->host_end)
-        return k;
-    }
-
-  return NULL;
+  return lookup_dev_1 (mem_map->root, (uintptr_t) d, s);
 }
 
 /* OpenACC is silent on how memory exhaustion is indicated.  We return
@@ -136,7 +132,7 @@ acc_free (void *d)
   /* We don't have to call lazy open here, as the ptr value must have
      been returned by acc_malloc.  It's not permitted to pass NULL in
      (unless you got that null from acc_malloc).  */
-  if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1)))
+  if ((k = lookup_dev (&acc_dev->mem_map, d, 1)))
     {
       void *offset;
 
@@ -260,7 +256,7 @@ acc_hostptr (void *d)
 
   gomp_mutex_lock (&acc_dev->lock);
 
-  n = lookup_dev (acc_dev->openacc.data_environ, d, 1);
+  n = lookup_dev (&acc_dev->mem_map, d, 1);
 
   if (!n)
     {
@@ -348,7 +344,7 @@ acc_map_data (void *h, void *d, size_t s)
 		      (int)s);
 	}
 
-      if (lookup_dev (thr->dev->openacc.data_environ, d, s))
+      if (lookup_dev (&thr->dev->mem_map, d, s))
         {
 	  gomp_mutex_unlock (&acc_dev->lock);
 	  gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d,
@@ -361,11 +357,6 @@ acc_map_data (void *h, void *d, size_t s)
 			   &kinds, true, GOMP_MAP_VARS_OPENACC);
       tgt->list[0].key->refcount = REFCOUNT_INFINITY;
     }
-
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
-  gomp_mutex_unlock (&acc_dev->lock);
 }
 
 void
@@ -373,6 +364,7 @@ acc_unmap_data (void *h)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
 
   /* No need to call lazy open, as the address must have been mapped.  */
 
@@ -380,12 +372,11 @@ acc_unmap_data (void *h)
   if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  size_t host_size;
-
   gomp_mutex_lock (&acc_dev->lock);
 
-  splay_tree_key n = lookup_host (acc_dev, h, 1);
-  struct target_mem_desc *t;
+  cur_node.host_start = (uintptr_t) h;
+  cur_node.host_end = cur_node.host_start + 1;
+  splay_tree_key n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
   if (!n)
     {
@@ -393,46 +384,27 @@ acc_unmap_data (void *h)
       gomp_fatal ("%p is not a mapped block", (void *)h);
     }
 
-  host_size = n->host_end - n->host_start;
-
   if (n->host_start != (uintptr_t) h)
     {
+      size_t host_size = n->host_end - n->host_start;
       gomp_mutex_unlock (&acc_dev->lock);
       gomp_fatal ("[%p,%d] surrounds %p",
 		  (void *) n->host_start, (int) host_size, (void *) h);
     }
 
-  /* Mark for removal.  */
-  n->refcount = 1;
+  splay_tree_remove (&acc_dev->mem_map, n);
 
-  t = n->tgt;
+  struct target_mem_desc *tgt = n->tgt;
 
-  if (t->refcount == 2)
+  if (tgt->refcount > 0)
+    tgt->refcount--;
+  else
     {
-      struct target_mem_desc *tp;
-
-      /* This is the last reference, so pull the descriptor off the
-         chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from
-         freeing the device memory. */
-      t->tgt_end = 0;
-      t->to_free = 0;
-
-      for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	   tp = t, t = t->prev)
-	if (n->tgt == t)
-	  {
-	    if (tp)
-	      tp->prev = t->prev;
-	    else
-	      acc_dev->openacc.data_environ = t->prev;
-
-	    break;
-	  }
+      free (tgt->array);
+      free (tgt);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
-
-  gomp_unmap_vars (t, true);
 }
 
 #define FLAG_PRESENT (1 << 0)
@@ -476,11 +448,14 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 	  gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
 	}
 
+      assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
       if (n->refcount != REFCOUNT_INFINITY)
 	{
 	  n->refcount++;
-	  n->dynamic_refcount++;
+	  n->virtual_refcount++;
 	}
+
       gomp_mutex_unlock (&acc_dev->lock);
     }
   else if (!(f & FLAG_CREATE))
@@ -490,7 +465,6 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
     }
   else
     {
-      struct target_mem_desc *tgt;
       size_t mapnum = 1;
       unsigned short kinds;
       void *hostaddrs = h;
@@ -504,18 +478,14 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 
       goacc_aq aq = get_goacc_asyncqueue (async);
 
-      tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s,
-				 &kinds, true, GOMP_MAP_VARS_OPENACC);
-
-      /* Initialize dynamic refcount.  */
-      tgt->list[0].key->dynamic_refcount = 1;
+      gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, &kinds,
+			   true, GOMP_MAP_VARS_OPENACC_ENTER_DATA);
 
       gomp_mutex_lock (&acc_dev->lock);
-
-      d = tgt->to_free;
-      tgt->prev = acc_dev->openacc.data_environ;
-      acc_dev->openacc.data_environ = tgt;
-
+      n = lookup_host (acc_dev, h, s);
+      assert (n != NULL);
+      d = (void *) (n->tgt->tgt_start + n->tgt_offset + (uintptr_t) h
+		    - n->host_start);
       gomp_mutex_unlock (&acc_dev->lock);
     }
 
@@ -590,7 +560,6 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 {
   size_t host_size;
   splay_tree_key n;
-  void *d;
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
@@ -610,8 +579,7 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
       gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s);
     }
 
-  d = (void *) (n->tgt->tgt_start + n->tgt_offset
-		+ (uintptr_t) h - n->host_start);
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
 
   host_size = n->host_end - n->host_start;
 
@@ -625,48 +593,37 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
   if (n->refcount == REFCOUNT_INFINITY)
     {
       n->refcount = 0;
-      n->dynamic_refcount = 0;
-    }
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
+      n->virtual_refcount = 0;
     }
 
   if (f & FLAG_FINALIZE)
     {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
+      n->refcount -= n->virtual_refcount;
+      n->virtual_refcount = 0;
     }
-  else if (n->dynamic_refcount)
+
+  if (n->virtual_refcount > 0)
     {
-      n->dynamic_refcount--;
       n->refcount--;
+      n->virtual_refcount--;
     }
+  else if (n->refcount > 0)
+    n->refcount--;
 
   if (n->refcount == 0)
     {
-      if (n->tgt->refcount == 2)
-	{
-	  struct target_mem_desc *tp, *t;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    if (n->tgt == t)
-	      {
-		if (tp)
-		  tp->prev = t->prev;
-		else
-		  acc_dev->openacc.data_environ = t->prev;
-		break;
-	      }
-	}
+      goacc_aq aq = get_goacc_asyncqueue (async);
 
       if (f & FLAG_COPYOUT)
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
+        {
+	  void *d = (void *) (n->tgt->tgt_start + n->tgt_offset
+			      + (uintptr_t) h - n->host_start);
 	  gomp_copy_dev2host (acc_dev, aq, h, d, s);
 	}
-      gomp_remove_var (acc_dev, n);
+      if (aq)
+	goacc_remove_var_async (acc_dev, n, aq);
+      else
+	gomp_remove_var (acc_dev, n);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
@@ -783,140 +740,161 @@ acc_update_self_async (void *h, size_t s, int async)
 }
 
 void
-gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
-			 void *kinds, int async)
+gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs,
+			 size_t *sizes, unsigned short *kinds, int async,
+			 bool finalize, int mapnum)
 {
-  struct target_mem_desc *tgt;
-  struct goacc_thread *thr = goacc_thread ();
-  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+
+  gomp_mutex_lock (&acc_dev->lock);
 
-  if (acc_is_present (*hostaddrs, *sizes))
+  for (int i = 0; i < mapnum; i++)
     {
-      splay_tree_key n;
-      gomp_mutex_lock (&acc_dev->lock);
-      n = lookup_host (acc_dev, *hostaddrs, *sizes);
-      gomp_mutex_unlock (&acc_dev->lock);
+      int kind = kinds[i] & 0xff;
+      bool copyfrom = false;
 
-      tgt = n->tgt;
-      for (size_t i = 0; i < tgt->list_count; i++)
-	if (tgt->list[i].key == n)
-	  {
-	    for (size_t j = 0; j < mapnum; j++)
-	      if (i + j < tgt->list_count && tgt->list[i + j].key)
-		{
-		  tgt->list[i + j].key->refcount++;
-		  tgt->list[i + j].key->dynamic_refcount++;
-		}
-	    return;
-	  }
-      /* Should not reach here.  */
-      gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset");
-    }
+      switch (kind)
+        {
+	case GOMP_MAP_FROM:
+	case GOMP_MAP_FORCE_FROM:
+	case GOMP_MAP_ALWAYS_FROM:
+	  copyfrom = true;
+	  /* Fallthrough.  */
+
+	case GOMP_MAP_TO_PSET:
+	case GOMP_MAP_POINTER:
+	case GOMP_MAP_DELETE:
+	case GOMP_MAP_RELEASE:
+	case GOMP_MAP_DETACH:
+	case GOMP_MAP_FORCE_DETACH:
+	  cur_node.host_start = (uintptr_t) hostaddrs[i];
+	  cur_node.host_end = cur_node.host_start
+			      + ((kind == GOMP_MAP_DETACH
+				  || kind == GOMP_MAP_FORCE_DETACH
+				  || kind == GOMP_MAP_POINTER)
+				 ? sizeof (void *) : sizes[i]);
+	  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
+
+	  if (n == NULL)
+	    continue;
+
+	  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+	  if (n->refcount == REFCOUNT_INFINITY)
+	    {
+	      n->refcount = 1;
+	      n->virtual_refcount = 0;
+	    }
 
-  gomp_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
-  goacc_aq aq = get_goacc_asyncqueue (async);
-  tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs,
-			     NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC);
-  gomp_debug (0, "  %s: mappings prepared\n", __FUNCTION__);
+	  if (finalize)
+	    {
+	      n->refcount -= n->virtual_refcount;
+	      n->virtual_refcount = 0;
+	    }
 
-  /* Initialize dynamic refcount.  */
-  tgt->list[0].key->dynamic_refcount = 1;
+	  if (n->virtual_refcount > 0)
+	    {
+	      n->refcount--;
+	      n->virtual_refcount--;
+	    }
+	  else if (n->refcount > 0)
+	    n->refcount--;
+
+	  if (copyfrom)
+	    gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start,
+				(void *) (n->tgt->tgt_start + n->tgt_offset
+					  + cur_node.host_start
+					  - n->host_start),
+				cur_node.host_end - cur_node.host_start);
+
+	  if (n->refcount == 0)
+	    gomp_remove_var (acc_dev, n);
+	  break;
+
+	default:
+	  gomp_mutex_unlock (&acc_dev->lock);
+	  gomp_fatal ("gomp_acc_remove_pointer unhandled kind 0x%.2x",
+		      kind);
+	}
+    }
 
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
   gomp_mutex_unlock (&acc_dev->lock);
 }
 
+
 void
-gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async,
-			 int finalize, int mapnum)
+acc_attach_async (void **hostaddr, int async)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  goacc_aq aq = get_goacc_asyncqueue (async);
+
+  struct splay_tree_key_s cur_node;
   splay_tree_key n;
-  struct target_mem_desc *t;
-  int minrefs = (mapnum == 1) ? 2 : 3;
 
-  if (!acc_is_present (h, s))
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  gomp_mutex_lock (&acc_dev->lock);
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  n = lookup_host (acc_dev, h, 1);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_attach");
 
-  if (!n)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("%p is not a mapped block", (void *)h);
-    }
+  gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr,
+		       0, NULL);
+}
 
-  gomp_debug (0, "  %s: restore mappings\n", __FUNCTION__);
+void
+acc_attach (void **hostaddr)
+{
+  acc_attach_async (hostaddr, acc_async_sync);
+}
 
-  t = n->tgt;
+static void
+goacc_detach_internal (void **hostaddr, int async, bool finalize)
+{
+  struct goacc_thread *thr = goacc_thread ();
+  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+  struct goacc_asyncqueue *aq = get_goacc_asyncqueue (async);
 
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
-    }
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+    return;
 
-  if (finalize)
-    {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
-    }
-  else if (n->dynamic_refcount)
-    {
-      n->dynamic_refcount--;
-      n->refcount--;
-    }
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  gomp_mutex_unlock (&acc_dev->lock);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_detach");
 
-  if (n->refcount == 0)
-    {
-      if (t->refcount == minrefs)
-	{
-	  /* This is the last reference, so pull the descriptor off the
-	     chain. This prevents gomp_unmap_vars via gomp_unmap_tgt from
-	     freeing the device memory. */
-	  struct target_mem_desc *tp;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    {
-	      if (n->tgt == t)
-		{
-		  if (tp)
-		    tp->prev = t->prev;
-		  else
-		    acc_dev->openacc.data_environ = t->prev;
-		  break;
-		}
-	    }
-	}
+  gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL);
+}
 
-      /* Set refcount to 1 to allow gomp_unmap_vars to unmap it.  */
-      n->refcount = 1;
-      t->refcount = minrefs;
-      for (size_t i = 0; i < t->list_count; i++)
-	if (t->list[i].key == n)
-	  {
-	    t->list[i].copy_from = force_copyfrom ? 1 : 0;
-	    break;
-	  }
-
-      /* If running synchronously, unmap immediately.  */
-      if (async < acc_async_noval)
-	gomp_unmap_vars (t, true);
-      else
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
-	  goacc_async_copyout_unmap_vars (t, aq);
-	}
-    }
+void
+acc_detach (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, false);
+}
 
-  gomp_mutex_unlock (&acc_dev->lock);
+void
+acc_detach_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, false);
+}
 
-  gomp_debug (0, "  %s: mappings restored\n", __FUNCTION__);
+void
+acc_detach_finalize (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, true);
+}
+
+void
+acc_detach_finalize_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, true);
 }
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 251154e..08f54cf 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -47,12 +47,32 @@ find_pointer (int pos, size_t mapnum, unsigned short *kinds)
   if (pos + 1 >= mapnum)
     return 0;
 
-  unsigned char kind = kinds[pos+1] & 0xff;
+  unsigned char kind0 = kinds[pos] & 0xff;
 
-  if (kind == GOMP_MAP_TO_PSET)
-    return 3;
-  else if (kind == GOMP_MAP_POINTER)
-    return 2;
+  switch (kind0)
+    {
+    case GOMP_MAP_TO:
+    case GOMP_MAP_FORCE_TO:
+    case GOMP_MAP_FROM:
+    case GOMP_MAP_FORCE_FROM:
+    case GOMP_MAP_TOFROM:
+    case GOMP_MAP_FORCE_TOFROM:
+    case GOMP_MAP_ALLOC:
+    case GOMP_MAP_RELEASE:
+      {
+	unsigned char kind1 = kinds[pos + 1] & 0xff;
+	if (kind1 == GOMP_MAP_POINTER
+	    || kind1 == GOMP_MAP_ALWAYS_POINTER
+	    || kind1 == GOMP_MAP_ATTACH
+	    || kind1 == GOMP_MAP_DETACH
+	    || kind1 == GOMP_MAP_FORCE_DETACH)
+	  return 2;
+	else if (kind1 == GOMP_MAP_TO_PSET)
+	  return 3;
+      }
+    default:
+      /* empty.  */;
+    }
 
   return 0;
 }
@@ -235,9 +255,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-    devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-			    + tgt->list[i].key->tgt_offset
-			    + tgt->list[i].offset);
+    devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i);
+
   if (aq == NULL)
     {
       acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
@@ -356,6 +375,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
   if (mapnum > 0)
     {
       unsigned char kind = kinds[0] & 0xff;
+
+      if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT)
+        kind = kinds[1] & 0xff;
+
       if (kind == GOMP_MAP_DELETE
 	  || kind == GOMP_MAP_FORCE_FROM)
 	finalize = true;
@@ -366,11 +389,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
     {
       unsigned char kind = kinds[i] & 0xff;
 
-      if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET)
+      if (kind == GOMP_MAP_POINTER
+	  || kind == GOMP_MAP_TO_PSET
+	  || kind == GOMP_MAP_STRUCT
+	  || kind == GOMP_MAP_FORCE_PRESENT)
 	continue;
 
       if (kind == GOMP_MAP_FORCE_ALLOC
-	  || kind == GOMP_MAP_FORCE_PRESENT
+	  || kind == GOMP_MAP_ATTACH
 	  || kind == GOMP_MAP_FORCE_TO
 	  || kind == GOMP_MAP_TO
 	  || kind == GOMP_MAP_ALLOC)
@@ -381,6 +407,8 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 
       if (kind == GOMP_MAP_RELEASE
 	  || kind == GOMP_MAP_DELETE
+	  || kind == GOMP_MAP_DETACH
+	  || kind == GOMP_MAP_FORCE_DETACH
 	  || kind == GOMP_MAP_FROM
 	  || kind == GOMP_MAP_FORCE_FROM)
 	break;
@@ -414,6 +442,9 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_ALLOC:
 		  acc_present_or_create (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_ATTACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
 		case GOMP_MAP_FORCE_ALLOC:
 		  acc_create (hostaddrs[i], sizes[i]);
 		  break;
@@ -423,6 +454,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_FORCE_TO:
 		  acc_copyin (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_STRUCT:
+		  {
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    gomp_map_vars_async (acc_dev, aq, elems + 1, &hostaddrs[i],
+					 NULL, &sizes[i], &kinds[i], true,
+					 GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		    i += elems;
+		  }
+		  break;
 		default:
 		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
 			      kind);
@@ -431,8 +472,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	    }
 	  else
 	    {
-	      gomp_acc_insert_pointer (pointer, &hostaddrs[i],
-				       &sizes[i], &kinds[i], async);
+	      goacc_aq aq = get_goacc_asyncqueue (async);
+	      for (int j = 0; j < 2; j++)
+		gomp_map_vars_async (acc_dev, aq,
+				     (j == 0 || pointer == 2) ? 1 : 2,
+				     &hostaddrs[i + j], NULL,
+				     &sizes[i + j], &kinds[i + j], true,
+				     GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+
 	      /* Increment 'i' by two because OpenACC requires fortran
 		 arrays to be contiguous, so each PSET is associated with
 		 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
@@ -440,51 +487,145 @@ GOACC_enter_exit_data (int device, size_t mapnum,
 	      i += pointer - 1;
 	    }
 	}
+
+      /* This loop only handles explicit "attach" clauses that are not an
+	 implicit part of a copy{,in,out}, etc. mapping.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  /* Scan for pointers and PSETs.  */
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_ATTACH)
+		acc_attach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    i += pointer - 1;
+	}
     }
   else
-    for (i = 0; i < mapnum; ++i)
-      {
-	unsigned char kind = kinds[i] & 0xff;
+    {
+      /* Handle "detach" before copyback/deletion of mapped data.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
 
-	int pointer = find_pointer (i, mapnum, kinds);
+	  int pointer = find_pointer (i, mapnum, kinds);
 
-	if (!pointer)
-	  {
-	    switch (kind)
-	      {
-	      case GOMP_MAP_RELEASE:
-	      case GOMP_MAP_DELETE:
-		if (acc_is_present (hostaddrs[i], sizes[i]))
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_FORCE_DETACH)
+		acc_detach_finalize (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    {
+	      unsigned char kind2 = kinds[i + pointer - 1] & 0xff;
+
+	      if (kind2 == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i + pointer - 1]);
+	      else if (kind2 == GOMP_MAP_FORCE_DETACH)
+	        acc_detach_finalize (hostaddrs[i + pointer - 1]);
+
+	      i += pointer - 1;
+	    }
+	}
+
+      for (i = 0; i < mapnum; ++i)
+	{
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      switch (kind)
+		{
+		case GOMP_MAP_RELEASE:
+		case GOMP_MAP_DELETE:
+		  if (acc_is_present (hostaddrs[i], sizes[i]))
+		    {
+		      if (finalize)
+			acc_delete_finalize_async (hostaddrs[i], sizes[i],
+						   async);
+		      else
+			acc_delete_async (hostaddrs[i], sizes[i], async);
+		    }
+		  break;
+		case GOMP_MAP_DETACH:
+		case GOMP_MAP_FORCE_DETACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
+		case GOMP_MAP_FROM:
+		case GOMP_MAP_FORCE_FROM:
+		  if (finalize)
+		    acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
+		  else
+		    acc_copyout_async (hostaddrs[i], sizes[i], async);
+		  break;
+		case GOMP_MAP_STRUCT:
 		  {
-		    if (finalize)
-		      acc_delete_finalize_async (hostaddrs[i], sizes[i], async);
-		    else
-		      acc_delete_async (hostaddrs[i], sizes[i], async);
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    for (int j = 1; j <= elems; j++)
+		      {
+			struct splay_tree_key_s k;
+			k.host_start = (uintptr_t) hostaddrs[i + j];
+			k.host_end = k.host_start + sizes[i + j];
+			splay_tree_key str;
+			gomp_mutex_lock (&acc_dev->lock);
+			str = splay_tree_lookup (&acc_dev->mem_map, &k);
+			gomp_mutex_unlock (&acc_dev->lock);
+			if (str)
+		          {
+			    assert (str->virtual_refcount
+				    != VREFCOUNT_LINK_KEY);
+			    if (finalize)
+			      {
+				str->refcount -= str->virtual_refcount;
+				str->virtual_refcount = 0;
+			      }
+			    if (str->virtual_refcount > 0)
+			      {
+				str->refcount--;
+				str->virtual_refcount--;
+			      }
+			    else if (str->refcount > 0)
+			      str->refcount--;
+			    if (str->refcount == 0)
+			      {
+				if (aq)
+				  goacc_remove_var_async (acc_dev, str, aq);
+				else
+				  gomp_remove_var (acc_dev, str);
+			      }
+			  }
+		      }
+		    i += elems;
 		  }
-		break;
-	      case GOMP_MAP_FROM:
-	      case GOMP_MAP_FORCE_FROM:
-		if (finalize)
-		  acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
-		else
-		  acc_copyout_async (hostaddrs[i], sizes[i], async);
-		break;
-	      default:
-		gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
-			    kind);
-		break;
-	      }
-	  }
-	else
-	  {
-	    bool copyfrom = (kind == GOMP_MAP_FORCE_FROM
-			     || kind == GOMP_MAP_FROM);
-	    gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async,
-				     finalize, pointer);
-	    /* See the above comment.  */
-	    i += pointer - 1;
-	  }
-      }
+		  break;
+		default:
+		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
+			      kind);
+		  break;
+		}
+	    }
+	  else
+	    {
+	      gomp_acc_remove_pointer (acc_dev, &hostaddrs[i], &sizes[i],
+				       &kinds[i], async, finalize, pointer);
+	      i += pointer - 1;
+	    }
+	}
+    }
 }
 
 static void
diff --git a/libgomp/openacc.h b/libgomp/openacc.h
index 2505ac0..1bf2d65 100644
--- a/libgomp/openacc.h
+++ b/libgomp/openacc.h
@@ -113,12 +113,18 @@ void *acc_hostptr (void *) __GOACC_NOTHROW;
 int acc_is_present (void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_to_device (void *, void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_from_device (void *, void *, size_t) __GOACC_NOTHROW;
+void acc_attach (void **) __GOACC_NOTHROW;
+void acc_attach_async (void **, int) __GOACC_NOTHROW;
+void acc_detach (void **) __GOACC_NOTHROW;
+void acc_detach_async (void **, int) __GOACC_NOTHROW;
 
 /* Finalize versions of copyout/delete functions, specified in OpenACC 2.5.  */
 void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
 void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
+void acc_detach_finalize (void **) __GOACC_NOTHROW;
+void acc_detach_finalize_async (void **, int) __GOACC_NOTHROW;
 
 /* Async functions, specified in OpenACC 2.5.  */
 void acc_copyin_async (void *, size_t, int) __GOACC_NOTHROW;
diff --git a/libgomp/target.c b/libgomp/target.c
index 2bfc7e2..b776b84 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -372,6 +372,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep,
   tgt_var->key = oldn;
   tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
   tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
+  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
   tgt_var->offset = newn->host_start - oldn->host_start;
   tgt_var->length = newn->host_end - newn->host_start;
 
@@ -505,7 +506,136 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
 	      (void *) cur_node.host_end);
 }
 
-static inline uintptr_t
+void
+gomp_attach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree mem_map,
+		     splay_tree_key n, uintptr_t attach_to, size_t bias,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  struct splay_tree_key_s s;
+  size_t size, idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for attach");
+    }
+
+  size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *);
+  /* We might have a pointer in a packed struct: however we cannot have more
+     than one such pointer in each pointer-sized portion of the struct, so
+     this is safe.  */
+  idx = (attach_to - n->host_start) / sizeof (void *);
+
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+  if (!n->u.attach_count)
+    n->u.attach_count
+      = gomp_malloc_cleared (sizeof (*n->u.attach_count) * size);
+
+  if (n->u.attach_count[idx] < UINTPTR_MAX)
+    n->u.attach_count[idx]++;
+  else
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count overflow");
+    }
+
+  if (n->u.attach_count[idx] == 1)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) attach_to;
+      splay_tree_key tn;
+      uintptr_t data;
+
+      if ((void *) target == NULL)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("attempt to attach null pointer");
+	}
+
+      s.host_start = target + bias;
+      s.host_end = s.host_start + 1;
+      tn = splay_tree_lookup (mem_map, &s);
+
+      if (!tn)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("pointer target not mapped for attach");
+	}
+
+      data = tn->tgt->tgt_start + tn->tgt_offset + target - tn->host_start;
+
+      gomp_debug (1,
+		  "%s: attaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) attach_to, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset), (void *) data);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &data,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) attach_to, (int) n->u.attach_count[idx]);
+}
+
+void
+gomp_detach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree_key n,
+		     uintptr_t detach_from, bool finalize,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  size_t idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for detach");
+    }
+
+  idx = (detach_from - n->host_start) / sizeof (void *);
+
+  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
+
+  if (!n->u.attach_count)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("no attachment counters for struct");
+    }
+
+  if (finalize)
+    n->u.attach_count[idx] = 1;
+
+  if (n->u.attach_count[idx] == 0)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count underflow");
+    }
+  else
+    n->u.attach_count[idx]--;
+
+  if (n->u.attach_count[idx] == 0)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + detach_from
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) detach_from;
+
+      gomp_debug (1,
+		  "%s: detaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) detach_from, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset),
+		  (void *) target);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &target,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) detach_from, (int) n->u.attach_count[idx]);
+}
+
+uintptr_t
 gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
 {
   if (tgt->list[i].key != NULL)
@@ -547,8 +677,9 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 attribute_hidden struct target_mem_desc *
 gomp_map_vars_async (struct gomp_device_descr *devicep,
 		     struct goacc_asyncqueue *aq, size_t mapnum,
-		     void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
-		     bool short_mapkind, enum gomp_map_vars_kind pragma_kind)
+		     void **hostaddrs, void **devaddrs, size_t *sizes,
+		     void *kinds, bool short_mapkind,
+		     enum gomp_map_vars_kind pragma_kind)
 {
   size_t i, tgt_align, tgt_size, not_found_cnt = 0;
   bool has_firstprivate = false;
@@ -559,7 +690,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
   struct target_mem_desc *tgt
     = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
   tgt->list_count = mapnum;
-  tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
+  tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+		   || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
 
@@ -674,8 +806,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if ((kind & typemask) == GOMP_MAP_ATTACH)
+	{
+	  tgt->list[i].key = NULL;
+	  has_firstprivate = true;
+	  continue;
+	}
       cur_node.host_start = (uintptr_t) hostaddrs[i];
-      if (!GOMP_MAP_POINTER_P (kind & typemask))
+      if (!GOMP_MAP_POINTER_P (kind & typemask)
+          && (kind & typemask) != GOMP_MAP_ATTACH)
 	cur_node.host_end = cur_node.host_start + sizes[i];
       else
 	cur_node.host_end = cur_node.host_start + sizeof (void *);
@@ -882,6 +1021,32 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 		cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
 				      + cur_node.host_start - n->host_start;
 		continue;
+	      case GOMP_MAP_ATTACH:
+		{
+		  cur_node.host_start = (uintptr_t) hostaddrs[i];
+		  cur_node.host_end = cur_node.host_start + sizeof (void *);
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n != NULL)
+		    {
+		      tgt->list[i].key = n;
+		      tgt->list[i].offset = cur_node.host_start - n->host_start;
+		      tgt->list[i].length = n->host_end - n->host_start;
+		      tgt->list[i].copy_from = false;
+		      tgt->list[i].always_copy_from = false;
+		      tgt->list[i].do_detach
+		        = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		      n->refcount++;
+		    }
+		  else
+		    {
+		      gomp_mutex_unlock (&devicep->lock);
+		      gomp_fatal ("outer struct not mapped for attach");
+		    }
+		  gomp_attach_pointer (devicep, aq, mem_map, n,
+				       (uintptr_t) hostaddrs[i], sizes[i],
+				       cbufp);
+		  continue;
+		}
 	      default:
 		break;
 	      }
@@ -899,13 +1064,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 				      kind & typemask, cbufp);
 	    else
 	      {
-		k->link_key = NULL;
+		if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
+		  k->u.link_key = NULL;
 		if (n && n->refcount == REFCOUNT_LINK)
 		  {
 		    /* Replace target address of the pointer with target address
 		       of mapped object in the splay tree.  */
 		    splay_tree_remove (mem_map, n);
-		    k->link_key = n;
+		    k->u.link_key = n;
+		    k->virtual_refcount = VREFCOUNT_LINK_KEY;
 		  }
 		size_t align = (size_t) 1 << (kind >> rshift);
 		tgt->list[i].key = k;
@@ -926,10 +1093,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 		tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
 		tgt->list[i].always_copy_from
 		  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
+		tgt->list[i].do_detach = false;
 		tgt->list[i].offset = 0;
 		tgt->list[i].length = k->host_end - k->host_start;
 		k->refcount = 1;
-		k->dynamic_refcount = 0;
+		k->virtual_refcount = 0;
+		k->u.attach_count = NULL;
 		tgt->refcount++;
 		array->left = NULL;
 		array->right = NULL;
@@ -980,6 +1149,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 			  tgt->list[j].key = k;
 			  tgt->list[j].copy_from = false;
 			  tgt->list[j].always_copy_from = false;
+			  tgt->list[j].do_detach = false;
 			  if (k->refcount != REFCOUNT_INFINITY)
 			    k->refcount++;
 			  gomp_map_pointer (tgt, aq,
@@ -1023,7 +1193,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
 				kind);
 		  }
 
-		if (k->link_key)
+		if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
 		  {
 		    /* Set link pointer on target to the device address of the
 		       mapped object.  */
@@ -1063,8 +1233,20 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
   /* If the variable from "omp target enter data" map-list was already mapped,
      tgt is not needed.  Otherwise tgt will be freed by gomp_unmap_vars or
      gomp_exit_data.  */
-  if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0)
-    {
+  if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+       || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+      && tgt->refcount == 0)
+    {
+      /* If we're about to discard a target_mem_desc with no "structural"
+	 references (tgt->refcount == 0), any splay keys linked in the tgt's
+	 list must have their virtual refcount incremented to represent that
+	 "lost" reference in order to implement the semantics of the OpenACC
+	 "present increment" operation properly.  */
+      if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+	for (i = 0; i < tgt->list_count; i++)
+	  if (tgt->list[i].key)
+	    tgt->list[i].key->virtual_refcount++;
+
       free (tgt);
       tgt = NULL;
     }
@@ -1089,8 +1271,13 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
 {
   bool is_tgt_unmapped = false;
   splay_tree_remove (&devicep->mem_map, k);
-  if (k->link_key)
-    splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key);
+  if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
+    {
+      if (k->u.link_key)
+	splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->u.link_key);
+    }
+  else if (k->u.attach_count)
+    free (k->u.attach_count);
   if (k->tgt->refcount > 1)
     k->tgt->refcount--;
   else
@@ -1133,14 +1320,35 @@ gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom,
     }
 
   size_t i;
+
+  /* We must perform detachments before any copies back to the host.  */
+  for (i = 0; i < tgt->list_count; i++)
+    {
+      splay_tree_key k = tgt->list[i].key;
+
+      if (k != NULL && tgt->list[i].do_detach)
+	gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start
+					     + tgt->list[i].offset,
+			     k->refcount == 1, NULL);
+    }
+
   for (i = 0; i < tgt->list_count; i++)
     {
       splay_tree_key k = tgt->list[i].key;
+
       if (k == NULL)
 	continue;
 
       bool do_unmap = false;
-      if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
+      if (k->tgt == tgt
+	  && k->virtual_refcount > 0
+	  && k->virtual_refcount != VREFCOUNT_LINK_KEY
+	  && k->refcount != REFCOUNT_INFINITY)
+	{
+	  k->virtual_refcount--;
+	  k->refcount--;
+	}
+      else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
 	k->refcount--;
       else if (k->refcount == 1)
 	{
@@ -1283,7 +1491,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_table[i].start;
       k->refcount = REFCOUNT_INFINITY;
-      k->link_key = NULL;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -1315,7 +1524,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_var->start;
       k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
-      k->link_key = NULL;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -1549,22 +1759,6 @@ gomp_unload_device (struct gomp_device_descr *devicep)
     }
 }
 
-/* Free address mapping tables.  MM must be locked on entry, and remains locked
-   on return.  */
-
-attribute_hidden void
-gomp_free_memmap (struct splay_tree_s *mem_map)
-{
-  while (mem_map->root)
-    {
-      struct target_mem_desc *tgt = mem_map->root->key.tgt;
-
-      splay_tree_remove (mem_map, &mem_map->root->key);
-      free (tgt->array);
-      free (tgt);
-    }
-}
-
 /* Do we have offload data available for the given offload target type?
    Instead of verifying that *all* offload data is available that could
    possibly be required, we instead just look for *any*.  If we later find any
@@ -2092,9 +2286,9 @@ gomp_exit_data (struct gomp_device_descr *devicep, size_t mapnum,
 	  if (k->refcount == 0)
 	    {
 	      splay_tree_remove (&devicep->mem_map, k);
-	      if (k->link_key)
+	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY && k->u.link_key)
 		splay_tree_insert (&devicep->mem_map,
-				   (splay_tree_node) k->link_key);
+				   (splay_tree_node) k->u.link_key);
 	      if (k->tgt->refcount > 1)
 		k->tgt->refcount--;
 	      else
@@ -2631,6 +2825,8 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
       k->tgt = tgt;
       k->tgt_offset = (uintptr_t) device_ptr + device_offset;
       k->refcount = REFCOUNT_INFINITY;
+      k->virtual_refcount = 0;
+      k->u.link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -3087,7 +3283,6 @@ gomp_target_init (void)
 		current_device.type = current_device.get_type_func ();
 		current_device.mem_map.root = NULL;
 		current_device.state = GOMP_DEVICE_UNINITIALIZED;
-		current_device.openacc.data_environ = NULL;
 
 		/* Augment DEVICES and NUM_DEVICES.  */
 		devices = gomp_realloc (devices,
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
index 6a52f74..6bdcfe7 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
@@ -182,13 +182,13 @@ main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
index 71365e8..b403a5c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
@@ -176,13 +176,13 @@ main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
new file mode 100644
index 0000000..d8d7067
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
@@ -0,0 +1,24 @@
+#include <stdlib.h>
+#include <assert.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i;
+  struct dc v = { .a = 3, .b = (int *) malloc (sizeof (int) * n) };
+
+#pragma acc parallel loop copy(v.a, v.b[:n])
+  for (i = 0; i < n; i++)
+    v.b[i] = v.a;
+
+  for (i = 0; i < 10; i++)
+    assert (v.b[i] == v.a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
new file mode 100644
index 0000000..7e26e9a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
@@ -0,0 +1,29 @@
+#include <assert.h>
+#include <stdlib.h>
+
+int
+main(int argc, char* argv[])
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  s.a = (int *) malloc (16 * sizeof (int));
+  s.b = (int *) malloc (16 * sizeof (int));
+  s.e = (int *) malloc (16 * sizeof (int));
+
+  #pragma acc data copy(s)
+  {
+    #pragma acc data copy(s.a[0:10])
+    {
+      #pragma acc parallel loop attach(s.a)
+      for (int i = 0; i < 10; i++)
+	s.a[i] = i;
+    }
+  }
+
+  for (int i = 0; i < 10; i++)
+    assert (s.a[i] == i);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
new file mode 100644
index 0000000..cec764b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
@@ -0,0 +1,34 @@
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+int
+main ()
+{
+  int n = 100, i;
+  int *a = (int *) malloc (sizeof (int) * n);
+  int *b;
+
+  for (i = 0; i < n; i++)
+    a[i] = i+1;
+
+#pragma acc enter data copyin(a[:n]) create(b)
+
+  b = a;
+  acc_attach ((void **)&b);
+
+#pragma acc parallel loop present (b[:n])
+  for (i = 0; i < n; i++)
+    b[i] = i+1;
+
+  acc_detach ((void **)&b);
+
+#pragma acc exit data copyout(a[:n], b)
+
+  for (i = 0; i < 10; i++)
+    assert (a[i] == b[i]);
+
+  free (a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
new file mode 100644
index 0000000..8874ca0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
@@ -0,0 +1,87 @@
+#include <assert.h>
+#include <stdlib.h>
+
+#define LIST_LENGTH 10
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    {
+#pragma acc exit data detach(head->next)
+    }
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+#pragma acc enter data copyin(n[:1])
+#pragma acc enter data attach(head->next)
+  if (n->next)
+    {
+#pragma acc enter data attach(n->next)
+    }
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+#pragma acc exit data detach(head->next)
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	{
+#pragma acc exit data detach(n->next)
+	}
+#pragma acc exit data delete (n[:1])
+      if (head->next)
+	{
+#pragma acc enter data attach(head->next)
+	}
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+#pragma acc enter data copyin(list)
+
+  for (i = 0; i < LIST_LENGTH; i++)
+    insert (&list, i + 1);
+
+  assert (sum_nodes (&list) == (LIST_LENGTH * LIST_LENGTH + LIST_LENGTH) / 2);
+
+  destroy (&list);
+
+#pragma acc exit data delete(list)
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
new file mode 100644
index 0000000..89cafbb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
@@ -0,0 +1,81 @@
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    acc_detach ((void **) &head->next);
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+  acc_copyin (n, sizeof (struct node));
+  acc_attach((void **) &head->next);
+
+  if (n->next)
+    acc_attach ((void **) &n->next);
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+      acc_detach ((void **) &head->next);
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	acc_detach ((void **) &n->next);
+
+      acc_delete (n, sizeof (struct node));
+      if (head->next)
+	acc_attach((void **) &head->next);
+
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+  acc_copyin (&list, sizeof (struct node));
+
+  for (i = 0; i < 10; i++)
+    insert (&list, 2);
+
+  assert (sum_nodes (&list) == 10 * 2);
+
+  destroy (&list);
+
+  acc_delete (&list, sizeof (struct node));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
new file mode 100644
index 0000000..81c1c5e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
@@ -0,0 +1,59 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int **b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int **) malloc (sizeof (int *) * n);
+  for (i = 0; i < n; i++)
+    v.b[i] = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc data copy(v)
+      {
+#pragma acc data copy(v.b[:n])
+	{
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_copyin (v.b[i], sizeof (int) * n);
+	      acc_attach ((void **) &v.b[i]);
+	    }
+
+#pragma acc parallel loop
+	  for (i = 0; i < n; i++)
+	    for (j = 0; j < n; j++)
+	      v.b[i][j] = v.a + i + j;
+
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_detach ((void **) &v.b[i]);
+	      acc_copyout (v.b[i], sizeof (int) * n);
+	    }
+	}
+      }
+
+      for (i = 0; i < n; i++)
+	for (j = 0; j < n; j++)
+	  assert (v.b[i][j] == v.a + i + j);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      for (i = 0; i < n; i++)
+        assert (!acc_is_present (v.b[i], sizeof (int) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
new file mode 100644
index 0000000..a59047a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
@@ -0,0 +1,45 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+      /* Here, we do not explicitly copy the enclosing structure, but work
+	 with fields directly.  Make sure attachment counters and reference
+	 counters work properly in that case.  */
+#pragma acc enter data copyin(v.a, v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+
+#pragma acc parallel loop present(v.a, v.b)
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n]) finalize
+#pragma acc exit data delete(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
new file mode 100644
index 0000000..0ca5990
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
@@ -0,0 +1,54 @@
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+  int *c;
+  int *d;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+  v.c = (int *) malloc (sizeof (int) * n);
+  v.d = (int *) malloc (sizeof (int) * n);
+
+#pragma acc enter data copyin(v)
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc enter data copyin(v.a, v.b[:n], v.c[:n], v.d[:n])
+
+#pragma acc parallel loop
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n])
+#pragma acc exit data copyout(v.c[:n])
+#pragma acc exit data copyout(v.d[:n])
+#pragma acc exit data copyout(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      assert (!acc_is_present (v.c, sizeof (int *) * n));
+      assert (!acc_is_present (v.d, sizeof (int *) * n));
+    }
+
+#pragma acc exit data copyout(v)
+
+  assert (!acc_is_present (&v, sizeof (v)));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
index 83a5400..6bb92c1 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
@@ -1,4 +1,5 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
 
 program test
   use openacc
@@ -70,10 +71,14 @@ program test
     end do
   !$acc end parallel
   
-  !$acc exit data copyout (d(1:N)) async
+  !$acc exit data delete (c(1:N)) copyout (d(1:N)) async
   !$acc exit data async
   !$acc wait
 
+#if !ACC_MEM_SHARED
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+#endif
+
   do i = 1, N
     if (d(i) .ne. 4.0) call abort
   end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
new file mode 100644
index 0000000..c4cea11
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
@@ -0,0 +1,35 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var)
+!$acc data copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
new file mode 100644
index 0000000..3593661
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
@@ -0,0 +1,33 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc data", two clauses at once.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var) copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
new file mode 100644
index 0000000..667d944
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
@@ -0,0 +1,34 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc parallel".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc parallel loop copy(var) copy(var%a(1:n)) copy(var%b(1:n))
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i
+  end do
+!$acc end parallel loop
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i .ne. var%b(i)) stop 2
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
new file mode 100644
index 0000000..6949e12
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
@@ -0,0 +1,49 @@
+! { dg-do run }
+
+! Test of attach/detach with "acc enter/exit data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer, allocatable :: r(:)
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(r(1:n))
+
+!$acc enter data copyin(var)
+
+!$acc enter data copyin(var%a, var%b, r)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i * 2
+    r(i) = i * 3
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a)
+!$acc exit data copyout(var%b)
+!$acc exit data copyout(r)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i * 2 .ne. var%b(i)) stop 2
+    if (i * 3 .ne. r(i)) stop 3
+  end do
+
+!$acc exit data delete(var)
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(r)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
new file mode 100644
index 0000000..6843cf1
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
@@ -0,0 +1,57 @@
+! { dg-do run }
+
+! Test of attach/detach, "enter data" inside "data", and subarray.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
new file mode 100644
index 0000000..12910d0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
@@ -0,0 +1,61 @@
+! { dg-do run }
+
+! Test of attachment counters and finalize.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+  do i = 1,20
+    !$acc enter data attach(var%a)
+  end do
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) finalize
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
new file mode 100644
index 0000000..ab44f0a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
@@ -0,0 +1,89 @@
+! { dg-do run }
+
+! Test of attach/detach with scalar elements and nested derived types.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type subtype
+    integer :: g, h
+    integer, allocatable :: q(:)
+  end type subtype
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: c, d
+    integer, allocatable :: b(:)
+    integer :: f
+    type(subtype) :: s
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(var%c)
+  allocate(var%d)
+  allocate(var%s%q(1:n))
+
+  var%c = 16
+  var%d = 20
+  var%f = 7
+  var%s%g = 21
+  var%s%h = 38
+
+!$acc enter data copyin(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+    var%s%q(i) = 0
+  end do
+
+!$acc data copy(var%a(5:n - 5), var%b(5:n - 5), var%c, var%d) &
+!$acc & copy(var%s%q)
+
+!$acc parallel loop default(none) present(var)
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+    var%s%q(i) = i * 3
+    var%s%g = 100
+    var%s%h = 101
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+!$acc exit data copyout(var)
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+    if (var%s%q(i) .ne. 0) stop 3
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 4
+    if (i * 2 .ne. var%b(i)) stop 5
+    if (i * 3 .ne. var%s%q(i)) stop 6
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 7
+    if (var%b(i) .ne. 0) stop 8
+    if (var%s%q(i) .ne. 0) stop 9
+  end do
+
+  if (var%c .ne. 16) stop 10
+  if (var%d .ne. 20) stop 11
+  if (var%s%g .ne. 100 .or. var%s%h .ne. 101) stop 12
+  if (var%f .ne. 7) stop 13
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(var%c)
+  deallocate(var%d)
+  deallocate(var%s%q)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
new file mode 100644
index 0000000..d142763a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
@@ -0,0 +1,41 @@
+! { dg-do run }
+
+! Test of explicit attach/detach clauses and attachment counters. There are no
+! acc_attach/acc_detach API routines in Fortran.
+
+program dtype
+  use openacc
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+  call acc_copyin(var)
+  call acc_copyin(var%a)
+
+  !$acc enter data attach(var%a)
+
+!$acc parallel loop attach(var%a)
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+  !$acc exit data detach(var%a)
+
+  call acc_copyout(var%a)
+  call acc_copyout(var)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
new file mode 100644
index 0000000..eb7812d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
@@ -0,0 +1,28 @@
+! Test derived types with subarrays
+
+! { dg-do run }
+
+  implicit none
+  type dtype
+     integer :: a, b, c
+  end type dtype
+  integer, parameter :: n = 100
+  integer i
+  type (dtype), dimension(n) :: d
+
+  !$acc data copy(d(1:n))
+  !$acc parallel loop
+  do i = 1, n
+     d(i)%a = i
+     d(i)%b = i-1
+     d(i)%c = i+1
+  end do
+  !$acc end data
+
+  do i = 1, n
+     if (d(i)%a /= i) stop 1
+     if (d(i)%b /= i-1) stop 2
+     if (d(i)%c /= i+1) stop 3
+  end do
+end program
+
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
new file mode 100644
index 0000000..c3c8a07
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
@@ -0,0 +1,284 @@
+! Test ACC UPDATE with derived types.
+
+! { dg-do run }
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type mytype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type mytype
+end module dt
+
+program derived_acc
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 1
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 2
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 3
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 4
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 5
+     if (i == 5 .and. var%c(i) /= 1) stop 6
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 7
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 8
+     if (i >= n/2 .and. var%c(i) /= i) stop 9
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 10
+     if (i == 5 .and. var%in%d(i) /= 1) stop 11
+  end do
+
+  !$acc exit data delete(var)
+
+  call derived_acc_subroutine(var)
+end program derived_acc
+
+subroutine derived_acc_subroutine(var)
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 12
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 13
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 14
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 15
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 16
+     if (i == 5 .and. var%c(i) /= 1) stop 17
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 18
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 19
+     if (i >= n/2 .and. var%c(i) /= i) stop 20
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 21
+     if (i == 5 .and. var%in%d(i) /= 1) stop 22
+  end do
+
+  !$acc exit data delete(var)
+end subroutine derived_acc_subroutine
Jakub Jelinek Dec. 18, 2018, 12:25 p.m. UTC | #6
On Fri, Dec 14, 2018 at 07:00:30PM +0000, Julian Brown wrote:
>     OpenACC 2.6 manual deep copy support (attach/detach)
>     
>     	gcc/c-family/
>     	* c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH,
>     	PRAGMA_OACC_CLAUSE_DETACH.

and instead of , ?
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1176,6 +1176,7 @@ extern void c_omp_split_clauses (location_t, enum tree_code, omp_clause_mask,
>  extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
>  extern void c_omp_declare_simd_clauses_to_decls (tree, tree);
>  extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
> +extern const char * c_omp_map_clause_name (tree, bool);

No space after * in this case.

> +const char *
> +c_omp_map_clause_name (tree clause, bool oacc)
> +{
> +  if (oacc && OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP)
> +    switch (OMP_CLAUSE_MAP_KIND (clause))
> +    {
> +    case GOMP_MAP_FORCE_ALLOC:
> +    case GOMP_MAP_ALLOC: return "create";
> +    case GOMP_MAP_FORCE_TO:
> +    case GOMP_MAP_TO: return "copyin";
> +    case GOMP_MAP_FORCE_FROM:
> +    case GOMP_MAP_FROM: return "copyout";
> +    case GOMP_MAP_FORCE_TOFROM:
> +    case GOMP_MAP_TOFROM: return "copy";
> +    case GOMP_MAP_RELEASE: return "delete";
> +    case GOMP_MAP_FORCE_PRESENT: return "present";
> +    case GOMP_MAP_ATTACH: return "attach";
> +    case GOMP_MAP_FORCE_DETACH:
> +    case GOMP_MAP_DETACH: return "detach";
> +    case GOMP_MAP_DEVICE_RESIDENT: return "device_resident";
> +    case GOMP_MAP_LINK: return "link";
> +    case GOMP_MAP_FORCE_DEVICEPTR: return "deviceptr";
> +    default:;

Please use default: break; instead.

>    for (i = 0; i < tgt->list_count; i++)
>      {
>        splay_tree_key k = tgt->list[i].key;
> +
>        if (k == NULL)
>  	continue;

Why the blank change?

Otherwise LGTM, if Thomas is ok with it.

	Jakub
Thomas Schwinge Dec. 22, 2018, 2:19 a.m. UTC | #7
Hi Julian!

The following review has been done in a rush, so can be expected to
contain more errors than usually.


Unfortunately, you're posting here changes to: add new features (great!),
add independent other new features (great!), refactor code (great!), fix
bugs (great!) -- all great, but would be much easier to review if all
these were discussed separately.


I rebased your latest patch onto the current version of the async
re-work, as published in
<https://github.com/tschwinge/gcc/tree/wip-async_re-work>.

I had to resolve a number of (mostly easy) textual conflicts, but I also
had to apply additional changes.

As part of the async re-work review and following re-work of the patch,
the "gomp_unmap_vars_async" function is no more.  I have instead added
"bool finalize" to "gomp_unmap_vars_internal" etc.  For symmetry to
"gomp_unmap_vars", I moved "bool finalize" for "gomp_unmap_vars_async"
before "struct goacc_asyncqueue *aq".

Additionally, and a bit more invasive, "goacc_async_unmap_tgt" has been
removed, which you call from the "goacc_remove_var_async" function that
you're adding.  I have removed that function and instead adapted
"gomp_unref_tgt" to be usable for this case, too (added "bool
is_tgt_unmapped" return value), added a "gomp_unref_tgt_void" wrapper,
renamed "gomp_remove_var" to "gomp_remove_var_internal", and added
"gomp_remove_var" and "gomp_remove_var_async" wrappers, the latter
replacing "goacc_remove_var_async".

For easier review of the latter changes, I'm attaching an incremental
patch, "0001-WIP-OpenACC-2.6-manual-deep-copy-support-attach-deta.patch".
(Maybe Chung-Lin could help with that, too?)

The attached
"0001-OpenACC-2.6-manual-deep-copy-support-attach-detach.patch" then is
the complete patch, that I've now (somewhat) tested, and (somewhat)
reviewed.

I'll be OK with that going into trunk, with my following comments
addressed in some way, or disputed, should they not make a lot of sense,
after long weeks, now at 03:00 in the night.


On Fri, 14 Dec 2018 19:00:30 +0000, Julian Brown <julian@codesourcery.com> wrote:
> [...]

Thanks, Julian (and before, Cesar), for taking on this big item of work,
and Jakub for his earlier review comments.


> One small thing, additionally: I've introduced a function
> c_omp_map_clause_name in c-family/c-omp.c to return the name of OpenACC
> map clauses -- since many of these use OMP_CLAUSE_MAP and then use
> OMP_CLAUSE_MAP_KIND to distinguish between different mapping types,
> just emitting 'map' to refer to several different clauses won't be very
> meaningful to the user. I didn't see any existing way to get the right
> names.

ACK.  

Eventually (that is, later), this should move into generic code, next to
the other "clause printing".  Also to be shared with Fortran.

> --- a/gcc/c/c-typeck.c
> +++ b/gcc/c/c-typeck.c

> @@ -12901,6 +12900,15 @@ handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
>  		}
>  	      t = TREE_OPERAND (t, 0);
>  	    }
> +	  if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	    {
> +	      if (maybe_ne (mem_ref_offset (t), 0))
> +	        error_at (OMP_CLAUSE_LOCATION (c),
> +			  "cannot dereference %qE in %qs clause", t,
> +			  c_omp_map_clause_name (c, true));
> +	      else
> +		t = TREE_OPERAND (t, 0);
> +	    }
>  	}
>        if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
>  	{

No test cases exercising that?  (Can be addressed later.)

> +/* Ensure that pointers are used in OpenACC attach and detach clauses.
> +   Return true if an error has been detected.  */
> +
> +static bool
> +c_oacc_check_attachments (tree c)
> +{
> +  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
> +    return false;
> +
> +  /* OpenACC attach / detach clauses must be pointers.  */
> +  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
> +      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
> +    {
> +      tree t = OMP_CLAUSE_DECL (c);
> +
> +      while (TREE_CODE (t) == TREE_LIST)
> +	t = TREE_CHAIN (t);
> +
> +      if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE)
> +	{
> +	  error_at (OMP_CLAUSE_LOCATION (c), "expected pointer in %qs clause",
> +		    c_omp_map_clause_name (c, true));
> +	  return true;
> +	}
> +    }
> +
> +  return false;
> +}

Share that with the C++ "cp_oacc_check_attachments"?  (Can be addressed
later.)

> @@ -14432,6 +14487,15 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
>  		}
>  	      if (remove)
>  		break;
> +	      if (ort == C_ORT_ACC && TREE_CODE (t) == MEM_REF)
> +	        {
> +		  if (maybe_ne (mem_ref_offset (t), 0))
> +	            error_at (OMP_CLAUSE_LOCATION (c),
> +			      "cannot dereference %qE in %qs clause", t,
> +			      c_omp_map_clause_name (c, true));
> +		  else
> +		    t = TREE_OPERAND (t, 0);
> +		}

Same error as above; no test cases exercising that?  (Can be addressed
later.)

Also, that error is not relevant for C++?

> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c

> @@ -11819,8 +11866,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
>  			       OMP_CLAUSE_FINALIZE))
>      {
> -      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
> -	 semantics apply to all mappings of this OpenACC directive.  */
> +      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
> +	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
> +	 to all mappings of this OpenACC directive.  */
>        bool finalize_marked = false;
>        for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
>  	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
> @@ -11834,10 +11882,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
>  	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
>  	      finalize_marked = true;
>  	      break;
> +	    case GOMP_MAP_DETACH:
> +	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
> +	      finalize_marked = true;
> +	      break;
> +	    case GOMP_MAP_STRUCT:
> +	    case GOMP_MAP_FORCE_PRESENT:
> +	      /* Skip over an initial struct or force_present mapping.  */
> +	      break;
>  	    default:
> -	      /* Check consistency: libgomp relies on the very first data
> -		 mapping clause being marked, so make sure we did that before
> -		 any other mapping clauses.  */
> +	      /* Check consistency: libgomp relies on the very first
> +		 non-struct, non-force-present data mapping clause being
> +		 marked, so make sure we did that before any other mapping
> +		 clauses.  */
>  	      gcc_assert (finalize_marked);
>  	      break;
>  	    }

But libgomp hasn't been updated accordingly, to look for
"GOMP_MAP_FORCE_DETACH" ("GOACC_enter_exit_data", "finalize" detection)?

> --- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
> +++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95

> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90

> --- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
> +++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95

No Fortran test cases for at least testing valid/invalid syntax of the
"attach" and "detach" clauses?

> --- a/include/gomp-constants.h
> +++ b/include/gomp-constants.h
> @@ -42,6 +42,7 @@
>  #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
>  #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
>  					 | GOMP_MAP_FLAG_SPECIAL_0)
> +#define GOMP_MAP_DEEP_COPY		(1 << 5)
>  /* Flag to force a specific behavior (or else, trigger a run-time error).  */
>  #define GOMP_MAP_FLAG_FORCE		(1 << 7)

I wondered whether we really need to reserve a whole class (one bit out
of eight) here?

> @@ -128,6 +129,13 @@ enum gomp_map_kind
>      /* Decrement usage count and deallocate if zero.  */
>      GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
>  					 | GOMP_MAP_DELETE),
> +    /* In OpenACC, attach a pointer to a mapped struct field.  */
> +    GOMP_MAP_ATTACH =			(GOMP_MAP_DEEP_COPY | 0),
> +    /* In OpenACC, detach a pointer to a mapped struct field.  */
> +    GOMP_MAP_DETACH =			(GOMP_MAP_DEEP_COPY | 1),
> +    /* In OpenACC, detach a pointer to a mapped struct field.  */
> +    GOMP_MAP_FORCE_DETACH =		(GOMP_MAP_DEEP_COPY
> +					 | GOMP_MAP_FLAG_FORCE | 1),

But anyway, we can still use for other things any other values with the
"GOMP_MAP_DEEP_COPY" bit set, should this become necessary later on.

Adding the "GOMP_MAP_FORCE_DETACH" mapping kind (that is,
"GOMP_MAP_DETACH | GOMP_MAP_FLAG_FORCE") could be avoided by using the
new mechanism 'For libgomp OpenACC entry points, redefine the "device"
argument to "flags"',
<https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01420.html>, that I plan
to get into trunk soon.  However, obviously, there's already precedence
to communicate via other "| GOMP_MAP_FLAG_FORCE" mapping variants to
libgomp the "finalize" semantics, so that'd be a general change now.
(... to be done later; just mentioning this now, for completeness.)

> --- a/libgomp/libgomp.h
> +++ b/libgomp/libgomp.h

> @@ -902,6 +904,10 @@ struct target_mem_desc {
>     artificial pointer to "omp declare target link" object.  */
>  #define REFCOUNT_LINK (~(uintptr_t) 1)
>  
> +/* A special tag value for "virtual_refcount" in the splay_tree_key_s structure
> +   below.  */
> +#define VREFCOUNT_LINK_KEY (~(uintptr_t) 0)
> +
>  /* Special offset values.  */
>  #define OFFSET_INLINED (~(uintptr_t) 0)
>  #define OFFSET_POINTER (~(uintptr_t) 1)
> @@ -918,10 +924,21 @@ struct splay_tree_key_s {
>    uintptr_t tgt_offset;
>    /* Reference count.  */
>    uintptr_t refcount;
> -  /* Dynamic reference count.  */
> -  uintptr_t dynamic_refcount;
> -  /* Pointer to the original mapping of "omp declare target link" object.  */
> -  splay_tree_key link_key;
> +  /* Reference counts beyond those that represent genuine references in the
> +     linked splay tree key/target memory structures, e.g. for multiple OpenACC
> +     "present increment" operations (via "acc enter data") referring to the same
> +     host-memory block.
> +     If set to VREFCOUNT_LINK_KEY (for OpenMP, where this field is not otherwise
> +     needed), the union below represents a link key.  */
> +  uintptr_t virtual_refcount;
> +  union {
> +    /* For a block with attached pointers, the attachment counters for each.
> +       Only used for OpenACC.  */
> +    uintptr_t *attach_count;
> +    /* Pointer to the original mapping of "omp declare target link" object.
> +       Only used for OpenMP.  */
> +    splay_tree_key link_key;
> +  } u;
>  };

(Will reference that one later.)

> @@ -943,13 +960,6 @@ splay_compare (splay_tree_key x, splay_tree_key y)
>  
>  typedef struct acc_dispatch_t
>  {
> -  /* This is a linked list of data mapped using the
> -     acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
> -     Unlike mapped_data in the goacc_thread struct, unmapping can
> -     happen out-of-order with respect to mapping.  */
> -  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
> -  struct target_mem_desc *data_environ;

Also revise the comment in "libgomp.h" containing a "data_environ"
mention.

> +struct gomp_coalesce_buf;
> [...]
>  struct gomp_coalesce_buf;

Duplicate.  (I removed it, I think.)

> --- a/libgomp/oacc-init.c
> +++ b/libgomp/oacc-init.c
> @@ -302,9 +302,13 @@ acc_shutdown_1 (acc_device_t d)
>  
>        if (walk->dev)
>  	{
> -	  gomp_mutex_lock (&walk->dev->lock);
> -	  gomp_free_memmap (&walk->dev->mem_map);
> -	  gomp_mutex_unlock (&walk->dev->lock);
> +	  while (walk->dev->mem_map.root)
> +	    {
> +	      splay_tree_key k = &walk->dev->mem_map.root->key;
> +	      if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
> +		k->u.link_key = NULL;
> +	      gomp_remove_var (walk->dev, k);
> +	    }

Surprising to see here the OpenMP-only "VREFCOUNT_LINK_KEY", "u.link_key"
mentions, in OpenACC-specific code?

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -373,6 +364,7 @@ acc_unmap_data (void *h)

Hmm, just noting that "acc_unmap_data" looks rather similar to
"gomp_remove_var_internal" in what it's doing (but without the device
memory deallocation).  (Cleaning up any such things is for later,
obviously.)

> -  if (t->refcount == 2)
> +  if (tgt->refcount > 0)
> +    tgt->refcount--;
> +  else
>      {
> -[...]
> +      free (tgt->array);
> +      free (tgt);
>      }
>  
>    gomp_mutex_unlock (&acc_dev->lock);
> -
> -  gomp_unmap_vars (t, true);
>  }

But doesn't it need to duplicate the "free ([attach_count])" here,
though?

> +      if (aq)
> +	goacc_remove_var_async (acc_dev, n, aq);
> +      else
> +	gomp_remove_var (acc_dev, n);

I changed (two, I think) such things to just always call
"goacc_remove_var_async".

> +gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs,
> +			 size_t *sizes, unsigned short *kinds, int async,
> +			 bool finalize, int mapnum)

Should be "attribute_hidden"?

This one duplicates code from "delete_copyout", "GOACC_enter_exit_data"
(where the latter is also the only place it gets called from).  (Again,
that's been an issue before, for similar code, and certainly not the only
instance of such things, but we should sort this out eventually.)

This one has an unused "async" formal parameter.  It that meant to be
resolved to an asyncqueue, and pass that one to "gomp_copy_dev2host", and
call "gomp_remove_var_async" instead of "gomp_remove_var"?  That's here:

> +	  if (copyfrom)
> +	    gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start,
> +				(void *) (n->tgt->tgt_start + n->tgt_offset
> +					  + cur_node.host_start
> +					  - n->host_start),
> +				cur_node.host_end - cur_node.host_start);
> +
> +	  if (n->refcount == 0)
> +	    gomp_remove_var (acc_dev, n);
> +	  break;

> --- a/libgomp/oacc-parallel.c
> +++ b/libgomp/oacc-parallel.c

> @@ -356,6 +375,10 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>    if (mapnum > 0)
>      {
>        unsigned char kind = kinds[0] & 0xff;
> +
> +      if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT)
> +        kind = kinds[1] & 0xff;

Are we sure that "kinds[1]" is always valid to dereference?

I was generally confused by your "GOMP_MAP_FORCE_PRESENT" mapping kind
changes/usage in this whole patch, and in "GOACC_enter_exit_data"
specifically.  It seems as if you're changing the behavior of
"GOMP_MAP_FORCE_PRESENT" here, which -- at least initially -- might look
like a problem for ABI compatibility, for code compiled with previous
versions of GCC?  I see you're now during gimplification generating
"GOMP_MAP_FORCE_PRESENT" mappings for some "attach" cases.  This should
actually be OK, as no OpenACC directive that directly maps to
"GOACC_enter_exit_data" (that is, "update", "enter data", "exit data")
support the "present" clause ("GOMP_MAP_FORCE_PRESENT").  On the other
hand, "GOACC_enter_exit_data" is also called via "GOACC_declare", and the
"declare" directive does support "present" clauses -- but that's handled
separately in "GOACC_declare" itself, ha.  So, why did
"GOACC_enter_exit_data" then support "GOMP_MAP_FORCE_PRESENT" already?
If that was for no particular reason, and before, "GOACC_enter_exit_data"
never had to deal with "GOMP_MAP_FORCE_PRESENT" in practice, then we
don't have to worry, otherwise, do we have to better understand this?

In addition to the hunk above, I mean things like this:

> @@ -366,11 +389,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>      {
>        unsigned char kind = kinds[i] & 0xff;
>  
> -      if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET)
> +      if (kind == GOMP_MAP_POINTER
> +	  || kind == GOMP_MAP_TO_PSET
> +	  || kind == GOMP_MAP_STRUCT
> +	  || kind == GOMP_MAP_FORCE_PRESENT)
>  	continue;
>  
>        if (kind == GOMP_MAP_FORCE_ALLOC
> -	  || kind == GOMP_MAP_FORCE_PRESENT
> +	  || kind == GOMP_MAP_ATTACH
>  	  || kind == GOMP_MAP_FORCE_TO
>  	  || kind == GOMP_MAP_TO
>  	  || kind == GOMP_MAP_ALLOC)

> @@ -431,8 +472,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	    }
>  	  else
>  	    {
> -	      gomp_acc_insert_pointer (pointer, &hostaddrs[i],
> -				       &sizes[i], &kinds[i], async);
> +	      goacc_aq aq = get_goacc_asyncqueue (async);
> +	      for (int j = 0; j < 2; j++)
> +		gomp_map_vars_async (acc_dev, aq,
> +				     (j == 0 || pointer == 2) ? 1 : 2,
> +				     &hostaddrs[i + j], NULL,
> +				     &sizes[i + j], &kinds[i + j], true,
> +				     GOMP_MAP_VARS_OPENACC_ENTER_DATA);

Can we get a comment added to such "magic", please?

> @@ -440,51 +487,145 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	      i += pointer - 1;
>  	    }
>  	}
> +
> +      /* This loop only handles explicit "attach" clauses that are not an
> +	 implicit part of a copy{,in,out}, etc. mapping.  */
> +      for (i = 0; i < mapnum; i++)
> +        {
> +	  unsigned char kind = kinds[i] & 0xff;
> +
> +	  /* Scan for pointers and PSETs.  */
> +	  int pointer = find_pointer (i, mapnum, kinds);
> +
> +	  if (!pointer)
> +	    {
> +	      if (kind == GOMP_MAP_ATTACH)
> +		acc_attach (hostaddrs[i]);

Shouldn't this call the async variant?

> +	      else if (kind == GOMP_MAP_STRUCT)
> +	        i += sizes[i];
> +	    }
> +	  else
> +	    i += pointer - 1;
> +	}
>      }
>    else
> -    for (i = 0; i < mapnum; ++i)
> -      {
> -	unsigned char kind = kinds[i] & 0xff;
> +    {
> +      /* Handle "detach" before copyback/deletion of mapped data.  */
> +      for (i = 0; i < mapnum; i++)
> +        {
> +	  unsigned char kind = kinds[i] & 0xff;
>  
> -	int pointer = find_pointer (i, mapnum, kinds);
> +	  int pointer = find_pointer (i, mapnum, kinds);
>  
> -	if (!pointer)
> -	  {
> -	    switch (kind)
> -	      {
> -	      case GOMP_MAP_RELEASE:
> -	      case GOMP_MAP_DELETE:
> -		if (acc_is_present (hostaddrs[i], sizes[i]))
> +	  if (!pointer)
> +	    {
> +	      if (kind == GOMP_MAP_DETACH)
> +		acc_detach (hostaddrs[i]);
> +	      else if (kind == GOMP_MAP_FORCE_DETACH)
> +		acc_detach_finalize (hostaddrs[i]);
> +	      else if (kind == GOMP_MAP_STRUCT)
> +	        i += sizes[i];
> +	    }
> +	  else
> +	    {
> +	      unsigned char kind2 = kinds[i + pointer - 1] & 0xff;
> +
> +	      if (kind2 == GOMP_MAP_DETACH)
> +		acc_detach (hostaddrs[i + pointer - 1]);
> +	      else if (kind2 == GOMP_MAP_FORCE_DETACH)
> +	        acc_detach_finalize (hostaddrs[i + pointer - 1]);

Likewise for all these?

> --- a/libgomp/target.c
> +++ b/libgomp/target.c

> +void
> +gomp_attach_pointer (struct gomp_device_descr *devicep,
> +		     struct goacc_asyncqueue *aq, splay_tree mem_map,
> +		     splay_tree_key n, uintptr_t attach_to, size_t bias,
> +		     struct gomp_coalesce_buf *cbufp)

Should be "attribute_hidden"?

> +{
> +  struct splay_tree_key_s s;
> +  size_t size, idx;
> +
> +  if (n == NULL)
> +    {
> +      gomp_mutex_unlock (&devicep->lock);
> +      gomp_fatal ("enclosing struct not mapped for attach");
> +    }
> +
> +  size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *);
> +  /* We might have a pointer in a packed struct: however we cannot have more
> +     than one such pointer in each pointer-sized portion of the struct, so
> +     this is safe.  */
> +  idx = (attach_to - n->host_start) / sizeof (void *);
> +
> +  assert (n->virtual_refcount != VREFCOUNT_LINK_KEY);
> +
> +  if (!n->u.attach_count)
> +    n->u.attach_count
> +      = gomp_malloc_cleared (sizeof (*n->u.attach_count) * size);

OK, good enough in terms of memory usage, I suppose.

> +void
> +gomp_detach_pointer (struct gomp_device_descr *devicep,
> +		     struct goacc_asyncqueue *aq, splay_tree_key n,
> +		     uintptr_t detach_from, bool finalize,
> +		     struct gomp_coalesce_buf *cbufp)

Should be "attribute_hidden"?

"gomp_detach_pointer" can't make use of its "struct gomp_coalesce_buf
*cbufp" formal parameter, I think?  (And for "gomp_attach_pointer" I have
not thought through whether that actually helps, but it's at least
plausible, being called from "gomp_map_vars_internal".)

> -static inline uintptr_t
> +uintptr_t
>  gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)

Should be "attribute_hidden"?

> @@ -899,13 +1064,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
>  				      kind & typemask, cbufp);
>  	    else
>  	      {
> -		k->link_key = NULL;
> +		if (k->virtual_refcount == VREFCOUNT_LINK_KEY)
> +		  k->u.link_key = NULL;
>  		if (n && n->refcount == REFCOUNT_LINK)
>  		  {
>  		    /* Replace target address of the pointer with target address
>  		       of mapped object in the splay tree.  */
>  		    splay_tree_remove (mem_map, n);
> -		    k->link_key = n;
> +		    k->u.link_key = n;
> +		    k->virtual_refcount = VREFCOUNT_LINK_KEY;
>  		  }
>  		size_t align = (size_t) 1 << (kind >> rshift);
>  		tgt->list[i].key = k;
> @@ -926,10 +1093,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
>  		tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
>  		tgt->list[i].always_copy_from
>  		  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
> +		tgt->list[i].do_detach = false;
>  		tgt->list[i].offset = 0;
>  		tgt->list[i].length = k->host_end - k->host_start;
>  		k->refcount = 1;
> -		k->dynamic_refcount = 0;
> +		k->virtual_refcount = 0;
> +		k->u.attach_count = NULL;
>  		tgt->refcount++;
>  		array->left = NULL;
>  		array->right = NULL;

Don't the latter two assignments unconditionally overwrite the earlier
ones ("k->u.link_key") quoted above?  (This doesn't seem to cause any
testsuite regressions, though.)  I must be missing something.

> @@ -1283,7 +1491,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
>        k->tgt = tgt;
>        k->tgt_offset = target_table[i].start;
>        k->refcount = REFCOUNT_INFINITY;
> -      k->link_key = NULL;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);
> @@ -1315,7 +1524,8 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
>        k->tgt = tgt;
>        k->tgt_offset = target_var->start;
>        k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
> -      k->link_key = NULL;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);

Surprising to see here "k->virtual_refcount = 0" (that is, OpenACC case),
but then "k->u.link_key = NULL" (that is, OpenMP case).

> -/* Free address mapping tables.  MM must be locked on entry, and remains locked
> -   on return.  */
> -
> -attribute_hidden void
> -gomp_free_memmap (struct splay_tree_s *mem_map)
> -{
> -  while (mem_map->root)
> -    {
> -      struct target_mem_desc *tgt = mem_map->root->key.tgt;
> -
> -      splay_tree_remove (mem_map, &mem_map->root->key);
> -      free (tgt->array);
> -      free (tgt);
> -    }
> -}

Also remove the prototype in "libgomp.h".

> @@ -2631,6 +2825,8 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
>        k->tgt = tgt;
>        k->tgt_offset = (uintptr_t) device_ptr + device_offset;
>        k->refcount = REFCOUNT_INFINITY;
> +      k->virtual_refcount = 0;
> +      k->u.link_key = NULL;
>        array->left = NULL;
>        array->right = NULL;
>        splay_tree_insert (&devicep->mem_map, array);

As above.

> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
> @@ -182,13 +182,13 @@ main (int argc, char **argv)
>          exit (EXIT_FAILURE);
>      }
>  
> +    acc_delete (&h_X[0], N * sizeof (float));
> +    acc_delete (&h_Y1[0], N * sizeof (float));
> +
>      free (h_X);
>      free (h_Y1);
>      free (h_Y2);
>  
> -    acc_free (d_X);
> -    acc_free (d_Y);
> -
>      context_check (pctx);
>  
>      s = cublasDestroy (h);
> --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
> @@ -176,13 +176,13 @@ main (int argc, char **argv)
>          exit (EXIT_FAILURE);
>      }
>  
> +    acc_delete (&h_X[0], N * sizeof (float));
> +    acc_delete (&h_Y1[0], N * sizeof (float));
> +
>      free (h_X);
>      free (h_Y1);
>      free (h_Y2);
>  
> -    acc_free (d_X);
> -    acc_free (d_Y);
> -
>      context_check (pctx);
>  
>      s = cublasDestroy (h);

Hmm, why is this not also required for "context-1.c" and "context-3.c"?

> [...]

Thanks for adding at least a good number of libgomp test cases.

There are, however, no libgomp test cases for acc_detach_finalize, and
generally none for any of the several async variants (including using the
respective OpenACC directives).


As I said above, I'm fine with the changes generally, and some/most? of
my comments can be dealt with later on.


Grüße
 Thomas
From cc70f9cc92d588bc915d505d3b2c15def2eacb01 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge <thomas@codesourcery.com>
Date: Fri, 21 Dec 2018 22:08:57 +0100
Subject: [PATCH] [WIP] OpenACC 2.6 manual deep copy support (attach/detach):
 adjust for "goacc_async_unmap_tgt" removal

---
 libgomp/libgomp.h       |  2 ++
 libgomp/oacc-async.c    | 18 ------------
 libgomp/oacc-int.h      |  4 ---
 libgomp/oacc-mem.c      |  5 +---
 libgomp/oacc-parallel.c |  7 +----
 libgomp/target.c        | 63 ++++++++++++++++++++++++++++++-----------
 6 files changed, 51 insertions(+), 48 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index f0bc33cce1d1..310e79693d39 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1119,6 +1119,8 @@ extern void gomp_init_device (struct gomp_device_descr *);
 extern bool gomp_fini_device (struct gomp_device_descr *);
 extern void gomp_unload_device (struct gomp_device_descr *);
 extern bool gomp_remove_var (struct gomp_device_descr *, splay_tree_key);
+extern void gomp_remove_var_async (struct gomp_device_descr *, splay_tree_key,
+				   struct goacc_asyncqueue *);
 
 /* work.c */
 
diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c
index e8a836ea2460..dbb501335f11 100644
--- a/libgomp/oacc-async.c
+++ b/libgomp/oacc-async.c
@@ -259,24 +259,6 @@ acc_wait_all_async (int async)
   gomp_mutex_unlock (&thr->dev->openacc.async.lock);
 }
 
-/* Remove a variable asynchronously.  This actually removes the variable
-   mapping immediately, but retains the linked target_mem_desc until the
-   asynchronous operation has completed (as it may still refer to target
-   memory).  The device lock must be held before entry, and remains locked on
-   exit.  */
-
-attribute_hidden void
-goacc_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key n,
-                       struct goacc_asyncqueue *aq)
-{
-  struct target_mem_desc *tgt = n->tgt;
-  assert (tgt);
-  tgt->refcount++;
-  gomp_remove_var (devicep, n);
-  devicep->openacc.async.queue_callback_func (aq, GONE goacc_async_unmap_tgt,
-                                             (void *) tgt);
-}
-
 attribute_hidden void
 goacc_async_free (struct gomp_device_descr *devicep,
 		  struct goacc_asyncqueue *aq, void *ptr)
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 1eeb3e0ab6f4..9c6e0b606bd4 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -101,10 +101,6 @@ void goacc_host_init (void);
 
 void goacc_init_asyncqueues (struct gomp_device_descr *);
 bool goacc_fini_asyncqueues (struct gomp_device_descr *);
-struct splay_tree_key_s;
-void goacc_remove_var_async (struct gomp_device_descr *devicep,
-			     struct splay_tree_key_s *n,
-			     struct goacc_asyncqueue *aq);
 void goacc_async_free (struct gomp_device_descr *, struct goacc_asyncqueue *,
 		       void *);
 struct goacc_asyncqueue *get_goacc_asyncqueue (int);
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 405b7392546b..ae452712a863 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -623,10 +623,7 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 			      + (uintptr_t) h - n->host_start);
 	  gomp_copy_dev2host (acc_dev, aq, h, d, s);
 	}
-      if (aq)
-	goacc_remove_var_async (acc_dev, n, aq);
-      else
-	gomp_remove_var (acc_dev, n);
+      gomp_remove_var_async (acc_dev, n, aq);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 6ac0ccaf7af9..85d168d12b68 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -610,12 +610,7 @@ GOACC_enter_exit_data (int flags_m, size_t mapnum,
 			    else if (str->refcount > 0)
 			      str->refcount--;
 			    if (str->refcount == 0)
-			      {
-				if (aq)
-				  goacc_remove_var_async (acc_dev, str, aq);
-				else
-				  gomp_remove_var (acc_dev, str);
-			      }
+			      gomp_remove_var_async (acc_dev, str, aq);
 			  }
 		      }
 		    i += elems;
diff --git a/libgomp/target.c b/libgomp/target.c
index 59b8946b0a71..a047beb90583 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1266,8 +1266,33 @@ gomp_unmap_tgt (struct target_mem_desc *tgt)
   free (tgt);
 }
 
-attribute_hidden bool
-gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
+static bool
+gomp_unref_tgt (void *ptr)
+{
+  bool is_tgt_unmapped = false;
+
+  struct target_mem_desc *tgt = (struct target_mem_desc *) ptr;
+
+  if (tgt->refcount > 1)
+    tgt->refcount--;
+  else
+    {
+      gomp_unmap_tgt (tgt);
+      is_tgt_unmapped = true;
+    }
+
+  return is_tgt_unmapped;
+}
+
+static void
+gomp_unref_tgt_void (void *ptr)
+{
+  (void) gomp_unref_tgt (ptr);
+}
+
+static inline __attribute__((always_inline)) bool
+gomp_remove_var_internal (struct gomp_device_descr *devicep, splay_tree_key k,
+			  struct goacc_asyncqueue *aq)
 {
   bool is_tgt_unmapped = false;
   splay_tree_remove (&devicep->mem_map, k);
@@ -1278,25 +1303,31 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
     }
   else if (k->u.attach_count)
     free (k->u.attach_count);
-  if (k->tgt->refcount > 1)
-    k->tgt->refcount--;
+  if (aq)
+    devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt_void,
+						(void *) k->tgt);
   else
-    {
-      is_tgt_unmapped = true;
-      gomp_unmap_tgt (k->tgt);
-    }
+    is_tgt_unmapped = gomp_unref_tgt ((void *) k->tgt);
   return is_tgt_unmapped;
 }
 
-static void
-gomp_unref_tgt (void *ptr)
+attribute_hidden bool
+gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
 {
-  struct target_mem_desc *tgt = (struct target_mem_desc *) ptr;
+  return gomp_remove_var_internal (devicep, k, NULL);
+}
 
-  if (tgt->refcount > 1)
-    tgt->refcount--;
-  else
-    gomp_unmap_tgt (tgt);
+/* Remove a variable asynchronously.  This actually removes the variable
+   mapping immediately, but retains the linked target_mem_desc until the
+   asynchronous operation has completed (as it may still refer to target
+   memory).  The device lock must be held before entry, and remains locked on
+   exit.  */
+
+attribute_hidden void
+gomp_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key k,
+		       struct goacc_asyncqueue *aq)
+{
+  (void) gomp_remove_var_internal (devicep, k, aq);
 }
 
 /* Unmap variables described by TGT.  If DO_COPYFROM is true, copy relevant
@@ -1373,7 +1404,7 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
     }
 
   if (aq)
-    devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt,
+    devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt_void,
 						(void *) tgt);
   else
     gomp_unref_tgt ((void *) tgt);
Thomas Schwinge Oct. 18, 2019, 4:47 p.m. UTC | #8
Hi!

While reviewing
<http://mid.mail-archive.com/20191003163505.49997-2-julian@codesourcery.com>
"OpenACC reference count overhaul", I've just now stumbled over one thing
that originally was designed here:

On 2018-12-10T19:41:37+0000, Julian Brown <julian@codesourcery.com> wrote:
> On Fri, 7 Dec 2018 14:50:19 +0100
> Jakub Jelinek <jakub@redhat.com> wrote:
>
>> On Fri, Nov 30, 2018 at 03:41:09AM -0800, Julian Brown wrote:
>> > @@ -918,8 +920,13 @@ struct splay_tree_key_s {
>> >    uintptr_t tgt_offset;
>> >    /* Reference count.  */
>> >    uintptr_t refcount;
>> > -  /* Dynamic reference count.  */
>> > -  uintptr_t dynamic_refcount;
>> > +  /* Reference counts beyond those that represent genuine references in the
>> > +     linked splay tree key/target memory structures, e.g. for multiple OpenACC
>> > +     "present increment" operations (via "acc enter data") refering to the same
>> > +     host-memory block.  */
>> > +  uintptr_t virtual_refcount;
>> > +  /* For a block with attached pointers, the attachment counters for each.  */
>> > +  unsigned short *attach_count;
>> >    /* Pointer to the original mapping of "omp declare target link" object.  */
>> >    splay_tree_key link_key;
>> >  };  
>> 
>> This is something I'm worried about a lot, the nodes keep growing way
>> too much.

Is that just a would-be-nice-to-avoid, or is it an actual problem?

If the latter, can we maybe move some data into on-the-side data
structures, say an associative array keyed by [something suitable]?  I
would assume that compared to actual host to/from device data movement
(or even lookup etc.), lookup of values from such an associative array
should be relatively cheap?

I'm bringing this up, because:

>> Is there a way to reuse some other field if it is of
>> certain kind?
>
> How about this -- it seems that the link_key is only used for OpenMP,

So, is that actually correct?  Per my understanding, for the OpenACC
'link' clause we uses 'GOMP_MAP_LINK', which sets "omp declare target
link", and thus:

> and the attach count is only needed for OpenACC. So the obvious thing
> to do is probably to put those two together into a tagged union. The
> question is where to put the tag?
>
> Options are, I guess:
>
> 1. The high or low bits of the address.  Potentially non-portable, ugly.
>
> 2. Or, the virtual refcount is also only needed for OpenACC, so we can
>    reserve a magic value for that field to act as a tag.
>
> I've tried implementing the latter in the attached patch, and it seems
> to work OK.

... this is not actually feasible?

It's certainly possible that we're totally lacking sufficient testsuite
coverage, and that there are issues in the 'link' implementation
(<https://gcc.gnu.org/PR81689> "libgomp.c/target-link-1.c fails for
nvptx: #pragma omp target link not implemented" comes to mind
immediatelly, and certainly for OpenACC I used to be aware of additional
issues; I think I intended to use that mechanism for Fortran
'allocatable' with OpenACC 'declare'), but the libgomp handling to me
seems reasonable upon quick review -- just that we need to keep it alive
for OpenACC, too, unless I'm confused?


Simplifying the libgomp code to avoid the 'VREFCOUNT_LINK_KEY' toggle
flag, and not putting 'link_key' into an union together with
'attach_count', that should -- I hope -- resolve/obsolete some of the
questions raised in my late-night pre-Christmas 2018 review,
<https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01620.html>, where I'm now
not sure yet whether all my questions have been addressed (or disputed,
but I didn't hear anything) in the recent -- split-out, thanks! --
version of this patch,
<http://mid.mail-archive.com/20191003163505.49997-2-julian@codesourcery.com>
"OpenACC reference count overhaul".


Grüße
 Thomas
Julian Brown Nov. 6, 2019, 6:43 p.m. UTC | #9
Hi,

This version of the patch has been adjusted for changes to the
reference-count overhaul patch last posted here:

https://gcc.gnu.org/ml/gcc-patches/2019-10/msg02041.html

In particular, it incorporates the idea alluded to in that patch
relating to adding the new "attach_count" field to the memory-mapping
splay tree key type without growing that structure in the common case
(i.e. when structure components are not being mapped, or for OpenMP).
In short, a new auxiliary structure is added containing the previous
"link_key" and "attach_count" fields: so, you can either have both
pointers (though of course one of them may be NULL), or in the common
case no aux pointer at all, so no growth in the base struct size.

I've also addressed a couple of Thomas's earlier review comments from
December 2018:

https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01620.html

And also, as a minor point, the patch reincorporates the patch adding
c_omp_map_clause_name, since it refers to the newly-added
GOMP_MAP_ATTACH values so the two patches are mutually dependent. That
one was posted here:

https://gcc.gnu.org/ml/gcc-patches/2019-10/msg00443.html

On Fri, 18 Oct 2019 18:47:08 +0200
Thomas Schwinge <thomas@codesourcery.com> wrote:

> Hi!
> 
> While reviewing
> <http://mid.mail-archive.com/20191003163505.49997-2-julian@codesourcery.com>
> "OpenACC reference count overhaul", I've just now stumbled over one
> thing that originally was designed here:
> 
> On 2018-12-10T19:41:37+0000, Julian Brown <julian@codesourcery.com>
> wrote:
> > On Fri, 7 Dec 2018 14:50:19 +0100
> > Jakub Jelinek <jakub@redhat.com> wrote:
> >  
> >> On Fri, Nov 30, 2018 at 03:41:09AM -0800, Julian Brown wrote:  
> >> > @@ -918,8 +920,13 @@ struct splay_tree_key_s {
> >> >    uintptr_t tgt_offset;
> >> >    /* Reference count.  */
> >> >    uintptr_t refcount;
> >> > -  /* Dynamic reference count.  */
> >> > -  uintptr_t dynamic_refcount;
> >> > +  /* Reference counts beyond those that represent genuine
> >> > references in the
> >> > +     linked splay tree key/target memory structures, e.g. for
> >> > multiple OpenACC
> >> > +     "present increment" operations (via "acc enter data")
> >> > refering to the same
> >> > +     host-memory block.  */
> >> > +  uintptr_t virtual_refcount;
> >> > +  /* For a block with attached pointers, the attachment
> >> > counters for each.  */
> >> > +  unsigned short *attach_count;
> >> >    /* Pointer to the original mapping of "omp declare target
> >> > link" object.  */ splay_tree_key link_key;
> >> >  };    
> >> 
> >> This is something I'm worried about a lot, the nodes keep growing
> >> way too much.  
> 
> Is that just a would-be-nice-to-avoid, or is it an actual problem?
> 
> If the latter, can we maybe move some data into on-the-side data
> structures, say an associative array keyed by [something suitable]?  I
> would assume that compared to actual host to/from device data movement
> (or even lookup etc.), lookup of values from such an associative array
> should be relatively cheap?

I'd be extremely wary of adding a completely separate off-the-side
structure to keep track of attachment counters: the reference-counting
behaviour is already complicated enough, and the risk of messing things
up with another indirectly-linked structure to keep track of is too
high (never mind the extra runtime overhead). With the approach in this
patch, at least the extra info for link_key/attach_count is directly
accessible from the splay tree key struct via pointer indirection.

This version entails slight additional overhead (another malloc'd
block and another pointer indirection) for the link_key field (and also
for the attach_count pointer). I've not benchmarked memory use or
performance though, so I'm not sure how much impact this has on real
code.

> I'm bringing this up, because:
> 
> >> Is there a way to reuse some other field if it is of
> >> certain kind?  
> >
> > How about this -- it seems that the link_key is only used for
> > OpenMP,  
> 
> So, is that actually correct?  Per my understanding, for the OpenACC
> 'link' clause we uses 'GOMP_MAP_LINK', which sets "omp declare target
> link", and thus:
> 
> > and the attach count is only needed for OpenACC. So the obvious
> > thing to do is probably to put those two together into a tagged
> > union. The question is where to put the tag?
> >
> > Options are, I guess:
> >
> > 1. The high or low bits of the address.  Potentially non-portable,
> > ugly.
> >
> > 2. Or, the virtual refcount is also only needed for OpenACC, so we
> > can reserve a magic value for that field to act as a tag.
> >
> > I've tried implementing the latter in the attached patch, and it
> > seems to work OK.  
> 
> ... this is not actually feasible?

I'd missed that link_key could be used for OpenACC as well as OpenMP
(obviously). This version of the patch avoids that problem.

Is this version OK? Re-tested with offloading to nvptx.

Thanks,

Julian

ChangeLog

2019-11-06  Julian Brown  <julian@codesourcery.com>
            Thomas Schwinge  <thomas@codesourcery.com>

    gcc/c-family/
    * c-common.h (c_omp_map_clause_name): Add prototype.
    * c-omp.c (c_omp_map_clause_name): New function.
    * c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH and
    PRAGMA_OACC_CLAUSE_DETACH.

    gcc/c/
    * c-parser.c (c_parser_omp_clause_name): Add parsing of attach and
    detach clauses.
    (c_parser_omp_variable_list): Add ALLOW_DEREF optional parameter.
    Allow deref (->) in variable lists if true.
    (c_parser_omp_var_list_parens): Add ALLOW_DEREF optional parameter.
    Pass to c_parser_omp_variable_list.
    (c_parser_oacc_data_clause): Support attach and detach clauses.  Update
    call to c_parser_omp_variable_list.
    (c_parser_oacc_all_clauses): Support attach and detach clauses.
    (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    PRAGMA_OACC_CLAUSE_ATTACH.
    (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    * c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach
    and detach.  Support deref.
    (handle_omp_array_sections): Use GOMP_MAP_ATTACH_DETACH instead of
    GOMP_MAP_ALWAYS_POINTER for OpenACC.
    (c_oacc_check_attachments): New function.
    (c_finish_omp_clauses): Check attach/detach arguments for being
    pointers using above.  Support deref.

    gcc/cp/
    * parser.c (cp_parser_omp_clause_name): Support attach and detach
    clauses.
    (cp_parser_omp_var_list_no_open): Add ALLOW_DEREF optional parameter.
    Parse deref if true.
    (cp_parser_omp_var_list): Add ALLOW_DEREF optional parameter.  Pass to
    cp_parser_omp_var_list_no_open.
    (cp_parser_oacc_data_clause): Support attach and detach clauses.
    Update call to cp_parser_omp_var_list_no_open.
    (cp_parser_oacc_all_clauses): Support attach and detach.
    (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
    (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add
    PRAGMA_OACC_CLAUSE_ATTACH.
    (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH.
    * semantics.c (handle_omp_array_sections_1): Reject subarrays for
    attach and detach.
    (handle_omp_array_sections): Use GOMP_MAP_ATTACH_DETACH instead of
    GOMP_MAP_ALWAYS_POINTER for OpenACC.
    (cp_oacc_check_attachments): New function.
    (finish_omp_clauses): Use above function.  Allow structure fields and
    class members to appear in OpenACC data clauses.  Support
    GOMP_MAP_ATTACH_DETACH.  Support deref.

    gcc/fortran/
    * gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH.
    * openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter.
    Parse derived-type member accesses if true.
    (omp_mask2): Add OMP_CLAUSE_ATTACH and OMP_CLAUSE_DETACH.
    (gfc_match_omp_map_clause): Add allow_derived parameter.  Pass to
    gfc_match_omp_variable_list.
    (gfc_match_omp_clauses): Support attach and detach.  Support derived
    types for appropriate OpenACC directives.
    (OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
    (OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH.
    (OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH.
    (check_symbol_not_pointer): Don't disallow pointer objects of derived
    type.
    (resolve_oacc_data_clauses): Don't disallow allocatable derived types.
    (resolve_omp_clauses): Perform duplicate checking only for non-derived
    type component accesses (plain variables and arrays or array sections).
    Support component refs.
    * trans-expr.c (gfc_conv_component_ref,
    conv_parent_component_references): Make global.
    (gfc_auto_dereference_var): New function, broken out of...
    (gfc_conv_variable): ...here.  Call above function.
    * trans-openmp.c (gfc_omp_privatize_by_reference): Support component
    refs.
    (gfc_trans_omp_array_section): New function, broken out of...
    (gfc_trans_omp_clauses): ...here.  Support component refs/derived
    types, attach and detach clauses.
    * trans.h (gfc_conv_component_ref, conv_parent_component_references,
    gfc_auto_dereference_var): Add prototypes.

    gcc/
    * gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS.
    (insert_struct_comp_map): Support derived-type member mappings
    for arrays with descriptors which use GOMP_MAP_TO_PSET.  Support
    GOMP_MAP_ATTACH_DETACH.
    (gimplify_scan_omp_clauses): Tidy up OACC_ENTER_DATA/OACC_EXIT_DATA
    mappings.  Handle attach/detach clauses and component references.
    (gimplify_adjust_omp_clauses_1): Skip adjustments for explicit
    attach/detach clauses.
    (gimplify_omp_target_update): Handle finalize for detach.
    * omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH,
    GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH.
    * tree-pretty-print.c (dump_omp_clause): Likewise, plus
    GOMP_MAP_ATTACH_DETACH.

    include/
    * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_4, GOMP_MAP_DEEP_COPY):
    Define.
    (gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH,
    GOMP_MAP_FORCE_DETACH.

    gcc/testsuite/
    * c-c++-common/goacc/deep-copy-arrayofstruct.c: New test.
    * c-c++-common/goacc/mdc-1.c: New test.
    * c-c++-common/goacc/mdc-2.c: New test.
    * gcc.dg/goacc/mdc.C: New test.
    * gfortran.dg/goacc/derived-types.f90: New test.
    * gfortran.dg/goacc/derived-types-2.f90: New test.
    * gfortran.dg/goacc/data-clauses.f95: Adjust for expected errors.
    * gfortran.dg/goacc/enter-exit-data.f95: Likewise.

    libgomp/
    * libgomp.h (struct target_var_desc): Add do_detach flag.
    (struct splay_tree_aux): New.
    (struct splay_tree_key_s): Replace link_key field with aux pointer.
    (gomp_attach_pointer, gomp_detach_pointer): Add prototypes.
    * libgomp.map (OACC_2.6): New section. Add acc_attach,
    acc_attach_async, acc_detach, acc_detach_async, acc_detach_finalize,
    acc_detach_finalize_async.
    * oacc-init.c (acc_shutdown_1): Free aux block if present.
    * oacc-mem.c (acc_attach_async, acc_attach, goacc_detach_internal,
    acc_detach, acc_detach_async, acc_detach_finalize,
    acc_detach_finalize_async): New functions.
    * oacc-parallel.c (find_group_last): Add SIZES parameter. Support
    struct components.  Tidy up and add some new checks.
    (goacc_enter_data_internal): Update call to find_group_last.
    (goacc_exit_data_internal): Support detach operations and
    GOMP_MAP_STRUCT.
    (GOACC_enter_exit_data): Handle initial GOMP_MAP_STRUCT or
    GOMP_MAP_FORCE_PRESENT in finalization detection code.  Handle
    attach/detach in enter/exit data detection code.
    * openacc.h (acc_attach, acc_attach_async, acc_detach,
    (acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add
    prototypes.
    * target.c (dump_tgt): Support aux field.
    (gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc.
    (gomp_attach_pointer, gomp_detach_pointer): New functions.
    (gomp_map_vars_internal): Support attach and detach.
    (gomp_remove_var_internal): Free aux block and attachment counts if
    present.
    (gomp_unmap_vars_internal): Support detach.
    (gomp_load_image_to_device): Zero-initialise aux field instead of
    link_key field.
    (gomp_exit_data): Handle link key in aux field.  Free aux field when
    appropriate.
    (omp_target_associate_ptr): Zero-initialize aux field instead of
    link_key.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-9.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-10.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-11.c: New test.
    * testsuite/libgomp.oacc-c-c++-common/deep-copy-14.c: New test.
    * testsuite/libgomp.oacc-c++/deep-copy-12.C: New test.
    * testsuite/libgomp.oacc-c++/deep-copy-13.C: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-1.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-2.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-3.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-4.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-5.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-6.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-7.f90: New test.
    * testsuite/libgomp.oacc-fortran/deep-copy-8.f90: New test.
    * testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test.
    * testsuite/libgomp.oacc-fortran/derivedtype-1.f95: New test.
    * testsuite/libgomp.oacc-fortran/derivedtype-2.f95: New test.
    * testsuite/libgomp.oacc-fortran/multidim-slice.f95: New test.
    * testsuite/libgomp.oacc-fortran/update-2.f90: New test.
Julian Brown Nov. 22, 2019, 11:42 p.m. UTC | #10
Hi,

On Wed, 6 Nov 2019 18:43:39 +0000
Julian Brown <julian@codesourcery.com> wrote:

> This version of the patch has been adjusted for changes to the
> reference-count overhaul patch last posted here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-10/msg02041.html

Here's a rebased/slightly-updated/retested version of the patch, for
convenience. It's mostly the same as before, apart from mechanical
merges arising from the rebase. One improvement is that the attach
clause should work on the "serial" construct now also, though there are
no new tests for that as yet.

OK?

Thanks,

Julian
Tobias Burnus Nov. 25, 2019, 10:27 a.m. UTC | #11
Hi Julian,

On 11/23/19 12:42 AM, Julian Brown wrote:
>
>              gcc/fortran/
>              * trans-expr.c
>              (gfc_auto_dereference_var): New function, broken out of...
> […]
> +gfc_auto_dereference_var (location_t loc, gfc_symbol *sym, tree var,
> +			  bool descriptor_only_p, bool is_classarray)
> […]
> +	var = build_fold_indirect_ref_loc (input_location, var);
As your patch takes a location as argument – it also makes sense to use 
that location. (Alternatively, one could remove the argument as both 
callers explicitly pass 'input_location' as argument.)

One could do either way, but the current variant does not make sense – 
and, additionally, the current variant causes a compile-time warning.

See also my OG9 commit 500483e6ced44e2e0fea6a37e4f8c267ebaf826a where do 
s/input_location/loc/g in that function.

Cheers,

Tobias
Julian Brown Nov. 26, 2019, 2:45 a.m. UTC | #12
On Mon, 25 Nov 2019 11:27:35 +0100
Tobias Burnus <tobias@codesourcery.com> wrote:

> Hi Julian,
> 
> On 11/23/19 12:42 AM, Julian Brown wrote:
> >
> >              gcc/fortran/
> >              * trans-expr.c
> >              (gfc_auto_dereference_var): New function, broken out
> > of... […]
> > +gfc_auto_dereference_var (location_t loc, gfc_symbol *sym, tree
> > var,
> > +			  bool descriptor_only_p, bool
> > is_classarray) […]
> > +	var = build_fold_indirect_ref_loc (input_location, var);  
> As your patch takes a location as argument – it also makes sense to
> use that location. (Alternatively, one could remove the argument as
> both callers explicitly pass 'input_location' as argument.)
> 
> One could do either way, but the current variant does not make sense
> – and, additionally, the current variant causes a compile-time
> warning.
> 
> See also my OG9 commit 500483e6ced44e2e0fea6a37e4f8c267ebaf826a where
> do s/input_location/loc/g in that function.

Thanks -- I'd missed that. This version of the patch uses your
alternative suggestion -- i.e., removing the location_t parameter from
gfc_auto_dereference_var. That gets rid of the warning.

The rest of the patch is as the last-posted version.

Cheers,

Julian
Thomas Schwinge Dec. 17, 2019, 11:28 a.m. UTC | #13
Hi Julian!

As a first step, can you please split out just the code required to make
the OpenACC 'acc_attach*', 'acc_detach*' runtime library routines work?

Assuming there were no other defects in libgomp, whould this already make
the 'libgomp.oacc-c-c++-common/deep-copy-3.c',
'libgomp.oacc-c-c++-common/deep-copy-5.c' test cases work?


Grüße
 Thomas
diff mbox series

Patch

diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index b781f73..dd8d807 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -136,11 +136,13 @@  enum pragma_omp_clause {
 
   /* Clauses for OpenACC.  */
   PRAGMA_OACC_CLAUSE_ASYNC,
+  PRAGMA_OACC_CLAUSE_ATTACH,
   PRAGMA_OACC_CLAUSE_AUTO,
   PRAGMA_OACC_CLAUSE_COPY,
   PRAGMA_OACC_CLAUSE_COPYOUT,
   PRAGMA_OACC_CLAUSE_CREATE,
   PRAGMA_OACC_CLAUSE_DELETE,
+  PRAGMA_OACC_CLAUSE_DETACH,
   PRAGMA_OACC_CLAUSE_DEVICEPTR,
   PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT,
   PRAGMA_OACC_CLAUSE_FINALIZE,
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9589502..e12a7aa 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -11412,6 +11412,8 @@  c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -11434,6 +11436,8 @@  c_parser_omp_clause_name (c_parser *parser)
 	    result = PRAGMA_OACC_CLAUSE_DELETE;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -11804,9 +11808,12 @@  c_parser_omp_variable_list (c_parser *parser,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (c_parser_next_token_is (parser, CPP_DOT))
+	      while (c_parser_next_token_is (parser, CPP_DOT)
+		     || c_parser_next_token_is (parser, CPP_DEREF))
 		{
 		  location_t op_loc = c_parser_peek_token (parser)->location;
+		  if (c_parser_next_token_is (parser, CPP_DEREF))
+		    t = build_simple_mem_ref (t);
 		  c_parser_consume_token (parser);
 		  if (!c_parser_next_token_is (parser, CPP_NAME))
 		    {
@@ -11945,12 +11952,14 @@  c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind,
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach (variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -11960,6 +11969,9 @@  c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -11975,6 +11987,9 @@  c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -14559,6 +14574,10 @@  c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 						clauses);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = c_parser_omp_clause_collapse (parser, clauses);
 	  c_name = "collapse";
@@ -14587,6 +14606,10 @@  c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask,
 	  clauses = c_parser_omp_clause_default (parser, clauses, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = c_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -15065,7 +15088,8 @@  c_parser_oacc_cache (location_t loc, c_parser *parser)
 */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -15248,6 +15272,7 @@  c_parser_oacc_declare (c_parser *parser)
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
@@ -15257,6 +15282,7 @@  c_parser_oacc_declare (c_parser *parser)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -15391,6 +15417,7 @@  c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -15406,6 +15433,7 @@  c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 144977e..b2de3b4 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -12610,7 +12610,6 @@  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	  return error_mark_node;
 	}
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM))
@@ -12632,6 +12631,8 @@  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 		}
 	      t = TREE_OPERAND (t, 0);
 	    }
+	  if (TREE_CODE (t) == MEM_REF)
+	    t = TREE_OPERAND (t, 0);
 	}
       if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL)
 	{
@@ -12716,7 +12717,19 @@  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
     length = fold_convert (sizetype, length);
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
-
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "array section in %<attach%> clause"
+		    : "array section in %<detach%> clause");
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -13393,6 +13406,37 @@  c_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+c_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "expected pointer in %<attach%> clause"
+		    : "expected pointer in %<detach%> clause");
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them against their constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -14117,6 +14161,8 @@  c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (c_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -14124,8 +14170,13 @@  c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (c_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (DECL_BIT_FIELD (TREE_OPERAND (t, 1)))
@@ -14163,6 +14214,8 @@  c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 		}
 	      if (remove)
 		break;
+	      if (TREE_CODE (t) == MEM_REF)
+		t = TREE_OPERAND (t, 0);
 	      if (VAR_P (t) || TREE_CODE (t) == PARM_DECL)
 		{
 		  if (bitmap_bit_p (&map_field_head, DECL_UID (t)))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 006d1ed..3508f09 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -31533,6 +31533,8 @@  cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_ALIGNED;
 	  else if (!strcmp ("async", p))
 	    result = PRAGMA_OACC_CLAUSE_ASYNC;
+	  else if (!strcmp ("attach", p))
+	    result = PRAGMA_OACC_CLAUSE_ATTACH;
 	  break;
 	case 'c':
 	  if (!strcmp ("collapse", p))
@@ -31553,6 +31555,8 @@  cp_parser_omp_clause_name (cp_parser *parser)
 	    result = PRAGMA_OMP_CLAUSE_DEFAULTMAP;
 	  else if (!strcmp ("depend", p))
 	    result = PRAGMA_OMP_CLAUSE_DEPEND;
+	  else if (!strcmp ("detach", p))
+	    result = PRAGMA_OACC_CLAUSE_DETACH;
 	  else if (!strcmp ("device", p))
 	    result = PRAGMA_OMP_CLAUSE_DEVICE;
 	  else if (!strcmp ("deviceptr", p))
@@ -31832,15 +31836,19 @@  cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
 	    case OMP_CLAUSE_MAP:
 	    case OMP_CLAUSE_FROM:
 	    case OMP_CLAUSE_TO:
-	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT))
+	      while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		     || cp_lexer_next_token_is (parser->lexer, CPP_DEREF))
 		{
+		  cpp_ttype ttype
+		    = cp_lexer_next_token_is (parser->lexer, CPP_DOT)
+		      ? CPP_DOT : CPP_DEREF;
 		  location_t loc
 		    = cp_lexer_peek_token (parser->lexer)->location;
 		  cp_id_kind idk = CP_ID_KIND_NONE;
 		  cp_lexer_consume_token (parser->lexer);
 		  decl = convert_from_reference (decl);
 		  decl
-		    = cp_parser_postfix_dot_deref_expression (parser, CPP_DOT,
+		    = cp_parser_postfix_dot_deref_expression (parser, ttype,
 							      decl, false,
 							      &idk, loc);
 		}
@@ -31965,12 +31973,14 @@  cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list)
   return list;
 }
 
-/* OpenACC 2.0:
+/* OpenACC 2.5:
+   attach ( variable-list )
    copy ( variable-list )
    copyin ( variable-list )
    copyout ( variable-list )
    create ( variable-list )
    delete ( variable-list )
+   detach ( variable-list )
    present ( variable-list ) */
 
 static tree
@@ -31980,6 +31990,9 @@  cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
   enum gomp_map_kind kind;
   switch (c_kind)
     {
+    case PRAGMA_OACC_CLAUSE_ATTACH:
+      kind = GOMP_MAP_ATTACH;
+      break;
     case PRAGMA_OACC_CLAUSE_COPY:
       kind = GOMP_MAP_TOFROM;
       break;
@@ -31995,6 +32008,9 @@  cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind,
     case PRAGMA_OACC_CLAUSE_DELETE:
       kind = GOMP_MAP_RELEASE;
       break;
+    case PRAGMA_OACC_CLAUSE_DETACH:
+      kind = GOMP_MAP_DETACH;
+      break;
     case PRAGMA_OACC_CLAUSE_DEVICE:
       kind = GOMP_MAP_FORCE_TO;
       break;
@@ -34344,6 +34360,10 @@  cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 						 clauses, here);
 	  c_name = "auto";
 	  break;
+	case PRAGMA_OACC_CLAUSE_ATTACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "attach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_COLLAPSE:
 	  clauses = cp_parser_omp_clause_collapse (parser, clauses, here);
 	  c_name = "collapse";
@@ -34372,6 +34392,10 @@  cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask,
 	  clauses = cp_parser_omp_clause_default (parser, clauses, here, true);
 	  c_name = "default";
 	  break;
+	case PRAGMA_OACC_CLAUSE_DETACH:
+	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
+	  c_name = "detach";
+	  break;
 	case PRAGMA_OACC_CLAUSE_DEVICE:
 	  clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses);
 	  c_name = "device";
@@ -38011,10 +38035,12 @@  cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok)
      structured-block  */
 
 #define OACC_DATA_CLAUSE_MASK						\
-	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
+	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) )
@@ -38214,6 +38240,7 @@  cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 
 #define OACC_ENTER_DATA_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF)			\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE)		\
@@ -38224,6 +38251,7 @@  cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok)
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) 		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) 		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) )
 
@@ -38327,6 +38355,7 @@  cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_KERNELS_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
@@ -38342,6 +38371,7 @@  cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name,
 
 #define OACC_PARALLEL_CLAUSE_MASK					\
 	( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC)		\
+	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN)		\
 	| (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT)		\
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index 182d360..303bcc4 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -4568,7 +4568,6 @@  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
 	t = TREE_OPERAND (t, 0);
       ret = t;
       if (TREE_CODE (t) == COMPONENT_REF
-	  && ort == C_ORT_OMP
 	  && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO
 	      || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)
@@ -4691,6 +4690,19 @@  handle_omp_array_sections_1 (tree c, tree t, vec<tree> &types,
   if (low_bound == NULL_TREE)
     low_bound = integer_zero_node;
 
+  if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+	  || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH))
+    {
+      if (length != integer_one_node)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "array section in %<attach%> clause"
+		    : "array section in %<detach%> clause");
+	  return error_mark_node;
+	}
+    }
   if (length != NULL_TREE)
     {
       if (!integer_nonzerop (length))
@@ -6048,6 +6060,43 @@  cp_omp_finish_iterators (tree iter)
   return ret;
 }
 
+/* Ensure that pointers are used in OpenACC attach and detach clauses.
+   Return true if an error has been detected.  */
+
+static bool
+cp_oacc_check_attachments (tree c)
+{
+  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
+    return false;
+
+  /* OpenACC attach / detach clauses must be pointers.  */
+  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+      || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)
+    {
+      tree t = OMP_CLAUSE_DECL (c);
+      tree type;
+
+      while (TREE_CODE (t) == TREE_LIST)
+	t = TREE_CHAIN (t);
+
+      type = TREE_TYPE (t);
+
+      if (TREE_CODE (type) == REFERENCE_TYPE)
+	type = TREE_TYPE (type);
+
+      if (TREE_CODE (type) != POINTER_TYPE)
+	{
+	  error_at (OMP_CLAUSE_LOCATION (c),
+		    OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+		    ? "expected pointer in %<attach%> clause"
+		    : "expected pointer in %<detach%> clause");
+	  return true;
+	}
+    }
+
+  return false;
+}
+
 /* For all elements of CLAUSES, validate them vs OpenMP constraints.
    Remove any elements from the list that are invalid.  */
 
@@ -6288,7 +6337,7 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    t = OMP_CLAUSE_DECL (c);
 	check_dup_generic_t:
 	  if (t == current_class_ptr
-	      && (ort != C_ORT_OMP_DECLARE_SIMD
+	      && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC)
 		  || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR
 		      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM)))
 	    {
@@ -6352,8 +6401,7 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	handle_field_decl:
 	  if (!remove
 	      && TREE_CODE (t) == FIELD_DECL
-	      && t == OMP_CLAUSE_DECL (c)
-	      && ort != C_ORT_ACC)
+	      && t == OMP_CLAUSE_DECL (c))
 	    {
 	      OMP_CLAUSE_DECL (c)
 		= omp_privatize_field (t, (OMP_CLAUSE_CODE (c)
@@ -6420,7 +6468,7 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    omp_note_field_privatization (t, OMP_CLAUSE_DECL (c));
 	  else
 	    t = OMP_CLAUSE_DECL (c);
-	  if (t == current_class_ptr)
+	  if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -6907,7 +6955,7 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	    }
 	  if (t == error_mark_node)
 	    remove = true;
-	  else if (t == current_class_ptr)
+	  else if (ort != C_ORT_ACC && t == current_class_ptr)
 	    {
 	      error_at (OMP_CLAUSE_LOCATION (c),
 			"%<this%> allowed in OpenMP only in %<declare simd%>"
@@ -7037,6 +7085,8 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 			}
 		    }
 		}
+	      if (cp_oacc_check_attachments (c))
+		remove = true;
 	      break;
 	    }
 	  if (t == error_mark_node)
@@ -7044,14 +7094,25 @@  finish_omp_clauses (tree clauses, enum c_omp_region_type ort)
 	      remove = true;
 	      break;
 	    }
+	  /* OpenACC attach / detach clauses must be pointers.  */
+	  if (cp_oacc_check_attachments (c))
+	    {
+	      remove = true;
+	      break;
+	    }
 	  if (REFERENCE_REF_P (t)
 	      && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF)
 	    {
 	      t = TREE_OPERAND (t, 0);
 	      OMP_CLAUSE_DECL (c) = t;
 	    }
+	  if (ort == C_ORT_ACC
+	      && TREE_CODE (t) == COMPONENT_REF
+	      && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF)
+	    t = TREE_OPERAND (TREE_OPERAND (t, 0), 0);
 	  if (TREE_CODE (t) == COMPONENT_REF
-	      && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+	      && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP
+		  || ort == C_ORT_ACC)
 	      && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_)
 	    {
 	      if (type_dependent_expression_p (t))
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index d8ef35d..9f96418 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1175,10 +1175,12 @@  enum gfc_omp_depend_op
 enum gfc_omp_map_op
 {
   OMP_MAP_ALLOC,
+  OMP_MAP_ATTACH,
   OMP_MAP_TO,
   OMP_MAP_FROM,
   OMP_MAP_TOFROM,
   OMP_MAP_DELETE,
+  OMP_MAP_DETACH,
   OMP_MAP_FORCE_ALLOC,
   OMP_MAP_FORCE_TO,
   OMP_MAP_FORCE_FROM,
diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c
index 6430e61..ebba7ca 100644
--- a/gcc/fortran/openmp.c
+++ b/gcc/fortran/openmp.c
@@ -222,7 +222,8 @@  static match
 gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 			     bool allow_common, bool *end_colon = NULL,
 			     gfc_omp_namelist ***headp = NULL,
-			     bool allow_sections = false)
+			     bool allow_sections = false,
+			     bool allow_derived = false)
 {
   gfc_omp_namelist *head, *tail, *p;
   locus old_loc, cur_loc;
@@ -248,7 +249,8 @@  gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list,
 	case MATCH_YES:
 	  gfc_expr *expr;
 	  expr = NULL;
-	  if (allow_sections && gfc_peek_ascii_char () == '(')
+	  if ((allow_sections && gfc_peek_ascii_char () == '(')
+	      || (allow_derived && gfc_peek_ascii_char () == '%'))
 	    {
 	      gfc_current_locus = cur_loc;
 	      m = gfc_match_variable (&expr, 0);
@@ -785,7 +787,7 @@  enum omp_mask1
   OMP_MASK1_LAST
 };
 
-/* OpenACC 2.0 specific clauses. */
+/* OpenACC 2.0+ specific clauses. */
 enum omp_mask2
 {
   OMP_CLAUSE_ASYNC,
@@ -811,6 +813,8 @@  enum omp_mask2
   OMP_CLAUSE_TILE,
   OMP_CLAUSE_IF_PRESENT,
   OMP_CLAUSE_FINALIZE,
+  OMP_CLAUSE_ATTACH,
+  OMP_CLAUSE_DETACH,
   /* This must come last.  */
   OMP_MASK2_LAST
 };
@@ -914,10 +918,12 @@  omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m)
    mapping.  */
 
 static bool
-gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op)
+gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op,
+			  bool allow_derived = false)
 {
   gfc_omp_namelist **head = NULL;
-  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true)
+  if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true,
+				   allow_derived)
       == MATCH_YES)
     {
       gfc_omp_namelist *n;
@@ -939,6 +945,14 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 {
   gfc_omp_clauses *c = gfc_get_omp_clauses ();
   locus old_loc;
+  /* Determine whether we're dealing with an OpenACC directive that permits
+     derived type member accesses.  This in particular disallows
+     "!$acc declare" from using such accesses, because it's not clear if/how
+     that should work.  */
+  bool allow_derived = (openacc
+			&& ((mask & OMP_CLAUSE_ATTACH)
+			    || (mask & OMP_CLAUSE_DETACH)
+			    || (mask & OMP_CLAUSE_HOST_SELF)));
 
   gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64);
   *cp = NULL;
@@ -1012,6 +1026,11 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      needs_space = true;
 	      continue;
 	    }
+	  if ((mask & OMP_CLAUSE_ATTACH)
+	      && gfc_match ("attach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_ATTACH, allow_derived))
+	    continue;
 	  break;
 	case 'c':
 	  if ((mask & OMP_CLAUSE_COLLAPSE)
@@ -1039,7 +1058,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if (mask & OMP_CLAUSE_COPYIN)
 	    {
@@ -1047,7 +1066,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		{
 		  if (gfc_match ("copyin ( ") == MATCH_YES
 		      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-						   OMP_MAP_TO))
+						   OMP_MAP_TO, allow_derived))
 		    continue;
 		}
 	      else if (gfc_match_omp_variable_list ("copyin (",
@@ -1058,7 +1077,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYPRIVATE)
 	      && gfc_match_omp_variable_list ("copyprivate (",
@@ -1068,7 +1087,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  break;
 	case 'd':
@@ -1104,7 +1123,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_DELETE)
 	      && gfc_match ("delete ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_RELEASE))
+					   OMP_MAP_RELEASE, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEPEND)
 	      && gfc_match ("depend ( ") == MATCH_YES)
@@ -1147,6 +1166,11 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      else
 		gfc_current_locus = old_loc;
 	    }
+	  if ((mask & OMP_CLAUSE_DETACH)
+	      && gfc_match ("detach ( ") == MATCH_YES
+	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
+					   OMP_MAP_DETACH, allow_derived))
+	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE)
 	      && !openacc
 	      && c->device == NULL
@@ -1156,12 +1180,13 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      && openacc
 	      && gfc_match ("device ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_TO))
+					   OMP_MAP_FORCE_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICEPTR)
 	      && gfc_match ("deviceptr ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_DEVICEPTR))
+					   OMP_MAP_FORCE_DEVICEPTR,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_DEVICE_RESIDENT)
 	      && gfc_match_omp_variable_list
@@ -1239,7 +1264,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("host ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  break;
 	case 'i':
@@ -1511,47 +1536,48 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("pcopy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("pcopyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("pcopyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("pcreate ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRESENT)
 	      && gfc_match ("present ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_PRESENT))
+					   OMP_MAP_FORCE_PRESENT,
+					   allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPY)
 	      && gfc_match ("present_or_copy ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TOFROM))
+					   OMP_MAP_TOFROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYIN)
 	      && gfc_match ("present_or_copyin ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_TO))
+					   OMP_MAP_TO, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_COPYOUT)
 	      && gfc_match ("present_or_copyout ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FROM))
+					   OMP_MAP_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_CREATE)
 	      && gfc_match ("present_or_create ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_ALLOC))
+					   OMP_MAP_ALLOC, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_PRIORITY)
 	      && c->priority == NULL
@@ -1669,8 +1695,8 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 
 	      if (gfc_match_omp_variable_list (" :",
 					       &c->lists[OMP_LIST_REDUCTION],
-					       false, NULL, &head,
-					       openacc) == MATCH_YES)
+					       false, NULL, &head, openacc,
+					       allow_derived) == MATCH_YES)
 		{
 		  gfc_omp_namelist *n;
 		  if (rop == OMP_REDUCTION_NONE)
@@ -1769,7 +1795,7 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	  if ((mask & OMP_CLAUSE_HOST_SELF)
 	      && gfc_match ("self ( ") == MATCH_YES
 	      && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],
-					   OMP_MAP_FORCE_FROM))
+					   OMP_MAP_FORCE_FROM, allow_derived))
 	    continue;
 	  if ((mask & OMP_CLAUSE_SEQ)
 	      && !c->seq
@@ -1927,17 +1953,17 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR	      \
    | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_KERNELS_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS	      \
    | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \
    | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT		      \
    | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT	      \
-   | OMP_CLAUSE_WAIT)
+   | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH)
 #define OACC_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR  | OMP_CLAUSE_COPY	      \
    | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE		      \
-   | OMP_CLAUSE_PRESENT)
+   | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH)
 #define OACC_LOOP_CLAUSES \
   (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER	      \
    | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT	      \
@@ -1958,10 +1984,11 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
    | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT)
 #define OACC_ENTER_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE)
+   | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH)
 #define OACC_EXIT_DATA_CLAUSES \
   (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT	      \
-   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE)
+   | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE	      \
+   | OMP_CLAUSE_DETACH)
 #define OACC_WAIT_CLAUSES \
   omp_mask (OMP_CLAUSE_ASYNC)
 #define OACC_ROUTINE_CLAUSES \
@@ -3734,9 +3761,6 @@  resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause)
 static void
 check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.pointer)
-    gfc_error ("POINTER object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer)
     gfc_error ("Cray pointer object %qs of derived type in %s clause at %L",
 	       sym->name, name, &loc);
@@ -3781,9 +3805,6 @@  check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name)
 static void
 resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
 {
-  if (sym->ts.type == BT_DERIVED && sym->attr.allocatable)
-    gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L",
-	       sym->name, name, &loc);
   if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable)
       || (sym->ts.type == BT_CLASS && CLASS_DATA (sym)
 	  && CLASS_DATA (sym)->attr.allocatable))
@@ -4153,11 +4174,23 @@  resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	&& (list != OMP_LIST_REDUCTION || !openacc))
       for (n = omp_clauses->lists[list]; n; n = n->next)
 	{
-	  if (n->sym->mark)
-	    gfc_error ("Symbol %qs present on multiple clauses at %L",
-		       n->sym->name, &n->where);
-	  else
-	    n->sym->mark = 1;
+	  bool array_only_p = true;
+	  /* Disallow duplicate bare variable references and multiple
+	     subarrays of the same array here, but allow multiple components of
+	     the same (e.g. derived-type) variable.  For the latter, duplicate
+	     components are detected elsewhere.  */
+	  if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE)
+	    for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
+	      if (ref->type != REF_ARRAY)
+		array_only_p = false;
+	  if (array_only_p)
+	    {
+	      if (n->sym->mark)
+		gfc_error ("Symbol %qs present on multiple clauses at %L",
+			   n->sym->name, &n->where);
+	      else
+		n->sym->mark = 1;
+	    }
 	}
 
   gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1);
@@ -4348,23 +4381,41 @@  resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 				 "are allowed on ORDERED directive at %L",
 				 &n->where);
 		  }
+		gfc_ref *array_ref = NULL;
+		bool resolved = false;
 		if (n->expr)
 		  {
-		    if (!gfc_resolve_expr (n->expr)
+		    array_ref = n->expr->ref;
+		    resolved = gfc_resolve_expr (n->expr);
+
+		    /* Look through component refs to find last array
+		       reference.  */
+		    while (resolved
+			   && array_ref
+			   && (array_ref->type == REF_COMPONENT
+			       || (array_ref->type == REF_ARRAY
+				   && array_ref->next
+			           && array_ref->next->type == REF_COMPONENT)))
+		      array_ref = array_ref->next;
+		  }
+		if (array_ref
+		    || (n->expr
+			&& (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
+		  {
+		    if (!resolved
 			|| n->expr->expr_type != EXPR_VARIABLE
-			|| n->expr->ref == NULL
-			|| n->expr->ref->next
-			|| n->expr->ref->type != REF_ARRAY)
+			|| array_ref->next
+			|| array_ref->type != REF_ARRAY)
 		      gfc_error ("%qs in %s clause at %L is not a proper "
 				 "array section", n->sym->name, name,
 				 &n->where);
-		    else if (n->expr->ref->u.ar.codimen)
+		    else if (array_ref->u.ar.codimen)
 		      gfc_error ("Coarrays not supported in %s clause at %L",
 				 name, &n->where);
 		    else
 		      {
 			int i;
-			gfc_array_ref *ar = &n->expr->ref->u.ar;
+			gfc_array_ref *ar = &array_ref->u.ar;
 			for (i = 0; i < ar->dimen; i++)
 			  if (ar->stride[i])
 			    {
diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index e7f3a78..b42f57a 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -60,6 +60,9 @@  gfc_omp_privatize_by_reference (const_tree decl)
 
   if (TREE_CODE (type) == POINTER_TYPE)
     {
+      while (TREE_CODE (decl) == COMPONENT_REF)
+        decl = TREE_OPERAND (decl, 1);
+
       /* Array POINTER/ALLOCATABLE have aggregate types, all user variables
 	 that have POINTER_TYPE type and aren't scalar pointers, scalar
 	 allocatables, Cray pointees or C pointers are supposed to be
@@ -2108,20 +2111,47 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	      tree decl = gfc_get_symbol_decl (n->sym);
 	      if (DECL_P (decl))
 		TREE_ADDRESSABLE (decl) = 1;
-	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
+
+	      gfc_ref *ref = n->expr ? n->expr->ref : NULL;
+	      symbol_attribute *sym_attr = &n->sym->attr;
+	      gomp_map_kind ptr_map_kind = GOMP_MAP_POINTER;
+
+	      if (ref && n->sym->ts.type == BT_DERIVED)
+	        {
+		  if (gfc_omp_privatize_by_reference (decl))
+		    decl = build_fold_indirect_ref (decl);
+
+		  for (; ref && ref->type == REF_COMPONENT; ref = ref->next)
+		    {
+		      tree field = ref->u.c.component->backend_decl;
+		      gcc_assert (field && TREE_CODE (field) == FIELD_DECL);
+		      decl = fold_build3 (COMPONENT_REF, TREE_TYPE (field),
+					  decl, field, NULL_TREE);
+		      sym_attr = &ref->u.c.component->attr;
+		    }
+
+		  ptr_map_kind = GOMP_MAP_ALWAYS_POINTER;
+		}
+
+	      if (ref == NULL || ref->u.ar.type == AR_FULL)
 		{
+		  tree field = decl;
+
+		  while (TREE_CODE (field) == COMPONENT_REF)
+		    field = TREE_OPERAND (field, 1);
+
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
 		      && (gfc_omp_privatize_by_reference (decl)
-			  || GFC_DECL_GET_SCALAR_POINTER (decl)
-			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
-			  || GFC_DECL_CRAY_POINTEE (decl)
+			  || GFC_DECL_GET_SCALAR_POINTER (field)
+			  || GFC_DECL_GET_SCALAR_ALLOCATABLE (field)
+			  || GFC_DECL_CRAY_POINTEE (field)
 			  || GFC_DESCRIPTOR_TYPE_P
-					(TREE_TYPE (TREE_TYPE (decl)))))
+					(TREE_TYPE (TREE_TYPE (field)))))
 		    {
 		      tree orig_decl = decl;
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2131,13 +2161,15 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			{
 			  node3 = build_omp_clause (input_location,
 						    OMP_CLAUSE_MAP);
-			  OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+			  OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 			  OMP_CLAUSE_DECL (node3) = decl;
 			  OMP_CLAUSE_SIZE (node3) = size_int (0);
 			  decl = build_fold_indirect_ref (decl);
 			}
 		    }
-		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
+		  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
+		      && n->u.map_op != OMP_MAP_ATTACH
+		      && n->u.map_op != OMP_MAP_DETACH)
 		    {
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
@@ -2152,14 +2184,16 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		      OMP_CLAUSE_SIZE (node3) = size_int (0);
 
 		      /* We have to check for n->sym->attr.dimension because
 			 of scalar coarrays.  */
-		      if (n->sym->attr.pointer && n->sym->attr.dimension)
+		      if (sym_attr->pointer && sym_attr->dimension)
 			{
 			  stmtblock_t cond_block;
 			  tree size
@@ -2189,11 +2223,11 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 							     else_b));
 			  OMP_CLAUSE_SIZE (node) = size;
 			}
-		      else if (n->sym->attr.dimension)
+		      else if (sym_attr->dimension)
 			OMP_CLAUSE_SIZE (node)
 			  = gfc_full_array_size (block, decl,
 						 GFC_TYPE_ARRAY_RANK (type));
-		      if (n->sym->attr.dimension)
+		      if (sym_attr->dimension)
 			{
 			  tree elemsz
 			    = TYPE_SIZE_UNIT (gfc_get_element_type (type));
@@ -2206,11 +2240,11 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  else
 		    OMP_CLAUSE_DECL (node) = decl;
 		}
-	      else
+	      else if (ref)
 		{
 		  tree ptr, ptr2;
 		  gfc_init_se (&se, NULL);
-		  if (n->expr->ref->u.ar.type == AR_ELEMENT)
+		  if (ref->u.ar.type == AR_ELEMENT)
 		    {
 		      gfc_conv_expr_reference (&se, n->expr);
 		      gfc_add_block_to_block (block, &se.pre);
@@ -2244,7 +2278,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		    {
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node4) = decl;
 		      OMP_CLAUSE_SIZE (node4) = size_int (0);
 		      decl = build_fold_indirect_ref (decl);
@@ -2261,9 +2295,11 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3)
 			= gfc_conv_descriptor_data_get (decl);
+		      if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER)
+		        STRIP_NOPS (OMP_CLAUSE_DECL (node3));
 		    }
 		  else
 		    {
@@ -2276,18 +2312,23 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			}
 		      node3 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+		      OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind);
 		      OMP_CLAUSE_DECL (node3) = decl;
 		    }
 		  ptr2 = fold_convert (sizetype, ptr2);
 		  OMP_CLAUSE_SIZE (node3)
 		    = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2);
 		}
+	      else
+	        gcc_unreachable ();
 	      switch (n->u.map_op)
 		{
 		case OMP_MAP_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC);
 		  break;
+		case OMP_MAP_ATTACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ATTACH);
+		  break;
 		case OMP_MAP_TO:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TO);
 		  break;
@@ -2312,6 +2353,9 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		case OMP_MAP_DELETE:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE);
 		  break;
+		case OMP_MAP_DETACH:
+		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DETACH);
+		  break;
 		case OMP_MAP_FORCE_ALLOC:
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC);
 		  break;
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 274edc0..aa7723d 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -113,6 +113,10 @@  enum gimplify_omp_var_data
 
   GOVD_NONTEMPORAL = 4194304,
 
+  /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for
+     fields.  */
+  GOVD_MAP_HAS_ATTACHMENTS = 8388608,
+
   GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE
 			   | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR
 			   | GOVD_LOCAL)
@@ -7998,7 +8002,13 @@  insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node,
   OMP_CLAUSE_SET_MAP_KIND (c2, mkind);
   OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c));
   OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node;
-  OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
+  if (OMP_CLAUSE_CHAIN (prev_node) != c
+      && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP
+      && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node))
+	  == GOMP_MAP_TO_PSET))
+    OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node));
+  else
+    OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node);
   if (struct_node)
     OMP_CLAUSE_CHAIN (struct_node) = c2;
 
@@ -8588,7 +8598,9 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		  remove = true;
 		  break;
 		}
-	      if (DECL_P (decl))
+	      if (DECL_P (decl)
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
+		  && code != OACC_UPDATE)
 		{
 		  if (error_operand_p (decl))
 		    {
@@ -8640,16 +8652,36 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		    = splay_tree_lookup (ctx->variables, (splay_tree_key)decl);
 		  bool ptr = (OMP_CLAUSE_MAP_KIND (c)
 			      == GOMP_MAP_ALWAYS_POINTER);
+		  bool attach = OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH
+			        || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH;
+		  bool has_attachments = false;
+		  /* For OpenACC, pointers in structs should trigger an
+		     attach action.  */
+		  if (ptr && (region_type & ORT_ACC) != 0)
+		    {
+		      /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a
+			 GOMP_MAP_ATTACH clause after we have detected a case
+			 that needs a GOMP_MAP_STRUCT mapping adding.  */
+		      OMP_CLAUSE_SET_MAP_KIND (c,
+			(code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH
+						 : GOMP_MAP_ATTACH);
+		      has_attachments = true;
+		    }
 		  if (n == NULL || (n->value & GOVD_MAP) == 0)
 		    {
 		      tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c),
 						 OMP_CLAUSE_MAP);
-		      OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT);
+		      OMP_CLAUSE_SET_MAP_KIND (l, attach
+			? GOMP_MAP_FORCE_PRESENT : GOMP_MAP_STRUCT);
 		      if (!base_eq_orig_base)
 			OMP_CLAUSE_DECL (l) = unshare_expr (orig_base);
 		      else
 			OMP_CLAUSE_DECL (l) = decl;
-		      OMP_CLAUSE_SIZE (l) = size_int (1);
+		      OMP_CLAUSE_SIZE (l) = attach
+			? (DECL_P (OMP_CLAUSE_DECL (l))
+			     ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l))
+			     : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l))))
+			: size_int (1);
 		      if (struct_map_to_clause == NULL)
 			struct_map_to_clause = new hash_map<tree, tree>;
 		      struct_map_to_clause->put (decl, l);
@@ -8681,9 +8713,11 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      flags = GOVD_MAP | GOVD_EXPLICIT;
 		      if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr)
 			flags |= GOVD_SEEN;
+		      if (has_attachments)
+			flags |= GOVD_MAP_HAS_ATTACHMENTS;
 		      goto do_add_decl;
 		    }
-		  else
+		  else if (struct_map_to_clause)
 		    {
 		      tree *osc = struct_map_to_clause->get (decl);
 		      tree *sc = NULL, *scp = NULL;
@@ -8692,8 +8726,10 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		      sc = &OMP_CLAUSE_CHAIN (*osc);
 		      if (*sc != c
 			  && (OMP_CLAUSE_MAP_KIND (*sc)
-			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) 
+			      == GOMP_MAP_FIRSTPRIVATE_REFERENCE))
 			sc = &OMP_CLAUSE_CHAIN (*sc);
+		      /* Here "prev_list_p" is the end of the inserted
+			 alloc/release nodes after the struct node, OSC.  */
 		      for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc))
 			if (ptr && sc == prev_list_p)
 			  break;
@@ -8752,9 +8788,10 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 			  }
 		      if (remove)
 			break;
-		      OMP_CLAUSE_SIZE (*osc)
-			= size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
-				      size_one_node);
+		      if (!attach)
+			OMP_CLAUSE_SIZE (*osc)
+			  = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc),
+					size_one_node);
 		      if (ptr)
 			{
 			  tree cl
@@ -8786,11 +8823,15 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 		}
 	      if (!remove
 		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER
+		  && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET
 		  && OMP_CLAUSE_CHAIN (c)
 		  && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP
-		  && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
-		      == GOMP_MAP_ALWAYS_POINTER))
+		  && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		       == GOMP_MAP_ALWAYS_POINTER)
+		      || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c))
+		          == GOMP_MAP_TO_PSET)))
 		prev_list_p = list_p;
+
 	      break;
 	    }
 	  flags = GOVD_MAP | GOVD_EXPLICIT;
@@ -9412,6 +9453,8 @@  gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data)
     return 0;
   if ((flags & GOVD_SEEN) == 0)
     return 0;
+  if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0)
+    return 0;
   if (flags & GOVD_DEBUG_PRIVATE)
     {
       gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED);
@@ -11795,8 +11838,9 @@  gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
 			       OMP_CLAUSE_FINALIZE))
     {
-      /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
-	 semantics apply to all mappings of this OpenACC directive.  */
+      /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and
+	 GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply
+	 to all mappings of this OpenACC directive.  */
       bool finalize_marked = false;
       for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
@@ -11810,10 +11854,19 @@  gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
 	      finalize_marked = true;
 	      break;
+	    case GOMP_MAP_DETACH:
+	      OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH);
+	      finalize_marked = true;
+	      break;
+	    case GOMP_MAP_STRUCT:
+	    case GOMP_MAP_FORCE_PRESENT:
+	      /* Skip over an initial struct or force_present mapping.  */
+	      break;
 	    default:
-	      /* Check consistency: libgomp relies on the very first data
-		 mapping clause being marked, so make sure we did that before
-		 any other mapping clauses.  */
+	      /* Check consistency: libgomp relies on the very first
+		 non-struct, non-force-present data mapping clause being
+		 marked, so make sure we did that before any other mapping
+		 clauses.  */
 	      gcc_assert (finalize_marked);
 	      break;
 	    }
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index ca78d7a..55dbc0b 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -9138,6 +9138,9 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 	  case GOMP_MAP_FORCE_DEVICEPTR:
 	  case GOMP_MAP_DEVICE_RESIDENT:
 	  case GOMP_MAP_LINK:
+	  case GOMP_MAP_ATTACH:
+	  case GOMP_MAP_DETACH:
+	  case GOMP_MAP_FORCE_DETACH:
 	    gcc_assert (is_gimple_omp_oacc (stmt));
 	    break;
 	  default:
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
new file mode 100644
index 0000000..84a44af
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c
@@ -0,0 +1,54 @@ 
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+/* { dg-additional-options "-fdump-tree-omplower" } */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z;
+
+#pragma acc enter data copyin(s)
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e)
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+
+      a = s.e;
+#pragma acc enter data attach(a)
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data copyin(a)
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e)
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a)
+
+#pragma acc exit data detach(a) finalize
+#pragma acc exit data detach(s.a) finalize
+  }
+}
+
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */
+/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */
diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
new file mode 100644
index 0000000..ebfb99d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c
@@ -0,0 +1,62 @@ 
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  int *a, *z, scalar, **y;
+
+#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(s.a[0:10]) copy(z[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(a)
+    }
+
+#pragma acc enter data attach(z[:5]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:5]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[1:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[1:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(z[:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(z[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(z[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(s.e)
+#pragma acc exit data detach(s.e) attach(z) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(s.e)
+    {
+    }
+#pragma acc exit data delete(a) attach(s.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(scalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(scalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(s) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(s) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(y[10])
+#pragma acc exit data detach(y[10])
+}
diff --git a/gcc/testsuite/g++.dg/goacc/mdc.C b/gcc/testsuite/g++.dg/goacc/mdc.C
new file mode 100644
index 0000000..fbd43aa
--- /dev/null
+++ b/gcc/testsuite/g++.dg/goacc/mdc.C
@@ -0,0 +1,68 @@ 
+/* Test OpenACC's support for manual deep copy, including the attach
+   and detach clauses.  */
+
+void
+t1 ()
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  struct foo& rs = s;
+  
+  int *a, *z, scalar, **y;
+  int* const &ra = a;
+  int* const &rz = z;
+  int& rscalar = scalar;
+  int** const &ry = y;
+
+#pragma acc enter data copyin(rs) detach(rz) /* { dg-error ".detach. is not valid for" } */
+  {
+#pragma acc data copy(rs.a[0:10]) copy(rz[0:10])
+    {
+      s.e = z;
+#pragma acc parallel loop attach(rs.e) detach(rs.b) /* { dg-error ".detach. is not valid for" } */
+      for (int i = 0; i < 10; i++)
+        s.a[i] = s.e[i];
+
+      a = s.e;
+#pragma acc enter data attach(ra) detach(rs.c) /* { dg-error ".detach. is not valid for" } */
+#pragma acc exit data detach(ra)
+    }
+
+#pragma acc enter data attach(rz[:5]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:5]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[1:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[1:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[:]) /* { dg-error "array section in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rz[:]) /* { dg-error "array section in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rz[3]) /* { dg-error "expected pointer in .attach. clause" } */
+#pragma acc exit data detach(rz[3]) /* { dg-error "expected pointer in .detach. clause" } */
+
+#pragma acc acc enter data attach(rs.e)
+#pragma acc exit data detach(rs.e) attach(rz) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc data attach(rs.e)
+    {
+    }
+#pragma acc exit data delete(ra) attach(rs.a) /* { dg-error ".attach. is not valid for" } */
+
+#pragma acc enter data attach(rscalar) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rscalar) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc enter data attach(rs) /* { dg-error "expected pointer in .attach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+#pragma acc exit data detach(rs) /* { dg-error "expected pointer in .detach. clause" } */
+/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */
+  }
+
+#pragma acc enter data attach(ry[10])
+#pragma acc exit data detach(ry[10])
+}
diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
index b94214e..1a4a671 100644
--- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95
@@ -39,9 +39,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel copy (tip)
   !$acc end parallel
-  !$acc parallel copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -54,9 +54,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyin (tip)
   !$acc end parallel
-  !$acc parallel copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -71,9 +71,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel copyout (tip)
   !$acc end parallel
-  !$acc parallel copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -90,9 +90,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel create (tip) ! { dg-error "POINTER" }
+  !$acc parallel create (tip)
   !$acc end parallel
-  !$acc parallel create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -111,9 +111,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel present (tip) ! { dg-error "POINTER" }
+  !$acc parallel present (tip)
   !$acc end parallel
-  !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -144,9 +144,9 @@  contains
   !$acc end parallel
 
 
-  !$acc parallel present_or_copy (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copy (tip)
   !$acc end parallel
-  !$acc parallel present_or_copy (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copy (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copy (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -169,9 +169,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyin (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyin (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyin (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -196,9 +196,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel present_or_copyout (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_copyout (tip)
   !$acc end parallel
-  !$acc parallel present_or_copyout (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_copyout (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_copyout (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -225,9 +225,9 @@  contains
   !$acc end data
 
 
-  !$acc parallel present_or_create (tip) ! { dg-error "POINTER" }
+  !$acc parallel present_or_create (tip)
   !$acc end parallel
-  !$acc parallel present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc parallel present_or_create (tia)
   !$acc end parallel
   !$acc parallel deviceptr (i) present_or_create (i) ! { dg-error "multiple clauses" }
   !$acc end parallel
@@ -256,4 +256,4 @@  contains
   !$acc end data
 
   end subroutine foo
-end module test
\ No newline at end of file
+end module test
diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
new file mode 100644
index 0000000..5fb2981
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90
@@ -0,0 +1,77 @@ 
+! Test ACC UPDATE with derived types.
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type dtype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type dtype
+end module dt
+
+program derived_acc
+  use dt
+  
+  implicit none
+  type(dtype):: var
+  integer i
+  !$acc declare create(var)
+  !$acc declare pcopy(var%a) ! { dg-error "Syntax error in OpenMP" }
+
+  !$acc update host(var)
+  !$acc update host(var%a)
+  !$acc update device(var)
+  !$acc update device(var%a)
+  !$acc update self(var)
+  !$acc update self(var%a)
+  
+  !$acc enter data copyin(var)
+  !$acc enter data copyin(var%a)
+
+  !$acc exit data copyout(var)
+  !$acc exit data copyout(var%a)
+
+  !$acc data copy(var)
+  !$acc end data
+
+  !$acc data copyout(var%a)
+  !$acc end data
+
+  !$acc parallel loop pcopyout(var)
+  do i = 1, 10
+  end do  
+  !$acc end parallel loop
+
+  !$acc parallel loop copyout(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel loop
+
+  !$acc parallel pcopy(var)
+  !$acc end parallel
+
+  !$acc parallel pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end parallel
+  
+  !$acc kernels pcopyin(var)
+  !$acc end kernels
+
+  !$acc kernels pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels
+
+  !$acc kernels loop pcopyin(var)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+
+  !$acc kernels loop pcopy(var%a)
+  do i = 1, 10
+  end do
+  !$acc end kernels loop
+end program derived_acc
diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
index 805459c..b616b39 100644
--- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
+++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95
@@ -44,14 +44,14 @@  contains
   !$acc enter data wait (i, 1) 
   !$acc enter data wait (a) ! { dg-error "INTEGER" }
   !$acc enter data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc enter data copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data create (tip) ! { dg-error "POINTER" }
-  !$acc enter data create (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_copyin (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_copyin (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc enter data present_or_create (tip) ! { dg-error "POINTER" }
-  !$acc enter data present_or_create (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc enter data copyin (tip)
+  !$acc enter data copyin (tia)
+  !$acc enter data create (tip)
+  !$acc enter data create (tia)
+  !$acc enter data present_or_copyin (tip)
+  !$acc enter data present_or_copyin (tia)
+  !$acc enter data present_or_create (tip)
+  !$acc enter data present_or_create (tia)
   !$acc enter data copyin (i) create (i) ! { dg-error "multiple clauses" }
   !$acc enter data copyin (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
   !$acc enter data create (i) present_or_copyin (i) ! { dg-error "multiple clauses" }
@@ -79,10 +79,10 @@  contains
   !$acc exit data wait (i, 1) 
   !$acc exit data wait (a) ! { dg-error "INTEGER" }
   !$acc exit data wait (b(5:6)) ! { dg-error "INTEGER" }
-  !$acc exit data copyout (tip) ! { dg-error "POINTER" }
-  !$acc exit data copyout (tia) ! { dg-error "ALLOCATABLE" }
-  !$acc exit data delete (tip) ! { dg-error "POINTER" }
-  !$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" }
+  !$acc exit data copyout (tip)
+  !$acc exit data copyout (tia)
+  !$acc exit data delete (tip)
+  !$acc exit data delete (tia)
   !$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" }
   !$acc exit data finalize
   !$acc exit data finalize copyout (i)
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 99eca4a..5455da9 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -826,6 +826,15 @@  dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
 	case GOMP_MAP_LINK:
 	  pp_string (pp, "link");
 	  break;
+	case GOMP_MAP_ATTACH:
+	  pp_string (pp, "attach");
+	  break;
+	case GOMP_MAP_DETACH:
+	  pp_string (pp, "detach");
+	  break;
+	case GOMP_MAP_FORCE_DETACH:
+	  pp_string (pp, "force_detach");
+	  break;
 	default:
 	  gcc_unreachable ();
 	}
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index acd2585..b7ce640 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -42,6 +42,7 @@ 
 #define GOMP_MAP_FLAG_SPECIAL_2		(1 << 4)
 #define GOMP_MAP_FLAG_SPECIAL		(GOMP_MAP_FLAG_SPECIAL_1 \
 					 | GOMP_MAP_FLAG_SPECIAL_0)
+#define GOMP_MAP_DEEP_COPY		(1 << 5)
 /* Flag to force a specific behavior (or else, trigger a run-time error).  */
 #define GOMP_MAP_FLAG_FORCE		(1 << 7)
 
@@ -128,6 +129,13 @@  enum gomp_map_kind
     /* Decrement usage count and deallocate if zero.  */
     GOMP_MAP_RELEASE =			(GOMP_MAP_FLAG_SPECIAL_2
 					 | GOMP_MAP_DELETE),
+    /* In OpenACC, attach a pointer to a mapped struct field.  */
+    GOMP_MAP_ATTACH =			(GOMP_MAP_DEEP_COPY | 0),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_DETACH =			(GOMP_MAP_DEEP_COPY | 1),
+    /* In OpenACC, detach a pointer to a mapped struct field.  */
+    GOMP_MAP_FORCE_DETACH =		(GOMP_MAP_DEEP_COPY
+					 | GOMP_MAP_FLAG_FORCE | 1),
 
     /* Internal to GCC, not used in libgomp.  */
     /* Do not map, but pointer assign a pointer instead.  */
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index cb25e86..df49c1b 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -866,6 +866,8 @@  struct target_var_desc {
   bool copy_from;
   /* True if data always should be copied from device to host at the end.  */
   bool always_copy_from;
+  /* True if variable should be detached at end of region.  */
+  bool do_detach;
   /* Relative offset against key host_start.  */
   uintptr_t offset;
   /* Actual length.  */
@@ -918,8 +920,13 @@  struct splay_tree_key_s {
   uintptr_t tgt_offset;
   /* Reference count.  */
   uintptr_t refcount;
-  /* Dynamic reference count.  */
-  uintptr_t dynamic_refcount;
+  /* Reference counts beyond those that represent genuine references in the
+     linked splay tree key/target memory structures, e.g. for multiple OpenACC
+     "present increment" operations (via "acc enter data") refering to the same
+     host-memory block.  */
+  uintptr_t virtual_refcount;
+  /* For a block with attached pointers, the attachment counters for each.  */
+  unsigned short *attach_count;
   /* Pointer to the original mapping of "omp declare target link" object.  */
   splay_tree_key link_key;
 };
@@ -943,13 +950,6 @@  splay_compare (splay_tree_key x, splay_tree_key y)
 
 typedef struct acc_dispatch_t
 {
-  /* This is a linked list of data mapped using the
-     acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas.
-     Unlike mapped_data in the goacc_thread struct, unmapping can
-     happen out-of-order with respect to mapping.  */
-  /* This is guarded by the lock in the "outer" struct gomp_device_descr.  */
-  struct target_mem_desc *data_environ;
-
   /* Execute.  */
   __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func;
 
@@ -1056,13 +1056,17 @@  struct gomp_device_descr
 enum gomp_map_vars_kind
 {
   GOMP_MAP_VARS_OPENACC,
+  GOMP_MAP_VARS_OPENACC_ENTER_DATA,
   GOMP_MAP_VARS_TARGET,
   GOMP_MAP_VARS_DATA,
   GOMP_MAP_VARS_ENTER_DATA
 };
 
-extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int);
-extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int);
+struct gomp_coalesce_buf;
+
+extern void gomp_acc_remove_pointer (struct gomp_device_descr *, void **,
+				     size_t *, unsigned short *, int, bool,
+				     int);
 extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *,
 				       unsigned short *);
 struct gomp_coalesce_buf;
@@ -1072,6 +1076,14 @@  extern void gomp_copy_host2dev (struct gomp_device_descr *,
 extern void gomp_copy_dev2host (struct gomp_device_descr *,
 				struct goacc_asyncqueue *, void *, const void *,
 				size_t);
+extern uintptr_t gomp_map_val (struct target_mem_desc *, void **, size_t);
+extern void gomp_attach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree,
+				 splay_tree_key, uintptr_t, size_t,
+				 struct gomp_coalesce_buf *);
+extern void gomp_detach_pointer (struct gomp_device_descr *,
+				 struct goacc_asyncqueue *, splay_tree_key,
+				 uintptr_t, bool, struct gomp_coalesce_buf *);
 
 extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *,
 					      size_t, void **, void **,
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index ba9218b..a086dd2 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -480,6 +480,16 @@  OACC_2.5 {
 	acc_update_self_async_array_h_;
 } OACC_2.0.1;
 
+OACC_2.6 {
+  global:
+	acc_attach;
+	acc_attach_async;
+	acc_detach;
+	acc_detach_async;
+	acc_detach_finalize;
+	acc_detach_finalize_async;
+} OACC_2.5;
+
 GOACC_2.0 {
   global:
 	GOACC_data_end;
diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c
index 68aaf19..077e28f 100644
--- a/libgomp/oacc-async.c
+++ b/libgomp/oacc-async.c
@@ -263,6 +263,24 @@  goacc_async_copyout_unmap_vars (struct target_mem_desc *tgt,
 					      (void *) tgt);
 }
 
+/* Remove a variable asynchronously.  This actually removes the variable
+   mapping immediately, but retains the linked target_mem_desc until the
+   asynchronous operation has completed (as it may still refer to target
+   memory).  The device lock must be held before entry, and remains locked on
+   exit.  */
+
+attribute_hidden void
+goacc_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key n,
+                       struct goacc_asyncqueue *aq)
+{
+  struct target_mem_desc *tgt = n->tgt;
+  assert (tgt);
+  tgt->refcount++;
+  gomp_remove_var (devicep, n);
+  devicep->openacc.async.queue_callback_func (aq, goacc_async_unmap_tgt,
+                                             (void *) tgt);
+}
+
 attribute_hidden void
 goacc_async_free (struct gomp_device_descr *devicep,
 		  struct goacc_asyncqueue *aq, void *ptr)
diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c
index 53658c8..6ab77a9 100644
--- a/libgomp/oacc-host.c
+++ b/libgomp/oacc-host.c
@@ -262,8 +262,6 @@  static struct gomp_device_descr host_dispatch =
     .state = GOMP_DEVICE_UNINITIALIZED,
 
     .openacc = {
-      .data_environ = NULL,
-
       .exec_func = host_openacc_exec,
 
       .async = {
diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c
index 823e20e..c3b8234 100644
--- a/libgomp/oacc-init.c
+++ b/libgomp/oacc-init.c
@@ -302,9 +302,12 @@  acc_shutdown_1 (acc_device_t d)
 
       if (walk->dev)
 	{
-	  gomp_mutex_lock (&walk->dev->lock);
-	  gomp_free_memmap (&walk->dev->mem_map);
-	  gomp_mutex_unlock (&walk->dev->lock);
+	  while (walk->dev->mem_map.root)
+	    {
+	      splay_tree_key k = &walk->dev->mem_map.root->key;
+	      k->link_key = NULL;
+	      gomp_remove_var (walk->dev, k);
+	    }
 
 	  walk->dev = NULL;
 	  walk->base_dev = NULL;
diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h
index 3354eb6..2e4045e 100644
--- a/libgomp/oacc-int.h
+++ b/libgomp/oacc-int.h
@@ -102,10 +102,15 @@  void goacc_restore_bind (void);
 void goacc_lazy_initialize (void);
 void goacc_host_init (void);
 
+struct splay_tree_key_s;
+
 void goacc_init_asyncqueues (struct gomp_device_descr *);
 bool goacc_fini_asyncqueues (struct gomp_device_descr *);
 void goacc_async_copyout_unmap_vars (struct target_mem_desc *,
 				     struct goacc_asyncqueue *);
+void goacc_remove_var_async (struct gomp_device_descr *devicep,
+			     struct splay_tree_key_s *n,
+			     struct goacc_asyncqueue *aq);
 void goacc_async_free (struct gomp_device_descr *,
 		       struct goacc_asyncqueue *, void *);
 struct goacc_asyncqueue *get_goacc_asyncqueue (int);
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 050eb0d..466e75b 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -52,6 +52,25 @@  lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
   return key;
 }
 
+/* Helper for lookup_dev.  Iterate over splay tree.  */
+
+static splay_tree_key
+lookup_dev_1 (splay_tree_node node, uintptr_t d, size_t s)
+{
+  splay_tree_key k = &node->key;
+  struct target_mem_desc *t = k->tgt;
+
+  if (d >= t->tgt_start && d + s <= t->tgt_end)
+    return k;
+
+  if (node->left)
+    return lookup_dev_1 (node->left, d, s);
+  if (node->right)
+    return lookup_dev_1 (node->right, d, s);
+
+  return NULL;
+}
+
 /* Return block containing [D->S), or NULL if not contained.
    The list isn't ordered by device address, so we have to iterate
    over the whole array.  This is not expected to be a common
@@ -59,35 +78,12 @@  lookup_host (struct gomp_device_descr *dev, void *h, size_t s)
    remains locked on exit.  */
 
 static splay_tree_key
-lookup_dev (struct target_mem_desc *tgt, void *d, size_t s)
+lookup_dev (splay_tree mem_map, void *d, size_t s)
 {
-  int i;
-  struct target_mem_desc *t;
-
-  if (!tgt)
-    return NULL;
-
-  for (t = tgt; t != NULL; t = t->prev)
-    {
-      if (t->tgt_start <= (uintptr_t) d && t->tgt_end >= (uintptr_t) d + s)
-        break;
-    }
-
-  if (!t)
+  if (!mem_map || !mem_map->root)
     return NULL;
 
-  for (i = 0; i < t->list_count; i++)
-    {
-      void * offset;
-
-      splay_tree_key k = &t->array[i].key;
-      offset = d - t->tgt_start + k->tgt_offset;
-
-      if (k->host_start + offset <= (void *) k->host_end)
-        return k;
-    }
-
-  return NULL;
+  return lookup_dev_1 (mem_map->root, (uintptr_t) d, s);
 }
 
 /* OpenACC is silent on how memory exhaustion is indicated.  We return
@@ -136,7 +132,7 @@  acc_free (void *d)
   /* We don't have to call lazy open here, as the ptr value must have
      been returned by acc_malloc.  It's not permitted to pass NULL in
      (unless you got that null from acc_malloc).  */
-  if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1)))
+  if ((k = lookup_dev (&acc_dev->mem_map, d, 1)))
     {
       void *offset;
 
@@ -260,7 +256,7 @@  acc_hostptr (void *d)
 
   gomp_mutex_lock (&acc_dev->lock);
 
-  n = lookup_dev (acc_dev->openacc.data_environ, d, 1);
+  n = lookup_dev (&acc_dev->mem_map, d, 1);
 
   if (!n)
     {
@@ -348,7 +344,7 @@  acc_map_data (void *h, void *d, size_t s)
 		      (int)s);
 	}
 
-      if (lookup_dev (thr->dev->openacc.data_environ, d, s))
+      if (lookup_dev (&thr->dev->mem_map, d, s))
         {
 	  gomp_mutex_unlock (&acc_dev->lock);
 	  gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d,
@@ -361,11 +357,6 @@  acc_map_data (void *h, void *d, size_t s)
 			   &kinds, true, GOMP_MAP_VARS_OPENACC);
       tgt->list[0].key->refcount = REFCOUNT_INFINITY;
     }
-
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
-  gomp_mutex_unlock (&acc_dev->lock);
 }
 
 void
@@ -373,6 +364,7 @@  acc_unmap_data (void *h)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
 
   /* No need to call lazy open, as the address must have been mapped.  */
 
@@ -380,12 +372,11 @@  acc_unmap_data (void *h)
   if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  size_t host_size;
-
   gomp_mutex_lock (&acc_dev->lock);
 
-  splay_tree_key n = lookup_host (acc_dev, h, 1);
-  struct target_mem_desc *t;
+  cur_node.host_start = (uintptr_t) h;
+  cur_node.host_end = cur_node.host_start + 1;
+  splay_tree_key n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
   if (!n)
     {
@@ -393,46 +384,27 @@  acc_unmap_data (void *h)
       gomp_fatal ("%p is not a mapped block", (void *)h);
     }
 
-  host_size = n->host_end - n->host_start;
-
   if (n->host_start != (uintptr_t) h)
     {
+      size_t host_size = n->host_end - n->host_start;
       gomp_mutex_unlock (&acc_dev->lock);
       gomp_fatal ("[%p,%d] surrounds %p",
 		  (void *) n->host_start, (int) host_size, (void *) h);
     }
 
-  /* Mark for removal.  */
-  n->refcount = 1;
+  splay_tree_remove (&acc_dev->mem_map, n);
 
-  t = n->tgt;
+  struct target_mem_desc *tgt = n->tgt;
 
-  if (t->refcount == 2)
+  if (tgt->refcount > 0)
+    tgt->refcount--;
+  else
     {
-      struct target_mem_desc *tp;
-
-      /* This is the last reference, so pull the descriptor off the
-         chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from
-         freeing the device memory. */
-      t->tgt_end = 0;
-      t->to_free = 0;
-
-      for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	   tp = t, t = t->prev)
-	if (n->tgt == t)
-	  {
-	    if (tp)
-	      tp->prev = t->prev;
-	    else
-	      acc_dev->openacc.data_environ = t->prev;
-
-	    break;
-	  }
+      free (tgt->array);
+      free (tgt);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
-
-  gomp_unmap_vars (t, true);
 }
 
 #define FLAG_PRESENT (1 << 0)
@@ -479,8 +451,9 @@  present_create_copy (unsigned f, void *h, size_t s, int async)
       if (n->refcount != REFCOUNT_INFINITY)
 	{
 	  n->refcount++;
-	  n->dynamic_refcount++;
+	  n->virtual_refcount++;
 	}
+
       gomp_mutex_unlock (&acc_dev->lock);
     }
   else if (!(f & FLAG_CREATE))
@@ -490,7 +463,6 @@  present_create_copy (unsigned f, void *h, size_t s, int async)
     }
   else
     {
-      struct target_mem_desc *tgt;
       size_t mapnum = 1;
       unsigned short kinds;
       void *hostaddrs = h;
@@ -504,18 +476,14 @@  present_create_copy (unsigned f, void *h, size_t s, int async)
 
       goacc_aq aq = get_goacc_asyncqueue (async);
 
-      tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s,
-				 &kinds, true, GOMP_MAP_VARS_OPENACC);
-
-      /* Initialize dynamic refcount.  */
-      tgt->list[0].key->dynamic_refcount = 1;
+      gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, &kinds,
+			   true, GOMP_MAP_VARS_OPENACC_ENTER_DATA);
 
       gomp_mutex_lock (&acc_dev->lock);
-
-      d = tgt->to_free;
-      tgt->prev = acc_dev->openacc.data_environ;
-      acc_dev->openacc.data_environ = tgt;
-
+      n = lookup_host (acc_dev, h, s);
+      assert (n != NULL);
+      d = (void *) (n->tgt->tgt_start + n->tgt_offset + (uintptr_t) h
+		    - n->host_start);
       gomp_mutex_unlock (&acc_dev->lock);
     }
 
@@ -590,7 +558,6 @@  delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 {
   size_t host_size;
   splay_tree_key n;
-  void *d;
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
@@ -610,9 +577,6 @@  delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
       gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s);
     }
 
-  d = (void *) (n->tgt->tgt_start + n->tgt_offset
-		+ (uintptr_t) h - n->host_start);
-
   host_size = n->host_end - n->host_start;
 
   if (n->host_start != (uintptr_t) h || host_size != s)
@@ -625,48 +589,37 @@  delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
   if (n->refcount == REFCOUNT_INFINITY)
     {
       n->refcount = 0;
-      n->dynamic_refcount = 0;
-    }
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
+      n->virtual_refcount = 0;
     }
 
   if (f & FLAG_FINALIZE)
     {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
+      n->refcount -= n->virtual_refcount;
+      n->virtual_refcount = 0;
     }
-  else if (n->dynamic_refcount)
+
+  if (n->virtual_refcount > 0)
     {
-      n->dynamic_refcount--;
       n->refcount--;
+      n->virtual_refcount--;
     }
+  else if (n->refcount > 0)
+    n->refcount--;
 
   if (n->refcount == 0)
     {
-      if (n->tgt->refcount == 2)
-	{
-	  struct target_mem_desc *tp, *t;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    if (n->tgt == t)
-	      {
-		if (tp)
-		  tp->prev = t->prev;
-		else
-		  acc_dev->openacc.data_environ = t->prev;
-		break;
-	      }
-	}
+      goacc_aq aq = get_goacc_asyncqueue (async);
 
       if (f & FLAG_COPYOUT)
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
+        {
+	  void *d = (void *) (n->tgt->tgt_start + n->tgt_offset
+			      + (uintptr_t) h - n->host_start);
 	  gomp_copy_dev2host (acc_dev, aq, h, d, s);
 	}
-      gomp_remove_var (acc_dev, n);
+      if (aq)
+	goacc_remove_var_async (acc_dev, n, aq);
+      else
+	gomp_remove_var (acc_dev, n);
     }
 
   gomp_mutex_unlock (&acc_dev->lock);
@@ -783,140 +736,159 @@  acc_update_self_async (void *h, size_t s, int async)
 }
 
 void
-gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes,
-			 void *kinds, int async)
+gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs,
+			 size_t *sizes, unsigned short *kinds, int async,
+			 bool finalize, int mapnum)
 {
-  struct target_mem_desc *tgt;
-  struct goacc_thread *thr = goacc_thread ();
-  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+
+  gomp_mutex_lock (&acc_dev->lock);
 
-  if (acc_is_present (*hostaddrs, *sizes))
+  for (int i = 0; i < mapnum; i++)
     {
-      splay_tree_key n;
-      gomp_mutex_lock (&acc_dev->lock);
-      n = lookup_host (acc_dev, *hostaddrs, *sizes);
-      gomp_mutex_unlock (&acc_dev->lock);
+      int kind = kinds[i] & 0xff;
+      bool copyfrom = false;
 
-      tgt = n->tgt;
-      for (size_t i = 0; i < tgt->list_count; i++)
-	if (tgt->list[i].key == n)
-	  {
-	    for (size_t j = 0; j < mapnum; j++)
-	      if (i + j < tgt->list_count && tgt->list[i + j].key)
-		{
-		  tgt->list[i + j].key->refcount++;
-		  tgt->list[i + j].key->dynamic_refcount++;
-		}
-	    return;
-	  }
-      /* Should not reach here.  */
-      gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset");
-    }
+      switch (kind)
+        {
+	case GOMP_MAP_FROM:
+	case GOMP_MAP_FORCE_FROM:
+	case GOMP_MAP_ALWAYS_FROM:
+	  copyfrom = true;
+	  /* Fallthrough.  */
+
+	case GOMP_MAP_TO_PSET:
+	case GOMP_MAP_POINTER:
+	case GOMP_MAP_DELETE:
+	case GOMP_MAP_RELEASE:
+	case GOMP_MAP_DETACH:
+	case GOMP_MAP_FORCE_DETACH:
+	  cur_node.host_start = (uintptr_t) hostaddrs[i];
+	  cur_node.host_end = cur_node.host_start
+			      + ((kind == GOMP_MAP_DETACH
+				  || kind == GOMP_MAP_FORCE_DETACH
+				  || kind == GOMP_MAP_POINTER)
+				 ? sizeof (void *) : sizes[i]);
+	  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
+
+	  if (n == NULL)
+	    continue;
+
+	  if (n->refcount == REFCOUNT_INFINITY)
+	    {
+	      n->refcount = 1;
+	      n->virtual_refcount = 0;
+	    }
 
-  gomp_debug (0, "  %s: prepare mappings\n", __FUNCTION__);
-  goacc_aq aq = get_goacc_asyncqueue (async);
-  tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs,
-			     NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC);
-  gomp_debug (0, "  %s: mappings prepared\n", __FUNCTION__);
+	  if (finalize)
+	    {
+	      n->refcount -= n->virtual_refcount;
+	      n->virtual_refcount = 0;
+	    }
 
-  /* Initialize dynamic refcount.  */
-  tgt->list[0].key->dynamic_refcount = 1;
+	  if (n->virtual_refcount > 0)
+	    {
+	      n->refcount--;
+	      n->virtual_refcount--;
+	    }
+	  else if (n->refcount > 0)
+	    n->refcount--;
+
+	  if (copyfrom)
+	    gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start,
+				(void *) (n->tgt->tgt_start + n->tgt_offset
+					  + cur_node.host_start
+					  - n->host_start),
+				cur_node.host_end - cur_node.host_start);
+
+	  if (n->refcount == 0)
+	    gomp_remove_var (acc_dev, n);
+	  break;
+
+	default:
+	  gomp_mutex_unlock (&acc_dev->lock);
+	  gomp_fatal ("gomp_acc_remove_pointer unhandled kind 0x%.2x",
+		      kind);
+	}
+    }
 
-  gomp_mutex_lock (&acc_dev->lock);
-  tgt->prev = acc_dev->openacc.data_environ;
-  acc_dev->openacc.data_environ = tgt;
   gomp_mutex_unlock (&acc_dev->lock);
 }
 
+
 void
-gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async,
-			 int finalize, int mapnum)
+acc_attach_async (void **hostaddr, int async)
 {
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
+  goacc_aq aq = get_goacc_asyncqueue (async);
+
+  struct splay_tree_key_s cur_node;
   splay_tree_key n;
-  struct target_mem_desc *t;
-  int minrefs = (mapnum == 1) ? 2 : 3;
 
-  if (!acc_is_present (h, s))
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
     return;
 
-  gomp_mutex_lock (&acc_dev->lock);
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  n = lookup_host (acc_dev, h, 1);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_attach");
 
-  if (!n)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("%p is not a mapped block", (void *)h);
-    }
+  gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr,
+		       0, NULL);
+}
 
-  gomp_debug (0, "  %s: restore mappings\n", __FUNCTION__);
+void
+acc_attach (void **hostaddr)
+{
+  acc_attach_async (hostaddr, acc_async_sync);
+}
 
-  t = n->tgt;
+static void
+goacc_detach_internal (void **hostaddr, int async, bool finalize)
+{
+  struct goacc_thread *thr = goacc_thread ();
+  struct gomp_device_descr *acc_dev = thr->dev;
+  struct splay_tree_key_s cur_node;
+  splay_tree_key n;
+  struct goacc_asyncqueue *aq = get_goacc_asyncqueue (async);
 
-  if (n->refcount < n->dynamic_refcount)
-    {
-      gomp_mutex_unlock (&acc_dev->lock);
-      gomp_fatal ("Dynamic reference counting assert fail\n");
-    }
+  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
+    return;
 
-  if (finalize)
-    {
-      n->refcount -= n->dynamic_refcount;
-      n->dynamic_refcount = 0;
-    }
-  else if (n->dynamic_refcount)
-    {
-      n->dynamic_refcount--;
-      n->refcount--;
-    }
+  cur_node.host_start = (uintptr_t) hostaddr;
+  cur_node.host_end = cur_node.host_start + sizeof (void *);
+  n = splay_tree_lookup (&acc_dev->mem_map, &cur_node);
 
-  gomp_mutex_unlock (&acc_dev->lock);
+  if (n == NULL)
+    gomp_fatal ("struct not mapped for acc_detach");
 
-  if (n->refcount == 0)
-    {
-      if (t->refcount == minrefs)
-	{
-	  /* This is the last reference, so pull the descriptor off the
-	     chain. This prevents gomp_unmap_vars via gomp_unmap_tgt from
-	     freeing the device memory. */
-	  struct target_mem_desc *tp;
-	  for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL;
-	       tp = t, t = t->prev)
-	    {
-	      if (n->tgt == t)
-		{
-		  if (tp)
-		    tp->prev = t->prev;
-		  else
-		    acc_dev->openacc.data_environ = t->prev;
-		  break;
-		}
-	    }
-	}
+  gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL);
+}
 
-      /* Set refcount to 1 to allow gomp_unmap_vars to unmap it.  */
-      n->refcount = 1;
-      t->refcount = minrefs;
-      for (size_t i = 0; i < t->list_count; i++)
-	if (t->list[i].key == n)
-	  {
-	    t->list[i].copy_from = force_copyfrom ? 1 : 0;
-	    break;
-	  }
-
-      /* If running synchronously, unmap immediately.  */
-      if (async < acc_async_noval)
-	gomp_unmap_vars (t, true);
-      else
-	{
-	  goacc_aq aq = get_goacc_asyncqueue (async);
-	  goacc_async_copyout_unmap_vars (t, aq);
-	}
-    }
+void
+acc_detach (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, false);
+}
 
-  gomp_mutex_unlock (&acc_dev->lock);
+void
+acc_detach_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, false);
+}
 
-  gomp_debug (0, "  %s: mappings restored\n", __FUNCTION__);
+void
+acc_detach_finalize (void **hostaddr)
+{
+  goacc_detach_internal (hostaddr, acc_async_sync, true);
+}
+
+void
+acc_detach_finalize_async (void **hostaddr, int async)
+{
+  goacc_detach_internal (hostaddr, async, true);
 }
diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c
index 07d0338..0e9a3e8 100644
--- a/libgomp/oacc-parallel.c
+++ b/libgomp/oacc-parallel.c
@@ -47,12 +47,32 @@  find_pointer (int pos, size_t mapnum, unsigned short *kinds)
   if (pos + 1 >= mapnum)
     return 0;
 
-  unsigned char kind = kinds[pos+1] & 0xff;
+  unsigned char kind0 = kinds[pos] & 0xff;
 
-  if (kind == GOMP_MAP_TO_PSET)
-    return 3;
-  else if (kind == GOMP_MAP_POINTER)
-    return 2;
+  switch (kind0)
+    {
+    case GOMP_MAP_TO:
+    case GOMP_MAP_FORCE_TO:
+    case GOMP_MAP_FROM:
+    case GOMP_MAP_FORCE_FROM:
+    case GOMP_MAP_TOFROM:
+    case GOMP_MAP_FORCE_TOFROM:
+    case GOMP_MAP_ALLOC:
+    case GOMP_MAP_RELEASE:
+      {
+	unsigned char kind1 = kinds[pos + 1] & 0xff;
+	if (kind1 == GOMP_MAP_POINTER
+	    || kind1 == GOMP_MAP_ALWAYS_POINTER
+	    || kind1 == GOMP_MAP_ATTACH
+	    || kind1 == GOMP_MAP_DETACH
+	    || kind1 == GOMP_MAP_FORCE_DETACH)
+	  return 2;
+	else if (kind1 == GOMP_MAP_TO_PSET)
+	  return 3;
+      }
+    default:
+      /* empty.  */;
+    }
 
   return 0;
 }
@@ -235,8 +255,8 @@  GOACC_parallel_keyed (int device, void (*fn) (void *),
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-    devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-			    + tgt->list[i].key->tgt_offset);
+    devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i);
+
   if (aq == NULL)
     {
       acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
@@ -355,6 +375,10 @@  GOACC_enter_exit_data (int device, size_t mapnum,
   if (mapnum > 0)
     {
       unsigned char kind = kinds[0] & 0xff;
+
+      if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT)
+        kind = kinds[1] & 0xff;
+
       if (kind == GOMP_MAP_DELETE
 	  || kind == GOMP_MAP_FORCE_FROM)
 	finalize = true;
@@ -365,11 +389,14 @@  GOACC_enter_exit_data (int device, size_t mapnum,
     {
       unsigned char kind = kinds[i] & 0xff;
 
-      if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET)
+      if (kind == GOMP_MAP_POINTER
+	  || kind == GOMP_MAP_TO_PSET
+	  || kind == GOMP_MAP_STRUCT
+	  || kind == GOMP_MAP_FORCE_PRESENT)
 	continue;
 
       if (kind == GOMP_MAP_FORCE_ALLOC
-	  || kind == GOMP_MAP_FORCE_PRESENT
+	  || kind == GOMP_MAP_ATTACH
 	  || kind == GOMP_MAP_FORCE_TO
 	  || kind == GOMP_MAP_TO
 	  || kind == GOMP_MAP_ALLOC)
@@ -380,6 +407,8 @@  GOACC_enter_exit_data (int device, size_t mapnum,
 
       if (kind == GOMP_MAP_RELEASE
 	  || kind == GOMP_MAP_DELETE
+	  || kind == GOMP_MAP_DETACH
+	  || kind == GOMP_MAP_FORCE_DETACH
 	  || kind == GOMP_MAP_FROM
 	  || kind == GOMP_MAP_FORCE_FROM)
 	break;
@@ -413,6 +442,9 @@  GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_ALLOC:
 		  acc_present_or_create (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_ATTACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
 		case GOMP_MAP_FORCE_ALLOC:
 		  acc_create (hostaddrs[i], sizes[i]);
 		  break;
@@ -422,6 +454,16 @@  GOACC_enter_exit_data (int device, size_t mapnum,
 		case GOMP_MAP_FORCE_TO:
 		  acc_copyin (hostaddrs[i], sizes[i]);
 		  break;
+		case GOMP_MAP_STRUCT:
+		  {
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    gomp_map_vars_async (acc_dev, aq, elems + 1, &hostaddrs[i],
+					 NULL, &sizes[i], &kinds[i], true,
+					 GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		    i += elems;
+		  }
+		  break;
 		default:
 		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
 			      kind);
@@ -430,8 +472,14 @@  GOACC_enter_exit_data (int device, size_t mapnum,
 	    }
 	  else
 	    {
-	      gomp_acc_insert_pointer (pointer, &hostaddrs[i],
-				       &sizes[i], &kinds[i], async);
+	      goacc_aq aq = get_goacc_asyncqueue (async);
+	      for (int j = 0; j < 2; j++)
+		gomp_map_vars_async (acc_dev, aq,
+				     (j == 0 || pointer == 2) ? 1 : 2,
+				     &hostaddrs[i + j], NULL,
+				     &sizes[i + j], &kinds[i + j], true,
+				     GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+
 	      /* Increment 'i' by two because OpenACC requires fortran
 		 arrays to be contiguous, so each PSET is associated with
 		 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
@@ -439,51 +487,143 @@  GOACC_enter_exit_data (int device, size_t mapnum,
 	      i += pointer - 1;
 	    }
 	}
+
+      /* This loop only handles explicit "attach" clauses that are not an
+	 implicit part of a copy{,in,out}, etc. mapping.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  /* Scan for pointers and PSETs.  */
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_ATTACH)
+		acc_attach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    i += pointer - 1;
+	}
     }
   else
-    for (i = 0; i < mapnum; ++i)
-      {
-	unsigned char kind = kinds[i] & 0xff;
+    {
+      /* Handle "detach" before copyback/deletion of mapped data.  */
+      for (i = 0; i < mapnum; i++)
+        {
+	  unsigned char kind = kinds[i] & 0xff;
 
-	int pointer = find_pointer (i, mapnum, kinds);
+	  int pointer = find_pointer (i, mapnum, kinds);
 
-	if (!pointer)
-	  {
-	    switch (kind)
-	      {
-	      case GOMP_MAP_RELEASE:
-	      case GOMP_MAP_DELETE:
-		if (acc_is_present (hostaddrs[i], sizes[i]))
+	  if (!pointer)
+	    {
+	      if (kind == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_FORCE_DETACH)
+		acc_detach_finalize (hostaddrs[i]);
+	      else if (kind == GOMP_MAP_STRUCT)
+	        i += sizes[i];
+	    }
+	  else
+	    {
+	      unsigned char kind2 = kinds[i + pointer - 1] & 0xff;
+
+	      if (kind2 == GOMP_MAP_DETACH)
+		acc_detach (hostaddrs[i + pointer - 1]);
+	      else if (kind2 == GOMP_MAP_FORCE_DETACH)
+	        acc_detach_finalize (hostaddrs[i + pointer - 1]);
+
+	      i += pointer - 1;
+	    }
+	}
+
+      for (i = 0; i < mapnum; ++i)
+	{
+	  unsigned char kind = kinds[i] & 0xff;
+
+	  int pointer = find_pointer (i, mapnum, kinds);
+
+	  if (!pointer)
+	    {
+	      switch (kind)
+		{
+		case GOMP_MAP_RELEASE:
+		case GOMP_MAP_DELETE:
+		  if (acc_is_present (hostaddrs[i], sizes[i]))
+		    {
+		      if (finalize)
+			acc_delete_finalize_async (hostaddrs[i], sizes[i],
+						   async);
+		      else
+			acc_delete_async (hostaddrs[i], sizes[i], async);
+		    }
+		  break;
+		case GOMP_MAP_DETACH:
+		case GOMP_MAP_FORCE_DETACH:
+		case GOMP_MAP_FORCE_PRESENT:
+		  break;
+		case GOMP_MAP_FROM:
+		case GOMP_MAP_FORCE_FROM:
+		  if (finalize)
+		    acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
+		  else
+		    acc_copyout_async (hostaddrs[i], sizes[i], async);
+		  break;
+		case GOMP_MAP_STRUCT:
 		  {
-		    if (finalize)
-		      acc_delete_finalize_async (hostaddrs[i], sizes[i], async);
-		    else
-		      acc_delete_async (hostaddrs[i], sizes[i], async);
+		    int elems = sizes[i];
+		    goacc_aq aq = get_goacc_asyncqueue (async);
+		    for (int j = 1; j <= elems; j++)
+		      {
+			struct splay_tree_key_s k;
+			k.host_start = (uintptr_t) hostaddrs[i + j];
+			k.host_end = k.host_start + sizes[i + j];
+			splay_tree_key str;
+			gomp_mutex_lock (&acc_dev->lock);
+			str = splay_tree_lookup (&acc_dev->mem_map, &k);
+			gomp_mutex_unlock (&acc_dev->lock);
+			if (str)
+		          {
+			    if (finalize)
+			      {
+				str->refcount -= str->virtual_refcount;
+				str->virtual_refcount = 0;
+			      }
+			    if (str->virtual_refcount > 0)
+			      {
+				str->refcount--;
+				str->virtual_refcount--;
+			      }
+			    else if (str->refcount > 0)
+			      str->refcount--;
+			    if (str->refcount == 0)
+			      {
+				if (aq)
+				  goacc_remove_var_async (acc_dev, str, aq);
+				else
+				  gomp_remove_var (acc_dev, str);
+			      }
+			  }
+		      }
+		    i += elems;
 		  }
-		break;
-	      case GOMP_MAP_FROM:
-	      case GOMP_MAP_FORCE_FROM:
-		if (finalize)
-		  acc_copyout_finalize_async (hostaddrs[i], sizes[i], async);
-		else
-		  acc_copyout_async (hostaddrs[i], sizes[i], async);
-		break;
-	      default:
-		gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
-			    kind);
-		break;
-	      }
-	  }
-	else
-	  {
-	    bool copyfrom = (kind == GOMP_MAP_FORCE_FROM
-			     || kind == GOMP_MAP_FROM);
-	    gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async,
-				     finalize, pointer);
-	    /* See the above comment.  */
-	    i += pointer - 1;
-	  }
-      }
+		  break;
+		default:
+		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
+			      kind);
+		  break;
+		}
+	    }
+	  else
+	    {
+	      gomp_acc_remove_pointer (acc_dev, &hostaddrs[i], &sizes[i],
+				       &kinds[i], async, finalize, pointer);
+	      i += pointer - 1;
+	    }
+	}
+    }
 }
 
 static void
diff --git a/libgomp/openacc.h b/libgomp/openacc.h
index 2505ac0..1bf2d65 100644
--- a/libgomp/openacc.h
+++ b/libgomp/openacc.h
@@ -113,12 +113,18 @@  void *acc_hostptr (void *) __GOACC_NOTHROW;
 int acc_is_present (void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_to_device (void *, void *, size_t) __GOACC_NOTHROW;
 void acc_memcpy_from_device (void *, void *, size_t) __GOACC_NOTHROW;
+void acc_attach (void **) __GOACC_NOTHROW;
+void acc_attach_async (void **, int) __GOACC_NOTHROW;
+void acc_detach (void **) __GOACC_NOTHROW;
+void acc_detach_async (void **, int) __GOACC_NOTHROW;
 
 /* Finalize versions of copyout/delete functions, specified in OpenACC 2.5.  */
 void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
 void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW;
 void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW;
+void acc_detach_finalize (void **) __GOACC_NOTHROW;
+void acc_detach_finalize_async (void **, int) __GOACC_NOTHROW;
 
 /* Async functions, specified in OpenACC 2.5.  */
 void acc_copyin_async (void *, size_t, int) __GOACC_NOTHROW;
diff --git a/libgomp/target.c b/libgomp/target.c
index 2bfc7e2..6e115d1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -39,6 +39,7 @@ 
 #include <string.h>
 #include <assert.h>
 #include <errno.h>
+#include <limits.h>
 
 #ifdef PLUGIN_SUPPORT
 #include <dlfcn.h>
@@ -372,6 +373,7 @@  gomp_map_vars_existing (struct gomp_device_descr *devicep,
   tgt_var->key = oldn;
   tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind);
   tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind);
+  tgt_var->do_detach = kind == GOMP_MAP_ATTACH;
   tgt_var->offset = newn->host_start - oldn->host_start;
   tgt_var->length = newn->host_end - newn->host_start;
 
@@ -505,7 +507,131 @@  gomp_map_fields_existing (struct target_mem_desc *tgt,
 	      (void *) cur_node.host_end);
 }
 
-static inline uintptr_t
+void
+gomp_attach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree mem_map,
+		     splay_tree_key n, uintptr_t attach_to, size_t bias,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  struct splay_tree_key_s s;
+  size_t size, idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for attach");
+    }
+
+  size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *);
+  /* We might have a pointer in a packed struct: however we cannot have more
+     than one such pointer in each pointer-sized portion of the struct, so
+     this is safe.  */
+  idx = (attach_to - n->host_start) / sizeof (void *);
+
+  if (!n->attach_count)
+    n->attach_count = gomp_malloc_cleared (sizeof (*n->attach_count) * size);
+
+  if (n->attach_count[idx] < USHRT_MAX)
+    n->attach_count[idx]++;
+  else
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count overflow");
+    }
+
+  if (n->attach_count[idx] == 1)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) attach_to;
+      splay_tree_key tn;
+      uintptr_t data;
+
+      if ((void *) target == NULL)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("attempt to attach null pointer");
+	}
+
+      s.host_start = target + bias;
+      s.host_end = s.host_start + 1;
+      tn = splay_tree_lookup (mem_map, &s);
+
+      if (!tn)
+	{
+	  gomp_mutex_unlock (&devicep->lock);
+	  gomp_fatal ("pointer target not mapped for attach");
+	}
+
+      data = tn->tgt->tgt_start + tn->tgt_offset + target - tn->host_start;
+
+      gomp_debug (1,
+		  "%s: attaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) attach_to, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset), (void *) data);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &data,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) attach_to, n->attach_count[idx]);
+}
+
+void
+gomp_detach_pointer (struct gomp_device_descr *devicep,
+		     struct goacc_asyncqueue *aq, splay_tree_key n,
+		     uintptr_t detach_from, bool finalize,
+		     struct gomp_coalesce_buf *cbufp)
+{
+  size_t idx;
+
+  if (n == NULL)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("enclosing struct not mapped for detach");
+    }
+
+  idx = (detach_from - n->host_start) / sizeof (void *);
+
+  if (!n->attach_count)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("no attachment counters for struct");
+    }
+
+  if (finalize)
+    n->attach_count[idx] = 1;
+
+  if (n->attach_count[idx] == 0)
+    {
+      gomp_mutex_unlock (&devicep->lock);
+      gomp_fatal ("attach count underflow");
+    }
+  else
+    n->attach_count[idx]--;
+
+  if (n->attach_count[idx] == 0)
+    {
+      uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + detach_from
+			 - n->host_start;
+      uintptr_t target = (uintptr_t) *(void **) detach_from;
+
+      gomp_debug (1,
+		  "%s: detaching host %p, target %p (struct base %p) to %p\n",
+		  __FUNCTION__, (void *) detach_from, (void *) devptr,
+		  (void *) (n->tgt->tgt_start + n->tgt_offset),
+		  (void *) target);
+
+      gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &target,
+			  sizeof (void *), cbufp);
+    }
+  else
+    gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__,
+		(void *) detach_from, n->attach_count[idx]);
+}
+
+uintptr_t
 gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
 {
   if (tgt->list[i].key != NULL)
@@ -547,8 +673,9 @@  gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 attribute_hidden struct target_mem_desc *
 gomp_map_vars_async (struct gomp_device_descr *devicep,
 		     struct goacc_asyncqueue *aq, size_t mapnum,
-		     void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds,
-		     bool short_mapkind, enum gomp_map_vars_kind pragma_kind)
+		     void **hostaddrs, void **devaddrs, size_t *sizes,
+		     void *kinds, bool short_mapkind,
+		     enum gomp_map_vars_kind pragma_kind)
 {
   size_t i, tgt_align, tgt_size, not_found_cnt = 0;
   bool has_firstprivate = false;
@@ -559,7 +686,8 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
   struct target_mem_desc *tgt
     = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
   tgt->list_count = mapnum;
-  tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1;
+  tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+		   || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1;
   tgt->device_descr = devicep;
   struct gomp_coalesce_buf cbuf, *cbufp = NULL;
 
@@ -674,8 +802,15 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
 	  has_firstprivate = true;
 	  continue;
 	}
+      else if ((kind & typemask) == GOMP_MAP_ATTACH)
+	{
+	  tgt->list[i].key = NULL;
+	  has_firstprivate = true;
+	  continue;
+	}
       cur_node.host_start = (uintptr_t) hostaddrs[i];
-      if (!GOMP_MAP_POINTER_P (kind & typemask))
+      if (!GOMP_MAP_POINTER_P (kind & typemask)
+          && (kind & typemask) != GOMP_MAP_ATTACH)
 	cur_node.host_end = cur_node.host_start + sizes[i];
       else
 	cur_node.host_end = cur_node.host_start + sizeof (void *);
@@ -882,6 +1017,32 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
 		cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset
 				      + cur_node.host_start - n->host_start;
 		continue;
+	      case GOMP_MAP_ATTACH:
+		{
+		  cur_node.host_start = (uintptr_t) hostaddrs[i];
+		  cur_node.host_end = cur_node.host_start + sizeof (void *);
+		  splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
+		  if (n != NULL)
+		    {
+		      tgt->list[i].key = n;
+		      tgt->list[i].offset = cur_node.host_start - n->host_start;
+		      tgt->list[i].length = n->host_end - n->host_start;
+		      tgt->list[i].copy_from = false;
+		      tgt->list[i].always_copy_from = false;
+		      tgt->list[i].do_detach
+		        = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA);
+		      n->refcount++;
+		    }
+		  else
+		    {
+		      gomp_mutex_unlock (&devicep->lock);
+		      gomp_fatal ("outer struct not mapped for attach");
+		    }
+		  gomp_attach_pointer (devicep, aq, mem_map, n,
+				       (uintptr_t) hostaddrs[i], sizes[i],
+				       cbufp);
+		  continue;
+		}
 	      default:
 		break;
 	      }
@@ -926,10 +1087,12 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
 		tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask);
 		tgt->list[i].always_copy_from
 		  = GOMP_MAP_ALWAYS_FROM_P (kind & typemask);
+		tgt->list[i].do_detach = false;
 		tgt->list[i].offset = 0;
 		tgt->list[i].length = k->host_end - k->host_start;
 		k->refcount = 1;
-		k->dynamic_refcount = 0;
+		k->virtual_refcount = 0;
+		k->attach_count = NULL;
 		tgt->refcount++;
 		array->left = NULL;
 		array->right = NULL;
@@ -980,6 +1143,7 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
 			  tgt->list[j].key = k;
 			  tgt->list[j].copy_from = false;
 			  tgt->list[j].always_copy_from = false;
+			  tgt->list[j].do_detach = false;
 			  if (k->refcount != REFCOUNT_INFINITY)
 			    k->refcount++;
 			  gomp_map_pointer (tgt, aq,
@@ -1063,8 +1227,20 @@  gomp_map_vars_async (struct gomp_device_descr *devicep,
   /* If the variable from "omp target enter data" map-list was already mapped,
      tgt is not needed.  Otherwise tgt will be freed by gomp_unmap_vars or
      gomp_exit_data.  */
-  if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0)
-    {
+  if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA
+       || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+      && tgt->refcount == 0)
+    {
+      /* If we're about to discard a target_mem_desc with no "structural"
+	 references (tgt->refcount == 0), any splay keys linked in the tgt's
+	 list must have their virtual refcount incremented to represent that
+	 "lost" reference in order to implement the semantics of the OpenACC
+	 "present increment" operation properly.  */
+      if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA)
+	for (i = 0; i < tgt->list_count; i++)
+	  if (tgt->list[i].key)
+	    tgt->list[i].key->virtual_refcount++;
+
       free (tgt);
       tgt = NULL;
     }
@@ -1091,6 +1267,8 @@  gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k)
   splay_tree_remove (&devicep->mem_map, k);
   if (k->link_key)
     splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key);
+  if (k->attach_count)
+    free (k->attach_count);
   if (k->tgt->refcount > 1)
     k->tgt->refcount--;
   else
@@ -1133,14 +1311,34 @@  gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom,
     }
 
   size_t i;
+
+  /* We must perform detachments before any copies back to the host.  */
+  for (i = 0; i < tgt->list_count; i++)
+    {
+      splay_tree_key k = tgt->list[i].key;
+
+      if (k != NULL && tgt->list[i].do_detach)
+	gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start
+					     + tgt->list[i].offset,
+			     k->refcount == 1, NULL);
+    }
+
   for (i = 0; i < tgt->list_count; i++)
     {
       splay_tree_key k = tgt->list[i].key;
+
       if (k == NULL)
 	continue;
 
       bool do_unmap = false;
-      if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
+      if (k->tgt == tgt
+	  && k->virtual_refcount > 0
+	  && k->refcount != REFCOUNT_INFINITY)
+	{
+	  k->virtual_refcount--;
+	  k->refcount--;
+	}
+      else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY)
 	k->refcount--;
       else if (k->refcount == 1)
 	{
@@ -1283,6 +1481,7 @@  gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_table[i].start;
       k->refcount = REFCOUNT_INFINITY;
+      k->virtual_refcount = 0;
       k->link_key = NULL;
       array->left = NULL;
       array->right = NULL;
@@ -1315,6 +1514,7 @@  gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version,
       k->tgt = tgt;
       k->tgt_offset = target_var->start;
       k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY;
+      k->virtual_refcount = 0;
       k->link_key = NULL;
       array->left = NULL;
       array->right = NULL;
@@ -1549,22 +1749,6 @@  gomp_unload_device (struct gomp_device_descr *devicep)
     }
 }
 
-/* Free address mapping tables.  MM must be locked on entry, and remains locked
-   on return.  */
-
-attribute_hidden void
-gomp_free_memmap (struct splay_tree_s *mem_map)
-{
-  while (mem_map->root)
-    {
-      struct target_mem_desc *tgt = mem_map->root->key.tgt;
-
-      splay_tree_remove (mem_map, &mem_map->root->key);
-      free (tgt->array);
-      free (tgt);
-    }
-}
-
 /* Do we have offload data available for the given offload target type?
    Instead of verifying that *all* offload data is available that could
    possibly be required, we instead just look for *any*.  If we later find any
@@ -2631,6 +2815,9 @@  omp_target_associate_ptr (const void *host_ptr, const void *device_ptr,
       k->tgt = tgt;
       k->tgt_offset = (uintptr_t) device_ptr + device_offset;
       k->refcount = REFCOUNT_INFINITY;
+      k->virtual_refcount = 0;
+      k->attach_count = NULL;
+      k->link_key = NULL;
       array->left = NULL;
       array->right = NULL;
       splay_tree_insert (&devicep->mem_map, array);
@@ -3087,7 +3274,6 @@  gomp_target_init (void)
 		current_device.type = current_device.get_type_func ();
 		current_device.mem_map.root = NULL;
 		current_device.state = GOMP_DEVICE_UNINITIALIZED;
-		current_device.openacc.data_environ = NULL;
 
 		/* Augment DEVICES and NUM_DEVICES.  */
 		devices = gomp_realloc (devices,
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
index 6a52f74..6bdcfe7 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c
@@ -182,13 +182,13 @@  main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
index 71365e8..b403a5c 100644
--- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c
@@ -176,13 +176,13 @@  main (int argc, char **argv)
         exit (EXIT_FAILURE);
     }
 
+    acc_delete (&h_X[0], N * sizeof (float));
+    acc_delete (&h_Y1[0], N * sizeof (float));
+
     free (h_X);
     free (h_Y1);
     free (h_Y2);
 
-    acc_free (d_X);
-    acc_free (d_Y);
-
     context_check (pctx);
 
     s = cublasDestroy (h);
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
new file mode 100644
index 0000000..d8d7067
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c
@@ -0,0 +1,24 @@ 
+#include <stdlib.h>
+#include <assert.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i;
+  struct dc v = { .a = 3, .b = (int *) malloc (sizeof (int) * n) };
+
+#pragma acc parallel loop copy(v.a, v.b[:n])
+  for (i = 0; i < n; i++)
+    v.b[i] = v.a;
+
+  for (i = 0; i < 10; i++)
+    assert (v.b[i] == v.a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
new file mode 100644
index 0000000..7e26e9a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c
@@ -0,0 +1,29 @@ 
+#include <assert.h>
+#include <stdlib.h>
+
+int
+main(int argc, char* argv[])
+{
+  struct foo {
+    int *a, *b, c, d, *e;
+  } s;
+
+  s.a = (int *) malloc (16 * sizeof (int));
+  s.b = (int *) malloc (16 * sizeof (int));
+  s.e = (int *) malloc (16 * sizeof (int));
+
+  #pragma acc data copy(s)
+  {
+    #pragma acc data copy(s.a[0:10])
+    {
+      #pragma acc parallel loop attach(s.a)
+      for (int i = 0; i < 10; i++)
+	s.a[i] = i;
+    }
+  }
+
+  for (int i = 0; i < 10; i++)
+    assert (s.a[i] == i);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
new file mode 100644
index 0000000..cec764b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c
@@ -0,0 +1,34 @@ 
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+int
+main ()
+{
+  int n = 100, i;
+  int *a = (int *) malloc (sizeof (int) * n);
+  int *b;
+
+  for (i = 0; i < n; i++)
+    a[i] = i+1;
+
+#pragma acc enter data copyin(a[:n]) create(b)
+
+  b = a;
+  acc_attach ((void **)&b);
+
+#pragma acc parallel loop present (b[:n])
+  for (i = 0; i < n; i++)
+    b[i] = i+1;
+
+  acc_detach ((void **)&b);
+
+#pragma acc exit data copyout(a[:n], b)
+
+  for (i = 0; i < 10; i++)
+    assert (a[i] == b[i]);
+
+  free (a);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
new file mode 100644
index 0000000..8874ca0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c
@@ -0,0 +1,87 @@ 
+#include <assert.h>
+#include <stdlib.h>
+
+#define LIST_LENGTH 10
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    {
+#pragma acc exit data detach(head->next)
+    }
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+#pragma acc enter data copyin(n[:1])
+#pragma acc enter data attach(head->next)
+  if (n->next)
+    {
+#pragma acc enter data attach(n->next)
+    }
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+#pragma acc exit data detach(head->next)
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	{
+#pragma acc exit data detach(n->next)
+	}
+#pragma acc exit data delete (n[:1])
+      if (head->next)
+	{
+#pragma acc enter data attach(head->next)
+	}
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+#pragma acc enter data copyin(list)
+
+  for (i = 0; i < LIST_LENGTH; i++)
+    insert (&list, i + 1);
+
+  assert (sum_nodes (&list) == (LIST_LENGTH * LIST_LENGTH + LIST_LENGTH) / 2);
+
+  destroy (&list);
+
+#pragma acc exit data delete(list)
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
new file mode 100644
index 0000000..89cafbb
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c
@@ -0,0 +1,81 @@ 
+#include <assert.h>
+#include <stdlib.h>
+#include <openacc.h>
+
+struct node
+{
+  struct node *next;
+  int val;
+};
+
+int
+sum_nodes (struct node *head)
+{
+  int i = 0, sum = 0;
+
+#pragma acc parallel reduction(+:sum) present(head[:1])
+  {
+    for (; head != NULL; head = head->next)
+      sum += head->val;
+  }
+
+  return sum;
+}
+
+void
+insert (struct node *head, int val)
+{
+  struct node *n = (struct node *) malloc (sizeof (struct node));
+
+  if (head->next)
+    acc_detach ((void **) &head->next);
+
+  n->val = val;
+  n->next = head->next;
+  head->next = n;
+
+  acc_copyin (n, sizeof (struct node));
+  acc_attach((void **) &head->next);
+
+  if (n->next)
+    acc_attach ((void **) &n->next);
+}
+
+void
+destroy (struct node *head)
+{
+  while (head->next != NULL)
+    {
+      acc_detach ((void **) &head->next);
+      struct node * n = head->next;
+      head->next = n->next;
+      if (n->next)
+	acc_detach ((void **) &n->next);
+
+      acc_delete (n, sizeof (struct node));
+      if (head->next)
+	acc_attach((void **) &head->next);
+
+      free (n);
+    }
+}
+
+int
+main ()
+{
+  struct node list = { .next = NULL, .val = 0 };
+  int i;
+
+  acc_copyin (&list, sizeof (struct node));
+
+  for (i = 0; i < 10; i++)
+    insert (&list, 2);
+
+  assert (sum_nodes (&list) == 10 * 2);
+
+  destroy (&list);
+
+  acc_delete (&list, sizeof (struct node));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
new file mode 100644
index 0000000..81c1c5e
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c
@@ -0,0 +1,59 @@ 
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int **b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int **) malloc (sizeof (int *) * n);
+  for (i = 0; i < n; i++)
+    v.b[i] = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc data copy(v)
+      {
+#pragma acc data copy(v.b[:n])
+	{
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_copyin (v.b[i], sizeof (int) * n);
+	      acc_attach ((void **) &v.b[i]);
+	    }
+
+#pragma acc parallel loop
+	  for (i = 0; i < n; i++)
+	    for (j = 0; j < n; j++)
+	      v.b[i][j] = v.a + i + j;
+
+	  for (i = 0; i < n; i++)
+	    {
+	      acc_detach ((void **) &v.b[i]);
+	      acc_copyout (v.b[i], sizeof (int) * n);
+	    }
+	}
+      }
+
+      for (i = 0; i < n; i++)
+	for (j = 0; j < n; j++)
+	  assert (v.b[i][j] == v.a + i + j);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      for (i = 0; i < n; i++)
+        assert (!acc_is_present (v.b[i], sizeof (int) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
new file mode 100644
index 0000000..a59047a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c
@@ -0,0 +1,45 @@ 
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+
+  for (k = 0; k < 16; k++)
+    {
+      /* Here, we do not explicitly copy the enclosing structure, but work
+	 with fields directly.  Make sure attachment counters and reference
+	 counters work properly in that case.  */
+#pragma acc enter data copyin(v.a, v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+#pragma acc enter data pcopyin(v.b[0:n])
+
+#pragma acc parallel loop present(v.a, v.b)
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n]) finalize
+#pragma acc exit data delete(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (!acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
new file mode 100644
index 0000000..0ca5990
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c
@@ -0,0 +1,54 @@ 
+/* { dg-do run { target { ! openacc_host_selected } } } */
+
+#include <stdlib.h>
+#include <assert.h>
+#include <openacc.h>
+
+struct dc
+{
+  int a;
+  int *b;
+  int *c;
+  int *d;
+};
+
+int
+main ()
+{
+  int n = 100, i, j, k;
+  struct dc v = { .a = 3 };
+
+  v.b = (int *) malloc (sizeof (int) * n);
+  v.c = (int *) malloc (sizeof (int) * n);
+  v.d = (int *) malloc (sizeof (int) * n);
+
+#pragma acc enter data copyin(v)
+
+  for (k = 0; k < 16; k++)
+    {
+#pragma acc enter data copyin(v.a, v.b[:n], v.c[:n], v.d[:n])
+
+#pragma acc parallel loop
+      for (i = 0; i < n; i++)
+	v.b[i] = v.a + i;
+
+#pragma acc exit data copyout(v.b[:n])
+#pragma acc exit data copyout(v.c[:n])
+#pragma acc exit data copyout(v.d[:n])
+#pragma acc exit data copyout(v.a)
+
+      for (i = 0; i < n; i++)
+	assert (v.b[i] == v.a + i);
+
+      assert (acc_is_present (&v, sizeof (v)));
+      assert (!acc_is_present (v.b, sizeof (int *) * n));
+      assert (!acc_is_present (v.c, sizeof (int *) * n));
+      assert (!acc_is_present (v.d, sizeof (int *) * n));
+    }
+
+#pragma acc exit data copyout(v)
+
+  assert (!acc_is_present (&v, sizeof (v)));
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
index 83a5400..6bb92c1 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90
@@ -1,4 +1,5 @@ 
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
 
 program test
   use openacc
@@ -70,10 +71,14 @@  program test
     end do
   !$acc end parallel
   
-  !$acc exit data copyout (d(1:N)) async
+  !$acc exit data delete (c(1:N)) copyout (d(1:N)) async
   !$acc exit data async
   !$acc wait
 
+#if !ACC_MEM_SHARED
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+#endif
+
   do i = 1, N
     if (d(i) .ne. 4.0) call abort
   end do
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
new file mode 100644
index 0000000..c4cea11
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90
@@ -0,0 +1,35 @@ 
+! { dg-do run }
+
+! Test of attach/detach with "acc data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var)
+!$acc data copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
new file mode 100644
index 0000000..3593661
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90
@@ -0,0 +1,33 @@ 
+! { dg-do run }
+
+! Test of attach/detach with "acc data", two clauses at once.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+!$acc data copy(var) copy(var%a)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
new file mode 100644
index 0000000..667d944
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90
@@ -0,0 +1,34 @@ 
+! { dg-do run }
+
+! Test of attach/detach with "acc parallel".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc parallel loop copy(var) copy(var%a(1:n)) copy(var%b(1:n))
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i
+  end do
+!$acc end parallel loop
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i .ne. var%b(i)) stop 2
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
new file mode 100644
index 0000000..6949e12
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90
@@ -0,0 +1,49 @@ 
+! { dg-do run }
+
+! Test of attach/detach with "acc enter/exit data".
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer, allocatable :: r(:)
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(r(1:n))
+
+!$acc enter data copyin(var)
+
+!$acc enter data copyin(var%a, var%b, r)
+
+!$acc parallel loop
+  do i = 1,n
+    var%a(i) = i
+    var%b(i) = i * 2
+    r(i) = i * 3
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a)
+!$acc exit data copyout(var%b)
+!$acc exit data copyout(r)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+    if (i * 2 .ne. var%b(i)) stop 2
+    if (i * 3 .ne. r(i)) stop 3
+  end do
+
+!$acc exit data delete(var)
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(r)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
new file mode 100644
index 0000000..6843cf1
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90
@@ -0,0 +1,57 @@ 
+! { dg-do run }
+
+! Test of attach/detach, "enter data" inside "data", and subarray.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5))
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
new file mode 100644
index 0000000..12910d0
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90
@@ -0,0 +1,61 @@ 
+! { dg-do run }
+
+! Test of attachment counters and finalize.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: b(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+
+!$acc data copy(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+  end do
+
+!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5))
+
+  do i = 1,20
+    !$acc enter data attach(var%a)
+  end do
+
+!$acc parallel loop
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+  end do
+!$acc end parallel loop
+
+!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) finalize
+
+!$acc end data
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 3
+    if (i * 2 .ne. var%b(i)) stop 4
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 5
+    if (var%b(i) .ne. 0) stop 6
+  end do
+
+  deallocate(var%a)
+  deallocate(var%b)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
new file mode 100644
index 0000000..ab44f0a
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90
@@ -0,0 +1,89 @@ 
+! { dg-do run }
+
+! Test of attach/detach with scalar elements and nested derived types.
+
+program dtype
+  implicit none
+  integer, parameter :: n = 512
+  type subtype
+    integer :: g, h
+    integer, allocatable :: q(:)
+  end type subtype
+  type mytype
+    integer, allocatable :: a(:)
+    integer, allocatable :: c, d
+    integer, allocatable :: b(:)
+    integer :: f
+    type(subtype) :: s
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+  allocate(var%b(1:n))
+  allocate(var%c)
+  allocate(var%d)
+  allocate(var%s%q(1:n))
+
+  var%c = 16
+  var%d = 20
+  var%f = 7
+  var%s%g = 21
+  var%s%h = 38
+
+!$acc enter data copyin(var)
+
+  do i = 1, n
+    var%a(i) = 0
+    var%b(i) = 0
+    var%s%q(i) = 0
+  end do
+
+!$acc data copy(var%a(5:n - 5), var%b(5:n - 5), var%c, var%d) &
+!$acc & copy(var%s%q)
+
+!$acc parallel loop default(none) present(var)
+  do i = 5,n - 5
+    var%a(i) = i
+    var%b(i) = i * 2
+    var%s%q(i) = i * 3
+    var%s%g = 100
+    var%s%h = 101
+  end do
+!$acc end parallel loop
+
+!$acc end data
+
+!$acc exit data copyout(var)
+
+  do i = 1,4
+    if (var%a(i) .ne. 0) stop 1
+    if (var%b(i) .ne. 0) stop 2
+    if (var%s%q(i) .ne. 0) stop 3
+  end do
+
+  do i = 5,n - 5
+    if (i .ne. var%a(i)) stop 4
+    if (i * 2 .ne. var%b(i)) stop 5
+    if (i * 3 .ne. var%s%q(i)) stop 6
+  end do
+
+  do i = n - 4,n
+    if (var%a(i) .ne. 0) stop 7
+    if (var%b(i) .ne. 0) stop 8
+    if (var%s%q(i) .ne. 0) stop 9
+  end do
+
+  if (var%c .ne. 16) stop 10
+  if (var%d .ne. 20) stop 11
+  if (var%s%g .ne. 100 .or. var%s%h .ne. 101) stop 12
+  if (var%f .ne. 7) stop 13
+
+  deallocate(var%a)
+  deallocate(var%b)
+  deallocate(var%c)
+  deallocate(var%d)
+  deallocate(var%s%q)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
new file mode 100644
index 0000000..d142763
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90
@@ -0,0 +1,41 @@ 
+! { dg-do run }
+
+! Test of explicit attach/detach clauses and attachment counters. There are no
+! acc_attach/acc_detach API routines in Fortran.
+
+program dtype
+  use openacc
+  implicit none
+  integer, parameter :: n = 512
+  type mytype
+    integer, allocatable :: a(:)
+  end type mytype
+  integer i
+
+  type(mytype) :: var
+
+  allocate(var%a(1:n))
+
+  call acc_copyin(var)
+  call acc_copyin(var%a)
+
+  !$acc enter data attach(var%a)
+
+!$acc parallel loop attach(var%a)
+  do i = 1,n
+    var%a(i) = i
+  end do
+!$acc end parallel loop
+
+  !$acc exit data detach(var%a)
+
+  call acc_copyout(var%a)
+  call acc_copyout(var)
+
+  do i = 1,n
+    if (i .ne. var%a(i)) stop 1
+  end do
+
+  deallocate(var%a)
+
+end program dtype
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
new file mode 100644
index 0000000..eb7812d
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90
@@ -0,0 +1,28 @@ 
+! Test derived types with subarrays
+
+! { dg-do run }
+
+  implicit none
+  type dtype
+     integer :: a, b, c
+  end type dtype
+  integer, parameter :: n = 100
+  integer i
+  type (dtype), dimension(n) :: d
+
+  !$acc data copy(d(1:n))
+  !$acc parallel loop
+  do i = 1, n
+     d(i)%a = i
+     d(i)%b = i-1
+     d(i)%c = i+1
+  end do
+  !$acc end data
+
+  do i = 1, n
+     if (d(i)%a /= i) stop 1
+     if (d(i)%b /= i-1) stop 2
+     if (d(i)%c /= i+1) stop 3
+  end do
+end program
+
diff --git a/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
new file mode 100644
index 0000000..c3c8a07
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90
@@ -0,0 +1,284 @@ 
+! Test ACC UPDATE with derived types.
+
+! { dg-do run }
+
+module dt
+  integer, parameter :: n = 10
+  type inner
+     integer :: d(n)
+  end type inner
+  type mytype
+     integer(8) :: a, b, c(n)
+     type(inner) :: in
+  end type mytype
+end module dt
+
+program derived_acc
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 1
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 2
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 3
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 4
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 5
+     if (i == 5 .and. var%c(i) /= 1) stop 6
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 7
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 8
+     if (i >= n/2 .and. var%c(i) /= i) stop 9
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 10
+     if (i == 5 .and. var%in%d(i) /= 1) stop 11
+  end do
+
+  !$acc exit data delete(var)
+
+  call derived_acc_subroutine(var)
+end program derived_acc
+
+subroutine derived_acc_subroutine(var)
+  use dt
+
+  implicit none
+  integer i, res
+  type(mytype) :: var
+
+  var%a = 0
+  var%b = 1
+  var%c(:) = 10
+  var%in%d(:) = 100
+
+  var%c(:) = 10
+
+  !$acc enter data copyin(var)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 12
+
+  var%b = 100
+
+  !$acc update device(var%b)
+
+  !$acc parallel loop present(var)
+  do i = 1, 1
+     var%a = var%b
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%a)
+
+  if (var%a /= var%b) stop 13
+
+  !$acc parallel loop present (var)
+  do i = 1, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c)
+
+  var%a = -1
+
+  do i = 1, n
+     if (var%c(i) /= i) stop 14
+     var%c(i) = var%a
+  end do
+
+  !$acc update device(var%a)
+  !$acc update device(var%c)
+
+  res = 0
+
+  !$acc parallel loop present(var) reduction(+:res)
+  do i = 1, n
+     if (var%c(i) /= var%a) res = res + 1
+  end do
+
+  if (res /= 0) stop 15
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%c(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%c(i) /= 0) stop 16
+     if (i == 5 .and. var%c(i) /= 1) stop 17
+  end do
+
+  !$acc parallel loop present(var)
+  do i = 1, n
+     var%in%d = var%a
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d)
+
+  do i = 1, n
+     if (var%in%d(i) /= var%a) stop 18
+  end do
+
+  var%c(:) = 0
+
+  !$acc update device(var%c)
+
+  var%c(:) = -1
+
+  !$acc parallel loop present(var)
+  do i = n/2, n
+     var%c(i) = i
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%c(n/2:n))
+
+  do i = 1,n
+     if (i < n/2 .and. var%c(i) /= -1) stop 19
+     if (i >= n/2 .and. var%c(i) /= i) stop 20
+  end do
+
+  var%in%d(:) = 0
+  !$acc update device(var%in%d)
+
+  !$acc parallel loop present(var)
+  do i = 5, 5
+     var%in%d(i) = 1
+  end do
+  !$acc end parallel loop
+
+  !$acc update host(var%in%d(5))
+
+  do i = 1, n
+     if (i /= 5 .and. var%in%d(i) /= 0) stop 21
+     if (i == 5 .and. var%in%d(i) /= 1) stop 22
+  end do
+
+  !$acc exit data delete(var)
+end subroutine derived_acc_subroutine