diff mbox series

OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668]

Message ID 34d058a5-aa21-2ef9-d644-f95723c54e61@codesourcery.com
State New
Headers show
Series OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668] | expand

Commit Message

Tobias Burnus Sept. 11, 2020, 6:13 p.m. UTC
This is a first attempt to improve the OpenMP mapping for allocatables
and pointers; there are some more issues – cf. PR and for scalars
PR 97021.

In real world code, a usage like the following is not uncommon:

real, allocatable :: A(:,:)
!$omp target enter data map(to: A)

This maps an unallocated array (a.data == NULL), the array descriptor
itself ("a", pointer set) and then pointer associates on the device
the mapped data (well, NULL) with the device's "a.data"

That works well – and one can now use A on the device and allocate
it (and, before, 'end target' deallocate it).

However, many programs now do on the host:

allocate(A(n,m))
!$omp target
   do i = ...
     do j = ...
       A(j,i) = ...
!$omp end target

which gets an implicit "map(tofrom:A)". While "a.data" now gets
mapped, the "a" is not updated as it is already present and
pointer-setting 'a.data' on the device is also not needed as it
is already there.

As written, such code is rather common and other compilers handle this.

The Fortran spec between OpenMP 4.5 and TR 8 is a bit unclear; in
TR 9 (not yet available), the code above is only valid with
   map(always, tofrom: A)  (or 'to:')
where the 'always' is required. The general notion is that it should
be also valid for the case above, but allocatable components of
derived types should not always be rechecked/remapped every time
map(dt) is used. — Hence, this was deferred and only the 'always'
part was clarified in the draft for the upcoming TR 9.

Additionally, for POINTER there is already the following wording
in the spec, which implies that the pointer has to be (potentially)
updated every time:

"If a list item in a map clause is an associated pointer and
  the pointer is not the base pointer of another list item in
  a map clause on the same construct, then it is treated as if
  its pointer target is implicitly mapped in the same clause.
  For the purposes of the map clause, the mapped pointer
  target is treated as if its base pointer is the associated pointer."

OK?

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter

Comments

Tobias Burnus Sept. 14, 2020, 7:50 a.m. UTC | #1
Second version, new is the second testcase, the change in omp-low.c,
and around the two asserts in libgomp/target.c

The new version now also handles 'omp declare target' variables. This
usage occurs in five SPEC ACCEL testcases (all without 'always' clause),
hence, real-world code seems to expect that the mapped as never allocated
feature works.

OK for the trunk?

Tobias

On 9/11/20 8:13 PM, Tobias Burnus wrote:

> This is a first attempt to improve the OpenMP mapping for allocatables
> and pointers; there are some more issues – cf. PR and for scalars
> PR 97021.
>
> In real world code, a usage like the following is not uncommon:
>
> real, allocatable :: A(:,:)
> !$omp target enter data map(to: A)
>
> This maps an unallocated array (a.data == NULL), the array descriptor
> itself ("a", pointer set) and then pointer associates on the device
> the mapped data (well, NULL) with the device's "a.data"
>
> That works well – and one can now use A on the device and allocate
> it (and, before, 'end target' deallocate it).
>
> However, many programs now do on the host:
>
> allocate(A(n,m))
> !$omp target
>   do i = ...
>     do j = ...
>       A(j,i) = ...
> !$omp end target
>
> which gets an implicit "map(tofrom:A)". While "a.data" now gets
> mapped, the "a" is not updated as it is already present and
> pointer-setting 'a.data' on the device is also not needed as it
> is already there.
>
> As written, such code is rather common and other compilers handle this.
>
> The Fortran spec between OpenMP 4.5 and TR 8 is a bit unclear; in
> TR 9 (not yet available), the code above is only valid with
>   map(always, tofrom: A)  (or 'to:')
> where the 'always' is required. The general notion is that it should
> be also valid for the case above, but allocatable components of
> derived types should not always be rechecked/remapped every time
> map(dt) is used. — Hence, this was deferred and only the 'always'
> part was clarified in the draft for the upcoming TR 9.
>
> Additionally, for POINTER there is already the following wording
> in the spec, which implies that the pointer has to be (potentially)
> updated every time:
>
> "If a list item in a map clause is an associated pointer and
>  the pointer is not the base pointer of another list item in
>  a map clause on the same construct, then it is treated as if
>  its pointer target is implicitly mapped in the same clause.
>  For the purposes of the map clause, the mapped pointer
>  target is treated as if its base pointer is the associated pointer."
>
> OK?
>
> Tobias
>
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Jakub Jelinek Sept. 14, 2020, 11:03 a.m. UTC | #2
On Mon, Sep 14, 2020 at 09:50:08AM +0200, Tobias Burnus wrote:
> --- a/gcc/fortran/trans-openmp.c
> +++ b/gcc/fortran/trans-openmp.c
> @@ -1357,6 +1357,15 @@ gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
>        tree type = TREE_TYPE (decl);
>        tree ptr = gfc_conv_descriptor_data_get (decl);
>  
> +      /* OpenMP: automatically map pointer targets with the pointer;
> +	 hence, always update the descriptor/pointer itself.
> +	 NOTE: This also remaps the pointer for allocatable arrays with
> +	 'target' attribute which also don't have the 'restrict' qualifier.  */
> +      bool always_modifier = false;
> +
> +      if (flag_openmp && !(TYPE_QUALS (TREE_TYPE (ptr)) & TYPE_QUAL_RESTRICT))
> +	always_modifier = true;

I think we don't want to depend on flag_openmp here, because
one can compile with -fopenmp -fopenacc and simply encounter OpenMP or
OpenACC code or none of that (or both as long as we don't reject it due to
invalid mixing).
So, if we need to handle the two languages differently, we probably need to
change the langhook's arguments and pass from the gimplifier a flag whether
it is OpenMP or OpenACC construct (or tree code of the construct or whatever
else).  Guess for now an openacc flag with (ctx->region_type & ORT_ACC) != 0
passed to it seems easiest.

> diff --git a/libgomp/target.c b/libgomp/target.c
> index 3e292eb..f037151 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -1,3 +1,4 @@
> +#pragma GCC optimize("O0")
>  /* Copyright (C) 2013-2020 Free Software Foundation, Inc.
>     Contributed by Jakub Jelinek <jakub@redhat.com>.
>  

Not this change please ;).

> @@ -472,8 +474,8 @@ gomp_map_fields_existing (struct target_mem_desc *tgt,
>  	      && n2->host_start - n->host_start
>  		 == n2->tgt_offset - n->tgt_offset)
>  	    {
> -	      gomp_map_vars_existing (devicep, aq, n2, &cur_node,
> -				      &tgt->list[i], kind & typemask, cbuf);
> +	      gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
> +				      kind & typemask,false,  cbuf);

Formatting (move 1 space before false from after it).

	Jakub
Tobias Burnus Sept. 14, 2020, 10:48 p.m. UTC | #3
On 9/14/20 1:03 PM, Jakub Jelinek wrote:
>> +
>> +      if (flag_openmp && !(TYPE_QUALS (TREE_TYPE (ptr)) & TYPE_QUAL_RESTRICT))
>> +    always_modifier = true;
> I think we don't want to depend on flag_openmp here, ... Guess for now
> an openacc flag with (ctx->region_type & ORT_ACC) != 0 passed to it
> seems easiest.

Done so. For gimplify_omp_for, I pass the bool as argument as
gimplify_omp_ctxp is often NULL.

>> +#pragma GCC optimize("O0")
> Not this change please ;).
True ;-)
> Formatting (move 1 space before false from after it).

Fixed. Thanks for the quick first review.

New version attached.

Tobias

PS: Regtesting still on-going but most parts already tested
successfully.

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Jakub Jelinek Sept. 15, 2020, 6:56 a.m. UTC | #4
On Tue, Sep 15, 2020 at 12:48:59AM +0200, Tobias Burnus wrote:
> 	PR fortran/96668
> 	* gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P):

Missing description (Define. ?)
> 
> libgomp/ChangeLog:
> 
> 	PR fortran/96668
> 	* libgomp.h (struct target_var_desc):
> 	* target.c (gomp_map_vars_existing):
> 	(gomp_map_fields_existing):
> 	(gomp_map_vars_internal):
> 	(GOMP_target_enter_exit_data):

Missing descriptions.

> @@ -11030,7 +11034,7 @@ gimplify_omp_taskloop_expr (tree type, tree *tp, gimple_seq *pre_p,
>  /* Gimplify the gross structure of an OMP_FOR statement.  */
>  
>  static enum gimplify_status
> -gimplify_omp_for (tree *expr_p, gimple_seq *pre_p)
> +gimplify_omp_for (tree *expr_p, gimple_seq *pre_p, bool openacc)
>  {
>    tree for_stmt, orig_for_stmt, inner_for_stmt = NULL_TREE, decl, var, t;
>    enum gimplify_status ret = GS_ALL_DONE;

Wouldn't it be easier to not add an argument, but:
  bool openacc = TREE_CODE (*expr_p) == OACC_LOOP;
inside of the routine?

Otherwise LGTM.

	Jakub
Jakub Jelinek Sept. 15, 2020, 5:03 p.m. UTC | #5
On Tue, Sep 15, 2020 at 12:48:59AM +0200, Tobias Burnus wrote:
> +	      bool has_nullptr;
> +	      size_t j;
> +	      for (j = 0; j < n->tgt->list_count; j++)
> +		if (n->tgt->list[j].key == n)
> +		  {
> +		    has_nullptr = n->tgt->list[j].has_null_ptr_assoc;
> +		    break;
> +		  }
> +	      if (n->tgt->list_count == 0)
> +		{
> +		  /* 'declare target'; assume has_nullptr; it could also be
> +		     statically assigned pointer, but that it should be to
> +		     the equivalent variable on the host.  */
> +		  assert (n->refcount == REFCOUNT_INFINITY);
> +		  has_nullptr = true;
> +		}
> +	      else
> +		assert (j < n->tgt->list_count);
> +	      /* Re-map the data if there is an 'always' modifier or if it a
> +		 null pointer was there and non a nonnull has been found; that
> +		 permits transparent re-mapping for Fortran array descriptors
> +		 which were previously mapped unallocated.  */
> +	      for (j = i + 1; j < mapnum; j++)
> +		{
> +		  int ptr_kind = get_kind (short_mapkind, kinds, j) & typemask;
> +		  if (!GOMP_MAP_ALWAYS_POINTER_P (ptr_kind)
> +		      && (!has_nullptr

David Edelsohn just reported this (rightly so) results in -Wuninitialized
warnings, I think you meant bool has_nullptr = false;
in the definition (in both places that it is defined at).

	Jakub
Tobias Burnus Sept. 15, 2020, 7:49 p.m. UTC | #6
On 9/15/20 7:03 PM, Jakub Jelinek wrote:
> On Tue, Sep 15, 2020 at 12:48:59AM +0200, Tobias Burnus wrote:
>> +          bool has_nullptr;
>> +          size_t j;
>> +          for (j = 0; j < n->tgt->list_count; j++)
>> +            if (n->tgt->list[j].key == n)
>> +              {
>> +                has_nullptr = n->tgt->list[j].has_null_ptr_assoc;
>> +                break;
>> +              }
>> +          if (n->tgt->list_count == 0)
>> +              has_nullptr = true;
>> +          else
>> +            assert (j < n->tgt->list_count);
> David Edelsohn just reported this (rightly so) results in -Wuninitialized
> warnings, I think you meant bool has_nullptr = false;
> in the definition (in both places that it is defined at).

No, I meant that it should be always set.

I think the uninitialized warning is a false positive:
list_count and j are unsigned. For list_count == 0 (and, hence, j = 0),
the value is set (special case). Otherwise, if j < list_count, has_nullptr
has been set via the 'for' loop. If it is not set in the loop, j == list_count
and that's (plus j > list_count) caught by the assert.

I admit that with the assert being disabled (NDEBUG is set), that's not
visible to the compiler but otherwise, it should be able to find out.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
diff mbox series

Patch

OpenMP/Fortran: Fix (re)mapping of allocatable/pointer arrays [PR96668]

gcc/fortran/ChangeLog:

	PR fortran/96668
	* trans-openmp.c (gfc_omp_finish_clause): Use GOMP_MAP_ALWAYS_POINTER
	with PSET for pointers.
	(gfc_trans_omp_clauses): Likewise and also if the always modifier is
	used.

gcc/ChangeLog:

	PR fortran/96668
	* gimplify.c (gimplify_scan_omp_clauses): Handle
	GOMP_MAP_ALWAYS_POINTER like GOMP_MAP_POINTER for target exit data.

include/ChangeLog:

	PR fortran/96668
	* gomp-constants.h (GOMP_MAP_ALWAYS_POINTER_P): New define.

libgomp/ChangeLog:

	PR fortran/96668
	* libgomp.h (struct target_var_desc): Add has_null_ptr_assoc member.
	* target.c (gomp_map_vars_existing): Add always_to_flag flag.
	(gomp_map_vars_existing): Update call to it.
	(gomp_map_fields_existing): Likewise
	(gomp_map_vars_internal): Update PSET handling such that if a nullptr is
	now allocated or if GOMP_MAP_POINTER is used PSET is updated and pointer
	remapped.
	(GOMP_target_enter_exit_data): Hanlde GOMP_MAP_ALWAYS_POINTER like
	GOMP_MAP_POINTER.
	* testsuite/libgomp.fortran/map-alloc-ptr-1.f90: New test.

 gcc/fortran/trans-openmp.c                         |  28 +++-
 gcc/gimplify.c                                     |   1 +
 include/gomp-constants.h                           |   3 +
 libgomp/libgomp.h                                  |   3 +
 libgomp/target.c                                   | 173 ++++++++++++++++-----
 .../testsuite/libgomp.fortran/map-alloc-ptr-1.f90  | 114 ++++++++++++++
 6 files changed, 282 insertions(+), 40 deletions(-)

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 0e1da04..268467d 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -1357,6 +1357,15 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
       tree type = TREE_TYPE (decl);
       tree ptr = gfc_conv_descriptor_data_get (decl);
 
+      /* OpenMP: automatically map pointer targets with the pointer;
+	 hence, always update the descriptor/pointer itself.
+	 NOTE: This also remaps the pointer for allocatable arrays with
+	 'target' attribute which also don't have the 'restrict' qualifier.  */
+      bool always_modifier = false;
+
+      if (flag_openmp && !(TYPE_QUALS (TREE_TYPE (ptr)) & TYPE_QUAL_RESTRICT))
+	always_modifier = true;
+
       if (present)
 	ptr = gfc_build_cond_assign_expr (&block, present, ptr,
 					  null_pointer_node);
@@ -1376,7 +1385,8 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
 	OMP_CLAUSE_DECL (c2) = decl;
       OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (type);
       c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
-      OMP_CLAUSE_SET_MAP_KIND (c3, GOMP_MAP_POINTER);
+      OMP_CLAUSE_SET_MAP_KIND (c3, always_modifier ? GOMP_MAP_ALWAYS_POINTER
+						   : GOMP_MAP_POINTER);
       if (present)
 	{
 	  ptr = gfc_conv_descriptor_data_get (decl);
@@ -2549,11 +2559,19 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	      if (!n->sym->attr.referenced)
 		continue;
 
+	      bool always_modifier = false;
 	      tree node = build_omp_clause (input_location, OMP_CLAUSE_MAP);
 	      tree node2 = NULL_TREE;
 	      tree node3 = NULL_TREE;
 	      tree node4 = NULL_TREE;
 
+	      /* OpenMP: automatically map pointer targets with the pointer;
+		 hence, always update the descriptor/pointer itself.  */
+              if (!openacc
+		  && ((n->expr == NULL && n->sym->attr.pointer)
+		      || (n->expr && gfc_expr_attr (n->expr).pointer)))
+		always_modifier = true;
+
 	      switch (n->u.map_op)
 		{
 		case OMP_MAP_ALLOC:
@@ -2575,12 +2593,15 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TOFROM);
 		  break;
 		case OMP_MAP_ALWAYS_TO:
+		  always_modifier = true;
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALWAYS_TO);
 		  break;
 		case OMP_MAP_ALWAYS_FROM:
+		  always_modifier = true;
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALWAYS_FROM);
 		  break;
 		case OMP_MAP_ALWAYS_TOFROM:
+		  always_modifier = true;
 		  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALWAYS_TOFROM);
 		  break;
 		case OMP_MAP_RELEASE:
@@ -2760,7 +2781,10 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			  goto finalize_map_clause;
 			}
 		      else
-			OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER);
+			OMP_CLAUSE_SET_MAP_KIND (node3,
+						 always_modifier
+						 ? GOMP_MAP_ALWAYS_POINTER
+						 : GOMP_MAP_POINTER);
 
 		      /* We have to check for n->sym->attr.dimension because
 			 of scalar coarrays.  */
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 23d0e25..108525c 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -8803,6 +8803,7 @@  gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p,
 					? GOMP_MAP_DELETE : GOMP_MAP_RELEASE);
 	  else if ((code == OMP_TARGET_EXIT_DATA || code == OMP_TARGET_UPDATE)
 		   && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_POINTER
+		       || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER
 		       || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_TO_PSET))
 	    remove = true;
 
diff --git a/include/gomp-constants.h b/include/gomp-constants.h
index 16f2d13..309cbca 100644
--- a/include/gomp-constants.h
+++ b/include/gomp-constants.h
@@ -171,6 +171,9 @@  enum gomp_map_kind
   (!((X) & GOMP_MAP_FLAG_SPECIAL) \
    && ((X) & GOMP_MAP_FLAG_FROM))
 
+#define GOMP_MAP_ALWAYS_POINTER_P(X) \
+  ((X) == GOMP_MAP_ALWAYS_POINTER)
+
 #define GOMP_MAP_POINTER_P(X) \
   ((X) == GOMP_MAP_POINTER)
 
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index f9080e9..87f939a 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -954,6 +954,9 @@  struct target_var_desc {
   bool always_copy_from;
   /* True if this is for OpenACC 'attach'.  */
   bool is_attach;
+  /* If GOMP_MAP_TO_PSET had a NULL pointer; used for Fortran descriptors,
+     which were initially unallocated.  */
+  bool has_null_ptr_assoc;
   /* Relative offset against key host_start.  */
   uintptr_t offset;
   /* Actual length.  */
diff --git a/libgomp/target.c b/libgomp/target.c
index 3e292eb..faef15b 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1,3 +1,4 @@ 
+#pragma GCC optimize("O0")
 /* Copyright (C) 2013-2020 Free Software Foundation, Inc.
    Contributed by Jakub Jelinek <jakub@redhat.com>.
 
@@ -355,7 +356,8 @@  static inline void
 gomp_map_vars_existing (struct gomp_device_descr *devicep,
 			struct goacc_asyncqueue *aq, splay_tree_key oldn,
 			splay_tree_key newn, struct target_var_desc *tgt_var,
-			unsigned char kind, struct gomp_coalesce_buf *cbuf)
+			unsigned char kind, bool always_to_flag,
+			struct gomp_coalesce_buf *cbuf)
 {
   assert (kind != GOMP_MAP_ATTACH);
 
@@ -377,7 +379,7 @@  gomp_map_vars_existing (struct gomp_device_descr *devicep,
 		  (void *) oldn->host_start, (void *) oldn->host_end);
     }
 
-  if (GOMP_MAP_ALWAYS_TO_P (kind))
+  if (GOMP_MAP_ALWAYS_TO_P (kind) || always_to_flag)
     gomp_copy_host2dev (devicep, aq,
 			(void *) (oldn->tgt->tgt_start + oldn->tgt_offset
 				  + newn->host_start - oldn->host_start),
@@ -456,8 +458,8 @@  gomp_map_fields_existing (struct target_mem_desc *tgt,
       && n2->tgt == n->tgt
       && n2->host_start - n->host_start == n2->tgt_offset - n->tgt_offset)
     {
-      gomp_map_vars_existing (devicep, aq, n2, &cur_node,
-			      &tgt->list[i], kind & typemask, cbuf);
+      gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
+			      kind & typemask, false, cbuf);
       return;
     }
   if (sizes[i] == 0)
@@ -472,8 +474,8 @@  gomp_map_fields_existing (struct target_mem_desc *tgt,
 	      && n2->host_start - n->host_start
 		 == n2->tgt_offset - n->tgt_offset)
 	    {
-	      gomp_map_vars_existing (devicep, aq, n2, &cur_node,
-				      &tgt->list[i], kind & typemask, cbuf);
+	      gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
+				      kind & typemask,false,  cbuf);
 	      return;
 	    }
 	}
@@ -485,7 +487,7 @@  gomp_map_fields_existing (struct target_mem_desc *tgt,
 	  && n2->host_start - n->host_start == n2->tgt_offset - n->tgt_offset)
 	{
 	  gomp_map_vars_existing (devicep, aq, n2, &cur_node, &tgt->list[i],
-				  kind & typemask, cbuf);
+				  kind & typemask, false, cbuf);
 	  return;
 	}
     }
@@ -661,6 +663,7 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
 {
   size_t i, tgt_align, tgt_size, not_found_cnt = 0;
   bool has_firstprivate = false;
+  bool has_always_ptrset = false;
   const int rshift = short_mapkind ? 8 : 3;
   const int typemask = short_mapkind ? 0xff : 0x7;
   struct splay_tree_s *mem_map = &devicep->mem_map;
@@ -848,8 +851,46 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
       else
 	n = splay_tree_lookup (mem_map, &cur_node);
       if (n && n->refcount != REFCOUNT_LINK)
-	gomp_map_vars_existing (devicep, aq, n, &cur_node, &tgt->list[i],
-				kind & typemask, NULL);
+	{
+	  int always_to_cnt = 0;
+	  if ((kind & typemask) == GOMP_MAP_TO_PSET)
+	    {
+	      bool has_nullptr;
+	      size_t j;
+	      for (j = 0; j < n->tgt->list_count; j++)
+		if (n->tgt->list[j].key == n)
+		  {
+		    has_nullptr = n->tgt->list[j].has_null_ptr_assoc;
+		    break;
+		  }
+	      assert (j < n->tgt->list_count);
+	      /* Re-map the data if there is an 'always' modifier or if it a
+		 null pointer was there and non a nonnull has been found; that
+		 permits transparent re-mapping for Fortran array descriptors
+		 which were previously mapped unallocated.  */
+	      for (j = i + 1; j < mapnum; j++)
+		{
+		  int ptr_kind = get_kind (short_mapkind, kinds, j) & typemask;
+		  if (!GOMP_MAP_ALWAYS_POINTER_P (ptr_kind)
+		      && (!has_nullptr
+			  || !GOMP_MAP_POINTER_P (ptr_kind)
+			  || *(void **) hostaddrs[j] == NULL))
+		    break;
+		  else if ((uintptr_t) hostaddrs[j] < cur_node.host_start
+			   || ((uintptr_t) hostaddrs[j] + sizeof (void *)
+			       > cur_node.host_end))
+		    break;
+		  else
+		    {
+		      has_always_ptrset = true;
+		      ++always_to_cnt;
+		    }
+		}
+	    }
+	  gomp_map_vars_existing (devicep, aq, n, &cur_node, &tgt->list[i],
+				  kind & typemask, always_to_cnt > 0, NULL);
+	  i += always_to_cnt;
+	}
       else
 	{
 	  tgt->list[i].key = NULL;
@@ -881,9 +922,11 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
 	  if ((kind & typemask) == GOMP_MAP_TO_PSET)
 	    {
 	      size_t j;
+	      int kind;
 	      for (j = i + 1; j < mapnum; j++)
-		if (!GOMP_MAP_POINTER_P (get_kind (short_mapkind, kinds, j)
-					 & typemask))
+		if (!GOMP_MAP_POINTER_P ((kind = (get_kind (short_mapkind,
+						  kinds, j)) & typemask))
+		    && !GOMP_MAP_ALWAYS_POINTER_P (kind))
 		  break;
 		else if ((uintptr_t) hostaddrs[j] < cur_node.host_start
 			 || ((uintptr_t) hostaddrs[j] + sizeof (void *)
@@ -951,7 +994,7 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
     tgt_size = mapnum * sizeof (void *);
 
   tgt->array = NULL;
-  if (not_found_cnt || has_firstprivate)
+  if (not_found_cnt || has_firstprivate || has_always_ptrset)
     {
       if (not_found_cnt)
 	tgt->array = gomp_malloc (not_found_cnt * sizeof (*tgt->array));
@@ -960,7 +1003,55 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
       uintptr_t field_tgt_base = 0;
 
       for (i = 0; i < mapnum; i++)
-	if (tgt->list[i].key == NULL)
+	if (has_always_ptrset
+	    && tgt->list[i].key
+	    && (get_kind (short_mapkind, kinds, i) & typemask)
+	       == GOMP_MAP_TO_PSET)
+	  {
+	    splay_tree_key k = tgt->list[i].key;
+	    bool has_nullptr;
+	    size_t j;
+	    for (j = 0; j < k->tgt->list_count; j++)
+	      if (k->tgt->list[j].key == k)
+		{
+		  has_nullptr = k->tgt->list[j].has_null_ptr_assoc;
+		  break;
+		}
+	    assert (j < k->tgt->list_count);
+
+	    tgt->list[i].has_null_ptr_assoc = false;
+	    for (j = i + 1; j < mapnum; j++)
+	      {
+		int ptr_kind = get_kind (short_mapkind, kinds, j) & typemask;
+		if (!GOMP_MAP_ALWAYS_POINTER_P (ptr_kind)
+		    && (!has_nullptr
+			|| !GOMP_MAP_POINTER_P (ptr_kind)
+			|| *(void **) hostaddrs[j] == NULL))
+		  break;
+		else if ((uintptr_t) hostaddrs[j] < k->host_start
+			 || ((uintptr_t) hostaddrs[j] + sizeof (void *)
+			     > k->host_end))
+		  break;
+		else
+		  {
+		    if (*(void **) hostaddrs[j] == NULL)
+		      tgt->list[i].has_null_ptr_assoc = true;
+		    tgt->list[j].key = k;
+		    tgt->list[j].copy_from = false;
+		    tgt->list[j].always_copy_from = false;
+		    tgt->list[j].is_attach = false;
+		    if (k->refcount != REFCOUNT_INFINITY)
+		      k->refcount++;
+		    gomp_map_pointer (k->tgt, aq,
+				      (uintptr_t) *(void **) hostaddrs[j],
+				      k->tgt_offset + ((uintptr_t) hostaddrs[j]
+						       - k->host_start),
+				      sizes[j], cbufp);
+		  }
+	      }
+	    i = j - 1;
+	  }
+	else if (tgt->list[i].key == NULL)
 	  {
 	    int kind = get_kind (short_mapkind, kinds, i);
 	    if (hostaddrs[i] == NULL)
@@ -1120,7 +1211,7 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
 	    splay_tree_key n = splay_tree_lookup (mem_map, k);
 	    if (n && n->refcount != REFCOUNT_LINK)
 	      gomp_map_vars_existing (devicep, aq, n, k, &tgt->list[i],
-				      kind & typemask, cbufp);
+				      kind & typemask, false, cbufp);
 	    else
 	      {
 		k->aux = NULL;
@@ -1192,32 +1283,37 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
 						  + k->tgt_offset),
 					(void *) k->host_start,
 					k->host_end - k->host_start, cbufp);
+		    tgt->list[i].has_null_ptr_assoc = false;
 
 		    for (j = i + 1; j < mapnum; j++)
-		      if (!GOMP_MAP_POINTER_P (get_kind (short_mapkind, kinds,
-							 j)
-					       & typemask))
-			break;
-		      else if ((uintptr_t) hostaddrs[j] < k->host_start
-			       || ((uintptr_t) hostaddrs[j] + sizeof (void *)
-				   > k->host_end))
-			break;
-		      else
-			{
-			  tgt->list[j].key = k;
-			  tgt->list[j].copy_from = false;
-			  tgt->list[j].always_copy_from = false;
-			  tgt->list[j].is_attach = false;
-			  if (k->refcount != REFCOUNT_INFINITY)
-			    k->refcount++;
-			  gomp_map_pointer (tgt, aq,
-					    (uintptr_t) *(void **) hostaddrs[j],
-					    k->tgt_offset
-					    + ((uintptr_t) hostaddrs[j]
-					       - k->host_start),
-					    sizes[j], cbufp);
-			  i++;
+		      {
+			int ptr_kind = (get_kind (short_mapkind, kinds, j)
+					& typemask);
+			if (!GOMP_MAP_POINTER_P (ptr_kind)
+			    && !GOMP_MAP_ALWAYS_POINTER_P (ptr_kind))
+			  break;
+			else if ((uintptr_t) hostaddrs[j] < k->host_start
+				 || ((uintptr_t) hostaddrs[j] + sizeof (void *)
+				     > k->host_end))
+			  break;
+			else
+			  {
+			    tgt->list[j].key = k;
+			    tgt->list[j].copy_from = false;
+			    tgt->list[j].always_copy_from = false;
+			    tgt->list[j].is_attach = false;
+			    tgt->list[i].has_null_ptr_assoc |= !(*(void **) hostaddrs[j]);
+			    if (k->refcount != REFCOUNT_INFINITY)
+			      k->refcount++;
+			    gomp_map_pointer (tgt, aq,
+					      (uintptr_t) *(void **) hostaddrs[j],
+					      k->tgt_offset
+					      + ((uintptr_t) hostaddrs[j]
+						 - k->host_start),
+					      sizes[j], cbufp);
+			  }
 			}
+		    i = j - 1;
 		    break;
 		  case GOMP_MAP_FORCE_PRESENT:
 		    {
@@ -2481,7 +2577,8 @@  GOMP_target_enter_exit_data (int device, size_t mapnum, void **hostaddrs,
       else if ((kinds[i] & 0xff) == GOMP_MAP_TO_PSET)
 	{
 	  for (j = i + 1; j < mapnum; j++)
-	    if (!GOMP_MAP_POINTER_P (get_kind (true, kinds, j) & 0xff))
+	    if (!GOMP_MAP_POINTER_P (get_kind (true, kinds, j) & 0xff)
+		&& !GOMP_MAP_ALWAYS_POINTER_P (get_kind (true, kinds, j) & 0xff))
 	      break;
 	  gomp_map_vars (devicep, j-i, &hostaddrs[i], NULL, &sizes[i],
 			 &kinds[i], true, GOMP_MAP_VARS_ENTER_DATA);
diff --git a/libgomp/testsuite/libgomp.fortran/map-alloc-ptr-1.f90 b/libgomp/testsuite/libgomp.fortran/map-alloc-ptr-1.f90
new file mode 100644
index 0000000..a1ff1d6
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/map-alloc-ptr-1.f90
@@ -0,0 +1,114 @@ 
+! { dg-do run }
+! 
+! PR fortran/96668
+
+implicit none
+  integer, pointer :: p1(:), p2(:), p3(:)
+  integer, allocatable :: a1(:), a2(:)
+  p1 => null()
+  p3 => null()
+
+  !$omp target enter data map(to:p3)
+
+  !$omp target data map(a1, a2, p1)
+     !$omp target
+       if (allocated (a1)) stop 1
+       if (allocated (a2)) stop 1
+       if (associated (p1)) stop 1
+       if (associated (p3)) stop 1
+     !$omp end target
+
+     allocate (a1, source=[10,11,12,13,14])
+     allocate (a2, source=[10,11,12,13,14])
+     allocate (p1, source=[9,8,7,6,5,4])
+     allocate (p3, source=[4,5,6])
+     p2 => p1
+
+     !$omp target enter data map(to:p3)
+
+     ! allocatable, TR9 requires 'always' modifier:
+     !$omp target map(always, tofrom: a1)
+       if (.not. allocated(a1)) stop 2
+       if (size(a1) /= 5) stop 3
+       if (any (a1 /= [10,11,12,13,14])) stop 5
+       a1(:) = [101, 102, 103, 104, 105]
+     !$omp end target
+
+     ! allocatable, extension (OpenMP 6.0?): without 'always'
+     !$omp target
+       if (.not. allocated(a2)) stop 2
+       if (size(a2) /= 5) stop 3
+       if (any (a2 /= [10,11,12,13,14])) stop 5
+       a2(:) = [101, 102, 103, 104, 105]
+     !$omp end target
+
+     ! pointer: target is automatically mapped
+     ! without requiring an explicit mapping or even the always modifier
+     !$omp target  !! map(always, tofrom: p1)
+       if (.not. associated(p1)) stop 7
+       if (size(p1) /= 6) stop 8
+       if (any (p1 /= [9,8,7,6,5,4])) stop 10
+       p1(:) = [-1, -2, -3, -4, -5, -6]
+     !$omp end target
+
+     !$omp target  !! map(always, tofrom: p3)
+       if (.not. associated(p3)) stop 7
+       if (size(p3) /= 3) stop 8
+       if (any (p3 /= [4,5,6])) stop 10
+       p3(:) = [23,24,25]
+     !$omp end target
+
+     if (any (p1 /= [-1, -2, -3, -4, -5, -6])) stop 141
+
+  !$omp target exit data map(from:p3)
+  !$omp target exit data map(from:p3)
+     if (any (p3 /= [23,24,25])) stop 141
+
+     allocate (p1, source=[99,88,77,66,55,44,33])
+
+     !$omp target  ! And this also should work
+       if (.not. associated(p1)) stop 7
+       if (size(p1) /= 7) stop 8
+       if (any (p1 /= [99,88,77,66,55,44,33])) stop 10
+       p1(:) = [-11, -22, -33, -44, -55, -66, -77]
+     !$omp end target
+  !$omp end target data
+
+  if (any (a1 /= [101, 102, 103, 104, 105])) stop 12
+  if (any (a2 /= [101, 102, 103, 104, 105])) stop 12
+
+  if (any (p1 /= [-11, -22, -33, -44, -55, -66, -77])) stop 142
+  if (any (p2 /= [-1, -2, -3, -4, -5, -6])) stop 143
+
+
+  block
+    integer, pointer :: tmp(:), tmp2(:), tmp3(:)
+    tmp => p1
+    tmp2 => p2
+    tmp3 => p3
+    !$omp target enter data map(to:p3)
+
+    !$omp target data map(to: p1, p2)
+      p1 => null ()
+      p2 => null ()
+      p3 => null ()
+      !$omp target map(always, tofrom: p1)
+        if (associated (p1)) stop 22
+      !$omp end target
+      if (associated (p1)) stop 22
+
+      !$omp target
+        if (associated (p2)) stop 22
+      !$omp end target
+      if (associated (p2)) stop 22
+
+      !$omp target
+        if (associated (p3)) stop 22
+      !$omp end target
+      if (associated (p3)) stop 22
+    !$omp end target data
+    !$omp target exit data map(from:p3)
+    deallocate(tmp, tmp2, tmp3) 
+  end block
+  deallocate(a1, a2)
+end