diff mbox

[libgomp,OpenACC] Additional enter/exit data map handling

Message ID 66f8ce3d-7206-ee8a-abaa-4bb25423e4eb@codesourcery.com
State New
Headers show

Commit Message

Chung-Lin Tang Aug. 29, 2016, 7:46 a.m. UTC
Hi Jakub,
this patch is a port of some changes from gomp-4_0-branch,
including adding additional map type handling in OpenACC enter/exit data
directives, and some pointer set handling changes. Updated
testsuite case are also included.

Tested on trunk to ensure no regressions, is this okay for trunk?

Thanks,
Chung-Lin

2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
            Thomas Schwinge  <thomas@codesourcery.com>
            Chung-Lin Tang  <cltang@codesourcery.com>

        libgomp/
        * oacc-parallel.c (find_pset): Adjust and rename from...
        (find_pointer): ...this function.
        (GOACC_enter_exit_data): Handle GOMP_MAP_TO and GOMP_MAP_ALLOC,
        adjust find_pointer calls into find_pset, adjust pointer map handling,
        add acc_is_present guards to calls to gomp_acc_insert_pointer and
        gomp_acc_remove_pointer.

        * testsuite/libgomp.oacc-c-c++-common/data-2.c: Update test.
        * testsuite/libgomp.oacc-c-c++-common/enter-data.c: New test.
        * testsuite/libgomp.oacc-fortran/data-2.f90: Update test.

Comments

Chung-Lin Tang Sept. 6, 2016, 11:45 a.m. UTC | #1
Ping.

On 2016/8/29 03:46 PM, Chung-Lin Tang wrote:
> Hi Jakub,
> this patch is a port of some changes from gomp-4_0-branch,
> including adding additional map type handling in OpenACC enter/exit data
> directives, and some pointer set handling changes. Updated
> testsuite case are also included.
> 
> Tested on trunk to ensure no regressions, is this okay for trunk?
> 
> Thanks,
> Chung-Lin
> 
> 2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
>             Thomas Schwinge  <thomas@codesourcery.com>
>             Chung-Lin Tang  <cltang@codesourcery.com>
> 
>         libgomp/
>         * oacc-parallel.c (find_pset): Adjust and rename from...
>         (find_pointer): ...this function.
>         (GOACC_enter_exit_data): Handle GOMP_MAP_TO and GOMP_MAP_ALLOC,
>         adjust find_pointer calls into find_pset, adjust pointer map handling,
>         add acc_is_present guards to calls to gomp_acc_insert_pointer and
>         gomp_acc_remove_pointer.
> 
>         * testsuite/libgomp.oacc-c-c++-common/data-2.c: Update test.
>         * testsuite/libgomp.oacc-c-c++-common/enter-data.c: New test.
>         * testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
>
Thomas Schwinge Sept. 6, 2016, 12:11 p.m. UTC | #2
Hi!

On Mon, 29 Aug 2016 15:46:47 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> this patch is a port of some changes from gomp-4_0-branch,
> including adding additional map type handling in OpenACC enter/exit data
> directives, and some pointer set handling changes. Updated
> testsuite case are also included.
> 
> Tested on trunk to ensure no regressions, is this okay for trunk?

> 2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
>             Thomas Schwinge  <thomas@codesourcery.com>
>             Chung-Lin Tang  <cltang@codesourcery.com>

Maybe I'm misremembering, but I can't remember having been involved in
this.  ;-)

>         libgomp/
>         * oacc-parallel.c (find_pset): Adjust and rename from...
>         (find_pointer): ...this function.
>         (GOACC_enter_exit_data): Handle GOMP_MAP_TO and GOMP_MAP_ALLOC,
>         adjust find_pointer calls into find_pset, adjust pointer map handling,
>         add acc_is_present guards to calls to gomp_acc_insert_pointer and
>         gomp_acc_remove_pointer.

> --- oacc-parallel.c	(revision 239814)
> +++ oacc-parallel.c	(working copy)
> @@ -38,15 +38,23 @@
>  #include <stdarg.h>
>  #include <assert.h>
>  
> +/* Returns the number of mappings associated with the pointer or pset. PSET
> +   have three mappings, whereas pointer have two.  */
> +
>  static int
> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>  {
>    if (pos + 1 >= mapnum)
>      return 0;
>  
>    unsigned char kind = kinds[pos+1] & 0xff;
>  
> -  return kind == GOMP_MAP_TO_PSET;
> +  if (kind == GOMP_MAP_TO_PSET)
> +    return 3;
> +  else if (kind == GOMP_MAP_POINTER)
> +    return 2;
> +
> +  return 0;
>  }

I'm still confused about that find_pset/find_pointer handling.  Why is
that required?  Essentially, that means that GOACC_enter_exit_data is
skipping over some mappings, right?  If yes, why do the front ends
(Fortran only?) then emit these mappings to begin with, if we're then
ignoring them in the runtime?

> @@ -298,7 +306,9 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  
>        if (kind == GOMP_MAP_FORCE_ALLOC
>  	  || kind == GOMP_MAP_FORCE_PRESENT
> -	  || kind == GOMP_MAP_FORCE_TO)
> +	  || kind == GOMP_MAP_FORCE_TO
> +	  || kind == GOMP_MAP_TO
> +	  || kind == GOMP_MAP_ALLOC)
>  	{
>  	  data_enter = true;
>  	  break;
> @@ -312,31 +322,39 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  		      kind);
>      }
>  
> +  /* In c, non-pointers and arrays are represented by a single data clause.
> +     Dynamically allocated arrays and subarrays are represented by a data
> +     clause followed by an internal GOMP_MAP_POINTER.
> +
> +     In fortran, scalars and not allocated arrays are represented by a
> +     single data clause. Allocated arrays and subarrays have three mappings:
> +     1) the original data clause, 2) a PSET 3) a pointer to the array data.
> +  */
> +
>    if (data_enter)
>      {
>        for (i = 0; i < mapnum; i++)
>  	{
>  	  unsigned char kind = kinds[i] & 0xff;
>  
> -	  /* Scan for PSETs.  */
> -	  int psets = find_pset (i, mapnum, kinds);
> +	  /* Scan for pointers and PSETs.  */
> +	  int pointer = find_pointer (i, mapnum, kinds);
>  
> -	  if (!psets)
> +	  if (!pointer)
>  	    {
>  	      switch (kind)
>  		{
> -		case GOMP_MAP_POINTER:
> -		  gomp_acc_insert_pointer (1, &hostaddrs[i], &sizes[i],
> -					&kinds[i]);
> +		case GOMP_MAP_ALLOC:
> +		  acc_present_or_create (hostaddrs[i], sizes[i]);
>  		  break;
>  		case GOMP_MAP_FORCE_ALLOC:
>  		  acc_create (hostaddrs[i], sizes[i]);
>  		  break;
> -		case GOMP_MAP_FORCE_PRESENT:
> +		case GOMP_MAP_TO:
>  		  acc_present_or_copyin (hostaddrs[i], sizes[i]);
>  		  break;
>  		case GOMP_MAP_FORCE_TO:
> -		  acc_present_or_copyin (hostaddrs[i], sizes[i]);
> +		  acc_copyin (hostaddrs[i], sizes[i]);
>  		  break;
>  		default:
>  		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
> @@ -346,12 +364,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	    }
>  	  else
>  	    {
> -	      gomp_acc_insert_pointer (3, &hostaddrs[i], &sizes[i], &kinds[i]);
> +	      if (!acc_is_present (hostaddrs[i], sizes[i]))
> +		{
> +		  gomp_acc_insert_pointer (pointer, &hostaddrs[i],
> +					   &sizes[i], &kinds[i]);
> +		}
>  	      /* Increment 'i' by two because OpenACC requires fortran
>  		 arrays to be contiguous, so each PSET is associated with
>  		 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
>  		 one MAP_POINTER.  */
> -	      i += 2;
> +	      i += pointer - 1;
>  	    }
>  	}
>      }
> @@ -360,19 +382,15 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>        {
>  	unsigned char kind = kinds[i] & 0xff;
>  
> -	int psets = find_pset (i, mapnum, kinds);
> +	int pointer = find_pointer (i, mapnum, kinds);
>  
> -	if (!psets)
> +	if (!pointer)
>  	  {
>  	    switch (kind)
>  	      {
> -	      case GOMP_MAP_POINTER:
> -		gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> -					 == GOMP_MAP_FORCE_FROM,
> -					 async, 1);
> -		break;
>  	      case GOMP_MAP_DELETE:
> -		acc_delete (hostaddrs[i], sizes[i]);
> +		if (acc_is_present (hostaddrs[i], sizes[i]))
> +		  acc_delete (hostaddrs[i], sizes[i]);
>  		break;
>  	      case GOMP_MAP_FORCE_FROM:
>  		acc_copyout (hostaddrs[i], sizes[i]);
> @@ -385,10 +403,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	  }
>  	else
>  	  {
> -	    gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> -				     == GOMP_MAP_FORCE_FROM, async, 3);
> -	    /* See the above comment.  */
> -	    i += 2;
> +	    if (acc_is_present (hostaddrs[i], sizes[i]))
> +	      {
> +		gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> +					 == GOMP_MAP_FORCE_FROM, async,
> +					 pointer);
> +		/* See the above comment.  */
> +	      }
> +	    i += pointer - 1;
>  	  }
>        }
>  


Grüße
 Thomas
Chung-Lin Tang Sept. 8, 2016, 11:18 a.m. UTC | #3
On 2016/9/6 8:11 PM, Thomas Schwinge wrote:
> Hi!
> 
> On Mon, 29 Aug 2016 15:46:47 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
>> this patch is a port of some changes from gomp-4_0-branch,
>> including adding additional map type handling in OpenACC enter/exit data
>> directives, and some pointer set handling changes. Updated
>> testsuite case are also included.
>>
>> Tested on trunk to ensure no regressions, is this okay for trunk?
> 
>> 2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
>>             Thomas Schwinge  <thomas@codesourcery.com>
>>             Chung-Lin Tang  <cltang@codesourcery.com>
> 
> Maybe I'm misremembering, but I can't remember having been involved in
> this.  ;-)

A part of this was picked from r223178, which you committed to gomp-4_0-branch.

>> +/* Returns the number of mappings associated with the pointer or pset. PSET
>> +   have three mappings, whereas pointer have two.  */
>> +
>>  static int
>> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
>> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>>  {
>>    if (pos + 1 >= mapnum)
>>      return 0;
>>  
>>    unsigned char kind = kinds[pos+1] & 0xff;
>>  
>> -  return kind == GOMP_MAP_TO_PSET;
>> +  if (kind == GOMP_MAP_TO_PSET)
>> +    return 3;
>> +  else if (kind == GOMP_MAP_POINTER)
>> +    return 2;
>> +
>> +  return 0;
>>  }
> 
> I'm still confused about that find_pset/find_pointer handling.  Why is
> that required?  Essentially, that means that GOACC_enter_exit_data is
> skipping over some mappings, right?  If yes, why do the front ends
> (Fortran only?) then emit these mappings to begin with, if we're then
> ignoring them in the runtime?

It's not skipping mappings. GOMP_MAP_PSET uses 3 continuous entries while
GOMP_MAP_POINTER uses 2, see how these are eventually processed together
in gomp_map_vars().

Chung-Lin
Thomas Schwinge Sept. 8, 2016, 1:29 p.m. UTC | #4
Hi!

On Thu, 8 Sep 2016 19:18:30 +0800, Chung-Lin Tang <chunglin_tang@mentor.com> wrote:
> On 2016/9/6 8:11 PM, Thomas Schwinge wrote:
> > On Mon, 29 Aug 2016 15:46:47 +0800, Chung-Lin Tang <cltang@codesourcery.com> wrote:
> >> this patch is a port of some changes from gomp-4_0-branch,
> >> including adding additional map type handling in OpenACC enter/exit data
> >> directives, and some pointer set handling changes. Updated
> >> testsuite case are also included.
> >>
> >> Tested on trunk to ensure no regressions, is this okay for trunk?
> > 
> >> 2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
> >>             Thomas Schwinge  <thomas@codesourcery.com>
> >>             Chung-Lin Tang  <cltang@codesourcery.com>
> > 
> > Maybe I'm misremembering, but I can't remember having been involved in
> > this.  ;-)
> 
> A part of this was picked from r223178, which you committed to gomp-4_0-branch.

Heh, right, though that was a commit containing "Assorted OpenACC
changes", so merging various changes from our internal development
branch, done by several people.  Anyway, nothing to waste much time on.
;-)


> >> +/* Returns the number of mappings associated with the pointer or pset. PSET
> >> +   have three mappings, whereas pointer have two.  */
> >> +
> >>  static int
> >> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
> >> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
> >>  {
> >>    if (pos + 1 >= mapnum)
> >>      return 0;
> >>  
> >>    unsigned char kind = kinds[pos+1] & 0xff;
> >>  
> >> -  return kind == GOMP_MAP_TO_PSET;
> >> +  if (kind == GOMP_MAP_TO_PSET)
> >> +    return 3;
> >> +  else if (kind == GOMP_MAP_POINTER)
> >> +    return 2;
> >> +
> >> +  return 0;
> >>  }
> > 
> > I'm still confused about that find_pset/find_pointer handling.  Why is
> > that required?  Essentially, that means that GOACC_enter_exit_data is
> > skipping over some mappings, right?  If yes, why do the front ends
> > (Fortran only?) then emit these mappings to begin with, if we're then
> > ignoring them in the runtime?
> 
> It's not skipping mappings. GOMP_MAP_PSET uses 3 continuous entries while
> GOMP_MAP_POINTER uses 2, see how these are eventually processed together
> in gomp_map_vars().

I now see how for the "pointer != 0" case, *the address of*
"hostaddrs[i]" etc. is passed to gomp_acc_insert_pointer, which then
calls gomp_map_vars.  So, you're (or more precisely, those who once
committed these changes to our internal development branch) indeed just
extend the existing GOMP_MAP_TO_PSET handling to also cover
GOMP_MAP_POINTER.  This code still doesn't look very pretty generally,
but that's not your task to fix, right now.


Thus, your patch is back in the queue, waiting for approval.


Grüße
 Thomas
Chung-Lin Tang Sept. 19, 2016, 5:06 a.m. UTC | #5
Ping.

On 2016/9/6 7:45 PM, Chung-Lin Tang wrote:
> Ping.
> 
> On 2016/8/29 03:46 PM, Chung-Lin Tang wrote:
>> Hi Jakub,
>> this patch is a port of some changes from gomp-4_0-branch,
>> including adding additional map type handling in OpenACC enter/exit data
>> directives, and some pointer set handling changes. Updated
>> testsuite case are also included.
>>
>> Tested on trunk to ensure no regressions, is this okay for trunk?
>>
>> Thanks,
>> Chung-Lin
>>
>> 2016-08-29  Cesar Philippidis  <cesar@codesourcery.com>
>>             Thomas Schwinge  <thomas@codesourcery.com>
>>             Chung-Lin Tang  <cltang@codesourcery.com>
>>
>>         libgomp/
>>         * oacc-parallel.c (find_pset): Adjust and rename from...
>>         (find_pointer): ...this function.
>>         (GOACC_enter_exit_data): Handle GOMP_MAP_TO and GOMP_MAP_ALLOC,
>>         adjust find_pointer calls into find_pset, adjust pointer map handling,
>>         add acc_is_present guards to calls to gomp_acc_insert_pointer and
>>         gomp_acc_remove_pointer.
>>
>>         * testsuite/libgomp.oacc-c-c++-common/data-2.c: Update test.
>>         * testsuite/libgomp.oacc-c-c++-common/enter-data.c: New test.
>>         * testsuite/libgomp.oacc-fortran/data-2.f90: Update test.
>>
>
Cesar Philippidis Sept. 20, 2016, 4:43 p.m. UTC | #6
On 08/29/2016 12:46 AM, Chung-Lin Tang wrote:

> Index: oacc-parallel.c
> ===================================================================
> --- oacc-parallel.c	(revision 239814)
> +++ oacc-parallel.c	(working copy)
> @@ -38,15 +38,23 @@
>  #include <stdarg.h>
>  #include <assert.h>
>  
> +/* Returns the number of mappings associated with the pointer or pset. PSET
> +   have three mappings, whereas pointer have two.  */
> +
>  static int
> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>  {
>    if (pos + 1 >= mapnum)
>      return 0;
>  
>    unsigned char kind = kinds[pos+1] & 0xff;
>  
> -  return kind == GOMP_MAP_TO_PSET;
> +  if (kind == GOMP_MAP_TO_PSET)
> +    return 3;
> +  else if (kind == GOMP_MAP_POINTER)
> +    return 2;
> +
> +  return 0;
>  }

Is this still necessary with the firstprivatization of subarrays
pointers? Well, it might be for fortran. Conceptually, the gimplifier
should prune out those unnecessary firstprivate pointer clauses for
executable constructs such as enter/exit data and update.

Actually, this is one area in the spec where the intent of enter/exit
data conflicts with what it describes. If you look at the runtime
documentation for, say, acc_create, it states that

  acc_create (pvar, n*sizeof(var))

is equivalent to

  acc enter data create (pvar[n])

And to free acc_create, you use acc_delete. So in theory, you should be
able to

  #pragma acc enter data create (pvar[n])
  acc_free (pvar)

but this may result in a memory leak if the pointer mapping isn't freed.

Fortran is somewhat special because of the pointer sets. I'm not sure if
its possible to make the OpenACC runtime API compatible with enter/exit
data.

>  static void goacc_wait (int async, int num_waits, va_list *ap);
> @@ -298,7 +306,9 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  
>        if (kind == GOMP_MAP_FORCE_ALLOC
>  	  || kind == GOMP_MAP_FORCE_PRESENT
> -	  || kind == GOMP_MAP_FORCE_TO)
> +	  || kind == GOMP_MAP_FORCE_TO
> +	  || kind == GOMP_MAP_TO
> +	  || kind == GOMP_MAP_ALLOC)
>  	{
>  	  data_enter = true;
>  	  break;
> @@ -312,31 +322,39 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  		      kind);
>      }
>  
> +  /* In c, non-pointers and arrays are represented by a single data clause.
> +     Dynamically allocated arrays and subarrays are represented by a data
> +     clause followed by an internal GOMP_MAP_POINTER.
> +
> +     In fortran, scalars and not allocated arrays are represented by a
> +     single data clause. Allocated arrays and subarrays have three mappings:
> +     1) the original data clause, 2) a PSET 3) a pointer to the array data.
> +  */
> +
>    if (data_enter)
>      {
>        for (i = 0; i < mapnum; i++)
>  	{
>  	  unsigned char kind = kinds[i] & 0xff;
>  
> -	  /* Scan for PSETs.  */
> -	  int psets = find_pset (i, mapnum, kinds);
> +	  /* Scan for pointers and PSETs.  */
> +	  int pointer = find_pointer (i, mapnum, kinds);
>  
> -	  if (!psets)
> +	  if (!pointer)
>  	    {
>  	      switch (kind)
>  		{
> -		case GOMP_MAP_POINTER:
> -		  gomp_acc_insert_pointer (1, &hostaddrs[i], &sizes[i],
> -					&kinds[i]);
> +		case GOMP_MAP_ALLOC:
> +		  acc_present_or_create (hostaddrs[i], sizes[i]);
>  		  break;
>  		case GOMP_MAP_FORCE_ALLOC:
>  		  acc_create (hostaddrs[i], sizes[i]);
>  		  break;
> -		case GOMP_MAP_FORCE_PRESENT:
> +		case GOMP_MAP_TO:
>  		  acc_present_or_copyin (hostaddrs[i], sizes[i]);
>  		  break;
>  		case GOMP_MAP_FORCE_TO:
> -		  acc_present_or_copyin (hostaddrs[i], sizes[i]);
> +		  acc_copyin (hostaddrs[i], sizes[i]);
>  		  break;

Thanks for correcting that. I had some of those data mappings wrong.

>  		default:
>  		  gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x",
> @@ -346,12 +364,16 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	    }
>  	  else
>  	    {
> -	      gomp_acc_insert_pointer (3, &hostaddrs[i], &sizes[i], &kinds[i]);
> +	      if (!acc_is_present (hostaddrs[i], sizes[i]))
> +		{
> +		  gomp_acc_insert_pointer (pointer, &hostaddrs[i],
> +					   &sizes[i], &kinds[i]);
> +		}
>  	      /* Increment 'i' by two because OpenACC requires fortran
>  		 arrays to be contiguous, so each PSET is associated with
>  		 one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and
>  		 one MAP_POINTER.  */
> -	      i += 2;
> +	      i += pointer - 1;
>  	    }
>  	}
>      }
> @@ -360,19 +382,15 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>        {
>  	unsigned char kind = kinds[i] & 0xff;
>  
> -	int psets = find_pset (i, mapnum, kinds);
> +	int pointer = find_pointer (i, mapnum, kinds);
>  
> -	if (!psets)
> +	if (!pointer)
>  	  {
>  	    switch (kind)
>  	      {
> -	      case GOMP_MAP_POINTER:
> -		gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> -					 == GOMP_MAP_FORCE_FROM,
> -					 async, 1);
> -		break;
>  	      case GOMP_MAP_DELETE:
> -		acc_delete (hostaddrs[i], sizes[i]);
> +		if (acc_is_present (hostaddrs[i], sizes[i]))
> +		  acc_delete (hostaddrs[i], sizes[i]);
>  		break;
>  	      case GOMP_MAP_FORCE_FROM:
>  		acc_copyout (hostaddrs[i], sizes[i]);
> @@ -385,10 +403,14 @@ GOACC_enter_exit_data (int device, size_t mapnum,
>  	  }
>  	else
>  	  {
> -	    gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> -				     == GOMP_MAP_FORCE_FROM, async, 3);
> -	    /* See the above comment.  */
> -	    i += 2;
> +	    if (acc_is_present (hostaddrs[i], sizes[i]))
> +	      {
> +		gomp_acc_remove_pointer (hostaddrs[i], (kinds[i] & 0xff)
> +					 == GOMP_MAP_FORCE_FROM, async,
> +					 pointer);
> +		/* See the above comment.  */
> +	      }
> +	    i += pointer - 1;
>  	  }
>        }
>  
> 
> 
> libgomp-enter-exit-testsuite.patch
> 
> 
> Index: testsuite/libgomp.oacc-c-c++-common/data-2.c
> ===================================================================
> --- testsuite/libgomp.oacc-c-c++-common/data-2.c	(revision 239814)
> +++ testsuite/libgomp.oacc-c-c++-common/data-2.c	(working copy)
> @@ -3,6 +3,7 @@
>  /* { dg-do run } */
>  
>  #include <stdlib.h>
> +#include <openacc.h>
>  
>  int
>  main (int argc, char **argv)
> @@ -32,7 +33,7 @@ main (int argc, char **argv)
>    for (i = 0; i < N; i++)
>      b[i] = a[i];
>  
> -#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait async
> +#pragma acc exit data copyout (a[0:N], b[0:N]) delete (N) wait async
>  #pragma acc wait

One note about these tests in general. I wonder if we should also be
testing subarrays with non-zero base offsets. We already hit one bug
with local arrays.

Cesar
Chung-Lin Tang Nov. 3, 2016, 2:22 p.m. UTC | #7
Ping this patch again.

On 2016/9/21 12:43 AM, Cesar Philippidis wrote:
>> +/* Returns the number of mappings associated with the pointer or pset. PSET
>> > +   have three mappings, whereas pointer have two.  */
>> > +
>> >  static int
>> > -find_pset (int pos, size_t mapnum, unsigned short *kinds)
>> > +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>> >  {
>> >    if (pos + 1 >= mapnum)
>> >      return 0;
>> >  
>> >    unsigned char kind = kinds[pos+1] & 0xff;
>> >  
>> > -  return kind == GOMP_MAP_TO_PSET;
>> > +  if (kind == GOMP_MAP_TO_PSET)
>> > +    return 3;
>> > +  else if (kind == GOMP_MAP_POINTER)
>> > +    return 2;
>> > +
>> > +  return 0;
>> >  }
> Is this still necessary with the firstprivatization of subarrays
> pointers? Well, it might be for fortran. Conceptually, the gimplifier
> should prune out those unnecessary firstprivate pointer clauses for
> executable constructs such as enter/exit data and update.

It appears that GOMP_MAP_POINTER/GOMP_MAP_TO_PSET maps are currently
created only from the Fortran FE, so I think your description is accurate.

> Actually, this is one area in the spec where the intent of enter/exit
> data conflicts with what it describes. If you look at the runtime
> documentation for, say, acc_create, it states that
> 
>   acc_create (pvar, n*sizeof(var))
> 
> is equivalent to
> 
>   acc enter data create (pvar[n])
> 
> And to free acc_create, you use acc_delete. So in theory, you should be
> able to
> 
>   #pragma acc enter data create (pvar[n])
>   acc_free (pvar)
> 
> but this may result in a memory leak if the pointer mapping isn't freed.

Upon re-reading the OpenACC spec, it appears that acc_malloc/acc_free are supposed
to be "dumb" allocation/deallocation interfaces, i.e. the implementation is likely
to be something that directly wires to the alloc_func/free_func plugin hooks.
I don't think it's supposed to be something that works with the enter/exit data directives,
or anything that works on the maps managed by libgomp.

Chung-Lin
Chung-Lin Tang Dec. 6, 2016, 3:49 p.m. UTC | #8
Ping.

On 2016/11/3 10:22 PM, Chung-Lin Tang wrote:
> 
> Ping this patch again.
> 
> On 2016/9/21 12:43 AM, Cesar Philippidis wrote:
>>> +/* Returns the number of mappings associated with the pointer or pset. PSET
>>>> +   have three mappings, whereas pointer have two.  */
>>>> +
>>>>  static int
>>>> -find_pset (int pos, size_t mapnum, unsigned short *kinds)
>>>> +find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>>>>  {
>>>>    if (pos + 1 >= mapnum)
>>>>      return 0;
>>>>  
>>>>    unsigned char kind = kinds[pos+1] & 0xff;
>>>>  
>>>> -  return kind == GOMP_MAP_TO_PSET;
>>>> +  if (kind == GOMP_MAP_TO_PSET)
>>>> +    return 3;
>>>> +  else if (kind == GOMP_MAP_POINTER)
>>>> +    return 2;
>>>> +
>>>> +  return 0;
>>>>  }
>> Is this still necessary with the firstprivatization of subarrays
>> pointers? Well, it might be for fortran. Conceptually, the gimplifier
>> should prune out those unnecessary firstprivate pointer clauses for
>> executable constructs such as enter/exit data and update.
> 
> It appears that GOMP_MAP_POINTER/GOMP_MAP_TO_PSET maps are currently
> created only from the Fortran FE, so I think your description is accurate.
> 
>> Actually, this is one area in the spec where the intent of enter/exit
>> data conflicts with what it describes. If you look at the runtime
>> documentation for, say, acc_create, it states that
>>
>>   acc_create (pvar, n*sizeof(var))
>>
>> is equivalent to
>>
>>   acc enter data create (pvar[n])
>>
>> And to free acc_create, you use acc_delete. So in theory, you should be
>> able to
>>
>>   #pragma acc enter data create (pvar[n])
>>   acc_free (pvar)
>>
>> but this may result in a memory leak if the pointer mapping isn't freed.
> 
> Upon re-reading the OpenACC spec, it appears that acc_malloc/acc_free are supposed
> to be "dumb" allocation/deallocation interfaces, i.e. the implementation is likely
> to be something that directly wires to the alloc_func/free_func plugin hooks.
> I don't think it's supposed to be something that works with the enter/exit data directives,
> or anything that works on the maps managed by libgomp.
> 
> Chung-Lin
> 
> 
>
diff mbox

Patch

Index: testsuite/libgomp.oacc-c-c++-common/data-2.c
===================================================================
--- testsuite/libgomp.oacc-c-c++-common/data-2.c	(revision 239814)
+++ testsuite/libgomp.oacc-c-c++-common/data-2.c	(working copy)
@@ -3,6 +3,7 @@ 
 /* { dg-do run } */
 
 #include <stdlib.h>
+#include <openacc.h>
 
 int
 main (int argc, char **argv)
@@ -32,7 +33,7 @@  main (int argc, char **argv)
   for (i = 0; i < N; i++)
     b[i] = a[i];
 
-#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait async
+#pragma acc exit data copyout (a[0:N], b[0:N]) delete (N) wait async
 #pragma acc wait
 
   for (i = 0; i < N; i++)
@@ -46,6 +47,32 @@  main (int argc, char **argv)
 
   for (i = 0; i < N; i++)
     {
+      a[i] = 3.0;
+      b[i] = 0.0;
+    }
+
+#pragma acc enter data copyin (a[0:N]) async 
+#pragma acc enter data copyin (b[0:N]) async wait
+#pragma acc enter data copyin (N) async wait
+#pragma acc parallel async wait present (a[0:N]) present (b[0:N]) present (N)
+#pragma acc loop
+  for (i = 0; i < N; i++)
+    b[i] = a[i];
+
+#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) delete (N) wait async
+#pragma acc wait
+
+  for (i = 0; i < N; i++)
+    {
+      if (a[i] != 3.0)
+	abort ();
+
+      if (b[i] != 3.0)
+	abort ();
+    }
+
+  for (i = 0; i < N; i++)
+    {
       a[i] = 2.0;
       b[i] = 0.0;
     }
@@ -56,7 +83,7 @@  main (int argc, char **argv)
   for (i = 0; i < N; i++)
     b[i] = a[i];
 
-#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) wait (1) async (1)
+#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) delete (N) wait (1) async (1)
 #pragma acc wait (1)
 
   for (i = 0; i < N; i++)
@@ -93,7 +120,7 @@  main (int argc, char **argv)
   for (i = 0; i < N; i++)
     d[i] = ((a[i] * a[i] + a[i]) / a[i]) - a[i];
 
-#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) copyout (c[0:N]) copyout (d[0:N]) wait (1, 2, 3) async (1)
+#pragma acc exit data copyout (a[0:N]) copyout (b[0:N]) copyout (c[0:N]) copyout (d[0:N]) delete (N) wait (1, 2, 3) async (1)
 #pragma acc wait (1)
 
   for (i = 0; i < N; i++)
@@ -161,5 +188,156 @@  main (int argc, char **argv)
 	abort ();
     }
 
+#if !ACC_MEM_SHARED
+  for (i = 0; i < N; i++)
+    {
+      a[i] = 3.0;
+      b[i] = 0.0;
+    }
+
+#pragma acc enter data present_or_copyin (a[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+#pragma acc exit data copyout (a[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_copyin (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data copyout (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_create (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data copyout (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_create (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_create (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data create (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_copyin (a[0:N], b[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (!acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc enter data present_or_copyin (a[0:N])
+
+  if (!acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+
+#pragma acc exit data delete (a[0:N], b[0:N])
+
+  if (acc_is_present (a, nbytes))
+    abort ();
+
+  if (acc_is_present (b, nbytes))
+    abort ();
+#endif
+
   return 0;
 }
Index: testsuite/libgomp.oacc-c-c++-common/enter-data.c
===================================================================
--- testsuite/libgomp.oacc-c-c++-common/enter-data.c	(revision 0)
+++ testsuite/libgomp.oacc-c-c++-common/enter-data.c	(revision 0)
@@ -0,0 +1,23 @@ 
+/* This test verifies that the present data clauses to acc enter data
+   don't cause duplicate mapping failures at runtime.  */
+
+/* { dg-do run } */
+
+#include <stdlib.h>
+
+int
+main (void)
+{
+  int a;
+
+#pragma acc enter data copyin (a)
+#pragma acc enter data pcopyin (a)
+#pragma acc enter data pcreate (a)
+#pragma acc exit data delete (a)
+
+#pragma acc enter data create (a)
+#pragma acc enter data pcreate (a)
+#pragma acc exit data delete (a)
+
+  return 0;
+}
Index: testsuite/libgomp.oacc-fortran/data-2.f90
===================================================================
--- testsuite/libgomp.oacc-fortran/data-2.f90	(revision 239814)
+++ testsuite/libgomp.oacc-fortran/data-2.f90	(working copy)
@@ -1,9 +1,16 @@ 
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
 
 program test
+  use openacc
   integer, parameter :: N = 8
   real, allocatable :: a(:,:), b(:,:)
+  real, allocatable :: c(:), d(:)
+  integer i, j
 
+  i = 0
+  j = 0
+
   allocate (a(N,N))
   allocate (b(N,N))
 
@@ -12,7 +19,7 @@  program test
 
   !$acc enter data copyin (a(1:N,1:N), b(1:N,1:N))
 
-  !$acc parallel
+  !$acc parallel present (a(1:N,1:N), b(1:N,1:N))
   do i = 1, n
     do j = 1, n
       b(j,i) = a (j,i)
@@ -28,4 +35,171 @@  program test
       if (b(j,i) .ne. 3.0) call abort
     end do
   end do
+
+  allocate (c(N))
+  allocate (d(N))
+
+  c(:) = 3.0
+  d(:) = 0.0
+
+  !$acc enter data copyin (c(1:N)) create (d(1:N)) async
+  !$acc wait
+  
+  !$acc parallel present (c(1:N), d(1:N))
+    do i = 1, N
+      d(i) = c(i) + 1
+    end do
+  !$acc end parallel
+
+  !$acc exit data copyout (c(1:N), d(1:N)) async
+  !$acc wait
+
+  do i = 1, N
+    if (d(i) .ne. 4.0) call abort
+  end do
+
+  c(:) = 3.0
+  d(:) = 0.0
+
+  !$acc enter data copyin (c(1:N)) async
+  !$acc enter data create (d(1:N)) wait
+  !$acc wait
+
+  !$acc parallel present (c(1:N), d(1:N))
+    do i = 1, N
+      d(i) = c(i) + 1
+    end do
+  !$acc end parallel
+  
+  !$acc exit data delete (c(1:N)) copyout (d(1:N)) async
+  !$acc exit data async
+  !$acc wait
+
+  do i = 1, N
+    if (d(i) .ne. 4.0) call abort
+  end do
+
+#if !ACC_MEM_SHARED
+
+  c(:) = 3.0
+  d(:) = 0.0
+
+  !$acc enter data present_or_copyin (c(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+
+  !$acc exit data copyout (c(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+
+  do i = 1, N
+    if (c(i) .ne. 3.0) call abort
+  end do
+
+  c(:) = 5.0
+  d(:) = 9.0
+
+  !$acc enter data present_or_copyin (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc exit data copyout (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  do i = 1, N
+    if (c(i) .ne. 5.0) call abort
+    if (d(i) .ne. 9.0) call abort
+  end do
+
+  !$acc enter data present_or_create (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc parallel present (c(0:N), d(0:N))
+    do i = 1, N
+      c(i) = 1.0;
+      d(i) = 2.0;
+    end do
+  !$acc end parallel
+
+  !$acc exit data copyout (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  do i = 1, N
+    if (c(i) .ne. 1.0) call abort
+    if (d(i) .ne. 2.0) call abort
+  end do
+
+  !$acc enter data present_or_create (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc enter data present_or_create (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc enter data create (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc enter data present_or_copyin (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .FALSE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc enter data present_or_copyin (c(0:N))
+
+  if (acc_is_present (c) .eqv. .FALSE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+  !$acc exit data delete (c(0:N), d(0:N))
+
+  if (acc_is_present (c) .eqv. .TRUE.) call abort
+  if (acc_is_present (d) .eqv. .TRUE.) call abort
+
+#endif
+
 end program test