diff mbox series

OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

Message ID 7fa4a70c-60e7-fa18-0fcd-98301c0b3344@codesourcery.com
State New
Headers show
Series OpenMP, libgomp: Add new runtime routine omp_target_is_accessible. | expand

Commit Message

Marcel Vollweiler March 11, 2022, 12:30 p.m. UTC
Hi,

This patch adds the OpenMP runtime routine "omp_target_is_accessible" which was
introduced in OpenMP 5.1 (specification section 3.8.4):

"The omp_target_is_accessible routine tests whether host memory is accessible
from a given device."

"This routine returns true if the storage of size bytes starting at the address
given by ptr is accessible from device device_num. Otherwise, it returns false."

"The value of ptr must be a valid host pointer or NULL (or C_NULL_PTR, for
Fortran). The device_num argument must be greater than or equal to zero and less
than or equal to the result of omp_get_num_devices()."

"When called from within a target region the effect is unspecified."

Currently, the only way of accessing host memory on a non-host device is via
shared memory. This will change with unified shared memory (usm) that was
recently submitted but not yet approved/committed. A follow-up patch for
omp_target_is_accessible is planned considering usm when available. The current
patch handles the basic implementation for C/C++ and Fortran and includes
comments pointing to usm.

Although not explicitly specified in the OpenMP 5.1 standard, the implemented
function returns "true" if the given device_num is equal to
"omp_get_num_devices" (i.e. the host) as it is expected that host memory can be
accessed from the host device.

The patch was tested on x86_64-linux and PowerPC, both with nvptx offloading.
All with no regressions.

Marcel
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

gcc/ChangeLog:

	* omp-low.cc (omp_runtime_api_call): Added target_is_accessible to
	omp_runtime_apis array.

libgomp/ChangeLog:

	* libgomp.map: Added omp_target_is_accessible.
	* libgomp.texi: Tagged omp_target_is_accessible as supported.
	* omp.h.in: Added omp_target_is_accessible.
	* omp_lib.f90.in: Added interface for omp_target_is_accessible.
	* omp_lib.h.in: Likewise.
	* target.c (omp_target_is_accessible): Added implementation of
	omp_target_is_accessible.
	* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: New test.
	* testsuite/libgomp.fortran/target-is-accessible-1.f90: New test.

Comments

Tobias Burnus March 11, 2022, 2:12 p.m. UTC | #1
Minor remark to the test:

On 11.03.22 13:30, Marcel Vollweiler wrote:
> +  int d = omp_get_default_device ();
...
> +  int shared_mem = 0;
> +  #pragma omp target map (alloc: shared_mem) device (d)
> +    shared_mem = 1;
> +  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
> +    __builtin_abort ();

I wonder whether it makes sense to do instead
   for (d = 0; d <= omp_get_num_devices(); ++d)
instead of just
   d = omp_get_default_device();
given that we have already found once in a while bugs when testing more
than just the default device - be it because devices differed or because
'0' was special.

In particular, I could image having at the same time two or three devices
available of type intelmic + gcn + nvptx, possibly mixing shared memory,
nonshared memory and semi-shared memory*

Tobias

(* semi-shared: I am especially thinking of nvptx with %dynamic_smem_size,
which requires some special handling. By contrast with HMM and Pascal GPUs,
real USM is possible.)

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Marcel Vollweiler March 14, 2022, 3:42 p.m. UTC | #2
Hi Tobias,

> Minor remark to the test:
>
> On 11.03.22 13:30, Marcel Vollweiler wrote:
>> +  int d = omp_get_default_device ();
> ...
>> +  int shared_mem = 0;
>> +  #pragma omp target map (alloc: shared_mem) device (d)
>> +    shared_mem = 1;
>> +  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
>> +    __builtin_abort ();
>
> I wonder whether it makes sense to do instead
>    for (d = 0; d <= omp_get_num_devices(); ++d)
> instead of just
>    d = omp_get_default_device();
> given that we have already found once in a while bugs when testing more
> than just the default device - be it because devices differed or because
> '0' was special.
>
> In particular, I could image having at the same time two or three devices
> available of type intelmic + gcn + nvptx, possibly mixing shared memory,
> nonshared memory and semi-shared memory*

Good hint, thanks. I updated the C(++) and Fortran tests accordingly and
attached the updated patch.

Marcel
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
OpenMP, libgomp: Add new runtime routine omp_target_is_accessible.

gcc/ChangeLog:

	* omp-low.cc (omp_runtime_api_call): Added target_is_accessible to
	omp_runtime_apis array.

libgomp/ChangeLog:

	* libgomp.map: Added omp_target_is_accessible.
	* libgomp.texi: Tagged omp_target_is_accessible as supported.
	* omp.h.in: Added omp_target_is_accessible.
	* omp_lib.f90.in: Added interface for omp_target_is_accessible.
	* omp_lib.h.in: Likewise.
	* target.c (omp_target_is_accessible): Added implementation of
	omp_target_is_accessible.
	* testsuite/libgomp.c-c++-common/target-is-accessible-1.c: New test.
	* testsuite/libgomp.fortran/target-is-accessible-1.f90: New test.

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..bf38fad 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3959,6 +3959,7 @@ omp_runtime_api_call (const_tree fndecl)
       "target_associate_ptr",
       "target_disassociate_ptr",
       "target_free",
+      "target_is_accessible",
       "target_is_present",
       "target_memcpy",
       "target_memcpy_rect",
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..1764380 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@ OMP_5.1 {
 	omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+	omp_target_is_accessible;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
 	GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..58e432c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -311,7 +311,7 @@ The OpenMP 4.5 specification is fully supported.
 @item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
       @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
       routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab N @tab
+@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
       runtime routines @tab N @tab
 @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..1ec7415 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,8 @@ extern int omp_target_memcpy_rect (void *, const void *, __SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 				     __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern int omp_target_is_accessible (const void *, __SIZE_TYPE__, int)
+  __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..f369507 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@
           end function omp_target_disassociate_ptr
         end interface
 
+        interface
+          function omp_target_is_accessible (ptr, size, device_num) bind(c)
+            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+            integer(c_int) :: omp_target_is_accessible
+            type(c_ptr), value :: ptr
+            integer(c_size_t), value :: size
+            integer(c_int), value :: device_num
+          end function omp_target_is_accessible
+        end interface
+
 #if _OPENMP >= 201811
 !GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
 #endif
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index ff857a4..5ea0366 100644
--- a/libgomp/omp_lib.h.in
+++ b/libgomp/omp_lib.h.in
@@ -416,3 +416,14 @@
           integer(c_int), value :: device_num
         end function omp_target_disassociate_ptr
       end interface
+
+      interface
+        function omp_target_is_accessible (ptr, size, device_num)          &
+     &      bind(c)
+          use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+          integer(c_int) :: omp_target_is_accessible
+          type(c_ptr), value :: ptr
+          integer(c_size_t), value :: size
+          integer(c_int), value :: device_num
+        end function omp_target_is_accessible
+      end interface
diff --git a/libgomp/target.c b/libgomp/target.c
index 9017458..01d36d1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3666,6 +3666,24 @@ omp_target_disassociate_ptr (const void *ptr, int device_num)
 }
 
 int
+omp_target_is_accessible (const void *ptr, size_t size, int device_num)
+{
+  if (device_num < 0 || device_num > gomp_get_num_devices ())
+    return false;
+
+  if (device_num == gomp_get_num_devices ())
+    return true;
+
+  struct gomp_device_descr *devicep = resolve_device (device_num);
+  if (devicep == NULL)
+    return false;
+
+  /* TODO: Unified shared memory must be handled when available.  */
+
+  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
+}
+
+int
 omp_pause_resource (omp_pause_resource_t kind, int device_num)
 {
   (void) kind;
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
new file mode 100644
index 0000000..7c2cf62
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
@@ -0,0 +1,47 @@
+#include <omp.h>
+
+int
+main ()
+{
+  int d = omp_get_default_device ();
+  int id = omp_get_initial_device ();
+  int n = omp_get_num_devices ();
+  void *p;
+
+  if (d < 0 || d >= n)
+    d = id;
+
+  if (!omp_target_is_accessible (p, sizeof (int), n))
+    __builtin_abort ();
+
+  if (!omp_target_is_accessible (p, sizeof (int), id))
+    __builtin_abort ();
+
+  if (omp_target_is_accessible (p, sizeof (int), -1))
+    __builtin_abort ();
+
+  if (omp_target_is_accessible (p, sizeof (int), n + 1))
+    __builtin_abort ();
+
+  /* Currently, a host pointer is accessible if the device supports shared
+     memory or omp_target_is_accessible is executed on the host. This
+     test case must be adapted when unified shared memory is avialable.  */
+  int a[128];
+  for (int d = 0; d <= omp_get_num_devices (); d++)
+    {
+      int shared_mem = 0;
+      #pragma omp target map (alloc: shared_mem) device (d)
+	shared_mem = 1;
+      if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+	__builtin_abort ();
+
+      if (omp_target_is_accessible (a, 128 * sizeof (int), d) != shared_mem)
+	__builtin_abort ();
+
+      for (int i = 0; i < 128; i++)
+	if (omp_target_is_accessible (&a[i], sizeof (int), d) != shared_mem)
+	  __builtin_abort ();
+    }
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90 b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
new file mode 100644
index 0000000..2611855
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
@@ -0,0 +1,50 @@
+program main
+  use omp_lib
+  use iso_c_binding
+  implicit none (external, type)
+  integer :: d, id, n, shared_mem, i
+  integer, target :: a(1:128)
+  type(c_ptr) :: p
+
+  d = omp_get_default_device ()
+  id = omp_get_initial_device ()
+  n = omp_get_num_devices ()
+
+  if (d < 0 .or. d >= n) &
+    d = id
+
+  if (omp_target_is_accessible (p, c_sizeof (d), n) /= 1) &
+    stop 1
+
+  if (omp_target_is_accessible (p, c_sizeof (d), id) /= 1) &
+    stop 2
+
+  if (omp_target_is_accessible (p, c_sizeof (d), -1) /= 0) &
+    stop 3
+
+  if (omp_target_is_accessible (p, c_sizeof (d), n + 1) /= 0) &
+    stop 4
+
+  ! Currently, a host pointer is accessible if the device supports shared
+  ! memory or omp_target_is_accessible is executed on the host. This
+  ! test case must be adapted when unified shared memory is avialable.
+  do d = 0, omp_get_num_devices ()
+    shared_mem = 0;
+    !$omp target map (alloc: shared_mem) device (d)
+      shared_mem = 1;
+    !$omp end target
+
+    if (omp_target_is_accessible (p, c_sizeof (d), d) /= shared_mem) &
+      stop 5;
+
+    if (omp_target_is_accessible (c_loc (a), 128 * sizeof (a(1)), d) /= shared_mem) &
+      stop 6;
+
+    do i = 1, 128
+      if (omp_target_is_accessible (c_loc (a(i)), sizeof (a(i)), d) /= shared_mem) &
+        stop 7;
+    end do
+
+  end do
+
+end program main
Jakub Jelinek May 5, 2022, 9:33 a.m. UTC | #3
On Mon, Mar 14, 2022 at 04:42:14PM +0100, Marcel Vollweiler wrote:
> --- a/libgomp/libgomp.map
> +++ b/libgomp/libgomp.map
> @@ -226,6 +226,11 @@ OMP_5.1 {
>  	omp_get_teams_thread_limit_;
>  } OMP_5.0.2;
>  
> +OMP_5.1.1 {
> +  global:
> +	omp_target_is_accessible;
> +} OMP_5.1;
> +

You've already added another OMP_5.1.1 symbol, so this hunk will need to be
adjusted.  Keep the names in there alphabetically sorted.

> --- a/libgomp/omp_lib.f90.in
> +++ b/libgomp/omp_lib.f90.in
> @@ -835,6 +835,16 @@
>            end function omp_target_disassociate_ptr
>          end interface
>  
> +        interface
> +          function omp_target_is_accessible (ptr, size, device_num) bind(c)
> +            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
> +            integer(c_int) :: omp_target_is_accessible

The function returning integer(c_int) rather than logical seems like
a screw up in the standard, but too late to fix that :(.

> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3666,6 +3666,24 @@ omp_target_disassociate_ptr (const void *ptr, int device_num)
>  }
>  
>  int
> +omp_target_is_accessible (const void *ptr, size_t size, int device_num)
> +{
> +  if (device_num < 0 || device_num > gomp_get_num_devices ())
> +    return false;
> +
> +  if (device_num == gomp_get_num_devices ())
> +    return true;
> +
> +  struct gomp_device_descr *devicep = resolve_device (device_num);
> +  if (devicep == NULL)
> +    return false;
> +
> +  /* TODO: Unified shared memory must be handled when available.  */
> +
> +  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;

I guess for now it is reasonable, but I wonder if even without
GOMP_OFFLOAD_CAP_SHARED_MEM one can't for CUDA or GCN allocate host
memory (not all, but just some subset) that will be accessible on the
device (I bet that means accessible through the same address on the host and
device, aka partial shared mem).

So, ok for trunk.

OT, tried to look how libomptarget implements it and they don't at least
on llvm-project trunk, but while looking at that, noticed that for
omp_target_is_present they do return false from omp_target_is_present
while we return true.  It is unclear if NULL has corresponding storage
on the device (NULL always corresponds to NULL on the device) or not.

	Jakub
Tobias Burnus May 5, 2022, 9:45 a.m. UTC | #4
Hi,

On 05.05.22 11:33, Jakub Jelinek via Gcc-patches wrote:
> On Mon, Mar 14, 2022 at 04:42:14PM +0100, Marcel Vollweiler wrote:
>> +        interface
>> +          function omp_target_is_accessible (ptr, size, device_num) bind(c)
>> +            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
>> +            integer(c_int) :: omp_target_is_accessible
> The function returning integer(c_int) rather than logical seems like
> a screw up in the standard, but too late to fix that :(.

I think the idea is that it can directly call the C function without
needing a wrapper. And as default-kind 'logical' != 'integer(c_int)' in
general, it cannot return logical. (In case of GCC, just claiming that
it is logical would work. But some Fortran compilers use -1 for .true.
and only flip a single bit for .not. For those,
"if(.not.omp_target_is_accessible(..)) will not work properly, if the C
function returns 1.

But I concur that requiring "/= 0" is ugly!

> OT, tried to look how libomptarget implements it and they don't at least
> on llvm-project trunk, but while looking at that, noticed that for
> omp_target_is_present they do return false from omp_target_is_present
> while we return true.  It is unclear if NULL has corresponding storage
> on the device (NULL always corresponds to NULL on the device) or not.

Regarding NULL: no idea what's the best semantic – we could ask
for clarification.

Regarding target:
I think "false" from on device makes more sense in general, especially
if the device number points to a different device. It might work in some
cases – but false simply plays save. Note that the spec states:

"When called from within a target region the effect is unspecified."

Thus, either behavior is fine.

Tobias

-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Jakub Jelinek May 5, 2022, 9:51 a.m. UTC | #5
On Thu, May 05, 2022 at 11:45:19AM +0200, Tobias Burnus wrote:
> > On Mon, Mar 14, 2022 at 04:42:14PM +0100, Marcel Vollweiler wrote:
> > > +        interface
> > > +          function omp_target_is_accessible (ptr, size, device_num) bind(c)
> > > +            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
> > > +            integer(c_int) :: omp_target_is_accessible
> > The function returning integer(c_int) rather than logical seems like
> > a screw up in the standard, but too late to fix that :(.
> 
> I think the idea is that it can directly call the C function without
> needing a wrapper. And as default-kind 'logical' != 'integer(c_int)' in
> general, it cannot return logical. (In case of GCC, just claiming that
> it is logical would work. But some Fortran compilers use -1 for .true.
> and only flip a single bit for .not. For those,
> "if(.not.omp_target_is_accessible(..)) will not work properly, if the C
> function returns 1.
> 
> But I concur that requiring "/= 0" is ugly!

Yeah, but for the APIs that don't have any iso_c_binding arguments
we just use wrappers rather than bind(c) and it allows for more Fortran-like
callers.  So, if omp_target_is_accessible had the *_ wrapper (or alias if
we determine logical ir the same as c_int in the ABI passing), people could
avoid the /= 0 stuff.
Anyway, that is just a thought for future APIs that if they return
false/true only bind(c) isn't always a good idea.

	Jakub
Marcel Vollweiler May 6, 2022, 11:14 a.m. UTC | #6
Hi Jakub,

Am 05.05.2022 um 11:33 schrieb Jakub Jelinek:
> On Mon, Mar 14, 2022 at 04:42:14PM +0100, Marcel Vollweiler wrote:
>> --- a/libgomp/libgomp.map
>> +++ b/libgomp/libgomp.map
>> @@ -226,6 +226,11 @@ OMP_5.1 {
>>      omp_get_teams_thread_limit_;
>>   } OMP_5.0.2;
>>
>> +OMP_5.1.1 {
>> +  global:
>> +    omp_target_is_accessible;
>> +} OMP_5.1;
>> +
>
> You've already added another OMP_5.1.1 symbol, so this hunk will need to be
> adjusted.  Keep the names in there alphabetically sorted.

Adjusted.

>> --- a/libgomp/omp_lib.f90.in
>> +++ b/libgomp/omp_lib.f90.in
>> @@ -835,6 +835,16 @@
>>             end function omp_target_disassociate_ptr
>>           end interface
>>
>> +        interface
>> +          function omp_target_is_accessible (ptr, size, device_num) bind(c)
>> +            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
>> +            integer(c_int) :: omp_target_is_accessible
>
> The function returning integer(c_int) rather than logical seems like
> a screw up in the standard, but too late to fix that :(.
>
>> --- a/libgomp/target.c
>> +++ b/libgomp/target.c
>> @@ -3666,6 +3666,24 @@ omp_target_disassociate_ptr (const void *ptr, int device_num)
>>   }
>>
>>   int
>> +omp_target_is_accessible (const void *ptr, size_t size, int device_num)
>> +{
>> +  if (device_num < 0 || device_num > gomp_get_num_devices ())
>> +    return false;
>> +
>> +  if (device_num == gomp_get_num_devices ())
>> +    return true;
>> +
>> +  struct gomp_device_descr *devicep = resolve_device (device_num);
>> +  if (devicep == NULL)
>> +    return false;
>> +
>> +  /* TODO: Unified shared memory must be handled when available.  */
>> +
>> +  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
>
> I guess for now it is reasonable, but I wonder if even without
> GOMP_OFFLOAD_CAP_SHARED_MEM one can't for CUDA or GCN allocate host
> memory (not all, but just some subset) that will be accessible on the
> device (I bet that means accessible through the same address on the host and
> device, aka partial shared mem).

Currently, I am only aware of

(a) physically shared memory which is used for some architectures where CPU and
GPU are close together (handled via GOMP_OFFLOAD_CAP_SHARED_MEM) and
(b) unified shared memory as being more a logical memory sharing via managed
memory (using sth. like cudaMallocManaged).

For (b) I will submit a follow up patch very soon that depends on the submitted
but not yet approved/committed usm patches:
    https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591349.html

>
> So, ok for trunk.
>
> OT, tried to look how libomptarget implements it and they don't at least
> on llvm-project trunk, but while looking at that, noticed that for
> omp_target_is_present they do return false from omp_target_is_present
> while we return true.  It is unclear if NULL has corresponding storage
> on the device (NULL always corresponds to NULL on the device) or not.

That's indeed an interesting point. I am not sure whether returning "true" for a
given NULL pointer is the desired behaviour for omp_target_is_present. For the
host that might be ok (for whatever reason) but for offload devices this implies
that NULL is actually mapped to some address on the device (as far as I
understand the definition):

"The omp_target_is_present routine tests whether a host pointer refers to
storage that is mapped to a given device."

I don't know if such a "NULL mapping" is valid/useful.

Marcel
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
diff mbox series

Patch

diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc
index 77176ef..bf38fad 100644
--- a/gcc/omp-low.cc
+++ b/gcc/omp-low.cc
@@ -3959,6 +3959,7 @@  omp_runtime_api_call (const_tree fndecl)
       "target_associate_ptr",
       "target_disassociate_ptr",
       "target_free",
+      "target_is_accessible",
       "target_is_present",
       "target_memcpy",
       "target_memcpy_rect",
diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map
index 2ac5809..1764380 100644
--- a/libgomp/libgomp.map
+++ b/libgomp/libgomp.map
@@ -226,6 +226,11 @@  OMP_5.1 {
 	omp_get_teams_thread_limit_;
 } OMP_5.0.2;
 
+OMP_5.1.1 {
+  global:
+	omp_target_is_accessible;
+} OMP_5.1;
+
 GOMP_1.0 {
   global:
 	GOMP_atomic_end;
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index 161a423..58e432c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -311,7 +311,7 @@  The OpenMP 4.5 specification is fully supported.
 @item @code{omp_set_num_teams}, @code{omp_set_teams_thread_limit},
       @code{omp_get_max_teams}, @code{omp_get_teams_thread_limit} runtime
       routines @tab Y @tab
-@item @code{omp_target_is_accessible} runtime routine @tab N @tab
+@item @code{omp_target_is_accessible} runtime routine @tab Y @tab
 @item @code{omp_target_memcpy_async} and @code{omp_target_memcpy_rect_async}
       runtime routines @tab N @tab
 @item @code{omp_get_mapped_ptr} runtime routine @tab N @tab
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 89c5d65..1ec7415 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -282,6 +282,8 @@  extern int omp_target_memcpy_rect (void *, const void *, __SIZE_TYPE__, int,
 extern int omp_target_associate_ptr (const void *, const void *, __SIZE_TYPE__,
 				     __SIZE_TYPE__, int) __GOMP_NOTHROW;
 extern int omp_target_disassociate_ptr (const void *, int) __GOMP_NOTHROW;
+extern int omp_target_is_accessible (const void *, __SIZE_TYPE__, int)
+  __GOMP_NOTHROW;
 
 extern void omp_set_affinity_format (const char *) __GOMP_NOTHROW;
 extern __SIZE_TYPE__ omp_get_affinity_format (char *, __SIZE_TYPE__)
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index daf40dc..f369507 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -835,6 +835,16 @@ 
           end function omp_target_disassociate_ptr
         end interface
 
+        interface
+          function omp_target_is_accessible (ptr, size, device_num) bind(c)
+            use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+            integer(c_int) :: omp_target_is_accessible
+            type(c_ptr), value :: ptr
+            integer(c_size_t), value :: size
+            integer(c_int), value :: device_num
+          end function omp_target_is_accessible
+        end interface
+
 #if _OPENMP >= 201811
 !GCC$ ATTRIBUTES DEPRECATED :: omp_get_nested, omp_set_nested
 #endif
diff --git a/libgomp/omp_lib.h.in b/libgomp/omp_lib.h.in
index ff857a4..5ea0366 100644
--- a/libgomp/omp_lib.h.in
+++ b/libgomp/omp_lib.h.in
@@ -416,3 +416,14 @@ 
           integer(c_int), value :: device_num
         end function omp_target_disassociate_ptr
       end interface
+
+      interface
+        function omp_target_is_accessible (ptr, size, device_num)          &
+     &      bind(c)
+          use, intrinsic :: iso_c_binding, only : c_ptr, c_size_t, c_int
+          integer(c_int) :: omp_target_is_accessible
+          type(c_ptr), value :: ptr
+          integer(c_size_t), value :: size
+          integer(c_int), value :: device_num
+        end function omp_target_is_accessible
+      end interface
diff --git a/libgomp/target.c b/libgomp/target.c
index 9017458..01d36d1 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3666,6 +3666,24 @@  omp_target_disassociate_ptr (const void *ptr, int device_num)
 }
 
 int
+omp_target_is_accessible (const void *ptr, size_t size, int device_num)
+{
+  if (device_num < 0 || device_num > gomp_get_num_devices ())
+    return false;
+
+  if (device_num == gomp_get_num_devices ())
+    return true;
+
+  struct gomp_device_descr *devicep = resolve_device (device_num);
+  if (devicep == NULL)
+    return false;
+
+  /* TODO: Unified shared memory must be handled when available.  */
+
+  return devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM;
+}
+
+int
 omp_pause_resource (omp_pause_resource_t kind, int device_num)
 {
   (void) kind;
diff --git a/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
new file mode 100644
index 0000000..6788894
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/target-is-accessible-1.c
@@ -0,0 +1,42 @@ 
+#include <omp.h>
+
+int
+main ()
+{
+  int d = omp_get_default_device ();
+  int id = omp_get_initial_device ();
+  int n = omp_get_num_devices ();
+  void *p;
+
+  if (d < 0 || d >= n)
+    d = id;
+
+  if (!omp_target_is_accessible (p, sizeof (int), n))
+    __builtin_abort ();
+
+  if (!omp_target_is_accessible (p, sizeof (int), id))
+    __builtin_abort ();
+
+  if (omp_target_is_accessible (p, sizeof (int), -1))
+    __builtin_abort ();
+
+  if (omp_target_is_accessible (p, sizeof (int), n + 1))
+    __builtin_abort ();
+
+  /* Currently, a host pointer is accessible if the device supports shared
+     memory or omp_target_is_accessible is executed on the host. This
+     test case must be adapted when unified shared memory is avialable.  */
+  int shared_mem = 0;
+  #pragma omp target map (alloc: shared_mem) device (d)
+    shared_mem = 1;
+  if (omp_target_is_accessible (p, sizeof (int), d) != shared_mem)
+    __builtin_abort ();
+
+  int a[128];
+  p = &a;
+  for (int i = 0; i < 128; i++)
+    if (omp_target_is_accessible (p, 128 * sizeof (int), d) != shared_mem)
+      __builtin_abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90 b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
new file mode 100644
index 0000000..eae696f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-is-accessible-1.f90
@@ -0,0 +1,44 @@ 
+program main
+  use omp_lib
+  use iso_c_binding
+  implicit none (external, type)
+  integer :: d, id, n, shared_mem, i
+  integer, target :: a(1:128)
+  type(c_ptr) :: p
+
+  d = omp_get_default_device ()
+  id = omp_get_initial_device ()
+  n = omp_get_num_devices ()
+
+  if (d < 0 .or. d >= n) &
+    d = id
+
+  if (omp_target_is_accessible (p, c_sizeof (d), n) /= 1) &
+    stop 1
+
+  if (omp_target_is_accessible (p, c_sizeof (d), id) /= 1) &
+    stop 2
+
+  if (omp_target_is_accessible (p, c_sizeof (d), -1) /= 0) &
+    stop 3
+
+  if (omp_target_is_accessible (p, c_sizeof (d), n + 1) /= 0) &
+    stop 4
+
+  ! Currently, a host pointer is accessible if the device supports shared
+  ! memory or omp_target_is_accessible is executed on the host. This
+  ! test case must be adapted when unified shared memory is avialable.
+  shared_mem = 0;
+  !$omp target map (alloc: shared_mem) device (d)
+    shared_mem = 1;
+  !$omp end target
+  if (omp_target_is_accessible (p, c_sizeof (d), d) /= shared_mem) &
+    stop 5;
+
+  p = c_loc (a)
+  do i = 1, 128
+    if (omp_target_is_accessible (p, 128 * sizeof (d), d) /= shared_mem) &
+      stop 6;
+  end do
+
+end program main