diff mbox series

libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET

Message ID 85ecf108-8249-21a3-b9ee-23b89b6816bd@codesourcery.com
State New
Headers show
Series libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET | expand

Commit Message

Tobias Burnus Feb. 6, 2023, 11:52 a.m. UTC
Seems as if I missed a GOMP_MAP_TO_PSET issue before. As nvptx is
XFAILed before, I only found it when testing on AMDGCN.

For an array-descriptor 'ai' variable, omplower has:
   map(tofrom:MEM <integer(kind=4)[0:]> [(integer(kind=4)[0:] *)D.4346] [len: D.4345])
   map(to:ai [pointer set, len: 64])
   map(alloc:ai.data [pointer assign, bias: 0])

The latter reaches GCC with the same address as 'ai' – i.e. the
one of the array descriptor. This then needs to be dereferenced
to get the address of the actual pointer.
The patch assumes (and asserts) that 'ai.data' is part of the 'ai'
such that I can use the host address of 'ai' to access the data.
If that's not guaranteed, we have to find another way (e.g. another
lookup). But so far it seems to hold and I have not seen a bias
other than 0.

With that patch, libgomp.fortran/reverse-offload-5.f90 now works
with AMDGCN.

OK? Any comments to the attached patch?

Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955

Comments

Thomas Schwinge Feb. 7, 2023, 10:51 p.m. UTC | #1
Hi Tobias!

On 2023-02-06T12:52:11+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> Seems as if I missed a GOMP_MAP_TO_PSET issue before. As nvptx is
> XFAILed before, I only found it when testing on AMDGCN.
>
> For an array-descriptor 'ai' variable, omplower has:
>    map(tofrom:MEM <integer(kind=4)[0:]> [(integer(kind=4)[0:] *)D.4346] [len: D.4345])
>    map(to:ai [pointer set, len: 64])
>    map(alloc:ai.data [pointer assign, bias: 0])
>
> The latter reaches GCC with the same address as 'ai' – i.e. the
> one of the array descriptor. This then needs to be dereferenced
> to get the address of the actual pointer.
> The patch assumes (and asserts) that 'ai.data' is part of the 'ai'
> such that I can use the host address of 'ai' to access the data.
> If that's not guaranteed, we have to find another way (e.g. another
> lookup). But so far it seems to hold and I have not seen a bias
> other than 0.
>
> With that patch, libgomp.fortran/reverse-offload-5.f90 now works
> with AMDGCN.

Confirming the latter, thanks.  (..., but I've not actually reviewed the
code changes.)

> OK? Any comments to the attached patch?

Yes:

> libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
>
> libgomp/
>       * target.c (gomp_target_rev): Dereference ptr
>       to get device address.
>       * libgomp.fortran/reverse-offload-5.f90: Add test
>       for unallocated allocatable.
>
>  libgomp/target.c                                        | 7 ++++++-
>  libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 | 6 ++++--
>  2 files changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/libgomp/target.c b/libgomp/target.c
> index c1682caea13..5cdd845291a 100644
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3579,8 +3579,13 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
>                     }
>                   int k;
>                   n2 = NULL;
> -                 cdata[i].present = true;
> +                 /* Dereference devaddrs[j] to get the device addr.  */
> +                 assert (devaddrs[j]-sizes[j] == cdata[i].devaddr);
> +                 devaddrs[j] = *(uint64_t *) (devaddrs[i] + sizes[j]);

For x86_64-pc-linux-gnu '-m32' multilib:

    [...]
    [...]/source-gcc/libgomp/target.c: In function ‘gomp_target_rev’:
    [...]/source-gcc/libgomp/target.c:3615:36: error: cast to pointer from integer of different size [-Werror=int-to-pointer-cast]
     3615 |                     devaddrs[j] = *(uint64_t *) (devaddrs[i] + sizes[j]);
          |                                    ^
    cc1: all warnings being treated as errors
    make[9]: *** [target.lo] Error 1
    make[9]: Leaving directory `[...]/build-gcc/x86_64-pc-linux-gnu/32/libgomp'
    [...]

I suppose you'd do similar to what you already have a few lines above;
that is, cast through 'uintptr_t':

    devaddrs[j] = *(uint64_t *) (uintptr_t) (devaddrs[i] + sizes[j]);


Grüße
 Thomas


> +                 cdata[j].present = true;
>                   cdata[j].devaddr = devaddrs[j];
> +                 if (devaddrs[j] == 0)
> +                   continue;
>                   k = gomp_map_cdata_lookup (cdata, devaddrs, kinds, sizes, j,
>                                              devaddrs[j],
>                                              devaddrs[j] + sizeof (void*),
> diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> index ef7eb7bdd52..16810eb47de 100644
> --- a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> +++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
> @@ -24,7 +24,7 @@ s2 = 55
>
>  !$omp target map(to: A, A2, s1, s2)
>  block
> -  integer, allocatable :: ai(:), ai2(:), si1, si2
> +  integer, allocatable :: ai(:), ai2(:), ai3(:), si1, si2, si3
>
>    a = a * 2
>    a2 = a2 * 3
> @@ -38,7 +38,7 @@ block
>
>    !$omp target device (ancestor:1)  &
>    !$omp&       map(to: A, s1, ai, si1) map(always, to: a2, s2)  &
> -  !$omp&       map(tofrom: ai2, si2)
> +  !$omp&       map(tofrom: ai2, si2, ai3, si3)
>      if (shared_mem) then
>        if (any (a  /= 2 * [1,2,3,4])) stop 1
>        if (s1 /= 4 * 532) stop 2
> @@ -52,6 +52,7 @@ block
>      if (any (ai2 /= [8,4,7,1])) stop 8
>      if (si1 /= 64) stop 9
>      if (si2 /= 765) stop 10
> +    if (allocated (ai3) .or. allocated(si3)) stop 26
>
>      a = a*3
>      a2 = a2*7
> @@ -80,6 +81,7 @@ block
>    endif
>    if (any (ai2 /= 21 * [8,4,7,1])) stop 24
>    if (si2 /= 31 * 765) stop 25
> +  if (allocated (ai3) .or. allocated(si3)) stop 27
>
>    deallocate (ai, ai2, si1, si2)
>  end block
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Tobias Burnus Feb. 9, 2023, 9:23 a.m. UTC | #2
(Updated to fix -m32 build, otherwise unchanged.)

Any other comments?

On 07.02.23 23:51, Thomas Schwinge wrote:

> On 2023-02-06T12:52:11+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
>> Seems as if I missed a GOMP_MAP_TO_PSET issue before. As nvptx is
>> XFAILed before, I only found it when testing on AMDGCN.
>>
>> For an array-descriptor 'ai' variable, omplower has:
>>     map(tofrom:MEM <integer(kind=4)[0:]> [(integer(kind=4)[0:] *)D.4346] [len: D.4345])
>>     map(to:ai [pointer set, len: 64])
>>     map(alloc:ai.data [pointer assign, bias: 0])
>>
>> The latter reaches GCC with the same address as 'ai' – i.e. the
>> one of the array descriptor. This then needs to be dereferenced
>> to get the address of the actual pointer.
>> The patch assumes (and asserts) that 'ai.data' is part of the 'ai'
>> such that I can use the host address of 'ai' to access the data.
>> If that's not guaranteed, we have to find another way (e.g. another
>> lookup). But so far it seems to hold and I have not seen a bias
>> other than 0.
>>
>> With that patch, libgomp.fortran/reverse-offload-5.f90 now works
>> with AMDGCN.
...
>> OK? Any comments to the attached patch?
...
> For x86_64-pc-linux-gnu '-m32' multilib:
...
> I suppose you'd do similar to what you already have a few lines above;
> that is, cast through 'uintptr_t':
>
>      devaddrs[j] = *(uint64_t *) (uintptr_t) (devaddrs[i] + sizes[j]);
Tobias
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Jakub Jelinek Feb. 15, 2023, 10:17 a.m. UTC | #3
On Thu, Feb 09, 2023 at 10:23:53AM +0100, Tobias Burnus wrote:
> libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET
> 
> libgomp/
> 	* target.c (gomp_target_rev): Dereference ptr
> 	to get device address.
> 	* libgomp.fortran/reverse-offload-5.f90: Add test
> 	for unallocated allocatable.
> 
> --- a/libgomp/target.c
> +++ b/libgomp/target.c
> @@ -3579,8 +3579,14 @@ gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
>  		      }
>  		    int k;
>  		    n2 = NULL;
> -		    cdata[i].present = true;
> +		    /* Dereference devaddrs[j] to get the device addr.  */
> +		    assert (devaddrs[j]-sizes[j] == cdata[i].devaddr);

Formatting, there should be spaces around - on both sides.

> +		    devaddrs[j] = *(uint64_t *) (uintptr_t) (devaddrs[i]
> +							     + sizes[j]);
> +		    cdata[j].present = true;
>  		    cdata[j].devaddr = devaddrs[j];
> +		    if (devaddrs[j] == 0)
> +		      continue;
>  		    k = gomp_map_cdata_lookup (cdata, devaddrs, kinds, sizes, j,
>  					       devaddrs[j],
>  					       devaddrs[j] + sizeof (void*),

Otherwise LGTM.

	Jakub
diff mbox series

Patch

libgomp: Fix reverse-offload for GOMP_MAP_TO_PSET

libgomp/
	* target.c (gomp_target_rev): Dereference ptr
	to get device address.
	* libgomp.fortran/reverse-offload-5.f90: Add test
	for unallocated allocatable.

 libgomp/target.c                                        | 7 ++++++-
 libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 | 6 ++++--
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/libgomp/target.c b/libgomp/target.c
index c1682caea13..5cdd845291a 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -3579,8 +3579,13 @@  gomp_target_rev (uint64_t fn_ptr, uint64_t mapnum, uint64_t devaddrs_ptr,
 		      }
 		    int k;
 		    n2 = NULL;
-		    cdata[i].present = true;
+		    /* Dereference devaddrs[j] to get the device addr.  */
+		    assert (devaddrs[j]-sizes[j] == cdata[i].devaddr);
+		    devaddrs[j] = *(uint64_t *) (devaddrs[i] + sizes[j]);
+		    cdata[j].present = true;
 		    cdata[j].devaddr = devaddrs[j];
+		    if (devaddrs[j] == 0)
+		      continue;
 		    k = gomp_map_cdata_lookup (cdata, devaddrs, kinds, sizes, j,
 					       devaddrs[j],
 					       devaddrs[j] + sizeof (void*),
diff --git a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90 b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
index ef7eb7bdd52..16810eb47de 100644
--- a/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
+++ b/libgomp/testsuite/libgomp.fortran/reverse-offload-5.f90
@@ -24,7 +24,7 @@  s2 = 55
 
 !$omp target map(to: A, A2, s1, s2)
 block
-  integer, allocatable :: ai(:), ai2(:), si1, si2
+  integer, allocatable :: ai(:), ai2(:), ai3(:), si1, si2, si3
 
   a = a * 2
   a2 = a2 * 3
@@ -38,7 +38,7 @@  block
 
   !$omp target device (ancestor:1)  &
   !$omp&       map(to: A, s1, ai, si1) map(always, to: a2, s2)  &
-  !$omp&       map(tofrom: ai2, si2)
+  !$omp&       map(tofrom: ai2, si2, ai3, si3)
     if (shared_mem) then
       if (any (a  /= 2 * [1,2,3,4])) stop 1
       if (s1 /= 4 * 532) stop 2
@@ -52,6 +52,7 @@  block
     if (any (ai2 /= [8,4,7,1])) stop 8
     if (si1 /= 64) stop 9
     if (si2 /= 765) stop 10
+    if (allocated (ai3) .or. allocated(si3)) stop 26
 
     a = a*3
     a2 = a2*7
@@ -80,6 +81,7 @@  block
   endif
   if (any (ai2 /= 21 * [8,4,7,1])) stop 24
   if (si2 /= 31 * 765) stop 25
+  if (allocated (ai3) .or. allocated(si3)) stop 27
 
   deallocate (ai, ai2, si1, si2)
 end block