diff mbox series

[Fortran] OpenMP/OpenACC – fix more issues with OPTIONAL

Message ID 01b9ed0e-aee8-b12a-c293-b057d71fac21@codesourcery.com
State New
Headers show
Series [Fortran] OpenMP/OpenACC – fix more issues with OPTIONAL | expand

Commit Message

Tobias Burnus Dec. 10, 2019, 5:54 p.m. UTC
Nonallocatable, nonpointer array arguments (of assumed shape) are 
special as they get a get an array descriptor ('arg') as argument but 
create a local variable which accesses the actual data ('arg.0 = 
arg->data').

With OPTIONAL, there are/were two outstanding issues:

(A) If the argument is not present, 'arg.0' is/was never assigned to.

(B) The optional-arg-is-present check is not just 'if (arg)' but 'if 
(arg && arg->data)' as passing an unallocated allocatable/disassociated 
pointer (i.e. 'arg->data = NULL') to a nonpointer, nonallocatable 
optional dummy argument counts as absent argument; this affects (A).

Solution:

(B) is now solved by updating what gfc_omp_check_optional_argument 
returns; as is now always returns an boolean_type_node, one can clean up 
the code which adds "!= NULL" when using the "present" tree variable.

(A) For mapping, one also does GOMP_MAP_POINTER; if one replaces this by 
a temporary variable 'D.124 = present ? arg.0 : NULL', it will later ICE 
in omp-low.c one confuses the identifier handling, which replaces the 
variables in 'target (data)'.

Build on x86-64-gnu-linux w/o offloading and on one nvptx configuration 
with actual offloading.
OK?

Tobias

PS: Besides adding tons of test cases, it also fixes the transient issue 
(which does only occur with -O1 ?!?) with the existing 
use_device_addr-{3,4}.f90 test case. That failed due to reason (A). – 
Cf. https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00499.html

PPS: I haven't tried polymorphic data types but I am positive they will 
fail. Cray pointers are also candidates for additional failures.

Comments

Tobias Burnus Dec. 16, 2019, 8:06 a.m. UTC | #1
Ping.

On 12/10/19 6:54 PM, Tobias Burnus wrote:
> Nonallocatable, nonpointer array arguments (of assumed shape) are 
> special as they get a get an array descriptor ('arg') as argument but 
> create a local variable which accesses the actual data ('arg.0 = 
> arg->data').
>
> With OPTIONAL, there are/were two outstanding issues:
>
> (A) If the argument is not present, 'arg.0' is/was never assigned to.
>
> (B) The optional-arg-is-present check is not just 'if (arg)' but 'if 
> (arg && arg->data)' as passing an unallocated 
> allocatable/disassociated pointer (i.e. 'arg->data = NULL') to a 
> nonpointer, nonallocatable optional dummy argument counts as absent 
> argument; this affects (A).
>
> Solution:
Tobias Burnus Dec. 29, 2019, 10:27 p.m. UTC | #2
On 12/16/19 9:06 AM, Tobias Burnus wrote:
> Ping.
>
> On 12/10/19 6:54 PM, Tobias Burnus wrote:
>> Nonallocatable, nonpointer array arguments (of assumed shape) are 
>> special as they get a get an array descriptor ('arg') as argument but 
>> create a local variable which accesses the actual data ('arg.0 = 
>> arg->data').
>>
>> With OPTIONAL, there are/were two outstanding issues:
>>
>> (A) If the argument is not present, 'arg.0' is/was never assigned to.
>>
>> (B) The optional-arg-is-present check is not just 'if (arg)' but 'if 
>> (arg && arg->data)' as passing an unallocated 
>> allocatable/disassociated pointer (i.e. 'arg->data = NULL') to a 
>> nonpointer, nonallocatable optional dummy argument counts as absent 
>> argument; this affects (A).
>>
>> Solution:
Jerry DeLisle Dec. 30, 2019, 3:06 a.m. UTC | #3
Between Holidays and being short on people that understand this, I would 
say commit it unless Jakub objects.

(When in doubt, make a decision and move forward principle, assuming one 
is not stupid,)

Cheers,

Jerry

On 12/29/19 2:27 PM, Tobias Burnus wrote:
> 
> On 12/16/19 9:06 AM, Tobias Burnus wrote:
>> Ping.
>>
>> On 12/10/19 6:54 PM, Tobias Burnus wrote:
>>> Nonallocatable, nonpointer array arguments (of assumed shape) are 
>>> special as they get a get an array descriptor ('arg') as argument but 
>>> create a local variable which accesses the actual data ('arg.0 = 
>>> arg->data').
>>>
>>> With OPTIONAL, there are/were two outstanding issues:
>>>
>>> (A) If the argument is not present, 'arg.0' is/was never assigned to.
>>>
>>> (B) The optional-arg-is-present check is not just 'if (arg)' but 'if 
>>> (arg && arg->data)' as passing an unallocated 
>>> allocatable/disassociated pointer (i.e. 'arg->data = NULL') to a 
>>> nonpointer, nonallocatable optional dummy argument counts as absent 
>>> argument; this affects (A).
>>>
>>> Solution:
Jakub Jelinek Jan. 3, 2020, 11:29 a.m. UTC | #4
On Tue, Dec 10, 2019 at 06:54:19PM +0100, Tobias Burnus wrote:
> 2019-12-10  Tobias Burnus  <tobias@codesourcery.com>
> 
> 	gcc/fortran/
> 	* trans-openmp.c (gfc_omp_check_optional_argument): Always return a
> 	Boolean expression; handle unallocated/disassociated actual arguments
> 	as absent if passed to nonallocatable/nonpointer dummy array arguments.
> 	(gfc_build_cond_assign): Change to assume a Boolean expr not a pointer.
> 	(gfc_omp_finish_clause, gfc_trans_omp_clauses): Assign NULL to generated
> 	array-data variable if the argument is absent. Simplify code as
> 	'present' is now a Boolean expression.
> 
> 	libgomp/
> 	* testsuite/libgomp.fortran/optional-map.f90: Add test for
> 	unallocated/disassociated actual arguments to nonallocatable/nonpointer
> 	dummy arguments; those are/shall be regarded as absent arguments.
> 	* testsuite/libgomp.fortran/use_device_ptr-optional-2.f90: Ditto.
> 	* testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: New.

Ok.  Sorry for the delay.

	Jakub
Thomas Schwinge Jan. 8, 2020, 8:33 a.m. UTC | #5
Hi Tobias!

On 2019-12-10T18:54:19+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> PS: Besides adding tons of test cases, [r279858] also fixes the transient issue 
> (which does only occur with -O1 ?!?)

(I saw it with different/differing optimization levels.)

> with the existing 
> use_device_addr-{3,4}.f90 test case. That failed due to [...]. – 
> Cf. https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00499.html

ACK, thanks.


> PPS: I haven't tried polymorphic data types but I am positive they will 
> fail. Cray pointers are also candidates for additional failures.

Please file PRs as appropriate.


> 	* testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: New.

With 'dg-do run' added, on powerpc64le-unknown-linux-gnu without
offloading I'm seeing:

    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  execution test

... with the '-O0' (only) execution test FAILing with 'STOP 1', and on
x86_64-pc-linux-gnu with offloading I'm seeing:

    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
    PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
    FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
    UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable

... due to:

    /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
    [...]

..., which may be something like PR90779, PR85063, PR84592, PR90779,
PR80411, PR71536 -- or something else.  ;-)


Grüße
 Thomas


> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-3.f90
> @@ -0,0 +1,140 @@
> +! Check whether absent optional arguments are properly
> +! handled with use_device_{addr,ptr}.
> +program main
> +  use iso_c_binding, only: c_ptr, c_loc, c_associated, c_f_pointer
> +  implicit none (type, external)
> +
> +  integer, target :: u
> +  integer, target :: v
> +  integer, target :: w
> +  integer, target :: x(4)
> +  integer, target, allocatable :: y
> +  integer, target, allocatable :: z(:)
> +  type(c_ptr), target :: cptr
> +  type(c_ptr), target :: cptr_in
> +  integer :: dummy
> +
> +  u = 42
> +  v = 5
> +  w = 7
> +  x = [3,4,6,2]
> +  y = 88
> +  z = [1,2,3]
> +
> +  !$omp target enter data map(to:u)
> +  !$omp target data map(to:dummy) use_device_addr(u)
> +   cptr_in = c_loc(u) ! Has to be outside 'foo' due to 'intent(in)'
> +  !$omp end target data
> +
> +  call foo (u, v, w, x, y, z, cptr, cptr_in)
> +  deallocate (y, z)
> +contains
> +  subroutine foo (u, v, w, x, y, z, cptr, cptr_in)
> +    integer, target, optional, value :: v
> +    integer, target, optional :: u, w
> +    integer, target, optional :: x(:)
> +    integer, target, optional, allocatable :: y
> +    integer, target, optional, allocatable :: z(:)
> +    type(c_ptr), target, optional, value :: cptr
> +    type(c_ptr), target, optional, value, intent(in) :: cptr_in
> +    integer :: d
> +
> +    type(c_ptr) :: p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in
> +
> +    !$omp target enter data map(to:w, x, y, z)
> +    !$omp target data map(dummy) use_device_addr(x)
> +      cptr = c_loc(x)
> +    !$omp end target data
> +
> +    ! Need to map per-VALUE arguments, if present
> +    if (present(v)) then
> +      !$omp target enter data map(to:v)
> +    else
> +      stop 1
> +    end if
> +    if (present(cptr)) then
> +      !$omp target enter data map(to:cptr)
> +    else
> +      stop 2
> +    end if
> +    if (present(cptr_in)) then
> +      !$omp target enter data map(to:cptr_in)
> +    else
> +      stop 3
> +    end if
> +
> +    !$omp target data map(d) use_device_addr(u, v, w, x, y, z)
> +    !$omp target data map(d) use_device_addr(cptr, cptr_in)
> +      if (.not. present(u)) stop 10
> +      if (.not. present(v)) stop 11
> +      if (.not. present(w)) stop 12
> +      if (.not. present(x)) stop 13
> +      if (.not. present(y)) stop 14
> +      if (.not. present(z)) stop 15
> +      if (.not. present(cptr)) stop 16
> +      if (.not. present(cptr_in)) stop 17
> +      p_u = c_loc(u)
> +      p_v = c_loc(v)
> +      p_w = c_loc(w)
> +      p_x = c_loc(x)
> +      p_y = c_loc(y)
> +      p_z = c_loc(z)
> +      p_cptr = c_loc(cptr)
> +      p_cptr_in = c_loc(cptr_in)
> +    !$omp end target data
> +    !$omp end target data
> +    call check(p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in, size(x), size(z))
> +  end subroutine foo
> +
> +  subroutine check(p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in, Nx, Nz)
> +    type(c_ptr), value :: p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in
> +    integer, value :: Nx, Nz
> +    integer, pointer :: c_u(:), c_v(:), c_w(:), c_x(:), c_y(:), c_z(:)
> +    type(c_ptr), pointer :: c_cptr(:), c_cptr_in(:)
> +
> +    ! As is_device_ptr does not handle scalars, we map them to a size-1 array
> +    call c_f_pointer(p_u, c_u, shape=[1])
> +    call c_f_pointer(p_v, c_v, shape=[1])
> +    call c_f_pointer(p_w, c_w, shape=[1])
> +    call c_f_pointer(p_x, c_x, shape=[Nx])
> +    call c_f_pointer(p_y, c_y, shape=[1])
> +    call c_f_pointer(p_z, c_z, shape=[Nz])
> +    call c_f_pointer(p_cptr, c_cptr, shape=[1])
> +    call c_f_pointer(p_cptr_in, c_cptr_in, shape=[1])
> +    call run_target(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
> +  end subroutine check
> +
> +  subroutine run_target(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
> +    integer, target :: c_u(:), c_v(:), c_w(:), c_x(:), c_y(:), c_z(:)
> +    type(c_ptr) :: c_cptr(:), c_cptr_in(:)
> +    integer, value :: Nx, Nz
> +    !$omp target is_device_ptr(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in) map(to:Nx, Nz)
> +      call target_fn(c_u(1), c_v(1), c_w(1), c_x, c_y(1), c_z, c_cptr(1), c_cptr_in(1), Nx, Nz)
> +    !$omp end target
> +  end subroutine run_target
> +
> +  subroutine target_fn(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
> +    !$omp declare target
> +    integer, target :: c_u, c_v, c_w, c_x(:), c_y, c_z(:)
> +    type(c_ptr), value :: c_cptr, c_cptr_in
> +    integer, value :: Nx, Nz
> +    integer, pointer :: u, x(:)
> +    if (c_u /= 42) stop 30
> +    if (c_v /= 5) stop 31
> +    if (c_w /= 7) stop 32
> +    if (Nx /= 4) stop 33
> +    if (any (c_x /= [3,4,6,2])) stop 34
> +    if (c_y /= 88) stop 35
> +    if (Nz /= 3) stop 36
> +    if (any (c_z /= [1,2,3])) stop 37
> +    if (.not. c_associated (c_cptr)) stop 38
> +    if (.not. c_associated (c_cptr_in)) stop 39
> +    if (.not. c_associated (c_cptr, c_loc(c_x))) stop 40
> +    if (.not. c_associated (c_cptr_in, c_loc(c_u))) stop 41
> +    call c_f_pointer(c_cptr_in, u)
> +    call c_f_pointer(c_cptr, x, shape=[Nx])
> +    if (u /= c_u .or. u /= 42)  stop 42
> +    if (any (x /= c_x))  stop 43
> +    if (any (x /= [3,4,6,2]))  stop 44
> +  end subroutine target_fn
> +end program main
Tobias Burnus Jan. 8, 2020, 8:55 a.m. UTC | #6
Hi Thomas,

On 1/8/20 9:33 AM, Thomas Schwinge wrote:
> With 'dg-do run' added, on powerpc64le-unknown-linux-gnu

Have I already expressed that I started to hate that target arch?

I think we really should find out what goes wrong for the small example 
of https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92305 — Help and 
suggestions very much appreciated!

The reduced test case is in Comment 9 and Comment 11 shows the dump + 
the assembler of caller and callee. (The example is that short that 
pasting those in the comment still makes a rather short comment!) — 
Analysis is in later comments, especially in the last comment (Comment 16).

I think we should try to understand what goes wrong in this case before 
starting to look at other issues: it is already partially analyzed and 
it short. — Again, help and suggestions are welcome!

Hence, I am inclined to ignore the following issue — until we have 
understood and possibly fixed for PR92305.

> ... with the '-O0' (only) execution test FAILing with 'STOP 1'
While:
> x86_64-pc-linux-gnu with offloading I'm seeing:
>      FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
>      UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
> ... due to:
>      /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
> which may be something like PR90779, PR85063, PR84592, PR90779,
> PR80411, PR71536 -- or something else.

Hmm. It is surely among the listed items, if all fails in the last item. 
Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is 
listed twice). To through in another number it could also be a variant 
of PR 92029 to though in yet another number …

Cheers,

Tobias
Thomas Schwinge April 29, 2020, 10 a.m. UTC | #7
Hi Tobias!

Do you happen to have any update regarding this one:

On 2020-01-08T09:55:06+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm seeing:

|     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
|     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
|     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
|     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
|     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
|     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
|     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
|     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
|     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
|     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
|     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
|     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable

>> ... due to:
>>      /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
>> which may be something like PR90779, PR85063, PR84592, PR90779,
>> PR80411, PR71536 -- or something else.
>
> Hmm. It is surely among the listed items, if all fails in the last item.
> Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
> listed twice). To through in another number it could also be a variant
> of PR 92029 to though in yet another number …


Grüße
 Thomas
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Tobias Burnus April 29, 2020, 2:01 p.m. UTC | #8
Hi Thomas,

was a bit on the backburner but I now digged again.
See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848

The problem is a generated static array variable in the
device function:
   static integer(kind=4) A.12[3] = {1, 2, 3};
used as
   _26 = A.12[S.13_67];

With -ftree-pre, the expressions using _26 now directly use
the value (1, 2 and 3). It seems as if the variable A.12 is
eliminated before writing to LTO but the usage (A.12[S.…])
is not – at least it still appears in the device dumps.

Maybe it is also related to something else, but crucial is
the -ftree-pre on the host side; the optimization level on
the device lto1 side is irrelevant.

Cheers,

Tobias

On 4/29/20 12:00 PM, Thomas Schwinge wrote:

> Hi Tobias!
>
> Do you happen to have any update regarding this one:
>
> On 2020-01-08T09:55:06+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
>> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
>>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm seeing:
> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable
>
>>> ... due to:
>>>       /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
>>> which may be something like PR90779, PR85063, PR84592, PR90779,
>>> PR80411, PR71536 -- or something else.
>> Hmm. It is surely among the listed items, if all fails in the last item.
>> Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
>> listed twice). To through in another number it could also be a variant
>> of PR 92029 to though in yet another number …
>
> Grüße
>   Thomas
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Richard Biener April 30, 2020, 6:24 a.m. UTC | #9
On Wed, Apr 29, 2020 at 5:05 PM Tobias Burnus <tobias@codesourcery.com> wrote:
>
> Hi Thomas,
>
> was a bit on the backburner but I now digged again.
> See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848
>
> The problem is a generated static array variable in the
> device function:
>    static integer(kind=4) A.12[3] = {1, 2, 3};
> used as
>    _26 = A.12[S.13_67];
>
> With -ftree-pre, the expressions using _26 now directly use
> the value (1, 2 and 3). It seems as if the variable A.12 is
> eliminated before writing to LTO but the usage (A.12[S.…])
> is not – at least it still appears in the device dumps.
>
> Maybe it is also related to something else, but crucial is
> the -ftree-pre on the host side; the optimization level on
> the device lto1 side is irrelevant.

IIRC there was a bugreport similar to this where offload
variables were computed "early" instead of at the point
of streaming.

It's obvious that the offload "part" use is not reflected in
the non-offloat "part" IL (like via a function call parameter
or so), so I assume this is a local static in a function
we simply compile twice (but did not actually clone),
once for each offloat target and once for the host.  So it's
likely the above issue.

Richard.

> Cheers,
>
> Tobias
>
> On 4/29/20 12:00 PM, Thomas Schwinge wrote:
>
> > Hi Tobias!
> >
> > Do you happen to have any update regarding this one:
> >
> > On 2020-01-08T09:55:06+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
> >> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
> >>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm seeing:
> > |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
> > |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
> > |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
> > |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
> > |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
> > |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
> > |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
> > |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
> > |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
> > |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
> > |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
> > |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable
> >
> >>> ... due to:
> >>>       /tmp/cciVc43I.o:(.gnu.offload_vars+0x10): undefined reference to `A.12.4064'
> >>> which may be something like PR90779, PR85063, PR84592, PR90779,
> >>> PR80411, PR71536 -- or something else.
> >> Hmm. It is surely among the listed items, if all fails in the last item.
> >> Note that PR85063 is fixed and PR84592 a duplicate of PR90779 (which is
> >> listed twice). To through in another number it could also be a variant
> >> of PR 92029 to though in yet another number …
> >
> > Grüße
> >   Thomas
> > -----------------
> > Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
> -----------------
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Jakub Jelinek April 30, 2020, 7:17 a.m. UTC | #10
On Thu, Apr 30, 2020 at 08:24:02AM +0200, Richard Biener wrote:
> On Wed, Apr 29, 2020 at 5:05 PM Tobias Burnus <tobias@codesourcery.com> wrote:
> > was a bit on the backburner but I now digged again.
> > See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848
> >
> > The problem is a generated static array variable in the
> > device function:
> >    static integer(kind=4) A.12[3] = {1, 2, 3};
> > used as
> >    _26 = A.12[S.13_67];
> >
> > With -ftree-pre, the expressions using _26 now directly use
> > the value (1, 2 and 3). It seems as if the variable A.12 is
> > eliminated before writing to LTO but the usage (A.12[S.…])
> > is not – at least it still appears in the device dumps.
> >
> > Maybe it is also related to something else, but crucial is
> > the -ftree-pre on the host side; the optimization level on
> > the device lto1 side is irrelevant.
> 
> IIRC there was a bugreport similar to this where offload
> variables were computed "early" instead of at the point
> of streaming.
> 
> It's obvious that the offload "part" use is not reflected in
> the non-offloat "part" IL (like via a function call parameter
> or so), so I assume this is a local static in a function
> we simply compile twice (but did not actually clone),
> once for each offloat target and once for the host.  So it's
> likely the above issue.

I'll try to do something about that early in stage1 and
eventually backport, we need to follow OpenMP rules anyway and
those require auto-discovery of referenced functions (if they are
definitions rather than just declarations), and similarly in
some cases for variables.  And the same code then could, if something
shouldn't be marked for offloading by the standard, complain loudly rather
than leaving it up for later (lto1 of the offloading compiler).

	Jakub
Tobias Burnus May 5, 2020, 9:08 a.m. UTC | #11
On 4/30/20 9:17 AM, Jakub Jelinek wrote:

> I'll try to do something about that early in stage1 and
> eventually backport, we need to follow OpenMP rules anyway and
> those require auto-discovery of referenced functions (if they are
> definitions rather than just declarations), and similarly in
> some cases for variables.  And the same code then could, if something
> shouldn't be marked for offloading by the standard, complain loudly rather
> than leaving it up for later (lto1 of the offloading compiler).

Just as cross reference:

* PR 94320 – is about automatically marking functions as
   offloading; that's mostly relevant for C++ templates where
   using std::vector<> for instance might require that
   operator[] is marked for offloading.

* PR 94848 – is about the discussion in this thread, where
   a local static variable in a target function is partially
   optimized away. In principle, that could be solved by writing out
   the LTO stream consistently, i.e. either with the variable
   fully optimized away or being fully present (e.g. with
   "proper" function cloning).
   However, as done in other cases (see PR) – and alluded by
   Jakub above, is can also be solved by ensuring the variable
   is written out, i.e. by marking it for offloading.

Tobias

-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Thomas Schwinge June 17, 2020, 10:22 p.m. UTC | #12
Hi!

On 2020-04-29T16:01:56+0200, Tobias Burnus <tobias@codesourcery.com> wrote:
> See: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94848
>
> The problem is a generated static array variable in the
> device function:
>    static integer(kind=4) A.12[3] = {1, 2, 3};
> used as
>    _26 = A.12[S.13_67];
>
> [...]

..., and that PR was recently fixed, and so...

> On 4/29/20 12:00 PM, Thomas Schwinge wrote:
>> On 2020-01-08T09:55:06+0100, Tobias Burnus <tobias@codesourcery.com> wrote:
>>> On 1/8/20 9:33 AM, Thomas Schwinge wrote:
>>>> With 'dg-do run' added, on [...] x86_64-pc-linux-gnu with offloading I'm seeing:
>> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  (test for excess errors)
>> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O0  execution test
>> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  (test for excess errors)
>> |     PASS: libgomp.fortran/use_device_ptr-optional-3.f90   -O1  execution test
>> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  (test for excess errors)
>> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O2  compilation failed to produce executable
>> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
>> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  compilation failed to produce executable
>> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  (test for excess errors)
>> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -O3 -g  compilation failed to produce executable
>> |     FAIL: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  (test for excess errors)
>> |     UNRESOLVED: libgomp.fortran/use_device_ptr-optional-3.f90   -Os  compilation failed to produce executable

... that problem now is gone, too.  I've thus pushed "Add 'dg-do run' to
'libgomp.fortran/use_device_ptr-optional-3.f90' [PR94848]" to master
branch in commit 5864930754f63e2dcef9606f2514ae20e80f436e, and
releases/gcc-10 branch in commit
61c896d84bdefbfffa7573a8af89119d4db7b3de, see attached.


Grüße
 Thomas


-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
diff mbox series

Patch

2019-12-10  Tobias Burnus  <tobias@codesourcery.com>

	gcc/fortran/
	* trans-openmp.c (gfc_omp_check_optional_argument): Always return a
	Boolean expression; handle unallocated/disassociated actual arguments
	as absent if passed to nonallocatable/nonpointer dummy array arguments.
	(gfc_build_cond_assign): Change to assume a Boolean expr not a pointer.
	(gfc_omp_finish_clause, gfc_trans_omp_clauses): Assign NULL to generated
	array-data variable if the argument is absent. Simplify code as
	'present' is now a Boolean expression.

	libgomp/
	* testsuite/libgomp.fortran/optional-map.f90: Add test for
	unallocated/disassociated actual arguments to nonallocatable/nonpointer
	dummy arguments; those are/shall be regarded as absent arguments.
	* testsuite/libgomp.fortran/use_device_ptr-optional-2.f90: Ditto.
	* testsuite/libgomp.fortran/use_device_ptr-optional-3.f90: New.

 gcc/fortran/trans-openmp.c                         | 117 +++++++++++------
 libgomp/testsuite/libgomp.fortran/optional-map.f90 |  13 ++
 .../libgomp.fortran/use_device_ptr-optional-2.f90  |  11 ++
 .../libgomp.fortran/use_device_ptr-optional-3.f90  | 140 +++++++++++++++++++++
 4 files changed, 242 insertions(+), 39 deletions(-)

diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c
index 356fd04e6c3..e46086d3916 100644
--- a/gcc/fortran/trans-openmp.c
+++ b/gcc/fortran/trans-openmp.c
@@ -90,11 +90,16 @@  gfc_omp_check_optional_argument (tree decl, bool for_present_check)
   if (!DECL_LANG_SPECIFIC (decl))
     return NULL_TREE;
 
+  bool is_array_type = false;
+
   /* For assumed-shape arrays, a local decl with arg->data is used.  */
   if (TREE_CODE (decl) != PARM_DECL
       && (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))
 	  || GFC_ARRAY_TYPE_P (TREE_TYPE (decl))))
-    decl = GFC_DECL_SAVED_DESCRIPTOR (decl);
+    {
+      is_array_type = true;
+      decl = GFC_DECL_SAVED_DESCRIPTOR (decl);
+    }
 
   if (TREE_CODE (decl) != PARM_DECL
       || !DECL_LANG_SPECIFIC (decl)
@@ -126,7 +131,23 @@  gfc_omp_check_optional_argument (tree decl, bool for_present_check)
       return decl;
     }
 
-  return decl;
+  tree cond = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+			       decl, null_pointer_node);
+
+  /* Fortran regards unallocated allocatables/disassociated pointer which
+     are passed to a nonallocatable, nonpointer argument as not associated;
+     cf. F2018, 15.5.2.12, Paragraph 1.  */
+  if (is_array_type)
+    {
+      tree cond2 = build_fold_indirect_ref_loc (input_location, decl);
+      cond2 = gfc_conv_array_data (cond2);
+      cond2 = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
+			       cond2, null_pointer_node);
+      cond = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
+			      boolean_type_node, cond, cond2);
+    }
+
+  return cond;
 }
 
 
@@ -1189,7 +1210,7 @@  gfc_build_cond_assign (stmtblock_t *block, tree val, tree cond_val,
 		       tree then_b, tree else_val)
 {
   stmtblock_t cond_block;
-  tree cond, else_b = NULL_TREE;
+  tree else_b = NULL_TREE;
   tree val_ty = TREE_TYPE (val);
 
   if (else_val)
@@ -1198,15 +1219,9 @@  gfc_build_cond_assign (stmtblock_t *block, tree val, tree cond_val,
       gfc_add_modify (&cond_block, val, fold_convert (val_ty, else_val));
       else_b = gfc_finish_block (&cond_block);
     }
-  cond = fold_build2_loc (input_location, NE_EXPR,
-			  logical_type_node,
-			  cond_val, null_pointer_node);
   gfc_add_expr_to_block (block,
-			 build3_loc (input_location,
-				     COND_EXPR,
-				     void_type_node,
-				     cond, then_b,
-				     else_b));
+			 build3_loc (input_location, COND_EXPR, void_type_node,
+				     cond_val, then_b, else_b));
 }
 
 /* Build a conditional expression in BLOCK, returning a temporary
@@ -1257,8 +1272,7 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
     }
 
   tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
-  tree present = (gfc_omp_is_optional_argument (decl)
-		  ? gfc_omp_check_optional_argument (decl, true) : NULL_TREE);
+  tree present = gfc_omp_check_optional_argument (decl, true);
   if (POINTER_TYPE_P (TREE_TYPE (decl)))
     {
       if (!gfc_omp_privatize_by_reference (decl)
@@ -1268,6 +1282,23 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
 	  && !GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (decl))))
 	return;
       tree orig_decl = decl;
+
+      /* For nonallocatable, nonpointer arrays, a temporary variable is
+	 generated, but this one is only defined if the variable is present;
+	 hence, we now set it to NULL to avoid accessing undefined variables.
+	 We cannot use a temporary variable here as otherwise the replacement
+	 of the variables in omp-low.c will not work.  */
+      if (present && GFC_ARRAY_TYPE_P (TREE_TYPE (decl)))
+	{
+	  tree tmp = fold_build2_loc (input_location, MODIFY_EXPR,
+				      void_type_node, decl, null_pointer_node);
+	  tree cond = fold_build1_loc (input_location, TRUTH_NOT_EXPR,
+				       boolean_type_node, present);
+	  tmp = build3_loc (input_location, COND_EXPR, void_type_node,
+			    cond, tmp, NULL_TREE);
+	  gimplify_and_add (tmp, pre_p);
+	}
+
       c4 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
       OMP_CLAUSE_SET_MAP_KIND (c4, GOMP_MAP_POINTER);
       OMP_CLAUSE_DECL (c4) = decl;
@@ -1375,10 +1406,8 @@  gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
 				  boolean_type_node, tem, null_pointer_node);
 	  if (present)
 	    {
-	      tem = fold_build2_loc (input_location, NE_EXPR, boolean_type_node,
-				     present, null_pointer_node);
 	      cond = fold_build2_loc (input_location, TRUTH_ANDIF_EXPR,
-				      boolean_type_node, tem, cond);
+				      boolean_type_node, present, cond);
 	    }
 	  gfc_add_expr_to_block (&block, build3_loc (input_location, COND_EXPR,
 						     void_type_node, cond,
@@ -2380,9 +2409,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		TREE_ADDRESSABLE (decl) = 1;
 	      if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL)
 		{
-		  tree present = (gfc_omp_is_optional_argument (decl)
-				  ? gfc_omp_check_optional_argument (decl, true)
-				  : NULL_TREE);
+		  tree present = gfc_omp_check_optional_argument (decl, true);
 		  if (POINTER_TYPE_P (TREE_TYPE (decl))
 		      && (gfc_omp_privatize_by_reference (decl)
 			  || GFC_DECL_GET_SCALAR_POINTER (decl)
@@ -2392,6 +2419,30 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 					(TREE_TYPE (TREE_TYPE (decl)))))
 		    {
 		      tree orig_decl = decl;
+
+		      /* For nonallocatable, nonpointer arrays, a temporary
+			 variable is generated, but this one is only defined if
+			 the variable is present; hence, we now set it to NULL
+			 to avoid accessing undefined variables.  We cannot use
+			 a temporary variable here as otherwise the replacement
+			 of the variables in omp-low.c will not work.  */
+		      if (present && GFC_ARRAY_TYPE_P (TREE_TYPE (decl)))
+			{
+			  tree tmp = fold_build2_loc (input_location,
+						      MODIFY_EXPR,
+						      void_type_node, decl,
+						      null_pointer_node);
+			  tree cond = fold_build1_loc (input_location,
+						       TRUTH_NOT_EXPR,
+						       boolean_type_node,
+						       present);
+			  gfc_add_expr_to_block (block,
+						 build3_loc (input_location,
+							     COND_EXPR,
+							     void_type_node,
+							     cond, tmp,
+							     NULL_TREE));
+			}
 		      node4 = build_omp_clause (input_location,
 						OMP_CLAUSE_MAP);
 		      OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER);
@@ -2469,17 +2520,10 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 						  boolean_type_node,
 						  tem, null_pointer_node);
 			  if (present)
-			    {
-			      tree tmp = fold_build2_loc (input_location,
-							  NE_EXPR,
-							  boolean_type_node,
-							  present,
-							  null_pointer_node);
-			      cond = fold_build2_loc (input_location,
-						      TRUTH_ANDIF_EXPR,
-						      boolean_type_node,
-						      tmp, cond);
-			    }
+			    cond = fold_build2_loc (input_location,
+						    TRUTH_ANDIF_EXPR,
+						    boolean_type_node,
+						    present, cond);
 			  gfc_add_expr_to_block (block,
 						 build3_loc (input_location,
 							     COND_EXPR,
@@ -2498,16 +2542,11 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			    {
 			      tree var = gfc_create_var (gfc_array_index_type,
 							 NULL);
-			      tree cond = fold_build2_loc (input_location,
-							   NE_EXPR,
-							   boolean_type_node,
-							   present,
-							   null_pointer_node);
 			      gfc_add_modify (&cond_block, var, size);
-			      cond = build3_loc (input_location, COND_EXPR,
-						 void_type_node, cond,
-						 gfc_finish_block (&cond_block),
-						 NULL_TREE);
+			      tree cond_body = gfc_finish_block (&cond_block);
+			      tree cond = build3_loc (input_location, COND_EXPR,
+						      void_type_node, present,
+						      cond_body, NULL_TREE);
 			      gfc_add_expr_to_block (block, cond);
 			      OMP_CLAUSE_SIZE (node) = var;
 			    }
diff --git a/libgomp/testsuite/libgomp.fortran/optional-map.f90 b/libgomp/testsuite/libgomp.fortran/optional-map.f90
index eebe58cc45c..b06efcc90d1 100644
--- a/libgomp/testsuite/libgomp.fortran/optional-map.f90
+++ b/libgomp/testsuite/libgomp.fortran/optional-map.f90
@@ -1,11 +1,24 @@ 
 ! { dg-do run }
 !
 implicit none (type, external)
+integer, allocatable :: a_ii, a_ival, a_iarr(:)
+integer, pointer :: p_ii, p_ival, p_iarr(:)
+
+nullify (p_ii, p_ival, p_iarr)
+
 call sub()
 call sub2()
 call call_present_1()
 call call_present_2()
 
+! unallocated/disassociated actual arguments to nonallocatable, nonpointer
+! dummy arguments are regarded as absent
+! Skipping 'ival' dummy argument due to PR fortran/92887
+call sub(ii=a_ii, iarr=a_iarr)
+call sub(ii=p_ii, iarr=p_iarr)
+call sub2(ii=a_ii, iarr=a_iarr)
+call sub2(ii=p_ii, iarr=p_iarr)
+
 contains
 
 subroutine call_present_1()
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
index d33b7d1cce0..641ebd98962 100644
--- a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-2.f90
@@ -3,8 +3,19 @@ 
 program main
  use iso_c_binding, only: c_ptr, c_loc, c_associated
  implicit none (type, external)
+ integer, allocatable :: a_w, a_x(:)
+ integer, pointer :: p_w, p_x(:)
+
+ nullify (p_w, p_x)
  call foo()
+
+ ! unallocated/disassociated actual arguments to nonallocatable, nonpointer
+ ! dummy arguments are regarded as absent
+ call foo (w=a_w, x=a_x)
+ call foo (w=p_w, x=p_x)
+
 contains
+
   subroutine foo(v, w, x, y, z, cptr, cptr_in)
     integer, target, optional, value :: v
     integer, target, optional :: w
diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-3.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-3.f90
new file mode 100644
index 00000000000..f2e1a60757f
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr-optional-3.f90
@@ -0,0 +1,140 @@ 
+! Check whether absent optional arguments are properly
+! handled with use_device_{addr,ptr}.
+program main
+  use iso_c_binding, only: c_ptr, c_loc, c_associated, c_f_pointer
+  implicit none (type, external)
+
+  integer, target :: u
+  integer, target :: v
+  integer, target :: w
+  integer, target :: x(4)
+  integer, target, allocatable :: y
+  integer, target, allocatable :: z(:)
+  type(c_ptr), target :: cptr
+  type(c_ptr), target :: cptr_in
+  integer :: dummy
+
+  u = 42
+  v = 5
+  w = 7
+  x = [3,4,6,2]
+  y = 88
+  z = [1,2,3]
+
+  !$omp target enter data map(to:u)
+  !$omp target data map(to:dummy) use_device_addr(u)
+   cptr_in = c_loc(u) ! Has to be outside 'foo' due to 'intent(in)'
+  !$omp end target data
+
+  call foo (u, v, w, x, y, z, cptr, cptr_in)
+  deallocate (y, z)
+contains
+  subroutine foo (u, v, w, x, y, z, cptr, cptr_in)
+    integer, target, optional, value :: v
+    integer, target, optional :: u, w
+    integer, target, optional :: x(:)
+    integer, target, optional, allocatable :: y
+    integer, target, optional, allocatable :: z(:)
+    type(c_ptr), target, optional, value :: cptr
+    type(c_ptr), target, optional, value, intent(in) :: cptr_in
+    integer :: d
+
+    type(c_ptr) :: p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in
+
+    !$omp target enter data map(to:w, x, y, z)
+    !$omp target data map(dummy) use_device_addr(x)
+      cptr = c_loc(x)
+    !$omp end target data
+
+    ! Need to map per-VALUE arguments, if present
+    if (present(v)) then
+      !$omp target enter data map(to:v)
+    else
+      stop 1
+    end if
+    if (present(cptr)) then
+      !$omp target enter data map(to:cptr)
+    else
+      stop 2
+    end if
+    if (present(cptr_in)) then
+      !$omp target enter data map(to:cptr_in)
+    else
+      stop 3
+    end if
+
+    !$omp target data map(d) use_device_addr(u, v, w, x, y, z)
+    !$omp target data map(d) use_device_addr(cptr, cptr_in)
+      if (.not. present(u)) stop 10
+      if (.not. present(v)) stop 11
+      if (.not. present(w)) stop 12
+      if (.not. present(x)) stop 13
+      if (.not. present(y)) stop 14
+      if (.not. present(z)) stop 15
+      if (.not. present(cptr)) stop 16
+      if (.not. present(cptr_in)) stop 17
+      p_u = c_loc(u)
+      p_v = c_loc(v)
+      p_w = c_loc(w)
+      p_x = c_loc(x)
+      p_y = c_loc(y)
+      p_z = c_loc(z)
+      p_cptr = c_loc(cptr)
+      p_cptr_in = c_loc(cptr_in)
+    !$omp end target data
+    !$omp end target data
+    call check(p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in, size(x), size(z))
+  end subroutine foo
+
+  subroutine check(p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in, Nx, Nz)
+    type(c_ptr), value :: p_u, p_v, p_w, p_x, p_y, p_z, p_cptr, p_cptr_in
+    integer, value :: Nx, Nz
+    integer, pointer :: c_u(:), c_v(:), c_w(:), c_x(:), c_y(:), c_z(:)
+    type(c_ptr), pointer :: c_cptr(:), c_cptr_in(:)
+
+    ! As is_device_ptr does not handle scalars, we map them to a size-1 array
+    call c_f_pointer(p_u, c_u, shape=[1])
+    call c_f_pointer(p_v, c_v, shape=[1])
+    call c_f_pointer(p_w, c_w, shape=[1])
+    call c_f_pointer(p_x, c_x, shape=[Nx])
+    call c_f_pointer(p_y, c_y, shape=[1])
+    call c_f_pointer(p_z, c_z, shape=[Nz])
+    call c_f_pointer(p_cptr, c_cptr, shape=[1])
+    call c_f_pointer(p_cptr_in, c_cptr_in, shape=[1])
+    call run_target(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
+  end subroutine check
+
+  subroutine run_target(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
+    integer, target :: c_u(:), c_v(:), c_w(:), c_x(:), c_y(:), c_z(:)
+    type(c_ptr) :: c_cptr(:), c_cptr_in(:)
+    integer, value :: Nx, Nz
+    !$omp target is_device_ptr(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in) map(to:Nx, Nz)
+      call target_fn(c_u(1), c_v(1), c_w(1), c_x, c_y(1), c_z, c_cptr(1), c_cptr_in(1), Nx, Nz)
+    !$omp end target
+  end subroutine run_target
+
+  subroutine target_fn(c_u, c_v, c_w, c_x, c_y, c_z, c_cptr, c_cptr_in, Nx, Nz)
+    !$omp declare target
+    integer, target :: c_u, c_v, c_w, c_x(:), c_y, c_z(:)
+    type(c_ptr), value :: c_cptr, c_cptr_in
+    integer, value :: Nx, Nz
+    integer, pointer :: u, x(:)
+    if (c_u /= 42) stop 30
+    if (c_v /= 5) stop 31
+    if (c_w /= 7) stop 32
+    if (Nx /= 4) stop 33
+    if (any (c_x /= [3,4,6,2])) stop 34
+    if (c_y /= 88) stop 35
+    if (Nz /= 3) stop 36
+    if (any (c_z /= [1,2,3])) stop 37
+    if (.not. c_associated (c_cptr)) stop 38
+    if (.not. c_associated (c_cptr_in)) stop 39
+    if (.not. c_associated (c_cptr, c_loc(c_x))) stop 40
+    if (.not. c_associated (c_cptr_in, c_loc(c_u))) stop 41
+    call c_f_pointer(c_cptr_in, u)
+    call c_f_pointer(c_cptr, x, shape=[Nx])
+    if (u /= c_u .or. u /= 42)  stop 42
+    if (any (x /= c_x))  stop 43
+    if (any (x /= [3,4,6,2]))  stop 44
+  end subroutine target_fn
+end program main