From patchwork Fri Sep 27 14:52:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Tobias Burnus X-Patchwork-Id: 1168593 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-509707-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="exXlvudK"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46fvt508Pmz9s4Y for ; Sat, 28 Sep 2019 00:52:59 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; q=dns; s=default; b=p5DfWAIpelYhJ1Y0cFGiXp+gTUOv+fvp2VuZsvmW6wMSEtIpNj rsr5wfAASCswlbmiwJ4x7MgqKyZamz2yru52Kdr3pEOc8WgRZF7WANg9R3s0UL/L CXOENj6nygzPP8d5IxXxr8Lu1R0VFmIfK50LhM5G8zFPzrCifwlH3yolU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:cc:message-id:date:mime-version:content-type; s= default; bh=eYkbnHDph2xFR2sK0x2+28Z2iXU=; b=exXlvudKDnl4v6EAH1Ox X9w//LUlQDHHXOLqnKdpC2KjjY4q3MAT/2n61UFORC0ejQSyvZ9R4l4j2WQV4eH9 6ZGZ3rCWg2J8gJHN/zx85LvNbF253uRqxWyUnR+gyruLIokmb7dHLZCmFLQW4nD8 g35W4skrLzA5h+UTb5nR9KM= Received: (qmail 48128 invoked by alias); 27 Sep 2019 14:52:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 48109 invoked by uid 89); 27 Sep 2019 14:52:43 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy= X-HELO: esa1.mentor.iphmx.com Received: from esa1.mentor.iphmx.com (HELO esa1.mentor.iphmx.com) (68.232.129.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 27 Sep 2019 14:52:37 +0000 IronPort-SDR: HGwfA1tTV54RFBGmiaKD/LO93GOy52c5WDgt/dKagTvYizsxiKvou2OBKjCAu3ZnuZL03PmlNh AvIeJ5oKvMqsUslKPlG4qgHeVRz71MtHDUS2vkRsIkByNSm2dJdlZWcmJL3piwf+67+r3dcHY7 jJyUZLJz1wovhsGWMhP/Wa+NHL4ft5RopKjDRoHvjKJF+Q4pJ/ocxy2FaRHNniZg1mEApJygUM H9SV+3eEQzbZyVCRd+hv7L6JUkLpjCJ+/69dTtEWlbfI8v9hlt71hxT+ZIxXZzLpCu/583/XX/ uB4= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 27 Sep 2019 06:52:35 -0800 IronPort-SDR: XkJiSaJ2OZtiBSUMoZMxamz9JeR0UiXmD+mZHke8xiBEVMU4tTKLb2pOHh2asL27SgtsEI8Tgv K7WAOucGYQiWMpTmjXVPe5eYl3+JhYyvET6Zaqu5MYA+47O2Bhi9VhkCEtQwHp+xwy/IW/1sNV pPQgtuTP5hJ7AHT/Tp4/8LdEJzr9auBPd6GIuHW2Nm7WM1W6FLXyUWyYLJekhxRNKyld1rt6KY sg+81F5OktM1TMvLzSRnZ6OzKegmxVKdOVK4SrMYtJeYne8lsy8lEoXwk67qPsq9ndT4jouoQc cs0= From: Tobias Burnus Subject: [Patch][fortran, omp-low.c] Fix OpenMP's use_device_ptr clause with array descriptors To: gcc-patches , Jakub Jelinek CC: Thomas Schwinge , Andrew Stubbs , fortran Message-ID: <9fe893c5-d986-ef05-1659-04df3db3cccb@codesourcery.com> Date: Fri, 27 Sep 2019 16:52:30 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 MIME-Version: 1.0 X-IsSubscribed: yes Hi all, using !$omp target data map(A) use_device_ptr(A) call foo(c_loc(A)) !$omp end target data fails on trunk if "A" is not a simple pointer but an array with descriptor. For the use_device_ptr clause, one needs to replace the host pointer address by the address on the device (done by calling libgomp's GOMP_target_data_ext). For array pointers, that's not directly "A" but "A.data" which needs to be updated. That's what the patch does. I think I covered all cases in the testsuite – but I might have missed some. "use_device_addr" might also need some changes, but as gfortran (contrary to g++/gcc and omp-low.c) does not support it, yet, I have not tested it. "is_device_addr" is tested in the run-time test case. Quoting OpenMP 5's "2.12.5 target Construct": "A list item that appears in an is_device_ptr clause must be a dummy argument that does not have the ALLOCATABLE, POINTER or VALUE attribute." As that test case (commented parts) shows, this is currently one of those things which are not tested at compile time – and not handled at run time (ICE). In this patch round, I have not attempted to fix it. Normally array-descriptor dummy arguments are immediately converted using: "dummy.0 = dummy->data" (happens during the array-bounds remapping). As "from.0" is what arrives in omp-low.c, that one can be handled flawlessly. I have tested the patch without offloading support on my laptop – and on the server which actually supports nvptx offloading (both: x86_64-gnu-linux). I am sure, there will be many suggestions how things should be done better. Still: OK for the trunk? Thanks, Tobias PS: A follow-up work is to add support for the "use_device_addr" clause and adding a bunch of argument checks; albeit, I do _not_ intent to do it as immediate follow-up patch. 2019-09-27 Tobias Burnus gcc/fortran/ * trans.h (gfc_omp_array_data): Declare. * trans-openmp.c (gfc_omp_array_data): New. * f95-lang.c (LANG_HOOKS_OMP_ARRAY_DATA): Use it. * trans-array.c (gfc_conv_descriptor_data_get): Handle reference types. gcc/ * hooks.h (hook_tree_tree_null): Declare. * hooks.c (hook_tree_tree_null): New. * langhooks-def.h (LANG_HOOKS_OMP_ARRAY_DATA): New. (LANG_HOOKS_FOR_TYPES_INITIALIZER): Add it. * langhooks.h (lang_hooks_for_types): Add omp_array_data. * omp-low.c (omp_context): Add array_data_map. (install_var_field): Handle unnamed field declarations. (delete_omp_context): Dealloc array_data_map. (scan_sharing_clauses, lower_omp_target): Handle array descriptors in use_device_ptr. gcc/testsuite/ * gfortran.dg/gomp/use_device_ptr1.f90: New. * gfortran.dg/gomp/use_device_ptr2.f90: New. * gfortran.dg/gomp/use_device_ptr3.f90: New. libgomp/ * testsuite/libgomp.fortran/use_device_ptr1.f90: New. gcc/fortran/f95-lang.c | 2 + gcc/fortran/trans-array.c | 3 + gcc/fortran/trans-openmp.c | 19 + gcc/fortran/trans.h | 1 + gcc/hooks.c | 6 + gcc/hooks.h | 1 + gcc/langhooks-def.h | 2 + gcc/langhooks.h | 4 + gcc/omp-low.c | 113 +++- gcc/testsuite/gfortran.dg/gomp/use_device_ptr1.f90 | 102 ++++ gcc/testsuite/gfortran.dg/gomp/use_device_ptr2.f90 | 107 ++++ gcc/testsuite/gfortran.dg/gomp/use_device_ptr3.f90 | 108 ++++ .../testsuite/libgomp.fortran/use_device_ptr1.f90 | 590 +++++++++++++++++++++ 13 files changed, 1047 insertions(+), 11 deletions(-) diff --git a/gcc/fortran/f95-lang.c b/gcc/fortran/f95-lang.c index 6b9f490d2bb..259d767e390 100644 --- a/gcc/fortran/f95-lang.c +++ b/gcc/fortran/f95-lang.c @@ -112,6 +112,7 @@ static const struct attribute_spec gfc_attribute_table[] = #undef LANG_HOOKS_MARK_ADDRESSABLE #undef LANG_HOOKS_TYPE_FOR_MODE #undef LANG_HOOKS_TYPE_FOR_SIZE +#undef LANG_HOOKS_OMP_ARRAY_DATA #undef LANG_HOOKS_INIT_TS #undef LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE #undef LANG_HOOKS_OMP_PREDETERMINED_SHARING @@ -145,6 +146,7 @@ static const struct attribute_spec gfc_attribute_table[] = #define LANG_HOOKS_TYPE_FOR_MODE gfc_type_for_mode #define LANG_HOOKS_TYPE_FOR_SIZE gfc_type_for_size #define LANG_HOOKS_INIT_TS gfc_init_ts +#define LANG_HOOKS_OMP_ARRAY_DATA gfc_omp_array_data #define LANG_HOOKS_OMP_PRIVATIZE_BY_REFERENCE gfc_omp_privatize_by_reference #define LANG_HOOKS_OMP_PREDETERMINED_SHARING gfc_omp_predetermined_sharing #define LANG_HOOKS_OMP_REPORT_DECL gfc_omp_report_decl diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c index 07c4d7e671f..822c53508dc 100644 --- a/gcc/fortran/trans-array.c +++ b/gcc/fortran/trans-array.c @@ -142,6 +142,9 @@ gfc_conv_descriptor_data_get (tree desc) tree field, type, t; type = TREE_TYPE (desc); + if (TREE_CODE (type) == REFERENCE_TYPE) + type = TREE_TYPE (type); + gcc_assert (GFC_DESCRIPTOR_TYPE_P (type)); field = TYPE_FIELDS (type); diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index b4c77aebf4d..809fbf52515 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -47,6 +47,25 @@ along with GCC; see the file COPYING3. If not see int ompws_flags; +tree +gfc_omp_array_data (tree decl) +{ + tree type = TREE_TYPE (decl); + + if (TREE_CODE (type) == REFERENCE_TYPE || POINTER_TYPE_P (type)) + type = TREE_TYPE (type); + + if (!GFC_DESCRIPTOR_TYPE_P (type)) + return NULL_TREE; + + if (POINTER_TYPE_P (TREE_TYPE (decl))) + decl = build_fold_indirect_ref (decl); + + decl = gfc_conv_descriptor_data_get (decl); + STRIP_NOPS (decl); + return decl; +} + /* True if OpenMP should privatize what this DECL points to rather than the DECL itself. */ diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h index 84793dc1df0..56e77fd4b35 100644 --- a/gcc/fortran/trans.h +++ b/gcc/fortran/trans.h @@ -785,6 +785,7 @@ struct array_descr_info; bool gfc_get_array_descr_info (const_tree, struct array_descr_info *); /* In trans-openmp.c */ +tree gfc_omp_array_data (tree); bool gfc_omp_privatize_by_reference (const_tree); enum omp_clause_default_kind gfc_omp_predetermined_sharing (tree); tree gfc_omp_report_decl (tree); diff --git a/gcc/hooks.c b/gcc/hooks.c index ca731c440e7..6456eb1eebc 100644 --- a/gcc/hooks.c +++ b/gcc/hooks.c @@ -430,6 +430,12 @@ hook_tree_tree_int_treep_bool_null (tree, int, tree *, bool) return NULL; } +tree +hook_tree_tree_null (tree) +{ + return NULL; +} + tree hook_tree_tree_tree_null (tree, tree) { diff --git a/gcc/hooks.h b/gcc/hooks.h index 040eff008db..53671fbd7d1 100644 --- a/gcc/hooks.h +++ b/gcc/hooks.h @@ -106,6 +106,7 @@ extern HOST_WIDE_INT hook_hwi_void_0 (void); extern tree hook_tree_const_tree_null (const_tree); extern tree hook_tree_void_null (void); +extern tree hook_tree_tree_null (tree); extern tree hook_tree_tree_tree_null (tree, tree); extern tree hook_tree_tree_tree_tree_null (tree, tree, tree); extern tree hook_tree_tree_int_treep_bool_null (tree, int, tree *, bool); diff --git a/gcc/langhooks-def.h b/gcc/langhooks-def.h index a059841b3df..46ec9d11968 100644 --- a/gcc/langhooks-def.h +++ b/gcc/langhooks-def.h @@ -188,6 +188,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree); #define LANG_HOOKS_OMP_FIRSTPRIVATIZE_TYPE_SIZES \ lhd_omp_firstprivatize_type_sizes #define LANG_HOOKS_OMP_MAPPABLE_TYPE lhd_omp_mappable_type +#define LANG_HOOKS_OMP_ARRAY_DATA hook_tree_tree_null #define LANG_HOOKS_TYPE_HASH_EQ NULL #define LANG_HOOKS_COPY_LANG_QUALIFIERS NULL #define LANG_HOOKS_GET_ARRAY_DESCR_INFO NULL @@ -214,6 +215,7 @@ extern tree lhd_unit_size_without_reusable_padding (tree); LANG_HOOKS_TYPE_MAX_SIZE, \ LANG_HOOKS_OMP_FIRSTPRIVATIZE_TYPE_SIZES, \ LANG_HOOKS_OMP_MAPPABLE_TYPE, \ + LANG_HOOKS_OMP_ARRAY_DATA, \ LANG_HOOKS_TYPE_HASH_EQ, \ LANG_HOOKS_COPY_LANG_QUALIFIERS, \ LANG_HOOKS_GET_ARRAY_DESCR_INFO, \ diff --git a/gcc/langhooks.h b/gcc/langhooks.h index a45579b3325..f7fa5f3e224 100644 --- a/gcc/langhooks.h +++ b/gcc/langhooks.h @@ -117,6 +117,10 @@ struct lang_hooks_for_types /* Return true if TYPE is a mappable type. */ bool (*omp_mappable_type) (tree type); + /* Return a tree for of the actual data of an array descriptor - or + NULL_TREE if original tree is not an array descriptor. */ + tree (*omp_array_data) (tree); + /* Return TRUE if TYPE1 and TYPE2 are identical for type hashing purposes. Called only after doing all language independent checks. At present, this function is only called when both TYPE1 and TYPE2 are diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 5db182c6841..b9522e9094c 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -90,6 +90,7 @@ struct omp_context /* Map variables to fields in a structure that allows communication between sending and receiving threads. */ splay_tree field_map; + splay_tree array_data_map; tree record_type; tree sender_decl; tree receiver_decl; @@ -715,7 +716,9 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx) tree field, type, sfield = NULL_TREE; splay_tree_key key = (splay_tree_key) var; - if ((mask & 8) != 0) + if ((mask & 16) != 0) + key = (splay_tree_key) var; + else if ((mask & 8) != 0) { key = (splay_tree_key) &DECL_UID (var); gcc_checking_assert (key != (splay_tree_key) var); @@ -745,14 +748,17 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx) else if ((mask & 3) == 1 && omp_is_reference (var)) type = TREE_TYPE (type); - field = build_decl (DECL_SOURCE_LOCATION (var), - FIELD_DECL, DECL_NAME (var), type); + if ((mask & 16) != 0) + field = build_decl (UNKNOWN_LOCATION, FIELD_DECL, NULL_TREE, type); + else + field = build_decl (DECL_SOURCE_LOCATION (var), + FIELD_DECL, DECL_NAME (var), type); /* Remember what variable this field was created for. This does have a side effect of making dwarf2out ignore this member, so for helpful debugging we clear it later in delete_omp_context. */ DECL_ABSTRACT_ORIGIN (field) = var; - if (type == TREE_TYPE (var)) + if ((mask & 16) == 0 && type == TREE_TYPE (var)) { SET_DECL_ALIGN (field, DECL_ALIGN (var)); DECL_USER_ALIGN (field) = DECL_USER_ALIGN (var); @@ -976,6 +982,8 @@ delete_omp_context (splay_tree_value value) splay_tree_delete (ctx->field_map); if (ctx->sfield_map) splay_tree_delete (ctx->sfield_map); + if (ctx->array_data_map) + splay_tree_delete (ctx->array_data_map); /* We hijacked DECL_ABSTRACT_ORIGIN earlier. We need to clear it before it produces corrupt debug information. */ @@ -1070,7 +1078,7 @@ fixup_child_record_type (omp_context *ctx) static void scan_sharing_clauses (tree clauses, omp_context *ctx) { - tree c, decl; + tree c, decl, x; bool scan_array_reductions = false; for (c = clauses; c; c = OMP_CLAUSE_CHAIN (c)) @@ -1240,7 +1248,30 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) case OMP_CLAUSE_USE_DEVICE_PTR: case OMP_CLAUSE_USE_DEVICE_ADDR: decl = OMP_CLAUSE_DECL (c); - if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR + x = NULL; + // Handle array descriptors + if (TREE_CODE (TREE_TYPE (decl)) == RECORD_TYPE || + (omp_is_reference (decl) + && TREE_CODE (TREE_TYPE (TREE_TYPE (decl))) == RECORD_TYPE)) + x = lang_hooks.types.omp_array_data (decl); + + if (x) + { + gcc_assert (!ctx->array_data_map + || !splay_tree_lookup (ctx->array_data_map, + (splay_tree_key) decl)); + if (!ctx->array_data_map) + ctx->array_data_map + = splay_tree_new (splay_tree_compare_pointers, 0, 0); + + splay_tree_insert (ctx->array_data_map, (splay_tree_key) decl, + (splay_tree_value) x); + + install_var_field (x, false, 19, ctx); + DECL_SOURCE_LOCATION (lookup_field (x, ctx)) + = OMP_CLAUSE_LOCATION (c); + } + else if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR && !omp_is_reference (decl)) || TREE_CODE (TREE_TYPE (decl)) == ARRAY_TYPE) install_var_field (decl, true, 11, ctx); @@ -11852,7 +11883,18 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case OMP_CLAUSE_IS_DEVICE_PTR: ovar = OMP_CLAUSE_DECL (c); var = lookup_decl_in_outer_ctx (ovar, ctx); - if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_IS_DEVICE_PTR) + + // For arrays with descriptor, use the pointer to the actual data + splay_tree_node n = ctx->array_data_map + ? splay_tree_lookup (ctx->array_data_map, + (splay_tree_key) ovar) + : NULL; + if (n) + { + tkind = GOMP_MAP_USE_DEVICE_PTR; + x = build_sender_ref ((tree) n->value, ctx); + } + else if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_IS_DEVICE_PTR) { tkind = GOMP_MAP_USE_DEVICE_PTR; x = build_sender_ref ((splay_tree_key) &DECL_UID (ovar), ctx); @@ -11863,7 +11905,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) x = build_sender_ref (ovar, ctx); } type = TREE_TYPE (ovar); - if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR + if (n) + var = (tree) n->value; + else if ((OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR && !omp_is_reference (ovar)) || TREE_CODE (type) == ARRAY_TYPE) var = build_fold_addr_expr (var); @@ -12028,11 +12072,58 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case OMP_CLAUSE_USE_DEVICE_ADDR: case OMP_CLAUSE_IS_DEVICE_PTR: var = OMP_CLAUSE_DECL (c); + tree array_data = NULL; + if (ctx->array_data_map) + { + splay_tree_node n = splay_tree_lookup (ctx->array_data_map, + (splay_tree_key) var); + if (n) + array_data = (tree) n->value; + } + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_IS_DEVICE_PTR) - x = build_sender_ref ((splay_tree_key) &DECL_UID (var), ctx); + x = build_sender_ref (array_data + ? (splay_tree_key) array_data + : (splay_tree_key) &DECL_UID (var), ctx); else - x = build_receiver_ref (var, false, ctx); - if (is_variable_sized (var)) + x = build_receiver_ref (array_data ? array_data : var, false, ctx); + + if (array_data) + { + tree new_var = lookup_decl (var, ctx); + new_var = DECL_VALUE_EXPR (new_var); + if (omp_is_reference (var) + && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_USE_DEVICE_ADDR) + { + tree type = TREE_TYPE (TREE_TYPE (var)); + tree v = create_tmp_var_raw (type, get_name (var)); + gimple_add_tmp_var (v); + TREE_ADDRESSABLE (v) = 1; + tree x2 = build_fold_indirect_ref (var); + gimplify_expr (&x2, &new_body, NULL, is_gimple_val, fb_rvalue); + gimple_seq_add_stmt (&new_body, gimple_build_assign (v, x2)); + + tree v2 = lang_hooks.types.omp_array_data (v); + gcc_assert (v2); + gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue); + gimple_seq_add_stmt (&new_body, + gimple_build_assign (v2, x)); + x = build_fold_addr_expr (v); + gimple_seq_add_stmt (&new_body, + gimple_build_assign (new_var, x)); + } + else + { + gimple_seq_add_stmt (&new_body, + gimple_build_assign (new_var, var)); + new_var = lang_hooks.types.omp_array_data (new_var); + gcc_assert (new_var); + gimplify_expr (&x, &new_body, NULL, is_gimple_val, fb_rvalue); + gimple_seq_add_stmt (&new_body, + gimple_build_assign (new_var, x)); + } + } + else if (is_variable_sized (var)) { tree pvar = DECL_VALUE_EXPR (var); gcc_assert (TREE_CODE (pvar) == INDIRECT_REF); diff --git a/gcc/testsuite/gfortran.dg/gomp/use_device_ptr1.f90 b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr1.f90 new file mode 100644 index 00000000000..5b2ddf50206 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr1.f90 @@ -0,0 +1,102 @@ +! { dg-do compile } +! { dg-options "-fopenmp -fdump-tree-optimized -O0" } + +! Check use_device_ptr with local variables + +module offloading + use iso_c_binding + implicit none + interface + subroutine copy3_array_data(from, to, N) bind(C) + import :: c_ptr + type(c_ptr), value :: from, to + integer, value :: N + end subroutine copy3_array_data + end interface +end module offloading + +subroutine omp_device_ptr() + use iso_c_binding + use offloading + implicit none + + integer, parameter :: N = 1000 + real(c_double), pointer :: AA(:), BBB(:) + real(c_double), allocatable, target :: CC(:), DD(:) + real(c_double), target :: EE(N), FF(N), dummy(1) + + ! allocate(AA(N), BBB(N), CC(N), DD(N)) ! make dump more readable + + ! AA = 11.0_c_double + ! BBB = 22.0_c_double + ! CC = 33.0_c_double + ! DD = 44.0_c_double + ! EE = 55.0_c_double + ! FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + ! !$omp target data map(to:AA) map(from:BBB) use_device_ptr(AA,BBB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BBB) + call copy3_array_data(c_loc(AA), c_loc(BBB), N) + !$omp end target data + + ! allocatable array to use_device_ptr + !!$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + ! fixed-size decriptorless array to use_device_ptr + !!$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + ! deallocate(AA, BBB) ! Free all pointers, only +end subroutine omp_device_ptr + + +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.aa = &aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.aa = aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.bbb = &bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.bbb = bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.cc = &cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.cc = cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.dd = &dd" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.dd = dd" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.ee = ee" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.ff = ff" "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ee = &ee" 1 "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ff = &ff" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa.data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa\.\[0-9\]+ = aa;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa\.\[0-9\]+\.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa\.\[0-9\]+\.data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb.data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb\.\[0-9\]+ = bbb;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb\.\[0-9\]+\.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb\.\[0-9\]+\.data;" 1 "optimized" } } + +! '3' because of automatic deallocation +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc.data;" 3 "optimized" } } +! { dg-final { scan-tree-dump-times "cc\.\[0-9\]+ = cc;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc\.\[0-9\]+\.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc\.\[0-9\]+\.data;" 1 "optimized" } } + +! '3' because of automatic deallocation +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd.data;" 3 "optimized" } } +! { dg-final { scan-tree-dump-times "dd\.\[0-9\]+ = dd;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd\.\[0-9\]+\.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd\.\[0-9\]+\.data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.ee;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "ee\.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = &\\\(\\*ee.\[0-9\]+_\[0-9\]+\\\);" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.ff;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "ff\.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = &\\\(\\*ff.\[0-9\]+_\[0-9\]+\\\);" 1 "optimized" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/use_device_ptr2.f90 b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr2.f90 new file mode 100644 index 00000000000..c2b837a416b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr2.f90 @@ -0,0 +1,107 @@ +! { dg-do compile } +! { dg-options "-fopenmp -fdump-tree-optimized -O0" } + +! Check use_device_ptr with nonoptional dummy arguments + +module offloading + use iso_c_binding + implicit none + interface + subroutine copy3_array_data(from, to, N) bind(C) + import :: c_ptr + type(c_ptr), value :: from, to + integer, value :: N + end subroutine copy3_array_data + end interface +end module offloading + +subroutine omp_device_ptr(AA, BBB, CC, DD, EE, FF, N) + use iso_c_binding + use offloading + implicit none + + integer, value :: N + real(c_double), pointer :: AA(:), BBB(:) + real(c_double), allocatable, target :: CC(:), DD(:) + real(c_double), target :: EE(N), FF(N), dummy(1) + + ! allocate(AA(N), BBB(N), CC(N), DD(N)) ! make dump more readable + + ! AA = 11.0_c_double + ! BBB = 22.0_c_double + ! CC = 33.0_c_double + ! DD = 44.0_c_double + ! EE = 55.0_c_double + ! FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + ! !$omp target data map(to:AA) map(from:BBB) use_device_ptr(AA,BBB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BBB) + call copy3_array_data(c_loc(AA), c_loc(BBB), N) + !$omp end target data + + ! allocatable array to use_device_ptr + !!$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + ! fixed-size decriptorless array to use_device_ptr + !!$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + ! deallocate(AA, BBB) ! Free all pointers, only +end subroutine omp_device_ptr + +! { dg-final { scan-tree-dump-not ".omp_data_arr.aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.dd" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.dd" "optimized" } } + +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.D.\[0-9\]+ = _\[0-9\]+" 4 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.D.\[0-9\]+" 4 "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ee = ee_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ff = ff_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.ee;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*aa_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+_\[0-9\]+ = &aa.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*bbb_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+_\[0-9\]+ = &bbb.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*cc_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+_\[0-9\]+ = &cc.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*dd_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+_\[0-9\]+ = &dd.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "ee.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = ee.\[0-9\]+_\[0-9\]+" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "ff.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = ff.\[0-9\]+_\[0-9\]+" 1 "optimized" } } diff --git a/gcc/testsuite/gfortran.dg/gomp/use_device_ptr3.f90 b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr3.f90 new file mode 100644 index 00000000000..7ce0ca6eb0b --- /dev/null +++ b/gcc/testsuite/gfortran.dg/gomp/use_device_ptr3.f90 @@ -0,0 +1,108 @@ +! { dg-do compile } +! { dg-options "-fopenmp -fdump-tree-optimized -O0" } + +! Check use_device_ptr with optional dummy arguments + +module offloading + use iso_c_binding + implicit none + interface + subroutine copy3_array_data(from, to, N) bind(C) + import :: c_ptr + type(c_ptr), value :: from, to + integer, value :: N + end subroutine copy3_array_data + end interface +end module offloading + +subroutine omp_device_ptr(AA, BBB, CC, DD, EE, FF, N) + use iso_c_binding + use offloading + implicit none + + integer, value :: N + real(c_double), optional, pointer :: AA(:), BBB(:) + real(c_double), optional, allocatable, target :: CC(:), DD(:) + real(c_double), optional, target :: EE(N), FF(N) + real(c_double) :: dummy(1) + + ! allocate(AA(N), BBB(N), CC(N), DD(N)) ! make dump more readable + + ! AA = 11.0_c_double + ! BBB = 22.0_c_double + ! CC = 33.0_c_double + ! DD = 44.0_c_double + ! EE = 55.0_c_double + ! FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + ! !$omp target data map(to:AA) map(from:BBB) use_device_ptr(AA,BBB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BBB) + call copy3_array_data(c_loc(AA), c_loc(BBB), N) + !$omp end target data + + ! allocatable array to use_device_ptr + !!$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + ! fixed-size decriptorless array to use_device_ptr + !!$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + ! deallocate(AA, BBB) ! Free all pointers, only +end subroutine omp_device_ptr + +! { dg-final { scan-tree-dump-not ".omp_data_arr.aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.aa" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.bbb" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.cc" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.dd" "optimized" } } +! { dg-final { scan-tree-dump-not ".omp_data_arr.\[0-9\]+.dd" "optimized" } } + +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.D.\[0-9\]+ = _\[0-9\]+" 4 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.D.\[0-9\]+" 4 "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ee = ee_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times ".omp_data_arr.\[0-9\]+.ff = ff_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = .omp_data_arr.\[0-9\]+.ee;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*aa_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "aa.\[0-9\]+_\[0-9\]+ = &aa.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = aa.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*bbb_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "bbb.\[0-9\]+_\[0-9\]+ = &bbb.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = bbb.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*cc_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "cc.\[0-9\]+_\[0-9\]+ = &cc.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = cc.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd_\[0-9\]+.D.->data;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "D.\[0-9\]+ = \\*dd_\[0-9\]+.D.;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+ = D.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+.data = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "dd.\[0-9\]+_\[0-9\]+ = &dd.\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = dd.\[0-9\]+_\[0-9\]+->data;" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "ee.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = ee.\[0-9\]+_\[0-9\]+" 1 "optimized" } } + +! { dg-final { scan-tree-dump-times "ff.\[0-9\]+_\[0-9\]+ = _\[0-9\]+;" 1 "optimized" } } +! { dg-final { scan-tree-dump-times "_\[0-9\]+ = ff.\[0-9\]+_\[0-9\]+" 1 "optimized" } } diff --git a/libgomp/testsuite/libgomp.fortran/use_device_ptr1.f90 b/libgomp/testsuite/libgomp.fortran/use_device_ptr1.f90 new file mode 100644 index 00000000000..1efc2b7c3fc --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/use_device_ptr1.f90 @@ -0,0 +1,590 @@ +module offloading + use iso_c_binding + implicit none +contains + subroutine copy3_array_data_int(from, to, N) + !$omp declare target + type(c_ptr), value :: from, to + integer, value :: N + + real(c_double), pointer :: from_ptr(:) + real(c_double), pointer :: to_ptr(:) + integer :: i + + call c_f_pointer(from, from_ptr, shape=[N]) + call c_f_pointer(to, to_ptr, shape=[N]) + !$omp parallel do + do i = 1, N + to_ptr(i) = 3 * from_ptr(i) + end do + !$omp end parallel do + end subroutine copy3_array_data_int + + subroutine copy3_array_data(from, to, N) + type(c_ptr), value :: from, to + integer, value :: N + !$omp target is_device_ptr(from, to) + call copy3_array_data_int(from, to, N) + !$omp end target + end subroutine copy3_array_data + + subroutine copy3_array1(from, to) + real(c_double), target :: from(:), to(:) + integer :: N + N = size(from) + + !$omp target is_device_ptr(from, to) + call copy3_array_data_int(c_loc(from), c_loc(to), N) + !$omp end target + end subroutine copy3_array1 + +! ICE - the following code gives (currently) an ICE +! It is accepted by the frontend but it is invalid +! OpenMP 5 as only "a dummy argument that does not have the +! ALLOCATABLE, POINTER or VALUE attribute." +! +! subroutine copy3_array2(from, to) +! real(c_double), pointer :: from(:), to(:) +! integer :: N +! N = size(from) +! +! !$omp target is_device_ptr(from, to) +! call copy3_array_data_int(c_loc(from), c_loc(to), N) +! !$omp end target +! end subroutine copy3_array2 + + subroutine copy3_array3(from, to) + real(c_double), optional, target :: from(:), to(:) + integer :: N + N = size(from) + + !$omp target is_device_ptr(from, to) + call copy3_array_data_int(c_loc(from), c_loc(to), N) + !$omp end target + end subroutine copy3_array3 + +! ICE - the following code gives (currently) an ICE +! It is accepted by the frontend but it is invalid +! OpenMP 5 as only "a dummy argument that does not have the +! ALLOCATABLE, POINTER or VALUE attribute." +! +! subroutine copy3_array4(from, to) +! real(c_double), optional, pointer :: from(:), to(:) +! integer :: N +! N = size(from) +! +! !$omp target is_device_ptr(from, to) +! call copy3_array_data_int(c_loc(from), c_loc(to), N) +! !$omp end target +! end subroutine copy3_array4 +end module offloading + + + +module offloading2 + use iso_c_binding + use offloading + implicit none +contains + ! Same as main program but uses dummy *nonoptional* arguments + subroutine use_device_ptr_sub(AA, BB, CC, DD, EE, FF, AptrA, BptrB, N) + real(c_double), pointer :: AA(:), BB(:) + real(c_double), allocatable, target :: CC(:), DD(:) + real(c_double), target :: EE(N), FF(N), dummy(1) + real(c_double), pointer :: AptrA(:), BptrB(:) + intent(inout) :: AA, BB, CC, DD, EE, FF + integer, value :: N + + type(c_ptr) :: tgt_aptr, tgt_bptr, tgt_cptr, tgt_dptr, tgt_eptr, tgt_fptr + + AA = 11.0_c_double + BB = 22.0_c_double + CC = 33.0_c_double + DD = 44.0_c_double + EE = 55.0_c_double + FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) use_device_ptr(AA,BB) + call copy3_array_data(c_loc(AA), c_loc(BB), N) + !$omp end target data + + if (any(abs(AA - 11.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + if (any(abs(CC - 33.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + if (any(abs(EE - 55.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + + + AA = 111.0_c_double + BB = 222.0_c_double + CC = 333.0_c_double + DD = 444.0_c_double + EE = 555.0_c_double + FF = 666.0_c_double + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BB) + tgt_aptr = c_loc(AA) + tgt_bptr = c_loc(BB) + AptrA => AA + BptrB => BB + !$omp end target data + + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 1111.0_c_double + !$omp target update to(AA) + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 1111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! AprtA tests + AA = 7.0_c_double + !$omp target update to(AA) + call copy3_array_data(c_loc(AptrA), c_loc(BptrB), N) + !$omp target update from(BB) + if (any(abs(AA - 7.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 77.0_c_double + !$omp target update to(AA) + call copy3_array1(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 77.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 777.0_c_double +! !$omp target update to(AA) +! call copy3_array2(AptrA, BptrB) +! !$omp target update from(BB) +! if (any(abs(AA - 777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 7777.0_c_double + !$omp target update to(AA) + call copy3_array3(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 7777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 77777.0_c_double +! !$omp target update to(AA) +! call copy3_array4(AptrA, BptrB) +! !$omp target update from(BB) + !$omp end target data +! +! if (any(abs(AA - 77777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + tgt_cptr = c_loc(CC) + tgt_dptr = c_loc(DD) + !$omp end target data + + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + if (any(abs(CC - 333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + CC = 3333.0_c_double + !$omp target update to(CC) + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + !$omp end target data + + if (any(abs(CC - 3333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + tgt_eptr = c_loc(EE) + tgt_fptr = c_loc(FF) + !$omp end target data + + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp target update from(FF) + if (any(abs(EE - 555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + EE = 5555.0_c_double + !$omp target update to(EE) + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp target update from(FF) + !$omp end target data + + if (any(abs(EE - 5555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + end subroutine use_device_ptr_sub + + + + ! Same as main program but uses dummy *optional* arguments + subroutine use_device_ptr_sub2(AA, BB, CC, DD, EE, FF, AptrA, BptrB, N) + real(c_double), optional, pointer :: AA(:), BB(:) + real(c_double), optional, allocatable, target :: CC(:), DD(:) + real(c_double), optional, target :: EE(N), FF(N) + real(c_double), pointer :: AptrA(:), BptrB(:) + intent(inout) :: AA, BB, CC, DD, EE, FF + real(c_double), target :: dummy(1) + integer, value :: N + + type(c_ptr) :: tgt_aptr, tgt_bptr, tgt_cptr, tgt_dptr, tgt_eptr, tgt_fptr + + AA = 11.0_c_double + BB = 22.0_c_double + CC = 33.0_c_double + DD = 44.0_c_double + EE = 55.0_c_double + FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) use_device_ptr(AA,BB) + call copy3_array_data(c_loc(AA), c_loc(BB), N) + !$omp end target data + + if (any(abs(AA - 11.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + if (any(abs(CC - 33.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + if (any(abs(EE - 55.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + + + AA = 111.0_c_double + BB = 222.0_c_double + CC = 333.0_c_double + DD = 444.0_c_double + EE = 555.0_c_double + FF = 666.0_c_double + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BB) + tgt_aptr = c_loc(AA) + tgt_bptr = c_loc(BB) + AptrA => AA + BptrB => BB + !$omp end target data + + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 1111.0_c_double + !$omp target update to(AA) + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 1111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! AprtA tests + AA = 7.0_c_double + !$omp target update to(AA) + call copy3_array_data(c_loc(AptrA), c_loc(BptrB), N) + !$omp target update from(BB) + if (any(abs(AA - 7.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 77.0_c_double + !$omp target update to(AA) + call copy3_array1(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 77.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 777.0_c_double +! !$omp target update to(AA) +! call copy3_array2(AptrA, BptrB) +! !$omp target update from(BB) +! if (any(abs(AA - 777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 7777.0_c_double + !$omp target update to(AA) + call copy3_array3(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 7777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 77777.0_c_double +! !$omp target update to(AA) +! call copy3_array4(AptrA, BptrB) +! !$omp target update from(BB) + !$omp end target data +! +! if (any(abs(AA - 77777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + tgt_cptr = c_loc(CC) + tgt_dptr = c_loc(DD) + !$omp end target data + + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + if (any(abs(CC - 333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + CC = 3333.0_c_double + !$omp target update to(CC) + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + !$omp end target data + + if (any(abs(CC - 3333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + tgt_eptr = c_loc(EE) + tgt_fptr = c_loc(FF) + !$omp end target data + + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp target update from(FF) + if (any(abs(EE - 555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + EE = 5555.0_c_double + !$omp target update to(EE) + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp end target data + + if (any(abs(EE - 5555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + end subroutine use_device_ptr_sub2 +end module offloading2 + + + +program omp_device_ptr + use iso_c_binding + use offloading + use offloading2 + implicit none + + integer, parameter :: N = 1000 + real(c_double), pointer :: AA(:), BB(:), arg_AA(:), arg_BB(:), arg2_AA(:), arg2_BB(:) + real(c_double), allocatable, target :: CC(:), DD(:), arg_CC(:), arg_DD(:), arg2_CC(:), arg2_DD(:) + real(c_double), target :: EE(N), FF(N), dummy(1), arg_EE(N), arg_FF(N), arg2_EE(N), arg2_FF(N) + + real(c_double), pointer :: AptrA(:), BptrB(:) + type(c_ptr) :: tgt_aptr, tgt_bptr, tgt_cptr, tgt_dptr, tgt_eptr, tgt_fptr + + allocate(AA(N), BB(N), CC(N), DD(N)) + + AA = 11.0_c_double + BB = 22.0_c_double + CC = 33.0_c_double + DD = 44.0_c_double + EE = 55.0_c_double + FF = 66.0_c_double + + ! NOTE: OpenMP 5's use_device_addr is (at time of writing) not yet supported + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) use_device_ptr(AA,BB) + call copy3_array_data(c_loc(AA), c_loc(BB), N) + !$omp end target data + + if (any(abs(AA - 11.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) use_device_ptr(CC,DD) + call copy3_array_data(c_loc(CC), c_loc(DD), N) + !$omp end target data + + if (any(abs(CC - 33.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) use_device_ptr(EE,FF) + call copy3_array_data(c_loc(EE), c_loc(FF), N) + !$omp end target data + + if (any(abs(EE - 55.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + + + AA = 111.0_c_double + BB = 222.0_c_double + CC = 333.0_c_double + DD = 444.0_c_double + EE = 555.0_c_double + FF = 666.0_c_double + + ! pointer-type array to use_device_ptr + !$omp target data map(to:AA) map(from:BB) + !$omp target data map(alloc:dummy) use_device_ptr(AA,BB) + tgt_aptr = c_loc(AA) + tgt_bptr = c_loc(BB) + AptrA => AA + BptrB => BB + !$omp end target data + + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 1111.0_c_double + !$omp target update to(AA) + call copy3_array_data(tgt_aptr, tgt_bptr, N) + !$omp target update from(BB) + if (any(abs(AA - 1111.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + ! AprtA tests + AA = 7.0_c_double + !$omp target update to(AA) + call copy3_array_data(c_loc(AptrA), c_loc(BptrB), N) + !$omp target update from(BB) + if (any(abs(AA - 7.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 77.0_c_double + !$omp target update to(AA) + call copy3_array1(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 77.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 777.0_c_double +! !$omp target update to(AA) +! call copy3_array2(AptrA, BptrB) +! !$omp target update from(BB) +! if (any(abs(AA - 777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + AA = 7777.0_c_double + !$omp target update to(AA) + call copy3_array3(AptrA, BptrB) + !$omp target update from(BB) + if (any(abs(AA - 7777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() + if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + +! AA = 77777.0_c_double +! !$omp target update to(AA) +! call copy3_array4(AptrA, BptrB) +! !$omp target update from(BB) + !$omp end target data +! +! if (any(abs(AA - 77777.0_c_double) > 10.0_c_double * epsilon(AA))) call abort() +! if (any(abs(3.0_c_double * AA - BB) > 10.0_c_double * epsilon(AA))) call abort() + + + + ! allocatable array to use_device_ptr + !$omp target data map(to:CC) map(from:DD) + !$omp target data map(alloc:dummy) use_device_ptr(CC,DD) + tgt_cptr = c_loc(CC) + tgt_dptr = c_loc(DD) + !$omp end target data + + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + if (any(abs(CC - 333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + CC = 3333.0_c_double + !$omp target update to(CC) + call copy3_array_data(tgt_cptr, tgt_dptr, N) + !$omp target update from(DD) + !$omp end target data + + if (any(abs(CC - 3333.0_c_double) > 10.0_c_double * epsilon(CC))) call abort() + if (any(abs(3.0_c_double * CC - DD) > 10.0_c_double * epsilon(CC))) call abort() + + + + ! fixed-size decriptorless array to use_device_ptr + !$omp target data map(to:EE) map(from:FF) + !$omp target data map(alloc:dummy) use_device_ptr(EE,FF) + tgt_eptr = c_loc(EE) + tgt_fptr = c_loc(FF) + !$omp end target data + + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp target update from(FF) + if (any(abs(EE - 555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + EE = 5555.0_c_double + !$omp target update to(EE) + call copy3_array_data(tgt_eptr, tgt_fptr, N) + !$omp target update from(FF) + !$omp end target data + + if (any(abs(EE - 5555.0_c_double) > 10.0_c_double * epsilon(EE))) call abort() + if (any(abs(3.0_c_double * EE - FF) > 10.0_c_double * epsilon(EE))) call abort() + + + + deallocate(AA, BB) ! Free pointers only + + AptrA => null() + BptrB => null() + allocate(arg_AA(N), arg_BB(N), arg_CC(N), arg_DD(N)) + call use_device_ptr_sub(arg_AA, arg_BB, arg_CC, arg_DD, arg_EE, arg_FF, AptrA, BptrB, N) + deallocate(arg_AA, arg_BB) + + AptrA => null() + BptrB => null() + allocate(arg2_AA(N), arg2_BB(N), arg2_CC(N), arg2_DD(N)) + call use_device_ptr_sub(arg2_AA, arg2_BB, arg2_CC, arg2_DD, arg2_EE, arg2_FF, AptrA, BptrB, N) + deallocate(arg2_AA, arg2_BB) +end program omp_device_ptr