Message ID | 7a5f39e8-a33b-048a-f9c1-1355b941771e@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | [Fortran/OpenMP] Don't create "alloc:" for 'target exit data' | expand |
On 10/18/19 11:27 AM, Tobias Burnus wrote: > Currently, one has for > !$omp target exit data map(delete:x) > in the original dump: > #pragma omp target exit data map(delete:*x) map(alloc:x [pointer > assign, bias: 0]) > > The "alloc:" not only does not make sense but also gives run-time > messages like: > libgomp: GOMP_target_enter_exit_data unhandled kind 0x04 > > [Depending on the data type, in gfc_trans_omp_clauses's OMP_LIST_MAP, > add map clauses of type GOMP_MAP_POINTER and/or GOMP_MAP_TO_PSET.] > > That's for release:/delete:. However, for 'target exit data' > (GOMP_target_enter_exit_data) the same issue occurs for > "from:"/"always, from:". But "from:" implies "alloc:". – While > "alloc:" does not make sense for "target exit data" or "update", for > "target" or "target data" it surely matters. Hence, I only exclude > "from:" for exit data and update. > > See attached patch. I have additionally Fortran-fied > libgomp.c/target-20.c to have at least one 'enter/exit target data' > test case for Fortran. > > Build + regtested on x86_64-gnu-linux w/o offloading. And I have > tested the new test case with nvptx. > > Tobias >
On Fri, Oct 18, 2019 at 11:27:39AM +0200, Tobias Burnus wrote: > Currently, one has for > !$omp target exit data map(delete:x) > in the original dump: > #pragma omp target exit data map(delete:*x) map(alloc:x [pointer assign, > bias: 0]) > > The "alloc:" not only does not make sense but also gives run-time messages > like: > libgomp: GOMP_target_enter_exit_data unhandled kind 0x04 > > [Depending on the data type, in gfc_trans_omp_clauses's OMP_LIST_MAP, add > map clauses of type GOMP_MAP_POINTER and/or GOMP_MAP_TO_PSET.] > > That's for release:/delete:. However, for 'target exit data' > (GOMP_target_enter_exit_data) the same issue occurs for "from:"/"always, > from:". But "from:" implies "alloc:". – While "alloc:" does not make sense > for "target exit data" or "update", for "target" or "target data" it surely > matters. Hence, I only exclude "from:" for exit data and update. > > See attached patch. I have additionally Fortran-fied libgomp.c/target-20.c > to have at least one 'enter/exit target data' test case for Fortran. > > Build + regtested on x86_64-gnu-linux w/o offloading. And I have tested the > new test case with nvptx. I believe it is easier to handle it at the same spot as we do it e.g. for C/C++ pointer attachments (where we create the same clauses regardless of the exact construct and then drop them later), in particular in gimplify_scan_omp_clauses. There we have: case OMP_TARGET: break; case OACC_DATA: if (TREE_CODE (TREE_TYPE (decl)) != ARRAY_TYPE) break; /* FALLTHRU */ case OMP_TARGET_DATA: case OMP_TARGET_ENTER_DATA: case OMP_TARGET_EXIT_DATA: case OACC_ENTER_DATA: case OACC_EXIT_DATA: case OACC_HOST_DATA: if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER || (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) /* For target {,enter ,exit }data only the array slice is mapped, but not the pointer to it. */ remove = true; break; So, I think best would be to add if (code == OMP_TARGET_EXIT_DATA && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_WHATEVER_IS_NOT_VALID_FOR_EXIT_DATA) remove = true; with a comment explaining that. The testcase LGTM. Jakub
On 10/30/19 11:12 AM, Jakub Jelinek wrote: > I believe it is easier to handle it at the same spot as we do it e.g. > for C/C++ pointer attachments (where we create the same clauses > regardless of the exact construct and then drop them later), in > particular in gimplify_scan_omp_clauses. […] I concur. Semantically, it is not identical – but I think still okay. For 'omp exit data', 'to:'/'alloc:' mapping does not make sense and it not handled in libgomp's gomp_exit_data. Hence, I exclude GOMP_MAP_POINTER (dump: 'alloc:') and GOMP_MAP_TO_PSET (dump: 'to:'). – Those are only internally used, hence, user-specified 'alloc:' will get diagnosed. ['delete:'/'release:' in other directives than 'exit data' doesn't make much sense. Other directives accept it but their libgomp function silently ignore it.] 'omp update': The gomp_update function only handles GOMP_MAP_COPY_TO_P and GOMP_MAP_COPY_FROM_P (and silently ignores others). Both macros have !((X) & GOMP_MAP_FLAG_SPECIAL). Hence, we can save a few bytes and avoid calling 'omp update' with GOMP_MAP_POINTER and GOMP_MAP_TO_PSET. [TO_PSET only appears in gfc_trans_omp_clauses (once); POINTER appears there and in gfc_omp_finish_clause and in c/c-typeck.c's handle_omp_array_sections but only if "(ort != C_ORT_OMP && ort != C_ORT_ACC)".] I moved trans-openmp.c change to gimplify.c and left the test case unchanged. Then, I bootstrapped on a non-offloading system and regtested it also with a nvptx system. Tobias
On Wed, Oct 30, 2019 at 04:48:43PM +0100, Tobias Burnus wrote: > On 10/30/19 11:12 AM, Jakub Jelinek wrote: > > I believe it is easier to handle it at the same spot as we do it e.g. > > for C/C++ pointer attachments (where we create the same clauses > > regardless of the exact construct and then drop them later), in > > particular in gimplify_scan_omp_clauses. […] > > I concur. Semantically, it is not identical – but I think still okay. > > For 'omp exit data', 'to:'/'alloc:' mapping does not make sense and it not > handled in libgomp's gomp_exit_data. Hence, I exclude GOMP_MAP_POINTER > (dump: 'alloc:') and GOMP_MAP_TO_PSET (dump: 'to:'). – Those are only > internally used, hence, user-specified 'alloc:' will get diagnosed. > > ['delete:'/'release:' in other directives than 'exit data' doesn't make much > sense. Other directives accept it but their libgomp function silently ignore > it.] Do they? At least the C/C++ FEs should complain/remove before it makes its way into the middle-end. E.g. c_parser_omp_target_enter_data has: switch (OMP_CLAUSE_MAP_KIND (*pc)) { case GOMP_MAP_TO: case GOMP_MAP_ALWAYS_TO: case GOMP_MAP_ALLOC: map_seen = 3; break; case GOMP_MAP_FIRSTPRIVATE_POINTER: case GOMP_MAP_ALWAYS_POINTER: break; default: map_seen |= 1; error_at (OMP_CLAUSE_LOCATION (*pc), "%<#pragma omp target enter data%> with map-type other " "than %<to%> or %<alloc%> on %<map%> clause"); *pc = OMP_CLAUSE_CHAIN (*pc); continue; } Haven't checked the Fortran FE. > gcc/ > * gimplify.c (gimplify_scan_omp_clauses): Remove FE-generated > GOMP_MAP_TO_PSET and GOMP_MAP_POINTER mapping for 'target update' > and 'target exit data'. > > libgomp/ > * testsuite/libgomp.fortran/target9.f90: New. Ok. Jakub
On 10/30/19 4:55 PM, Jakub Jelinek wrote: > Do they? At least the C/C++ FEs should complain/remove before it makes > its way into the middle-end. […] > Haven't checked the Fortran FE. The Fortran FE lacks many checks the C/C++ FE has – but, admittedly, it *does* have this check. (Which obviously does not apply to FE generated code.) > Ok. Thanks for the quick review. (Committed as Rev. 277631.) Tobias
Hi! On 2019-10-30T16:48:43+0100, Tobias Burnus <tobias@codesourcery.com> wrote: > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/target9.f90 As obvious; see attached, committed "Torture testing: 'libgomp.fortran/target9.f90'" to trunk in r278045. Grüße Thomas
gcc/fortran/ * trans-openmp.c (gfc_trans_omp_clauses): Do not create map(alloc:) for map(delete:/release:) and for (from:/always,from:) only if new arg require_from_alloc is true, which is the default. (gfc_trans_omp_target_exit_data, gfc_trans_omp_target_update): Call it with require_from_alloc = false. libgomp/ * testsuite/libgomp.fortran/target9.f90: New. diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index dad11a24430..f890629c73d 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -1852,7 +1852,8 @@ static vec<tree, va_heap, vl_embed> *doacross_steps; static tree gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, - locus where, bool declare_simd = false) + locus where, bool declare_simd = false, + bool require_from_alloc = true) { tree omp_clauses = NULL_TREE, chunk_size, c; int list, ifc; @@ -2163,6 +2164,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (!n->sym->attr.referenced) continue; + /* map(alloc:) etc. is not needed for delete/release + For 'from:', it is needed when setting up the environment + but not for updating or copying out of the data. */ + bool no_extra_pointer = n->u.map_op == OMP_MAP_DELETE + || n->u.map_op == OMP_MAP_RELEASE + || (!require_from_alloc + && (n->u.map_op == OMP_MAP_FROM + || n->u.map_op + == OMP_MAP_ALWAYS_FROM)); + tree node = build_omp_clause (input_location, OMP_CLAUSE_MAP); tree node2 = NULL_TREE; tree node3 = NULL_TREE; @@ -2172,7 +2183,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, TREE_ADDRESSABLE (decl) = 1; if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL) { - if (POINTER_TYPE_P (TREE_TYPE (decl)) + if (!no_extra_pointer + && POINTER_TYPE_P (TREE_TYPE (decl)) && (gfc_omp_privatize_by_reference (decl) || GFC_DECL_GET_SCALAR_POINTER (decl) || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl) @@ -2208,17 +2220,20 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, ptr); ptr = build_fold_indirect_ref (ptr); OMP_CLAUSE_DECL (node) = ptr; - node2 = build_omp_clause (input_location, - OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET); - OMP_CLAUSE_DECL (node2) = decl; - OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); - node3 = build_omp_clause (input_location, - OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); - OMP_CLAUSE_DECL (node3) - = gfc_conv_descriptor_data_get (decl); - OMP_CLAUSE_SIZE (node3) = size_int (0); + if (!no_extra_pointer) + { + node2 = build_omp_clause (input_location, + OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET); + OMP_CLAUSE_DECL (node2) = decl; + OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); + node3 = build_omp_clause (input_location, + OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_DECL (node3) + = gfc_conv_descriptor_data_get (decl); + OMP_CLAUSE_SIZE (node3) = size_int (0); + } /* We have to check for n->sym->attr.dimension because of scalar coarrays. */ @@ -2302,6 +2317,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, ptr); OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (ptr); + if (no_extra_pointer) + goto skip_extra_map_pointer; + if (POINTER_TYPE_P (TREE_TYPE (decl)) && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (decl)))) { @@ -2346,6 +2364,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2); } + + skip_extra_map_pointer: + switch (n->u.map_op) { case OMP_MAP_ALLOC: @@ -4979,7 +5000,7 @@ gfc_trans_omp_target_exit_data (gfc_code *code) gfc_start_block (&block); omp_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses, - code->loc); + code->loc, false, false); stmt = build1_loc (input_location, OMP_TARGET_EXIT_DATA, void_type_node, omp_clauses); gfc_add_expr_to_block (&block, stmt); @@ -4994,7 +5015,7 @@ gfc_trans_omp_target_update (gfc_code *code) gfc_start_block (&block); omp_clauses = gfc_trans_omp_clauses (&block, code->ext.omp_clauses, - code->loc); + code->loc, false, false); stmt = build1_loc (input_location, OMP_TARGET_UPDATE, void_type_node, omp_clauses); gfc_add_expr_to_block (&block, stmt); diff --git a/libgomp/testsuite/libgomp.fortran/target9.f90 b/libgomp/testsuite/libgomp.fortran/target9.f90 new file mode 100644 index 00000000000..91d60a33307 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/target9.f90 @@ -0,0 +1,123 @@ +! { dg-require-effective-target offload_device_nonshared_as } */ + +module target_test + implicit none (type, external) + integer, parameter :: N = 40 + integer :: sum + integer :: var1 = 1 + integer :: var2 = 2 + + !$omp declare target to(D) + integer :: D(N) = 0 +contains + subroutine enter_data (X) + integer :: X(:) + !$omp target enter data map(to: var1, var2, X) map(alloc: sum) + end subroutine enter_data + + subroutine exit_data_0 (D) + integer :: D(N) + !$omp target exit data map(delete: D) + end subroutine exit_data_0 + + subroutine exit_data_1 () + !$omp target exit data map(from: var1) + end subroutine exit_data_1 + + subroutine exit_data_2 (X) + integer :: X(N) + !$omp target exit data map(from: var2) map(release: X, sum) + end subroutine exit_data_2 + + subroutine exit_data_3 (p, idx) + integer :: p(:) + integer, value :: idx + !$omp target exit data map(from: p(idx)) + end subroutine exit_data_3 + + subroutine test_nested () + integer :: X, Y, Z + X = 0 + Y = 0 + Z = 0 + + !$omp target data map(from: X, Y, Z) + !$omp target data map(from: X, Y, Z) + !$omp target map(from: X, Y, Z) + X = 1337 + Y = 1337 + Z = 1337 + !$omp end target + if (X /= 0) stop 11 + if (Y /= 0) stop 12 + if (Z /= 0) stop 13 + + !$omp target exit data map(from: X) map(release: Y) + if (X /= 0) stop 14 + if (Y /= 0) stop 15 + + !$omp target exit data map(release: Y) map(delete: Z) + if (Y /= 0) stop 16 + if (Z /= 0) stop 17 + !$omp end target data + if (X /= 1337) stop 18 + if (Y /= 0) stop 19 + if (Z /= 0) stop 20 + + !$omp target map(from: X) + X = 2448 + !$omp end target + if (X /= 2448) stop 21 + if (Y /= 0) stop 22 + if (Z /= 0) stop 23 + + X = 4896 + !$omp end target data + if (X /= 4896) stop 24 + if (Y /= 0) stop 25 + if (Z /= 0) stop 26 + end subroutine test_nested +end module target_test + +program main + use target_test + implicit none (type, external) + + integer, allocatable :: X(:) + integer, pointer, contiguous :: Y(:) + + + allocate(X(N), Y(N)) + X(10) = 10 + Y(20) = 20 + call enter_data (X) + + call exit_data_0 (D) ! This should have no effect on D. + + !$omp target map(alloc: var1, var2, X) map(to: Y) map(always, from: sum) + var1 = var1 + X(10) + var2 = var2 + Y(20) + sum = var1 + var2 + D(sum) = D(sum) + 1 + !$omp end target + + if (var1 /= 1) stop 1 + if (var2 /= 2) stop 2 + if (sum /= 33) stop 3 + + call exit_data_1 () + if (var1 /= 11) stop 4 + if (var2 /= 2) stop 5 + + ! Increase refcount of already mapped X(1:N). + !$omp target enter data map(alloc: X(16:17)) + + call exit_data_2 (X) + if (var2 /= 22) stop 6 + + call exit_data_3 (X, 5) ! Unmap X(1:N). + + deallocate (X, Y) + + call test_nested () +end program main