From patchwork Fri Nov 30 11:41:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1005905 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-491354-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="a7IVaDgx"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 435std3mdfz9sBQ for ; Fri, 30 Nov 2018 22:41:59 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=Z/cKpmPKw9GK6K1D G/EejlQ/g7WeRrHGoSklgnk6jXoxj8CJZmUVkN/eqAfECQ0OS8WgXm8LJryQoi2G ti4EnOSidMbxlQuLlrX7cK4z+aMhxVudJhSXmYJLW34bDgdtRuUIaD9DB+WrVGfI 5xb2wxRR3PLlivTNlPS06HAf0y0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=lxpFRgm0PgYeqFIu4Uqj4w Z1btk=; b=a7IVaDgx4zl1kjtNlXjuCR8fzoYBU6tpCvCZu1iQlUdPkV5KNYRvWL RIAPBjkCj/mcjeAX4O/CSywJw8jBY8f8rdgNPYw/B5VYnk40kyARcQdV/KxTeeTm dHLnS9lr8tQBRs7VNLsE1pM/ngxGqKGmcUboDjtyk409acLSd8dB0= Received: (qmail 45901 invoked by alias); 30 Nov 2018 11:41:43 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 45855 invoked by uid 89); 30 Nov 2018 11:41:42 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS, TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=va, collapse, Turning, allocatable X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 30 Nov 2018 11:41:31 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gShAl-0005cA-Du from Julian_Brown@mentor.com ; Fri, 30 Nov 2018 03:41:28 -0800 Received: from localhost.localdomain (147.34.91.1) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Fri, 30 Nov 2018 11:41:22 +0000 From: Julian Brown To: CC: Chung-Lin Tang , Thomas Schwinge , Jakub Jelinek , Subject: [PATCH] OpenACC 2.6 manual deep copy support (attach/detach) Date: Fri, 30 Nov 2018 03:41:09 -0800 Message-ID: <1543578069-386-1-git-send-email-julian@codesourcery.com> In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes This is a new version of the patch incorporating several improvements/bugfixes made on the og8 branch: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01773.html https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02366.html The "dynamic"/multidimensional arrays parts (mostly initialisations of new struct fields in the appropriate places) are missing, because they haven't landed upstream yet. Non-runtime parts of the patch are the same as before. There are a few new tests. The patch also supersedes this one: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01175.html and this one: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02543.html Tested with offloading to nvptx and bootstrapped, on top of Chung-Lin's async patches, previously-posted patches in the series and with my patch to consistency-check refcounts (to be posted). OK? Thanks, Julian ChangeLog gcc/c-family/ * c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH, PRAGMA_OACC_CLAUSE_DETACH. gcc/c/ * c-parser.c (c_parser_omp_clause_name): Add parsing of attach and detach clauses. (c_parser_omp_variable_list): Allow deref (->) in variable lists. (c_parser_oacc_data_clause): Support attach and detach clauses. (c_parser_oacc_all_clauses): Likewise. (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH. * c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach and detach. Support deref. (c_oacc_check_attachments): New function. (c_finish_omp_clauses): Check attach/detach arguments for being pointers using above. Support deref. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Support attach and detach clauses. (cp_parser_omp_var_list_no_open): Support deref. (cp_parser_oacc_data_clause): Support attach and detach clauses. (cp_parser_oacc_all_clauses): Likewise. (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH. * semantics.c (handle_omp_array_sections_1): Reject subarrays for attach and detach. (cp_oacc_check_attachments): New function. (finish_omp_clauses): Use above function. Allow structure fields and class members to appear in OpenACC data clauses. Support deref. gcc/fortran/ * gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH. * openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter. Parse derived-type member accesses if true. (omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH. (gfc_match_omp_map_clause): Add allow_derived parameter. Pass to gfc_match_omp_variable_list. (gfc_match_omp_clauses): Support attach and detach. Support derived types for appropriate OpenACC directives. (OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES) (OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH. (check_symbol_not_pointer): Don't disallow pointer objects of derived type. (resolve_oacc_data_clauses): Don't disallow allocatable derived types. (resolve_omp_clauses): Perform duplicate checking only for non-derived type component accesses (plain variables and arrays or array sections). Support component refs. * trans-openmp.c (gfc_omp_privatize_by_reference): Support component refs. (gfc_trans_omp_clauses): Support component refs, attach and detach clauses. gcc/ * gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS. (insert_struct_component_mapping): Support derived-type member mappings for arrays with descriptors which use GOMP_MAP_TO_PSET. (gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers. Handle pointer mappings that use GOMP_MAP_TO_PSET. Handle attach/detach clauses. (gimplify_adjust_omp_clauses_1): Skip adjustments for explicit attach/detach clauses. (gimplify_omp_target_update): Handle finalize for detach. * omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH, GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/include/ * gomp-constants.h (GOMP_MAP_DEEP_COPY): Define. (gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH. gcc/testsuite/ * c-c++-common/goacc/mdc-1.c: New test. * c-c++-common/goacc/mdc-2.c: New test. * gcc.dg/goacc/mdc.C: New test. * gfortran.dg/goacc/data-clauses.f95: New test. * gfortran.dg/goacc/derived-types.f90: New test. * gfortran.dg/goacc/enter-exit-data.f95: New test. libgomp/ * libgomp.h (struct target_var_desc): Add do_detach flag. (struct splay_tree_key_s): Add attach_count field. Substitute dynamic_refcount field for virtual_refcount. (struct acc_dispatch_t): Remove data_environ field. (enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA. (gomp_acc_insert_pointer): Remove prototype. (gomp_acc_remove_pointer): Update prototype. (struct gomp_coalesce_buf): Add forward declaration. (gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add prototypes. * libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async, acc_detach, acc_detach_async, acc_detach_finalize, acc_detach_finalize_async. * oacc-async.c (goacc_remove_var_async): New function. * oacc-host.c (host_dispatch): Don't initialise removed data_environ field. * oacc-init.c (acc_shutdown_1): Use gomp_remove_var instead of gomp_unmap_vars to remove mappings by splay tree key instead of target memory descriptor. * oacc-int.h (splay_tree_key_s): Add forward declaration. (goacc_remove_car_async): Add prototype. * oacc-mem.c (lookup_dev_1): New function. (lookup_dev): Reimplement using above. (acc_free, acc_hostptr): Update calls to lookup_dev. (acc_map_data): Likewise. Don't add to data_environ list. (acc_unmap_data): Update call to gomp_unmap_vars. Fix semantics to remove mapping, but not mapped data. (present_create_copy): Use virtual_refcount instead of dynamic_refcount. Don't manipulate data_environ. Fix target pointer return value. (delete_copyout): Update for virtual_refcount semantics. Use goacc_remove_var_async for asynchronous delete/copyouts. (gomp_acc_insert_pointer): Remove function. (gomp_acc_remove_pointer): Reimplement. (acc_attach_async, acc_attach, goacc_detach_internal, acc_detach) (acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New functions. * oacc-parallel.c (find_pointer): Support attach/detach. Make a little more strict. (GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses. (GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT. Don't call gomp_acc_insert_pointer. * openacc.h (acc_attach, acc_attach_async, acc_detach) (acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add prototypes. * target.c (limits.h): Include. (gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc. (gomp_attach_pointer, gomp_detach_pointer): New functions. (gomp_map_val): Make global. (gomp_map_vars_async): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA. Update for virtual_refcount semantics. Support attach and detach. (gomp_remove_var): Free attach count array if present. (gomp_unmap_vars_async): Support detach and update for virtual_refcount semantics. (gomp_load_image_to_device): Zero-initialise virtual_refcount fields. (gomp_free_memmap): Remove function. (omp_target_associate_ptr): Zero-initialise virtual_refcount, attach_count and link_key splay tree key fields. (gomp_target_init): Don't initialise removed data_environ field. * testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test. * testsuite/libgomp.oacc-fortran/data-2.f90: Update test. * testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test. * testsuite/libgomp.oacc-fortran/update-2.f90: New test. --- gcc/c-family/c-pragma.h | 2 + gcc/c/c-parser.c | 34 ++- gcc/c/c-typeck.c | 59 +++- gcc/cp/parser.c | 38 ++- gcc/cp/semantics.c | 75 ++++- gcc/fortran/gfortran.h | 2 + gcc/fortran/openmp.c | 145 +++++--- gcc/fortran/trans-openmp.c | 78 +++- gcc/gimplify.c | 85 ++++- gcc/omp-low.c | 3 + gcc/testsuite/c-c++-common/goacc/mdc-1.c | 54 +++ gcc/testsuite/c-c++-common/goacc/mdc-2.c | 62 +++ gcc/testsuite/g++.dg/goacc/mdc.C | 68 ++++ gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 | 38 +- gcc/testsuite/gfortran.dg/goacc/derived-types.f90 | 77 ++++ .../gfortran.dg/goacc/enter-exit-data.f95 | 24 +- gcc/tree-pretty-print.c | 9 + include/gomp-constants.h | 8 + libgomp/libgomp.h | 34 ++- libgomp/libgomp.map | 10 + libgomp/oacc-async.c | 18 + libgomp/oacc-host.c | 2 - libgomp/oacc-init.c | 9 +- libgomp/oacc-int.h | 5 + libgomp/oacc-mem.c | 402 +++++++++----------- libgomp/oacc-parallel.c | 240 +++++++++--- libgomp/openacc.h | 6 + libgomp/target.c | 238 ++++++++++-- .../libgomp.oacc-c-c++-common/context-2.c | 6 +- .../libgomp.oacc-c-c++-common/context-4.c | 6 +- .../libgomp.oacc-c-c++-common/deep-copy-1.c | 24 ++ .../libgomp.oacc-c-c++-common/deep-copy-2.c | 29 ++ .../libgomp.oacc-c-c++-common/deep-copy-3.c | 34 ++ .../libgomp.oacc-c-c++-common/deep-copy-4.c | 87 +++++ .../libgomp.oacc-c-c++-common/deep-copy-5.c | 81 ++++ .../libgomp.oacc-c-c++-common/deep-copy-6.c | 59 +++ .../libgomp.oacc-c-c++-common/deep-copy-7.c | 45 +++ .../libgomp.oacc-c-c++-common/deep-copy-8.c | 54 +++ libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 | 7 +- .../testsuite/libgomp.oacc-fortran/deep-copy-1.f90 | 35 ++ .../testsuite/libgomp.oacc-fortran/deep-copy-2.f90 | 33 ++ .../testsuite/libgomp.oacc-fortran/deep-copy-3.f90 | 34 ++ .../testsuite/libgomp.oacc-fortran/deep-copy-4.f90 | 49 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-5.f90 | 57 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-6.f90 | 61 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-7.f90 | 89 +++++ .../testsuite/libgomp.oacc-fortran/deep-copy-8.f90 | 41 ++ .../libgomp.oacc-fortran/derived-type-1.f90 | 28 ++ .../testsuite/libgomp.oacc-fortran/update-2.f90 | 284 ++++++++++++++ 49 files changed, 2526 insertions(+), 442 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-1.c create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-2.c create mode 100644 gcc/testsuite/g++.dg/goacc/mdc.C create mode 100644 gcc/testsuite/gfortran.dg/goacc/derived-types.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index b781f73..dd8d807 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -136,11 +136,13 @@ enum pragma_omp_clause { /* Clauses for OpenACC. */ PRAGMA_OACC_CLAUSE_ASYNC, + PRAGMA_OACC_CLAUSE_ATTACH, PRAGMA_OACC_CLAUSE_AUTO, PRAGMA_OACC_CLAUSE_COPY, PRAGMA_OACC_CLAUSE_COPYOUT, PRAGMA_OACC_CLAUSE_CREATE, PRAGMA_OACC_CLAUSE_DELETE, + PRAGMA_OACC_CLAUSE_DETACH, PRAGMA_OACC_CLAUSE_DEVICEPTR, PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT, PRAGMA_OACC_CLAUSE_FINALIZE, diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 9589502..e12a7aa 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -11412,6 +11412,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OMP_CLAUSE_ALIGNED; else if (!strcmp ("async", p)) result = PRAGMA_OACC_CLAUSE_ASYNC; + else if (!strcmp ("attach", p)) + result = PRAGMA_OACC_CLAUSE_ATTACH; break; case 'c': if (!strcmp ("collapse", p)) @@ -11434,6 +11436,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OACC_CLAUSE_DELETE; else if (!strcmp ("depend", p)) result = PRAGMA_OMP_CLAUSE_DEPEND; + else if (!strcmp ("detach", p)) + result = PRAGMA_OACC_CLAUSE_DETACH; else if (!strcmp ("device", p)) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) @@ -11804,9 +11808,12 @@ c_parser_omp_variable_list (c_parser *parser, case OMP_CLAUSE_MAP: case OMP_CLAUSE_FROM: case OMP_CLAUSE_TO: - while (c_parser_next_token_is (parser, CPP_DOT)) + while (c_parser_next_token_is (parser, CPP_DOT) + || c_parser_next_token_is (parser, CPP_DEREF)) { location_t op_loc = c_parser_peek_token (parser)->location; + if (c_parser_next_token_is (parser, CPP_DEREF)) + t = build_simple_mem_ref (t); c_parser_consume_token (parser); if (!c_parser_next_token_is (parser, CPP_NAME)) { @@ -11945,12 +11952,14 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind, return list; } -/* OpenACC 2.0: +/* OpenACC 2.5: + attach (variable-list ) copy ( variable-list ) copyin ( variable-list ) copyout ( variable-list ) create ( variable-list ) delete ( variable-list ) + detach ( variable-list ) present ( variable-list ) */ static tree @@ -11960,6 +11969,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, enum gomp_map_kind kind; switch (c_kind) { + case PRAGMA_OACC_CLAUSE_ATTACH: + kind = GOMP_MAP_ATTACH; + break; case PRAGMA_OACC_CLAUSE_COPY: kind = GOMP_MAP_TOFROM; break; @@ -11975,6 +11987,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DELETE: kind = GOMP_MAP_RELEASE; break; + case PRAGMA_OACC_CLAUSE_DETACH: + kind = GOMP_MAP_DETACH; + break; case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; @@ -14559,6 +14574,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses); c_name = "auto"; break; + case PRAGMA_OACC_CLAUSE_ATTACH: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "attach"; + break; case PRAGMA_OACC_CLAUSE_COLLAPSE: clauses = c_parser_omp_clause_collapse (parser, clauses); c_name = "collapse"; @@ -14587,6 +14606,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses = c_parser_omp_clause_default (parser, clauses, true); c_name = "default"; break; + case PRAGMA_OACC_CLAUSE_DETACH: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "detach"; + break; case PRAGMA_OACC_CLAUSE_DEVICE: clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "device"; @@ -15065,7 +15088,8 @@ c_parser_oacc_cache (location_t loc, c_parser *parser) */ #define OACC_DATA_CLAUSE_MASK \ - ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ @@ -15248,6 +15272,7 @@ c_parser_oacc_declare (c_parser *parser) #define OACC_ENTER_DATA_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -15257,6 +15282,7 @@ c_parser_oacc_declare (c_parser *parser) | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -15391,6 +15417,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name, #define OACC_KERNELS_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ @@ -15406,6 +15433,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name, #define OACC_PARALLEL_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 144977e..b2de3b4 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -12610,7 +12610,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, return error_mark_node; } if (TREE_CODE (t) == COMPONENT_REF - && ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)) @@ -12632,6 +12631,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, } t = TREE_OPERAND (t, 0); } + if (TREE_CODE (t) == MEM_REF) + t = TREE_OPERAND (t, 0); } if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL) { @@ -12716,7 +12717,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; - + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)) + { + if (length != integer_one_node) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "array section in % clause" + : "array section in % clause"); + return error_mark_node; + } + } if (length != NULL_TREE) { if (!integer_nonzerop (length)) @@ -13393,6 +13406,37 @@ c_omp_finish_iterators (tree iter) return ret; } +/* Ensure that pointers are used in OpenACC attach and detach clauses. + Return true if an error has been detected. */ + +static bool +c_oacc_check_attachments (tree c) +{ + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) + return false; + + /* OpenACC attach / detach clauses must be pointers. */ + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) + { + tree t = OMP_CLAUSE_DECL (c); + + while (TREE_CODE (t) == TREE_LIST) + t = TREE_CHAIN (t); + + if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "expected pointer in % clause" + : "expected pointer in % clause"); + return true; + } + } + + return false; +} + /* For all elements of CLAUSES, validate them against their constraints. Remove any elements from the list that are invalid. */ @@ -14117,6 +14161,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } } } + if (c_oacc_check_attachments (c)) + remove = true; break; } if (t == error_mark_node) @@ -14124,8 +14170,13 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; break; } + /* OpenACC attach / detach clauses must be pointers. */ + if (c_oacc_check_attachments (c)) + { + remove = true; + break; + } if (TREE_CODE (t) == COMPONENT_REF - && (ort & C_ORT_OMP) && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_) { if (DECL_BIT_FIELD (TREE_OPERAND (t, 1))) @@ -14163,6 +14214,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } if (remove) break; + if (TREE_CODE (t) == MEM_REF) + t = TREE_OPERAND (t, 0); if (VAR_P (t) || TREE_CODE (t) == PARM_DECL) { if (bitmap_bit_p (&map_field_head, DECL_UID (t))) diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index 006d1ed..3508f09 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -31533,6 +31533,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_ALIGNED; else if (!strcmp ("async", p)) result = PRAGMA_OACC_CLAUSE_ASYNC; + else if (!strcmp ("attach", p)) + result = PRAGMA_OACC_CLAUSE_ATTACH; break; case 'c': if (!strcmp ("collapse", p)) @@ -31553,6 +31555,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_DEFAULTMAP; else if (!strcmp ("depend", p)) result = PRAGMA_OMP_CLAUSE_DEPEND; + else if (!strcmp ("detach", p)) + result = PRAGMA_OACC_CLAUSE_DETACH; else if (!strcmp ("device", p)) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) @@ -31832,15 +31836,19 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, case OMP_CLAUSE_MAP: case OMP_CLAUSE_FROM: case OMP_CLAUSE_TO: - while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)) + while (cp_lexer_next_token_is (parser->lexer, CPP_DOT) + || cp_lexer_next_token_is (parser->lexer, CPP_DEREF)) { + cpp_ttype ttype + = cp_lexer_next_token_is (parser->lexer, CPP_DOT) + ? CPP_DOT : CPP_DEREF; location_t loc = cp_lexer_peek_token (parser->lexer)->location; cp_id_kind idk = CP_ID_KIND_NONE; cp_lexer_consume_token (parser->lexer); decl = convert_from_reference (decl); decl - = cp_parser_postfix_dot_deref_expression (parser, CPP_DOT, + = cp_parser_postfix_dot_deref_expression (parser, ttype, decl, false, &idk, loc); } @@ -31965,12 +31973,14 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list) return list; } -/* OpenACC 2.0: +/* OpenACC 2.5: + attach ( variable-list ) copy ( variable-list ) copyin ( variable-list ) copyout ( variable-list ) create ( variable-list ) delete ( variable-list ) + detach ( variable-list ) present ( variable-list ) */ static tree @@ -31980,6 +31990,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, enum gomp_map_kind kind; switch (c_kind) { + case PRAGMA_OACC_CLAUSE_ATTACH: + kind = GOMP_MAP_ATTACH; + break; case PRAGMA_OACC_CLAUSE_COPY: kind = GOMP_MAP_TOFROM; break; @@ -31995,6 +32008,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DELETE: kind = GOMP_MAP_RELEASE; break; + case PRAGMA_OACC_CLAUSE_DETACH: + kind = GOMP_MAP_DETACH; + break; case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; @@ -34344,6 +34360,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses, here); c_name = "auto"; break; + case PRAGMA_OACC_CLAUSE_ATTACH: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "attach"; + break; case PRAGMA_OACC_CLAUSE_COLLAPSE: clauses = cp_parser_omp_clause_collapse (parser, clauses, here); c_name = "collapse"; @@ -34372,6 +34392,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses = cp_parser_omp_clause_default (parser, clauses, here, true); c_name = "default"; break; + case PRAGMA_OACC_CLAUSE_DETACH: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "detach"; + break; case PRAGMA_OACC_CLAUSE_DEVICE: clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "device"; @@ -38011,10 +38035,12 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok) structured-block */ #define OACC_DATA_CLAUSE_MASK \ - ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) ) @@ -38214,6 +38240,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok) #define OACC_ENTER_DATA_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ @@ -38224,6 +38251,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok) | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -38327,6 +38355,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name, #define OACC_KERNELS_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ @@ -38342,6 +38371,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name, #define OACC_PARALLEL_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 182d360..303bcc4 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -4568,7 +4568,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, t = TREE_OPERAND (t, 0); ret = t; if (TREE_CODE (t) == COMPONENT_REF - && ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM) @@ -4691,6 +4690,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, if (low_bound == NULL_TREE) low_bound = integer_zero_node; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)) + { + if (length != integer_one_node) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "array section in % clause" + : "array section in % clause"); + return error_mark_node; + } + } if (length != NULL_TREE) { if (!integer_nonzerop (length)) @@ -6048,6 +6060,43 @@ cp_omp_finish_iterators (tree iter) return ret; } +/* Ensure that pointers are used in OpenACC attach and detach clauses. + Return true if an error has been detected. */ + +static bool +cp_oacc_check_attachments (tree c) +{ + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) + return false; + + /* OpenACC attach / detach clauses must be pointers. */ + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) + { + tree t = OMP_CLAUSE_DECL (c); + tree type; + + while (TREE_CODE (t) == TREE_LIST) + t = TREE_CHAIN (t); + + type = TREE_TYPE (t); + + if (TREE_CODE (type) == REFERENCE_TYPE) + type = TREE_TYPE (type); + + if (TREE_CODE (type) != POINTER_TYPE) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "expected pointer in % clause" + : "expected pointer in % clause"); + return true; + } + } + + return false; +} + /* For all elements of CLAUSES, validate them vs OpenMP constraints. Remove any elements from the list that are invalid. */ @@ -6288,7 +6337,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) t = OMP_CLAUSE_DECL (c); check_dup_generic_t: if (t == current_class_ptr - && (ort != C_ORT_OMP_DECLARE_SIMD + && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC) || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM))) { @@ -6352,8 +6401,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) handle_field_decl: if (!remove && TREE_CODE (t) == FIELD_DECL - && t == OMP_CLAUSE_DECL (c) - && ort != C_ORT_ACC) + && t == OMP_CLAUSE_DECL (c)) { OMP_CLAUSE_DECL (c) = omp_privatize_field (t, (OMP_CLAUSE_CODE (c) @@ -6420,7 +6468,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) omp_note_field_privatization (t, OMP_CLAUSE_DECL (c)); else t = OMP_CLAUSE_DECL (c); - if (t == current_class_ptr) + if (ort != C_ORT_ACC && t == current_class_ptr) { error_at (OMP_CLAUSE_LOCATION (c), "% allowed in OpenMP only in %" @@ -6907,7 +6955,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } if (t == error_mark_node) remove = true; - else if (t == current_class_ptr) + else if (ort != C_ORT_ACC && t == current_class_ptr) { error_at (OMP_CLAUSE_LOCATION (c), "% allowed in OpenMP only in %" @@ -7037,6 +7085,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } } } + if (cp_oacc_check_attachments (c)) + remove = true; break; } if (t == error_mark_node) @@ -7044,14 +7094,25 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; break; } + /* OpenACC attach / detach clauses must be pointers. */ + if (cp_oacc_check_attachments (c)) + { + remove = true; + break; + } if (REFERENCE_REF_P (t) && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF) { t = TREE_OPERAND (t, 0); OMP_CLAUSE_DECL (c) = t; } + if (ort == C_ORT_ACC + && TREE_CODE (t) == COMPONENT_REF + && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF) + t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); if (TREE_CODE (t) == COMPONENT_REF - && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + || ort == C_ORT_ACC) && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_) { if (type_dependent_expression_p (t)) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index d8ef35d..9f96418 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -1175,10 +1175,12 @@ enum gfc_omp_depend_op enum gfc_omp_map_op { OMP_MAP_ALLOC, + OMP_MAP_ATTACH, OMP_MAP_TO, OMP_MAP_FROM, OMP_MAP_TOFROM, OMP_MAP_DELETE, + OMP_MAP_DETACH, OMP_MAP_FORCE_ALLOC, OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM, diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 6430e61..ebba7ca 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -222,7 +222,8 @@ static match gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list, bool allow_common, bool *end_colon = NULL, gfc_omp_namelist ***headp = NULL, - bool allow_sections = false) + bool allow_sections = false, + bool allow_derived = false) { gfc_omp_namelist *head, *tail, *p; locus old_loc, cur_loc; @@ -248,7 +249,8 @@ gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list, case MATCH_YES: gfc_expr *expr; expr = NULL; - if (allow_sections && gfc_peek_ascii_char () == '(') + if ((allow_sections && gfc_peek_ascii_char () == '(') + || (allow_derived && gfc_peek_ascii_char () == '%')) { gfc_current_locus = cur_loc; m = gfc_match_variable (&expr, 0); @@ -785,7 +787,7 @@ enum omp_mask1 OMP_MASK1_LAST }; -/* OpenACC 2.0 specific clauses. */ +/* OpenACC 2.0+ specific clauses. */ enum omp_mask2 { OMP_CLAUSE_ASYNC, @@ -811,6 +813,8 @@ enum omp_mask2 OMP_CLAUSE_TILE, OMP_CLAUSE_IF_PRESENT, OMP_CLAUSE_FINALIZE, + OMP_CLAUSE_ATTACH, + OMP_CLAUSE_DETACH, /* This must come last. */ OMP_MASK2_LAST }; @@ -914,10 +918,12 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m) mapping. */ static bool -gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op) +gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op, + bool allow_derived = false) { gfc_omp_namelist **head = NULL; - if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true) + if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true, + allow_derived) == MATCH_YES) { gfc_omp_namelist *n; @@ -939,6 +945,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, { gfc_omp_clauses *c = gfc_get_omp_clauses (); locus old_loc; + /* Determine whether we're dealing with an OpenACC directive that permits + derived type member accesses. This in particular disallows + "!$acc declare" from using such accesses, because it's not clear if/how + that should work. */ + bool allow_derived = (openacc + && ((mask & OMP_CLAUSE_ATTACH) + || (mask & OMP_CLAUSE_DETACH) + || (mask & OMP_CLAUSE_HOST_SELF))); gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64); *cp = NULL; @@ -1012,6 +1026,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, needs_space = true; continue; } + if ((mask & OMP_CLAUSE_ATTACH) + && gfc_match ("attach ( ") == MATCH_YES + && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], + OMP_MAP_ATTACH, allow_derived)) + continue; break; case 'c': if ((mask & OMP_CLAUSE_COLLAPSE) @@ -1039,7 +1058,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPY) && gfc_match ("copy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if (mask & OMP_CLAUSE_COPYIN) { @@ -1047,7 +1066,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, { if (gfc_match ("copyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; } else if (gfc_match_omp_variable_list ("copyin (", @@ -1058,7 +1077,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("copyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYPRIVATE) && gfc_match_omp_variable_list ("copyprivate (", @@ -1068,7 +1087,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("create ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; break; case 'd': @@ -1104,7 +1123,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_DELETE) && gfc_match ("delete ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_RELEASE)) + OMP_MAP_RELEASE, allow_derived)) continue; if ((mask & OMP_CLAUSE_DEPEND) && gfc_match ("depend ( ") == MATCH_YES) @@ -1147,6 +1166,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, else gfc_current_locus = old_loc; } + if ((mask & OMP_CLAUSE_DETACH) + && gfc_match ("detach ( ") == MATCH_YES + && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], + OMP_MAP_DETACH, allow_derived)) + continue; if ((mask & OMP_CLAUSE_DEVICE) && !openacc && c->device == NULL @@ -1156,12 +1180,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, && openacc && gfc_match ("device ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_TO)) + OMP_MAP_FORCE_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_DEVICEPTR) && gfc_match ("deviceptr ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_DEVICEPTR)) + OMP_MAP_FORCE_DEVICEPTR, + allow_derived)) continue; if ((mask & OMP_CLAUSE_DEVICE_RESIDENT) && gfc_match_omp_variable_list @@ -1239,7 +1264,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_HOST_SELF) && gfc_match ("host ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_FROM)) + OMP_MAP_FORCE_FROM, allow_derived)) continue; break; case 'i': @@ -1511,47 +1536,48 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("pcopyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("pcopyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("pcreate ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; if ((mask & OMP_CLAUSE_PRESENT) && gfc_match ("present ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_PRESENT)) + OMP_MAP_FORCE_PRESENT, + allow_derived)) continue; if ((mask & OMP_CLAUSE_COPY) && gfc_match ("present_or_copy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("present_or_copyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("present_or_copyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("present_or_create ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; if ((mask & OMP_CLAUSE_PRIORITY) && c->priority == NULL @@ -1669,8 +1695,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if (gfc_match_omp_variable_list (" :", &c->lists[OMP_LIST_REDUCTION], - false, NULL, &head, - openacc) == MATCH_YES) + false, NULL, &head, openacc, + allow_derived) == MATCH_YES) { gfc_omp_namelist *n; if (rop == OMP_REDUCTION_NONE) @@ -1769,7 +1795,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_HOST_SELF) && gfc_match ("self ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_FROM)) + OMP_MAP_FORCE_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_SEQ) && !c->seq @@ -1927,17 +1953,17 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT \ | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR \ | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT \ - | OMP_CLAUSE_WAIT) + | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH) #define OACC_KERNELS_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS \ | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \ | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT \ | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT \ - | OMP_CLAUSE_WAIT) + | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH) #define OACC_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR | OMP_CLAUSE_COPY \ | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE \ - | OMP_CLAUSE_PRESENT) + | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH) #define OACC_LOOP_CLAUSES \ (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER \ | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT \ @@ -1958,10 +1984,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT) #define OACC_ENTER_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT \ - | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE) + | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH) #define OACC_EXIT_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT \ - | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE) + | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE \ + | OMP_CLAUSE_DETACH) #define OACC_WAIT_CLAUSES \ omp_mask (OMP_CLAUSE_ASYNC) #define OACC_ROUTINE_CLAUSES \ @@ -3734,9 +3761,6 @@ resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause) static void check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name) { - if (sym->ts.type == BT_DERIVED && sym->attr.pointer) - gfc_error ("POINTER object %qs of derived type in %s clause at %L", - sym->name, name, &loc); if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer) gfc_error ("Cray pointer object %qs of derived type in %s clause at %L", sym->name, name, &loc); @@ -3781,9 +3805,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name) static void resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name) { - if (sym->ts.type == BT_DERIVED && sym->attr.allocatable) - gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L", - sym->name, name, &loc); if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable) || (sym->ts.type == BT_CLASS && CLASS_DATA (sym) && CLASS_DATA (sym)->attr.allocatable)) @@ -4153,11 +4174,23 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, && (list != OMP_LIST_REDUCTION || !openacc)) for (n = omp_clauses->lists[list]; n; n = n->next) { - if (n->sym->mark) - gfc_error ("Symbol %qs present on multiple clauses at %L", - n->sym->name, &n->where); - else - n->sym->mark = 1; + bool array_only_p = true; + /* Disallow duplicate bare variable references and multiple + subarrays of the same array here, but allow multiple components of + the same (e.g. derived-type) variable. For the latter, duplicate + components are detected elsewhere. */ + if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE) + for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next) + if (ref->type != REF_ARRAY) + array_only_p = false; + if (array_only_p) + { + if (n->sym->mark) + gfc_error ("Symbol %qs present on multiple clauses at %L", + n->sym->name, &n->where); + else + n->sym->mark = 1; + } } gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1); @@ -4348,23 +4381,41 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, "are allowed on ORDERED directive at %L", &n->where); } + gfc_ref *array_ref = NULL; + bool resolved = false; if (n->expr) { - if (!gfc_resolve_expr (n->expr) + array_ref = n->expr->ref; + resolved = gfc_resolve_expr (n->expr); + + /* Look through component refs to find last array + reference. */ + while (resolved + && array_ref + && (array_ref->type == REF_COMPONENT + || (array_ref->type == REF_ARRAY + && array_ref->next + && array_ref->next->type == REF_COMPONENT))) + array_ref = array_ref->next; + } + if (array_ref + || (n->expr + && (!resolved || n->expr->expr_type != EXPR_VARIABLE))) + { + if (!resolved || n->expr->expr_type != EXPR_VARIABLE - || n->expr->ref == NULL - || n->expr->ref->next - || n->expr->ref->type != REF_ARRAY) + || array_ref->next + || array_ref->type != REF_ARRAY) gfc_error ("%qs in %s clause at %L is not a proper " "array section", n->sym->name, name, &n->where); - else if (n->expr->ref->u.ar.codimen) + else if (array_ref->u.ar.codimen) gfc_error ("Coarrays not supported in %s clause at %L", name, &n->where); else { int i; - gfc_array_ref *ar = &n->expr->ref->u.ar; + gfc_array_ref *ar = &array_ref->u.ar; for (i = 0; i < ar->dimen; i++) if (ar->stride[i]) { diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index e7f3a78..b42f57a 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -60,6 +60,9 @@ gfc_omp_privatize_by_reference (const_tree decl) if (TREE_CODE (type) == POINTER_TYPE) { + while (TREE_CODE (decl) == COMPONENT_REF) + decl = TREE_OPERAND (decl, 1); + /* Array POINTER/ALLOCATABLE have aggregate types, all user variables that have POINTER_TYPE type and aren't scalar pointers, scalar allocatables, Cray pointees or C pointers are supposed to be @@ -2108,20 +2111,47 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, tree decl = gfc_get_symbol_decl (n->sym); if (DECL_P (decl)) TREE_ADDRESSABLE (decl) = 1; - if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL) + + gfc_ref *ref = n->expr ? n->expr->ref : NULL; + symbol_attribute *sym_attr = &n->sym->attr; + gomp_map_kind ptr_map_kind = GOMP_MAP_POINTER; + + if (ref && n->sym->ts.type == BT_DERIVED) + { + if (gfc_omp_privatize_by_reference (decl)) + decl = build_fold_indirect_ref (decl); + + for (; ref && ref->type == REF_COMPONENT; ref = ref->next) + { + tree field = ref->u.c.component->backend_decl; + gcc_assert (field && TREE_CODE (field) == FIELD_DECL); + decl = fold_build3 (COMPONENT_REF, TREE_TYPE (field), + decl, field, NULL_TREE); + sym_attr = &ref->u.c.component->attr; + } + + ptr_map_kind = GOMP_MAP_ALWAYS_POINTER; + } + + if (ref == NULL || ref->u.ar.type == AR_FULL) { + tree field = decl; + + while (TREE_CODE (field) == COMPONENT_REF) + field = TREE_OPERAND (field, 1); + if (POINTER_TYPE_P (TREE_TYPE (decl)) && (gfc_omp_privatize_by_reference (decl) - || GFC_DECL_GET_SCALAR_POINTER (decl) - || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl) - || GFC_DECL_CRAY_POINTEE (decl) + || GFC_DECL_GET_SCALAR_POINTER (field) + || GFC_DECL_GET_SCALAR_ALLOCATABLE (field) + || GFC_DECL_CRAY_POINTEE (field) || GFC_DESCRIPTOR_TYPE_P - (TREE_TYPE (TREE_TYPE (decl))))) + (TREE_TYPE (TREE_TYPE (field))))) { tree orig_decl = decl; node4 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind); OMP_CLAUSE_DECL (node4) = decl; OMP_CLAUSE_SIZE (node4) = size_int (0); decl = build_fold_indirect_ref (decl); @@ -2131,13 +2161,15 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = decl; OMP_CLAUSE_SIZE (node3) = size_int (0); decl = build_fold_indirect_ref (decl); } } - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))) + if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)) + && n->u.map_op != OMP_MAP_ATTACH + && n->u.map_op != OMP_MAP_DETACH) { tree type = TREE_TYPE (decl); tree ptr = gfc_conv_descriptor_data_get (decl); @@ -2152,14 +2184,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl); + if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER) + STRIP_NOPS (OMP_CLAUSE_DECL (node3)); OMP_CLAUSE_SIZE (node3) = size_int (0); /* We have to check for n->sym->attr.dimension because of scalar coarrays. */ - if (n->sym->attr.pointer && n->sym->attr.dimension) + if (sym_attr->pointer && sym_attr->dimension) { stmtblock_t cond_block; tree size @@ -2189,11 +2223,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, else_b)); OMP_CLAUSE_SIZE (node) = size; } - else if (n->sym->attr.dimension) + else if (sym_attr->dimension) OMP_CLAUSE_SIZE (node) = gfc_full_array_size (block, decl, GFC_TYPE_ARRAY_RANK (type)); - if (n->sym->attr.dimension) + if (sym_attr->dimension) { tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type)); @@ -2206,11 +2240,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, else OMP_CLAUSE_DECL (node) = decl; } - else + else if (ref) { tree ptr, ptr2; gfc_init_se (&se, NULL); - if (n->expr->ref->u.ar.type == AR_ELEMENT) + if (ref->u.ar.type == AR_ELEMENT) { gfc_conv_expr_reference (&se, n->expr); gfc_add_block_to_block (block, &se.pre); @@ -2244,7 +2278,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { node4 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind); OMP_CLAUSE_DECL (node4) = decl; OMP_CLAUSE_SIZE (node4) = size_int (0); decl = build_fold_indirect_ref (decl); @@ -2261,9 +2295,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl); + if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER) + STRIP_NOPS (OMP_CLAUSE_DECL (node3)); } else { @@ -2276,18 +2312,23 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, } node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = decl; } ptr2 = fold_convert (sizetype, ptr2); OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2); } + else + gcc_unreachable (); switch (n->u.map_op) { case OMP_MAP_ALLOC: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC); break; + case OMP_MAP_ATTACH: + OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ATTACH); + break; case OMP_MAP_TO: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TO); break; @@ -2312,6 +2353,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, case OMP_MAP_DELETE: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE); break; + case OMP_MAP_DETACH: + OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DETACH); + break; case OMP_MAP_FORCE_ALLOC: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC); break; diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 274edc0..aa7723d 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -113,6 +113,10 @@ enum gimplify_omp_var_data GOVD_NONTEMPORAL = 4194304, + /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for + fields. */ + GOVD_MAP_HAS_ATTACHMENTS = 8388608, + GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR | GOVD_LOCAL) @@ -7998,7 +8002,13 @@ insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node, OMP_CLAUSE_SET_MAP_KIND (c2, mkind); OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c)); OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node; - OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node); + if (OMP_CLAUSE_CHAIN (prev_node) != c + && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node)) + == GOMP_MAP_TO_PSET)) + OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node)); + else + OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node); if (struct_node) OMP_CLAUSE_CHAIN (struct_node) = c2; @@ -8588,7 +8598,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, remove = true; break; } - if (DECL_P (decl)) + if (DECL_P (decl) + && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET + && code != OACC_UPDATE) { if (error_operand_p (decl)) { @@ -8640,16 +8652,36 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); bool ptr = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER); + bool attach = OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH; + bool has_attachments = false; + /* For OpenACC, pointers in structs should trigger an + attach action. */ + if (ptr && (region_type & ORT_ACC) != 0) + { + /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a + GOMP_MAP_ATTACH clause after we have detected a case + that needs a GOMP_MAP_STRUCT mapping adding. */ + OMP_CLAUSE_SET_MAP_KIND (c, + (code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH + : GOMP_MAP_ATTACH); + has_attachments = true; + } if (n == NULL || (n->value & GOVD_MAP) == 0) { tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT); + OMP_CLAUSE_SET_MAP_KIND (l, attach + ? GOMP_MAP_FORCE_PRESENT : GOMP_MAP_STRUCT); if (!base_eq_orig_base) OMP_CLAUSE_DECL (l) = unshare_expr (orig_base); else OMP_CLAUSE_DECL (l) = decl; - OMP_CLAUSE_SIZE (l) = size_int (1); + OMP_CLAUSE_SIZE (l) = attach + ? (DECL_P (OMP_CLAUSE_DECL (l)) + ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l)) + : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l)))) + : size_int (1); if (struct_map_to_clause == NULL) struct_map_to_clause = new hash_map; struct_map_to_clause->put (decl, l); @@ -8681,9 +8713,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, flags = GOVD_MAP | GOVD_EXPLICIT; if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr) flags |= GOVD_SEEN; + if (has_attachments) + flags |= GOVD_MAP_HAS_ATTACHMENTS; goto do_add_decl; } - else + else if (struct_map_to_clause) { tree *osc = struct_map_to_clause->get (decl); tree *sc = NULL, *scp = NULL; @@ -8692,8 +8726,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, sc = &OMP_CLAUSE_CHAIN (*osc); if (*sc != c && (OMP_CLAUSE_MAP_KIND (*sc) - == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) + == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) sc = &OMP_CLAUSE_CHAIN (*sc); + /* Here "prev_list_p" is the end of the inserted + alloc/release nodes after the struct node, OSC. */ for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc)) if (ptr && sc == prev_list_p) break; @@ -8752,9 +8788,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } if (remove) break; - OMP_CLAUSE_SIZE (*osc) - = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc), - size_one_node); + if (!attach) + OMP_CLAUSE_SIZE (*osc) + = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc), + size_one_node); if (ptr) { tree cl @@ -8786,11 +8823,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } if (!remove && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER + && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET && OMP_CLAUSE_CHAIN (c) && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) - == GOMP_MAP_ALWAYS_POINTER)) + && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) + == GOMP_MAP_ALWAYS_POINTER) + || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) + == GOMP_MAP_TO_PSET))) prev_list_p = list_p; + break; } flags = GOVD_MAP | GOVD_EXPLICIT; @@ -9412,6 +9453,8 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data) return 0; if ((flags & GOVD_SEEN) == 0) return 0; + if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0) + return 0; if (flags & GOVD_DEBUG_PRIVATE) { gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED); @@ -11795,8 +11838,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) && omp_find_clause (OMP_STANDALONE_CLAUSES (expr), OMP_CLAUSE_FINALIZE)) { - /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize" - semantics apply to all mappings of this OpenACC directive. */ + /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and + GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply + to all mappings of this OpenACC directive. */ bool finalize_marked = false; for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c)) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP) @@ -11810,10 +11854,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE); finalize_marked = true; break; + case GOMP_MAP_DETACH: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH); + finalize_marked = true; + break; + case GOMP_MAP_STRUCT: + case GOMP_MAP_FORCE_PRESENT: + /* Skip over an initial struct or force_present mapping. */ + break; default: - /* Check consistency: libgomp relies on the very first data - mapping clause being marked, so make sure we did that before - any other mapping clauses. */ + /* Check consistency: libgomp relies on the very first + non-struct, non-force-present data mapping clause being + marked, so make sure we did that before any other mapping + clauses. */ gcc_assert (finalize_marked); break; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index ca78d7a..55dbc0b 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -9138,6 +9138,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case GOMP_MAP_FORCE_DEVICEPTR: case GOMP_MAP_DEVICE_RESIDENT: case GOMP_MAP_LINK: + case GOMP_MAP_ATTACH: + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: gcc_assert (is_gimple_omp_oacc (stmt)); break; default: diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c new file mode 100644 index 0000000..84a44af --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c @@ -0,0 +1,54 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +/* { dg-additional-options "-fdump-tree-omplower" } */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + int *a, *z; + +#pragma acc enter data copyin(s) + { +#pragma acc data copy(s.a[0:10]) copy(z[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(s.e) + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + + a = s.e; +#pragma acc enter data attach(a) +#pragma acc exit data detach(a) + } + +#pragma acc enter data copyin(a) +#pragma acc acc enter data attach(s.e) +#pragma acc exit data detach(s.e) + +#pragma acc data attach(s.e) + { + } +#pragma acc exit data delete(a) + +#pragma acc exit data detach(a) finalize +#pragma acc exit data detach(s.a) finalize + } +} + +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c new file mode 100644 index 0000000..ebfb99d --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c @@ -0,0 +1,62 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + int *a, *z, scalar, **y; + +#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */ + { +#pragma acc data copy(s.a[0:10]) copy(z[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */ + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + a = s.e; +#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */ +#pragma acc exit data detach(a) + } + +#pragma acc enter data attach(z[:5]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[:5]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[1:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[1:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[3]) /* { dg-error "expected pointer in .attach. clause" } */ +#pragma acc exit data detach(z[3]) /* { dg-error "expected pointer in .detach. clause" } */ + +#pragma acc acc enter data attach(s.e) +#pragma acc exit data detach(s.e) attach(z) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc data attach(s.e) + { + } +#pragma acc exit data delete(a) attach(s.a) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc enter data attach(scalar) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(scalar) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(s) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(s) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ + } + +#pragma acc enter data attach(y[10]) +#pragma acc exit data detach(y[10]) +} diff --git a/gcc/testsuite/g++.dg/goacc/mdc.C b/gcc/testsuite/g++.dg/goacc/mdc.C new file mode 100644 index 0000000..fbd43aa --- /dev/null +++ b/gcc/testsuite/g++.dg/goacc/mdc.C @@ -0,0 +1,68 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + struct foo& rs = s; + + int *a, *z, scalar, **y; + int* const &ra = a; + int* const &rz = z; + int& rscalar = scalar; + int** const &ry = y; + +#pragma acc enter data copyin(rs) detach(rz) /* { dg-error ".detach. is not valid for" } */ + { +#pragma acc data copy(rs.a[0:10]) copy(rz[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(rs.e) detach(rs.b) /* { dg-error ".detach. is not valid for" } */ + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + a = s.e; +#pragma acc enter data attach(ra) detach(rs.c) /* { dg-error ".detach. is not valid for" } */ +#pragma acc exit data detach(ra) + } + +#pragma acc enter data attach(rz[:5]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[:5]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[1:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[1:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[3]) /* { dg-error "expected pointer in .attach. clause" } */ +#pragma acc exit data detach(rz[3]) /* { dg-error "expected pointer in .detach. clause" } */ + +#pragma acc acc enter data attach(rs.e) +#pragma acc exit data detach(rs.e) attach(rz) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc data attach(rs.e) + { + } +#pragma acc exit data delete(ra) attach(rs.a) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc enter data attach(rscalar) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rscalar) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rs) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rs) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ + } + +#pragma acc enter data attach(ry[10]) +#pragma acc exit data detach(ry[10]) +} diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 index b94214e..1a4a671 100644 --- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 @@ -39,9 +39,9 @@ contains !$acc end data - !$acc parallel copy (tip) ! { dg-error "POINTER" } + !$acc parallel copy (tip) !$acc end parallel - !$acc parallel copy (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copy (tia) !$acc end parallel !$acc parallel deviceptr (i) copy (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -54,9 +54,9 @@ contains !$acc end data - !$acc parallel copyin (tip) ! { dg-error "POINTER" } + !$acc parallel copyin (tip) !$acc end parallel - !$acc parallel copyin (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copyin (tia) !$acc end parallel !$acc parallel deviceptr (i) copyin (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -71,9 +71,9 @@ contains !$acc end data - !$acc parallel copyout (tip) ! { dg-error "POINTER" } + !$acc parallel copyout (tip) !$acc end parallel - !$acc parallel copyout (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copyout (tia) !$acc end parallel !$acc parallel deviceptr (i) copyout (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -90,9 +90,9 @@ contains !$acc end data - !$acc parallel create (tip) ! { dg-error "POINTER" } + !$acc parallel create (tip) !$acc end parallel - !$acc parallel create (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel create (tia) !$acc end parallel !$acc parallel deviceptr (i) create (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -111,9 +111,9 @@ contains !$acc end data - !$acc parallel present (tip) ! { dg-error "POINTER" } + !$acc parallel present (tip) !$acc end parallel - !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present (tia) !$acc end parallel !$acc parallel deviceptr (i) present (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -144,9 +144,9 @@ contains !$acc end parallel - !$acc parallel present_or_copy (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copy (tip) !$acc end parallel - !$acc parallel present_or_copy (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copy (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copy (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -169,9 +169,9 @@ contains !$acc end data - !$acc parallel present_or_copyin (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copyin (tip) !$acc end parallel - !$acc parallel present_or_copyin (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copyin (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copyin (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -196,9 +196,9 @@ contains !$acc end data - !$acc parallel present_or_copyout (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copyout (tip) !$acc end parallel - !$acc parallel present_or_copyout (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copyout (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copyout (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -225,9 +225,9 @@ contains !$acc end data - !$acc parallel present_or_create (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_create (tip) !$acc end parallel - !$acc parallel present_or_create (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_create (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_create (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -256,4 +256,4 @@ contains !$acc end data end subroutine foo -end module test \ No newline at end of file +end module test diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 new file mode 100644 index 0000000..5fb2981 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 @@ -0,0 +1,77 @@ +! Test ACC UPDATE with derived types. + +module dt + integer, parameter :: n = 10 + type inner + integer :: d(n) + end type inner + type dtype + integer(8) :: a, b, c(n) + type(inner) :: in + end type dtype +end module dt + +program derived_acc + use dt + + implicit none + type(dtype):: var + integer i + !$acc declare create(var) + !$acc declare pcopy(var%a) ! { dg-error "Syntax error in OpenMP" } + + !$acc update host(var) + !$acc update host(var%a) + !$acc update device(var) + !$acc update device(var%a) + !$acc update self(var) + !$acc update self(var%a) + + !$acc enter data copyin(var) + !$acc enter data copyin(var%a) + + !$acc exit data copyout(var) + !$acc exit data copyout(var%a) + + !$acc data copy(var) + !$acc end data + + !$acc data copyout(var%a) + !$acc end data + + !$acc parallel loop pcopyout(var) + do i = 1, 10 + end do + !$acc end parallel loop + + !$acc parallel loop copyout(var%a) + do i = 1, 10 + end do + !$acc end parallel loop + + !$acc parallel pcopy(var) + !$acc end parallel + + !$acc parallel pcopy(var%a) + do i = 1, 10 + end do + !$acc end parallel + + !$acc kernels pcopyin(var) + !$acc end kernels + + !$acc kernels pcopy(var%a) + do i = 1, 10 + end do + !$acc end kernels + + !$acc kernels loop pcopyin(var) + do i = 1, 10 + end do + !$acc end kernels loop + + !$acc kernels loop pcopy(var%a) + do i = 1, 10 + end do + !$acc end kernels loop +end program derived_acc diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 index 805459c..b616b39 100644 --- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 @@ -44,14 +44,14 @@ contains !$acc enter data wait (i, 1) !$acc enter data wait (a) ! { dg-error "INTEGER" } !$acc enter data wait (b(5:6)) ! { dg-error "INTEGER" } - !$acc enter data copyin (tip) ! { dg-error "POINTER" } - !$acc enter data copyin (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data create (tip) ! { dg-error "POINTER" } - !$acc enter data create (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data present_or_copyin (tip) ! { dg-error "POINTER" } - !$acc enter data present_or_copyin (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data present_or_create (tip) ! { dg-error "POINTER" } - !$acc enter data present_or_create (tia) ! { dg-error "ALLOCATABLE" } + !$acc enter data copyin (tip) + !$acc enter data copyin (tia) + !$acc enter data create (tip) + !$acc enter data create (tia) + !$acc enter data present_or_copyin (tip) + !$acc enter data present_or_copyin (tia) + !$acc enter data present_or_create (tip) + !$acc enter data present_or_create (tia) !$acc enter data copyin (i) create (i) ! { dg-error "multiple clauses" } !$acc enter data copyin (i) present_or_copyin (i) ! { dg-error "multiple clauses" } !$acc enter data create (i) present_or_copyin (i) ! { dg-error "multiple clauses" } @@ -79,10 +79,10 @@ contains !$acc exit data wait (i, 1) !$acc exit data wait (a) ! { dg-error "INTEGER" } !$acc exit data wait (b(5:6)) ! { dg-error "INTEGER" } - !$acc exit data copyout (tip) ! { dg-error "POINTER" } - !$acc exit data copyout (tia) ! { dg-error "ALLOCATABLE" } - !$acc exit data delete (tip) ! { dg-error "POINTER" } - !$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" } + !$acc exit data copyout (tip) + !$acc exit data copyout (tia) + !$acc exit data delete (tip) + !$acc exit data delete (tia) !$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" } !$acc exit data finalize !$acc exit data finalize copyout (i) diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 99eca4a..5455da9 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -826,6 +826,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case GOMP_MAP_LINK: pp_string (pp, "link"); break; + case GOMP_MAP_ATTACH: + pp_string (pp, "attach"); + break; + case GOMP_MAP_DETACH: + pp_string (pp, "detach"); + break; + case GOMP_MAP_FORCE_DETACH: + pp_string (pp, "force_detach"); + break; default: gcc_unreachable (); } diff --git a/include/gomp-constants.h b/include/gomp-constants.h index acd2585..b7ce640 100644 --- a/include/gomp-constants.h +++ b/include/gomp-constants.h @@ -42,6 +42,7 @@ #define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4) #define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \ | GOMP_MAP_FLAG_SPECIAL_0) +#define GOMP_MAP_DEEP_COPY (1 << 5) /* Flag to force a specific behavior (or else, trigger a run-time error). */ #define GOMP_MAP_FLAG_FORCE (1 << 7) @@ -128,6 +129,13 @@ enum gomp_map_kind /* Decrement usage count and deallocate if zero. */ GOMP_MAP_RELEASE = (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_DELETE), + /* In OpenACC, attach a pointer to a mapped struct field. */ + GOMP_MAP_ATTACH = (GOMP_MAP_DEEP_COPY | 0), + /* In OpenACC, detach a pointer to a mapped struct field. */ + GOMP_MAP_DETACH = (GOMP_MAP_DEEP_COPY | 1), + /* In OpenACC, detach a pointer to a mapped struct field. */ + GOMP_MAP_FORCE_DETACH = (GOMP_MAP_DEEP_COPY + | GOMP_MAP_FLAG_FORCE | 1), /* Internal to GCC, not used in libgomp. */ /* Do not map, but pointer assign a pointer instead. */ diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index cb25e86..df49c1b 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -866,6 +866,8 @@ struct target_var_desc { bool copy_from; /* True if data always should be copied from device to host at the end. */ bool always_copy_from; + /* True if variable should be detached at end of region. */ + bool do_detach; /* Relative offset against key host_start. */ uintptr_t offset; /* Actual length. */ @@ -918,8 +920,13 @@ struct splay_tree_key_s { uintptr_t tgt_offset; /* Reference count. */ uintptr_t refcount; - /* Dynamic reference count. */ - uintptr_t dynamic_refcount; + /* Reference counts beyond those that represent genuine references in the + linked splay tree key/target memory structures, e.g. for multiple OpenACC + "present increment" operations (via "acc enter data") refering to the same + host-memory block. */ + uintptr_t virtual_refcount; + /* For a block with attached pointers, the attachment counters for each. */ + unsigned short *attach_count; /* Pointer to the original mapping of "omp declare target link" object. */ splay_tree_key link_key; }; @@ -943,13 +950,6 @@ splay_compare (splay_tree_key x, splay_tree_key y) typedef struct acc_dispatch_t { - /* This is a linked list of data mapped using the - acc_map_data/acc_unmap_data or "acc enter data"/"acc exit data" pragmas. - Unlike mapped_data in the goacc_thread struct, unmapping can - happen out-of-order with respect to mapping. */ - /* This is guarded by the lock in the "outer" struct gomp_device_descr. */ - struct target_mem_desc *data_environ; - /* Execute. */ __typeof (GOMP_OFFLOAD_openacc_exec) *exec_func; @@ -1056,13 +1056,17 @@ struct gomp_device_descr enum gomp_map_vars_kind { GOMP_MAP_VARS_OPENACC, + GOMP_MAP_VARS_OPENACC_ENTER_DATA, GOMP_MAP_VARS_TARGET, GOMP_MAP_VARS_DATA, GOMP_MAP_VARS_ENTER_DATA }; -extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int); -extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int); +struct gomp_coalesce_buf; + +extern void gomp_acc_remove_pointer (struct gomp_device_descr *, void **, + size_t *, unsigned short *, int, bool, + int); extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *, unsigned short *); struct gomp_coalesce_buf; @@ -1072,6 +1076,14 @@ extern void gomp_copy_host2dev (struct gomp_device_descr *, extern void gomp_copy_dev2host (struct gomp_device_descr *, struct goacc_asyncqueue *, void *, const void *, size_t); +extern uintptr_t gomp_map_val (struct target_mem_desc *, void **, size_t); +extern void gomp_attach_pointer (struct gomp_device_descr *, + struct goacc_asyncqueue *, splay_tree, + splay_tree_key, uintptr_t, size_t, + struct gomp_coalesce_buf *); +extern void gomp_detach_pointer (struct gomp_device_descr *, + struct goacc_asyncqueue *, splay_tree_key, + uintptr_t, bool, struct gomp_coalesce_buf *); extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *, size_t, void **, void **, diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index ba9218b..a086dd2 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -480,6 +480,16 @@ OACC_2.5 { acc_update_self_async_array_h_; } OACC_2.0.1; +OACC_2.6 { + global: + acc_attach; + acc_attach_async; + acc_detach; + acc_detach_async; + acc_detach_finalize; + acc_detach_finalize_async; +} OACC_2.5; + GOACC_2.0 { global: GOACC_data_end; diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c index 68aaf19..077e28f 100644 --- a/libgomp/oacc-async.c +++ b/libgomp/oacc-async.c @@ -263,6 +263,24 @@ goacc_async_copyout_unmap_vars (struct target_mem_desc *tgt, (void *) tgt); } +/* Remove a variable asynchronously. This actually removes the variable + mapping immediately, but retains the linked target_mem_desc until the + asynchronous operation has completed (as it may still refer to target + memory). The device lock must be held before entry, and remains locked on + exit. */ + +attribute_hidden void +goacc_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key n, + struct goacc_asyncqueue *aq) +{ + struct target_mem_desc *tgt = n->tgt; + assert (tgt); + tgt->refcount++; + gomp_remove_var (devicep, n); + devicep->openacc.async.queue_callback_func (aq, goacc_async_unmap_tgt, + (void *) tgt); +} + attribute_hidden void goacc_async_free (struct gomp_device_descr *devicep, struct goacc_asyncqueue *aq, void *ptr) diff --git a/libgomp/oacc-host.c b/libgomp/oacc-host.c index 53658c8..6ab77a9 100644 --- a/libgomp/oacc-host.c +++ b/libgomp/oacc-host.c @@ -262,8 +262,6 @@ static struct gomp_device_descr host_dispatch = .state = GOMP_DEVICE_UNINITIALIZED, .openacc = { - .data_environ = NULL, - .exec_func = host_openacc_exec, .async = { diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index 823e20e..c3b8234 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -302,9 +302,12 @@ acc_shutdown_1 (acc_device_t d) if (walk->dev) { - gomp_mutex_lock (&walk->dev->lock); - gomp_free_memmap (&walk->dev->mem_map); - gomp_mutex_unlock (&walk->dev->lock); + while (walk->dev->mem_map.root) + { + splay_tree_key k = &walk->dev->mem_map.root->key; + k->link_key = NULL; + gomp_remove_var (walk->dev, k); + } walk->dev = NULL; walk->base_dev = NULL; diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h index 3354eb6..2e4045e 100644 --- a/libgomp/oacc-int.h +++ b/libgomp/oacc-int.h @@ -102,10 +102,15 @@ void goacc_restore_bind (void); void goacc_lazy_initialize (void); void goacc_host_init (void); +struct splay_tree_key_s; + void goacc_init_asyncqueues (struct gomp_device_descr *); bool goacc_fini_asyncqueues (struct gomp_device_descr *); void goacc_async_copyout_unmap_vars (struct target_mem_desc *, struct goacc_asyncqueue *); +void goacc_remove_var_async (struct gomp_device_descr *devicep, + struct splay_tree_key_s *n, + struct goacc_asyncqueue *aq); void goacc_async_free (struct gomp_device_descr *, struct goacc_asyncqueue *, void *); struct goacc_asyncqueue *get_goacc_asyncqueue (int); diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 050eb0d..466e75b 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -52,6 +52,25 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s) return key; } +/* Helper for lookup_dev. Iterate over splay tree. */ + +static splay_tree_key +lookup_dev_1 (splay_tree_node node, uintptr_t d, size_t s) +{ + splay_tree_key k = &node->key; + struct target_mem_desc *t = k->tgt; + + if (d >= t->tgt_start && d + s <= t->tgt_end) + return k; + + if (node->left) + return lookup_dev_1 (node->left, d, s); + if (node->right) + return lookup_dev_1 (node->right, d, s); + + return NULL; +} + /* Return block containing [D->S), or NULL if not contained. The list isn't ordered by device address, so we have to iterate over the whole array. This is not expected to be a common @@ -59,35 +78,12 @@ lookup_host (struct gomp_device_descr *dev, void *h, size_t s) remains locked on exit. */ static splay_tree_key -lookup_dev (struct target_mem_desc *tgt, void *d, size_t s) +lookup_dev (splay_tree mem_map, void *d, size_t s) { - int i; - struct target_mem_desc *t; - - if (!tgt) - return NULL; - - for (t = tgt; t != NULL; t = t->prev) - { - if (t->tgt_start <= (uintptr_t) d && t->tgt_end >= (uintptr_t) d + s) - break; - } - - if (!t) + if (!mem_map || !mem_map->root) return NULL; - for (i = 0; i < t->list_count; i++) - { - void * offset; - - splay_tree_key k = &t->array[i].key; - offset = d - t->tgt_start + k->tgt_offset; - - if (k->host_start + offset <= (void *) k->host_end) - return k; - } - - return NULL; + return lookup_dev_1 (mem_map->root, (uintptr_t) d, s); } /* OpenACC is silent on how memory exhaustion is indicated. We return @@ -136,7 +132,7 @@ acc_free (void *d) /* We don't have to call lazy open here, as the ptr value must have been returned by acc_malloc. It's not permitted to pass NULL in (unless you got that null from acc_malloc). */ - if ((k = lookup_dev (acc_dev->openacc.data_environ, d, 1))) + if ((k = lookup_dev (&acc_dev->mem_map, d, 1))) { void *offset; @@ -260,7 +256,7 @@ acc_hostptr (void *d) gomp_mutex_lock (&acc_dev->lock); - n = lookup_dev (acc_dev->openacc.data_environ, d, 1); + n = lookup_dev (&acc_dev->mem_map, d, 1); if (!n) { @@ -348,7 +344,7 @@ acc_map_data (void *h, void *d, size_t s) (int)s); } - if (lookup_dev (thr->dev->openacc.data_environ, d, s)) + if (lookup_dev (&thr->dev->mem_map, d, s)) { gomp_mutex_unlock (&acc_dev->lock); gomp_fatal ("device address [%p, +%d] is already mapped", (void *)d, @@ -361,11 +357,6 @@ acc_map_data (void *h, void *d, size_t s) &kinds, true, GOMP_MAP_VARS_OPENACC); tgt->list[0].key->refcount = REFCOUNT_INFINITY; } - - gomp_mutex_lock (&acc_dev->lock); - tgt->prev = acc_dev->openacc.data_environ; - acc_dev->openacc.data_environ = tgt; - gomp_mutex_unlock (&acc_dev->lock); } void @@ -373,6 +364,7 @@ acc_unmap_data (void *h) { struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + struct splay_tree_key_s cur_node; /* No need to call lazy open, as the address must have been mapped. */ @@ -380,12 +372,11 @@ acc_unmap_data (void *h) if (acc_dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) return; - size_t host_size; - gomp_mutex_lock (&acc_dev->lock); - splay_tree_key n = lookup_host (acc_dev, h, 1); - struct target_mem_desc *t; + cur_node.host_start = (uintptr_t) h; + cur_node.host_end = cur_node.host_start + 1; + splay_tree_key n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); if (!n) { @@ -393,46 +384,27 @@ acc_unmap_data (void *h) gomp_fatal ("%p is not a mapped block", (void *)h); } - host_size = n->host_end - n->host_start; - if (n->host_start != (uintptr_t) h) { + size_t host_size = n->host_end - n->host_start; gomp_mutex_unlock (&acc_dev->lock); gomp_fatal ("[%p,%d] surrounds %p", (void *) n->host_start, (int) host_size, (void *) h); } - /* Mark for removal. */ - n->refcount = 1; + splay_tree_remove (&acc_dev->mem_map, n); - t = n->tgt; + struct target_mem_desc *tgt = n->tgt; - if (t->refcount == 2) + if (tgt->refcount > 0) + tgt->refcount--; + else { - struct target_mem_desc *tp; - - /* This is the last reference, so pull the descriptor off the - chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from - freeing the device memory. */ - t->tgt_end = 0; - t->to_free = 0; - - for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL; - tp = t, t = t->prev) - if (n->tgt == t) - { - if (tp) - tp->prev = t->prev; - else - acc_dev->openacc.data_environ = t->prev; - - break; - } + free (tgt->array); + free (tgt); } gomp_mutex_unlock (&acc_dev->lock); - - gomp_unmap_vars (t, true); } #define FLAG_PRESENT (1 << 0) @@ -479,8 +451,9 @@ present_create_copy (unsigned f, void *h, size_t s, int async) if (n->refcount != REFCOUNT_INFINITY) { n->refcount++; - n->dynamic_refcount++; + n->virtual_refcount++; } + gomp_mutex_unlock (&acc_dev->lock); } else if (!(f & FLAG_CREATE)) @@ -490,7 +463,6 @@ present_create_copy (unsigned f, void *h, size_t s, int async) } else { - struct target_mem_desc *tgt; size_t mapnum = 1; unsigned short kinds; void *hostaddrs = h; @@ -504,18 +476,14 @@ present_create_copy (unsigned f, void *h, size_t s, int async) goacc_aq aq = get_goacc_asyncqueue (async); - tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, - &kinds, true, GOMP_MAP_VARS_OPENACC); - - /* Initialize dynamic refcount. */ - tgt->list[0].key->dynamic_refcount = 1; + gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, &kinds, + true, GOMP_MAP_VARS_OPENACC_ENTER_DATA); gomp_mutex_lock (&acc_dev->lock); - - d = tgt->to_free; - tgt->prev = acc_dev->openacc.data_environ; - acc_dev->openacc.data_environ = tgt; - + n = lookup_host (acc_dev, h, s); + assert (n != NULL); + d = (void *) (n->tgt->tgt_start + n->tgt_offset + (uintptr_t) h + - n->host_start); gomp_mutex_unlock (&acc_dev->lock); } @@ -590,7 +558,6 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) { size_t host_size; splay_tree_key n; - void *d; struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; @@ -610,9 +577,6 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s); } - d = (void *) (n->tgt->tgt_start + n->tgt_offset - + (uintptr_t) h - n->host_start); - host_size = n->host_end - n->host_start; if (n->host_start != (uintptr_t) h || host_size != s) @@ -625,48 +589,37 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) if (n->refcount == REFCOUNT_INFINITY) { n->refcount = 0; - n->dynamic_refcount = 0; - } - if (n->refcount < n->dynamic_refcount) - { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("Dynamic reference counting assert fail\n"); + n->virtual_refcount = 0; } if (f & FLAG_FINALIZE) { - n->refcount -= n->dynamic_refcount; - n->dynamic_refcount = 0; + n->refcount -= n->virtual_refcount; + n->virtual_refcount = 0; } - else if (n->dynamic_refcount) + + if (n->virtual_refcount > 0) { - n->dynamic_refcount--; n->refcount--; + n->virtual_refcount--; } + else if (n->refcount > 0) + n->refcount--; if (n->refcount == 0) { - if (n->tgt->refcount == 2) - { - struct target_mem_desc *tp, *t; - for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL; - tp = t, t = t->prev) - if (n->tgt == t) - { - if (tp) - tp->prev = t->prev; - else - acc_dev->openacc.data_environ = t->prev; - break; - } - } + goacc_aq aq = get_goacc_asyncqueue (async); if (f & FLAG_COPYOUT) - { - goacc_aq aq = get_goacc_asyncqueue (async); + { + void *d = (void *) (n->tgt->tgt_start + n->tgt_offset + + (uintptr_t) h - n->host_start); gomp_copy_dev2host (acc_dev, aq, h, d, s); } - gomp_remove_var (acc_dev, n); + if (aq) + goacc_remove_var_async (acc_dev, n, aq); + else + gomp_remove_var (acc_dev, n); } gomp_mutex_unlock (&acc_dev->lock); @@ -783,140 +736,159 @@ acc_update_self_async (void *h, size_t s, int async) } void -gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes, - void *kinds, int async) +gomp_acc_remove_pointer (struct gomp_device_descr *acc_dev, void **hostaddrs, + size_t *sizes, unsigned short *kinds, int async, + bool finalize, int mapnum) { - struct target_mem_desc *tgt; - struct goacc_thread *thr = goacc_thread (); - struct gomp_device_descr *acc_dev = thr->dev; + struct splay_tree_key_s cur_node; + splay_tree_key n; + + gomp_mutex_lock (&acc_dev->lock); - if (acc_is_present (*hostaddrs, *sizes)) + for (int i = 0; i < mapnum; i++) { - splay_tree_key n; - gomp_mutex_lock (&acc_dev->lock); - n = lookup_host (acc_dev, *hostaddrs, *sizes); - gomp_mutex_unlock (&acc_dev->lock); + int kind = kinds[i] & 0xff; + bool copyfrom = false; - tgt = n->tgt; - for (size_t i = 0; i < tgt->list_count; i++) - if (tgt->list[i].key == n) - { - for (size_t j = 0; j < mapnum; j++) - if (i + j < tgt->list_count && tgt->list[i + j].key) - { - tgt->list[i + j].key->refcount++; - tgt->list[i + j].key->dynamic_refcount++; - } - return; - } - /* Should not reach here. */ - gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset"); - } + switch (kind) + { + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_ALWAYS_FROM: + copyfrom = true; + /* Fallthrough. */ + + case GOMP_MAP_TO_PSET: + case GOMP_MAP_POINTER: + case GOMP_MAP_DELETE: + case GOMP_MAP_RELEASE: + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: + cur_node.host_start = (uintptr_t) hostaddrs[i]; + cur_node.host_end = cur_node.host_start + + ((kind == GOMP_MAP_DETACH + || kind == GOMP_MAP_FORCE_DETACH + || kind == GOMP_MAP_POINTER) + ? sizeof (void *) : sizes[i]); + n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); + + if (n == NULL) + continue; + + if (n->refcount == REFCOUNT_INFINITY) + { + n->refcount = 1; + n->virtual_refcount = 0; + } - gomp_debug (0, " %s: prepare mappings\n", __FUNCTION__); - goacc_aq aq = get_goacc_asyncqueue (async); - tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, - NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC); - gomp_debug (0, " %s: mappings prepared\n", __FUNCTION__); + if (finalize) + { + n->refcount -= n->virtual_refcount; + n->virtual_refcount = 0; + } - /* Initialize dynamic refcount. */ - tgt->list[0].key->dynamic_refcount = 1; + if (n->virtual_refcount > 0) + { + n->refcount--; + n->virtual_refcount--; + } + else if (n->refcount > 0) + n->refcount--; + + if (copyfrom) + gomp_copy_dev2host (acc_dev, NULL, (void *) cur_node.host_start, + (void *) (n->tgt->tgt_start + n->tgt_offset + + cur_node.host_start + - n->host_start), + cur_node.host_end - cur_node.host_start); + + if (n->refcount == 0) + gomp_remove_var (acc_dev, n); + break; + + default: + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("gomp_acc_remove_pointer unhandled kind 0x%.2x", + kind); + } + } - gomp_mutex_lock (&acc_dev->lock); - tgt->prev = acc_dev->openacc.data_environ; - acc_dev->openacc.data_environ = tgt; gomp_mutex_unlock (&acc_dev->lock); } + void -gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async, - int finalize, int mapnum) +acc_attach_async (void **hostaddr, int async) { struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; + goacc_aq aq = get_goacc_asyncqueue (async); + + struct splay_tree_key_s cur_node; splay_tree_key n; - struct target_mem_desc *t; - int minrefs = (mapnum == 1) ? 2 : 3; - if (!acc_is_present (h, s)) + if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) return; - gomp_mutex_lock (&acc_dev->lock); + cur_node.host_start = (uintptr_t) hostaddr; + cur_node.host_end = cur_node.host_start + sizeof (void *); + n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); - n = lookup_host (acc_dev, h, 1); + if (n == NULL) + gomp_fatal ("struct not mapped for acc_attach"); - if (!n) - { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("%p is not a mapped block", (void *)h); - } + gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr, + 0, NULL); +} - gomp_debug (0, " %s: restore mappings\n", __FUNCTION__); +void +acc_attach (void **hostaddr) +{ + acc_attach_async (hostaddr, acc_async_sync); +} - t = n->tgt; +static void +goacc_detach_internal (void **hostaddr, int async, bool finalize) +{ + struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *acc_dev = thr->dev; + struct splay_tree_key_s cur_node; + splay_tree_key n; + struct goacc_asyncqueue *aq = get_goacc_asyncqueue (async); - if (n->refcount < n->dynamic_refcount) - { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("Dynamic reference counting assert fail\n"); - } + if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) + return; - if (finalize) - { - n->refcount -= n->dynamic_refcount; - n->dynamic_refcount = 0; - } - else if (n->dynamic_refcount) - { - n->dynamic_refcount--; - n->refcount--; - } + cur_node.host_start = (uintptr_t) hostaddr; + cur_node.host_end = cur_node.host_start + sizeof (void *); + n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); - gomp_mutex_unlock (&acc_dev->lock); + if (n == NULL) + gomp_fatal ("struct not mapped for acc_detach"); - if (n->refcount == 0) - { - if (t->refcount == minrefs) - { - /* This is the last reference, so pull the descriptor off the - chain. This prevents gomp_unmap_vars via gomp_unmap_tgt from - freeing the device memory. */ - struct target_mem_desc *tp; - for (tp = NULL, t = acc_dev->openacc.data_environ; t != NULL; - tp = t, t = t->prev) - { - if (n->tgt == t) - { - if (tp) - tp->prev = t->prev; - else - acc_dev->openacc.data_environ = t->prev; - break; - } - } - } + gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL); +} - /* Set refcount to 1 to allow gomp_unmap_vars to unmap it. */ - n->refcount = 1; - t->refcount = minrefs; - for (size_t i = 0; i < t->list_count; i++) - if (t->list[i].key == n) - { - t->list[i].copy_from = force_copyfrom ? 1 : 0; - break; - } - - /* If running synchronously, unmap immediately. */ - if (async < acc_async_noval) - gomp_unmap_vars (t, true); - else - { - goacc_aq aq = get_goacc_asyncqueue (async); - goacc_async_copyout_unmap_vars (t, aq); - } - } +void +acc_detach (void **hostaddr) +{ + goacc_detach_internal (hostaddr, acc_async_sync, false); +} - gomp_mutex_unlock (&acc_dev->lock); +void +acc_detach_async (void **hostaddr, int async) +{ + goacc_detach_internal (hostaddr, async, false); +} - gomp_debug (0, " %s: mappings restored\n", __FUNCTION__); +void +acc_detach_finalize (void **hostaddr) +{ + goacc_detach_internal (hostaddr, acc_async_sync, true); +} + +void +acc_detach_finalize_async (void **hostaddr, int async) +{ + goacc_detach_internal (hostaddr, async, true); } diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index 07d0338..0e9a3e8 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -47,12 +47,32 @@ find_pointer (int pos, size_t mapnum, unsigned short *kinds) if (pos + 1 >= mapnum) return 0; - unsigned char kind = kinds[pos+1] & 0xff; + unsigned char kind0 = kinds[pos] & 0xff; - if (kind == GOMP_MAP_TO_PSET) - return 3; - else if (kind == GOMP_MAP_POINTER) - return 2; + switch (kind0) + { + case GOMP_MAP_TO: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_TOFROM: + case GOMP_MAP_FORCE_TOFROM: + case GOMP_MAP_ALLOC: + case GOMP_MAP_RELEASE: + { + unsigned char kind1 = kinds[pos + 1] & 0xff; + if (kind1 == GOMP_MAP_POINTER + || kind1 == GOMP_MAP_ALWAYS_POINTER + || kind1 == GOMP_MAP_ATTACH + || kind1 == GOMP_MAP_DETACH + || kind1 == GOMP_MAP_FORCE_DETACH) + return 2; + else if (kind1 == GOMP_MAP_TO_PSET) + return 3; + } + default: + /* empty. */; + } return 0; } @@ -235,8 +255,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), devaddrs = gomp_alloca (sizeof (void *) * mapnum); for (i = 0; i < mapnum; i++) - devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start - + tgt->list[i].key->tgt_offset); + devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i); + if (aq == NULL) { acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, @@ -355,6 +375,10 @@ GOACC_enter_exit_data (int device, size_t mapnum, if (mapnum > 0) { unsigned char kind = kinds[0] & 0xff; + + if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT) + kind = kinds[1] & 0xff; + if (kind == GOMP_MAP_DELETE || kind == GOMP_MAP_FORCE_FROM) finalize = true; @@ -365,11 +389,14 @@ GOACC_enter_exit_data (int device, size_t mapnum, { unsigned char kind = kinds[i] & 0xff; - if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET) + if (kind == GOMP_MAP_POINTER + || kind == GOMP_MAP_TO_PSET + || kind == GOMP_MAP_STRUCT + || kind == GOMP_MAP_FORCE_PRESENT) continue; if (kind == GOMP_MAP_FORCE_ALLOC - || kind == GOMP_MAP_FORCE_PRESENT + || kind == GOMP_MAP_ATTACH || kind == GOMP_MAP_FORCE_TO || kind == GOMP_MAP_TO || kind == GOMP_MAP_ALLOC) @@ -380,6 +407,8 @@ GOACC_enter_exit_data (int device, size_t mapnum, if (kind == GOMP_MAP_RELEASE || kind == GOMP_MAP_DELETE + || kind == GOMP_MAP_DETACH + || kind == GOMP_MAP_FORCE_DETACH || kind == GOMP_MAP_FROM || kind == GOMP_MAP_FORCE_FROM) break; @@ -413,6 +442,9 @@ GOACC_enter_exit_data (int device, size_t mapnum, case GOMP_MAP_ALLOC: acc_present_or_create (hostaddrs[i], sizes[i]); break; + case GOMP_MAP_ATTACH: + case GOMP_MAP_FORCE_PRESENT: + break; case GOMP_MAP_FORCE_ALLOC: acc_create (hostaddrs[i], sizes[i]); break; @@ -422,6 +454,16 @@ GOACC_enter_exit_data (int device, size_t mapnum, case GOMP_MAP_FORCE_TO: acc_copyin (hostaddrs[i], sizes[i]); break; + case GOMP_MAP_STRUCT: + { + int elems = sizes[i]; + goacc_aq aq = get_goacc_asyncqueue (async); + gomp_map_vars_async (acc_dev, aq, elems + 1, &hostaddrs[i], + NULL, &sizes[i], &kinds[i], true, + GOMP_MAP_VARS_OPENACC_ENTER_DATA); + i += elems; + } + break; default: gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", kind); @@ -430,8 +472,14 @@ GOACC_enter_exit_data (int device, size_t mapnum, } else { - gomp_acc_insert_pointer (pointer, &hostaddrs[i], - &sizes[i], &kinds[i], async); + goacc_aq aq = get_goacc_asyncqueue (async); + for (int j = 0; j < 2; j++) + gomp_map_vars_async (acc_dev, aq, + (j == 0 || pointer == 2) ? 1 : 2, + &hostaddrs[i + j], NULL, + &sizes[i + j], &kinds[i + j], true, + GOMP_MAP_VARS_OPENACC_ENTER_DATA); + /* Increment 'i' by two because OpenACC requires fortran arrays to be contiguous, so each PSET is associated with one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and @@ -439,51 +487,143 @@ GOACC_enter_exit_data (int device, size_t mapnum, i += pointer - 1; } } + + /* This loop only handles explicit "attach" clauses that are not an + implicit part of a copy{,in,out}, etc. mapping. */ + for (i = 0; i < mapnum; i++) + { + unsigned char kind = kinds[i] & 0xff; + + /* Scan for pointers and PSETs. */ + int pointer = find_pointer (i, mapnum, kinds); + + if (!pointer) + { + if (kind == GOMP_MAP_ATTACH) + acc_attach (hostaddrs[i]); + else if (kind == GOMP_MAP_STRUCT) + i += sizes[i]; + } + else + i += pointer - 1; + } } else - for (i = 0; i < mapnum; ++i) - { - unsigned char kind = kinds[i] & 0xff; + { + /* Handle "detach" before copyback/deletion of mapped data. */ + for (i = 0; i < mapnum; i++) + { + unsigned char kind = kinds[i] & 0xff; - int pointer = find_pointer (i, mapnum, kinds); + int pointer = find_pointer (i, mapnum, kinds); - if (!pointer) - { - switch (kind) - { - case GOMP_MAP_RELEASE: - case GOMP_MAP_DELETE: - if (acc_is_present (hostaddrs[i], sizes[i])) + if (!pointer) + { + if (kind == GOMP_MAP_DETACH) + acc_detach (hostaddrs[i]); + else if (kind == GOMP_MAP_FORCE_DETACH) + acc_detach_finalize (hostaddrs[i]); + else if (kind == GOMP_MAP_STRUCT) + i += sizes[i]; + } + else + { + unsigned char kind2 = kinds[i + pointer - 1] & 0xff; + + if (kind2 == GOMP_MAP_DETACH) + acc_detach (hostaddrs[i + pointer - 1]); + else if (kind2 == GOMP_MAP_FORCE_DETACH) + acc_detach_finalize (hostaddrs[i + pointer - 1]); + + i += pointer - 1; + } + } + + for (i = 0; i < mapnum; ++i) + { + unsigned char kind = kinds[i] & 0xff; + + int pointer = find_pointer (i, mapnum, kinds); + + if (!pointer) + { + switch (kind) + { + case GOMP_MAP_RELEASE: + case GOMP_MAP_DELETE: + if (acc_is_present (hostaddrs[i], sizes[i])) + { + if (finalize) + acc_delete_finalize_async (hostaddrs[i], sizes[i], + async); + else + acc_delete_async (hostaddrs[i], sizes[i], async); + } + break; + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: + case GOMP_MAP_FORCE_PRESENT: + break; + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + if (finalize) + acc_copyout_finalize_async (hostaddrs[i], sizes[i], async); + else + acc_copyout_async (hostaddrs[i], sizes[i], async); + break; + case GOMP_MAP_STRUCT: { - if (finalize) - acc_delete_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_delete_async (hostaddrs[i], sizes[i], async); + int elems = sizes[i]; + goacc_aq aq = get_goacc_asyncqueue (async); + for (int j = 1; j <= elems; j++) + { + struct splay_tree_key_s k; + k.host_start = (uintptr_t) hostaddrs[i + j]; + k.host_end = k.host_start + sizes[i + j]; + splay_tree_key str; + gomp_mutex_lock (&acc_dev->lock); + str = splay_tree_lookup (&acc_dev->mem_map, &k); + gomp_mutex_unlock (&acc_dev->lock); + if (str) + { + if (finalize) + { + str->refcount -= str->virtual_refcount; + str->virtual_refcount = 0; + } + if (str->virtual_refcount > 0) + { + str->refcount--; + str->virtual_refcount--; + } + else if (str->refcount > 0) + str->refcount--; + if (str->refcount == 0) + { + if (aq) + goacc_remove_var_async (acc_dev, str, aq); + else + gomp_remove_var (acc_dev, str); + } + } + } + i += elems; } - break; - case GOMP_MAP_FROM: - case GOMP_MAP_FORCE_FROM: - if (finalize) - acc_copyout_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_copyout_async (hostaddrs[i], sizes[i], async); - break; - default: - gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", - kind); - break; - } - } - else - { - bool copyfrom = (kind == GOMP_MAP_FORCE_FROM - || kind == GOMP_MAP_FROM); - gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async, - finalize, pointer); - /* See the above comment. */ - i += pointer - 1; - } - } + break; + default: + gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", + kind); + break; + } + } + else + { + gomp_acc_remove_pointer (acc_dev, &hostaddrs[i], &sizes[i], + &kinds[i], async, finalize, pointer); + i += pointer - 1; + } + } + } } static void diff --git a/libgomp/openacc.h b/libgomp/openacc.h index 2505ac0..1bf2d65 100644 --- a/libgomp/openacc.h +++ b/libgomp/openacc.h @@ -113,12 +113,18 @@ void *acc_hostptr (void *) __GOACC_NOTHROW; int acc_is_present (void *, size_t) __GOACC_NOTHROW; void acc_memcpy_to_device (void *, void *, size_t) __GOACC_NOTHROW; void acc_memcpy_from_device (void *, void *, size_t) __GOACC_NOTHROW; +void acc_attach (void **) __GOACC_NOTHROW; +void acc_attach_async (void **, int) __GOACC_NOTHROW; +void acc_detach (void **) __GOACC_NOTHROW; +void acc_detach_async (void **, int) __GOACC_NOTHROW; /* Finalize versions of copyout/delete functions, specified in OpenACC 2.5. */ void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW; void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW; void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW; void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW; +void acc_detach_finalize (void **) __GOACC_NOTHROW; +void acc_detach_finalize_async (void **, int) __GOACC_NOTHROW; /* Async functions, specified in OpenACC 2.5. */ void acc_copyin_async (void *, size_t, int) __GOACC_NOTHROW; diff --git a/libgomp/target.c b/libgomp/target.c index 2bfc7e2..6e115d1 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -39,6 +39,7 @@ #include #include #include +#include #ifdef PLUGIN_SUPPORT #include @@ -372,6 +373,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, tgt_var->key = oldn; tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind); tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind); + tgt_var->do_detach = kind == GOMP_MAP_ATTACH; tgt_var->offset = newn->host_start - oldn->host_start; tgt_var->length = newn->host_end - newn->host_start; @@ -505,7 +507,131 @@ gomp_map_fields_existing (struct target_mem_desc *tgt, (void *) cur_node.host_end); } -static inline uintptr_t +void +gomp_attach_pointer (struct gomp_device_descr *devicep, + struct goacc_asyncqueue *aq, splay_tree mem_map, + splay_tree_key n, uintptr_t attach_to, size_t bias, + struct gomp_coalesce_buf *cbufp) +{ + struct splay_tree_key_s s; + size_t size, idx; + + if (n == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("enclosing struct not mapped for attach"); + } + + size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *); + /* We might have a pointer in a packed struct: however we cannot have more + than one such pointer in each pointer-sized portion of the struct, so + this is safe. */ + idx = (attach_to - n->host_start) / sizeof (void *); + + if (!n->attach_count) + n->attach_count = gomp_malloc_cleared (sizeof (*n->attach_count) * size); + + if (n->attach_count[idx] < USHRT_MAX) + n->attach_count[idx]++; + else + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attach count overflow"); + } + + if (n->attach_count[idx] == 1) + { + uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to + - n->host_start; + uintptr_t target = (uintptr_t) *(void **) attach_to; + splay_tree_key tn; + uintptr_t data; + + if ((void *) target == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attempt to attach null pointer"); + } + + s.host_start = target + bias; + s.host_end = s.host_start + 1; + tn = splay_tree_lookup (mem_map, &s); + + if (!tn) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("pointer target not mapped for attach"); + } + + data = tn->tgt->tgt_start + tn->tgt_offset + target - tn->host_start; + + gomp_debug (1, + "%s: attaching host %p, target %p (struct base %p) to %p\n", + __FUNCTION__, (void *) attach_to, (void *) devptr, + (void *) (n->tgt->tgt_start + n->tgt_offset), (void *) data); + + gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &data, + sizeof (void *), cbufp); + } + else + gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__, + (void *) attach_to, n->attach_count[idx]); +} + +void +gomp_detach_pointer (struct gomp_device_descr *devicep, + struct goacc_asyncqueue *aq, splay_tree_key n, + uintptr_t detach_from, bool finalize, + struct gomp_coalesce_buf *cbufp) +{ + size_t idx; + + if (n == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("enclosing struct not mapped for detach"); + } + + idx = (detach_from - n->host_start) / sizeof (void *); + + if (!n->attach_count) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("no attachment counters for struct"); + } + + if (finalize) + n->attach_count[idx] = 1; + + if (n->attach_count[idx] == 0) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attach count underflow"); + } + else + n->attach_count[idx]--; + + if (n->attach_count[idx] == 0) + { + uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + detach_from + - n->host_start; + uintptr_t target = (uintptr_t) *(void **) detach_from; + + gomp_debug (1, + "%s: detaching host %p, target %p (struct base %p) to %p\n", + __FUNCTION__, (void *) detach_from, (void *) devptr, + (void *) (n->tgt->tgt_start + n->tgt_offset), + (void *) target); + + gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &target, + sizeof (void *), cbufp); + } + else + gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__, + (void *) detach_from, n->attach_count[idx]); +} + +uintptr_t gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i) { if (tgt->list[i].key != NULL) @@ -547,8 +673,9 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, attribute_hidden struct target_mem_desc * gomp_map_vars_async (struct gomp_device_descr *devicep, struct goacc_asyncqueue *aq, size_t mapnum, - void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds, - bool short_mapkind, enum gomp_map_vars_kind pragma_kind) + void **hostaddrs, void **devaddrs, size_t *sizes, + void *kinds, bool short_mapkind, + enum gomp_map_vars_kind pragma_kind) { size_t i, tgt_align, tgt_size, not_found_cnt = 0; bool has_firstprivate = false; @@ -559,7 +686,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); tgt->list_count = mapnum; - tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; + tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA + || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1; tgt->device_descr = devicep; struct gomp_coalesce_buf cbuf, *cbufp = NULL; @@ -674,8 +802,15 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, has_firstprivate = true; continue; } + else if ((kind & typemask) == GOMP_MAP_ATTACH) + { + tgt->list[i].key = NULL; + has_firstprivate = true; + continue; + } cur_node.host_start = (uintptr_t) hostaddrs[i]; - if (!GOMP_MAP_POINTER_P (kind & typemask)) + if (!GOMP_MAP_POINTER_P (kind & typemask) + && (kind & typemask) != GOMP_MAP_ATTACH) cur_node.host_end = cur_node.host_start + sizes[i]; else cur_node.host_end = cur_node.host_start + sizeof (void *); @@ -882,6 +1017,32 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset + cur_node.host_start - n->host_start; continue; + case GOMP_MAP_ATTACH: + { + cur_node.host_start = (uintptr_t) hostaddrs[i]; + cur_node.host_end = cur_node.host_start + sizeof (void *); + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n != NULL) + { + tgt->list[i].key = n; + tgt->list[i].offset = cur_node.host_start - n->host_start; + tgt->list[i].length = n->host_end - n->host_start; + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + tgt->list[i].do_detach + = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA); + n->refcount++; + } + else + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("outer struct not mapped for attach"); + } + gomp_attach_pointer (devicep, aq, mem_map, n, + (uintptr_t) hostaddrs[i], sizes[i], + cbufp); + continue; + } default: break; } @@ -926,10 +1087,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask); tgt->list[i].always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind & typemask); + tgt->list[i].do_detach = false; tgt->list[i].offset = 0; tgt->list[i].length = k->host_end - k->host_start; k->refcount = 1; - k->dynamic_refcount = 0; + k->virtual_refcount = 0; + k->attach_count = NULL; tgt->refcount++; array->left = NULL; array->right = NULL; @@ -980,6 +1143,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, tgt->list[j].key = k; tgt->list[j].copy_from = false; tgt->list[j].always_copy_from = false; + tgt->list[j].do_detach = false; if (k->refcount != REFCOUNT_INFINITY) k->refcount++; gomp_map_pointer (tgt, aq, @@ -1063,8 +1227,20 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, /* If the variable from "omp target enter data" map-list was already mapped, tgt is not needed. Otherwise tgt will be freed by gomp_unmap_vars or gomp_exit_data. */ - if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0) - { + if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA + || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) + && tgt->refcount == 0) + { + /* If we're about to discard a target_mem_desc with no "structural" + references (tgt->refcount == 0), any splay keys linked in the tgt's + list must have their virtual refcount incremented to represent that + "lost" reference in order to implement the semantics of the OpenACC + "present increment" operation properly. */ + if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) + for (i = 0; i < tgt->list_count; i++) + if (tgt->list[i].key) + tgt->list[i].key->virtual_refcount++; + free (tgt); tgt = NULL; } @@ -1091,6 +1267,8 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k) splay_tree_remove (&devicep->mem_map, k); if (k->link_key) splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key); + if (k->attach_count) + free (k->attach_count); if (k->tgt->refcount > 1) k->tgt->refcount--; else @@ -1133,14 +1311,34 @@ gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom, } size_t i; + + /* We must perform detachments before any copies back to the host. */ + for (i = 0; i < tgt->list_count; i++) + { + splay_tree_key k = tgt->list[i].key; + + if (k != NULL && tgt->list[i].do_detach) + gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start + + tgt->list[i].offset, + k->refcount == 1, NULL); + } + for (i = 0; i < tgt->list_count; i++) { splay_tree_key k = tgt->list[i].key; + if (k == NULL) continue; bool do_unmap = false; - if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY) + if (k->tgt == tgt + && k->virtual_refcount > 0 + && k->refcount != REFCOUNT_INFINITY) + { + k->virtual_refcount--; + k->refcount--; + } + else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY) k->refcount--; else if (k->refcount == 1) { @@ -1283,6 +1481,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, k->tgt = tgt; k->tgt_offset = target_table[i].start; k->refcount = REFCOUNT_INFINITY; + k->virtual_refcount = 0; k->link_key = NULL; array->left = NULL; array->right = NULL; @@ -1315,6 +1514,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, k->tgt = tgt; k->tgt_offset = target_var->start; k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY; + k->virtual_refcount = 0; k->link_key = NULL; array->left = NULL; array->right = NULL; @@ -1549,22 +1749,6 @@ gomp_unload_device (struct gomp_device_descr *devicep) } } -/* Free address mapping tables. MM must be locked on entry, and remains locked - on return. */ - -attribute_hidden void -gomp_free_memmap (struct splay_tree_s *mem_map) -{ - while (mem_map->root) - { - struct target_mem_desc *tgt = mem_map->root->key.tgt; - - splay_tree_remove (mem_map, &mem_map->root->key); - free (tgt->array); - free (tgt); - } -} - /* Do we have offload data available for the given offload target type? Instead of verifying that *all* offload data is available that could possibly be required, we instead just look for *any*. If we later find any @@ -2631,6 +2815,9 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr, k->tgt = tgt; k->tgt_offset = (uintptr_t) device_ptr + device_offset; k->refcount = REFCOUNT_INFINITY; + k->virtual_refcount = 0; + k->attach_count = NULL; + k->link_key = NULL; array->left = NULL; array->right = NULL; splay_tree_insert (&devicep->mem_map, array); @@ -3087,7 +3274,6 @@ gomp_target_init (void) current_device.type = current_device.get_type_func (); current_device.mem_map.root = NULL; current_device.state = GOMP_DEVICE_UNINITIALIZED; - current_device.openacc.data_environ = NULL; /* Augment DEVICES and NUM_DEVICES. */ devices = gomp_realloc (devices, diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c index 6a52f74..6bdcfe7 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-2.c @@ -182,13 +182,13 @@ main (int argc, char **argv) exit (EXIT_FAILURE); } + acc_delete (&h_X[0], N * sizeof (float)); + acc_delete (&h_Y1[0], N * sizeof (float)); + free (h_X); free (h_Y1); free (h_Y2); - acc_free (d_X); - acc_free (d_Y); - context_check (pctx); s = cublasDestroy (h); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c index 71365e8..b403a5c 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/context-4.c @@ -176,13 +176,13 @@ main (int argc, char **argv) exit (EXIT_FAILURE); } + acc_delete (&h_X[0], N * sizeof (float)); + acc_delete (&h_Y1[0], N * sizeof (float)); + free (h_X); free (h_Y1); free (h_Y2); - acc_free (d_X); - acc_free (d_Y); - context_check (pctx); s = cublasDestroy (h); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c new file mode 100644 index 0000000..d8d7067 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c @@ -0,0 +1,24 @@ +#include +#include + +struct dc +{ + int a; + int *b; +}; + +int +main () +{ + int n = 100, i; + struct dc v = { .a = 3, .b = (int *) malloc (sizeof (int) * n) }; + +#pragma acc parallel loop copy(v.a, v.b[:n]) + for (i = 0; i < n; i++) + v.b[i] = v.a; + + for (i = 0; i < 10; i++) + assert (v.b[i] == v.a); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c new file mode 100644 index 0000000..7e26e9a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c @@ -0,0 +1,29 @@ +#include +#include + +int +main(int argc, char* argv[]) +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + s.a = (int *) malloc (16 * sizeof (int)); + s.b = (int *) malloc (16 * sizeof (int)); + s.e = (int *) malloc (16 * sizeof (int)); + + #pragma acc data copy(s) + { + #pragma acc data copy(s.a[0:10]) + { + #pragma acc parallel loop attach(s.a) + for (int i = 0; i < 10; i++) + s.a[i] = i; + } + } + + for (int i = 0; i < 10; i++) + assert (s.a[i] == i); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c new file mode 100644 index 0000000..cec764b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c @@ -0,0 +1,34 @@ +#include +#include +#include + +int +main () +{ + int n = 100, i; + int *a = (int *) malloc (sizeof (int) * n); + int *b; + + for (i = 0; i < n; i++) + a[i] = i+1; + +#pragma acc enter data copyin(a[:n]) create(b) + + b = a; + acc_attach ((void **)&b); + +#pragma acc parallel loop present (b[:n]) + for (i = 0; i < n; i++) + b[i] = i+1; + + acc_detach ((void **)&b); + +#pragma acc exit data copyout(a[:n], b) + + for (i = 0; i < 10; i++) + assert (a[i] == b[i]); + + free (a); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c new file mode 100644 index 0000000..8874ca0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c @@ -0,0 +1,87 @@ +#include +#include + +#define LIST_LENGTH 10 + +struct node +{ + struct node *next; + int val; +}; + +int +sum_nodes (struct node *head) +{ + int i = 0, sum = 0; + +#pragma acc parallel reduction(+:sum) present(head[:1]) + { + for (; head != NULL; head = head->next) + sum += head->val; + } + + return sum; +} + +void +insert (struct node *head, int val) +{ + struct node *n = (struct node *) malloc (sizeof (struct node)); + + if (head->next) + { +#pragma acc exit data detach(head->next) + } + + n->val = val; + n->next = head->next; + head->next = n; + +#pragma acc enter data copyin(n[:1]) +#pragma acc enter data attach(head->next) + if (n->next) + { +#pragma acc enter data attach(n->next) + } +} + +void +destroy (struct node *head) +{ + while (head->next != NULL) + { +#pragma acc exit data detach(head->next) + struct node * n = head->next; + head->next = n->next; + if (n->next) + { +#pragma acc exit data detach(n->next) + } +#pragma acc exit data delete (n[:1]) + if (head->next) + { +#pragma acc enter data attach(head->next) + } + free (n); + } +} + +int +main () +{ + struct node list = { .next = NULL, .val = 0 }; + int i; + +#pragma acc enter data copyin(list) + + for (i = 0; i < LIST_LENGTH; i++) + insert (&list, i + 1); + + assert (sum_nodes (&list) == (LIST_LENGTH * LIST_LENGTH + LIST_LENGTH) / 2); + + destroy (&list); + +#pragma acc exit data delete(list) + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c new file mode 100644 index 0000000..89cafbb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c @@ -0,0 +1,81 @@ +#include +#include +#include + +struct node +{ + struct node *next; + int val; +}; + +int +sum_nodes (struct node *head) +{ + int i = 0, sum = 0; + +#pragma acc parallel reduction(+:sum) present(head[:1]) + { + for (; head != NULL; head = head->next) + sum += head->val; + } + + return sum; +} + +void +insert (struct node *head, int val) +{ + struct node *n = (struct node *) malloc (sizeof (struct node)); + + if (head->next) + acc_detach ((void **) &head->next); + + n->val = val; + n->next = head->next; + head->next = n; + + acc_copyin (n, sizeof (struct node)); + acc_attach((void **) &head->next); + + if (n->next) + acc_attach ((void **) &n->next); +} + +void +destroy (struct node *head) +{ + while (head->next != NULL) + { + acc_detach ((void **) &head->next); + struct node * n = head->next; + head->next = n->next; + if (n->next) + acc_detach ((void **) &n->next); + + acc_delete (n, sizeof (struct node)); + if (head->next) + acc_attach((void **) &head->next); + + free (n); + } +} + +int +main () +{ + struct node list = { .next = NULL, .val = 0 }; + int i; + + acc_copyin (&list, sizeof (struct node)); + + for (i = 0; i < 10; i++) + insert (&list, 2); + + assert (sum_nodes (&list) == 10 * 2); + + destroy (&list); + + acc_delete (&list, sizeof (struct node)); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c new file mode 100644 index 0000000..81c1c5e --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-6.c @@ -0,0 +1,59 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include +#include + +struct dc +{ + int a; + int **b; +}; + +int +main () +{ + int n = 100, i, j, k; + struct dc v = { .a = 3 }; + + v.b = (int **) malloc (sizeof (int *) * n); + for (i = 0; i < n; i++) + v.b[i] = (int *) malloc (sizeof (int) * n); + + for (k = 0; k < 16; k++) + { +#pragma acc data copy(v) + { +#pragma acc data copy(v.b[:n]) + { + for (i = 0; i < n; i++) + { + acc_copyin (v.b[i], sizeof (int) * n); + acc_attach ((void **) &v.b[i]); + } + +#pragma acc parallel loop + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + v.b[i][j] = v.a + i + j; + + for (i = 0; i < n; i++) + { + acc_detach ((void **) &v.b[i]); + acc_copyout (v.b[i], sizeof (int) * n); + } + } + } + + for (i = 0; i < n; i++) + for (j = 0; j < n; j++) + assert (v.b[i][j] == v.a + i + j); + + assert (!acc_is_present (&v, sizeof (v))); + assert (!acc_is_present (v.b, sizeof (int *) * n)); + for (i = 0; i < n; i++) + assert (!acc_is_present (v.b[i], sizeof (int) * n)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c new file mode 100644 index 0000000..a59047a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-7.c @@ -0,0 +1,45 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include +#include + +struct dc +{ + int a; + int *b; +}; + +int +main () +{ + int n = 100, i, j, k; + struct dc v = { .a = 3 }; + + v.b = (int *) malloc (sizeof (int) * n); + + for (k = 0; k < 16; k++) + { + /* Here, we do not explicitly copy the enclosing structure, but work + with fields directly. Make sure attachment counters and reference + counters work properly in that case. */ +#pragma acc enter data copyin(v.a, v.b[0:n]) +#pragma acc enter data pcopyin(v.b[0:n]) +#pragma acc enter data pcopyin(v.b[0:n]) + +#pragma acc parallel loop present(v.a, v.b) + for (i = 0; i < n; i++) + v.b[i] = v.a + i; + +#pragma acc exit data copyout(v.b[:n]) finalize +#pragma acc exit data delete(v.a) + + for (i = 0; i < n; i++) + assert (v.b[i] == v.a + i); + + assert (!acc_is_present (&v, sizeof (v))); + assert (!acc_is_present (v.b, sizeof (int *) * n)); + } + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c new file mode 100644 index 0000000..0ca5990 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-8.c @@ -0,0 +1,54 @@ +/* { dg-do run { target { ! openacc_host_selected } } } */ + +#include +#include +#include + +struct dc +{ + int a; + int *b; + int *c; + int *d; +}; + +int +main () +{ + int n = 100, i, j, k; + struct dc v = { .a = 3 }; + + v.b = (int *) malloc (sizeof (int) * n); + v.c = (int *) malloc (sizeof (int) * n); + v.d = (int *) malloc (sizeof (int) * n); + +#pragma acc enter data copyin(v) + + for (k = 0; k < 16; k++) + { +#pragma acc enter data copyin(v.a, v.b[:n], v.c[:n], v.d[:n]) + +#pragma acc parallel loop + for (i = 0; i < n; i++) + v.b[i] = v.a + i; + +#pragma acc exit data copyout(v.b[:n]) +#pragma acc exit data copyout(v.c[:n]) +#pragma acc exit data copyout(v.d[:n]) +#pragma acc exit data copyout(v.a) + + for (i = 0; i < n; i++) + assert (v.b[i] == v.a + i); + + assert (acc_is_present (&v, sizeof (v))); + assert (!acc_is_present (v.b, sizeof (int *) * n)); + assert (!acc_is_present (v.c, sizeof (int *) * n)); + assert (!acc_is_present (v.d, sizeof (int *) * n)); + } + +#pragma acc exit data copyout(v) + + assert (!acc_is_present (&v, sizeof (v))); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 index 83a5400..6bb92c1 100644 --- a/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 +++ b/libgomp/testsuite/libgomp.oacc-fortran/data-2.f90 @@ -1,4 +1,5 @@ ! { dg-do run } +! { dg-additional-options "-cpp" } program test use openacc @@ -70,10 +71,14 @@ program test end do !$acc end parallel - !$acc exit data copyout (d(1:N)) async + !$acc exit data delete (c(1:N)) copyout (d(1:N)) async !$acc exit data async !$acc wait +#if !ACC_MEM_SHARED + if (acc_is_present (c) .eqv. .TRUE.) call abort +#endif + do i = 1, N if (d(i) .ne. 4.0) call abort end do diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 new file mode 100644 index 0000000..c4cea11 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 @@ -0,0 +1,35 @@ +! { dg-do run } + +! Test of attach/detach with "acc data". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + +!$acc data copy(var) +!$acc data copy(var%a) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + +!$acc end data +!$acc end data + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 new file mode 100644 index 0000000..3593661 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 @@ -0,0 +1,33 @@ +! { dg-do run } + +! Test of attach/detach with "acc data", two clauses at once. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + +!$acc data copy(var) copy(var%a) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + +!$acc end data + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 new file mode 100644 index 0000000..667d944 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 @@ -0,0 +1,34 @@ +! { dg-do run } + +! Test of attach/detach with "acc parallel". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc parallel loop copy(var) copy(var%a(1:n)) copy(var%b(1:n)) + do i = 1,n + var%a(i) = i + var%b(i) = i + end do +!$acc end parallel loop + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + if (i .ne. var%b(i)) stop 2 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 new file mode 100644 index 0000000..6949e12 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 @@ -0,0 +1,49 @@ +! { dg-do run } + +! Test of attach/detach with "acc enter/exit data". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer, allocatable :: r(:) + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + allocate(r(1:n)) + +!$acc enter data copyin(var) + +!$acc enter data copyin(var%a, var%b, r) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + var%b(i) = i * 2 + r(i) = i * 3 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a) +!$acc exit data copyout(var%b) +!$acc exit data copyout(r) + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + if (i * 2 .ne. var%b(i)) stop 2 + if (i * 3 .ne. r(i)) stop 3 + end do + +!$acc exit data delete(var) + + deallocate(var%a) + deallocate(var%b) + deallocate(r) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 new file mode 100644 index 0000000..6843cf1 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 @@ -0,0 +1,57 @@ +! { dg-do run } + +! Test of attach/detach, "enter data" inside "data", and subarray. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc data copy(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + end do + +!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5)) + +!$acc parallel loop + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) + +!$acc end data + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 3 + if (i * 2 .ne. var%b(i)) stop 4 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 5 + if (var%b(i) .ne. 0) stop 6 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 new file mode 100644 index 0000000..12910d0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 @@ -0,0 +1,61 @@ +! { dg-do run } + +! Test of attachment counters and finalize. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc data copy(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + end do + +!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5)) + + do i = 1,20 + !$acc enter data attach(var%a) + end do + +!$acc parallel loop + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) finalize + +!$acc end data + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 3 + if (i * 2 .ne. var%b(i)) stop 4 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 5 + if (var%b(i) .ne. 0) stop 6 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 new file mode 100644 index 0000000..ab44f0a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 @@ -0,0 +1,89 @@ +! { dg-do run } + +! Test of attach/detach with scalar elements and nested derived types. + +program dtype + implicit none + integer, parameter :: n = 512 + type subtype + integer :: g, h + integer, allocatable :: q(:) + end type subtype + type mytype + integer, allocatable :: a(:) + integer, allocatable :: c, d + integer, allocatable :: b(:) + integer :: f + type(subtype) :: s + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + allocate(var%c) + allocate(var%d) + allocate(var%s%q(1:n)) + + var%c = 16 + var%d = 20 + var%f = 7 + var%s%g = 21 + var%s%h = 38 + +!$acc enter data copyin(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + var%s%q(i) = 0 + end do + +!$acc data copy(var%a(5:n - 5), var%b(5:n - 5), var%c, var%d) & +!$acc & copy(var%s%q) + +!$acc parallel loop default(none) present(var) + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + var%s%q(i) = i * 3 + var%s%g = 100 + var%s%h = 101 + end do +!$acc end parallel loop + +!$acc end data + +!$acc exit data copyout(var) + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + if (var%s%q(i) .ne. 0) stop 3 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 4 + if (i * 2 .ne. var%b(i)) stop 5 + if (i * 3 .ne. var%s%q(i)) stop 6 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 7 + if (var%b(i) .ne. 0) stop 8 + if (var%s%q(i) .ne. 0) stop 9 + end do + + if (var%c .ne. 16) stop 10 + if (var%d .ne. 20) stop 11 + if (var%s%g .ne. 100 .or. var%s%h .ne. 101) stop 12 + if (var%f .ne. 7) stop 13 + + deallocate(var%a) + deallocate(var%b) + deallocate(var%c) + deallocate(var%d) + deallocate(var%s%q) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 new file mode 100644 index 0000000..d142763 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 @@ -0,0 +1,41 @@ +! { dg-do run } + +! Test of explicit attach/detach clauses and attachment counters. There are no +! acc_attach/acc_detach API routines in Fortran. + +program dtype + use openacc + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + + call acc_copyin(var) + call acc_copyin(var%a) + + !$acc enter data attach(var%a) + +!$acc parallel loop attach(var%a) + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + + !$acc exit data detach(var%a) + + call acc_copyout(var%a) + call acc_copyout(var) + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 new file mode 100644 index 0000000..eb7812d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 @@ -0,0 +1,28 @@ +! Test derived types with subarrays + +! { dg-do run } + + implicit none + type dtype + integer :: a, b, c + end type dtype + integer, parameter :: n = 100 + integer i + type (dtype), dimension(n) :: d + + !$acc data copy(d(1:n)) + !$acc parallel loop + do i = 1, n + d(i)%a = i + d(i)%b = i-1 + d(i)%c = i+1 + end do + !$acc end data + + do i = 1, n + if (d(i)%a /= i) stop 1 + if (d(i)%b /= i-1) stop 2 + if (d(i)%c /= i+1) stop 3 + end do +end program + diff --git a/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 new file mode 100644 index 0000000..c3c8a07 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 @@ -0,0 +1,284 @@ +! Test ACC UPDATE with derived types. + +! { dg-do run } + +module dt + integer, parameter :: n = 10 + type inner + integer :: d(n) + end type inner + type mytype + integer(8) :: a, b, c(n) + type(inner) :: in + end type mytype +end module dt + +program derived_acc + use dt + + implicit none + integer i, res + type(mytype) :: var + + var%a = 0 + var%b = 1 + var%c(:) = 10 + var%in%d(:) = 100 + + var%c(:) = 10 + + !$acc enter data copyin(var) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 1 + + var%b = 100 + + !$acc update device(var%b) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 2 + + !$acc parallel loop present (var) + do i = 1, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c) + + var%a = -1 + + do i = 1, n + if (var%c(i) /= i) stop 3 + var%c(i) = var%a + end do + + !$acc update device(var%a) + !$acc update device(var%c) + + res = 0 + + !$acc parallel loop present(var) reduction(+:res) + do i = 1, n + if (var%c(i) /= var%a) res = res + 1 + end do + + if (res /= 0) stop 4 + + var%c(:) = 0 + + !$acc update device(var%c) + + !$acc parallel loop present(var) + do i = 5, 5 + var%c(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%c(5)) + + do i = 1, n + if (i /= 5 .and. var%c(i) /= 0) stop 5 + if (i == 5 .and. var%c(i) /= 1) stop 6 + end do + + !$acc parallel loop present(var) + do i = 1, n + var%in%d = var%a + end do + !$acc end parallel loop + + !$acc update host(var%in%d) + + do i = 1, n + if (var%in%d(i) /= var%a) stop 7 + end do + + var%c(:) = 0 + + !$acc update device(var%c) + + var%c(:) = -1 + + !$acc parallel loop present(var) + do i = n/2, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c(n/2:n)) + + do i = 1,n + if (i < n/2 .and. var%c(i) /= -1) stop 8 + if (i >= n/2 .and. var%c(i) /= i) stop 9 + end do + + var%in%d(:) = 0 + !$acc update device(var%in%d) + + !$acc parallel loop present(var) + do i = 5, 5 + var%in%d(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%in%d(5)) + + do i = 1, n + if (i /= 5 .and. var%in%d(i) /= 0) stop 10 + if (i == 5 .and. var%in%d(i) /= 1) stop 11 + end do + + !$acc exit data delete(var) + + call derived_acc_subroutine(var) +end program derived_acc + +subroutine derived_acc_subroutine(var) + use dt + + implicit none + integer i, res + type(mytype) :: var + + var%a = 0 + var%b = 1 + var%c(:) = 10 + var%in%d(:) = 100 + + var%c(:) = 10 + + !$acc enter data copyin(var) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 12 + + var%b = 100 + + !$acc update device(var%b) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 13 + + !$acc parallel loop present (var) + do i = 1, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c) + + var%a = -1 + + do i = 1, n + if (var%c(i) /= i) stop 14 + var%c(i) = var%a + end do + + !$acc update device(var%a) + !$acc update device(var%c) + + res = 0 + + !$acc parallel loop present(var) reduction(+:res) + do i = 1, n + if (var%c(i) /= var%a) res = res + 1 + end do + + if (res /= 0) stop 15 + + var%c(:) = 0 + + !$acc update device(var%c) + + !$acc parallel loop present(var) + do i = 5, 5 + var%c(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%c(5)) + + do i = 1, n + if (i /= 5 .and. var%c(i) /= 0) stop 16 + if (i == 5 .and. var%c(i) /= 1) stop 17 + end do + + !$acc parallel loop present(var) + do i = 1, n + var%in%d = var%a + end do + !$acc end parallel loop + + !$acc update host(var%in%d) + + do i = 1, n + if (var%in%d(i) /= var%a) stop 18 + end do + + var%c(:) = 0 + + !$acc update device(var%c) + + var%c(:) = -1 + + !$acc parallel loop present(var) + do i = n/2, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c(n/2:n)) + + do i = 1,n + if (i < n/2 .and. var%c(i) /= -1) stop 19 + if (i >= n/2 .and. var%c(i) /= i) stop 20 + end do + + var%in%d(:) = 0 + !$acc update device(var%in%d) + + !$acc parallel loop present(var) + do i = 5, 5 + var%in%d(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%in%d(5)) + + do i = 1, n + if (i /= 5 .and. var%in%d(i) /= 0) stop 21 + if (i == 5 .and. var%in%d(i) /= 1) stop 22 + end do + + !$acc exit data delete(var) +end subroutine derived_acc_subroutine