From patchwork Sat Nov 10 17:11:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 995937 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489622-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="HKyIX8cE"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42sk8g323Fz9s9G for ; Sun, 11 Nov 2018 04:12:03 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=fuq1ec+jHydFiQsn TIKUiDJ2uD5YRAqZ3gL6nH7Bvm4DEeu4Lv7KDF6/w0qq1esQaXA4sZnMdyf4sGy+ Lk4BmduXRkzFr2fatkUDkC8fCEZIeEwU2oVBPYpRGR3OMO34p6peiwt1fUvspEab 2u8Ct4pPZ27Cd4A2ZweQVe7H7sQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=8fstFUlIwyp6xomMdvNZ/c MUEZM=; b=HKyIX8cExD0uU1ob41c8tI0TBWBPpUx8W9ZU5KqHYEUgqXNXAM9zXM 6dHFqzQz4ghBU6DmTTg04sJmefw0ok9Qa1IJZSxD2ElsEh4Ts4pd5nQT/qMOFosk FqVVGX9tElbH4bVD2FdAsvm6tBWLNYJEzI+YOqLP4qzc5KhjvX7wo= Received: (qmail 100468 invoked by alias); 10 Nov 2018 17:11:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100365 invoked by uid 89); 10 Nov 2018 17:11:40 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-22.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS, UNWANTED_LANGUAGE_BODY autolearn=ham version=3.3.2 spammy=transfers, 8766, 458, 2058 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 10 Nov 2018 17:11:38 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gLWnI-0005c4-EZ from Julian_Brown@mentor.com ; Sat, 10 Nov 2018 09:11:36 -0800 Received: from localhost.localdomain (147.34.91.1) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Sat, 10 Nov 2018 17:11:31 +0000 From: Julian Brown To: CC: , , , Subject: [PATCH 1/3] Host-to-device transfer coalescing & magic offset value self-documentation Date: Sat, 10 Nov 2018 09:11:18 -0800 Message-ID: <8340b3d7685106871b060c54f894105f20cdc052.1541863637.git.julian@codesourcery.com> In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes This patch (by Cesar, with some minor additional changes) replaces usage of several magic constants in target.c with named macros, and replaces the flat array of size_t pairs used for coalescing host-to-device copies with an array of a new struct with start/end fields instead. Tested and bootstrapped alongside the other patches in this series (plus the async patches).. OK? Julian ChangeLog libgomp/ * libgomp.h (OFFSET_INLINED, OFFSET_POINTER, OFFSET_STRUCT): Define. * target.c (FIELD_TGT_EMPTY): Define. (gomp_coalesce_chunk): New. (gomp_coalesce_buf): Use above instead of flat array of size_t pairs. (gomp_coalesce_buf_add): Adjust for above change. (gomp_copy_host2dev): Likewise. (gomp_map_val): Use OFFSET_* macros instead of magic constants. Write as switch instead of list of ifs. (gomp_map_vars_async): Adjust for gomp_coalesce_chunk change. Use OFFSET_* macros. --- libgomp/libgomp.h | 5 +++ libgomp/target.c | 101 ++++++++++++++++++++++++++++++++---------------------- 2 files changed, 65 insertions(+), 41 deletions(-) diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index dac8dc4..cb25e86 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -902,6 +902,11 @@ struct target_mem_desc { artificial pointer to "omp declare target link" object. */ #define REFCOUNT_LINK (~(uintptr_t) 1) +/* Special offset values. */ +#define OFFSET_INLINED (~(uintptr_t) 0) +#define OFFSET_POINTER (~(uintptr_t) 1) +#define OFFSET_STRUCT (~(uintptr_t) 2) + struct splay_tree_key_s { /* Address of the host object. */ uintptr_t host_start; diff --git a/libgomp/target.c b/libgomp/target.c index f3e2332..2bfc7e2 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -45,6 +45,8 @@ #include "plugin-suffix.h" #endif +#define FIELD_TGT_EMPTY (~(size_t) 0) + static void gomp_target_init (void); /* The whole initialization code for offloading plugins is only run one. */ @@ -205,8 +207,14 @@ goacc_device_copy_async (struct gomp_device_descr *devicep, } } -/* Infrastructure for coalescing adjacent or nearly adjacent (in device addresses) - host to device memory transfers. */ +/* Infrastructure for coalescing adjacent or nearly adjacent (in device + addresses) host to device memory transfers. */ + +struct gomp_coalesce_chunk +{ + /* The starting and ending point of a coalesced chunk of memory. */ + size_t start, end; +}; struct gomp_coalesce_buf { @@ -214,10 +222,10 @@ struct gomp_coalesce_buf it will be copied to the device. */ void *buf; struct target_mem_desc *tgt; - /* Array with offsets, chunks[2 * i] is the starting offset and - chunks[2 * i + 1] ending offset relative to tgt->tgt_start device address + /* Array with offsets, chunks[i].start is the starting offset and + chunks[i].end ending offset relative to tgt->tgt_start device address of chunks which are to be copied to buf and later copied to device. */ - size_t *chunks; + struct gomp_coalesce_chunk *chunks; /* Number of chunks in chunks array, or -1 if coalesce buffering should not be performed. */ long chunk_cnt; @@ -250,14 +258,14 @@ gomp_coalesce_buf_add (struct gomp_coalesce_buf *cbuf, size_t start, size_t len) { if (cbuf->chunk_cnt < 0) return; - if (start < cbuf->chunks[2 * cbuf->chunk_cnt - 1]) + if (start < cbuf->chunks[cbuf->chunk_cnt-1].end) { cbuf->chunk_cnt = -1; return; } - if (start < cbuf->chunks[2 * cbuf->chunk_cnt - 1] + MAX_COALESCE_BUF_GAP) + if (start < cbuf->chunks[cbuf->chunk_cnt-1].end + MAX_COALESCE_BUF_GAP) { - cbuf->chunks[2 * cbuf->chunk_cnt - 1] = start + len; + cbuf->chunks[cbuf->chunk_cnt-1].end = start + len; cbuf->use_cnt++; return; } @@ -267,8 +275,8 @@ gomp_coalesce_buf_add (struct gomp_coalesce_buf *cbuf, size_t start, size_t len) if (cbuf->use_cnt == 1) cbuf->chunk_cnt--; } - cbuf->chunks[2 * cbuf->chunk_cnt] = start; - cbuf->chunks[2 * cbuf->chunk_cnt + 1] = start + len; + cbuf->chunks[cbuf->chunk_cnt].start = start; + cbuf->chunks[cbuf->chunk_cnt].end = start + len; cbuf->chunk_cnt++; cbuf->use_cnt = 1; } @@ -300,20 +308,20 @@ gomp_copy_host2dev (struct gomp_device_descr *devicep, if (cbuf) { uintptr_t doff = (uintptr_t) d - cbuf->tgt->tgt_start; - if (doff < cbuf->chunks[2 * cbuf->chunk_cnt - 1]) + if (doff < cbuf->chunks[cbuf->chunk_cnt-1].end) { long first = 0; long last = cbuf->chunk_cnt - 1; while (first <= last) { long middle = (first + last) >> 1; - if (cbuf->chunks[2 * middle + 1] <= doff) + if (cbuf->chunks[middle].end <= doff) first = middle + 1; - else if (cbuf->chunks[2 * middle] <= doff) + else if (cbuf->chunks[middle].start <= doff) { - if (doff + sz > cbuf->chunks[2 * middle + 1]) + if (doff + sz > cbuf->chunks[middle].end) gomp_fatal ("internal libgomp cbuf error"); - memcpy ((char *) cbuf->buf + (doff - cbuf->chunks[0]), + memcpy ((char *) cbuf->buf + (doff - cbuf->chunks[0].start), h, sz); return; } @@ -504,17 +512,25 @@ gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i) return tgt->list[i].key->tgt->tgt_start + tgt->list[i].key->tgt_offset + tgt->list[i].offset; - if (tgt->list[i].offset == ~(uintptr_t) 0) - return (uintptr_t) hostaddrs[i]; - if (tgt->list[i].offset == ~(uintptr_t) 1) - return 0; - if (tgt->list[i].offset == ~(uintptr_t) 2) - return tgt->list[i + 1].key->tgt->tgt_start - + tgt->list[i + 1].key->tgt_offset - + tgt->list[i + 1].offset - + (uintptr_t) hostaddrs[i] - - (uintptr_t) hostaddrs[i + 1]; - return tgt->tgt_start + tgt->list[i].offset; + + switch (tgt->list[i].offset) + { + case OFFSET_INLINED: + return (uintptr_t) hostaddrs[i]; + + case OFFSET_POINTER: + return 0; + + case OFFSET_STRUCT: + return tgt->list[i + 1].key->tgt->tgt_start + + tgt->list[i + 1].key->tgt_offset + + tgt->list[i + 1].offset + + (uintptr_t) hostaddrs[i] + - (uintptr_t) hostaddrs[i + 1]; + + default: + return tgt->tgt_start + tgt->list[i].offset; + } } attribute_hidden struct target_mem_desc * @@ -562,8 +578,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, cbuf.buf = NULL; if (mapnum > 1 || pragma_kind == GOMP_MAP_VARS_TARGET) { - cbuf.chunks - = (size_t *) gomp_alloca ((2 * mapnum + 2) * sizeof (size_t)); + size_t chunk_size = (mapnum + 1) * sizeof (struct gomp_coalesce_chunk); + cbuf.chunks = (struct gomp_coalesce_chunk *) gomp_alloca (chunk_size); cbuf.chunk_cnt = 0; } if (pragma_kind == GOMP_MAP_VARS_TARGET) @@ -573,8 +589,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, tgt_size = mapnum * sizeof (void *); cbuf.chunk_cnt = 1; cbuf.use_cnt = 1 + (mapnum > 1); - cbuf.chunks[0] = 0; - cbuf.chunks[1] = tgt_size; + cbuf.chunks[0].start = 0; + cbuf.chunks[0].end = tgt_size; } gomp_mutex_lock (&devicep->lock); @@ -592,7 +608,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, || (kind & typemask) == GOMP_MAP_FIRSTPRIVATE_INT) { tgt->list[i].key = NULL; - tgt->list[i].offset = ~(uintptr_t) 0; + tgt->list[i].offset = OFFSET_INLINED; continue; } else if ((kind & typemask) == GOMP_MAP_USE_DEVICE_PTR) @@ -610,7 +626,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, = (void *) (n->tgt->tgt_start + n->tgt_offset + cur_node.host_start); tgt->list[i].key = NULL; - tgt->list[i].offset = ~(uintptr_t) 0; + tgt->list[i].offset = OFFSET_INLINED; continue; } else if ((kind & typemask) == GOMP_MAP_STRUCT) @@ -621,7 +637,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, cur_node.host_end = (uintptr_t) hostaddrs[last] + sizes[last]; tgt->list[i].key = NULL; - tgt->list[i].offset = ~(uintptr_t) 2; + tgt->list[i].offset = OFFSET_STRUCT; splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); if (n == NULL) { @@ -654,7 +670,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, else if ((kind & typemask) == GOMP_MAP_ALWAYS_POINTER) { tgt->list[i].key = NULL; - tgt->list[i].offset = ~(uintptr_t) 1; + tgt->list[i].offset = OFFSET_POINTER; has_firstprivate = true; continue; } @@ -684,7 +700,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, if (!n) { tgt->list[i].key = NULL; - tgt->list[i].offset = ~(uintptr_t) 1; + tgt->list[i].offset = OFFSET_POINTER; continue; } } @@ -759,7 +775,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, if (cbuf.chunk_cnt > 0) { cbuf.buf - = malloc (cbuf.chunks[2 * cbuf.chunk_cnt - 1] - cbuf.chunks[0]); + = malloc (cbuf.chunks[cbuf.chunk_cnt-1].end - cbuf.chunks[0].start); if (cbuf.buf) { cbuf.tgt = tgt; @@ -876,6 +892,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, else k->host_end = k->host_start + sizeof (void *); splay_tree_key n = splay_tree_lookup (mem_map, k); + /* Need to account for the case where a struct field hasn't been + mapped onto the accelerator yet. */ if (n && n->refcount != REFCOUNT_LINK) gomp_map_vars_existing (devicep, aq, n, k, &tgt->list[i], kind & typemask, cbufp); @@ -892,12 +910,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, size_t align = (size_t) 1 << (kind >> rshift); tgt->list[i].key = k; k->tgt = tgt; - if (field_tgt_clear != ~(size_t) 0) + if (field_tgt_clear != FIELD_TGT_EMPTY) { k->tgt_offset = k->host_start - field_tgt_base + field_tgt_offset; if (i == field_tgt_clear) - field_tgt_clear = ~(size_t) 0; + field_tgt_clear = FIELD_TGT_EMPTY; } else { @@ -1035,9 +1053,10 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, long c = 0; for (c = 0; c < cbuf.chunk_cnt; ++c) gomp_copy_host2dev (devicep, aq, - (void *) (tgt->tgt_start + cbuf.chunks[2 * c]), - (char *) cbuf.buf + (cbuf.chunks[2 * c] - cbuf.chunks[0]), - cbuf.chunks[2 * c + 1] - cbuf.chunks[2 * c], NULL); + (void *) (tgt->tgt_start + cbuf.chunks[c].start), + (char *) cbuf.buf + (cbuf.chunks[c].start + - cbuf.chunks[0].start), + cbuf.chunks[c].end - cbuf.chunks[c].start, NULL); free (cbuf.buf); } From patchwork Sat Nov 10 17:11:19 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 995938 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489623-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="b87ytaVY"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42sk941hzmz9s55 for ; Sun, 11 Nov 2018 04:12:23 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=Agy/sct3YZTmNKoO mA2ayQJJ9rJGWVRhTN2eocCeuNJF1ee3TQysrU6IXk+siFm8PTaIcJlc7Xyip++B 7AsYdksWul6SkgcwxLPJIwRPpDGikpg2wvdx3Lcjf2yA8VaMd0RJjNILvnGt0AS+ +qyqbjm5lWfJc4UhGYOaq74fee4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=PIH6nHYAmqdIFVNx0DIQ63 z2Fhk=; b=b87ytaVY3kJLzTDQo5R7fFq8lAuJ08AK4+m910Nm2k4A3zWckgSJeQ QumLOI/zr2b7CAgzjjoqYiyi1pyMwvcsAbcCP4MhMuK3iTHIk8O8N55RzcA6L62+ abxwCuUzvvad3JE6Jx5e2u6FSK+EQmQ2PBg3KqtwvF6+2vIqMaMyg= Received: (qmail 101108 invoked by alias); 10 Nov 2018 17:11:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100975 invoked by uid 89); 10 Nov 2018 17:11:45 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=RELEASE, think!, array_type, ARRAY_TYPE X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 10 Nov 2018 17:11:41 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gLWnM-0005cL-0B from Julian_Brown@mentor.com ; Sat, 10 Nov 2018 09:11:40 -0800 Received: from localhost.localdomain (147.34.91.1) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Sat, 10 Nov 2018 17:11:35 +0000 From: Julian Brown To: CC: , , , Subject: [PATCH 2/3] Factor out duplicate code in gimplify_scan_omp_clauses Date: Sat, 10 Nov 2018 09:11:19 -0800 Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes This patch, created while trying to figure out the open-coded linked-list handling in gimplify_scan_omp_clauses, factors out four somewhat repetitive portions of that function into two new outlined functions. This was done largely mechanically; the actual lines of executed code are more-or-less the same. That means the interfaces to the new functions is somewhat eccentric though, and could no doubt be improved. I've tried to add commentary to the best of my understanding, but suggestions for improvements are welcome! As a bonus, one apparent bug introduced during an earlier refactoring to use the polynomial types has been fixed (I think!): "known_eq (o1, 2)" should have been "known_eq (o1, o2)". Tested alongside other patches in this series and the async patches. OK? ChangeLog gcc/ * gimplify.c (insert_struct_component_mapping) (check_base_and_compare_lt): New. (gimplify_scan_omp_clauses): Outline duplicated code into calls to above two functions. --- gcc/gimplify.c | 307 ++++++++++++++++++++++++++++++++------------------------- 1 file changed, 174 insertions(+), 133 deletions(-) diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 61dca24..274edc0 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -7967,6 +7967,160 @@ gimplify_omp_depend (tree *list_p, gimple_seq *pre_p) return 1; } +/* Insert a GOMP_MAP_ALLOC or GOMP_MAP_RELEASE node following a + GOMP_MAP_STRUCT mapping. C is an always_pointer mapping. STRUCT_NODE is + the struct node to insert the new mapping after (when the struct node is + initially created). PREV_NODE is the first of two or three mappings for a + pointer, and is either: + - the node before C, when a pair of mappings is used, e.g. for a C/C++ + array section. + - not the node before C. This is true when we have a reference-to-pointer + type (with a mapping for the reference and for the pointer), or for + Fortran derived-type mappings with a GOMP_MAP_TO_PSET. + If SCP is non-null, the new node is inserted before *SCP. + if SCP is null, the new node is inserted before PREV_NODE. + The return type is: + - PREV_NODE, if SCP is non-null. + - The newly-created ALLOC or RELEASE node, if SCP is null. + - The second newly-created ALLOC or RELEASE node, if we are mapping a + reference to a pointer. */ + +static tree +insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node, + tree prev_node, tree *scp) +{ + enum gomp_map_kind mkind = (code == OMP_TARGET_EXIT_DATA + || code == OACC_EXIT_DATA) + ? GOMP_MAP_RELEASE : GOMP_MAP_ALLOC; + + tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); + tree cl = scp ? prev_node : c2; + OMP_CLAUSE_SET_MAP_KIND (c2, mkind); + OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c)); + OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node; + OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node); + if (struct_node) + OMP_CLAUSE_CHAIN (struct_node) = c2; + + /* We might need to create an additional mapping if we have a reference to a + pointer (in C++). Don't do this if we have something other than a + GOMP_MAP_ALWAYS_POINTER though, i.e. a GOMP_MAP_TO_PSET. */ + if (OMP_CLAUSE_CHAIN (prev_node) != c + && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node)) + == GOMP_MAP_ALWAYS_POINTER)) + { + tree c4 = OMP_CLAUSE_CHAIN (prev_node); + tree c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (c3, mkind); + OMP_CLAUSE_DECL (c3) = unshare_expr (OMP_CLAUSE_DECL (c4)); + OMP_CLAUSE_SIZE (c3) = TYPE_SIZE_UNIT (ptr_type_node); + OMP_CLAUSE_CHAIN (c3) = prev_node; + if (!scp) + OMP_CLAUSE_CHAIN (c2) = c3; + else + cl = c3; + } + + if (scp) + *scp = c2; + + return cl; +} + +/* Called initially with ORIG_BASE non-null, sets PREV_BITPOS and PREV_POFFSET + to the offset of the field given in BASE. Return type is 1 if BASE is equal + to *ORIG_BASE after stripping off ARRAY_REF and INDIRECT_REF nodes and + calling get_inner_reference, else 0. + + Called subsequently with ORIG_BASE null, compares the offset of the field + given in BASE to PREV_BITPOS, PREV_POFFSET. Returns -1 if the base object + has changed, 0 if the new value has a higher bit position than that + described by the aforementioned arguments, or 1 if the new value is less + than them. Used for (insertion) sorting components after a GOMP_MAP_STRUCT + mapping. */ + +static int +check_base_and_compare_lt (tree base, tree *orig_base, tree decl, + poly_int64 *prev_bitpos, + poly_offset_int *prev_poffset) +{ + tree offset; + poly_int64 bitsize, bitpos; + machine_mode mode; + int unsignedp, reversep, volatilep = 0; + poly_offset_int poffset; + + if (orig_base) + { + while (TREE_CODE (base) == ARRAY_REF) + base = TREE_OPERAND (base, 0); + + if (TREE_CODE (base) == INDIRECT_REF) + base = TREE_OPERAND (base, 0); + } + else + { + if (TREE_CODE (base) == ARRAY_REF) + { + while (TREE_CODE (base) == ARRAY_REF) + base = TREE_OPERAND (base, 0); + if (TREE_CODE (base) != COMPONENT_REF + || TREE_CODE (TREE_TYPE (base)) != ARRAY_TYPE) + return -1; + } + else if (TREE_CODE (base) == INDIRECT_REF + && TREE_CODE (TREE_OPERAND (base, 0)) == COMPONENT_REF + && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) + == REFERENCE_TYPE)) + base = TREE_OPERAND (base, 0); + } + + base = get_inner_reference (base, &bitsize, &bitpos, &offset, &mode, + &unsignedp, &reversep, &volatilep); + + if (orig_base) + *orig_base = base; + + if ((TREE_CODE (base) == INDIRECT_REF + || (TREE_CODE (base) == MEM_REF + && integer_zerop (TREE_OPERAND (base, 1)))) + && DECL_P (TREE_OPERAND (base, 0)) + && TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) == REFERENCE_TYPE) + base = TREE_OPERAND (base, 0); + + gcc_assert (offset == NULL_TREE || poly_int_tree_p (offset)); + + if (offset) + poffset = wi::to_poly_offset (offset); + else + poffset = 0; + + if (maybe_ne (bitpos, 0)) + poffset += bits_to_bytes_round_down (bitpos); + + if (orig_base) + { + gcc_assert (base == decl); + + *prev_bitpos = bitpos; + *prev_poffset = poffset; + + return *orig_base == base; + } + else + { + if (base != decl) + return -1; + + return (maybe_lt (*prev_poffset, poffset) + || (known_eq (*prev_poffset, poffset) + && maybe_lt (*prev_bitpos, bitpos))); + } + + return 0; +} + /* Scan the OMP clauses in *LIST_P, installing mappings into a new and previous omp contexts. */ @@ -8474,29 +8628,13 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } } - tree offset; - poly_int64 bitsize, bitpos; - machine_mode mode; - int unsignedp, reversep, volatilep = 0; - tree base = OMP_CLAUSE_DECL (c); - while (TREE_CODE (base) == ARRAY_REF) - base = TREE_OPERAND (base, 0); - if (TREE_CODE (base) == INDIRECT_REF) - base = TREE_OPERAND (base, 0); - base = get_inner_reference (base, &bitsize, &bitpos, &offset, - &mode, &unsignedp, &reversep, - &volatilep); - tree orig_base = base; - if ((TREE_CODE (base) == INDIRECT_REF - || (TREE_CODE (base) == MEM_REF - && integer_zerop (TREE_OPERAND (base, 1)))) - && DECL_P (TREE_OPERAND (base, 0)) - && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, 0))) - == REFERENCE_TYPE)) - base = TREE_OPERAND (base, 0); - gcc_assert (base == decl - && (offset == NULL_TREE - || poly_int_tree_p (offset))); + tree orig_base; + poly_int64 bitpos1; + poly_offset_int offset1; + + int base_eq_orig_base + = check_base_and_compare_lt (OMP_CLAUSE_DECL (c), + &orig_base, decl, &bitpos1, &offset1); splay_tree_node n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); @@ -8507,7 +8645,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT); - if (orig_base != base) + if (!base_eq_orig_base) OMP_CLAUSE_DECL (l) = unshare_expr (orig_base); else OMP_CLAUSE_DECL (l) = decl; @@ -8517,32 +8655,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, struct_map_to_clause->put (decl, l); if (ptr) { - enum gomp_map_kind mkind - = code == OMP_TARGET_EXIT_DATA - ? GOMP_MAP_RELEASE : GOMP_MAP_ALLOC; - tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), - OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (c2, mkind); - OMP_CLAUSE_DECL (c2) - = unshare_expr (OMP_CLAUSE_DECL (c)); - OMP_CLAUSE_CHAIN (c2) = *prev_list_p; - OMP_CLAUSE_SIZE (c2) - = TYPE_SIZE_UNIT (ptr_type_node); - OMP_CLAUSE_CHAIN (l) = c2; - if (OMP_CLAUSE_CHAIN (*prev_list_p) != c) - { - tree c4 = OMP_CLAUSE_CHAIN (*prev_list_p); - tree c3 - = build_omp_clause (OMP_CLAUSE_LOCATION (c), - OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (c3, mkind); - OMP_CLAUSE_DECL (c3) - = unshare_expr (OMP_CLAUSE_DECL (c4)); - OMP_CLAUSE_SIZE (c3) - = TYPE_SIZE_UNIT (ptr_type_node); - OMP_CLAUSE_CHAIN (c3) = *prev_list_p; - OMP_CLAUSE_CHAIN (c2) = c3; - } + insert_struct_component_mapping (code, c, l, + *prev_list_p, NULL); *prev_list_p = l; prev_list_p = NULL; } @@ -8552,7 +8666,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, *list_p = l; list_p = &OMP_CLAUSE_CHAIN (l); } - if (orig_base != base && code == OMP_TARGET) + if (!base_eq_orig_base && code == OMP_TARGET) { tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); @@ -8575,13 +8689,6 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, tree *sc = NULL, *scp = NULL; if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr) n->value |= GOVD_SEEN; - poly_offset_int o1, o2; - if (offset) - o1 = wi::to_poly_offset (offset); - else - o1 = 0; - if (maybe_ne (bitpos, 0)) - o1 += bits_to_bytes_round_down (bitpos); sc = &OMP_CLAUSE_CHAIN (*osc); if (*sc != c && (OMP_CLAUSE_MAP_KIND (*sc) @@ -8599,44 +8706,14 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, break; else { - tree offset2; - poly_int64 bitsize2, bitpos2; - base = OMP_CLAUSE_DECL (*sc); - if (TREE_CODE (base) == ARRAY_REF) - { - while (TREE_CODE (base) == ARRAY_REF) - base = TREE_OPERAND (base, 0); - if (TREE_CODE (base) != COMPONENT_REF - || (TREE_CODE (TREE_TYPE (base)) - != ARRAY_TYPE)) - break; - } - else if (TREE_CODE (base) == INDIRECT_REF - && (TREE_CODE (TREE_OPERAND (base, 0)) - == COMPONENT_REF) - && (TREE_CODE (TREE_TYPE - (TREE_OPERAND (base, 0))) - == REFERENCE_TYPE)) - base = TREE_OPERAND (base, 0); - base = get_inner_reference (base, &bitsize2, - &bitpos2, &offset2, - &mode, &unsignedp, - &reversep, &volatilep); - if ((TREE_CODE (base) == INDIRECT_REF - || (TREE_CODE (base) == MEM_REF - && integer_zerop (TREE_OPERAND (base, - 1)))) - && DECL_P (TREE_OPERAND (base, 0)) - && (TREE_CODE (TREE_TYPE (TREE_OPERAND (base, - 0))) - == REFERENCE_TYPE)) - base = TREE_OPERAND (base, 0); - if (base != decl) + int same_decl_offset_lt + = check_base_and_compare_lt ( + OMP_CLAUSE_DECL (*sc), NULL, decl, + &bitpos1, &offset1); + if (same_decl_offset_lt == -1) break; if (scp) continue; - gcc_assert (offset == NULL_TREE - || poly_int_tree_p (offset)); tree d1 = OMP_CLAUSE_DECL (*sc); tree d2 = OMP_CLAUSE_DECL (c); while (TREE_CODE (d1) == ARRAY_REF) @@ -8665,14 +8742,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, remove = true; break; } - if (offset2) - o2 = wi::to_poly_offset (offset2); - else - o2 = 0; - o2 += bits_to_bytes_round_down (bitpos2); - if (maybe_lt (o1, o2) - || (known_eq (o1, 2) - && maybe_lt (bitpos, bitpos2))) + if (same_decl_offset_lt) { if (ptr) scp = sc; @@ -8687,38 +8757,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, size_one_node); if (ptr) { - tree c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c), - OMP_CLAUSE_MAP); - tree cl = NULL_TREE; - enum gomp_map_kind mkind - = code == OMP_TARGET_EXIT_DATA - ? GOMP_MAP_RELEASE : GOMP_MAP_ALLOC; - OMP_CLAUSE_SET_MAP_KIND (c2, mkind); - OMP_CLAUSE_DECL (c2) - = unshare_expr (OMP_CLAUSE_DECL (c)); - OMP_CLAUSE_CHAIN (c2) = scp ? *scp : *prev_list_p; - OMP_CLAUSE_SIZE (c2) - = TYPE_SIZE_UNIT (ptr_type_node); - cl = scp ? *prev_list_p : c2; - if (OMP_CLAUSE_CHAIN (*prev_list_p) != c) - { - tree c4 = OMP_CLAUSE_CHAIN (*prev_list_p); - tree c3 - = build_omp_clause (OMP_CLAUSE_LOCATION (c), - OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (c3, mkind); - OMP_CLAUSE_DECL (c3) - = unshare_expr (OMP_CLAUSE_DECL (c4)); - OMP_CLAUSE_SIZE (c3) - = TYPE_SIZE_UNIT (ptr_type_node); - OMP_CLAUSE_CHAIN (c3) = *prev_list_p; - if (!scp) - OMP_CLAUSE_CHAIN (c2) = c3; - else - cl = c3; - } - if (scp) - *scp = c2; + tree cl + = insert_struct_component_mapping (code, c, NULL, + *prev_list_p, scp); if (sc == prev_list_p) { *sc = cl; From patchwork Sat Nov 10 17:11:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 995939 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489624-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="E4QyNiE4"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42sk9R2Sy3z9s9G for ; Sun, 11 Nov 2018 04:12:43 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; q=dns; s=default; b=Mv2Sloy1OLFId1P/ T2gLtkVB191rdB9IPLfQNkBhzhcGtdzJQwvpAuyLTojgmmYKqV9AX8s555mBtzXH TIquaYgo42JbYVBDPseKIlzcrX2fRQtpnNGDCimdD91suHOwZN01sTWoGpA9FG05 3Fap0pGPuya4qbMOHUmSSkn/5K8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-type; s=default; bh=Ept6u/HBlWCJuHQAWsxWO7 I46qU=; b=E4QyNiE4t3SXw4F6hDtbCkzxQqvnXMJP8OJP2dd3bOTqhEDnz62qzh hUkiFAxuFkRPXKDooc3F/KbRhY1XS2dI9baHAf9b8AVTJXB60jQZS70VxZFedCj6 qHNOqVuaLM2thRZ8LlaJIZ0u8Av8lzaH+hGWXLiKC9sK7ON9Qzot4= Received: (qmail 103055 invoked by alias); 10 Nov 2018 17:12:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 102527 invoked by uid 89); 10 Nov 2018 17:11:58 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS, TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=shorts X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 10 Nov 2018 17:11:46 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-MBX-04.mgc.mentorg.com) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gLWnQ-0005cQ-3O from Julian_Brown@mentor.com ; Sat, 10 Nov 2018 09:11:45 -0800 Received: from localhost.localdomain (147.34.91.1) by SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Sat, 10 Nov 2018 17:11:38 +0000 From: Julian Brown To: CC: , , , Subject: [PATCH 3/3] OpenACC 2.6 manual deep copy support (attach/detach) Date: Sat, 10 Nov 2018 09:11:20 -0800 Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes This patch implements the bulk of support for OpenACC 2.6 manual deep copy for the C, C++ and Fortran front-ends, the middle end and the libgomp runtime. I've incorporated parts of the patches previously posted by Cesar: https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01941.html https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01942.html https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01943.html https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01946.html The patch also supersedes the patch posted earlier to support OpenACC 2.5 "update" directives with Fortran derived types: https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00153.html Some brief notes: * Struct members mapped with a tuple of map(to/from), optional pset and an always_pointer are rewritten in gimplify_scan_omp_clauses to use a new GOMP_MAP_ATTACH mapping type instead of the final GOMP_MAP_ALWAYS_POINTER. Explicit "attach" clauses also use the GOMP_MAP_ATTACH mapping, and explicit "detach" uses GOMP_MAP_DETACH. This means that the new "attach operation" takes place when, and only when, the GOMP_MAP_ATTACH appears explicitly in the list of clauses (as rewritten by gimplify.c). Similarly for GOMP_MAP_DETACH. * The runtime needs to keep track of potentially multiple "attachment counters" for each mapped struct/derived type. The way I've implemented this is as a simple array of shorts, where each element maps 1-to-1 onto logical "slots" in the mapped struct. The attachment counters are associated with the block of memory containing the structure in the host's address space, hence the array is allocated on-demand in the splay_tree_key_s structure. This does unfortunately grow that structure a little in all cases. Tested alongside the other patches in the series and bootstrapped. OK? Julian ChangeLog gcc/c-family/ * c-pragma.h (pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ATTACH, PRAGMA_OACC_CLAUSE_DETACH. gcc/c/ * c-parser.c (c_parser_omp_clause_name): Add parsing of attach and detach clauses. (c_parser_omp_variable_list): Allow deref (->) in variable lists. (c_parser_oacc_data_clause): Support attach and detach clauses. (c_parser_oacc_all_clauses): Likewise. (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH. * c-typeck.c (handle_omp_array_sections_1): Reject subarrays for attach and detach. Support deref. (c_oacc_check_attachments): New function. (c_finish_omp_clauses): Check attach/detach arguments for being pointers using above. Support deref. gcc/cp/ * parser.c (cp_parser_omp_clause_name): Support attach and detach clauses. (cp_parser_omp_var_list_no_open): Support deref. (cp_parser_oacc_data_clause): Support attach and detach clauses. (cp_parser_oacc_all_clauses): Likewise. (OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK) (OACC_KERNELS_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSE_MASK): Add PRAGMA_OACC_CLAUSE_DETACH. * semantics.c (handle_omp_array_sections_1): Reject subarrays for attach and detach. (cp_oacc_check_attachments): New function. (finish_omp_clauses): Use above function. Allow structure fields and class members to appear in OpenACC data clauses. Support deref. gcc/fortran/ * gfortran.h (gfc_omp_map_op): Add OMP_MAP_ATTACH, OMP_MAP_DETACH. * openmp.c (gfc_match_omp_variable_list): Add allow_derived parameter. Parse derived-type member accesses if true. (omp_mask2): Add OMP_CLAUSE_ATTACH, OMP_CLAUSE_DETACH. (gfc_match_omp_map_clause): Add allow_derived parameter. Pass to gfc_match_omp_variable_list. (gfc_match_omp_clauses): Support attach and detach. Support derived types for appropriate OpenACC directives. (OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES) (OACC_ENTER_DATA_CLAUSES): Add OMP_CLAUSE_ATTACH. (OACC_EXIT_DATA_CLAUSES): Add OMP_CLAUSE_DETACH. (check_symbol_not_pointer): Don't disallow pointer objects of derived type. (resolve_oacc_data_clauses): Don't disallow allocatable derived types. (resolve_omp_clauses): Perform duplicate checking only for non-derived type component accesses (plain variables and arrays or array sections). Support component refs. * trans-openmp.c (gfc_omp_privatize_by_reference): Support component refs. (gfc_trans_omp_clauses): Support component refs, attach and detach clauses. gcc/ * gimplify.c (gimplify_omp_var_data): Add GOVD_MAP_HAS_ATTACHMENTS. (insert_struct_component_mapping): Support derived-type member mappings for arrays with descriptors which use GOMP_MAP_TO_PSET. (gimplify_scan_omp_clauses): Rewrite GOMP_MAP_ALWAYS_POINTER to GOMP_MAP_ATTACH for OpenACC struct/derived-type component pointers. Handle pointer mappings that use GOMP_MAP_TO_PSET. Handle attach/detach clauses. (gimplify_adjust_omp_clauses_1): Skip adjustments for explicit attach/detach clauses. (gimplify_omp_target_update): Handle finalize for detach. * omp-low.c (lower_omp_target): Support GOMP_MAP_ATTACH, GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH. * tree-pretty-print.c (dump_omp_clause): Likewise. gcc/include/ * gomp-constants.h (GOMP_MAP_DEEP_COPY): Define. (gomp_map_kind): Add GOMP_MAP_ATTACH, GOMP_MAP_DETACH, GOMP_MAP_FORCE_DETACH. gcc/testsuite/ * c-c++-common/goacc/mdc-1.c: New test. * c-c++-common/goacc/mdc-2.c: New test. * gcc.dg/goacc/mdc.C: New test. * gfortran.dg/goacc/data-clauses.f95: New test. * gfortran.dg/goacc/derived-types.f90: New test. * gfortran.dg/goacc/enter-exit-data.f95: New test. libgomp/ * libgomp.h (struct target_var_desc): Add do_detach flag. (struct splay_tree_key_s): Add attach_count field. (struct gomp_coalesce_buf): Add forward declaration. (gomp_map_val, gomp_attach_pointer, gomp_detach_pointer): Add prototypes. (gomp_unmap_vars): Add finalize parameter. * libgomp.map (OACC_2.6): New section. Add acc_attach, acc_attach_async, acc_detach, acc_detach_async, acc_detach_finalize, acc_detach_finalize_async. * oacc-async.c (goacc_async_copyout_unmap_vars): Add finalize parameter. Pass to gomp_unmap_vars_async. * oacc-int.h (goacc_async_copyout_unmap_vars): Add finalize parameter. * oacc-mem.c (acc_unmap_data): Update call to gomp_unmap_vars. (present_create_copy): Initialise attach_count. (delete_copyout): Likewise. (gomp_acc_insert_pointer): Likewise. (gomp_acc_remove_pointer): Update calls to gomp_unmap_vars, goacc_async_copyout_unmap_vars. (acc_attach_async, acc_attach, goacc_detach_internal, acc_detach) (acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): New functions. * oacc-parallel.c (find_pointer): Support attach/detach. Make a little more strict. (GOACC_parallel_keyed): Use gomp_map_val to calculate device addresses. Update calls to gomp_unmap_vars, goacc_async_copyout_unmap_vars. (GOACC_data_end): Update call to gomp_unmap_vars. (GOACC_enter_exit_data): Support attach/detach and GOMP_MAP_STRUCT. * openacc.h (acc_attach, acc_attach_async, acc_detach) (acc_detach_async, acc_detach_finalize, acc_detach_finalize_async): Add prototypes. * target.c (limits.h): Include. (gomp_map_vars_existing): Initialise do_detach field of tgt_var_desc. (gomp_attach_pointer, gomp_detach_pointer): New functions. (gomp_map_val): Make global. (gomp_map_vars_async): Support attach and detach. (gomp_remove_var): Free attach count array if present. (gomp_unmap_vars): Add finalize parameter. Update call to gomp_unmap_vars_async. (gomp_unmap_vars_async): Add finalize parameter. Add pointer detaching support. (GOMP_target): Update call to gomp_unmap_vars. (GOMP_target_ext): Likewise. (gomp_exit_data): Free attach count array if present. (gomp_target_task_fn): Update call to gomp_unmap_vars. * testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-1.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-2.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-3.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-4.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-5.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-6.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-7.c: New test. * testsuite/libgomp.oacc-fortran/deep-copy-8.c: New test. * testsuite/libgomp.oacc-fortran/derived-type-1.f90: New test. * testsuite/libgomp.oacc-fortran/update-2.f90: New test. --- gcc/c-family/c-pragma.h | 2 + gcc/c/c-parser.c | 34 ++- gcc/c/c-typeck.c | 59 ++++- gcc/cp/parser.c | 38 ++- gcc/cp/semantics.c | 75 +++++- gcc/fortran/gfortran.h | 2 + gcc/fortran/openmp.c | 145 +++++++---- gcc/fortran/trans-openmp.c | 78 ++++-- gcc/gimplify.c | 85 ++++-- gcc/omp-low.c | 3 + gcc/testsuite/c-c++-common/goacc/mdc-1.c | 54 ++++ gcc/testsuite/c-c++-common/goacc/mdc-2.c | 62 +++++ gcc/testsuite/g++.dg/goacc/mdc.C | 68 +++++ gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 | 38 +-- gcc/testsuite/gfortran.dg/goacc/derived-types.f90 | 77 ++++++ .../gfortran.dg/goacc/enter-exit-data.f95 | 24 +- gcc/tree-pretty-print.c | 9 + include/gomp-constants.h | 8 + libgomp/libgomp.h | 18 +- libgomp/libgomp.map | 10 + libgomp/oacc-async.c | 4 +- libgomp/oacc-int.h | 2 +- libgomp/oacc-mem.c | 86 ++++++- libgomp/oacc-parallel.c | 220 ++++++++++++---- libgomp/openacc.h | 6 + libgomp/target.c | 191 +++++++++++++- .../libgomp.oacc-c-c++-common/deep-copy-1.c | 24 ++ .../libgomp.oacc-c-c++-common/deep-copy-2.c | 29 +++ .../libgomp.oacc-c-c++-common/deep-copy-3.c | 34 +++ .../libgomp.oacc-c-c++-common/deep-copy-4.c | 87 +++++++ .../libgomp.oacc-c-c++-common/deep-copy-5.c | 81 ++++++ .../testsuite/libgomp.oacc-fortran/deep-copy-1.f90 | 35 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-2.f90 | 33 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-3.f90 | 34 +++ .../testsuite/libgomp.oacc-fortran/deep-copy-4.f90 | 49 ++++ .../testsuite/libgomp.oacc-fortran/deep-copy-5.f90 | 57 +++++ .../testsuite/libgomp.oacc-fortran/deep-copy-6.f90 | 61 +++++ .../testsuite/libgomp.oacc-fortran/deep-copy-7.f90 | 89 +++++++ .../testsuite/libgomp.oacc-fortran/deep-copy-8.f90 | 41 +++ .../libgomp.oacc-fortran/derived-type-1.f90 | 28 ++ .../testsuite/libgomp.oacc-fortran/update-2.f90 | 284 +++++++++++++++++++++ 41 files changed, 2168 insertions(+), 196 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-1.c create mode 100644 gcc/testsuite/c-c++-common/goacc/mdc-2.c create mode 100644 gcc/testsuite/g++.dg/goacc/mdc.C create mode 100644 gcc/testsuite/gfortran.dg/goacc/derived-types.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h index b781f73..dd8d807 100644 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -136,11 +136,13 @@ enum pragma_omp_clause { /* Clauses for OpenACC. */ PRAGMA_OACC_CLAUSE_ASYNC, + PRAGMA_OACC_CLAUSE_ATTACH, PRAGMA_OACC_CLAUSE_AUTO, PRAGMA_OACC_CLAUSE_COPY, PRAGMA_OACC_CLAUSE_COPYOUT, PRAGMA_OACC_CLAUSE_CREATE, PRAGMA_OACC_CLAUSE_DELETE, + PRAGMA_OACC_CLAUSE_DETACH, PRAGMA_OACC_CLAUSE_DEVICEPTR, PRAGMA_OACC_CLAUSE_DEVICE_RESIDENT, PRAGMA_OACC_CLAUSE_FINALIZE, diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c index 624d5a3..03a9e5b 100644 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -11412,6 +11412,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OMP_CLAUSE_ALIGNED; else if (!strcmp ("async", p)) result = PRAGMA_OACC_CLAUSE_ASYNC; + else if (!strcmp ("attach", p)) + result = PRAGMA_OACC_CLAUSE_ATTACH; break; case 'c': if (!strcmp ("collapse", p)) @@ -11434,6 +11436,8 @@ c_parser_omp_clause_name (c_parser *parser) result = PRAGMA_OACC_CLAUSE_DELETE; else if (!strcmp ("depend", p)) result = PRAGMA_OMP_CLAUSE_DEPEND; + else if (!strcmp ("detach", p)) + result = PRAGMA_OACC_CLAUSE_DETACH; else if (!strcmp ("device", p)) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) @@ -11804,9 +11808,12 @@ c_parser_omp_variable_list (c_parser *parser, case OMP_CLAUSE_MAP: case OMP_CLAUSE_FROM: case OMP_CLAUSE_TO: - while (c_parser_next_token_is (parser, CPP_DOT)) + while (c_parser_next_token_is (parser, CPP_DOT) + || c_parser_next_token_is (parser, CPP_DEREF)) { location_t op_loc = c_parser_peek_token (parser)->location; + if (c_parser_next_token_is (parser, CPP_DEREF)) + t = build_simple_mem_ref (t); c_parser_consume_token (parser); if (!c_parser_next_token_is (parser, CPP_NAME)) { @@ -11945,12 +11952,14 @@ c_parser_omp_var_list_parens (c_parser *parser, enum omp_clause_code kind, return list; } -/* OpenACC 2.0: +/* OpenACC 2.5: + attach (variable-list ) copy ( variable-list ) copyin ( variable-list ) copyout ( variable-list ) create ( variable-list ) delete ( variable-list ) + detach ( variable-list ) present ( variable-list ) */ static tree @@ -11960,6 +11969,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, enum gomp_map_kind kind; switch (c_kind) { + case PRAGMA_OACC_CLAUSE_ATTACH: + kind = GOMP_MAP_ATTACH; + break; case PRAGMA_OACC_CLAUSE_COPY: kind = GOMP_MAP_TOFROM; break; @@ -11975,6 +11987,9 @@ c_parser_oacc_data_clause (c_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DELETE: kind = GOMP_MAP_RELEASE; break; + case PRAGMA_OACC_CLAUSE_DETACH: + kind = GOMP_MAP_DETACH; + break; case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; @@ -14551,6 +14566,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses); c_name = "auto"; break; + case PRAGMA_OACC_CLAUSE_ATTACH: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "attach"; + break; case PRAGMA_OACC_CLAUSE_COLLAPSE: clauses = c_parser_omp_clause_collapse (parser, clauses); c_name = "collapse"; @@ -14579,6 +14598,10 @@ c_parser_oacc_all_clauses (c_parser *parser, omp_clause_mask mask, clauses = c_parser_omp_clause_default (parser, clauses, true); c_name = "default"; break; + case PRAGMA_OACC_CLAUSE_DETACH: + clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "detach"; + break; case PRAGMA_OACC_CLAUSE_DEVICE: clauses = c_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "device"; @@ -15057,7 +15080,8 @@ c_parser_oacc_cache (location_t loc, c_parser *parser) */ #define OACC_DATA_CLAUSE_MASK \ - ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ @@ -15240,6 +15264,7 @@ c_parser_oacc_declare (c_parser *parser) #define OACC_ENTER_DATA_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -15249,6 +15274,7 @@ c_parser_oacc_declare (c_parser *parser) | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -15383,6 +15409,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name, #define OACC_KERNELS_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ @@ -15398,6 +15425,7 @@ c_parser_oacc_loop (location_t loc, c_parser *parser, char *p_name, #define OACC_PARALLEL_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c index 144977e..b2de3b4 100644 --- a/gcc/c/c-typeck.c +++ b/gcc/c/c-typeck.c @@ -12610,7 +12610,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, return error_mark_node; } if (TREE_CODE (t) == COMPONENT_REF - && ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM)) @@ -12632,6 +12631,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, } t = TREE_OPERAND (t, 0); } + if (TREE_CODE (t) == MEM_REF) + t = TREE_OPERAND (t, 0); } if (!VAR_P (t) && TREE_CODE (t) != PARM_DECL) { @@ -12716,7 +12717,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; - + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)) + { + if (length != integer_one_node) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "array section in % clause" + : "array section in % clause"); + return error_mark_node; + } + } if (length != NULL_TREE) { if (!integer_nonzerop (length)) @@ -13393,6 +13406,37 @@ c_omp_finish_iterators (tree iter) return ret; } +/* Ensure that pointers are used in OpenACC attach and detach clauses. + Return true if an error has been detected. */ + +static bool +c_oacc_check_attachments (tree c) +{ + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) + return false; + + /* OpenACC attach / detach clauses must be pointers. */ + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) + { + tree t = OMP_CLAUSE_DECL (c); + + while (TREE_CODE (t) == TREE_LIST) + t = TREE_CHAIN (t); + + if (TREE_CODE (TREE_TYPE (t)) != POINTER_TYPE) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "expected pointer in % clause" + : "expected pointer in % clause"); + return true; + } + } + + return false; +} + /* For all elements of CLAUSES, validate them against their constraints. Remove any elements from the list that are invalid. */ @@ -14117,6 +14161,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } } } + if (c_oacc_check_attachments (c)) + remove = true; break; } if (t == error_mark_node) @@ -14124,8 +14170,13 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; break; } + /* OpenACC attach / detach clauses must be pointers. */ + if (c_oacc_check_attachments (c)) + { + remove = true; + break; + } if (TREE_CODE (t) == COMPONENT_REF - && (ort & C_ORT_OMP) && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_) { if (DECL_BIT_FIELD (TREE_OPERAND (t, 1))) @@ -14163,6 +14214,8 @@ c_finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } if (remove) break; + if (TREE_CODE (t) == MEM_REF) + t = TREE_OPERAND (t, 0); if (VAR_P (t) || TREE_CODE (t) == PARM_DECL) { if (bitmap_bit_p (&map_field_head, DECL_UID (t))) diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index deaca5c..033c37a 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -31533,6 +31533,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_ALIGNED; else if (!strcmp ("async", p)) result = PRAGMA_OACC_CLAUSE_ASYNC; + else if (!strcmp ("attach", p)) + result = PRAGMA_OACC_CLAUSE_ATTACH; break; case 'c': if (!strcmp ("collapse", p)) @@ -31553,6 +31555,8 @@ cp_parser_omp_clause_name (cp_parser *parser) result = PRAGMA_OMP_CLAUSE_DEFAULTMAP; else if (!strcmp ("depend", p)) result = PRAGMA_OMP_CLAUSE_DEPEND; + else if (!strcmp ("detach", p)) + result = PRAGMA_OACC_CLAUSE_DETACH; else if (!strcmp ("device", p)) result = PRAGMA_OMP_CLAUSE_DEVICE; else if (!strcmp ("deviceptr", p)) @@ -31832,15 +31836,19 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind, case OMP_CLAUSE_MAP: case OMP_CLAUSE_FROM: case OMP_CLAUSE_TO: - while (cp_lexer_next_token_is (parser->lexer, CPP_DOT)) + while (cp_lexer_next_token_is (parser->lexer, CPP_DOT) + || cp_lexer_next_token_is (parser->lexer, CPP_DEREF)) { + cpp_ttype ttype + = cp_lexer_next_token_is (parser->lexer, CPP_DOT) + ? CPP_DOT : CPP_DEREF; location_t loc = cp_lexer_peek_token (parser->lexer)->location; cp_id_kind idk = CP_ID_KIND_NONE; cp_lexer_consume_token (parser->lexer); decl = convert_from_reference (decl); decl - = cp_parser_postfix_dot_deref_expression (parser, CPP_DOT, + = cp_parser_postfix_dot_deref_expression (parser, ttype, decl, false, &idk, loc); } @@ -31965,12 +31973,14 @@ cp_parser_omp_var_list (cp_parser *parser, enum omp_clause_code kind, tree list) return list; } -/* OpenACC 2.0: +/* OpenACC 2.5: + attach ( variable-list ) copy ( variable-list ) copyin ( variable-list ) copyout ( variable-list ) create ( variable-list ) delete ( variable-list ) + detach ( variable-list ) present ( variable-list ) */ static tree @@ -31980,6 +31990,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, enum gomp_map_kind kind; switch (c_kind) { + case PRAGMA_OACC_CLAUSE_ATTACH: + kind = GOMP_MAP_ATTACH; + break; case PRAGMA_OACC_CLAUSE_COPY: kind = GOMP_MAP_TOFROM; break; @@ -31995,6 +32008,9 @@ cp_parser_oacc_data_clause (cp_parser *parser, pragma_omp_clause c_kind, case PRAGMA_OACC_CLAUSE_DELETE: kind = GOMP_MAP_RELEASE; break; + case PRAGMA_OACC_CLAUSE_DETACH: + kind = GOMP_MAP_DETACH; + break; case PRAGMA_OACC_CLAUSE_DEVICE: kind = GOMP_MAP_FORCE_TO; break; @@ -34338,6 +34354,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses, here); c_name = "auto"; break; + case PRAGMA_OACC_CLAUSE_ATTACH: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "attach"; + break; case PRAGMA_OACC_CLAUSE_COLLAPSE: clauses = cp_parser_omp_clause_collapse (parser, clauses, here); c_name = "collapse"; @@ -34366,6 +34386,10 @@ cp_parser_oacc_all_clauses (cp_parser *parser, omp_clause_mask mask, clauses = cp_parser_omp_clause_default (parser, clauses, here, true); c_name = "default"; break; + case PRAGMA_OACC_CLAUSE_DETACH: + clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); + c_name = "detach"; + break; case PRAGMA_OACC_CLAUSE_DEVICE: clauses = cp_parser_oacc_data_clause (parser, c_kind, clauses); c_name = "device"; @@ -38005,10 +38029,12 @@ cp_parser_oacc_cache (cp_parser *parser, cp_token *pragma_tok) structured-block */ #define OACC_DATA_CLAUSE_MASK \ - ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ + ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DEVICEPTR) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_PRESENT) ) @@ -38208,6 +38234,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok) #define OACC_ENTER_DATA_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_IF) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_CREATE) \ @@ -38218,6 +38245,7 @@ cp_parser_oacc_declare (cp_parser *parser, cp_token *pragma_tok) | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DELETE) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_DETACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_FINALIZE) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_WAIT) ) @@ -38321,6 +38349,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name, #define OACC_KERNELS_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ @@ -38336,6 +38365,7 @@ cp_parser_oacc_loop (cp_parser *parser, cp_token *pragma_tok, char *p_name, #define OACC_PARALLEL_CLAUSE_MASK \ ( (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ASYNC) \ + | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_ATTACH) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPY) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYIN) \ | (OMP_CLAUSE_MASK_1 << PRAGMA_OACC_CLAUSE_COPYOUT) \ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 182d360..303bcc4 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -4568,7 +4568,6 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, t = TREE_OPERAND (t, 0); ret = t; if (TREE_CODE (t) == COMPONENT_REF - && ort == C_ORT_OMP && (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_TO || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FROM) @@ -4691,6 +4690,19 @@ handle_omp_array_sections_1 (tree c, tree t, vec &types, if (low_bound == NULL_TREE) low_bound = integer_zero_node; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH)) + { + if (length != integer_one_node) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "array section in % clause" + : "array section in % clause"); + return error_mark_node; + } + } if (length != NULL_TREE) { if (!integer_nonzerop (length)) @@ -6048,6 +6060,43 @@ cp_omp_finish_iterators (tree iter) return ret; } +/* Ensure that pointers are used in OpenACC attach and detach clauses. + Return true if an error has been detected. */ + +static bool +cp_oacc_check_attachments (tree c) +{ + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP) + return false; + + /* OpenACC attach / detach clauses must be pointers. */ + if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH) + { + tree t = OMP_CLAUSE_DECL (c); + tree type; + + while (TREE_CODE (t) == TREE_LIST) + t = TREE_CHAIN (t); + + type = TREE_TYPE (t); + + if (TREE_CODE (type) == REFERENCE_TYPE) + type = TREE_TYPE (type); + + if (TREE_CODE (type) != POINTER_TYPE) + { + error_at (OMP_CLAUSE_LOCATION (c), + OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + ? "expected pointer in % clause" + : "expected pointer in % clause"); + return true; + } + } + + return false; +} + /* For all elements of CLAUSES, validate them vs OpenMP constraints. Remove any elements from the list that are invalid. */ @@ -6288,7 +6337,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) t = OMP_CLAUSE_DECL (c); check_dup_generic_t: if (t == current_class_ptr - && (ort != C_ORT_OMP_DECLARE_SIMD + && ((ort != C_ORT_OMP_DECLARE_SIMD && ort != C_ORT_ACC) || (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_LINEAR && OMP_CLAUSE_CODE (c) != OMP_CLAUSE_UNIFORM))) { @@ -6352,8 +6401,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) handle_field_decl: if (!remove && TREE_CODE (t) == FIELD_DECL - && t == OMP_CLAUSE_DECL (c) - && ort != C_ORT_ACC) + && t == OMP_CLAUSE_DECL (c)) { OMP_CLAUSE_DECL (c) = omp_privatize_field (t, (OMP_CLAUSE_CODE (c) @@ -6420,7 +6468,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) omp_note_field_privatization (t, OMP_CLAUSE_DECL (c)); else t = OMP_CLAUSE_DECL (c); - if (t == current_class_ptr) + if (ort != C_ORT_ACC && t == current_class_ptr) { error_at (OMP_CLAUSE_LOCATION (c), "% allowed in OpenMP only in %" @@ -6907,7 +6955,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } if (t == error_mark_node) remove = true; - else if (t == current_class_ptr) + else if (ort != C_ORT_ACC && t == current_class_ptr) { error_at (OMP_CLAUSE_LOCATION (c), "% allowed in OpenMP only in %" @@ -7037,6 +7085,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) } } } + if (cp_oacc_check_attachments (c)) + remove = true; break; } if (t == error_mark_node) @@ -7044,14 +7094,25 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) remove = true; break; } + /* OpenACC attach / detach clauses must be pointers. */ + if (cp_oacc_check_attachments (c)) + { + remove = true; + break; + } if (REFERENCE_REF_P (t) && TREE_CODE (TREE_OPERAND (t, 0)) == COMPONENT_REF) { t = TREE_OPERAND (t, 0); OMP_CLAUSE_DECL (c) = t; } + if (ort == C_ORT_ACC + && TREE_CODE (t) == COMPONENT_REF + && TREE_CODE (TREE_OPERAND (t, 0)) == INDIRECT_REF) + t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); if (TREE_CODE (t) == COMPONENT_REF - && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + || ort == C_ORT_ACC) && OMP_CLAUSE_CODE (c) != OMP_CLAUSE__CACHE_) { if (type_dependent_expression_p (t)) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index d8ef35d..9f96418 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -1175,10 +1175,12 @@ enum gfc_omp_depend_op enum gfc_omp_map_op { OMP_MAP_ALLOC, + OMP_MAP_ATTACH, OMP_MAP_TO, OMP_MAP_FROM, OMP_MAP_TOFROM, OMP_MAP_DELETE, + OMP_MAP_DETACH, OMP_MAP_FORCE_ALLOC, OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM, diff --git a/gcc/fortran/openmp.c b/gcc/fortran/openmp.c index 6430e61..ebba7ca 100644 --- a/gcc/fortran/openmp.c +++ b/gcc/fortran/openmp.c @@ -222,7 +222,8 @@ static match gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list, bool allow_common, bool *end_colon = NULL, gfc_omp_namelist ***headp = NULL, - bool allow_sections = false) + bool allow_sections = false, + bool allow_derived = false) { gfc_omp_namelist *head, *tail, *p; locus old_loc, cur_loc; @@ -248,7 +249,8 @@ gfc_match_omp_variable_list (const char *str, gfc_omp_namelist **list, case MATCH_YES: gfc_expr *expr; expr = NULL; - if (allow_sections && gfc_peek_ascii_char () == '(') + if ((allow_sections && gfc_peek_ascii_char () == '(') + || (allow_derived && gfc_peek_ascii_char () == '%')) { gfc_current_locus = cur_loc; m = gfc_match_variable (&expr, 0); @@ -785,7 +787,7 @@ enum omp_mask1 OMP_MASK1_LAST }; -/* OpenACC 2.0 specific clauses. */ +/* OpenACC 2.0+ specific clauses. */ enum omp_mask2 { OMP_CLAUSE_ASYNC, @@ -811,6 +813,8 @@ enum omp_mask2 OMP_CLAUSE_TILE, OMP_CLAUSE_IF_PRESENT, OMP_CLAUSE_FINALIZE, + OMP_CLAUSE_ATTACH, + OMP_CLAUSE_DETACH, /* This must come last. */ OMP_MASK2_LAST }; @@ -914,10 +918,12 @@ omp_inv_mask::omp_inv_mask (const omp_mask &m) : omp_mask (m) mapping. */ static bool -gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op) +gfc_match_omp_map_clause (gfc_omp_namelist **list, gfc_omp_map_op map_op, + bool allow_derived = false) { gfc_omp_namelist **head = NULL; - if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true) + if (gfc_match_omp_variable_list ("", list, false, NULL, &head, true, + allow_derived) == MATCH_YES) { gfc_omp_namelist *n; @@ -939,6 +945,14 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, { gfc_omp_clauses *c = gfc_get_omp_clauses (); locus old_loc; + /* Determine whether we're dealing with an OpenACC directive that permits + derived type member accesses. This in particular disallows + "!$acc declare" from using such accesses, because it's not clear if/how + that should work. */ + bool allow_derived = (openacc + && ((mask & OMP_CLAUSE_ATTACH) + || (mask & OMP_CLAUSE_DETACH) + || (mask & OMP_CLAUSE_HOST_SELF))); gcc_checking_assert (OMP_MASK1_LAST <= 64 && OMP_MASK2_LAST <= 64); *cp = NULL; @@ -1012,6 +1026,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, needs_space = true; continue; } + if ((mask & OMP_CLAUSE_ATTACH) + && gfc_match ("attach ( ") == MATCH_YES + && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], + OMP_MAP_ATTACH, allow_derived)) + continue; break; case 'c': if ((mask & OMP_CLAUSE_COLLAPSE) @@ -1039,7 +1058,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPY) && gfc_match ("copy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if (mask & OMP_CLAUSE_COPYIN) { @@ -1047,7 +1066,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, { if (gfc_match ("copyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; } else if (gfc_match_omp_variable_list ("copyin (", @@ -1058,7 +1077,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("copyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYPRIVATE) && gfc_match_omp_variable_list ("copyprivate (", @@ -1068,7 +1087,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("create ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; break; case 'd': @@ -1104,7 +1123,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_DELETE) && gfc_match ("delete ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_RELEASE)) + OMP_MAP_RELEASE, allow_derived)) continue; if ((mask & OMP_CLAUSE_DEPEND) && gfc_match ("depend ( ") == MATCH_YES) @@ -1147,6 +1166,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, else gfc_current_locus = old_loc; } + if ((mask & OMP_CLAUSE_DETACH) + && gfc_match ("detach ( ") == MATCH_YES + && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], + OMP_MAP_DETACH, allow_derived)) + continue; if ((mask & OMP_CLAUSE_DEVICE) && !openacc && c->device == NULL @@ -1156,12 +1180,13 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, && openacc && gfc_match ("device ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_TO)) + OMP_MAP_FORCE_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_DEVICEPTR) && gfc_match ("deviceptr ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_DEVICEPTR)) + OMP_MAP_FORCE_DEVICEPTR, + allow_derived)) continue; if ((mask & OMP_CLAUSE_DEVICE_RESIDENT) && gfc_match_omp_variable_list @@ -1239,7 +1264,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_HOST_SELF) && gfc_match ("host ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_FROM)) + OMP_MAP_FORCE_FROM, allow_derived)) continue; break; case 'i': @@ -1511,47 +1536,48 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_COPY) && gfc_match ("pcopy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("pcopyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("pcopyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("pcreate ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; if ((mask & OMP_CLAUSE_PRESENT) && gfc_match ("present ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_PRESENT)) + OMP_MAP_FORCE_PRESENT, + allow_derived)) continue; if ((mask & OMP_CLAUSE_COPY) && gfc_match ("present_or_copy ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TOFROM)) + OMP_MAP_TOFROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYIN) && gfc_match ("present_or_copyin ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_TO)) + OMP_MAP_TO, allow_derived)) continue; if ((mask & OMP_CLAUSE_COPYOUT) && gfc_match ("present_or_copyout ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FROM)) + OMP_MAP_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_CREATE) && gfc_match ("present_or_create ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_ALLOC)) + OMP_MAP_ALLOC, allow_derived)) continue; if ((mask & OMP_CLAUSE_PRIORITY) && c->priority == NULL @@ -1669,8 +1695,8 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if (gfc_match_omp_variable_list (" :", &c->lists[OMP_LIST_REDUCTION], - false, NULL, &head, - openacc) == MATCH_YES) + false, NULL, &head, openacc, + allow_derived) == MATCH_YES) { gfc_omp_namelist *n; if (rop == OMP_REDUCTION_NONE) @@ -1769,7 +1795,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, if ((mask & OMP_CLAUSE_HOST_SELF) && gfc_match ("self ( ") == MATCH_YES && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP], - OMP_MAP_FORCE_FROM)) + OMP_MAP_FORCE_FROM, allow_derived)) continue; if ((mask & OMP_CLAUSE_SEQ) && !c->seq @@ -1927,17 +1953,17 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT \ | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEVICEPTR \ | OMP_CLAUSE_PRIVATE | OMP_CLAUSE_FIRSTPRIVATE | OMP_CLAUSE_DEFAULT \ - | OMP_CLAUSE_WAIT) + | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH) #define OACC_KERNELS_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_NUM_GANGS \ | OMP_CLAUSE_NUM_WORKERS | OMP_CLAUSE_VECTOR_LENGTH | OMP_CLAUSE_DEVICEPTR \ | OMP_CLAUSE_COPY | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT \ | OMP_CLAUSE_CREATE | OMP_CLAUSE_PRESENT | OMP_CLAUSE_DEFAULT \ - | OMP_CLAUSE_WAIT) + | OMP_CLAUSE_WAIT | OMP_CLAUSE_ATTACH) #define OACC_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_DEVICEPTR | OMP_CLAUSE_COPY \ | OMP_CLAUSE_COPYIN | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_CREATE \ - | OMP_CLAUSE_PRESENT) + | OMP_CLAUSE_PRESENT | OMP_CLAUSE_ATTACH) #define OACC_LOOP_CLAUSES \ (omp_mask (OMP_CLAUSE_COLLAPSE) | OMP_CLAUSE_GANG | OMP_CLAUSE_WORKER \ | OMP_CLAUSE_VECTOR | OMP_CLAUSE_SEQ | OMP_CLAUSE_INDEPENDENT \ @@ -1958,10 +1984,11 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask, | OMP_CLAUSE_DEVICE | OMP_CLAUSE_WAIT | OMP_CLAUSE_IF_PRESENT) #define OACC_ENTER_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT \ - | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE) + | OMP_CLAUSE_COPYIN | OMP_CLAUSE_CREATE | OMP_CLAUSE_ATTACH) #define OACC_EXIT_DATA_CLAUSES \ (omp_mask (OMP_CLAUSE_IF) | OMP_CLAUSE_ASYNC | OMP_CLAUSE_WAIT \ - | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE) + | OMP_CLAUSE_COPYOUT | OMP_CLAUSE_DELETE | OMP_CLAUSE_FINALIZE \ + | OMP_CLAUSE_DETACH) #define OACC_WAIT_CLAUSES \ omp_mask (OMP_CLAUSE_ASYNC) #define OACC_ROUTINE_CLAUSES \ @@ -3734,9 +3761,6 @@ resolve_nonnegative_int_expr (gfc_expr *expr, const char *clause) static void check_symbol_not_pointer (gfc_symbol *sym, locus loc, const char *name) { - if (sym->ts.type == BT_DERIVED && sym->attr.pointer) - gfc_error ("POINTER object %qs of derived type in %s clause at %L", - sym->name, name, &loc); if (sym->ts.type == BT_DERIVED && sym->attr.cray_pointer) gfc_error ("Cray pointer object %qs of derived type in %s clause at %L", sym->name, name, &loc); @@ -3781,9 +3805,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, const char *name) static void resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name) { - if (sym->ts.type == BT_DERIVED && sym->attr.allocatable) - gfc_error ("ALLOCATABLE object %qs of derived type in %s clause at %L", - sym->name, name, &loc); if ((sym->ts.type == BT_ASSUMED && sym->attr.allocatable) || (sym->ts.type == BT_CLASS && CLASS_DATA (sym) && CLASS_DATA (sym)->attr.allocatable)) @@ -4153,11 +4174,23 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, && (list != OMP_LIST_REDUCTION || !openacc)) for (n = omp_clauses->lists[list]; n; n = n->next) { - if (n->sym->mark) - gfc_error ("Symbol %qs present on multiple clauses at %L", - n->sym->name, &n->where); - else - n->sym->mark = 1; + bool array_only_p = true; + /* Disallow duplicate bare variable references and multiple + subarrays of the same array here, but allow multiple components of + the same (e.g. derived-type) variable. For the latter, duplicate + components are detected elsewhere. */ + if (openacc && n->expr && n->expr->expr_type == EXPR_VARIABLE) + for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next) + if (ref->type != REF_ARRAY) + array_only_p = false; + if (array_only_p) + { + if (n->sym->mark) + gfc_error ("Symbol %qs present on multiple clauses at %L", + n->sym->name, &n->where); + else + n->sym->mark = 1; + } } gcc_assert (OMP_LIST_LASTPRIVATE == OMP_LIST_FIRSTPRIVATE + 1); @@ -4348,23 +4381,41 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses, "are allowed on ORDERED directive at %L", &n->where); } + gfc_ref *array_ref = NULL; + bool resolved = false; if (n->expr) { - if (!gfc_resolve_expr (n->expr) + array_ref = n->expr->ref; + resolved = gfc_resolve_expr (n->expr); + + /* Look through component refs to find last array + reference. */ + while (resolved + && array_ref + && (array_ref->type == REF_COMPONENT + || (array_ref->type == REF_ARRAY + && array_ref->next + && array_ref->next->type == REF_COMPONENT))) + array_ref = array_ref->next; + } + if (array_ref + || (n->expr + && (!resolved || n->expr->expr_type != EXPR_VARIABLE))) + { + if (!resolved || n->expr->expr_type != EXPR_VARIABLE - || n->expr->ref == NULL - || n->expr->ref->next - || n->expr->ref->type != REF_ARRAY) + || array_ref->next + || array_ref->type != REF_ARRAY) gfc_error ("%qs in %s clause at %L is not a proper " "array section", n->sym->name, name, &n->where); - else if (n->expr->ref->u.ar.codimen) + else if (array_ref->u.ar.codimen) gfc_error ("Coarrays not supported in %s clause at %L", name, &n->where); else { int i; - gfc_array_ref *ar = &n->expr->ref->u.ar; + gfc_array_ref *ar = &array_ref->u.ar; for (i = 0; i < ar->dimen; i++) if (ar->stride[i]) { diff --git a/gcc/fortran/trans-openmp.c b/gcc/fortran/trans-openmp.c index 483ca66..d56cda0c 100644 --- a/gcc/fortran/trans-openmp.c +++ b/gcc/fortran/trans-openmp.c @@ -60,6 +60,9 @@ gfc_omp_privatize_by_reference (const_tree decl) if (TREE_CODE (type) == POINTER_TYPE) { + while (TREE_CODE (decl) == COMPONENT_REF) + decl = TREE_OPERAND (decl, 1); + /* Array POINTER/ALLOCATABLE have aggregate types, all user variables that have POINTER_TYPE type and aren't scalar pointers, scalar allocatables, Cray pointees or C pointers are supposed to be @@ -2108,20 +2111,47 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, tree decl = gfc_get_symbol_decl (n->sym); if (DECL_P (decl)) TREE_ADDRESSABLE (decl) = 1; - if (n->expr == NULL || n->expr->ref->u.ar.type == AR_FULL) + + gfc_ref *ref = n->expr ? n->expr->ref : NULL; + symbol_attribute *sym_attr = &n->sym->attr; + gomp_map_kind ptr_map_kind = GOMP_MAP_POINTER; + + if (ref && n->sym->ts.type == BT_DERIVED) + { + if (gfc_omp_privatize_by_reference (decl)) + decl = build_fold_indirect_ref (decl); + + for (; ref && ref->type == REF_COMPONENT; ref = ref->next) + { + tree field = ref->u.c.component->backend_decl; + gcc_assert (field && TREE_CODE (field) == FIELD_DECL); + decl = fold_build3 (COMPONENT_REF, TREE_TYPE (field), + decl, field, NULL_TREE); + sym_attr = &ref->u.c.component->attr; + } + + ptr_map_kind = GOMP_MAP_ALWAYS_POINTER; + } + + if (ref == NULL || ref->u.ar.type == AR_FULL) { + tree field = decl; + + while (TREE_CODE (field) == COMPONENT_REF) + field = TREE_OPERAND (field, 1); + if (POINTER_TYPE_P (TREE_TYPE (decl)) && (gfc_omp_privatize_by_reference (decl) - || GFC_DECL_GET_SCALAR_POINTER (decl) - || GFC_DECL_GET_SCALAR_ALLOCATABLE (decl) - || GFC_DECL_CRAY_POINTEE (decl) + || GFC_DECL_GET_SCALAR_POINTER (field) + || GFC_DECL_GET_SCALAR_ALLOCATABLE (field) + || GFC_DECL_CRAY_POINTEE (field) || GFC_DESCRIPTOR_TYPE_P - (TREE_TYPE (TREE_TYPE (decl))))) + (TREE_TYPE (TREE_TYPE (field))))) { tree orig_decl = decl; node4 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind); OMP_CLAUSE_DECL (node4) = decl; OMP_CLAUSE_SIZE (node4) = size_int (0); decl = build_fold_indirect_ref (decl); @@ -2131,13 +2161,15 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = decl; OMP_CLAUSE_SIZE (node3) = size_int (0); decl = build_fold_indirect_ref (decl); } } - if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl))) + if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)) + && n->u.map_op != OMP_MAP_ATTACH + && n->u.map_op != OMP_MAP_DETACH) { tree type = TREE_TYPE (decl); tree ptr = gfc_conv_descriptor_data_get (decl); @@ -2152,14 +2184,16 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl); + if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER) + STRIP_NOPS (OMP_CLAUSE_DECL (node3)); OMP_CLAUSE_SIZE (node3) = size_int (0); /* We have to check for n->sym->attr.dimension because of scalar coarrays. */ - if (n->sym->attr.pointer && n->sym->attr.dimension) + if (sym_attr->pointer && sym_attr->dimension) { stmtblock_t cond_block; tree size @@ -2189,11 +2223,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, else_b)); OMP_CLAUSE_SIZE (node) = size; } - else if (n->sym->attr.dimension) + else if (sym_attr->dimension) OMP_CLAUSE_SIZE (node) = gfc_full_array_size (block, decl, GFC_TYPE_ARRAY_RANK (type)); - if (n->sym->attr.dimension) + if (sym_attr->dimension) { tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type)); @@ -2206,11 +2240,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, else OMP_CLAUSE_DECL (node) = decl; } - else + else if (ref) { tree ptr, ptr2; gfc_init_se (&se, NULL); - if (n->expr->ref->u.ar.type == AR_ELEMENT) + if (ref->u.ar.type == AR_ELEMENT) { gfc_conv_expr_reference (&se, n->expr); gfc_add_block_to_block (block, &se.pre); @@ -2244,7 +2278,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, { node4 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node4, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node4, ptr_map_kind); OMP_CLAUSE_DECL (node4) = decl; OMP_CLAUSE_SIZE (node4) = size_int (0); decl = build_fold_indirect_ref (decl); @@ -2261,9 +2295,11 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type); node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = gfc_conv_descriptor_data_get (decl); + if (ptr_map_kind == GOMP_MAP_ALWAYS_POINTER) + STRIP_NOPS (OMP_CLAUSE_DECL (node3)); } else { @@ -2276,18 +2312,23 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, } node3 = build_omp_clause (input_location, OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (node3, GOMP_MAP_POINTER); + OMP_CLAUSE_SET_MAP_KIND (node3, ptr_map_kind); OMP_CLAUSE_DECL (node3) = decl; } ptr2 = fold_convert (sizetype, ptr2); OMP_CLAUSE_SIZE (node3) = fold_build2 (MINUS_EXPR, sizetype, ptr, ptr2); } + else + gcc_unreachable (); switch (n->u.map_op) { case OMP_MAP_ALLOC: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ALLOC); break; + case OMP_MAP_ATTACH: + OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_ATTACH); + break; case OMP_MAP_TO: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TO); break; @@ -2312,6 +2353,9 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, case OMP_MAP_DELETE: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE); break; + case OMP_MAP_DETACH: + OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DETACH); + break; case OMP_MAP_FORCE_ALLOC: OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC); break; diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 274edc0..aa7723d 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -113,6 +113,10 @@ enum gimplify_omp_var_data GOVD_NONTEMPORAL = 4194304, + /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for + fields. */ + GOVD_MAP_HAS_ATTACHMENTS = 8388608, + GOVD_DATA_SHARE_CLASS = (GOVD_SHARED | GOVD_PRIVATE | GOVD_FIRSTPRIVATE | GOVD_LASTPRIVATE | GOVD_REDUCTION | GOVD_LINEAR | GOVD_LOCAL) @@ -7998,7 +8002,13 @@ insert_struct_component_mapping (enum tree_code code, tree c, tree struct_node, OMP_CLAUSE_SET_MAP_KIND (c2, mkind); OMP_CLAUSE_DECL (c2) = unshare_expr (OMP_CLAUSE_DECL (c)); OMP_CLAUSE_CHAIN (c2) = scp ? *scp : prev_node; - OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node); + if (OMP_CLAUSE_CHAIN (prev_node) != c + && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (prev_node)) == OMP_CLAUSE_MAP + && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (prev_node)) + == GOMP_MAP_TO_PSET)) + OMP_CLAUSE_SIZE (c2) = OMP_CLAUSE_SIZE (OMP_CLAUSE_CHAIN (prev_node)); + else + OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (ptr_type_node); if (struct_node) OMP_CLAUSE_CHAIN (struct_node) = c2; @@ -8588,7 +8598,9 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, remove = true; break; } - if (DECL_P (decl)) + if (DECL_P (decl) + && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET + && code != OACC_UPDATE) { if (error_operand_p (decl)) { @@ -8640,16 +8652,36 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); bool ptr = (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ALWAYS_POINTER); + bool attach = OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_ATTACH + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_DETACH; + bool has_attachments = false; + /* For OpenACC, pointers in structs should trigger an + attach action. */ + if (ptr && (region_type & ORT_ACC) != 0) + { + /* Turning a GOMP_MAP_ALWAYS_POINTER clause into a + GOMP_MAP_ATTACH clause after we have detected a case + that needs a GOMP_MAP_STRUCT mapping adding. */ + OMP_CLAUSE_SET_MAP_KIND (c, + (code == OACC_EXIT_DATA) ? GOMP_MAP_DETACH + : GOMP_MAP_ATTACH); + has_attachments = true; + } if (n == NULL || (n->value & GOVD_MAP) == 0) { tree l = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP); - OMP_CLAUSE_SET_MAP_KIND (l, GOMP_MAP_STRUCT); + OMP_CLAUSE_SET_MAP_KIND (l, attach + ? GOMP_MAP_FORCE_PRESENT : GOMP_MAP_STRUCT); if (!base_eq_orig_base) OMP_CLAUSE_DECL (l) = unshare_expr (orig_base); else OMP_CLAUSE_DECL (l) = decl; - OMP_CLAUSE_SIZE (l) = size_int (1); + OMP_CLAUSE_SIZE (l) = attach + ? (DECL_P (OMP_CLAUSE_DECL (l)) + ? DECL_SIZE_UNIT (OMP_CLAUSE_DECL (l)) + : TYPE_SIZE_UNIT (TREE_TYPE (OMP_CLAUSE_DECL (l)))) + : size_int (1); if (struct_map_to_clause == NULL) struct_map_to_clause = new hash_map; struct_map_to_clause->put (decl, l); @@ -8681,9 +8713,11 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, flags = GOVD_MAP | GOVD_EXPLICIT; if (GOMP_MAP_ALWAYS_P (OMP_CLAUSE_MAP_KIND (c)) || ptr) flags |= GOVD_SEEN; + if (has_attachments) + flags |= GOVD_MAP_HAS_ATTACHMENTS; goto do_add_decl; } - else + else if (struct_map_to_clause) { tree *osc = struct_map_to_clause->get (decl); tree *sc = NULL, *scp = NULL; @@ -8692,8 +8726,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, sc = &OMP_CLAUSE_CHAIN (*osc); if (*sc != c && (OMP_CLAUSE_MAP_KIND (*sc) - == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) + == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) sc = &OMP_CLAUSE_CHAIN (*sc); + /* Here "prev_list_p" is the end of the inserted + alloc/release nodes after the struct node, OSC. */ for (; *sc != c; sc = &OMP_CLAUSE_CHAIN (*sc)) if (ptr && sc == prev_list_p) break; @@ -8752,9 +8788,10 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } if (remove) break; - OMP_CLAUSE_SIZE (*osc) - = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc), - size_one_node); + if (!attach) + OMP_CLAUSE_SIZE (*osc) + = size_binop (PLUS_EXPR, OMP_CLAUSE_SIZE (*osc), + size_one_node); if (ptr) { tree cl @@ -8786,11 +8823,15 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, } if (!remove && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_ALWAYS_POINTER + && OMP_CLAUSE_MAP_KIND (c) != GOMP_MAP_TO_PSET && OMP_CLAUSE_CHAIN (c) && OMP_CLAUSE_CODE (OMP_CLAUSE_CHAIN (c)) == OMP_CLAUSE_MAP - && (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) - == GOMP_MAP_ALWAYS_POINTER)) + && ((OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) + == GOMP_MAP_ALWAYS_POINTER) + || (OMP_CLAUSE_MAP_KIND (OMP_CLAUSE_CHAIN (c)) + == GOMP_MAP_TO_PSET))) prev_list_p = list_p; + break; } flags = GOVD_MAP | GOVD_EXPLICIT; @@ -9412,6 +9453,8 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data) return 0; if ((flags & GOVD_SEEN) == 0) return 0; + if ((flags & GOVD_MAP_HAS_ATTACHMENTS) != 0) + return 0; if (flags & GOVD_DEBUG_PRIVATE) { gcc_assert ((flags & GOVD_DATA_SHARE_CLASS) == GOVD_SHARED); @@ -11795,8 +11838,9 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) && omp_find_clause (OMP_STANDALONE_CLAUSES (expr), OMP_CLAUSE_FINALIZE)) { - /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize" - semantics apply to all mappings of this OpenACC directive. */ + /* Use GOMP_MAP_DELETE, GOMP_MAP_FORCE_DETACH, and + GOMP_MAP_FORCE_FROM to denote that "finalize" semantics apply + to all mappings of this OpenACC directive. */ bool finalize_marked = false; for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c)) if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP) @@ -11810,10 +11854,19 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p) OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE); finalize_marked = true; break; + case GOMP_MAP_DETACH: + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_DETACH); + finalize_marked = true; + break; + case GOMP_MAP_STRUCT: + case GOMP_MAP_FORCE_PRESENT: + /* Skip over an initial struct or force_present mapping. */ + break; default: - /* Check consistency: libgomp relies on the very first data - mapping clause being marked, so make sure we did that before - any other mapping clauses. */ + /* Check consistency: libgomp relies on the very first + non-struct, non-force-present data mapping clause being + marked, so make sure we did that before any other mapping + clauses. */ gcc_assert (finalize_marked); break; } diff --git a/gcc/omp-low.c b/gcc/omp-low.c index ca78d7a..55dbc0b 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -9138,6 +9138,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) case GOMP_MAP_FORCE_DEVICEPTR: case GOMP_MAP_DEVICE_RESIDENT: case GOMP_MAP_LINK: + case GOMP_MAP_ATTACH: + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: gcc_assert (is_gimple_omp_oacc (stmt)); break; default: diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-1.c b/gcc/testsuite/c-c++-common/goacc/mdc-1.c new file mode 100644 index 0000000..84a44af --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/mdc-1.c @@ -0,0 +1,54 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +/* { dg-additional-options "-fdump-tree-omplower" } */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + int *a, *z; + +#pragma acc enter data copyin(s) + { +#pragma acc data copy(s.a[0:10]) copy(z[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(s.e) + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + + a = s.e; +#pragma acc enter data attach(a) +#pragma acc exit data detach(a) + } + +#pragma acc enter data copyin(a) +#pragma acc acc enter data attach(s.e) +#pragma acc exit data detach(s.e) + +#pragma acc data attach(s.e) + { + } +#pragma acc exit data delete(a) + +#pragma acc exit data detach(a) finalize +#pragma acc exit data detach(s.a) finalize + } +} + +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:s .len: 32.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.tofrom:.z .len: 40.. map.struct:s .len: 1.. map.alloc:s.a .len: 8.. map.tofrom:._1 .len: 40.. map.attach:s.a .len: 0.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_parallel map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.attach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.detach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.to:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.force_present:s .len: 32.. map.detach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_data map.force_present:s .len: 32.. map.attach:s.e .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data map.release:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_detach:a .len: 8.." 1 "omplower" } } */ +/* { dg-final { scan-tree-dump-times "pragma omp target oacc_enter_exit_data finalize map.force_present:s .len: 32.. map.force_detach:s.a .len: 8.." 1 "omplower" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/mdc-2.c b/gcc/testsuite/c-c++-common/goacc/mdc-2.c new file mode 100644 index 0000000..ebfb99d --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/mdc-2.c @@ -0,0 +1,62 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + int *a, *z, scalar, **y; + +#pragma acc enter data copyin(s) detach(z) /* { dg-error ".detach. is not valid for" } */ + { +#pragma acc data copy(s.a[0:10]) copy(z[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(s.e) detach(s.b) /* { dg-error ".detach. is not valid for" } */ + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + a = s.e; +#pragma acc enter data attach(a) detach(s.c) /* { dg-error ".detach. is not valid for" } */ +#pragma acc exit data detach(a) + } + +#pragma acc enter data attach(z[:5]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[:5]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[1:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[1:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(z[:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(z[3]) /* { dg-error "expected pointer in .attach. clause" } */ +#pragma acc exit data detach(z[3]) /* { dg-error "expected pointer in .detach. clause" } */ + +#pragma acc acc enter data attach(s.e) +#pragma acc exit data detach(s.e) attach(z) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc data attach(s.e) + { + } +#pragma acc exit data delete(a) attach(s.a) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc enter data attach(scalar) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(scalar) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(s) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(s) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ + } + +#pragma acc enter data attach(y[10]) +#pragma acc exit data detach(y[10]) +} diff --git a/gcc/testsuite/g++.dg/goacc/mdc.C b/gcc/testsuite/g++.dg/goacc/mdc.C new file mode 100644 index 0000000..fbd43aa --- /dev/null +++ b/gcc/testsuite/g++.dg/goacc/mdc.C @@ -0,0 +1,68 @@ +/* Test OpenACC's support for manual deep copy, including the attach + and detach clauses. */ + +void +t1 () +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + struct foo& rs = s; + + int *a, *z, scalar, **y; + int* const &ra = a; + int* const &rz = z; + int& rscalar = scalar; + int** const &ry = y; + +#pragma acc enter data copyin(rs) detach(rz) /* { dg-error ".detach. is not valid for" } */ + { +#pragma acc data copy(rs.a[0:10]) copy(rz[0:10]) + { + s.e = z; +#pragma acc parallel loop attach(rs.e) detach(rs.b) /* { dg-error ".detach. is not valid for" } */ + for (int i = 0; i < 10; i++) + s.a[i] = s.e[i]; + + a = s.e; +#pragma acc enter data attach(ra) detach(rs.c) /* { dg-error ".detach. is not valid for" } */ +#pragma acc exit data detach(ra) + } + +#pragma acc enter data attach(rz[:5]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[:5]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[1:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[1:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[:]) /* { dg-error "array section in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rz[:]) /* { dg-error "array section in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rz[3]) /* { dg-error "expected pointer in .attach. clause" } */ +#pragma acc exit data detach(rz[3]) /* { dg-error "expected pointer in .detach. clause" } */ + +#pragma acc acc enter data attach(rs.e) +#pragma acc exit data detach(rs.e) attach(rz) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc data attach(rs.e) + { + } +#pragma acc exit data delete(ra) attach(rs.a) /* { dg-error ".attach. is not valid for" } */ + +#pragma acc enter data attach(rscalar) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rscalar) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc enter data attach(rs) /* { dg-error "expected pointer in .attach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ +#pragma acc exit data detach(rs) /* { dg-error "expected pointer in .detach. clause" } */ +/* { dg-error "has no data movement clause" "" { target *-*-* } .-1 } */ + } + +#pragma acc enter data attach(ry[10]) +#pragma acc exit data detach(ry[10]) +} diff --git a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 index b94214e..1a4a671 100644 --- a/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/data-clauses.f95 @@ -39,9 +39,9 @@ contains !$acc end data - !$acc parallel copy (tip) ! { dg-error "POINTER" } + !$acc parallel copy (tip) !$acc end parallel - !$acc parallel copy (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copy (tia) !$acc end parallel !$acc parallel deviceptr (i) copy (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -54,9 +54,9 @@ contains !$acc end data - !$acc parallel copyin (tip) ! { dg-error "POINTER" } + !$acc parallel copyin (tip) !$acc end parallel - !$acc parallel copyin (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copyin (tia) !$acc end parallel !$acc parallel deviceptr (i) copyin (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -71,9 +71,9 @@ contains !$acc end data - !$acc parallel copyout (tip) ! { dg-error "POINTER" } + !$acc parallel copyout (tip) !$acc end parallel - !$acc parallel copyout (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel copyout (tia) !$acc end parallel !$acc parallel deviceptr (i) copyout (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -90,9 +90,9 @@ contains !$acc end data - !$acc parallel create (tip) ! { dg-error "POINTER" } + !$acc parallel create (tip) !$acc end parallel - !$acc parallel create (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel create (tia) !$acc end parallel !$acc parallel deviceptr (i) create (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -111,9 +111,9 @@ contains !$acc end data - !$acc parallel present (tip) ! { dg-error "POINTER" } + !$acc parallel present (tip) !$acc end parallel - !$acc parallel present (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present (tia) !$acc end parallel !$acc parallel deviceptr (i) present (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -144,9 +144,9 @@ contains !$acc end parallel - !$acc parallel present_or_copy (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copy (tip) !$acc end parallel - !$acc parallel present_or_copy (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copy (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copy (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -169,9 +169,9 @@ contains !$acc end data - !$acc parallel present_or_copyin (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copyin (tip) !$acc end parallel - !$acc parallel present_or_copyin (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copyin (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copyin (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -196,9 +196,9 @@ contains !$acc end data - !$acc parallel present_or_copyout (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_copyout (tip) !$acc end parallel - !$acc parallel present_or_copyout (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_copyout (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_copyout (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -225,9 +225,9 @@ contains !$acc end data - !$acc parallel present_or_create (tip) ! { dg-error "POINTER" } + !$acc parallel present_or_create (tip) !$acc end parallel - !$acc parallel present_or_create (tia) ! { dg-error "ALLOCATABLE" } + !$acc parallel present_or_create (tia) !$acc end parallel !$acc parallel deviceptr (i) present_or_create (i) ! { dg-error "multiple clauses" } !$acc end parallel @@ -256,4 +256,4 @@ contains !$acc end data end subroutine foo -end module test \ No newline at end of file +end module test diff --git a/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 new file mode 100644 index 0000000..5fb2981 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/goacc/derived-types.f90 @@ -0,0 +1,77 @@ +! Test ACC UPDATE with derived types. + +module dt + integer, parameter :: n = 10 + type inner + integer :: d(n) + end type inner + type dtype + integer(8) :: a, b, c(n) + type(inner) :: in + end type dtype +end module dt + +program derived_acc + use dt + + implicit none + type(dtype):: var + integer i + !$acc declare create(var) + !$acc declare pcopy(var%a) ! { dg-error "Syntax error in OpenMP" } + + !$acc update host(var) + !$acc update host(var%a) + !$acc update device(var) + !$acc update device(var%a) + !$acc update self(var) + !$acc update self(var%a) + + !$acc enter data copyin(var) + !$acc enter data copyin(var%a) + + !$acc exit data copyout(var) + !$acc exit data copyout(var%a) + + !$acc data copy(var) + !$acc end data + + !$acc data copyout(var%a) + !$acc end data + + !$acc parallel loop pcopyout(var) + do i = 1, 10 + end do + !$acc end parallel loop + + !$acc parallel loop copyout(var%a) + do i = 1, 10 + end do + !$acc end parallel loop + + !$acc parallel pcopy(var) + !$acc end parallel + + !$acc parallel pcopy(var%a) + do i = 1, 10 + end do + !$acc end parallel + + !$acc kernels pcopyin(var) + !$acc end kernels + + !$acc kernels pcopy(var%a) + do i = 1, 10 + end do + !$acc end kernels + + !$acc kernels loop pcopyin(var) + do i = 1, 10 + end do + !$acc end kernels loop + + !$acc kernels loop pcopy(var%a) + do i = 1, 10 + end do + !$acc end kernels loop +end program derived_acc diff --git a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 index 805459c..b616b39 100644 --- a/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 +++ b/gcc/testsuite/gfortran.dg/goacc/enter-exit-data.f95 @@ -44,14 +44,14 @@ contains !$acc enter data wait (i, 1) !$acc enter data wait (a) ! { dg-error "INTEGER" } !$acc enter data wait (b(5:6)) ! { dg-error "INTEGER" } - !$acc enter data copyin (tip) ! { dg-error "POINTER" } - !$acc enter data copyin (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data create (tip) ! { dg-error "POINTER" } - !$acc enter data create (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data present_or_copyin (tip) ! { dg-error "POINTER" } - !$acc enter data present_or_copyin (tia) ! { dg-error "ALLOCATABLE" } - !$acc enter data present_or_create (tip) ! { dg-error "POINTER" } - !$acc enter data present_or_create (tia) ! { dg-error "ALLOCATABLE" } + !$acc enter data copyin (tip) + !$acc enter data copyin (tia) + !$acc enter data create (tip) + !$acc enter data create (tia) + !$acc enter data present_or_copyin (tip) + !$acc enter data present_or_copyin (tia) + !$acc enter data present_or_create (tip) + !$acc enter data present_or_create (tia) !$acc enter data copyin (i) create (i) ! { dg-error "multiple clauses" } !$acc enter data copyin (i) present_or_copyin (i) ! { dg-error "multiple clauses" } !$acc enter data create (i) present_or_copyin (i) ! { dg-error "multiple clauses" } @@ -79,10 +79,10 @@ contains !$acc exit data wait (i, 1) !$acc exit data wait (a) ! { dg-error "INTEGER" } !$acc exit data wait (b(5:6)) ! { dg-error "INTEGER" } - !$acc exit data copyout (tip) ! { dg-error "POINTER" } - !$acc exit data copyout (tia) ! { dg-error "ALLOCATABLE" } - !$acc exit data delete (tip) ! { dg-error "POINTER" } - !$acc exit data delete (tia) ! { dg-error "ALLOCATABLE" } + !$acc exit data copyout (tip) + !$acc exit data copyout (tia) + !$acc exit data delete (tip) + !$acc exit data delete (tia) !$acc exit data copyout (i) delete (i) ! { dg-error "multiple clauses" } !$acc exit data finalize !$acc exit data finalize copyout (i) diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c index 99eca4a..5455da9 100644 --- a/gcc/tree-pretty-print.c +++ b/gcc/tree-pretty-print.c @@ -826,6 +826,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags) case GOMP_MAP_LINK: pp_string (pp, "link"); break; + case GOMP_MAP_ATTACH: + pp_string (pp, "attach"); + break; + case GOMP_MAP_DETACH: + pp_string (pp, "detach"); + break; + case GOMP_MAP_FORCE_DETACH: + pp_string (pp, "force_detach"); + break; default: gcc_unreachable (); } diff --git a/include/gomp-constants.h b/include/gomp-constants.h index fc7c7a2..a3fe624 100644 --- a/include/gomp-constants.h +++ b/include/gomp-constants.h @@ -42,6 +42,7 @@ #define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4) #define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \ | GOMP_MAP_FLAG_SPECIAL_0) +#define GOMP_MAP_DEEP_COPY (1 << 5) /* Flag to force a specific behavior (or else, trigger a run-time error). */ #define GOMP_MAP_FLAG_FORCE (1 << 7) @@ -128,6 +129,13 @@ enum gomp_map_kind /* Decrement usage count and deallocate if zero. */ GOMP_MAP_RELEASE = (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_DELETE), + /* In OpenACC, attach a pointer to a mapped struct field. */ + GOMP_MAP_ATTACH = (GOMP_MAP_DEEP_COPY | 0), + /* In OpenACC, detach a pointer to a mapped struct field. */ + GOMP_MAP_DETACH = (GOMP_MAP_DEEP_COPY | 1), + /* In OpenACC, detach a pointer to a mapped struct field. */ + GOMP_MAP_FORCE_DETACH = (GOMP_MAP_DEEP_COPY + | GOMP_MAP_FLAG_FORCE | 1), /* Internal to GCC, not used in libgomp. */ /* Do not map, but pointer assign a pointer instead. */ diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index cb25e86..5636030 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -866,6 +866,8 @@ struct target_var_desc { bool copy_from; /* True if data always should be copied from device to host at the end. */ bool always_copy_from; + /* True if variable should be detached at end of region. */ + bool do_detach; /* Relative offset against key host_start. */ uintptr_t offset; /* Actual length. */ @@ -920,6 +922,8 @@ struct splay_tree_key_s { uintptr_t refcount; /* Dynamic reference count. */ uintptr_t dynamic_refcount; + /* For a block with attached pointers, the attachment counters for each. */ + unsigned short *attach_count; /* Pointer to the original mapping of "omp declare target link" object. */ splay_tree_key link_key; }; @@ -1061,6 +1065,8 @@ enum gomp_map_vars_kind GOMP_MAP_VARS_ENTER_DATA }; +struct gomp_coalesce_buf; + extern void gomp_acc_insert_pointer (size_t, void **, size_t *, void *, int); extern void gomp_acc_remove_pointer (void *, size_t, bool, int, int, int); extern void gomp_acc_declare_allocate (bool, size_t, void **, size_t *, @@ -1072,6 +1078,14 @@ extern void gomp_copy_host2dev (struct gomp_device_descr *, extern void gomp_copy_dev2host (struct gomp_device_descr *, struct goacc_asyncqueue *, void *, const void *, size_t); +extern uintptr_t gomp_map_val (struct target_mem_desc *, void **, size_t); +extern void gomp_attach_pointer (struct gomp_device_descr *, + struct goacc_asyncqueue *, splay_tree, + splay_tree_key, uintptr_t, size_t, + struct gomp_coalesce_buf *); +extern void gomp_detach_pointer (struct gomp_device_descr *, + struct goacc_asyncqueue *, splay_tree_key, + uintptr_t, bool, struct gomp_coalesce_buf *); extern struct target_mem_desc *gomp_map_vars (struct gomp_device_descr *, size_t, void **, void **, @@ -1083,9 +1097,9 @@ extern struct target_mem_desc *gomp_map_vars_async (struct gomp_device_descr *, size_t *, void *, bool, enum gomp_map_vars_kind); extern void gomp_unmap_tgt (struct target_mem_desc *); -extern void gomp_unmap_vars (struct target_mem_desc *, bool); +extern void gomp_unmap_vars (struct target_mem_desc *, bool, bool); extern void gomp_unmap_vars_async (struct target_mem_desc *, bool, - struct goacc_asyncqueue *); + struct goacc_asyncqueue *, bool); extern void gomp_init_device (struct gomp_device_descr *); extern bool gomp_fini_device (struct gomp_device_descr *); extern void gomp_free_memmap (struct splay_tree_s *); diff --git a/libgomp/libgomp.map b/libgomp/libgomp.map index ba9218b..a086dd2 100644 --- a/libgomp/libgomp.map +++ b/libgomp/libgomp.map @@ -480,6 +480,16 @@ OACC_2.5 { acc_update_self_async_array_h_; } OACC_2.0.1; +OACC_2.6 { + global: + acc_attach; + acc_attach_async; + acc_detach; + acc_detach_async; + acc_detach_finalize; + acc_detach_finalize_async; +} OACC_2.5; + GOACC_2.0 { global: GOACC_data_end; diff --git a/libgomp/oacc-async.c b/libgomp/oacc-async.c index 68aaf19..8f700e0 100644 --- a/libgomp/oacc-async.c +++ b/libgomp/oacc-async.c @@ -251,14 +251,14 @@ goacc_async_unmap_tgt (void *ptr) attribute_hidden void goacc_async_copyout_unmap_vars (struct target_mem_desc *tgt, - struct goacc_asyncqueue *aq) + struct goacc_asyncqueue *aq, bool finalize) { struct gomp_device_descr *devicep = tgt->device_descr; /* Increment reference to delay freeing of device memory until callback has triggered. */ tgt->refcount++; - gomp_unmap_vars_async (tgt, true, aq); + gomp_unmap_vars_async (tgt, true, aq, finalize); devicep->openacc.async.queue_callback_func (aq, goacc_async_unmap_tgt, (void *) tgt); } diff --git a/libgomp/oacc-int.h b/libgomp/oacc-int.h index 3354eb6..9203068 100644 --- a/libgomp/oacc-int.h +++ b/libgomp/oacc-int.h @@ -105,7 +105,7 @@ void goacc_host_init (void); void goacc_init_asyncqueues (struct gomp_device_descr *); bool goacc_fini_asyncqueues (struct gomp_device_descr *); void goacc_async_copyout_unmap_vars (struct target_mem_desc *, - struct goacc_asyncqueue *); + struct goacc_asyncqueue *, bool); void goacc_async_free (struct gomp_device_descr *, struct goacc_asyncqueue *, void *); struct goacc_asyncqueue *get_goacc_asyncqueue (int); diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 050eb0d..ad84857 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -432,7 +432,7 @@ acc_unmap_data (void *h) gomp_mutex_unlock (&acc_dev->lock); - gomp_unmap_vars (t, true); + gomp_unmap_vars (t, true, false); } #define FLAG_PRESENT (1 << 0) @@ -509,6 +509,7 @@ present_create_copy (unsigned f, void *h, size_t s, int async) /* Initialize dynamic refcount. */ tgt->list[0].key->dynamic_refcount = 1; + tgt->list[0].key->attach_count = NULL; gomp_mutex_lock (&acc_dev->lock); @@ -626,6 +627,7 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) { n->refcount = 0; n->dynamic_refcount = 0; + n->attach_count = NULL; } if (n->refcount < n->dynamic_refcount) { @@ -821,6 +823,7 @@ gomp_acc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes, /* Initialize dynamic refcount. */ tgt->list[0].key->dynamic_refcount = 1; + tgt->list[0].key->attach_count = NULL; gomp_mutex_lock (&acc_dev->lock); tgt->prev = acc_dev->openacc.data_environ; @@ -908,11 +911,11 @@ gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async, /* If running synchronously, unmap immediately. */ if (async < acc_async_noval) - gomp_unmap_vars (t, true); + gomp_unmap_vars (t, true, finalize); else { goacc_aq aq = get_goacc_asyncqueue (async); - goacc_async_copyout_unmap_vars (t, aq); + goacc_async_copyout_unmap_vars (t, aq, finalize); } } @@ -920,3 +923,80 @@ gomp_acc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async, gomp_debug (0, " %s: mappings restored\n", __FUNCTION__); } + + +void +acc_attach_async (void **hostaddr, int async) +{ + struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *acc_dev = thr->dev; + goacc_aq aq = get_goacc_asyncqueue (async); + + struct splay_tree_key_s cur_node; + splay_tree_key n; + + if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) + return; + + cur_node.host_start = (uintptr_t) hostaddr; + cur_node.host_end = cur_node.host_start + sizeof (void *); + n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); + + if (n == NULL) + gomp_fatal ("struct not mapped for acc_attach"); + + gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n, (uintptr_t) hostaddr, + 0, NULL); +} + +void +acc_attach (void **hostaddr) +{ + acc_attach_async (hostaddr, acc_async_sync); +} + +static void +goacc_detach_internal (void **hostaddr, int async, bool finalize) +{ + struct goacc_thread *thr = goacc_thread (); + struct gomp_device_descr *acc_dev = thr->dev; + struct splay_tree_key_s cur_node; + splay_tree_key n; + struct goacc_asyncqueue *aq = get_goacc_asyncqueue (async); + + if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) + return; + + cur_node.host_start = (uintptr_t) hostaddr; + cur_node.host_end = cur_node.host_start + sizeof (void *); + n = splay_tree_lookup (&acc_dev->mem_map, &cur_node); + + if (n == NULL) + gomp_fatal ("struct not mapped for acc_detach"); + + gomp_detach_pointer (acc_dev, aq, n, (uintptr_t) hostaddr, finalize, NULL); +} + +void +acc_detach (void **hostaddr) +{ + goacc_detach_internal (hostaddr, acc_async_sync, false); +} + +void +acc_detach_async (void **hostaddr, int async) +{ + goacc_detach_internal (hostaddr, async, false); +} + +void +acc_detach_finalize (void **hostaddr) +{ + goacc_detach_internal (hostaddr, acc_async_sync, true); +} + +void +acc_detach_finalize_async (void **hostaddr, int async) +{ + goacc_detach_internal (hostaddr, async, true); +} diff --git a/libgomp/oacc-parallel.c b/libgomp/oacc-parallel.c index a01e05c..f00aaf2 100644 --- a/libgomp/oacc-parallel.c +++ b/libgomp/oacc-parallel.c @@ -47,12 +47,29 @@ find_pointer (int pos, size_t mapnum, unsigned short *kinds) if (pos + 1 >= mapnum) return 0; - unsigned char kind = kinds[pos+1] & 0xff; + unsigned char kind0 = kinds[pos] & 0xff; - if (kind == GOMP_MAP_TO_PSET) - return 3; - else if (kind == GOMP_MAP_POINTER) - return 2; + switch (kind0) + { + case GOMP_MAP_TO: + case GOMP_MAP_FORCE_TO: + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_ALLOC: + case GOMP_MAP_RELEASE: + { + unsigned char kind1 = kinds[pos + 1] & 0xff; + if (kind1 == GOMP_MAP_POINTER + || kind1 == GOMP_MAP_ALWAYS_POINTER + || kind1 == GOMP_MAP_ATTACH + || kind1 == GOMP_MAP_DETACH) + return 2; + else if (kind1 == GOMP_MAP_TO_PSET) + return 3; + } + default: + /* empty. */; + } return 0; } @@ -231,20 +248,20 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), devaddrs = gomp_alloca (sizeof (void *) * mapnum); for (i = 0; i < mapnum; i++) - devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start - + tgt->list[i].key->tgt_offset); + devaddrs[i] = (void *) gomp_map_val (tgt, hostaddrs, i); + if (aq == NULL) { acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, dims, tgt); /* If running synchronously, unmap immediately. */ - gomp_unmap_vars (tgt, true); + gomp_unmap_vars (tgt, true, false); } else { acc_dev->openacc.async.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs, dims, tgt, aq); - goacc_async_copyout_unmap_vars (tgt, aq); + goacc_async_copyout_unmap_vars (tgt, aq, false); } } @@ -310,7 +327,7 @@ GOACC_data_end (void) gomp_debug (0, " %s: restore mappings\n", __FUNCTION__); thr->mapped_data = tgt->prev; - gomp_unmap_vars (tgt, true); + gomp_unmap_vars (tgt, true, false); gomp_debug (0, " %s: mappings restored\n", __FUNCTION__); } @@ -349,6 +366,10 @@ GOACC_enter_exit_data (int device, size_t mapnum, if (mapnum > 0) { unsigned char kind = kinds[0] & 0xff; + + if (kind == GOMP_MAP_STRUCT || kind == GOMP_MAP_FORCE_PRESENT) + kind = kinds[1] & 0xff; + if (kind == GOMP_MAP_DELETE || kind == GOMP_MAP_FORCE_FROM) finalize = true; @@ -359,11 +380,14 @@ GOACC_enter_exit_data (int device, size_t mapnum, { unsigned char kind = kinds[i] & 0xff; - if (kind == GOMP_MAP_POINTER || kind == GOMP_MAP_TO_PSET) + if (kind == GOMP_MAP_POINTER + || kind == GOMP_MAP_TO_PSET + || kind == GOMP_MAP_STRUCT + || kind == GOMP_MAP_FORCE_PRESENT) continue; if (kind == GOMP_MAP_FORCE_ALLOC - || kind == GOMP_MAP_FORCE_PRESENT + || kind == GOMP_MAP_ATTACH || kind == GOMP_MAP_FORCE_TO || kind == GOMP_MAP_TO || kind == GOMP_MAP_ALLOC) @@ -374,6 +398,8 @@ GOACC_enter_exit_data (int device, size_t mapnum, if (kind == GOMP_MAP_RELEASE || kind == GOMP_MAP_DELETE + || kind == GOMP_MAP_DETACH + || kind == GOMP_MAP_FORCE_DETACH || kind == GOMP_MAP_FROM || kind == GOMP_MAP_FORCE_FROM) break; @@ -407,6 +433,9 @@ GOACC_enter_exit_data (int device, size_t mapnum, case GOMP_MAP_ALLOC: acc_present_or_create (hostaddrs[i], sizes[i]); break; + case GOMP_MAP_ATTACH: + case GOMP_MAP_FORCE_PRESENT: + break; case GOMP_MAP_FORCE_ALLOC: acc_create (hostaddrs[i], sizes[i]); break; @@ -416,6 +445,27 @@ GOACC_enter_exit_data (int device, size_t mapnum, case GOMP_MAP_FORCE_TO: acc_copyin (hostaddrs[i], sizes[i]); break; + case GOMP_MAP_STRUCT: + { + int elems = sizes[i]; + struct splay_tree_key_s k; + splay_tree_key str; + k.host_start = (uintptr_t) hostaddrs[i]; + k.host_end = k.host_start + 1; + gomp_mutex_lock (&acc_dev->lock); + str = splay_tree_lookup (&acc_dev->mem_map, &k); + gomp_mutex_unlock (&acc_dev->lock); + /* We increment the dynamic reference count for the struct + itself by the number of struct elements that we + mapped. */ + if (str->refcount != REFCOUNT_INFINITY) + { + str->refcount += elems; + str->dynamic_refcount += elems; + } + i += elems; + } + break; default: gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", kind); @@ -433,51 +483,119 @@ GOACC_enter_exit_data (int device, size_t mapnum, i += pointer - 1; } } + + /* This loop only handles explicit "attach" clauses that are not an + implicit part of a copy{,in,out}, etc. mapping. */ + for (i = 0; i < mapnum; i++) + { + unsigned char kind = kinds[i] & 0xff; + + /* Scan for pointers and PSETs. */ + int pointer = find_pointer (i, mapnum, kinds); + + if (!pointer) + { + if (kind == GOMP_MAP_ATTACH) + acc_attach (hostaddrs[i]); + else if (kind == GOMP_MAP_STRUCT) + i += sizes[i]; + } + else + i += pointer - 1; + } } else - for (i = 0; i < mapnum; ++i) - { - unsigned char kind = kinds[i] & 0xff; + { + /* This loop only handles explicit "detach" clauses that are not an + implicit part of a copy{,in,out}, etc. mapping. */ + for (i = 0; i < mapnum; i++) + { + unsigned char kind = kinds[i] & 0xff; - int pointer = find_pointer (i, mapnum, kinds); + int pointer = find_pointer (i, mapnum, kinds); - if (!pointer) - { - switch (kind) - { - case GOMP_MAP_RELEASE: - case GOMP_MAP_DELETE: - if (acc_is_present (hostaddrs[i], sizes[i])) + if (!pointer) + { + if (kind == GOMP_MAP_DETACH) + acc_detach (hostaddrs[i]); + else if (kind == GOMP_MAP_FORCE_DETACH) + acc_detach_finalize (hostaddrs[i]); + else if (kind == GOMP_MAP_STRUCT) + i += sizes[i]; + } + else + i += pointer - 1; + } + + for (i = 0; i < mapnum; ++i) + { + unsigned char kind = kinds[i] & 0xff; + + int pointer = find_pointer (i, mapnum, kinds); + + if (!pointer) + { + switch (kind) + { + case GOMP_MAP_RELEASE: + case GOMP_MAP_DELETE: + if (acc_is_present (hostaddrs[i], sizes[i])) + { + if (finalize) + acc_delete_finalize_async (hostaddrs[i], sizes[i], + async); + else + acc_delete_async (hostaddrs[i], sizes[i], async); + } + break; + case GOMP_MAP_DETACH: + case GOMP_MAP_FORCE_DETACH: + case GOMP_MAP_FORCE_PRESENT: + break; + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + if (finalize) + acc_copyout_finalize_async (hostaddrs[i], sizes[i], async); + else + acc_copyout_async (hostaddrs[i], sizes[i], async); + break; + case GOMP_MAP_STRUCT: { - if (finalize) - acc_delete_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_delete_async (hostaddrs[i], sizes[i], async); + int elems = sizes[i]; + struct splay_tree_key_s k; + splay_tree_key str; + k.host_start = (uintptr_t) hostaddrs[i]; + k.host_end = k.host_start + 1; + gomp_mutex_lock (&acc_dev->lock); + str = splay_tree_lookup (&acc_dev->mem_map, &k); + gomp_mutex_unlock (&acc_dev->lock); + /* Decrement dynamic reference count for the struct by the + number of elements that we are unmapping. */ + if (str->dynamic_refcount >= elems) + { + str->dynamic_refcount -= elems; + str->refcount -= elems; + } + i += elems; } - break; - case GOMP_MAP_FROM: - case GOMP_MAP_FORCE_FROM: - if (finalize) - acc_copyout_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_copyout_async (hostaddrs[i], sizes[i], async); - break; - default: - gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", - kind); - break; - } - } - else - { - bool copyfrom = (kind == GOMP_MAP_FORCE_FROM - || kind == GOMP_MAP_FROM); - gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async, - finalize, pointer); - /* See the above comment. */ - i += pointer - 1; - } - } + break; + default: + gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", + kind); + break; + } + } + else + { + bool copyfrom = (kind == GOMP_MAP_FORCE_FROM + || kind == GOMP_MAP_FROM); + gomp_acc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async, + finalize, pointer); + /* See the above comment. */ + i += pointer - 1; + } + } + } } static void diff --git a/libgomp/openacc.h b/libgomp/openacc.h index 2505ac0..1bf2d65 100644 --- a/libgomp/openacc.h +++ b/libgomp/openacc.h @@ -113,12 +113,18 @@ void *acc_hostptr (void *) __GOACC_NOTHROW; int acc_is_present (void *, size_t) __GOACC_NOTHROW; void acc_memcpy_to_device (void *, void *, size_t) __GOACC_NOTHROW; void acc_memcpy_from_device (void *, void *, size_t) __GOACC_NOTHROW; +void acc_attach (void **) __GOACC_NOTHROW; +void acc_attach_async (void **, int) __GOACC_NOTHROW; +void acc_detach (void **) __GOACC_NOTHROW; +void acc_detach_async (void **, int) __GOACC_NOTHROW; /* Finalize versions of copyout/delete functions, specified in OpenACC 2.5. */ void acc_copyout_finalize (void *, size_t) __GOACC_NOTHROW; void acc_copyout_finalize_async (void *, size_t, int) __GOACC_NOTHROW; void acc_delete_finalize (void *, size_t) __GOACC_NOTHROW; void acc_delete_finalize_async (void *, size_t, int) __GOACC_NOTHROW; +void acc_detach_finalize (void **) __GOACC_NOTHROW; +void acc_detach_finalize_async (void **, int) __GOACC_NOTHROW; /* Async functions, specified in OpenACC 2.5. */ void acc_copyin_async (void *, size_t, int) __GOACC_NOTHROW; diff --git a/libgomp/target.c b/libgomp/target.c index 2bfc7e2..bd51982 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -39,6 +39,7 @@ #include #include #include +#include #ifdef PLUGIN_SUPPORT #include @@ -372,6 +373,7 @@ gomp_map_vars_existing (struct gomp_device_descr *devicep, tgt_var->key = oldn; tgt_var->copy_from = GOMP_MAP_COPY_FROM_P (kind); tgt_var->always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind); + tgt_var->do_detach = false; tgt_var->offset = newn->host_start - oldn->host_start; tgt_var->length = newn->host_end - newn->host_start; @@ -505,7 +507,128 @@ gomp_map_fields_existing (struct target_mem_desc *tgt, (void *) cur_node.host_end); } -static inline uintptr_t +void +gomp_attach_pointer (struct gomp_device_descr *devicep, + struct goacc_asyncqueue *aq, splay_tree mem_map, + splay_tree_key n, uintptr_t attach_to, size_t bias, + struct gomp_coalesce_buf *cbufp) +{ + struct splay_tree_key_s s; + size_t size, idx; + + if (n == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("enclosing struct not mapped for attach"); + } + + size = (n->host_end - n->host_start + sizeof (void *) - 1) / sizeof (void *); + /* We might have a pointer in a packed struct: however we cannot have more + than one such pointer in each pointer-sized portion of the struct, so + this is safe. */ + idx = (attach_to - n->host_start) / sizeof (void *); + + if (!n->attach_count) + n->attach_count = gomp_malloc_cleared (sizeof (*n->attach_count) * size); + + if (n->attach_count[idx] < USHRT_MAX) + n->attach_count[idx]++; + else + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attach count overflow"); + } + + if (n->attach_count[idx] == 1) + { + uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + attach_to + - n->host_start; + uintptr_t target = (uintptr_t) *(void **) attach_to; + splay_tree_key tn; + uintptr_t data; + + if ((void *) target == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attempt to attach null pointer"); + } + + s.host_start = target + bias; + s.host_end = s.host_start + 1; + tn = splay_tree_lookup (mem_map, &s); + + if (!tn) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("pointer target not mapped for attach"); + } + + data = tn->tgt->tgt_start + tn->tgt_offset + target - tn->host_start; + + gomp_debug (1, + "%s: attaching host %p, target %p (struct base %p) to %p\n", + __FUNCTION__, (void *) attach_to, (void *) devptr, + (void *) (n->tgt->tgt_start + n->tgt_offset), (void *) data); + + gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &data, + sizeof (void *), cbufp); + } + else + gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__, + (void *) attach_to, n->attach_count[idx]); +} + +void +gomp_detach_pointer (struct gomp_device_descr *devicep, + struct goacc_asyncqueue *aq, splay_tree_key n, + uintptr_t detach_from, bool finalize, + struct gomp_coalesce_buf *cbufp) +{ + size_t idx; + + if (n == NULL) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("enclosing struct not mapped for detach"); + } + + idx = (detach_from - n->host_start) / sizeof (void *); + + if (!n->attach_count) + gomp_fatal ("no attachment counters for struct"); + + if (finalize) + n->attach_count[idx] = 1; + + if (n->attach_count[idx] == 0) + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("attach count underflow"); + } + else + n->attach_count[idx]--; + + if (n->attach_count[idx] == 0) + { + uintptr_t devptr = n->tgt->tgt_start + n->tgt_offset + detach_from + - n->host_start; + uintptr_t target = (uintptr_t) *(void **) detach_from; + + gomp_debug (1, + "%s: detaching host %p, target %p (struct base %p) to %p\n", + __FUNCTION__, (void *) detach_from, (void *) devptr, + (void *) (n->tgt->tgt_start + n->tgt_offset), + (void *) target); + + gomp_copy_host2dev (devicep, aq, (void *) devptr, (void *) &target, + sizeof (void *), cbufp); + } + else + gomp_debug (1, "%s: attach count for %p -> %u\n", __FUNCTION__, + (void *) detach_from, n->attach_count[idx]); +} + +uintptr_t gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i) { if (tgt->list[i].key != NULL) @@ -674,6 +797,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, has_firstprivate = true; continue; } + else if ((kind & typemask) == GOMP_MAP_ATTACH) + { + tgt->list[i].key = NULL; + has_firstprivate = true; + continue; + } cur_node.host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) cur_node.host_end = cur_node.host_start + sizes[i]; @@ -882,6 +1011,30 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset + cur_node.host_start - n->host_start; continue; + case GOMP_MAP_ATTACH: + { + cur_node.host_start = (uintptr_t) hostaddrs[i]; + cur_node.host_end = cur_node.host_start + sizeof (void *); + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n != NULL) + { + tgt->list[i].key = n; + tgt->list[i].offset = cur_node.host_start - n->host_start; + tgt->list[i].length = n->host_end - n->host_start; + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + tgt->list[i].do_detach = true; + } + else + { + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("outer struct not mapped for attach"); + } + gomp_attach_pointer (devicep, aq, mem_map, n, + (uintptr_t) hostaddrs[i], sizes[i], + cbufp); + continue; + } default: break; } @@ -926,10 +1079,12 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, tgt->list[i].copy_from = GOMP_MAP_COPY_FROM_P (kind & typemask); tgt->list[i].always_copy_from = GOMP_MAP_ALWAYS_FROM_P (kind & typemask); + tgt->list[i].do_detach = false; tgt->list[i].offset = 0; tgt->list[i].length = k->host_end - k->host_start; k->refcount = 1; k->dynamic_refcount = 0; + k->attach_count = NULL; tgt->refcount++; array->left = NULL; array->right = NULL; @@ -980,6 +1135,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep, tgt->list[j].key = k; tgt->list[j].copy_from = false; tgt->list[j].always_copy_from = false; + tgt->list[j].do_detach = false; if (k->refcount != REFCOUNT_INFINITY) k->refcount++; gomp_map_pointer (tgt, aq, @@ -1098,6 +1254,8 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k) is_tgt_unmapped = true; gomp_unmap_tgt (k->tgt); } + if (k->attach_count) + free (k->attach_count); return is_tgt_unmapped; } @@ -1106,14 +1264,14 @@ gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k) has been done already. */ attribute_hidden void -gomp_unmap_vars (struct target_mem_desc *tgt, bool do_copyfrom) +gomp_unmap_vars (struct target_mem_desc *tgt, bool do_copyfrom, bool finalize) { - gomp_unmap_vars_async (tgt, do_copyfrom, NULL); + gomp_unmap_vars_async (tgt, do_copyfrom, NULL, finalize); } attribute_hidden void gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom, - struct goacc_asyncqueue *aq) + struct goacc_asyncqueue *aq, bool finalize) { struct gomp_device_descr *devicep = tgt->device_descr; @@ -1133,10 +1291,23 @@ gomp_unmap_vars_async (struct target_mem_desc *tgt, bool do_copyfrom, } size_t i; + + /* We must perform detachments before any copies back to the host. */ for (i = 0; i < tgt->list_count; i++) { splay_tree_key k = tgt->list[i].key; - if (k == NULL) + + if (k != NULL && tgt->list[i].do_detach) + gomp_detach_pointer (devicep, aq, k, tgt->list[i].key->host_start + + tgt->list[i].offset, finalize, + NULL); + } + + for (i = 0; i < tgt->list_count; i++) + { + splay_tree_key k = tgt->list[i].key; + + if (k == NULL || tgt->list[i].do_detach) continue; bool do_unmap = false; @@ -1731,7 +1902,7 @@ GOMP_target (int device, void (*fn) (void *), const void *unused, GOMP_MAP_VARS_TARGET); devicep->run_func (devicep->target_id, fn_addr, (void *) tgt_vars->tgt_start, NULL); - gomp_unmap_vars (tgt_vars, true); + gomp_unmap_vars (tgt_vars, true, false); } /* Like GOMP_target, but KINDS is 16-bit, UNUSED is no longer present, @@ -1875,7 +2046,7 @@ GOMP_target_ext (int device, void (*fn) (void *), size_t mapnum, tgt_vars ? (void *) tgt_vars->tgt_start : hostaddrs, args); if (tgt_vars) - gomp_unmap_vars (tgt_vars, true); + gomp_unmap_vars (tgt_vars, true, false); } /* Host fallback for GOMP_target_data{,_ext} routines. */ @@ -1944,7 +2115,7 @@ GOMP_target_end_data (void) { struct target_mem_desc *tgt = icv->target_data; icv->target_data = tgt->prev; - gomp_unmap_vars (tgt, true); + gomp_unmap_vars (tgt, true, false); } } @@ -2099,6 +2270,8 @@ gomp_exit_data (struct gomp_device_descr *devicep, size_t mapnum, k->tgt->refcount--; else gomp_unmap_tgt (k->tgt); + if (k->attach_count) + free (k->attach_count); } break; @@ -2226,7 +2399,7 @@ gomp_target_task_fn (void *data) if (ttask->state == GOMP_TARGET_TASK_FINISHED) { if (ttask->tgt) - gomp_unmap_vars (ttask->tgt, true); + gomp_unmap_vars (ttask->tgt, true, false); return false; } diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c new file mode 100644 index 0000000..d8d7067 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-1.c @@ -0,0 +1,24 @@ +#include +#include + +struct dc +{ + int a; + int *b; +}; + +int +main () +{ + int n = 100, i; + struct dc v = { .a = 3, .b = (int *) malloc (sizeof (int) * n) }; + +#pragma acc parallel loop copy(v.a, v.b[:n]) + for (i = 0; i < n; i++) + v.b[i] = v.a; + + for (i = 0; i < 10; i++) + assert (v.b[i] == v.a); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c new file mode 100644 index 0000000..7e26e9a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-2.c @@ -0,0 +1,29 @@ +#include +#include + +int +main(int argc, char* argv[]) +{ + struct foo { + int *a, *b, c, d, *e; + } s; + + s.a = (int *) malloc (16 * sizeof (int)); + s.b = (int *) malloc (16 * sizeof (int)); + s.e = (int *) malloc (16 * sizeof (int)); + + #pragma acc data copy(s) + { + #pragma acc data copy(s.a[0:10]) + { + #pragma acc parallel loop attach(s.a) + for (int i = 0; i < 10; i++) + s.a[i] = i; + } + } + + for (int i = 0; i < 10; i++) + assert (s.a[i] == i); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c new file mode 100644 index 0000000..cec764b --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-3.c @@ -0,0 +1,34 @@ +#include +#include +#include + +int +main () +{ + int n = 100, i; + int *a = (int *) malloc (sizeof (int) * n); + int *b; + + for (i = 0; i < n; i++) + a[i] = i+1; + +#pragma acc enter data copyin(a[:n]) create(b) + + b = a; + acc_attach ((void **)&b); + +#pragma acc parallel loop present (b[:n]) + for (i = 0; i < n; i++) + b[i] = i+1; + + acc_detach ((void **)&b); + +#pragma acc exit data copyout(a[:n], b) + + for (i = 0; i < 10; i++) + assert (a[i] == b[i]); + + free (a); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c new file mode 100644 index 0000000..8874ca0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-4.c @@ -0,0 +1,87 @@ +#include +#include + +#define LIST_LENGTH 10 + +struct node +{ + struct node *next; + int val; +}; + +int +sum_nodes (struct node *head) +{ + int i = 0, sum = 0; + +#pragma acc parallel reduction(+:sum) present(head[:1]) + { + for (; head != NULL; head = head->next) + sum += head->val; + } + + return sum; +} + +void +insert (struct node *head, int val) +{ + struct node *n = (struct node *) malloc (sizeof (struct node)); + + if (head->next) + { +#pragma acc exit data detach(head->next) + } + + n->val = val; + n->next = head->next; + head->next = n; + +#pragma acc enter data copyin(n[:1]) +#pragma acc enter data attach(head->next) + if (n->next) + { +#pragma acc enter data attach(n->next) + } +} + +void +destroy (struct node *head) +{ + while (head->next != NULL) + { +#pragma acc exit data detach(head->next) + struct node * n = head->next; + head->next = n->next; + if (n->next) + { +#pragma acc exit data detach(n->next) + } +#pragma acc exit data delete (n[:1]) + if (head->next) + { +#pragma acc enter data attach(head->next) + } + free (n); + } +} + +int +main () +{ + struct node list = { .next = NULL, .val = 0 }; + int i; + +#pragma acc enter data copyin(list) + + for (i = 0; i < LIST_LENGTH; i++) + insert (&list, i + 1); + + assert (sum_nodes (&list) == (LIST_LENGTH * LIST_LENGTH + LIST_LENGTH) / 2); + + destroy (&list); + +#pragma acc exit data delete(list) + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c new file mode 100644 index 0000000..89cafbb --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/deep-copy-5.c @@ -0,0 +1,81 @@ +#include +#include +#include + +struct node +{ + struct node *next; + int val; +}; + +int +sum_nodes (struct node *head) +{ + int i = 0, sum = 0; + +#pragma acc parallel reduction(+:sum) present(head[:1]) + { + for (; head != NULL; head = head->next) + sum += head->val; + } + + return sum; +} + +void +insert (struct node *head, int val) +{ + struct node *n = (struct node *) malloc (sizeof (struct node)); + + if (head->next) + acc_detach ((void **) &head->next); + + n->val = val; + n->next = head->next; + head->next = n; + + acc_copyin (n, sizeof (struct node)); + acc_attach((void **) &head->next); + + if (n->next) + acc_attach ((void **) &n->next); +} + +void +destroy (struct node *head) +{ + while (head->next != NULL) + { + acc_detach ((void **) &head->next); + struct node * n = head->next; + head->next = n->next; + if (n->next) + acc_detach ((void **) &n->next); + + acc_delete (n, sizeof (struct node)); + if (head->next) + acc_attach((void **) &head->next); + + free (n); + } +} + +int +main () +{ + struct node list = { .next = NULL, .val = 0 }; + int i; + + acc_copyin (&list, sizeof (struct node)); + + for (i = 0; i < 10; i++) + insert (&list, 2); + + assert (sum_nodes (&list) == 10 * 2); + + destroy (&list); + + acc_delete (&list, sizeof (struct node)); + + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 new file mode 100644 index 0000000..c4cea11 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-1.f90 @@ -0,0 +1,35 @@ +! { dg-do run } + +! Test of attach/detach with "acc data". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + +!$acc data copy(var) +!$acc data copy(var%a) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + +!$acc end data +!$acc end data + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 new file mode 100644 index 0000000..3593661 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-2.f90 @@ -0,0 +1,33 @@ +! { dg-do run } + +! Test of attach/detach with "acc data", two clauses at once. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + +!$acc data copy(var) copy(var%a) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + +!$acc end data + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 new file mode 100644 index 0000000..667d944 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-3.f90 @@ -0,0 +1,34 @@ +! { dg-do run } + +! Test of attach/detach with "acc parallel". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc parallel loop copy(var) copy(var%a(1:n)) copy(var%b(1:n)) + do i = 1,n + var%a(i) = i + var%b(i) = i + end do +!$acc end parallel loop + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + if (i .ne. var%b(i)) stop 2 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 new file mode 100644 index 0000000..6949e12 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-4.f90 @@ -0,0 +1,49 @@ +! { dg-do run } + +! Test of attach/detach with "acc enter/exit data". + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer, allocatable :: r(:) + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + allocate(r(1:n)) + +!$acc enter data copyin(var) + +!$acc enter data copyin(var%a, var%b, r) + +!$acc parallel loop + do i = 1,n + var%a(i) = i + var%b(i) = i * 2 + r(i) = i * 3 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a) +!$acc exit data copyout(var%b) +!$acc exit data copyout(r) + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + if (i * 2 .ne. var%b(i)) stop 2 + if (i * 3 .ne. r(i)) stop 3 + end do + +!$acc exit data delete(var) + + deallocate(var%a) + deallocate(var%b) + deallocate(r) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 new file mode 100644 index 0000000..6843cf1 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-5.f90 @@ -0,0 +1,57 @@ +! { dg-do run } + +! Test of attach/detach, "enter data" inside "data", and subarray. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc data copy(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + end do + +!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5)) + +!$acc parallel loop + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) + +!$acc end data + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 3 + if (i * 2 .ne. var%b(i)) stop 4 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 5 + if (var%b(i) .ne. 0) stop 6 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 new file mode 100644 index 0000000..12910d0 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-6.f90 @@ -0,0 +1,61 @@ +! { dg-do run } + +! Test of attachment counters and finalize. + +program dtype + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + integer, allocatable :: b(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + +!$acc data copy(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + end do + +!$acc enter data copyin(var%a(5:n - 5), var%b(5:n - 5)) + + do i = 1,20 + !$acc enter data attach(var%a) + end do + +!$acc parallel loop + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + end do +!$acc end parallel loop + +!$acc exit data copyout(var%a(5:n - 5), var%b(5:n - 5)) finalize + +!$acc end data + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 3 + if (i * 2 .ne. var%b(i)) stop 4 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 5 + if (var%b(i) .ne. 0) stop 6 + end do + + deallocate(var%a) + deallocate(var%b) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 new file mode 100644 index 0000000..ab44f0a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-7.f90 @@ -0,0 +1,89 @@ +! { dg-do run } + +! Test of attach/detach with scalar elements and nested derived types. + +program dtype + implicit none + integer, parameter :: n = 512 + type subtype + integer :: g, h + integer, allocatable :: q(:) + end type subtype + type mytype + integer, allocatable :: a(:) + integer, allocatable :: c, d + integer, allocatable :: b(:) + integer :: f + type(subtype) :: s + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + allocate(var%b(1:n)) + allocate(var%c) + allocate(var%d) + allocate(var%s%q(1:n)) + + var%c = 16 + var%d = 20 + var%f = 7 + var%s%g = 21 + var%s%h = 38 + +!$acc enter data copyin(var) + + do i = 1, n + var%a(i) = 0 + var%b(i) = 0 + var%s%q(i) = 0 + end do + +!$acc data copy(var%a(5:n - 5), var%b(5:n - 5), var%c, var%d) & +!$acc & copy(var%s%q) + +!$acc parallel loop default(none) present(var) + do i = 5,n - 5 + var%a(i) = i + var%b(i) = i * 2 + var%s%q(i) = i * 3 + var%s%g = 100 + var%s%h = 101 + end do +!$acc end parallel loop + +!$acc end data + +!$acc exit data copyout(var) + + do i = 1,4 + if (var%a(i) .ne. 0) stop 1 + if (var%b(i) .ne. 0) stop 2 + if (var%s%q(i) .ne. 0) stop 3 + end do + + do i = 5,n - 5 + if (i .ne. var%a(i)) stop 4 + if (i * 2 .ne. var%b(i)) stop 5 + if (i * 3 .ne. var%s%q(i)) stop 6 + end do + + do i = n - 4,n + if (var%a(i) .ne. 0) stop 7 + if (var%b(i) .ne. 0) stop 8 + if (var%s%q(i) .ne. 0) stop 9 + end do + + if (var%c .ne. 16) stop 10 + if (var%d .ne. 20) stop 11 + if (var%s%g .ne. 100 .or. var%s%h .ne. 101) stop 12 + if (var%f .ne. 7) stop 13 + + deallocate(var%a) + deallocate(var%b) + deallocate(var%c) + deallocate(var%d) + deallocate(var%s%q) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 new file mode 100644 index 0000000..d142763a --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/deep-copy-8.f90 @@ -0,0 +1,41 @@ +! { dg-do run } + +! Test of explicit attach/detach clauses and attachment counters. There are no +! acc_attach/acc_detach API routines in Fortran. + +program dtype + use openacc + implicit none + integer, parameter :: n = 512 + type mytype + integer, allocatable :: a(:) + end type mytype + integer i + + type(mytype) :: var + + allocate(var%a(1:n)) + + call acc_copyin(var) + call acc_copyin(var%a) + + !$acc enter data attach(var%a) + +!$acc parallel loop attach(var%a) + do i = 1,n + var%a(i) = i + end do +!$acc end parallel loop + + !$acc exit data detach(var%a) + + call acc_copyout(var%a) + call acc_copyout(var) + + do i = 1,n + if (i .ne. var%a(i)) stop 1 + end do + + deallocate(var%a) + +end program dtype diff --git a/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 new file mode 100644 index 0000000..eb7812d --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/derived-type-1.f90 @@ -0,0 +1,28 @@ +! Test derived types with subarrays + +! { dg-do run } + + implicit none + type dtype + integer :: a, b, c + end type dtype + integer, parameter :: n = 100 + integer i + type (dtype), dimension(n) :: d + + !$acc data copy(d(1:n)) + !$acc parallel loop + do i = 1, n + d(i)%a = i + d(i)%b = i-1 + d(i)%c = i+1 + end do + !$acc end data + + do i = 1, n + if (d(i)%a /= i) stop 1 + if (d(i)%b /= i-1) stop 2 + if (d(i)%c /= i+1) stop 3 + end do +end program + diff --git a/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 new file mode 100644 index 0000000..c3c8a07 --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-fortran/update-2.f90 @@ -0,0 +1,284 @@ +! Test ACC UPDATE with derived types. + +! { dg-do run } + +module dt + integer, parameter :: n = 10 + type inner + integer :: d(n) + end type inner + type mytype + integer(8) :: a, b, c(n) + type(inner) :: in + end type mytype +end module dt + +program derived_acc + use dt + + implicit none + integer i, res + type(mytype) :: var + + var%a = 0 + var%b = 1 + var%c(:) = 10 + var%in%d(:) = 100 + + var%c(:) = 10 + + !$acc enter data copyin(var) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 1 + + var%b = 100 + + !$acc update device(var%b) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 2 + + !$acc parallel loop present (var) + do i = 1, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c) + + var%a = -1 + + do i = 1, n + if (var%c(i) /= i) stop 3 + var%c(i) = var%a + end do + + !$acc update device(var%a) + !$acc update device(var%c) + + res = 0 + + !$acc parallel loop present(var) reduction(+:res) + do i = 1, n + if (var%c(i) /= var%a) res = res + 1 + end do + + if (res /= 0) stop 4 + + var%c(:) = 0 + + !$acc update device(var%c) + + !$acc parallel loop present(var) + do i = 5, 5 + var%c(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%c(5)) + + do i = 1, n + if (i /= 5 .and. var%c(i) /= 0) stop 5 + if (i == 5 .and. var%c(i) /= 1) stop 6 + end do + + !$acc parallel loop present(var) + do i = 1, n + var%in%d = var%a + end do + !$acc end parallel loop + + !$acc update host(var%in%d) + + do i = 1, n + if (var%in%d(i) /= var%a) stop 7 + end do + + var%c(:) = 0 + + !$acc update device(var%c) + + var%c(:) = -1 + + !$acc parallel loop present(var) + do i = n/2, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c(n/2:n)) + + do i = 1,n + if (i < n/2 .and. var%c(i) /= -1) stop 8 + if (i >= n/2 .and. var%c(i) /= i) stop 9 + end do + + var%in%d(:) = 0 + !$acc update device(var%in%d) + + !$acc parallel loop present(var) + do i = 5, 5 + var%in%d(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%in%d(5)) + + do i = 1, n + if (i /= 5 .and. var%in%d(i) /= 0) stop 10 + if (i == 5 .and. var%in%d(i) /= 1) stop 11 + end do + + !$acc exit data delete(var) + + call derived_acc_subroutine(var) +end program derived_acc + +subroutine derived_acc_subroutine(var) + use dt + + implicit none + integer i, res + type(mytype) :: var + + var%a = 0 + var%b = 1 + var%c(:) = 10 + var%in%d(:) = 100 + + var%c(:) = 10 + + !$acc enter data copyin(var) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 12 + + var%b = 100 + + !$acc update device(var%b) + + !$acc parallel loop present(var) + do i = 1, 1 + var%a = var%b + end do + !$acc end parallel loop + + !$acc update host(var%a) + + if (var%a /= var%b) stop 13 + + !$acc parallel loop present (var) + do i = 1, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c) + + var%a = -1 + + do i = 1, n + if (var%c(i) /= i) stop 14 + var%c(i) = var%a + end do + + !$acc update device(var%a) + !$acc update device(var%c) + + res = 0 + + !$acc parallel loop present(var) reduction(+:res) + do i = 1, n + if (var%c(i) /= var%a) res = res + 1 + end do + + if (res /= 0) stop 15 + + var%c(:) = 0 + + !$acc update device(var%c) + + !$acc parallel loop present(var) + do i = 5, 5 + var%c(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%c(5)) + + do i = 1, n + if (i /= 5 .and. var%c(i) /= 0) stop 16 + if (i == 5 .and. var%c(i) /= 1) stop 17 + end do + + !$acc parallel loop present(var) + do i = 1, n + var%in%d = var%a + end do + !$acc end parallel loop + + !$acc update host(var%in%d) + + do i = 1, n + if (var%in%d(i) /= var%a) stop 18 + end do + + var%c(:) = 0 + + !$acc update device(var%c) + + var%c(:) = -1 + + !$acc parallel loop present(var) + do i = n/2, n + var%c(i) = i + end do + !$acc end parallel loop + + !$acc update host(var%c(n/2:n)) + + do i = 1,n + if (i < n/2 .and. var%c(i) /= -1) stop 19 + if (i >= n/2 .and. var%c(i) /= i) stop 20 + end do + + var%in%d(:) = 0 + !$acc update device(var%in%d) + + !$acc parallel loop present(var) + do i = 5, 5 + var%in%d(i) = 1 + end do + !$acc end parallel loop + + !$acc update host(var%in%d(5)) + + do i = 1, n + if (i /= 5 .and. var%in%d(i) /= 0) stop 21 + if (i == 5 .and. var%in%d(i) /= 1) stop 22 + end do + + !$acc exit data delete(var) +end subroutine derived_acc_subroutine