From patchwork Fri Apr 8 14:35:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Cesar Philippidis X-Patchwork-Id: 608042 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qhMTD3hYbz9sf6 for ; Sat, 9 Apr 2016 00:35:59 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=clcVTBtV; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=weHGRFcTfd+6BGFW5 PgcPXoV02B1lRBijGHcEtehsUBaRuFYX2UfO8PMLCXa5ly7VEDhyzQmLrTUHB4ic g6MFcJKZRgyqX/e29nfobngHVLsOe7EDb9eDTAWD6KN3hcZ5yHNBpfzoXwymmhBS 8p51gt259cEAf1Yy34/hfhnqZk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=wh01/vBqh0z6SshmLZ2H/tt Bp0Y=; b=clcVTBtVzWIqlhvL97TXHli640YmwsIv+22JngoYFwPFhL3fZibQS9O WDzB4lvdXxmlxMRICS23rQXk+bxjPo6tzNMRSmtYbxFWGyXjiUMlBO+JwHPdh951 vjUbMLW783CtH0Qp7tp4ob68UhYoQaEK81eUqcO1C/WufbcVNLkg= Received: (qmail 52370 invoked by alias); 8 Apr 2016 14:35:50 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 51279 invoked by uid 89); 8 Apr 2016 14:35:49 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy=sum, sk:protect X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Fri, 08 Apr 2016 14:35:39 +0000 Received: from svr-orw-fem-03.mgc.mentorg.com ([147.34.97.39]) by relay1.mentorg.com with esmtp id 1aoXVX-0004Vs-RR from Cesar_Philippidis@mentor.com ; Fri, 08 Apr 2016 07:35:35 -0700 Received: from [127.0.0.1] (147.34.91.1) by svr-orw-fem-03.mgc.mentorg.com (147.34.97.39) with Microsoft SMTP Server id 14.3.224.2; Fri, 8 Apr 2016 07:35:35 -0700 Subject: Re: openacc reference reductions To: Jakub Jelinek References: <56BA0257.6050607@codesourcery.com> <56BA06C3.90606@acm.org> <56BA10FC.90705@codesourcery.com> <56CB2A76.3090809@codesourcery.com> <57046C2B.6080002@codesourcery.com> <20160406142340.GZ19207@tucnak.redhat.com> <57056FCA.7040602@codesourcery.com> <20160407095657.GD19207@tucnak.redhat.com> <570734E3.7030601@codesourcery.com> <20160408074054.GN19207@tucnak.redhat.com> CC: "gcc-patches@gcc.gnu.org" , Nathan Sidwell From: Cesar Philippidis Message-ID: <5707C1B7.6080808@codesourcery.com> Date: Fri, 8 Apr 2016 07:35:35 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160408074054.GN19207@tucnak.redhat.com> On 04/08/2016 12:40 AM, Jakub Jelinek wrote: > On Thu, Apr 07, 2016 at 09:34:43PM -0700, Cesar Philippidis wrote: >> --- a/gcc/gimplify.c >> +++ b/gcc/gimplify.c >> @@ -5802,7 +5802,8 @@ omp_add_variable (struct gimplify_omp_ctx *ctx, tree decl, unsigned int flags) >> flags |= GOVD_SEEN; >> >> n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); >> - if (n != NULL && (n->value & GOVD_DATA_SHARE_CLASS) != 0) >> + if (n != NULL && (n->value & GOVD_DATA_SHARE_CLASS) != 0 >> + && ctx->region_type != ORT_ACC_PARALLEL) >> { >> /* We shouldn't be re-adding the decl with the same data >> sharing class. */ > > Why? Because I was trying be clever and do everything in gimplify_scan_omp_clauses initially. I removed this in the attached patch. >> @@ -6557,6 +6558,24 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, >> break; >> } >> >> +/* OpenACC parallel reductions need a present_or_copy clause to ensure >> + that the original variable used in the reduction gets updated on >> + the host. Scan the list of clauses for reduction so that any existing >> + data clause can be adjusted if necessary. */ >> + if (region_type == ORT_ACC_PARALLEL) >> + { >> + for (c = *list_p; c; c = OMP_CLAUSE_CHAIN (c)) >> + { >> + tree decl = NULL_TREE; >> + >> + if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_REDUCTION) >> + continue; >> + >> + decl = OMP_CLAUSE_DECL (c); >> + omp_add_variable (ctx, decl, GOVD_REDUCTION); >> + } >> + } >> + > > And this looks also wrong, why? > If I try under the debugger 3 cases: > void f1 (int sum) > { > #pragma acc parallel reduction(+:sum) present_or_copy(sum) > ; > } > void f2 (int sum) > { > #pragma acc parallel present_or_copy(sum) > ; > } > void f3 (int sum) > { > #pragma acc parallel reduction(+:sum) > ; > } > then I see the loop that starts with the while below doing the right thing > already. In the first case you end up with > GOVD_SEEN | GOVD_EXPLICIT | GOVD_REDUCTION | GOVD_MAP > in the second with > GOVD_SEEN | GOVD_EXPLICIT | GOVD_MAP > and third one with > GOVD_SEEN | GOVD_EXPLICIT | GOVD_REDUCTION > > That is where you IMHO should stop at the gimplify_scan_omp_clauses side, > so don't modify neither omp_add_variable nor gimplify_scan_omp_clauses > at all, and do everything else in gimplify_adjust_omp_clauses. > That function walks the explicit clauses and has all the info gathered > during gimplify_scan_omp_clauses available in the splay tree. > So, you can do all the checking there. Say on OMP_CLAUSE_REDUCTION > for the ORT_ACC_PARALLEL check the flags if they include GOVD_PRIVATE > or GOVD_FIRSTPRIVATE, if yes, complain. Also check if GOVD_MAP is included, > if not, add the extra OMP_CLAUSE_MAP tofrom. > And, on OMP_CLAUSE_MAP, check if GOVD_REDUCTION is set on ORT_ACC_PARALLEL, > and if yes, check if it is tofrom and complain otherwise. Yeah, that does simplify things quite a bit. This patch still needs to finish testing. Is it OK for trunk if the test results comes back clean? On 04/08/2016 12:43 AM, Jakub Jelinek wrote: > On Fri, Apr 08, 2016 at 09:40:54AM +0200, Jakub Jelinek wrote: >> So, you can do all the checking there. Say on OMP_CLAUSE_REDUCTION >> for the ORT_ACC_PARALLEL check the flags if they include GOVD_PRIVATE >> or GOVD_FIRSTPRIVATE, if yes, complain. Also check if GOVD_MAP is included, > > Though, > void f1 (int sum) > { > #pragma acc parallel reduction(+:sum) firstprivate(sum) > ; > } > void f2 (int sum) > { > #pragma acc parallel reduction(+:sum) private(sum) > ; > } > is already rejected in the FE, so not sure why you want to deal with that. The FEs a little inconsistent, and I didn't want to make this patch that invasive. Can the FE changes wait to gcc7? Cesar 2016-04-08 Cesar Philippidis PR lto/70289 PR ipa/70348 PR tree-optimization/70373 PR middle-end/70533 PR middle-end/70534 PR middle-end/70535 * gimplify.c (gimplify_adjust_omp_clauses): Add or adjust data clauses for acc parallel reductions as necessary. Error on those that are private. * omp-low.c (is_oacc_parallel_reduction): New function. (scan_sharing_clauses): Use it to prevent installing local variables for those used in acc parallel reductions. (lower_rec_input_clauses): Remove dead code. (lower_oacc_reductions): Add support for reference reductions. (lower_reduction_clauses): Remove dead code. (lower_omp_target): Don't remap variables appearing in acc parallel reductions. * tree.h (OMP_CLAUSE_MAP_IN_REDUCTION): New macro. diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 9c0119e..e376cde 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -7987,6 +7987,34 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, break; } decl = OMP_CLAUSE_DECL (c); + /* Data clasues associated with acc parallel reductions must be + compatible with present_or_copy. Warn and adjust the clause + if that is not the case. */ + if (ctx->region_type == ORT_ACC_PARALLEL) + { + tree t = DECL_P (decl) ? decl : TREE_OPERAND (decl, 0); + n = NULL; + + if (DECL_P (t)) + n = splay_tree_lookup (ctx->variables, (splay_tree_key)t); + + if (n && (n->value & GOVD_REDUCTION)) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + + OMP_CLAUSE_MAP_IN_REDUCTION(c) = 1; + if ((kind & GOMP_MAP_TOFROM) != GOMP_MAP_TOFROM + && kind != GOMP_MAP_FORCE_PRESENT + && kind != GOMP_MAP_POINTER) + { + warning_at (OMP_CLAUSE_LOCATION (c), 0, + "incompatible data clause with reduction " + "on %qE; promoting to present_or_copy", + DECL_NAME (t)); + OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_TOFROM); + } + } + } if (!DECL_P (decl)) { if ((ctx->region_type & ORT_TARGET) != 0 @@ -8118,6 +8146,34 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_REDUCTION: decl = OMP_CLAUSE_DECL (c); + /* OpenACC reductions need a present_or_copy data clause. + Add one if necessary. Error is the reduction is private. */ + if (ctx->region_type == ORT_ACC_PARALLEL) + { + n = splay_tree_lookup (ctx->variables, (splay_tree_key)decl); + if (n->value & (GOVD_PRIVATE | GOVD_FIRSTPRIVATE)) + { + error_at (OMP_CLAUSE_LOCATION (c), "invalid private " + "reduction on %qE", DECL_NAME (decl)); + } + else if ((n->value & GOVD_MAP) == 0) + { + tree next = OMP_CLAUSE_CHAIN (c); + tree nc = build_omp_clause (UNKNOWN_LOCATION, OMP_CLAUSE_MAP); + OMP_CLAUSE_SET_MAP_KIND (nc, GOMP_MAP_TOFROM); + OMP_CLAUSE_DECL (nc) = decl; + OMP_CLAUSE_CHAIN (c) = nc; + lang_hooks.decls.omp_finish_clause (nc, pre_p); + for (; nc; nc = OMP_CLAUSE_CHAIN (nc)) + { + OMP_CLAUSE_MAP_IN_REDUCTION (nc) = 1; + if (OMP_CLAUSE_CHAIN (nc) == NULL) + break; + } + OMP_CLAUSE_CHAIN (nc) = next; + n->value |= GOVD_MAP; + } + } if (DECL_P (decl) && omp_shared_to_firstprivate_optimizable_decl_p (decl)) omp_mark_stores (gimplify_omp_ctxp->outer_context, decl); diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 979926d..ed47853 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -2122,7 +2122,8 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, else install_var_field (decl, true, 3, ctx, base_pointers_restrict); - if (is_gimple_omp_offloaded (ctx->stmt)) + if (is_gimple_omp_offloaded (ctx->stmt) + && !OMP_CLAUSE_MAP_IN_REDUCTION (c)) install_var_local (decl, ctx); } } @@ -4839,7 +4840,7 @@ lower_rec_input_clauses (tree clauses, gimple_seq *ilist, gimple_seq *dlist, gimplify_assign (ptr, x, ilist); } } - else if (is_reference (var) && !is_oacc_parallel (ctx)) + else if (is_reference (var)) { /* For references that are being privatized for Fortran, allocate new backing storage for the new pointer @@ -5575,7 +5576,8 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, tree orig = OMP_CLAUSE_DECL (c); tree var = maybe_lookup_decl (orig, ctx); tree ref_to_res = NULL_TREE; - tree incoming, outgoing; + tree incoming, outgoing, v1, v2, v3; + bool is_private = false; enum tree_code rcode = OMP_CLAUSE_REDUCTION_CODE (c); if (rcode == MINUS_EXPR) @@ -5588,7 +5590,6 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, if (!var) var = orig; - gcc_assert (!is_reference (var)); incoming = outgoing = var; @@ -5624,22 +5625,38 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, for (; cls; cls = OMP_CLAUSE_CHAIN (cls)) if (OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_REDUCTION && orig == OMP_CLAUSE_DECL (cls)) - goto has_outer_reduction; + { + incoming = outgoing = lookup_decl (orig, probe); + goto has_outer_reduction; + } + else if ((OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_FIRSTPRIVATE + || OMP_CLAUSE_CODE (cls) == OMP_CLAUSE_PRIVATE) + && orig == OMP_CLAUSE_DECL (cls)) + { + is_private = true; + goto do_lookup; + } } do_lookup: /* This is the outermost construct with this reduction, see if there's a mapping for it. */ if (gimple_code (outer->stmt) == GIMPLE_OMP_TARGET - && maybe_lookup_field (orig, outer)) + && maybe_lookup_field (orig, outer) && !is_private) { ref_to_res = build_receiver_ref (orig, false, outer); if (is_reference (orig)) ref_to_res = build_simple_mem_ref (ref_to_res); + tree type = TREE_TYPE (var); + if (POINTER_TYPE_P (type)) + type = TREE_TYPE (type); + outgoing = var; - incoming = omp_reduction_init_op (loc, rcode, TREE_TYPE (var)); + incoming = omp_reduction_init_op (loc, rcode, type); } + else if (ctx->outer) + incoming = outgoing = lookup_decl (orig, ctx->outer); else incoming = outgoing = orig; @@ -5649,6 +5666,37 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, if (!ref_to_res) ref_to_res = integer_zero_node; + if (is_reference (orig)) + { + tree type = TREE_TYPE (var); + const char *id = IDENTIFIER_POINTER (DECL_NAME (var)); + + if (!inner) + { + tree x = create_tmp_var (TREE_TYPE (type), id); + gimplify_assign (var, build_fold_addr_expr (x), fork_seq); + } + + v1 = create_tmp_var (type, id); + v2 = create_tmp_var (type, id); + v3 = create_tmp_var (type, id); + + gimplify_assign (v1, var, fork_seq); + gimplify_assign (v2, var, fork_seq); + gimplify_assign (v3, var, fork_seq); + + var = build_simple_mem_ref (var); + v1 = build_simple_mem_ref (v1); + v2 = build_simple_mem_ref (v2); + v3 = build_simple_mem_ref (v3); + outgoing = build_simple_mem_ref (outgoing); + + if (TREE_CODE (incoming) != INTEGER_CST) + incoming = build_simple_mem_ref (incoming); + } + else + v1 = v2 = v3 = var; + /* Determine position in reduction buffer, which may be used by target. */ enum machine_mode mode = TYPE_MODE (TREE_TYPE (var)); @@ -5678,20 +5726,20 @@ lower_oacc_reductions (location_t loc, tree clauses, tree level, bool inner, = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, init_code, unshare_expr (ref_to_res), - var, level, op, off); + v1, level, op, off); tree fini_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, fini_code, unshare_expr (ref_to_res), - var, level, op, off); + v2, level, op, off); tree teardown_call = build_call_expr_internal_loc (loc, IFN_GOACC_REDUCTION, TREE_TYPE (var), 6, teardown_code, - ref_to_res, var, level, op, off); + ref_to_res, v3, level, op, off); - gimplify_assign (var, setup_call, &before_fork); - gimplify_assign (var, init_call, &after_fork); - gimplify_assign (var, fini_call, &before_join); + gimplify_assign (v1, setup_call, &before_fork); + gimplify_assign (v2, init_call, &after_fork); + gimplify_assign (v3, fini_call, &before_join); gimplify_assign (outgoing, teardown_call, &after_join); } @@ -5933,9 +5981,6 @@ lower_reduction_clauses (tree clauses, gimple_seq *stmt_seqp, omp_context *ctx) } } - if (is_gimple_omp_oacc (ctx->stmt)) - return; - stmt = gimple_build_call (builtin_decl_explicit (BUILT_IN_GOMP_ATOMIC_START), 0); gimple_seq_add_stmt (stmt_seqp, stmt); @@ -15829,7 +15874,10 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) if (!maybe_lookup_field (var, ctx)) continue; - if (offloaded) + /* Don't remap oacc parallel reduction variables, because the + intermediate result must be local to each gang. */ + if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && OMP_CLAUSE_MAP_IN_REDUCTION(c))) { x = build_receiver_ref (var, true, ctx); tree new_var = lookup_decl (var, ctx); diff --git a/gcc/tree.h b/gcc/tree.h index fa70596..87e7563 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -1536,6 +1536,9 @@ extern void protected_set_expr_location (tree, location_t); treatment if OMP_CLAUSE_SIZE is zero. */ #define OMP_CLAUSE_MAP_MAYBE_ZERO_LENGTH_ARRAY_SECTION(NODE) \ TREE_PROTECTED (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) +/* Nonzero if this map clause is for an ACC parallel reduction variable. */ +#define OMP_CLAUSE_MAP_IN_REDUCTION(NODE) \ + TREE_PRIVATE (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_MAP)) #define OMP_CLAUSE_PROC_BIND_KIND(NODE) \ (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_PROC_BIND)->omp_clause.subcode.proc_bind_kind)