From patchwork Thu Nov 19 10:30:34 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 546378 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 5EBAE141474 for ; Thu, 19 Nov 2015 21:31:36 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=BOSEJOIj; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=Zz08bPLf/nsx+GDKt oxtUueImh+jY/miF0ruGVMiMkYgp39nt5ujzl7mhaU7wqmqUdRHd32DwKiiypxmm n1ze4aHOQnYhhi0a6fpdWGDAAEPVbpFxUPygcSFOb790iN0TsN1mSO7+/49et0xI Fb6G35fnOirdE00BtrgdC/cdbw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=jhIjBBZeKSfKQJtvaLrypKv /BFs=; b=BOSEJOIjizz30/ej1O4KirL2aDkkkmv+S+B8HAY9lLafdSdGovh8Osc lpVu+48JOd4qp8HbCTqSkXgme4gghovHLrS9aQNKLXVCbongrylSUOy7hzuGxBlS HWQntBoppzRTDUmvIqPqXFqKdbiuyGSGUeQWW4AXkNkaU5JtYXGA= Received: (qmail 47233 invoked by alias); 19 Nov 2015 10:31:27 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 47217 invoked by uid 89); 19 Nov 2015 10:31:26 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 19 Nov 2015 10:31:25 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:38472) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZzMUs-0003Nl-DX for gcc-patches@gnu.org; Thu, 19 Nov 2015 05:31:22 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZzMUn-0001h6-Sz for gcc-patches@gnu.org; Thu, 19 Nov 2015 05:31:21 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:49863) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzMUn-0001eZ-KD for gcc-patches@gnu.org; Thu, 19 Nov 2015 05:31:17 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZzMUl-0004l8-LR from Tom_deVries@mentor.com ; Thu, 19 Nov 2015 02:31:16 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Thu, 19 Nov 2015 10:31:14 +0000 Subject: Re: [PATCH, 10/16] Add pass_oacc_kernels pass group in passes.def To: Richard Biener References: <5640BD31.2060602@mentor.com> <5640FB07.6010008@mentor.com> <5649C41A.40403@mentor.com> CC: "gcc-patches@gnu.org" , Jakub Jelinek From: Tom de Vries Message-ID: <564DA4CA.3020506@mentor.com> Date: Thu, 19 Nov 2015 11:30:34 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 On 16/11/15 13:45, Richard Biener wrote: >> I've eliminated all the uses for pass_tree_loop_init/pass_tree_loop_done in >> >the pass group. Instead, I've added conditional loop optimizer setup in: >> >- pass_lim and pass_scev_cprop (added in this patch), and Reposting the "Add pass_oacc_kernels pass group in passes.def" patch. pass_scev_cprop is no longer part of the pass group. And I've dropped the scev_initialize in pass_lim. Pass_lim is part of the pass_tree_loop pass group, where AFAIU scev info is initialized at the start of the pass group and updated or reset by passes in the pass group if necessary, such that it's always available, or can be recalculated on the spot. First, pass_lim doesn't invalidate scev info. And second, AFAIU pass_lim doesn't use scev info. So there doesn't seem to be a need to do anything about scev info for using pass_lim outside pass_tree_loop. >> >- pass_parallelize_loops_oacc_kernels (added in patch "Add >> > pass_parallelize_loops_oacc_kernels"). > You miss calling scev_finalize (). I've added the scev_finalize () in patch "Add pass_parallelize_loops_oacc_kernels". Thanks, - Tom Add pass_oacc_kernels pass group in passes.def 2015-11-09 Tom de Vries * omp-low.c (pass_expand_omp_ssa::clone): New function. * passes.def: Add pass_oacc_kernels pass group. * tree-ssa-loop-ch.c (pass_ch::clone): New function. * tree-ssa-loop-im.c (tree_ssa_lim): Make static. (pass_lim::execute): Allow to run outside pass_tree_loop. --- gcc/omp-low.c | 1 + gcc/passes.def | 25 +++++++++++++++++++++++++ gcc/tree-ssa-loop-ch.c | 2 ++ gcc/tree-ssa-loop-im.c | 10 +++++++++- 4 files changed, 37 insertions(+), 1 deletion(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 9c27396..d2f88b3 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -13385,6 +13385,7 @@ public: return !(fun->curr_properties & PROP_gimple_eomp); } virtual unsigned int execute (function *) { return execute_expand_omp (); } + opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); } }; // class pass_expand_omp_ssa diff --git a/gcc/passes.def b/gcc/passes.def index 17027786..00446c3 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -88,7 +88,32 @@ along with GCC; see the file COPYING3. If not see /* pass_build_ealias is a dummy pass that ensures that we execute TODO_rebuild_alias at this point. */ NEXT_PASS (pass_build_ealias); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 1. */ + NEXT_PASS (pass_oacc_kernels); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) + /* We need pass_ch here, because pass_lim has no effect on + exit-first loops (PR65442). Ideally we want to remove both + this pass instantiation, and the reverse transformation + transform_to_exit_first_loop_alt, which is done in + pass_parallelize_loops_oacc_kernels. */ + NEXT_PASS (pass_ch); + POP_INSERT_PASSES () NEXT_PASS (pass_fre); + /* Pass group that runs when the function is an offloaded function + containing oacc kernels loops. Part 2. */ + NEXT_PASS (pass_oacc_kernels2); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2) + /* We use pass_lim to rewrite in-memory iteration and reduction + variable accesses in loops into local variables accesses. */ + NEXT_PASS (pass_lim); + NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); + NEXT_PASS (pass_lim); + NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */); + NEXT_PASS (pass_dce); + NEXT_PASS (pass_parallelize_loops_oacc_kernels); + NEXT_PASS (pass_expand_omp_ssa); + POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); NEXT_PASS (pass_dse); NEXT_PASS (pass_cd_dce); diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 7e618bf..6493fcc 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -165,6 +165,8 @@ public: /* Initialize and finalize loop structures, copying headers inbetween. */ virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_ch (m_ctxt); } + protected: /* ch_base method: */ virtual bool process_loop_p (struct loop *loop); diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 30b53ce..96f05f2 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -2496,7 +2496,7 @@ tree_ssa_lim_finalize (void) /* Moves invariants from loops. Only "expensive" invariants are moved out -- i.e. those that are likely to be win regardless of the register pressure. */ -unsigned int +static unsigned int tree_ssa_lim (void) { unsigned int todo; @@ -2560,9 +2560,17 @@ public: unsigned int pass_lim::execute (function *fun) { + if (!loops_state_satisfies_p (LOOPS_NORMAL + | LOOPS_HAVE_RECORDED_EXITS)) + loop_optimizer_init (LOOPS_NORMAL + | LOOPS_HAVE_RECORDED_EXITS); + if (number_of_loops (fun) <= 1) return 0; + if (!loops_state_satisfies_p (LOOP_CLOSED_SSA)) + rewrite_into_loop_closed_ssa (NULL, TODO_update_ssa); + return tree_ssa_lim (); }