From patchwork Tue Nov 3 15:31:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Frederik Harwath X-Patchwork-Id: 1393141 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CQYfz6Pprz9sVD for ; Wed, 4 Nov 2020 02:31:54 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 61081385481A; Tue, 3 Nov 2020 15:31:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 1B038385481A for ; Tue, 3 Nov 2020 15:31:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1B038385481A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Frederik_Harwath@mentor.com IronPort-SDR: QtAFshHSjRrP89m8KO98PGzFIQgf0EOBn7SgYvC6Srhprqiu8kOxhXpubMcktiRRAfeR9IQIxE QgH1uRRgSDBP2+4iewGc98//sTcGKljoeqG1ZanH5+DeGrhJXWgW8FcklP0RVi5ljnvIDuTsuM XF8M/x3l4lHKvZSiQSli9jztuKlKPcDvem1nzaQADRippFLvkPRikjHFUlbzImh+Wc6B+Dylob DdQCDUW6j66fTJ00e0wyqrLumLM3FAJd9KCQFlktEYHHAvwDFRdFtap5wxl1Ijdsci6JUFnaEO 8xs= X-IronPort-AV: E=Sophos;i="5.77,448,1596528000"; d="scan'208,223";a="56921242" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 03 Nov 2020 07:31:41 -0800 IronPort-SDR: b1S1N3H1zdiVFcpJNRfiLruwRX0nZtsn3v1WVvza4JjHWHWXMCI/duwojGxEvyj+FglKFlpaJk KD2p4lVM2n/jjMFj6YS/zcCGIwyDTy31oeJO/oclqRhl4MNGLrwfxNyh9O8tQ5PThm82sUIAG1 6KkcsmjWSmIBI0hcH4Z/pU4vsGoSs99afsBgg994MfbKHyT7wBOBa5QWtNgqjnuzgR298car6w TZuwAxOuQsCVYpEiSe9BNBrX+qV/E5NpGPYJGtQfXxFN8BoRMHJKiP2m6xFNcPZvZHUtnwJujy 3Dk= From: Frederik Harwath To: Thomas Schwinge , Richard Biener , Tobias Burnus , Subject: Move pass_oacc_device_lower after pass_graphite Date: Tue, 3 Nov 2020 16:31:28 +0100 Message-ID: MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-12.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: GCC Patches Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, as a first step towards enabling the use of Graphite for optimizing OpenACC loops this patch moves the OpenACC device lowering after the Graphite pass. This means that the device lowering now takes place after some crucial optimization passes. Thus new instances of those passes are added inside of a new pass pass_oacc_functions which ensures that they run on OpenACC functions only. The choice of the new position for pass_oacc_device_lower is further constrainted by the need to execute it before pass_vectorize. This means that pass_oacc_device_lower now runs inside of pass_tree_loop. A further instance of the pass that handles functions without loops is added inside of pass_tree_no_loop. Yet another pass instance that executes if optimizations are disabled is included inside of a new pass_no_optimizations. The patch has been bootstrapped on x86_64-linux-gnu and tested with the GCC testsuite and with the libgomp testsuite with nvptx and gcn offloading. The patch should have no impact on non-OpenACC user code. However the new pass instances have changed the pass instance numbering and hence the dump scanning commands in several tests had to be adjusted. I hope that I found all that needed adjustment, but it is well possible that I missed some tests that execute for particular targets or non-default languages only. The resulting UNRESOLVED tests are usually easily fixed by appending a pass number to the name of a pass that previously had no number (e.g. "cunrolli" becomes "cunrolli1") or by incrementing the pass number (e.g. "dce6" becomes "dce7") in a dump scanning command. The patch leads to several new unresolved tests in the libgomp testsuite which are caused by the combination of torture testing, missing cleanup of the offload dump files, and the new pass numbering. If a test that uses, for instance, "-foffload=fdump-tree-oaccdevlow" gets compiled with "-O0" and afterwards with "-O2", each run of the test executes different instances of pass_oacc_device_lower and produces dumps whose names differ only in the pass instance number. The dump scanning command in the second run fails, because the dump files do not get removed after the first run and the command consequently matches two different dump files. This seems to be a known issue. I am going to submit a patch that implements the cleanup of the offload dumps soon. I have tried to rule out performance regressions by running different benchmark suites with nvptx and gcn offloading. Nevertheless, I think that it makes sense to keep an eye on OpenACC performance in the close future and revisit the optimizations that run on the device lowered function if necessary. Ok to include the patch in master? Best regards, Frederik ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter From 93fb166876a0540416e19c9428316d1370dd1e1b Mon Sep 17 00:00:00 2001 From: Frederik Harwath Date: Tue, 3 Nov 2020 12:58:37 +0100 Subject: [PATCH] Move pass_oacc_device_lower after pass_graphite As a first step towards enabling the use of Graphite for optimizing OpenACC loops, the OpenACC device lowering must be moved after the Graphite pass. This means that the device lowering now takes place after some crucial optimization passes. Thus new instances of those passes are added inside of a new pass pass_oacc_functions which ensures that they execute on OpenACC functions only. The choice of the new position for pass_oacc_device_lower is further constrainted by the need to execute it before pass_vectorize. This means that pass_oacc_device_lower now runs inside of pass_tree_loop. A further instance of the pass that handles functions without loops is added inside of pass_tree_no_loop. Yet another pass instance that executes if optimizations are disabled is included inside of a new pass_no_optimizations. 2020-11-03 Frederik Harwath Thomas Schwinge gcc/ChangeLog: * omp-general.c (oacc_get_fn_dim_size): Adapt. * omp-offload.c (pass_oacc_device_lower::clone) : New method. * passes.c (class pass_no_optimizations): New pass. (make_pass_no_optimizations): New static function. * passes.def: Move pass_oacc_device_lower into pass_tree_loop and add further instances to pass_tree_no_loop and to new pass pass_no_optimizations. Add new instances of pass_lower_complex, pass_ccp, pass_sink_code, pass_complete_unrolli, pass_backprop, pass_phiprop, pass_forwprop, pass_vrp, pass_dce, pass_loop_done, pass_loop_init, pass_fix_loops supporting the pass_oacc_device_lower instance in pass_tree_loop. * tree-pass.h (make_pass_oacc_functions): New static function. (make_pass_oacc_functions): New static function. * tree-ssa-loop-ivcanon.c (pass_complete_unroll::clone): New method. (pass_complete_unrolli::clone): New method. * tree-ssa-loop.c (pass_fix_loops::clone): New method. (pass_tree_loop_init::clone): New method. (pass_tree_loop_done::clone): New method. * tree-ssa-phiprop.c (pass_phiprop::clone): New method. * tree-ssa-sink.c (pass_sink_code::clone): New method. libgomp/ChangeLog: * testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adapt to changed pass instance numbering. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise. gcc/testsuite/ChangeLog: * c-c++-common/goacc/classify-kernels-unparallelized.c: Adapt to changed pass instance numbering. * c-c++-common/goacc/classify-kernels.c: Likewise. * c-c++-common/goacc/classify-parallel.c: Likewise. * c-c++-common/goacc/classify-routine.c: Likewise. * c-c++-common/unroll-1.c: Likewise. * c-c++-common/unroll-4.c: Likewise. * g++.dg/ext/unroll-1.C: Likewise. * g++.dg/ext/unroll-2.C: Likewise. * g++.dg/ext/unroll-3.C: Likewise. * g++.dg/tree-ssa/pr49911.C: Likewise. * g++.dg/vect/pr36648.cc: Likewise. * gcc.dg/goacc/loop-processing-1.c: Likewise. * gcc.dg/graphite/fuse-1.c: Likewise. * gcc.dg/tree-ssa/backprop-1.c: Likewise. * gcc.dg/tree-ssa/backprop-2.c: Likewise. * gcc.dg/tree-ssa/backprop-3.c: Likewise. * gcc.dg/tree-ssa/backprop-4.c: Likewise. * gcc.dg/tree-ssa/backprop-5.c: Likewise. * gcc.dg/tree-ssa/backprop-6.c: Likewise. * gcc.dg/tree-ssa/cunroll-1.c: Likewise. * gcc.dg/tree-ssa/cunroll-3.c: Likewise. * gcc.dg/tree-ssa/cunroll-9.c: Likewise. * gcc.dg/tree-ssa/ldist-17.c: Likewise. * gcc.dg/tree-ssa/loop-38.c: Likewise. * gcc.dg/tree-ssa/pr21463.c: Likewise. * gcc.dg/tree-ssa/pr45427.c: Likewise. * gcc.dg/tree-ssa/pr61743-1.c: Likewise. * gcc.dg/tree-ssa/pr68234.c: Likewise. * gcc.dg/tree-ssa/pr70232.c: Likewise. * gcc.dg/tree-ssa/ssa-dom-thread-7.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-1.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-10.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-13.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-14.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-16.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-17.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-2.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-3.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-4.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-5.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-6.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-7.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-8.c: Likewise. * gcc.dg/tree-ssa/ssa-sink-9.c: Likewise. * gcc.dg/tree-ssa/ssa-thread-11.c: Likewise. * gcc.dg/tree-ssa/vrp47.c: Likewise. * gcc.dg/tree-ssa/vrp91.c: Likewise. * gcc.dg/unroll-2.c: Likewise. * gcc.dg/unroll-3.c: Likewise. * gcc.dg/unroll-4.c: Likewise. * gcc.dg/unroll-5.c: Likewise. * gcc.dg/vect/bb-slp-59.c: Likewise. * gcc.dg/vect/pr26359.c: Likewise. * gcc.dg/vect/vect-profile-1.c: Likewise. * gcc.dg/vrp-min-max-2.c: Likewise. * gcc.dg/wrapped-binop-simplify.c: Likewise. * gfortran.dg/directive_unroll_1.f90: Likewise. * gfortran.dg/directive_unroll_4.f90: Likewise. * gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise. * gfortran.dg/goacc/classify-kernels.f95: Likewise. * gfortran.dg/goacc/classify-parallel.f95: Likewise. * gfortran.dg/goacc/classify-routine.f95: Likewise. * gnat.dg/unroll1.adb: Likewise. * gnat.dg/unroll2.adb: Likewise. * c-c++-common/goacc/device-lowering-no-optimizations.c: New test. * c-c++-common/goacc/device-lowering-with-optimizations.c: New test. --- gcc/omp-general.c | 8 ++- gcc/omp-offload.c | 1 + gcc/passes.c | 35 ++++++++++++ gcc/passes.def | 29 +++++++++- .../goacc/classify-kernels-unparallelized.c | 6 +-- .../c-c++-common/goacc/classify-kernels.c | 6 +-- .../c-c++-common/goacc/classify-parallel.c | 6 +-- .../c-c++-common/goacc/classify-routine.c | 6 +-- .../goacc/device-lowering-no-optimizations.c | 25 +++++++++ .../device-lowering-with-optimizations.c | 30 +++++++++++ gcc/testsuite/c-c++-common/unroll-1.c | 8 +-- gcc/testsuite/c-c++-common/unroll-4.c | 4 +- gcc/testsuite/g++.dg/ext/unroll-1.C | 2 +- gcc/testsuite/g++.dg/ext/unroll-2.C | 2 +- gcc/testsuite/g++.dg/ext/unroll-3.C | 2 +- gcc/testsuite/g++.dg/tree-ssa/pr49911.C | 4 +- gcc/testsuite/g++.dg/vect/pr36648.cc | 2 +- .../gcc.dg/goacc/loop-processing-1.c | 3 +- gcc/testsuite/gcc.dg/graphite/fuse-1.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c | 6 +-- gcc/testsuite/gcc.dg/tree-ssa/backprop-2.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/backprop-3.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/backprop-4.c | 6 +-- gcc/testsuite/gcc.dg/tree-ssa/backprop-5.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c | 6 +-- gcc/testsuite/gcc.dg/tree-ssa/cunroll-1.c | 6 +-- gcc/testsuite/gcc.dg/tree-ssa/cunroll-3.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/cunroll-9.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/loop-38.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr21463.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr45427.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr61743-1.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/pr68234.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/pr70232.c | 4 +- .../gcc.dg/tree-ssa/ssa-dom-thread-7.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-16.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-17.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-2.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-4.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-5.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-6.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-7.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-8.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-9.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/vrp47.c | 4 +- gcc/testsuite/gcc.dg/tree-ssa/vrp91.c | 4 +- gcc/testsuite/gcc.dg/unroll-2.c | 2 +- gcc/testsuite/gcc.dg/unroll-3.c | 4 +- gcc/testsuite/gcc.dg/unroll-4.c | 4 +- gcc/testsuite/gcc.dg/unroll-5.c | 4 +- gcc/testsuite/gcc.dg/vect/bb-slp-59.c | 2 +- gcc/testsuite/gcc.dg/vect/pr26359.c | 4 +- gcc/testsuite/gcc.dg/vect/vect-profile-1.c | 2 +- gcc/testsuite/gcc.dg/vrp-min-max-2.c | 6 +-- gcc/testsuite/gcc.dg/wrapped-binop-simplify.c | 4 +- .../gfortran.dg/directive_unroll_1.f90 | 2 +- .../gfortran.dg/directive_unroll_4.f90 | 2 +- .../goacc/classify-kernels-unparallelized.f95 | 6 +-- .../gfortran.dg/goacc/classify-kernels.f95 | 6 +-- .../gfortran.dg/goacc/classify-parallel.f95 | 6 +-- .../gfortran.dg/goacc/classify-routine.f95 | 6 +-- gcc/testsuite/gnat.dg/unroll1.adb | 2 +- gcc/testsuite/gnat.dg/unroll2.adb | 2 +- gcc/tree-pass.h | 1 + gcc/tree-ssa-loop-ivcanon.c | 2 + gcc/tree-ssa-loop.c | 54 +++++++++++++++++++ gcc/tree-ssa-phiprop.c | 2 + gcc/tree-ssa-sink.c | 2 + .../libgomp.oacc-c-c++-common/pr84955-1.c | 4 +- .../libgomp.oacc-c-c++-common/pr85486-2.c | 2 +- .../libgomp.oacc-c-c++-common/pr85486-3.c | 2 +- .../libgomp.oacc-c-c++-common/pr85486.c | 2 +- .../vector-length-128-1.c | 2 +- .../vector-length-128-2.c | 2 +- .../vector-length-128-3.c | 2 +- .../vector-length-128-4.c | 2 +- .../vector-length-128-5.c | 2 +- .../vector-length-128-6.c | 2 +- .../vector-length-128-7.c | 2 +- 86 files changed, 316 insertions(+), 130 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimizations.c create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-with-optimizations.c diff --git a/gcc/omp-general.c b/gcc/omp-general.c index b66dfb58257..7b848d9b20c 100644 --- a/gcc/omp-general.c +++ b/gcc/omp-general.c @@ -2778,7 +2778,13 @@ oacc_get_fn_dim_size (tree fn, int axis) while (axis--) dims = TREE_CHAIN (dims); - int size = TREE_INT_CST_LOW (TREE_VALUE (dims)); + tree v = TREE_VALUE (dims); + /* TODO With 'pass_oacc_device_lower' moved "later", this is necessary to + avoid ICE for some OpenACC 'kernels' ("parloops") constructs. */ + if (v == NULL_TREE) + return 0; + + int size = TREE_INT_CST_LOW (v); return size; } diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index 4490701147c..8ff4675153c 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -2027,6 +2027,7 @@ public: { return execute_oacc_device_lower (); } + opt_pass * clone () { return new pass_oacc_device_lower (m_ctxt); } }; // class pass_oacc_device_lower diff --git a/gcc/passes.c b/gcc/passes.c index f71f63918f4..51fa39e94e8 100644 --- a/gcc/passes.c +++ b/gcc/passes.c @@ -620,6 +620,41 @@ make_pass_all_optimizations_g (gcc::context *ctxt) namespace { +const pass_data pass_data_no_optimizations = +{ + GIMPLE_PASS, /* type */ + "*no_optimizations", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_OPTIMIZE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_no_optimizations : public gimple_opt_pass +{ +public: + pass_no_optimizations (gcc::context *ctxt) + : gimple_opt_pass (pass_data_no_optimizations, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) { return !optimize; } + +}; // class pass_no_optimizations + +} // anon namespace + +static gimple_opt_pass * +make_pass_no_optimizations (gcc::context *ctxt) +{ + return new pass_no_optimizations (ctxt); +} + +namespace { + const pass_data pass_data_rest_of_compilation = { RTL_PASS, /* type */ diff --git a/gcc/passes.def b/gcc/passes.def index c68231287b6..58f9be8f957 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -181,7 +181,6 @@ along with GCC; see the file COPYING3. If not see INSERT_PASSES_AFTER (all_passes) NEXT_PASS (pass_fixup_cfg); NEXT_PASS (pass_lower_eh_dispatch); - NEXT_PASS (pass_oacc_device_lower); NEXT_PASS (pass_omp_device_lower); NEXT_PASS (pass_omp_target_link); NEXT_PASS (pass_adjust_alignment); @@ -284,6 +283,29 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_parallelize_loops, false /* oacc_kernels_p */); NEXT_PASS (pass_expand_omp_ssa); + /* Interrupt pass_tree_loop for OpenACC device lowering. */ + NEXT_PASS (pass_oacc_functions); + PUSH_INSERT_PASSES_WITHIN (pass_oacc_functions) + NEXT_PASS (pass_tree_loop_done); + NEXT_PASS (pass_oacc_device_lower); + /* Passes that must run after OpenACC device lowering. */ + /* Lower complex number instructions arising from reductions. */ + NEXT_PASS (pass_lower_complex); + /* Those optimizations are generally beneficial, but they are + particularly important to help the vectorizer which is crucial + for AMD GCN offloading. */ + NEXT_PASS (pass_ccp, true /* nonzero_p */); + NEXT_PASS (pass_sink_code); + NEXT_PASS (pass_complete_unrolli); + NEXT_PASS (pass_backprop); + NEXT_PASS (pass_phiprop); + NEXT_PASS (pass_forwprop); + NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */); + NEXT_PASS (pass_dce); + NEXT_PASS (pass_fix_loops); + /* Continue pass_tree_loop after OpenACC device lowering. */ + NEXT_PASS (pass_tree_loop_init); + POP_INSERT_PASSES () NEXT_PASS (pass_ch_vect); NEXT_PASS (pass_if_conversion); /* pass_vectorize must immediately follow pass_if_conversion. @@ -312,6 +334,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_tree_no_loop); PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop) NEXT_PASS (pass_slp_vectorize); + NEXT_PASS (pass_oacc_device_lower); POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); @@ -387,6 +410,10 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_local_pure_const); NEXT_PASS (pass_modref); POP_INSERT_PASSES () + NEXT_PASS (pass_no_optimizations); + PUSH_INSERT_PASSES_WITHIN (pass_no_optimizations) + NEXT_PASS (pass_oacc_device_lower); + POP_INSERT_PASSES () NEXT_PASS (pass_tm_init); PUSH_INSERT_PASSES_WITHIN (pass_tm_init) NEXT_PASS (pass_tm_mark); diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c index d4c4b2ca237..df733954847 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels-unparallelized.c @@ -35,6 +35,6 @@ void KERNELS () /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). - { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */ + { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c index 16e9b9e31d1..649ef317e93 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-kernels.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-kernels.c @@ -31,6 +31,6 @@ void KERNELS () /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). - { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */ + { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-parallel.c b/gcc/testsuite/c-c++-common/goacc/classify-parallel.c index 66a6d133663..3dc528fa099 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-parallel.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-parallel.c @@ -24,6 +24,6 @@ void PARALLEL () /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). - { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */ + { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/classify-routine.c b/gcc/testsuite/c-c++-common/goacc/classify-routine.c index 0b9ba6ea69f..6509103c52e 100644 --- a/gcc/testsuite/c-c++-common/goacc/classify-routine.c +++ b/gcc/testsuite/c-c++-common/goacc/classify-routine.c @@ -26,6 +26,6 @@ void ROUTINE () /* Check the offloaded function's classification and compute dimensions (will always be 1 x 1 x 1 for non-offloading compilation). - { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } } - { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } } */ + { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } } + { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow1" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimizations.c b/gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimizations.c new file mode 100644 index 00000000000..ce90891e342 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimizations.c @@ -0,0 +1,25 @@ +/* Check that the instance of the OpenACC device lowering pass that is + supposed to run if optimizations are disabled does get executed. */ + +/* { dg-additional-options "-fdump-tree-oaccdevlow" } */ +/* { dg-additional-options "-O0" } */ + +#pragma acc routine +int test (int x) +{ + return x * x; +} + +int test2 (int x) +{ +#pragma acc parallel + { + for (int i = 1; i < 1000; ++i) + x += x; + } + + return x; +} + +/* { dg-final { scan-tree-dump-times "Function is OpenACC routine" 1 "oaccdevlow3" } } */ +/* { dg-final { scan-tree-dump-times "Function is OpenACC parallel" 1 "oaccdevlow3" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/device-lowering-with-optimizations.c b/gcc/testsuite/c-c++-common/goacc/device-lowering-with-optimizations.c new file mode 100644 index 00000000000..9b7cb625b35 --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/device-lowering-with-optimizations.c @@ -0,0 +1,30 @@ +/* Check that the different instances of the OpenACC device lowering + pass get executed on the types of functions they are supposed to + handle if optimizations are enabled. */ + +/* { dg-additional-options "-fdump-tree-oaccdevlow" } */ +/* { dg-additional-options "-O" } */ + +#pragma acc routine +int test (int x) +{ + return x * x; +} + +int test2 (int x) +{ +#pragma acc parallel + { + for (int i = 1; i < 1000; ++i) + x += x; + } + + return x; +} + + +/* { dg-final { scan-tree-dump-times "Function is OpenACC routine" 1 "oaccdevlow2" } } + The acc routine should be handled by the pass instance for functions without loops. */ +/* { dg-final { scan-tree-dump-times "Function is OpenACC parallel" 1 "oaccdevlow1" } } + The function with the parallel region should be handled by the pass instance + for functions with loops. */ diff --git a/gcc/testsuite/c-c++-common/unroll-1.c b/gcc/testsuite/c-c++-common/unroll-1.c index fe7f4f31912..8e57a44be23 100644 --- a/gcc/testsuite/c-c++-common/unroll-1.c +++ b/gcc/testsuite/c-c++-common/unroll-1.c @@ -1,5 +1,5 @@ -/* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-cunrolli-details -fdump-rtl-loop2_unroll-details" } */ +/* { dg-do compile } * +/* { dg-options "-O2 -fdump-tree-cunrolli1-details -fdump-rtl-loop2_unroll-details" } */ extern void bar (int); @@ -10,12 +10,12 @@ void test (void) #pragma GCC unroll 8 for (unsigned long i = 1; i <= 8; ++i) bar(i); - /* { dg-final { scan-tree-dump "11:.*: loop with 8 iterations completely unrolled" "cunrolli" } } */ + /* { dg-final { scan-tree-dump "11:.*: loop with 8 iterations completely unrolled" "cunrolli1" } } */ #pragma GCC unroll 8 for (unsigned long i = 1; i <= 7; ++i) bar(i); - /* { dg-final { scan-tree-dump "16:.*: loop with 7 iterations completely unrolled" "cunrolli" } } */ + /* { dg-final { scan-tree-dump "16:.*: loop with 7 iterations completely unrolled" "cunrolli1" } } */ #pragma GCC unroll 8 for (unsigned long i = 1; i <= 15; ++i) diff --git a/gcc/testsuite/c-c++-common/unroll-4.c b/gcc/testsuite/c-c++-common/unroll-4.c index 1c1988174ba..fe7f9e10626 100644 --- a/gcc/testsuite/c-c++-common/unroll-4.c +++ b/gcc/testsuite/c-c++-common/unroll-4.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -funroll-all-loops -fdump-rtl-loop2_unroll-details -fdump-tree-cunrolli-details" } */ +/* { dg-options "-O2 -funroll-all-loops -fdump-rtl-loop2_unroll-details -fdump-tree-cunrolli1-details" } */ extern void bar (int); @@ -17,6 +17,6 @@ void test (void) for (unsigned long i = 1; i <= j; ++i) bar(i); - /* { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli" } } */ + /* { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli1" } } */ /* { dg-final { scan-rtl-dump-times "Not unrolling loop, user didn't want it unrolled" 2 "loop2_unroll" } } */ } diff --git a/gcc/testsuite/g++.dg/ext/unroll-1.C b/gcc/testsuite/g++.dg/ext/unroll-1.C index aa11b2e6ef7..0e087dfd251 100644 --- a/gcc/testsuite/g++.dg/ext/unroll-1.C +++ b/gcc/testsuite/g++.dg/ext/unroll-1.C @@ -16,4 +16,4 @@ bar (int *a, int *b, int *c) foo (a, b, c); } -// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } } +// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } } diff --git a/gcc/testsuite/g++.dg/ext/unroll-2.C b/gcc/testsuite/g++.dg/ext/unroll-2.C index f9ec892dbdd..4feb23bf565 100644 --- a/gcc/testsuite/g++.dg/ext/unroll-2.C +++ b/gcc/testsuite/g++.dg/ext/unroll-2.C @@ -10,4 +10,4 @@ foo (int (&a)[8], int *b, int *c) a[i] = b[i] * c[i]; } -// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } } +// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } } diff --git a/gcc/testsuite/g++.dg/ext/unroll-3.C b/gcc/testsuite/g++.dg/ext/unroll-3.C index dda94c56af2..3b772fa45c8 100644 --- a/gcc/testsuite/g++.dg/ext/unroll-3.C +++ b/gcc/testsuite/g++.dg/ext/unroll-3.C @@ -17,4 +17,4 @@ bar (int (&a)[8], int *b, int *c) foo (a, b, c); } -// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } } +// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } } diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr49911.C b/gcc/testsuite/g++.dg/tree-ssa/pr49911.C index e31a3f4b1d9..5df6b6f9291 100644 --- a/gcc/testsuite/g++.dg/tree-ssa/pr49911.C +++ b/gcc/testsuite/g++.dg/tree-ssa/pr49911.C @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fstrict-enums -fno-rtti -fno-exceptions -fno-strict-aliasing -fdump-tree-vrp2" } */ +/* { dg-options "-O2 -fstrict-enums -fno-rtti -fno-exceptions -fno-strict-aliasing -fdump-tree-vrp3" } */ extern void JS_Assert(); @@ -37,4 +37,4 @@ void jsop_setelem(bool y, int z) { x = frame.dataRematInfo2(y, z); } -/* { dg-final { scan-tree-dump-times "Folding predicate.*45" 0 "vrp2"} } */ +/* { dg-final { scan-tree-dump-times "Folding predicate.*45" 0 "vrp3"} } */ diff --git a/gcc/testsuite/g++.dg/vect/pr36648.cc b/gcc/testsuite/g++.dg/vect/pr36648.cc index 8d24d3d445d..8990041e4fa 100644 --- a/gcc/testsuite/g++.dg/vect/pr36648.cc +++ b/gcc/testsuite/g++.dg/vect/pr36648.cc @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target vect_float } */ -/* { dg-additional-options "-fdisable-tree-cunrolli" } */ +/* { dg-additional-options "-fdisable-tree-cunrolli1" } */ struct vector { diff --git a/gcc/testsuite/gcc.dg/goacc/loop-processing-1.c b/gcc/testsuite/gcc.dg/goacc/loop-processing-1.c index bd4c07e7d81..4dc33241b78 100644 --- a/gcc/testsuite/gcc.dg/goacc/loop-processing-1.c +++ b/gcc/testsuite/gcc.dg/goacc/loop-processing-1.c @@ -15,4 +15,5 @@ void vector_1 (int *ary, int size) } } -/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop 24\(1\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(6\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 1\);.*Head-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 2\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 1\);} "oaccdevlow" } } */ +/* { dg-final { scan-tree-dump { +OpenACC loops.*Loop 0\(0\).*Loop [0-9]{2}\(1\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(6\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 1\);.*Head-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 2\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 1\);} "oaccdevlow*" } } */ diff --git a/gcc/testsuite/gcc.dg/graphite/fuse-1.c b/gcc/testsuite/gcc.dg/graphite/fuse-1.c index 204d3b20703..8a0ac433f92 100644 --- a/gcc/testsuite/gcc.dg/graphite/fuse-1.c +++ b/gcc/testsuite/gcc.dg/graphite/fuse-1.c @@ -12,7 +12,7 @@ for (int c0 = 0; c0 <= 99; c0 += 1) { /* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */ /* Check that after fusing the loops, the scalar computation is also fused. */ -/* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop4" } } */ +/* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop5" } } */ #define MAX 100 int A[MAX]; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c b/gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c index 302fdb570b6..b6b11bf30af 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O -g -fdump-tree-backprop-details" } */ +/* { dg-options "-O -g -fdump-tree-backprop1-details" } */ /* Test a simple case of non-looping code in which both uses ignore the sign and both definitions are sign ops. */ @@ -18,5 +18,5 @@ TEST_FUNCTION (float, f) TEST_FUNCTION (double, ) TEST_FUNCTION (long double, l) -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 3 "backprop" } } */ -/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR > 6" "vrp2" } } */ +/* { dg-final { scan-tree-dump ">> 6" "vrp3" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c b/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c index 6cc987a722a..672878d7bd1 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr70232.c @@ -1,12 +1,12 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -w -fdump-tree-vrp1-details -fdump-tree-vrp2-details -fdump-tree-dom2-details -fdump-tree-dom3-details" } */ +/* { dg-options "-O2 -w -fdump-tree-vrp1-details -fdump-tree-vrp3-details -fdump-tree-dom2-details -fdump-tree-dom3-details" } */ /* All the threads found by the FSM threader should have too many statements to be profitable. */ /* { dg-final { scan-tree-dump-not "Registering FSM " "dom2"} } */ /* { dg-final { scan-tree-dump-not "Registering FSM " "dom3"} } */ /* { dg-final { scan-tree-dump-not "Registering FSM " "vrp1"} } */ -/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp2"} } */ +/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp3"} } */ typedef _Bool bool; typedef unsigned char uint8_t; diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c index bad5bc1d003..cec2132ce65 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */ +/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp3-stats -fno-guess-branch-probability" } */ /* Here we have the same issue as was commented in ssa-dom-thread-6.c. The PHI coming into the threader has a lot more constants, so the @@ -24,7 +24,7 @@ $ diff clean/a.c.105t.mergephi2 a.c.105t.mergephi2 to change decisions in switch expansion which in turn can expose new jump threading opportunities. Skip the later tests on aarch64. */ /* { dg-final { scan-tree-dump-not "Jumps threaded" "dom3" { target { ! aarch64*-*-* } } } } */ -/* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp2" { target { ! aarch64*-*-* } } } } */ +/* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp3" { target { ! aarch64*-*-* } } } } */ enum STATE { S0=0, diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c index 411585a6dc4..57b501681f3 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c @@ -7,4 +7,4 @@ foo (int a, int b, int c) return c ? x : a; } /* We should sink the x = a * b calculation into the branch that returns x. */ -/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink" } } */ +/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c index 37e4d2fe687..535cb3208f5 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c @@ -16,4 +16,4 @@ void foo (void) } } -/* { dg-final { scan-tree-dump-times "Sinking # VUSE" 4 "sink" } } */ +/* { dg-final { scan-tree-dump-times "Sinking # VUSE" 4 "sink1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c index a65ba35d4ba..584fd91f43a 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c @@ -21,5 +21,5 @@ void test () /* We should sink/merge all stores and end up with a single BB. */ -/* { dg-final { scan-tree-dump-times "MEM\[^\n\r\]* = 0;" 3 "sink" } } */ -/* { dg-final { scan-tree-dump-times "= 2; } virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_complete_unrolli (m_ctxt); } }; // class pass_complete_unrolli diff --git a/gcc/tree-ssa-loop.c b/gcc/tree-ssa-loop.c index 5e8365d4e83..79ece2c941f 100644 --- a/gcc/tree-ssa-loop.c +++ b/gcc/tree-ssa-loop.c @@ -70,6 +70,9 @@ public: virtual bool gate (function *) { return flag_tree_loop_optimize; } virtual unsigned int execute (function *fn); + + opt_pass * clone () { return new pass_fix_loops (m_ctxt); } + }; // class pass_fix_loops unsigned int @@ -202,6 +205,53 @@ make_pass_oacc_kernels (gcc::context *ctxt) return new pass_oacc_kernels (ctxt); } +/* A superpass that runs only on OpenACC functions. */ + +namespace { + +const pass_data pass_data_oacc_functions = +{ + GIMPLE_PASS, /* type */ + "*oacc_functions", /* name */ + OPTGROUP_LOOP, /* optinfo_flags */ + TV_TREE_LOOP, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_oacc_functions: public gimple_opt_pass +{ +public: + pass_oacc_functions (gcc::context *ctxt) + : gimple_opt_pass (pass_data_oacc_functions, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *fn) { + if (!flag_openacc) + return false; + + if (!oacc_get_fn_attrib (fn->decl)) + return false; + + return true; + } + +}; // class pass_oacc_functions + +} // anon namespace + +gimple_opt_pass * +make_pass_oacc_functions (gcc::context *ctxt) +{ + return new pass_oacc_functions (ctxt); +} + + + /* The ipa oacc superpass. */ namespace { @@ -344,6 +394,8 @@ public: /* opt_pass methods: */ virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); } + }; // class pass_tree_loop_init unsigned int @@ -558,6 +610,8 @@ public: /* opt_pass methods: */ virtual unsigned int execute (function *) { return tree_ssa_loop_done (); } + opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); } + }; // class pass_tree_loop_done } // anon namespace diff --git a/gcc/tree-ssa-phiprop.c b/gcc/tree-ssa-phiprop.c index 024da8c408c..6c67e95c0b0 100644 --- a/gcc/tree-ssa-phiprop.c +++ b/gcc/tree-ssa-phiprop.c @@ -479,6 +479,8 @@ public: virtual bool gate (function *) { return flag_tree_phiprop; } virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_phiprop (m_ctxt); } + }; // class pass_phiprop unsigned int diff --git a/gcc/tree-ssa-sink.c b/gcc/tree-ssa-sink.c index 207aae2818a..e64fd077b84 100644 --- a/gcc/tree-ssa-sink.c +++ b/gcc/tree-ssa-sink.c @@ -816,6 +816,8 @@ public: virtual bool gate (function *) { return flag_tree_sink != 0; } virtual unsigned int execute (function *); + opt_pass * clone () { return new pass_sink_code (m_ctxt); } + }; // class pass_sink_code unsigned int diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c index 44767cd27c3..9fff1a35143 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr84955-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */ +/* { dg-options "-O2 -fdump-tree-cddce -ffinite-loops" } */ int f1 (void) @@ -28,4 +28,4 @@ f2 (void) return i + j; } -/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */ +/* { dg-final { scan-tree-dump-not "if" "cddce3"} } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-2.c index d45326488cd..a04905eab2d 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-2.c @@ -7,5 +7,5 @@ #include "pr85486.c" -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-3.c index 33480a4ae68..abd36f93686 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-3.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486-3.c @@ -7,5 +7,5 @@ #include "pr85486.c" -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486.c index 0d98b82f993..78df5b140ba 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr85486.c @@ -54,5 +54,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c index 18d77cc5ecb..085d7ffe287 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c @@ -34,5 +34,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=128" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c index 8b5b2a4a92d..391aa845f42 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c @@ -35,5 +35,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=128" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c index 59be37a7c27..3be8b21ef01 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c @@ -38,5 +38,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c index e5d1df09b8a..a9a00d1141b 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c @@ -36,5 +36,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c index e60f1c28db4..1633a6ca81a 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c @@ -37,5 +37,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c index a1f67622f84..16af1c9e6c6 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c @@ -37,5 +37,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */ diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c index c419f6499b5..57830542ad0 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c @@ -36,5 +36,5 @@ main (void) return 0; } -/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow" } } */ +/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow*" } } */ /* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=8, vectors=128" } */ -- 2.17.1