diff mbox

[10/16] Add pass_oacc_kernels pass group in passes.def

Message ID 565455C9.7000206@mentor.com
State New
Headers show

Commit Message

Tom de Vries Nov. 24, 2015, 12:19 p.m. UTC
On 23/11/15 11:02, Richard Biener wrote:
> On Fri, 20 Nov 2015, Tom de Vries wrote:
>
>> On 20/11/15 14:29, Richard Biener wrote:
>>> I agree it's somewhat of an odd behavior but all passes should
>>> either be placed in a sub-pipeline with an outer
>>> loop_optimizer_init()/finalize () call or call both themselves.
>>
>> Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the loop
>> pipeline.
>>
>> We could use the style used in pass_slp_vectorize::execute:
>> ...
>> pass_slp_vectorize::execute (function *fun)
>> {
>>    basic_block bb;
>>
>>    bool in_loop_pipeline = scev_initialized_p ();
>>    if (!in_loop_pipeline)
>>      {
>>        loop_optimizer_init (LOOPS_NORMAL);
>>        scev_initialize ();
>>      }
>>
>>    ...
>>
>>    if (!in_loop_pipeline)
>>      {
>>        scev_finalize ();
>>        loop_optimizer_finalize ();
>>      }
>> ...
>>
>> Although that doesn't strike me as particularly clean.
>
> At least it would be a consistent "unclean" style.  So yes, the
> above would work for me.
>

Reposting using the in_loop_pipeline style in pass_lim.

Thanks,
- Tom

Comments

Richard Biener Nov. 24, 2015, 1:13 p.m. UTC | #1
On Tue, 24 Nov 2015, Tom de Vries wrote:

> On 23/11/15 11:02, Richard Biener wrote:
> > On Fri, 20 Nov 2015, Tom de Vries wrote:
> > 
> > > On 20/11/15 14:29, Richard Biener wrote:
> > > > I agree it's somewhat of an odd behavior but all passes should
> > > > either be placed in a sub-pipeline with an outer
> > > > loop_optimizer_init()/finalize () call or call both themselves.
> > > 
> > > Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the
> > > loop
> > > pipeline.
> > > 
> > > We could use the style used in pass_slp_vectorize::execute:
> > > ...
> > > pass_slp_vectorize::execute (function *fun)
> > > {
> > >    basic_block bb;
> > > 
> > >    bool in_loop_pipeline = scev_initialized_p ();
> > >    if (!in_loop_pipeline)
> > >      {
> > >        loop_optimizer_init (LOOPS_NORMAL);
> > >        scev_initialize ();
> > >      }
> > > 
> > >    ...
> > > 
> > >    if (!in_loop_pipeline)
> > >      {
> > >        scev_finalize ();
> > >        loop_optimizer_finalize ();
> > >      }
> > > ...
> > > 
> > > Although that doesn't strike me as particularly clean.
> > 
> > At least it would be a consistent "unclean" style.  So yes, the
> > above would work for me.
> > 
> 
> Reposting using the in_loop_pipeline style in pass_lim.

The tree-ssa-loop-im.c changes are ok (I suppose the other changes
are in the other patch you posted as well).

Thanks,
Richard.
Tom de Vries Nov. 24, 2015, 2:02 p.m. UTC | #2
On 24/11/15 14:13, Richard Biener wrote:
> On Tue, 24 Nov 2015, Tom de Vries wrote:
>
>> >On 23/11/15 11:02, Richard Biener wrote:
>>> > >On Fri, 20 Nov 2015, Tom de Vries wrote:
>>> > >
>>>> > > >On 20/11/15 14:29, Richard Biener wrote:
>>>>> > > > >I agree it's somewhat of an odd behavior but all passes should
>>>>> > > > >either be placed in a sub-pipeline with an outer
>>>>> > > > >loop_optimizer_init()/finalize () call or call both themselves.
>>>> > > >
>>>> > > >Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the
>>>> > > >loop
>>>> > > >pipeline.
>>>> > > >
>>>> > > >We could use the style used in pass_slp_vectorize::execute:
>>>> > > >...
>>>> > > >pass_slp_vectorize::execute (function *fun)
>>>> > > >{
>>>> > > >    basic_block bb;
>>>> > > >
>>>> > > >    bool in_loop_pipeline = scev_initialized_p ();
>>>> > > >    if (!in_loop_pipeline)
>>>> > > >      {
>>>> > > >        loop_optimizer_init (LOOPS_NORMAL);
>>>> > > >        scev_initialize ();
>>>> > > >      }
>>>> > > >
>>>> > > >    ...
>>>> > > >
>>>> > > >    if (!in_loop_pipeline)
>>>> > > >      {
>>>> > > >        scev_finalize ();
>>>> > > >        loop_optimizer_finalize ();
>>>> > > >      }
>>>> > > >...
>>>> > > >
>>>> > > >Although that doesn't strike me as particularly clean.
>>> > >
>>> > >At least it would be a consistent "unclean" style.  So yes, the
>>> > >above would work for me.
>>> > >
>> >
>> >Reposting using the in_loop_pipeline style in pass_lim.
> The tree-ssa-loop-im.c changes are ok

OK, I'll commit those.

> (I suppose the other changes
> are in the other patch you posted as well).

This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch 
contains changes related to adding pass_oacc_kernels2. Are those the 
"other changes" you're referring to?

Thanks,
- Tom
Richard Biener Nov. 24, 2015, 2:33 p.m. UTC | #3
On Tue, 24 Nov 2015, Tom de Vries wrote:

> On 24/11/15 14:13, Richard Biener wrote:
> > On Tue, 24 Nov 2015, Tom de Vries wrote:
> > 
> > > >On 23/11/15 11:02, Richard Biener wrote:
> > > > > >On Fri, 20 Nov 2015, Tom de Vries wrote:
> > > > > >
> > > > > > > >On 20/11/15 14:29, Richard Biener wrote:
> > > > > > > > > >I agree it's somewhat of an odd behavior but all passes
> > > > > > should
> > > > > > > > > >either be placed in a sub-pipeline with an outer
> > > > > > > > > >loop_optimizer_init()/finalize () call or call both
> > > > > > themselves.
> > > > > > > >
> > > > > > > >Hmm, but adding loop_optimizer_finalize at the end of pass_lim
> > > > > breaks the
> > > > > > > >loop
> > > > > > > >pipeline.
> > > > > > > >
> > > > > > > >We could use the style used in pass_slp_vectorize::execute:
> > > > > > > >...
> > > > > > > >pass_slp_vectorize::execute (function *fun)
> > > > > > > >{
> > > > > > > >    basic_block bb;
> > > > > > > >
> > > > > > > >    bool in_loop_pipeline = scev_initialized_p ();
> > > > > > > >    if (!in_loop_pipeline)
> > > > > > > >      {
> > > > > > > >        loop_optimizer_init (LOOPS_NORMAL);
> > > > > > > >        scev_initialize ();
> > > > > > > >      }
> > > > > > > >
> > > > > > > >    ...
> > > > > > > >
> > > > > > > >    if (!in_loop_pipeline)
> > > > > > > >      {
> > > > > > > >        scev_finalize ();
> > > > > > > >        loop_optimizer_finalize ();
> > > > > > > >      }
> > > > > > > >...
> > > > > > > >
> > > > > > > >Although that doesn't strike me as particularly clean.
> > > > > >
> > > > > >At least it would be a consistent "unclean" style.  So yes, the
> > > > > >above would work for me.
> > > > > >
> > > >
> > > >Reposting using the in_loop_pipeline style in pass_lim.
> > The tree-ssa-loop-im.c changes are ok
> 
> OK, I'll commit those.
> 
> > (I suppose the other changes
> > are in the other patch you posted as well).
> 
> This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch
> contains changes related to adding pass_oacc_kernels2. Are those the "other
> changes" you're referring to?

No, the other pathc adding oacc_kernels pass group to passes.def.

Btw, at some point splitting patches too much becomes very much
confusing instead of helping.

Richard.
Tom de Vries Nov. 24, 2015, 2:57 p.m. UTC | #4
On 24/11/15 15:33, Richard Biener wrote:
> On Tue, 24 Nov 2015, Tom de Vries wrote:
>
>> On 24/11/15 14:13, Richard Biener wrote:
>>> On Tue, 24 Nov 2015, Tom de Vries wrote:
>>>
>>>>> On 23/11/15 11:02, Richard Biener wrote:
>>>>>>> On Fri, 20 Nov 2015, Tom de Vries wrote:
>>>>>>>
>>>>>>>>> On 20/11/15 14:29, Richard Biener wrote:
>>>>>>>>>>> I agree it's somewhat of an odd behavior but all passes
>>>>>>> should
>>>>>>>>>>> either be placed in a sub-pipeline with an outer
>>>>>>>>>>> loop_optimizer_init()/finalize () call or call both
>>>>>>> themselves.
>>>>>>>>>
>>>>>>>>> Hmm, but adding loop_optimizer_finalize at the end of pass_lim
>>>>>> breaks the
>>>>>>>>> loop
>>>>>>>>> pipeline.
>>>>>>>>>
>>>>>>>>> We could use the style used in pass_slp_vectorize::execute:
>>>>>>>>> ...
>>>>>>>>> pass_slp_vectorize::execute (function *fun)
>>>>>>>>> {
>>>>>>>>>     basic_block bb;
>>>>>>>>>
>>>>>>>>>     bool in_loop_pipeline = scev_initialized_p ();
>>>>>>>>>     if (!in_loop_pipeline)
>>>>>>>>>       {
>>>>>>>>>         loop_optimizer_init (LOOPS_NORMAL);
>>>>>>>>>         scev_initialize ();
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>>     ...
>>>>>>>>>
>>>>>>>>>     if (!in_loop_pipeline)
>>>>>>>>>       {
>>>>>>>>>         scev_finalize ();
>>>>>>>>>         loop_optimizer_finalize ();
>>>>>>>>>       }
>>>>>>>>> ...
>>>>>>>>>
>>>>>>>>> Although that doesn't strike me as particularly clean.
>>>>>>>
>>>>>>> At least it would be a consistent "unclean" style.  So yes, the
>>>>>>> above would work for me.
>>>>>>>
>>>>>
>>>>> Reposting using the in_loop_pipeline style in pass_lim.
>>> The tree-ssa-loop-im.c changes are ok
>>
>> OK, I'll commit those.
>>
>>> (I suppose the other changes
>>> are in the other patch you posted as well).
>>
>> This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch
>> contains changes related to adding pass_oacc_kernels2. Are those the "other
>> changes" you're referring to?
>
> No, the other pathc adding oacc_kernels pass group to passes.def.
>

I don't understand. There 's only one patch adding oacc_kernels pass 
group to passes.def (which is the one in this thread).

> Btw, at some point splitting patches too much becomes very much
> confusing instead of helping.

Would it help if I merge "Add pass_oacc_kernels" with this patch?

Thanks,
- Tom
Richard Biener Nov. 25, 2015, 10:42 a.m. UTC | #5
On Tue, 24 Nov 2015, Tom de Vries wrote:

> On 24/11/15 15:33, Richard Biener wrote:
> > On Tue, 24 Nov 2015, Tom de Vries wrote:
> > 
> > > On 24/11/15 14:13, Richard Biener wrote:
> > > > On Tue, 24 Nov 2015, Tom de Vries wrote:
> > > > 
> > > > > > On 23/11/15 11:02, Richard Biener wrote:
> > > > > > > > On Fri, 20 Nov 2015, Tom de Vries wrote:
> > > > > > > > 
> > > > > > > > > > On 20/11/15 14:29, Richard Biener wrote:
> > > > > > > > > > > > I agree it's somewhat of an odd behavior but all passes
> > > > > > > > should
> > > > > > > > > > > > either be placed in a sub-pipeline with an outer
> > > > > > > > > > > > loop_optimizer_init()/finalize () call or call both
> > > > > > > > themselves.
> > > > > > > > > > 
> > > > > > > > > > Hmm, but adding loop_optimizer_finalize at the end of
> > > > > > > > > > pass_lim
> > > > > > > breaks the
> > > > > > > > > > loop
> > > > > > > > > > pipeline.
> > > > > > > > > > 
> > > > > > > > > > We could use the style used in pass_slp_vectorize::execute:
> > > > > > > > > > ...
> > > > > > > > > > pass_slp_vectorize::execute (function *fun)
> > > > > > > > > > {
> > > > > > > > > >     basic_block bb;
> > > > > > > > > > 
> > > > > > > > > >     bool in_loop_pipeline = scev_initialized_p ();
> > > > > > > > > >     if (!in_loop_pipeline)
> > > > > > > > > >       {
> > > > > > > > > >         loop_optimizer_init (LOOPS_NORMAL);
> > > > > > > > > >         scev_initialize ();
> > > > > > > > > >       }
> > > > > > > > > > 
> > > > > > > > > >     ...
> > > > > > > > > > 
> > > > > > > > > >     if (!in_loop_pipeline)
> > > > > > > > > >       {
> > > > > > > > > >         scev_finalize ();
> > > > > > > > > >         loop_optimizer_finalize ();
> > > > > > > > > >       }
> > > > > > > > > > ...
> > > > > > > > > > 
> > > > > > > > > > Although that doesn't strike me as particularly clean.
> > > > > > > > 
> > > > > > > > At least it would be a consistent "unclean" style.  So yes, the
> > > > > > > > above would work for me.
> > > > > > > > 
> > > > > > 
> > > > > > Reposting using the in_loop_pipeline style in pass_lim.
> > > > The tree-ssa-loop-im.c changes are ok
> > > 
> > > OK, I'll commit those.
> > > 
> > > > (I suppose the other changes
> > > > are in the other patch you posted as well).
> > > 
> > > This ( https://gcc.gnu.org/ml/gcc-patches/2015-11/msg02882.html ) patch
> > > contains changes related to adding pass_oacc_kernels2. Are those the
> > > "other
> > > changes" you're referring to?
> > 
> > No, the other pathc adding oacc_kernels pass group to passes.def.
> > 
> 
> I don't understand. There 's only one patch adding oacc_kernels pass group to
> passes.def (which is the one in this thread).
> 
> > Btw, at some point splitting patches too much becomes very much
> > confusing instead of helping.
> 
> Would it help if I merge "Add pass_oacc_kernels" with this patch?

It would have, yes.  As said, the excessive splitting just confuses
the review process.  Will review in the present state anyway.

Richard.

> Thanks,
> - Tom
> 
>
Richard Biener Nov. 25, 2015, 10:43 a.m. UTC | #6
On Tue, 24 Nov 2015, Tom de Vries wrote:

> On 23/11/15 11:02, Richard Biener wrote:
> > On Fri, 20 Nov 2015, Tom de Vries wrote:
> > 
> > > On 20/11/15 14:29, Richard Biener wrote:
> > > > I agree it's somewhat of an odd behavior but all passes should
> > > > either be placed in a sub-pipeline with an outer
> > > > loop_optimizer_init()/finalize () call or call both themselves.
> > > 
> > > Hmm, but adding loop_optimizer_finalize at the end of pass_lim breaks the
> > > loop
> > > pipeline.
> > > 
> > > We could use the style used in pass_slp_vectorize::execute:
> > > ...
> > > pass_slp_vectorize::execute (function *fun)
> > > {
> > >    basic_block bb;
> > > 
> > >    bool in_loop_pipeline = scev_initialized_p ();
> > >    if (!in_loop_pipeline)
> > >      {
> > >        loop_optimizer_init (LOOPS_NORMAL);
> > >        scev_initialize ();
> > >      }
> > > 
> > >    ...
> > > 
> > >    if (!in_loop_pipeline)
> > >      {
> > >        scev_finalize ();
> > >        loop_optimizer_finalize ();
> > >      }
> > > ...
> > > 
> > > Although that doesn't strike me as particularly clean.
> > 
> > At least it would be a consistent "unclean" style.  So yes, the
> > above would work for me.
> > 
> 
> Reposting using the in_loop_pipeline style in pass_lim.

Ok.

Thanks,
Richard.
diff mbox

Patch

Add pass_oacc_kernels pass group in passes.def

2015-11-09  Tom de Vries  <tom@codesourcery.com>

	* omp-low.c (pass_expand_omp_ssa::clone): New function.
	* passes.def: Add pass_oacc_kernels pass group.
	* tree-ssa-loop-ch.c (pass_ch::clone): New function.
	* tree-ssa-loop-im.c (tree_ssa_lim): Make static.
	(pass_lim::execute): Allow to run outside pass_tree_loop.

---
 gcc/omp-low.c          |  1 +
 gcc/passes.def         | 18 ++++++++++++++++++
 gcc/tree-ssa-loop-ch.c |  2 ++
 gcc/tree-ssa-loop-im.c | 12 ++++++++++--
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index efe5d3a..7318b0e 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -13366,6 +13366,7 @@  public:
       return !(fun->curr_properties & PROP_gimple_eomp);
     }
   virtual unsigned int execute (function *) { return execute_expand_omp (); }
+  opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); }
 
 }; // class pass_expand_omp_ssa
 
diff --git a/gcc/passes.def b/gcc/passes.def
index 17027786..f1969c0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -88,7 +88,25 @@  along with GCC; see the file COPYING3.  If not see
 	  /* pass_build_ealias is a dummy pass that ensures that we
 	     execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
+	  /* Pass group that runs when the function is an offloaded function
+	     containing oacc kernels loops.  Part 1.  */
+	  NEXT_PASS (pass_oacc_kernels);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels)
+	      NEXT_PASS (pass_ch);
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_fre);
+	  /* Pass group that runs when the function is an offloaded function
+	     containing oacc kernels loops.  Part 2.  */
+	  NEXT_PASS (pass_oacc_kernels2);
+	  PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels2)
+	      /* We use pass_lim to rewrite in-memory iteration and reduction
+	         variable accesses in loops into local variables accesses.  */
+	      NEXT_PASS (pass_lim);
+	      NEXT_PASS (pass_dominator, false /* may_peel_loop_headers_p */);
+	      NEXT_PASS (pass_dce);
+	      NEXT_PASS (pass_parallelize_loops_oacc_kernels);
+	      NEXT_PASS (pass_expand_omp_ssa);
+	  POP_INSERT_PASSES ()
 	  NEXT_PASS (pass_merge_phi);
           NEXT_PASS (pass_dse);
 	  NEXT_PASS (pass_cd_dce);
diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c
index 7e618bf..6493fcc 100644
--- a/gcc/tree-ssa-loop-ch.c
+++ b/gcc/tree-ssa-loop-ch.c
@@ -165,6 +165,8 @@  public:
   /* Initialize and finalize loop structures, copying headers inbetween.  */
   virtual unsigned int execute (function *);
 
+  opt_pass * clone () { return new pass_ch (m_ctxt); }
+
 protected:
   /* ch_base method: */
   virtual bool process_loop_p (struct loop *loop);
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 30b53ce..0d82d36 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -43,6 +43,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa-propagate.h"
 #include "trans-mem.h"
 #include "gimple-fold.h"
+#include "tree-scalar-evolution.h"
 
 /* TODO:  Support for predicated code motion.  I.e.
 
@@ -2496,7 +2497,7 @@  tree_ssa_lim_finalize (void)
 /* Moves invariants from loops.  Only "expensive" invariants are moved out --
    i.e. those that are likely to be win regardless of the register pressure.  */
 
-unsigned int
+static unsigned int
 tree_ssa_lim (void)
 {
   unsigned int todo;
@@ -2560,10 +2561,17 @@  public:
 unsigned int
 pass_lim::execute (function *fun)
 {
+  bool in_loop_pipeline = scev_initialized_p ();
+  if (!in_loop_pipeline)
+    loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS);
+
   if (number_of_loops (fun) <= 1)
     return 0;
+  unsigned int todo = tree_ssa_lim ();
 
-  return tree_ssa_lim ();
+  if (!in_loop_pipeline)
+    loop_optimizer_finalize ();
+  return todo;
 }
 
 } // anon namespace