diff mbox

[4/16] Implement -foffload-alias

Message ID 566AC56E.8050701@mentor.com
State New
Headers show

Commit Message

Tom de Vries Dec. 11, 2015, 12:45 p.m. UTC
On 13/11/15 12:39, Jakub Jelinek wrote:
> We simply have some compiler internal interface between the caller and
> callee of the outlined regions, each interface in between those has
> its own structure type used to communicate the info;
> we can attach attributes on the fields, or some flags to indicate some
> properties interesting from aliasing POV.  We don't really need to perform
> full IPA-PTA, perhaps it would be enough to a) record somewhere in cgraph
> the relationship in between such callers and callees (for offloading regions
> we already have "omp target entrypoint" attribute on the callee and a
> singler caller), tell LTO if possible not to split those into different
> partitions if easily possible, and then just for these pairs perform
> aliasing/points-to analysis in the caller and the result record using
> cliques/special attributes/whatever to the callee side, so that the callee
> (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias analysis.

Hi,

This work-in-progress patch allows me to use IPA PTA information in the 
kernels pass group.

Since:
-  I'm running IPA PTA before ealias, and IPA PTA does not interpret
    restrict, and
- compute_may_alias doesn't run if IPA PTA information is present
I needed to convince ealias to do the restrict clique/base annotation.

It would be more logical to fit IPA PTA after ealias, but one is an IPA 
pass, the other a regular one-function pass, so I would have to split 
the containing pass groups pass_all_early_optimizations and 
pass_local_optimization_passes. I'll give that a try now.

Any comments?

Thanks,
- Tom

Comments

Richard Biener Dec. 11, 2015, 1 p.m. UTC | #1
On Fri, 11 Dec 2015, Tom de Vries wrote:

> On 13/11/15 12:39, Jakub Jelinek wrote:
> > We simply have some compiler internal interface between the caller and
> > callee of the outlined regions, each interface in between those has
> > its own structure type used to communicate the info;
> > we can attach attributes on the fields, or some flags to indicate some
> > properties interesting from aliasing POV.  We don't really need to perform
> > full IPA-PTA, perhaps it would be enough to a) record somewhere in cgraph
> > the relationship in between such callers and callees (for offloading regions
> > we already have "omp target entrypoint" attribute on the callee and a
> > singler caller), tell LTO if possible not to split those into different
> > partitions if easily possible, and then just for these pairs perform
> > aliasing/points-to analysis in the caller and the result record using
> > cliques/special attributes/whatever to the callee side, so that the callee
> > (outlined OpenMP/OpenACC/Cilk+ region) can then improve its alias analysis.
> 
> Hi,
> 
> This work-in-progress patch allows me to use IPA PTA information in the
> kernels pass group.
> 
> Since:
> -  I'm running IPA PTA before ealias, and IPA PTA does not interpret
>    restrict, and
> - compute_may_alias doesn't run if IPA PTA information is present
> I needed to convince ealias to do the restrict clique/base annotation.
> 
> It would be more logical to fit IPA PTA after ealias, but one is an IPA pass,
> the other a regular one-function pass, so I would have to split the containing
> pass groups pass_all_early_optimizations and pass_local_optimization_passes.
> I'll give that a try now.
> 
> Any comments?

I don't think you want to run IPA PTA before early
optimizations, it (and ealias) rely on some initial cleanup to
do anything meaningful with well-spent ressources.

The local PTA "hack" also looks more like a waste of resources, but well 
... teaching IPA PTA to honor restrict might be an impossible task
though I didn't think much about it other than handling it only for
nonlocal_p functions (for others we should see all incoming args
if IPA PTA works optimally).  The restrict tags will leak all over
the place of course and in the end no meaningful cliques may remain.

Richard.
diff mbox

Patch

Run pass_ipa_pta before pass_local_optimization_passes

---
 gcc/gimple-ssa.h           |  2 ++
 gcc/passes.def             |  1 +
 gcc/tree-pass.h            |  1 +
 gcc/tree-ssa-structalias.c | 60 +++++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 61 insertions(+), 3 deletions(-)

diff --git a/gcc/gimple-ssa.h b/gcc/gimple-ssa.h
index 39551da..aff2fb7 100644
--- a/gcc/gimple-ssa.h
+++ b/gcc/gimple-ssa.h
@@ -83,6 +83,8 @@  struct GTY(()) gimple_df {
   /* The PTA solution for the ESCAPED artificial variable.  */
   struct pt_solution escaped;
 
+  bool clique_base_annotation_done;
+
   /* A map of decls to artificial ssa-names that point to the partition
      of the decl.  */
   hash_map<tree, tree> * GTY((skip(""))) decls_to_pointers;
diff --git a/gcc/passes.def b/gcc/passes.def
index 678a900..5293be0 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -68,6 +68,7 @@  along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_rebuild_cgraph_edges);
   POP_INSERT_PASSES ()
 
+  NEXT_PASS (pass_ipa_pta_oacc_kernels);
   NEXT_PASS (pass_local_optimization_passes);
   PUSH_INSERT_PASSES_WITHIN (pass_local_optimization_passes)
       NEXT_PASS (pass_fixup_cfg);
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 4566d33..980922e 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -497,6 +497,7 @@  extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt);
 extern ipa_opt_pass_d *make_pass_ipa_pure_const (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_pta (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_pta_oacc_kernels (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_tm (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_target_clone (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_dispatcher_calls (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-structalias.c b/gcc/tree-ssa-structalias.c
index 7420ce1..dfc0422 100644
--- a/gcc/tree-ssa-structalias.c
+++ b/gcc/tree-ssa-structalias.c
@@ -6939,7 +6939,7 @@  solve_constraints (void)
    at the start of the file for an algorithmic overview.  */
 
 static void
-compute_points_to_sets (void)
+compute_points_to_sets (bool set_points_to_info)
 {
   basic_block bb;
   unsigned i;
@@ -6981,6 +6981,9 @@  compute_points_to_sets (void)
   /* From the constraints compute the points-to sets.  */
   solve_constraints ();
 
+  if (!set_points_to_info)
+    goto done;
+
   /* Compute the points-to set for ESCAPED used for call-clobber analysis.  */
   cfun->gimple_df->escaped = find_what_var_points_to (cfun->decl,
 						      get_varinfo (escaped_id));
@@ -7057,6 +7060,7 @@  compute_points_to_sets (void)
 	}
     }
 
+ done:
   timevar_pop (TV_TREE_PTA);
 }
 
@@ -7289,6 +7293,8 @@  compute_dependence_clique (void)
 unsigned int
 compute_may_aliases (void)
 {
+  bool set_points_to_info = true;
+
   if (cfun->gimple_df->ipa_pta)
     {
       if (dump_file)
@@ -7300,13 +7306,16 @@  compute_may_aliases (void)
 	  dump_alias_info (dump_file);
 	}
 
-      return 0;
+      if (cfun->gimple_df->clique_base_annotation_done)
+	return 0;
+
+      set_points_to_info = false;
     }
 
   /* For each pointer P_i, determine the sets of variables that P_i may
      point-to.  Compute the reachability set of escaped and call-used
      variables.  */
-  compute_points_to_sets ();
+  compute_points_to_sets (set_points_to_info);
 
   /* Debugging dumps.  */
   if (dump_file)
@@ -7314,6 +7323,7 @@  compute_may_aliases (void)
 
   /* Compute restrict-based memory disambiguations.  */
   compute_dependence_clique ();
+  cfun->gimple_df->clique_base_annotation_done = true;
 
   /* Deallocate memory used by aliasing data structures and the internal
      points-to solution.  */
@@ -7816,3 +7826,47 @@  make_pass_ipa_pta (gcc::context *ctxt)
 {
   return new pass_ipa_pta (ctxt);
 }
+
+namespace {
+
+const pass_data pass_data_ipa_pta_oacc_kernels =
+{
+  SIMPLE_IPA_PASS, /* type */
+  "pta_oacc_kernels", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_IPA_PTA, /* tv_id */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_ipa_pta_oacc_kernels : public simple_ipa_opt_pass
+{
+public:
+  pass_ipa_pta_oacc_kernels (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_pta_oacc_kernels, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+    {
+      return (optimize
+	      && flag_openacc
+	      && flag_tree_parallelize_loops > 1
+	      /* Don't bother doing anything if the program has errors.  */
+	      && !seen_error ());
+    }
+
+  virtual unsigned int execute (function *) { return ipa_pta_execute (); }
+
+}; // class pass_ipa_pta_oacc_kernels
+
+} // anon namespace
+
+simple_ipa_opt_pass *
+make_pass_ipa_pta_oacc_kernels (gcc::context *ctxt)
+{
+  return new pass_ipa_pta_oacc_kernels (ctxt);
+}