diff mbox

[1/8] Expand oacc kernels after pass_build_ealias

Message ID 54730EE7.40000@mentor.com
State New
Headers show

Commit Message

Tom de Vries Nov. 24, 2014, 10:56 a.m. UTC
On 15-11-14 18:19, Tom de Vries wrote:
> On 15-11-14 13:14, Tom de Vries wrote:
>> Hi,
>>
>> I'm submitting a patch series with initial support for the oacc kernels
>> directive.
>>
>> The patch series uses pass_parallelize_loops to implement parallelization of
>> loops in the oacc kernels region.
>>
>> The patch series consists of these 8 patches:
>> ...
>>      1  Expand oacc kernels after pass_build_ealias
>>      2  Add pass_oacc_kernels
>>      3  Add pass_ch_oacc_kernels to pass_oacc_kernels
>>      4  Add pass_tree_loop_{init,done} to pass_oacc_kernels
>>      5  Add pass_loop_im to pass_oacc_kernels
>>      6  Add pass_ccp to pass_oacc_kernels
>>      7  Add pass_parloops_oacc_kernels to pass_oacc_kernels
>>      8  Do simple omp lowering for no address taken var
>> ...
>
> This patch moves omp expansion of the oacc kernels directive to after
> pass_build_ealias.
>
> The rationale is that in order to use pass_parallelize_loops for analysis and
> transformation of an oacc kernels region, we postpone omp expansion of that
> region until the earliest point in the pass list where enough information is
> availabe to run pass_parallelize_loops, in other words, after pass_build_ealias.
>
> The patch postpones expansion in expand_omp, and ensures expansion by adding
> pass_expand_omp_ssa:
> - after pass_build_ealias, and
> - after pass_all_early_optimizations for the case we're not optimizing.
>
> In order to make sure the oacc kernels region arrives at pass_expand_omp_ssa,
> the way it left expand_omp, the patch makes pass_ccp and pass_forwprop aware of
> lowered omp code, to handle it conservatively.
>
> The patch contains changes in expand_omp_target to deal with ssa-code, similar
> to what is already present in expand_omp_taskreg.
>
> Furthermore, the patch forces the .omp_data_sizes and .omp_data_kinds to not be
> static for oacc kernels. It does this to get some references to .omp_data_sizes
> and .omp_data_kinds in the ssa code.  Without these references, the definitions
> will be removed. The reference of the variables in GIMPLE_OACC_KERNELS is not
> enough to have them not removed. [ In vries/oacc-kernels, I used a BUILT_IN_USE
> kludge for this purpose ].
>
> Finally, at the end of pass_expand_omp_ssa we're left with SSA_NAMEs in the
> original function of which the definition has been removed (as in moved to the
> split off function). TODO_remove_unused_locals takes care of some of them, but
> not the anonymous ones. So the patch iterates over all SSA_NAMEs to find these
> dangling SSA_NAMEs and releases them.
>

Reposting with small update: I've replaced the use of the rather generic 
gimple_stmt_omp_lowering_p with the more specific gimple_stmt_omp_data_i_init_p.

Bootstrapped and reg-tested in the same way as before.

> OK for trunk?
>
> Thanks,
> - Tom
diff mbox

Patch

2014-11-14  Tom de Vries  <tom@codesourcery.com>

	* function.h (struct function): Add contains_oacc_kernels field.
	* gimplify.c (gimplify_omp_workshare): Set contains_oacc_kernels.
	* omp-low.c: Include gimple-pretty-print.h.
	(release_first_vuse_in_edge_dest): New function.
	(expand_omp_target): Handle ssa-code.
	(expand_omp): Don't expand GIMPLE_OACC_KERNELS when not in ssa.
	(pass_data_expand_omp): Don't set PROP_gimple_eomp unconditionally in
	properties_provided field.
	(pass_expand_omp::execute): Set PROP_gimple_eomp in
	cfun->curr_properties only if cfun does not contain oacc kernels.
	(pass_data_expand_omp_ssa): Add TODO_remove_unused_locals to
	todo_flags_finish field.
	(pass_expand_omp_ssa::execute): Release dandging SSA_NAMEs after calling
	execute_expand_omp.
	(lower_omp_target): Add static_arrays variable, init to 1.  Don't use
	static arrays for kernels directive.  Use static_arrays variable.
	Handle case that .omp_data_kinds is not static.
	(gimple_stmt_ssa_operand_references_var_p)
	(gimple_stmt_omp_data_i_init_p): New function.
	* omp-low.h (gimple_stmt_omp_data_i_init_p): Declare.
	* passes.def: Add pass_expand_omp_ssa after pass_build_ealias.
	* tree-ssa-ccp.c: Include omp-low.h.
	(surely_varying_stmt_p, ccp_visit_stmt): Handle omp lowering code
	conservatively.
	* tree-ssa-forwprop.c: Include omp-low.h.
	(pass_forwprop::execute): Handle omp lowering code conservatively.
---
 gcc/function.h          |   3 +
 gcc/gimplify.c          |   1 +
 gcc/omp-low.c           | 196 +++++++++++++++++++++++++++++++++++++++++++++---
 gcc/omp-low.h           |   1 +
 gcc/passes.def          |   2 +
 gcc/tree-ssa-ccp.c      |   6 ++
 gcc/tree-ssa-forwprop.c |   4 +-
 7 files changed, 200 insertions(+), 13 deletions(-)

diff --git a/gcc/function.h b/gcc/function.h
index 3a6305c..bb48775 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -667,6 +667,9 @@  struct GTY(()) function {
 
   /* Set when the tail call has been identified.  */
   unsigned int tail_call_marked : 1;
+
+  /* Set when the function contains oacc kernels directives.  */
+  unsigned int contains_oacc_kernels : 1;
 };
 
 /* Add the decl D to the local_decls list of FUN.  */
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index ad48d51..c40f20f 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -7316,6 +7316,7 @@  gimplify_omp_workshare (tree *expr_p, gimple_seq *pre_p)
       break;
     case OACC_KERNELS:
       stmt = gimple_build_oacc_kernels (body, OACC_KERNELS_CLAUSES (expr));
+      cfun->contains_oacc_kernels = 1;
       break;
     case OACC_PARALLEL:
       stmt = gimple_build_oacc_parallel (body, OACC_PARALLEL_CLAUSES (expr));
diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index c503cc1..767fa87 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -88,6 +88,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-eh.h"
 #include "cilk.h"
 #include "lto-section-names.h"
+#include "gimple-pretty-print.h"
 
 
 /* Lowering of OpenMP parallel and workshare constructs proceeds in two
@@ -5338,6 +5339,35 @@  expand_omp_build_assign (gimple_stmt_iterator *gsi_p, tree to, tree from)
     }
 }
 
+static void
+release_first_vuse_in_edge_dest (edge e)
+{
+  gimple_stmt_iterator i;
+  basic_block bb = e->dest;
+
+  for (i = gsi_start_phis (bb); !gsi_end_p (i); gsi_next (&i))
+    {
+      gimple phi = gsi_stmt (i);
+      tree arg = PHI_ARG_DEF_FROM_EDGE (phi, e);
+
+      if (!virtual_operand_p (arg))
+	continue;
+
+      mark_virtual_operand_for_renaming (arg);
+      return;
+    }
+
+  for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next_nondebug (&i))
+    {
+      gimple stmt = gsi_stmt (i);
+      if (gimple_vuse (stmt) == NULL_TREE)
+	continue;
+
+      mark_virtual_operand_for_renaming (gimple_vuse (stmt));
+      return;
+    }
+}
+
 /* Expand the OpenMP parallel or task directive starting at REGION.  */
 
 static void
@@ -8832,7 +8862,6 @@  expand_omp_target (struct omp_region *region)
   /* Supported by expand_omp_taskreg, but not here.  */
   if (child_cfun != NULL)
     gcc_assert (!child_cfun->cfg);
-  gcc_assert (!gimple_in_ssa_p (cfun));
 
   entry_bb = region->entry;
   exit_bb = region->exit;
@@ -8858,7 +8887,7 @@  expand_omp_target (struct omp_region *region)
 	{
 	  basic_block entry_succ_bb = single_succ (entry_bb);
 	  gimple_stmt_iterator gsi;
-	  tree arg;
+	  tree arg, narg;
 	  gimple tgtcopy_stmt = NULL;
 	  tree sender = TREE_VEC_ELT (gimple_omp_data_arg (entry_stmt), 0);
 
@@ -8888,8 +8917,27 @@  expand_omp_target (struct omp_region *region)
 	  gcc_assert (tgtcopy_stmt != NULL);
 	  arg = DECL_ARGUMENTS (child_fn);
 
-	  gcc_assert (gimple_assign_lhs (tgtcopy_stmt) == arg);
-	  gsi_remove (&gsi, true);
+	  if (!gimple_in_ssa_p (cfun))
+	    {
+	      gcc_assert (gimple_assign_lhs (tgtcopy_stmt) == arg);
+	      gsi_remove (&gsi, true);
+	    }
+	  else
+	    {
+	      gcc_assert (SSA_NAME_VAR (gimple_assign_lhs (tgtcopy_stmt))
+			  == arg);
+
+	      /* If we are in ssa form, we must load the value from the default
+		 definition of the argument.  That should not be defined now,
+		 since the argument is not used uninitialized.  */
+	      gcc_assert (ssa_default_def (cfun, arg) == NULL);
+	      narg = make_ssa_name (arg, gimple_build_nop ());
+	      set_ssa_default_def (cfun, arg, narg);
+	      /* ?? Is setting the subcode really necessary ??  */
+	      gimple_omp_set_subcode (tgtcopy_stmt, TREE_CODE (narg));
+	      gimple_assign_set_rhs1 (tgtcopy_stmt, narg);
+	      update_stmt (tgtcopy_stmt);
+	    }
 	}
 
       /* Declare local variables needed in CHILD_CFUN.  */
@@ -8932,11 +8980,23 @@  expand_omp_target (struct omp_region *region)
 	  stmt = gimple_build_return (NULL);
 	  gsi_insert_after (&gsi, stmt, GSI_SAME_STMT);
 	  gsi_remove (&gsi, true);
+
+	  /* A vuse in single_succ (exit_bb) may use a vdef from the region
+	     which is about to be split off.  Mark the vdef for renaming.  */
+	  release_first_vuse_in_edge_dest (single_succ_edge (exit_bb));
 	}
 
       /* Move the offloading region into CHILD_CFUN.  */
 
-      block = gimple_block (entry_stmt);
+      if (gimple_in_ssa_p (cfun))
+	{
+	  init_tree_ssa (child_cfun);
+	  init_ssa_operands (child_cfun);
+	  child_cfun->gimple_df->in_ssa_p = true;
+	  block = NULL_TREE;
+	}
+      else
+	block = gimple_block (entry_stmt);
 
       new_bb = move_sese_region_to_fn (child_cfun, entry_bb, exit_bb, block);
       if (exit_bb)
@@ -8986,6 +9046,8 @@  expand_omp_target (struct omp_region *region)
 	  if (changed)
 	    cleanup_tree_cfg ();
 	}
+      if (gimple_in_ssa_p (cfun))
+	update_ssa (TODO_update_ssa);
       pop_cfun ();
     }
 
@@ -9262,6 +9324,8 @@  expand_omp_target (struct omp_region *region)
       gcc_assert (g && gimple_code (g) == GIMPLE_OMP_RETURN);
       gsi_remove (&gsi, true);
     }
+  if (gimple_in_ssa_p (cfun))
+    update_ssa (TODO_update_ssa_only_virtuals);
 }
 
 
@@ -9332,6 +9396,15 @@  expand_omp (struct omp_region *region)
 	  break;
 
 	case GIMPLE_OACC_KERNELS:
+	  if (!gimple_in_ssa_p (cfun))
+	    /* We're in pass_expand_omp.  Postpone expanding till
+	       pass_expand_omp_ssa.  */
+	    break;
+
+	  /* We're in pass_expand_omp_ssa.  Expand now.  */
+
+	  /* FALLTHRU.  */
+
 	case GIMPLE_OACC_PARALLEL:
 	case GIMPLE_OMP_TARGET:
 	  expand_omp_target (region);
@@ -9504,7 +9577,7 @@  const pass_data pass_data_expand_omp =
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   PROP_gimple_any, /* properties_required */
-  PROP_gimple_eomp, /* properties_provided */
+  0 /* Possibly PROP_gimple_eomp.  */, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
   0, /* todo_flags_finish */
@@ -9518,7 +9591,7 @@  public:
   {}
 
   /* opt_pass methods: */
-  virtual unsigned int execute (function *)
+  virtual unsigned int execute (function *fun)
     {
       bool gate = ((flag_openacc != 0 || flag_openmp != 0
 		    || flag_openmp_simd != 0 || flag_cilkplus != 0)
@@ -9529,7 +9602,12 @@  public:
       if (!gate)
 	return 0;
 
-      return execute_expand_omp ();
+      unsigned int res = execute_expand_omp ();
+
+      if (!cfun->contains_oacc_kernels)
+	fun->curr_properties |= PROP_gimple_eomp;
+
+      return res;
     }
 
 }; // class pass_expand_omp
@@ -9554,7 +9632,8 @@  const pass_data pass_data_expand_omp_ssa =
   PROP_gimple_eomp, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
-  TODO_cleanup_cfg | TODO_rebuild_alias, /* todo_flags_finish */
+  TODO_cleanup_cfg | TODO_rebuild_alias
+  | TODO_remove_unused_locals, /* todo_flags_finish */
 };
 
 class pass_expand_omp_ssa : public gimple_opt_pass
@@ -9569,7 +9648,47 @@  public:
     {
       return !(fun->curr_properties & PROP_gimple_eomp);
     }
-  virtual unsigned int execute (function *) { return execute_expand_omp (); }
+  virtual unsigned int execute (function *)
+    {
+      unsigned res = execute_expand_omp ();
+
+      /* After running pass_expand_omp_ssa to expand the oacc kernels
+	 directive, we are left in the original function with anonymous
+	 SSA_NAMEs, with a defining statement that has been deleted.  This
+	 pass finds those SSA_NAMEs and releases them.  */
+      unsigned int i;
+      for (i = 1; i < num_ssa_names; ++i)
+	{
+	  tree name = ssa_name (i);
+	  if (name == NULL_TREE)
+	    continue;
+
+	  gimple stmt = SSA_NAME_DEF_STMT (name);
+	  bool found = false;
+
+	  ssa_op_iter op_iter;
+	  def_operand_p def_p;
+	  FOR_EACH_PHI_OR_STMT_DEF (def_p, stmt, op_iter, SSA_OP_ALL_DEFS)
+	    {
+	      tree def = DEF_FROM_PTR (def_p);
+	      if (def == name)
+		{
+		  found = true;
+		  break;
+		}
+	    }
+
+	  if (!found)
+	    {
+	      if (dump_file)
+		fprintf (dump_file, "Released dangling ssa name %u\n", i);
+	      release_ssa_name (name);
+	    }
+	}
+
+      return res;
+    }
+  opt_pass * clone () { return new pass_expand_omp_ssa (m_ctxt); }
 
 }; // class pass_expand_omp_ssa
 
@@ -11195,6 +11314,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
   unsigned int map_cnt = 0;
   tree (*gimple_omp_clauses) (const_gimple);
   void (*gimple_omp_set_data_arg) (gimple, tree);
+  unsigned int static_arrays = 1;
 
   offloaded = is_gimple_omp_offloaded (stmt);
   data_region = false;
@@ -11203,6 +11323,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
     case GIMPLE_OACC_KERNELS:
       gimple_omp_clauses = gimple_oacc_kernels_clauses;
       gimple_omp_set_data_arg = gimple_oacc_kernels_set_data_arg;
+      static_arrays = 0;
       break;
     case GIMPLE_OACC_PARALLEL:
       gimple_omp_clauses = gimple_oacc_parallel_clauses;
@@ -11369,7 +11490,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 			  ".omp_data_sizes");
       DECL_NAMELESS (TREE_VEC_ELT (t, 1)) = 1;
       TREE_ADDRESSABLE (TREE_VEC_ELT (t, 1)) = 1;
-      TREE_STATIC (TREE_VEC_ELT (t, 1)) = 1;
+      TREE_STATIC (TREE_VEC_ELT (t, 1)) = static_arrays;
       tree tkind_type;
       int talign_shift;
       if (is_gimple_omp_oacc_specifically (stmt))
@@ -11387,7 +11508,7 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 			  ".omp_data_kinds");
       DECL_NAMELESS (TREE_VEC_ELT (t, 2)) = 1;
       TREE_ADDRESSABLE (TREE_VEC_ELT (t, 2)) = 1;
-      TREE_STATIC (TREE_VEC_ELT (t, 2)) = 1;
+      TREE_STATIC (TREE_VEC_ELT (t, 2)) = static_arrays;
       gimple_omp_set_data_arg (stmt, t);
 
       vec<constructor_elt, va_gc> *vsize;
@@ -11560,6 +11681,22 @@  lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx)
 						    clobber));
 	}
 
+      if (!TREE_STATIC (TREE_VEC_ELT (t, 2)))
+	{
+	  gimple_seq initlist = NULL;
+	  force_gimple_operand (build1 (DECL_EXPR, void_type_node,
+					TREE_VEC_ELT (t, 2)),
+				&initlist, true, NULL_TREE);
+	  gimple_seq_add_seq (&ilist, initlist);
+
+	  tree clobber = build_constructor (TREE_TYPE (TREE_VEC_ELT (t, 2)),
+					    NULL);
+	  TREE_THIS_VOLATILE (clobber) = 1;
+	  gimple_seq_add_stmt (&olist,
+			       gimple_build_assign (TREE_VEC_ELT (t, 2),
+						    clobber));
+	}
+
       tree clobber = build_constructor (ctx->record_type, NULL);
       TREE_THIS_VOLATILE (clobber) = 1;
       gimple_seq_add_stmt (&olist, gimple_build_assign (ctx->sender_decl,
@@ -13740,4 +13877,39 @@  omp_finish_file (void)
     }
 }
 
+static bool
+gimple_stmt_ssa_operand_references_var_p (gimple stmt, const char **varnames,
+					  unsigned int nr_varnames,
+					  unsigned int flags)
+{
+  tree use;
+  ssa_op_iter iter;
+  const char *s;
+
+  FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, flags)
+    {
+      if (SSA_NAME_IDENTIFIER (use) == NULL_TREE)
+	continue;
+      s = IDENTIFIER_POINTER (SSA_NAME_IDENTIFIER (use));
+
+      unsigned int i;
+      for (i = 0; i < nr_varnames; ++i)
+	if (strcmp (varnames[i], s) == 0)
+	  return true;
+    }
+
+  return false;
+}
+
+/* Return true if STMT is .omp_data_i init.  */
+
+bool
+gimple_stmt_omp_data_i_init_p (gimple stmt)
+{
+  const char *varnames[] = { ".omp_data_i" };
+  unsigned int nr_varnames = sizeof (varnames) / sizeof (varnames[0]);
+  return gimple_stmt_ssa_operand_references_var_p (stmt, varnames, nr_varnames,
+						   SSA_OP_DEF);
+}
+
 #include "gt-omp-low.h"
diff --git a/gcc/omp-low.h b/gcc/omp-low.h
index ac587d0..32076e4 100644
--- a/gcc/omp-low.h
+++ b/gcc/omp-low.h
@@ -28,6 +28,7 @@  extern void free_omp_regions (void);
 extern tree omp_reduction_init (tree, tree);
 extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *);
 extern void omp_finish_file (void);
+extern bool gimple_stmt_omp_data_i_init_p (gimple);
 
 extern GTY(()) vec<tree, va_gc> *offload_funcs;
 extern GTY(()) vec<tree, va_gc> *offload_vars;
diff --git a/gcc/passes.def b/gcc/passes.def
index ebd2b95..dc45e3f 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -85,6 +85,7 @@  along with GCC; see the file COPYING3.  If not see
 	  /* pass_build_ealias is a dummy pass that ensures that we
 	     execute TODO_rebuild_alias at this point.  */
 	  NEXT_PASS (pass_build_ealias);
+	  NEXT_PASS (pass_expand_omp_ssa);
 	  NEXT_PASS (pass_fre);
 	  NEXT_PASS (pass_merge_phi);
 	  NEXT_PASS (pass_cd_dce);
@@ -99,6 +100,7 @@  along with GCC; see the file COPYING3.  If not see
 	      late.  */
 	  NEXT_PASS (pass_split_functions);
       POP_INSERT_PASSES ()
+      NEXT_PASS (pass_expand_omp_ssa);
       NEXT_PASS (pass_release_ssa_names);
       NEXT_PASS (pass_rebuild_cgraph_edges);
       NEXT_PASS (pass_inline_parameters);
diff --git a/gcc/tree-ssa-ccp.c b/gcc/tree-ssa-ccp.c
index 52d8503..23185e6 100644
--- a/gcc/tree-ssa-ccp.c
+++ b/gcc/tree-ssa-ccp.c
@@ -165,6 +165,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "wide-int-print.h"
 #include "builtins.h"
 #include "tree-chkp.h"
+#include "omp-low.h"
 
 
 /* Possible lattice values.  */
@@ -789,6 +790,9 @@  surely_varying_stmt_p (gimple stmt)
       && gimple_code (stmt) != GIMPLE_CALL)
     return true;
 
+  if (gimple_stmt_omp_data_i_init_p (stmt))
+    return true;
+
   return false;
 }
 
@@ -2297,6 +2301,8 @@  ccp_visit_stmt (gimple stmt, edge *taken_edge_p, tree *output_p)
   switch (gimple_code (stmt))
     {
       case GIMPLE_ASSIGN:
+	if (gimple_stmt_omp_data_i_init_p (stmt))
+	  break;
         /* If the statement is an assignment that produces a single
            output value, evaluate its RHS to see if the lattice value of
            its output has changed.  */
diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index feb8253..860c53e 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -68,6 +68,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-cfgcleanup.h"
 #include "tree-into-ssa.h"
 #include "cfganal.h"
+#include "omp-low.h"
 
 /* This pass propagates the RHS of assignment statements into use
    sites of the LHS of the assignment.  It's basically a specialized
@@ -2244,7 +2245,8 @@  pass_forwprop::execute (function *fun)
 	  tree lhs, rhs;
 	  enum tree_code code;
 
-	  if (!is_gimple_assign (stmt))
+	  if (!is_gimple_assign (stmt)
+	      || gimple_stmt_omp_data_i_init_p (stmt))
 	    {
 	      gsi_next (&gsi);
 	      continue;
-- 
1.9.1