diff mbox

Fix pr60363 by adding backtraced value of phi arg along jump threading path

Message ID 003701cf6763$8d4bc4d0$a7e34e70$@arm.com
State New
Headers show

Commit Message

Bin Cheng May 4, 2014, 6:39 a.m. UTC
> -----Original Message-----
> From: Jeff Law [mailto:law@redhat.com]
> Sent: Friday, April 25, 2014 4:53 AM
> To: Bin.Cheng
> Cc: Bin Cheng; gcc-patches List
> Subject: Re: [PATCH GCC]Fix pr60363 by adding backtraced value of phi arg
> along jump threading path
> 
> On 04/17/14 04:07, Bin.Cheng wrote:
> > On Thu, Apr 17, 2014 at 1:30 PM, Jeff Law <law@redhat.com> wrote:
> >> On 03/18/14 04:13, bin.cheng wrote:
> >>>
> >>> Hi,
> >>> After control flow graph change made by
> >>> http://gcc.gnu.org/ml/gcc-patches/2014-02/msg01492.html, case
> >>> gcc.dg/tree-ssa/ssa-dom-thread-4.c is broken on
> >>> logical_op_short_circuit targets including cortex-m3/cortex-m0.
> >>> The regression reveals a missed opportunity in jump threading, which
> >>> causes a forward basic block doesn't get removed in cfgcleanup after
> >>> jump threading in VRP1.  Root cause is stated at the corresponding
> >>> PR:
> >>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60363, please refer to
> >>> it for detailed report.
> >>>
> >>> This patch fixes the issue by adding constant value instead of
> >>> ssa_name as the new phi argument.  Bootstrap and test on x86_64,
> >>> also test on
> >>> cortex-m3
> >>> and the regression is gone.
> >>> I think this should wait for stage1, but would like to hear some
> >>> comments now.  So does it look reasonable?
> >>>
> >>>
> >>> 2014-03-18  Bin Cheng<bin.cheng@arm.com>
> >>>
> >>>          PR regression/60363
> >>>          * gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
> >>>          (copy_phi_args): New parameters.  Call
get_value_locus_in_path.
> >>>          (update_destination_phis): New parameter.
> >>>          (create_edge_and_update_destination_phis): Ditto.
> >>>          (ssa_fix_duplicate_block_edges): Pass new arguments.
> >>>          (thread_single_edge): Ditto.
> >>
> >> This is a good and interesting catch. DOM knows how to propagate
> >> these context sensitive equivalences which should expose the
> >> optimizable forwarder blocks.
> > At the time I was looking into the problem, DOM couldn't understand
> > the equivalence.  Maybe it can be improved too.
> That's something I improved during 4.9 stage1 development.  There were
> cases where we clearly missed propagating a context sensitive equivalence
> into PHI nodes.  I didn't check this specific case, but it looks
reasonably similar
> to the cases I fixed a few months back.
> 
> >>> +  for (int j = idx - 1; j >= 0; j--)
> >>> +    {
> >>> +      edge e = (*path)[j]->e;
> >>> +      if (e->dest == def_bb)
> >>> +       {
> >>> +         arg = gimple_phi_arg_def (def_phi, e->dest_idx);
> >>> +         *locus = gimple_phi_arg_location (def_phi, e->dest_idx);
> >>> +         return (TREE_CODE (arg) == INTEGER_CST ? arg : def);
> >>
> >> Presumably any constant that can legitimately appear in a PHI node is
> >> good here.  So for example ADDR_EXPR <something in static storage>
> >> ought to be handled as well.
> >>
> >> One could also argue that we should go ahead and do a context
> >> sensitive copy propagation here too if ARG turns out to be an
> >> SSA_NAME.  You have to be a bit more careful with those and use
> >> may_propagate_copy_p and you'd probably want to test the loop depth
> >> of the SSA_NAMEs to ensure you're not doing a propagation that is
> >> going to muck up LICM.  See loop_depth_of_name uses in tree-ssa-dom.c.
> >>
> >> Overall I think it's good.  We just need to resolve whether or not we
> >> want to catch constant ADDR_EXPRs and/or do the context sensitive
> >> copy propagations.
> > Do you mean const/copy propagation in jump threading optimization, or
> > just an independent opt somewhere else?  It's naturally flow sensitive
> > along jump threading path, which looks interesting to me.
> Thinking more about it, I'd probably hold off on the context sensitive
copy
> propagation right now since you have to check a variety of things to
ensure
> correctness and performance.  Handling that as a follow-up would make
> sense if you're interested.
> 
> So for this patch to go forward, all I think you need to do is change the
test
> TREE_CODE (arg) == INTEGER_CST above to instead test
> is_gimple_min_invariant (arg), bootstrap & test.  Assuming the bootstrap
> and regressino test pass with that change, consider the patch
pre-approved.

Hi,
I updated and rebased the patch against latest trunk.  It passes bootstrap
and regression test on x86/x86_64.   Also pr60363 is fixed on
logical_op_short_circuit targets.  Is it OK?

Since ssa-dom-thread-4.c is fixed now, I also reverted the XFAIL test for
it.

An irrelevant point, the const propagation opportunities exist not only for
threaded jump path, but also for the original path after threading jump.  So
maybe it's worth to do the general transformation as Jeff suggested before.

Thanks,
bin


2014-05-04  Bin Cheng  <bin.cheng@arm.com>

	PR regression/60363
	* gcc/tree-ssa-threadupdate.c (get_value_locus_in_path): New.
	(copy_phi_args): New parameters.  Call get_value_locus_in_path.
	(update_destination_phis): New parameter.
	(create_edge_and_update_destination_phis): Ditto.
	(ssa_fix_duplicate_block_edges): Pass new arguments.
	(thread_single_edge): Ditto.

gcc/testsuite/ChangeLog
2014-05-04  Bin Cheng  <bin.cheng@arm.com>

	PR target/60363
	* gcc.target/tree-ssa/ssa-dom-thread-4.c: Revert XFAIL test.

Comments

Jeff Law May 4, 2014, 1:05 p.m. UTC | #1
On 05/04/14 00:39, bin.cheng wrote:>
> Hi,
> I updated and rebased the patch against latest trunk.  It passes bootstrap
> and regression test on x86/x86_64.   Also pr60363 is fixed on
> logical_op_short_circuit targets.  Is it OK?
>
> Since ssa-dom-thread-4.c is fixed now, I also reverted the XFAIL test for
> it.
Yes.  This is fine.  Please install.


>
> An irrelevant point, the const propagation opportunities exist not only for
> threaded jump path, but also for the original path after threading jump.  So
> maybe it's worth to do the general transformation as Jeff suggested before.
Most of the time, const propagation opportunities on the original path 
are limited to PHIs that become degenerate as a result of isolation of 
the threaded path.  We use a specialized propagator which seeds strictly 
from PHIs to catch those.

If you've got testcases where there's const propagation opportunities on 
the original path that do not originate from PHIs that become degenerate 
as a result of threading, certainly file a BZ so we can all take a look.


thanks,
Jeff
diff mbox

Patch

Index: gcc/tree-ssa-threadupdate.c
===================================================================
--- gcc/tree-ssa-threadupdate.c	(revision 210047)
+++ gcc/tree-ssa-threadupdate.c	(working copy)
@@ -403,10 +403,59 @@  copy_phi_arg_into_existing_phi (edge src_e, edge t
     }
 }
 
-/* For each PHI in BB, copy the argument associated with SRC_E to TGT_E.  */
+/* Given ssa_name DEF, backtrack jump threading PATH from node IDX
+   to see if it has constant value in a flow sensitive manner.  Set
+   LOCUS to location of the constant phi arg and return the value.
+   Return DEF directly if either PATH or idx is ZERO.  */
 
+static tree
+get_value_locus_in_path (tree def, vec<jump_thread_edge *> *path,
+			 basic_block bb, int idx, source_location *locus)
+{
+  tree arg;
+  gimple def_phi;
+  basic_block def_bb;
+
+  if (path == NULL || idx == 0)
+    return def;
+
+  def_phi = SSA_NAME_DEF_STMT (def);
+  if (gimple_code (def_phi) != GIMPLE_PHI)
+    return def;
+
+  def_bb = gimple_bb (def_phi);
+  /* Don't propagate loop invariants into deeper loops.  */
+  if (!def_bb || bb_loop_depth (def_bb) < bb_loop_depth (bb))
+    return def;
+
+  /* Backtrack jump threading path from IDX to see if def has constant
+     value.  */
+  for (int j = idx - 1; j >= 0; j--)
+    {
+      edge e = (*path)[j]->e;
+      if (e->dest == def_bb)
+	{
+	  arg = gimple_phi_arg_def (def_phi, e->dest_idx);
+	  if (is_gimple_min_invariant (arg))
+	    {
+	      *locus = gimple_phi_arg_location (def_phi, e->dest_idx);
+	      return arg;
+	    }
+	  break;
+	}
+    }
+
+  return def;
+}
+
+/* For each PHI in BB, copy the argument associated with SRC_E to TGT_E.
+   Try to backtrack jump threading PATH from node IDX to see if the arg
+   has constant value, copy constant value instead of argument itself
+   if yes.  */
+
 static void
-copy_phi_args (basic_block bb, edge src_e, edge tgt_e)
+copy_phi_args (basic_block bb, edge src_e, edge tgt_e,
+	       vec<jump_thread_edge *> *path, int idx)
 {
   gimple_stmt_iterator gsi;
   int src_indx = src_e->dest_idx;
@@ -414,8 +463,14 @@  static void
   for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple phi = gsi_stmt (gsi);
+      tree def = gimple_phi_arg_def (phi, src_indx);
       source_location locus = gimple_phi_arg_location (phi, src_indx);
-      add_phi_arg (phi, gimple_phi_arg_def (phi, src_indx), tgt_e, locus);
+
+      if (TREE_CODE (def) == SSA_NAME
+	  && !virtual_operand_p (gimple_phi_result (phi)))
+	def = get_value_locus_in_path (def, path, bb, idx, &locus);
+
+      add_phi_arg (phi, def, tgt_e, locus);
     }
 }
 
@@ -423,10 +478,13 @@  static void
    edges.  The copy is NEW_BB.  Every PHI node in every direct successor of
    ORIG_BB has a new argument associated with edge from NEW_BB to the
    successor.  Initialize the PHI argument so that it is equal to the PHI
-   argument associated with the edge from ORIG_BB to the successor.  */
+   argument associated with the edge from ORIG_BB to the successor.
+   PATH and IDX are used to check if the new PHI argument has constant
+   value in a flow sensitive manner.  */
 
 static void
-update_destination_phis (basic_block orig_bb, basic_block new_bb)
+update_destination_phis (basic_block orig_bb, basic_block new_bb,
+			 vec<jump_thread_edge *> *path, int idx)
 {
   edge_iterator ei;
   edge e;
@@ -434,7 +492,7 @@  static void
   FOR_EACH_EDGE (e, ei, orig_bb->succs)
     {
       edge e2 = find_edge (new_bb, e->dest);
-      copy_phi_args (e->dest, e, e2);
+      copy_phi_args (e->dest, e, e2, path, idx);
     }
 }
 
@@ -443,11 +501,13 @@  static void
    destination.
 
    Add an additional argument to any PHI nodes at the single
-   destination.  */
+   destination.  IDX is the start node in jump threading path
+   we start to check to see if the new PHI argument has constant
+   value along the jump threading path.  */
 
 static void
 create_edge_and_update_destination_phis (struct redirection_data *rd,
-					 basic_block bb)
+					 basic_block bb, int idx)
 {
   edge e = make_edge (bb, rd->path->last ()->e->dest, EDGE_FALLTHRU);
 
@@ -476,7 +536,7 @@  create_edge_and_update_destination_phis (struct re
      from the duplicate block, then we will need to add a new argument
      to them.  The argument should have the same value as the argument
      associated with the outgoing edge stored in RD.  */
-  copy_phi_args (e->dest, rd->path->last ()->e, e);
+  copy_phi_args (e->dest, rd->path->last ()->e, e, rd->path, idx);
 }
 
 /* Look through PATH beginning at START and return TRUE if there are
@@ -501,6 +561,7 @@  void
 ssa_fix_duplicate_block_edges (struct redirection_data *rd,
 			       ssa_local_info_t *local_info)
 {
+  bool multi_incomings = (rd->incoming_edges->next != NULL);
   edge e = rd->incoming_edges->e;
   vec<jump_thread_edge *> *path = THREAD_PATH (e);
 
@@ -516,8 +577,10 @@  ssa_fix_duplicate_block_edges (struct redirection_
 	  edge e2;
 
 	  /* This updates the PHIs at the destination of the duplicate
-	     block.  */
-	  update_destination_phis (local_info->bb, rd->dup_blocks[count]);
+	     block.  Pass 0 instead of i if we are threading a path which
+	     has multiple incoming edges.  */
+	  update_destination_phis (local_info->bb, rd->dup_blocks[count],
+				   path, multi_incomings ? 0 : i);
 
 	  /* Find the edge from the duplicate block to the block we're
 	     threading through.  That's the edge we want to redirect.  */
@@ -535,7 +598,8 @@  ssa_fix_duplicate_block_edges (struct redirection_
 		 case), then the PHIs in the target already have the correct
 		 arguments.  */
 	      if (e2 == victim)
-		copy_phi_args (e2->dest, path->last ()->e, e2);
+		copy_phi_args (e2->dest, path->last ()->e, e2,
+			       path, multi_incomings ? 0 : i);
 	    }
 	  else
 	    {
@@ -567,7 +631,8 @@  ssa_fix_duplicate_block_edges (struct redirection_
       else if ((*path)[i]->type == EDGE_COPY_SRC_BLOCK)
 	{
 	  remove_ctrl_stmt_and_useless_edges (rd->dup_blocks[count], NULL);
-	  create_edge_and_update_destination_phis (rd, rd->dup_blocks[count]);
+	  create_edge_and_update_destination_phis (rd, rd->dup_blocks[count],
+						   multi_incomings ? 0 : i);
 	  if (count == 1)
 	    single_succ_edge (rd->dup_blocks[1])->aux = NULL;
 	  count++;
@@ -989,7 +1054,7 @@  thread_single_edge (edge e)
 
   create_block_for_threading (bb, &rd, 0);
   remove_ctrl_stmt_and_useless_edges (rd.dup_blocks[0], NULL);
-  create_edge_and_update_destination_phis (&rd, rd.dup_blocks[0]);
+  create_edge_and_update_destination_phis (&rd, rd.dup_blocks[0], 0);
 
   if (dump_file && (dump_flags & TDF_DETAILS))
     fprintf (dump_file, "  Threaded jump %d --> %d to %d\n",
Index: gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c	(revision 210047)
+++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-4.c	(working copy)
@@ -75,6 +75,6 @@  bitmap_ior_and_compl (bitmap dst, const_bitmap a,
       -> "kill_elt->indx == b_elt->indx" in the second condition,
 	 skipping the known-true "b_elt && kill_elt" in the second
 	 condition.  */
-/* { dg-final { scan-tree-dump-times "Threaded" 4 "dom1" { target logical_op_short_circuit xfail logical_op_short_circuit } } } */
+/* { dg-final { scan-tree-dump-times "Threaded" 4 "dom1" { target logical_op_short_circuit } } } */
 /* { dg-final { cleanup-tree-dump "dom1" } } */