diff mbox

Simplify jump threading slightly

Message ID 5228E9CC.8030207@redhat.com
State New
Headers show

Commit Message

Jeff Law Sept. 5, 2013, 8:30 p.m. UTC
The basic structure we're moving towards for the general form of the FSA 
optimization will look something like:

while (something changed)
   {
     changed = thread_through_empty_blocks ()
     if (!threaded_through_block_with_side_effects)
        changed |= thread_through_block_with_wide_effects ();
     if (!threaded_through_joiner_block)
       changed |= thread_through_joiner_block ();
   }

The idea being that the updating code will know how to handle updating 
the CFG and SSA graph when copying precisely one block with side effects 
and one joiner block, potentially in either order with an unlimited 
number of blocks which just transfer control without side effects thrown 
into the mix as well.

So basically we're going to be classifying blocks into 3 categories.

   1. Those with no side effects other than transferring control and we
      can statically determine the successor.  Note these should require
      no copying as we can just wire up the edges in the desired fashion.

   2. Blocks with side effects where we can statically determine the
      successor.  These require duplication so we can preserve the
      side effects.  Obviously the duplicate will transfer control to
      the desired destination unconditionally.

   3. Joiner blocks.  These blocks aren't threadable, but have a
      successor which is threadable.  We have to copy the joiner
      block and redirect the edge leading to the threadable successor
      either to the ultimate destination (if the successor is of type
      #1) or to a duplicate of the successor if it is of type #2.


We're going to annotate each block in the path with its type so that the 
updating code can "easily" determine how to perform the update.

To this end I'm reorganizing the code to detect the thread path into 
subroutines to detect each of the three types of blocks.  This is the 
first (and obviously easiest) of those changes.

In the specific case of the FSA optimization, the key path(s) to thread 
have the form #3 -> #1 -> #2.  Where #1 is the loop back edge.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
Installed onto the trunk.

Comments

Gerald Pfeifer Sept. 6, 2013, 8:10 p.m. UTC | #1
Hi Jeff,

On Thu, 5 Sep 2013, Jeff Law wrote:
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
> Installed onto the trunk.

is it possible this

  2013-09-05  Jeff Law  <law@redhat.com>

       * tree-ssa-threadedge.c (thread_around_empty_blocks): Renamed
       from thread_around_empty_block.  Record threading path into PATH.
       Recurse if threading through the initial block is successful.
       (thread_across_edge): Corresponding changes to slightly simplify.

is causing http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58340 , or
least unearthing it?

That's amd64-unknown-freebsd8.0, pretty similar to your test platform,
so this would be a bit surprising, but...

Gerald
Jeff Law Sept. 6, 2013, 8:13 p.m. UTC | #2
On 09/06/2013 02:10 PM, Gerald Pfeifer wrote:
> Hi Jeff,
>
> On Thu, 5 Sep 2013, Jeff Law wrote:
>> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
>> Installed onto the trunk.
>
> is it possible this
>
>    2013-09-05  Jeff Law  <law@redhat.com>
>
>         * tree-ssa-threadedge.c (thread_around_empty_blocks): Renamed
>         from thread_around_empty_block.  Record threading path into PATH.
>         Recurse if threading through the initial block is successful.
>         (thread_across_edge): Corresponding changes to slightly simplify.
>
> is causing http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58340 , or
> least unearthing it?
>
> That's amd64-unknown-freebsd8.0, pretty similar to your test platform,
> so this would be a bit surprising, but...
I'll take a look.  I'm not a big believer in coincidences -- with that 
code recently changing and you seeing a bootstrap failure, odds are, 
it's something in that change.

The day is winding down, so it'll probably be monday before I get too 
far on it.

jeff
Markus Trippelsdorf Sept. 10, 2013, 7:40 a.m. UTC | #3
On 2013.09.05 at 14:30 -0600, Jeff Law wrote:
> 
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. 
> Installed onto the trunk.

This patch caused http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58373
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 91f41b1..ce0e39d 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@ 
+2013-09-05  Jeff Law  <law@redhat.com>
+
+	* tree-ssa-threadedge.c (thread_around_empty_blocks): Renamed
+	from thread_around_empty_block.  Record threading path into PATH.
+	Recurse if threading through the initial block is successful.
+	(thread_across_edge): Corresponding changes to slightly simplify.
+
 2013-09-05  James Greenhalgh  <james.greenhalgh@arm.com>
 
 	* config/aarch64/aarch64.md
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index dddcfce..afdd0af 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -738,46 +738,68 @@  propagate_threaded_block_debug_into (basic_block dest, basic_block src)
     fewvars.release ();
 }
 
-/* TAKEN_EDGE represents the an edge taken as a result of jump threading.
-   See if we can thread around TAKEN_EDGE->dest as well.  If so, return
-   the edge out of TAKEN_EDGE->dest that we can statically compute will be
-   traversed.
-
-   We are much more restrictive as to the contents of TAKEN_EDGE->dest
-   as the path isolation code in tree-ssa-threadupdate.c isn't prepared
-   to handle copying intermediate blocks on a threaded path. 
-
-   Long term a more consistent and structured approach to path isolation
-   would be a huge help.   */
-static edge
-thread_around_empty_block (edge taken_edge,
-			   gimple dummy_cond,
-			   bool handle_dominating_asserts,
-			   tree (*simplify) (gimple, gimple),
-			   bitmap visited)
+/* See if TAKEN_EDGE->dest is a threadable block with no side effecs (ie, it
+   need not be duplicated as part of the CFG/SSA updating process).
+
+   If it is threadable, add it to PATH and VISITED and recurse, ultimately
+   returning TRUE from the toplevel call.   Otherwise do nothing and
+   return false.
+
+   DUMMY_COND, HANDLE_DOMINATING_ASSERTS and SIMPLIFY are used to
+   try and simplify the condition at the end of TAKEN_EDGE->dest.  */
+static bool
+thread_around_empty_blocks (edge taken_edge,
+			    gimple dummy_cond,
+			    bool handle_dominating_asserts,
+			    tree (*simplify) (gimple, gimple),
+			    bitmap visited,
+			    vec<edge> *path)
 {
   basic_block bb = taken_edge->dest;
   gimple_stmt_iterator gsi;
   gimple stmt;
   tree cond;
 
-  /* This block can have no PHI nodes.  This is overly conservative.  */
+  /* The key property of these blocks is that they need not be duplicated
+     when threading.  Thus they can not have visible side effects such
+     as PHI nodes.  */
   if (!gsi_end_p (gsi_start_phis (bb)))
     return NULL;
 
   /* Skip over DEBUG statements at the start of the block.  */
   gsi = gsi_start_nondebug_bb (bb);
 
+  /* If the block has no statements, but does have a single successor, then
+     it's just a forwarding block and we can thread through it trivially.  */
   if (gsi_end_p (gsi))
-    return NULL;
+    {
+      if (single_succ_p (bb))
+	{
+	  taken_edge = single_succ_edge (bb);
+	  if ((taken_edge->flags & EDGE_DFS_BACK) == 0
+	      && !bitmap_bit_p (visited, taken_edge->dest->index))
+	    {
+	      bitmap_set_bit (visited, taken_edge->dest->index);
+	      path->safe_push (taken_edge);
+	      thread_around_empty_blocks (taken_edge,
+					  dummy_cond,
+					  handle_dominating_asserts,
+					  simplify,
+					  visited,
+					  path);
+	      return true;
+	    }
+	}
+      return false;
+    }
 
-  /* This block can have no statements other than its control altering
-     statement.  This is overly conservative.  */
+  /* The only real statements this block can have are a control
+     flow altering statement.  Anything else stops the thread.  */
   stmt = gsi_stmt (gsi);
   if (gimple_code (stmt) != GIMPLE_COND
       && gimple_code (stmt) != GIMPLE_GOTO
       && gimple_code (stmt) != GIMPLE_SWITCH)
-    return NULL;
+    return false;
 
   /* Extract and simplify the condition.  */
   cond = simplify_control_stmt_condition (taken_edge, stmt, dummy_cond,
@@ -788,15 +810,22 @@  thread_around_empty_block (edge taken_edge,
      path.  */
   if (cond && is_gimple_min_invariant (cond))
     {
-      edge taken_edge = find_taken_edge (bb, cond);
+      taken_edge = find_taken_edge (bb, cond);
 
       if (bitmap_bit_p (visited, taken_edge->dest->index))
-	return NULL;
+	return false;
       bitmap_set_bit (visited, taken_edge->dest->index);
-      return taken_edge;
+      path->safe_push (taken_edge);
+      thread_around_empty_blocks (taken_edge,
+				  dummy_cond,
+				  handle_dominating_asserts,
+				  simplify,
+				  visited,
+				  path);
+      return true;
     }
  
-  return NULL;
+  return false;
 }
       
 /* E1 and E2 are edges into the same basic block.  Return TRUE if the
@@ -896,51 +925,40 @@  thread_across_edge (gimple dummy_cond,
 	  edge taken_edge = find_taken_edge (e->dest, cond);
 	  basic_block dest = (taken_edge ? taken_edge->dest : NULL);
 	  bitmap visited;
-	  edge e2;
 
-	  if (dest == e->dest)
+	  /* DEST could be NULL for a computed jump to an absolute
+	     address.  */
+	  if (dest == NULL || dest == e->dest)
 	    goto fail;
 
 	  vec<edge> path = vNULL;
 	  path.safe_push (e);
 	  path.safe_push (taken_edge);
 
-	  /* DEST could be null for a computed jump to an absolute
-	     address.  If DEST is not null, then see if we can thread
-	     through it as well, this helps capture secondary effects
-	     of threading without having to re-run DOM or VRP.  */
-	  if (dest
-	      && ((e->flags & EDGE_DFS_BACK) == 0
-		  || ! cond_arg_set_in_bb (taken_edge, e->dest)))
+	  /* See if we can thread through DEST as well, this helps capture
+	     secondary effects of threading without having to re-run DOM or
+	     VRP.  */
+	  if ((e->flags & EDGE_DFS_BACK) == 0
+	       || ! cond_arg_set_in_bb (taken_edge, e->dest))
 	    {
 	      /* We don't want to thread back to a block we have already
  		 visited.  This may be overly conservative.  */
 	      visited = BITMAP_ALLOC (NULL);
 	      bitmap_set_bit (visited, dest->index);
 	      bitmap_set_bit (visited, e->dest->index);
-	      do
-		{
-		  e2 = thread_around_empty_block (taken_edge,
-						  dummy_cond,
-						  handle_dominating_asserts,
-						  simplify,
-						  visited);
-		  if (e2)
-		    {
-		      taken_edge = e2;
-		      path.safe_push (e2);
-		    }
-		}
-	      while (e2);
+	      thread_around_empty_blocks (taken_edge,
+					  dummy_cond,
+					  handle_dominating_asserts,
+					  simplify,
+					  visited,
+					  &path);
 	      BITMAP_FREE (visited);
 	    }
 
 	  remove_temporary_equivalences (stack);
-	  if (taken_edge)
-	    {
-	      propagate_threaded_block_debug_into (taken_edge->dest, e->dest);
-	      register_jump_thread (path, false);
-	    }
+	  propagate_threaded_block_debug_into (path[path.length () - 1]->dest,
+					       e->dest);
+	  register_jump_thread (path, false);
 	  path.release ();
 	  return;
 	}
@@ -958,7 +976,7 @@  thread_across_edge (gimple dummy_cond,
     This is a stopgap until we have a more structured approach to path
     isolation.  */
   {
-    edge e2, e3, taken_edge;
+    edge taken_edge;
     edge_iterator ei;
     bool found = false;
     bitmap visited = BITMAP_ALLOC (NULL);
@@ -976,28 +994,14 @@  thread_across_edge (gimple dummy_cond,
 	   of E->dest.  */
 	path.safe_push (e);
 	path.safe_push (taken_edge);
-	found = false;
-	e3 = taken_edge;
-	do
-	  {
-	    if ((e->flags & EDGE_DFS_BACK) == 0
-		|| ! cond_arg_set_in_bb (e3, e->dest))
-	      e2 = thread_around_empty_block (e3,
+	if ((e->flags & EDGE_DFS_BACK) == 0
+	    || ! cond_arg_set_in_bb (path[path.length () - 1], e->dest))
+	  found = thread_around_empty_blocks (taken_edge,
 					      dummy_cond,
 					      handle_dominating_asserts,
 					      simplify,
-					      visited);
-	    else
-	      e2 = NULL;
-
-	    if (e2)
-	      {
-		path.safe_push (e2);
-	        e3 = e2;
-		found = true;
-	      }
-	  }
-        while (e2);
+					      visited,
+					      &path);
 
 	/* If we were able to thread through a successor of E->dest, then
 	   record the jump threading opportunity.  */
@@ -1008,10 +1012,10 @@  thread_across_edge (gimple dummy_cond,
 	       (E2->src) to the final target (E3->dest), then make sure that
 	       the PHI args associated with the edges E2 and E3 are the
 	       same.  */
-	    tmp = find_edge (taken_edge->src, e3->dest);
-	    if (!tmp || phi_args_equal_on_edges (tmp, e3))
+	    tmp = find_edge (taken_edge->src, path[path.length () - 1]->dest);
+	    if (!tmp || phi_args_equal_on_edges (tmp, path[path.length () - 1]))
 	      {
-		propagate_threaded_block_debug_into (e3->dest,
+		propagate_threaded_block_debug_into (path[path.length () - 1]->dest,
 						     taken_edge->dest);
 		register_jump_thread (path, true);
 	      }