diff mbox

Fix PR 65177: diamonds are not valid execution threads for jump threading

Message ID 20150319195402.GA21421@f1.c.bardezibar.internal
State New
Headers show

Commit Message

Sebastian Pop March 19, 2015, 7:54 p.m. UTC
Richard Biener wrote:
> please instead fixup after copy_bbs in duplicate_seme_region.
> 

Thanks for the review.
Attached patch that does not modify copy_bbs.
Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp

Full bootstrap and regtest in progress on x86_64-linux.  Ok for trunk?

Comments

Richard Biener March 20, 2015, 9:07 a.m. UTC | #1
On Thu, Mar 19, 2015 at 8:54 PM, Sebastian Pop <sebpop@gmail.com> wrote:
> Richard Biener wrote:
>> please instead fixup after copy_bbs in duplicate_seme_region.
>>
>
> Thanks for the review.
> Attached patch that does not modify copy_bbs.
> Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp
>
> Full bootstrap and regtest in progress on x86_64-linux.  Ok for trunk?

Looks good to me.  Please also add a testcase.

Thanks,
Richard.
Jeff Law March 23, 2015, 5:41 a.m. UTC | #2
On 03/19/2015 01:54 PM, Sebastian Pop wrote:
> Richard Biener wrote:
>> please instead fixup after copy_bbs in duplicate_seme_region.
>>
>
> Thanks for the review.
> Attached patch that does not modify copy_bbs.
> Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp
>
> Full bootstrap and regtest in progress on x86_64-linux.  Ok for trunk?
I'm going to have to find time tomorrow to wrap this up.  I've got a 
morning full of meetings, then I on a plane for the east coast for the 
rest of the day.  I'm in an exit row seat for the 2nd leg, so there 
ought to be enough room to open my laptop and poke at this.

jeff
Jeff Law March 25, 2015, 12:21 p.m. UTC | #3
On 03/19/15 13:54, Sebastian Pop wrote:
> Richard Biener wrote:
>> >please instead fixup after copy_bbs in duplicate_seme_region.
>> >
> Thanks for the review.
> Attached patch that does not modify copy_bbs.
> Fixes make check in hmmer and make check RUNTESTFLAGS=tree-ssa.exp
>
> Full bootstrap and regtest in progress on x86_64-linux.  Ok for trunk?
>
>
> 0001-diamonds-are-not-valid-execution-threads-for-jump-th.patch
>
>
>  From 8f1516235bce3e1c4f359149dcc546d813ed7817 Mon Sep 17 00:00:00 2001
> From: Sebastian Pop<sebpop@gmail.com>
> Date: Tue, 17 Mar 2015 20:28:19 +0100
> Subject: [PATCH] diamonds are not valid execution threads for jump threading
>
> 	PR tree-optimization/65177
> 	* tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread.
> 	(bb_in_bbs): New.
> 	(duplicate_seme_region): Renamed duplicate_thread_path.  Redirect all
> 	edges not adjacent on the path to the original code.
OK for the trunk.  Though I think there's some stage1 refactoring that 
we're going to want to do.

Specifically, it seems to me that copy_bbs should be refactored into 
copy_bbs and copy_bbs_for_threading or somesuch.  Where those routines 
call into refactored common subroutines, but obviously handle wiring up 
the outgoing edges from the copied blocks differently.

The goal would be to eliminate the overly complex block copy/CFG update 
scheme in tree-ssa-threadupdate.c as part of a larger project to convert 
to a backward threader that can run independently of DOM.

Jeff
Sebastian Pop March 25, 2015, 11:09 p.m. UTC | #4
Jeff Law wrote:
> >	PR tree-optimization/65177
> >	* tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread.
> >	(bb_in_bbs): New.
> >	(duplicate_seme_region): Renamed duplicate_thread_path.  Redirect all
> >	edges not adjacent on the path to the original code.
> OK for the trunk.  

Committed r221675.

> Though I think there's some stage1 refactoring that we're going to want to do.

Agreed.

> Specifically, it seems to me that copy_bbs should be refactored into
> copy_bbs and copy_bbs_for_threading or somesuch.  Where those
> routines call into refactored common subroutines, but obviously
> handle wiring up the outgoing edges from the copied blocks
> differently.
> 

That would be a good cleanup: I don't like to arbitrarily redirect edges in
copy_bbs just to redirect them back to their initial place in the caller.

> The goal would be to eliminate the overly complex block copy/CFG
> update scheme in tree-ssa-threadupdate.c as part of a larger project
> to convert to a backward threader that can run independently of DOM.

I have a start of a patch for that cleanup, it currently runs wild as it would
replace the existing threadupdate code generator with a call to the new
duplicate_thread_path.  I think we should take smaller more manageable steps to
ease the review and to not destabilize the jump-threader.  In particular I think
we should have both code generators for a while and turn one on/off with an option.

Sebastian
Jeff Law March 26, 2015, 3:50 p.m. UTC | #5
On 03/25/2015 05:09 PM, Sebastian Pop wrote:
>> Specifically, it seems to me that copy_bbs should be refactored into
>> copy_bbs and copy_bbs_for_threading or somesuch.  Where those
>> routines call into refactored common subroutines, but obviously
>> handle wiring up the outgoing edges from the copied blocks
>> differently.
>>
>
> That would be a good cleanup: I don't like to arbitrarily redirect edges in
> copy_bbs just to redirect them back to their initial place in the caller.
Exactly.

>
>> The goal would be to eliminate the overly complex block copy/CFG
>> update scheme in tree-ssa-threadupdate.c as part of a larger project
>> to convert to a backward threader that can run independently of DOM.
>
> I have a start of a patch for that cleanup, it currently runs wild as it would
> replace the existing threadupdate code generator with a call to the new
> duplicate_thread_path.  I think we should take smaller more manageable steps to
> ease the review and to not destabilize the jump-threader.  In particular I think
> we should have both code generators for a while and turn one on/off with an option.
I hadn't planned on supporting both with an option, I'd rather make the 
switch and not look back :-)   An option just adds maintenance burdens 
(supporting both approaches) and doesn't actually help the end users 
(though it would help us as developers).

Regardless, probably the first step is a common API for the two path 
duplication approaches.

Jeff
Richard Biener March 27, 2015, 8:53 a.m. UTC | #6
On Thu, Mar 26, 2015 at 4:50 PM, Jeff Law <law@redhat.com> wrote:
> On 03/25/2015 05:09 PM, Sebastian Pop wrote:
>>>
>>> Specifically, it seems to me that copy_bbs should be refactored into
>>> copy_bbs and copy_bbs_for_threading or somesuch.  Where those
>>> routines call into refactored common subroutines, but obviously
>>> handle wiring up the outgoing edges from the copied blocks
>>> differently.
>>>
>>
>> That would be a good cleanup: I don't like to arbitrarily redirect edges
>> in
>> copy_bbs just to redirect them back to their initial place in the caller.
>
> Exactly.
>
>>
>>> The goal would be to eliminate the overly complex block copy/CFG
>>> update scheme in tree-ssa-threadupdate.c as part of a larger project
>>> to convert to a backward threader that can run independently of DOM.
>>
>>
>> I have a start of a patch for that cleanup, it currently runs wild as it
>> would
>> replace the existing threadupdate code generator with a call to the new
>> duplicate_thread_path.  I think we should take smaller more manageable
>> steps to
>> ease the review and to not destabilize the jump-threader.  In particular I
>> think
>> we should have both code generators for a while and turn one on/off with
>> an option.
>
> I hadn't planned on supporting both with an option, I'd rather make the
> switch and not look back :-)   An option just adds maintenance burdens
> (supporting both approaches) and doesn't actually help the end users (though
> it would help us as developers).
>
> Regardless, probably the first step is a common API for the two path
> duplication approaches.

Yeah, and refactoring copy_bbs so that the actual edge duplication happens
in another function (thus we can have two of them).

Note we have way too many copy-XYZ APIs/workers on GIMPLE already.

I was also playing with the idea to support value-numbering the stmts
on-the-fly as we copy them and use this for cost estimation of copies
(ok, well - basically do the copy and then either throw it away again
or accept it).
I thought about applying this to loop unrolling, but obviously this also applies
to any block duplication we do during jump threading.

Richard.

> Jeff
Jeff Law March 27, 2015, 8:26 p.m. UTC | #7
On 03/27/2015 02:53 AM, Richard Biener wrote:
>
> Yeah, and refactoring copy_bbs so that the actual edge duplication happens
> in another function (thus we can have two of them).
Exactly.


>
> I was also playing with the idea to support value-numbering the stmts
> on-the-fly as we copy them and use this for cost estimation of copies
> (ok, well - basically do the copy and then either throw it away again
> or accept it).
> I thought about applying this to loop unrolling, but obviously this also applies
> to any block duplication we do during jump threading.
It'd definitely be useful in threading.  It's often the case that PHIs 
in the duplicates are degenerates and we can/should cprop them away and 
perform resultant simplifications that propagation enables.

If we could do that integrated with the actual duplication, then we'd be 
able to drop the phi-only cprop pass which exists solely to clean up 
that stuff.

jeff
diff mbox

Patch

From 8f1516235bce3e1c4f359149dcc546d813ed7817 Mon Sep 17 00:00:00 2001
From: Sebastian Pop <sebpop@gmail.com>
Date: Tue, 17 Mar 2015 20:28:19 +0100
Subject: [PATCH] diamonds are not valid execution threads for jump threading

	PR tree-optimization/65177
	* tree-ssa-threadupdate.c (verify_seme): Renamed verify_jump_thread.
	(bb_in_bbs): New.
	(duplicate_seme_region): Renamed duplicate_thread_path.  Redirect all
	edges not adjacent on the path to the original code.
---
 gcc/tree-ssa-threadupdate.c |   93 +++++++++++++++++++++++++++++++------------
 1 files changed, 67 insertions(+), 26 deletions(-)

diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index 7a159bb..610e807 100644
--- a/gcc/tree-ssa-threadupdate.c
+++ b/gcc/tree-ssa-threadupdate.c
@@ -2328,36 +2328,32 @@  bb_ends_with_multiway_branch (basic_block bb ATTRIBUTE_UNUSED)
   return false;
 }
 
-/* Verify that the REGION is a Single Entry Multiple Exits region: make sure no
-   edge other than ENTRY is entering the REGION.  */
+/* Verify that the REGION is a valid jump thread.  A jump thread is a special
+   case of SEME Single Entry Multiple Exits region in which all nodes in the
+   REGION have exactly one incoming edge.  The only exception is the first block
+   that may not have been connected to the rest of the cfg yet.  */
 
 DEBUG_FUNCTION void
-verify_seme (edge entry, basic_block *region, unsigned n_region)
+verify_jump_thread (basic_block *region, unsigned n_region)
 {
-  bitmap bbs = BITMAP_ALLOC (NULL);
-
   for (unsigned i = 0; i < n_region; i++)
-    bitmap_set_bit (bbs, region[i]->index);
+    gcc_assert (EDGE_COUNT (region[i]->preds) <= 1);
+}
 
-  for (unsigned i = 0; i < n_region; i++)
-    {
-      edge e;
-      edge_iterator ei;
-      basic_block bb = region[i];
+/* Return true when BB is one of the first N items in BBS.  */
 
-      /* All predecessors other than ENTRY->src should be in the region.  */
-      for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ei_next (&ei))
-	if (e != entry)
-	  gcc_assert (bitmap_bit_p (bbs, e->src->index));
-    }
+static inline bool
+bb_in_bbs (basic_block bb, basic_block *bbs, int n)
+{
+  for (int i = 0; i < n; i++)
+    if (bb == bbs[i])
+      return true;
 
-  BITMAP_FREE (bbs);
+  return false;
 }
 
-/* Duplicates a Single Entry Multiple Exit REGION (set of N_REGION basic
-   blocks).  The ENTRY edge is redirected to the duplicate of the region.  If
-   REGION is not a Single Entry region, ignore any incoming edges other than
-   ENTRY: this makes the copied region a Single Entry region.
+/* Duplicates a jump-thread path of N_REGION basic blocks.
+   The ENTRY edge is redirected to the duplicate of the region.
 
    Remove the last conditional statement in the last basic block in the REGION,
    and create a single fallthru edge pointing to the same destination as the
@@ -2369,7 +2365,7 @@  verify_seme (edge entry, basic_block *region, unsigned n_region)
    Returns false if it is unable to copy the region, true otherwise.  */
 
 static bool
-duplicate_seme_region (edge entry, edge exit,
+duplicate_thread_path (edge entry, edge exit,
 		       basic_block *region, unsigned n_region,
 		       basic_block *region_copy)
 {
@@ -2428,7 +2424,53 @@  duplicate_seme_region (edge entry, edge exit,
     }
 
   copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
-	    split_edge_bb_loc (entry), 0);
+	    split_edge_bb_loc (entry), false);
+
+  /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
+     following code ensures that all the edges exiting the jump-thread path are
+     redirected back to the original code: these edges are exceptions
+     invalidating the property that is propagated by executing all the blocks of
+     the jump-thread path in order.  */
+
+  for (i = 0; i < n_region; i++)
+    {
+      edge e;
+      edge_iterator ei;
+      basic_block bb = region_copy[i];
+
+      if (single_succ_p (bb))
+	{
+	  /* Make sure the successor is the next node in the path.  */
+	  gcc_assert (i + 1 == n_region
+		      || region_copy[i + 1] == single_succ_edge (bb)->dest);
+	  continue;
+	}
+
+      /* Special case the last block on the path: make sure that it does not
+	 jump back on the copied path.  */
+      if (i + 1 == n_region)
+	{
+	  FOR_EACH_EDGE (e, ei, bb->succs)
+	    if (bb_in_bbs (e->dest, region_copy, n_region - 1))
+	      {
+		basic_block orig = get_bb_original (e->dest);
+		if (orig)
+		  redirect_edge_and_branch_force (e, orig);
+	      }
+	  continue;
+	}
+
+      /* Redirect all other edges jumping to non-adjacent blocks back to the
+	 original code.  */
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	if (region_copy[i + 1] != e->dest)
+	  {
+	    basic_block orig = get_bb_original (e->dest);
+	    if (orig)
+	      redirect_edge_and_branch_force (e, orig);
+	  }
+    }
+
   if (total_count)
     {
       scale_bbs_frequencies_gcov_type (region, n_region,
@@ -2445,8 +2487,7 @@  duplicate_seme_region (edge entry, edge exit,
     }
 
 #ifdef ENABLE_CHECKING
-  /* Make sure no edge other than ENTRY is entering the copied region.  */
-  verify_seme (entry, region_copy, n_region);
+  verify_jump_thread (region_copy, n_region);
 #endif
 
   /* Remove the last branch in the jump thread path.  */
@@ -2550,7 +2591,7 @@  thread_through_all_blocks (bool may_peel_loop_headers)
       for (unsigned int j = 0; j < len - 1; j++)
 	region[j] = (*path)[j]->e->dest;
 
-      if (duplicate_seme_region (entry, exit, region, len - 1, NULL))
+      if (duplicate_thread_path (entry, exit, region, len - 1, NULL))
 	{
 	  /* We do not update dominance info.  */
 	  free_dominance_info (CDI_DOMINATORS);
-- 
1.7.2.5