Patchwork [PR50878] Fix for verify_dominators in -ftree-tail-merge

login
register
mail settings
Submitter Tom de Vries
Date Oct. 31, 2011, 8:19 p.m.
Message ID <4EAF02CE.4060507@mentor.com>
Download mbox | patch
Permalink /patch/122936/
State New
Headers show

Comments

Tom de Vries - Oct. 31, 2011, 8:19 p.m.
On 10/30/2011 10:54 AM, Richard Guenther wrote:
> On Sun, Oct 30, 2011 at 9:27 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>> On 10/30/2011 09:20 AM, Tom de Vries wrote:
>>> Richard,
>>>
>>> I have a fix for PR50878.
>>
>> Sorry, with patch this time.
> 
> Ok for now, but see Davids mail and the complexity issue with iteratively
> updating dominators.

I'm not sure which mail you mean.

> It seems to me that we know exactly what to update
> and how, and we should do that (well, if we need up-to-date dominators,
> re-computing them once in the pass would be ok).
> 

Indeed, in this example we know exactly what to update and how. However, PR50908
popped up, and there that's not the case anymore.

Consider the following cfg, where A is the direct dominator of I:

                A
               / \
              B   \
             / \   \
                C   D
               /|   |\
                E   F
                |\ /|
                | x |
                |/ \|
                G   H
                 \ /
                  I

Say E and F are duplicates, and F is removed.  The cfg then looks like
this:

                A
               / \
              B   \
             / \   \
                C   D
               / \ / \
                  E
                 / \
                G   H
                 \ /
                  I

E is now the new direct dominator of I.

The patch for PR50878 did not address this example, since it uses the set of bbs
directly dominated by the (single) predecessor of bb1 and bb2.

The new patch calculates the updated dominator info by taking the nearest common
dominator (A) of bb1 (F) and bb2 (E), and getting the set of bbs immediately
dominated by it.  Part of this set is now directly dominated by bb2.

Ideally we would have a means to determine which bbs in the set are now
directly dominated by bb2, and call set_immediate_dominator for those bbs, but
we don't, so instead we let iterate_fix_dominators figure it out.

Additionally, the patch makes sure it updates dominator info before updating the
vuses, this fixes a latent bug.

The patch fixes both PR50908 and PR50878.

Bootstrapped and reg-tested on x86_64 and i686, and build and reg-tested on ARM
and MIPS.

Ok for trunk?

Thanks,
- Tom

> Richard.
> 
>> Thanks,
>> - Tom
>>
>>>
>>> A simplified form of the problem from the test-case of the PR is shown in this
>>> cfg. Block 12 has as direct dominator block 5.
>>>
>>>         5
>>>        / \
>>>       /   \
>>>      *     *
>>>      6     7
>>>      |     |
>>>      |     |
>>>      *     *
>>>      8     9
>>>       \   /
>>>        \ /
>>>         *
>>>        12
>>>
>>> tail_merge_optimize finds that blocks 6 and 7 are duplicates. After replacing
>>> block 7 by block 6, the cfg looks like this:
>>>
>>>         5
>>>         |
>>>         |
>>>         *
>>>         6
>>>        / \
>>>       /   \
>>>      *     *
>>>      8     9
>>>       \   /
>>>        \ /
>>>         *
>>>        12
>>>
>>> The new direct dominator of block 12 is block 6, but the current algorithm only
>>> recalculates dominator info for blocks 6, 8 and 9.
>>>
>>> The patch fixes this by additionally recalculating the dominator info for blocks
>>> immediately dominated by bb2 (block 6 in the example), if bb2 has a single
>>> predecessor after replacement.
>>>
>>> Bootstapped and reg-tested on x86_64 and i686. Build and reg-tested on MIPS and ARM.
>>>
>>> Ok for trunk?
>>>
>>> Thanks,
>>> - Tom
>>>
>>> 2011-10-30  Tom de Vries  <tom@codesourcery.com>
>>>
>>>       PR tree-optimization/50878
>>>       * tree-ssa-tail-merge.c (replace_block_by): Recalculate dominator info
>>>       for blocks immediately dominated by bb2, if bb2 has a single predecessor
>>>       after replacement.
>>
>>

2011-10-31  Tom de Vries  <tom@codesourcery.com>

	PR tree-optimization/50908
	* tree-ssa-tail-merge.c (update_vuses): Now that edges are removed
	before update_vuses, test for 1 predecessor rather than two.
	(delete_block_update_dominator_info): New function, part of it factored
	out of ...
	(replace_block_by): Use delete_block_update_dominator_info.  Call
	update_vuses after deleting bb1 and updating dominator info, instead of
	before.
Richard Guenther - Nov. 1, 2011, 9:43 a.m.
On Mon, Oct 31, 2011 at 9:19 PM, Tom de Vries <Tom_deVries@mentor.com> wrote:
> On 10/30/2011 10:54 AM, Richard Guenther wrote:
>> On Sun, Oct 30, 2011 at 9:27 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> On 10/30/2011 09:20 AM, Tom de Vries wrote:
>>>> Richard,
>>>>
>>>> I have a fix for PR50878.
>>>
>>> Sorry, with patch this time.
>>
>> Ok for now, but see Davids mail and the complexity issue with iteratively
>> updating dominators.
>
> I'm not sure which mail you mean.

The one I CCed you on, which complained about iterative dominator fixing
taking 70% of the compile-time in some GCC testsuite test.

>> It seems to me that we know exactly what to update
>> and how, and we should do that (well, if we need up-to-date dominators,
>> re-computing them once in the pass would be ok).
>>
>
> Indeed, in this example we know exactly what to update and how. However, PR50908
> popped up, and there that's not the case anymore.
>
> Consider the following cfg, where A is the direct dominator of I:
>
>                A
>               / \
>              B   \
>             / \   \
>                C   D
>               /|   |\
>                E   F
>                |\ /|
>                | x |
>                |/ \|
>                G   H
>                 \ /
>                  I
>
> Say E and F are duplicates, and F is removed.  The cfg then looks like
> this:
>
>                A
>               / \
>              B   \
>             / \   \
>                C   D
>               / \ / \
>                  E
>                 / \
>                G   H
>                 \ /
>                  I
>
> E is now the new direct dominator of I.
>
> The patch for PR50878 did not address this example, since it uses the set of bbs
> directly dominated by the (single) predecessor of bb1 and bb2.
>
> The new patch calculates the updated dominator info by taking the nearest common
> dominator (A) of bb1 (F) and bb2 (E), and getting the set of bbs immediately
> dominated by it.  Part of this set is now directly dominated by bb2.
>
> Ideally we would have a means to determine which bbs in the set are now
> directly dominated by bb2, and call set_immediate_dominator for those bbs, but
> we don't, so instead we let iterate_fix_dominators figure it out.
>
> Additionally, the patch makes sure it updates dominator info before updating the
> vuses, this fixes a latent bug.
>
> The patch fixes both PR50908 and PR50878.
>
> Bootstrapped and reg-tested on x86_64 and i686, and build and reg-tested on ARM
> and MIPS.
>
> Ok for trunk?

Ok, but please add testcases for all the bugs you fixed.  This helps adding test
coverage for these cases.

Thanks,
Richard.

> Thanks,
> - Tom
>
>> Richard.
>>
>>> Thanks,
>>> - Tom
>>>
>>>>
>>>> A simplified form of the problem from the test-case of the PR is shown in this
>>>> cfg. Block 12 has as direct dominator block 5.
>>>>
>>>>         5
>>>>        / \
>>>>       /   \
>>>>      *     *
>>>>      6     7
>>>>      |     |
>>>>      |     |
>>>>      *     *
>>>>      8     9
>>>>       \   /
>>>>        \ /
>>>>         *
>>>>        12
>>>>
>>>> tail_merge_optimize finds that blocks 6 and 7 are duplicates. After replacing
>>>> block 7 by block 6, the cfg looks like this:
>>>>
>>>>         5
>>>>         |
>>>>         |
>>>>         *
>>>>         6
>>>>        / \
>>>>       /   \
>>>>      *     *
>>>>      8     9
>>>>       \   /
>>>>        \ /
>>>>         *
>>>>        12
>>>>
>>>> The new direct dominator of block 12 is block 6, but the current algorithm only
>>>> recalculates dominator info for blocks 6, 8 and 9.
>>>>
>>>> The patch fixes this by additionally recalculating the dominator info for blocks
>>>> immediately dominated by bb2 (block 6 in the example), if bb2 has a single
>>>> predecessor after replacement.
>>>>
>>>> Bootstapped and reg-tested on x86_64 and i686. Build and reg-tested on MIPS and ARM.
>>>>
>>>> Ok for trunk?
>>>>
>>>> Thanks,
>>>> - Tom
>>>>
>>>> 2011-10-30  Tom de Vries  <tom@codesourcery.com>
>>>>
>>>>       PR tree-optimization/50878
>>>>       * tree-ssa-tail-merge.c (replace_block_by): Recalculate dominator info
>>>>       for blocks immediately dominated by bb2, if bb2 has a single predecessor
>>>>       after replacement.
>>>
>>>
>
> 2011-10-31  Tom de Vries  <tom@codesourcery.com>
>
>        PR tree-optimization/50908
>        * tree-ssa-tail-merge.c (update_vuses): Now that edges are removed
>        before update_vuses, test for 1 predecessor rather than two.
>        (delete_block_update_dominator_info): New function, part of it factored
>        out of ...
>        (replace_block_by): Use delete_block_update_dominator_info.  Call
>        update_vuses after deleting bb1 and updating dominator info, instead of
>        before.
>
Tom de Vries - Nov. 7, 2011, 9:43 a.m.
On 10/31/2011 09:19 PM, Tom de Vries wrote:
> On 10/30/2011 10:54 AM, Richard Guenther wrote:
>> On Sun, Oct 30, 2011 at 9:27 AM, Tom de Vries <Tom_deVries@mentor.com> wrote:
>>> On 10/30/2011 09:20 AM, Tom de Vries wrote:
>>>> Richard,
>>>>
>>>> I have a fix for PR50878.
>>>
>>> Sorry, with patch this time.
>>
>> Ok for now, but see Davids mail and the complexity issue with iteratively
>> updating dominators.
> 
> I'm not sure which mail you mean.
> 

I had a problem with my mail filter, but luckily I found the message this
weekend by accident. I filed http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51005
about this issue.

Thanks,
- Tom

Patch

Index: gcc/tree-ssa-tail-merge.c
===================================================================
--- gcc/tree-ssa-tail-merge.c (revision 180521)
+++ gcc/tree-ssa-tail-merge.c (working copy)
@@ -1458,7 +1458,7 @@  update_vuses (bool vuse1_phi_args, tree
 	      if (!dominated_by_p (CDI_DOMINATORS, pred, bb2))
 		continue;
 
-	      if (pred == bb2 && EDGE_COUNT (gimple_bb (stmt)->preds) == 2)
+	      if (pred == bb2 && EDGE_COUNT (gimple_bb (stmt)->preds) == 1)
 		{
 		  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
 		  unlink_virtual_phi (stmt, lhs);
@@ -1526,6 +1526,88 @@  vop_at_entry (basic_block bb)
 	  : NULL_TREE);
 }
 
+/* Given that all incoming edges of BB1 have been redirected to BB2, delete BB1
+   and recompute dominator info.  */
+
+static void
+delete_block_update_dominator_info (basic_block bb1, basic_block bb2)
+{
+  VEC (basic_block,heap) *fix_dom_bb;
+  unsigned int i;
+  basic_block bb, dom;
+  edge e;
+  edge_iterator ei;
+
+  /* Consider the following cfg, where A is the direct dominator of I:
+
+                A
+               / \
+              B   \
+             / \   \
+                C   D
+               /|   |\
+                E   F
+                |\ /|
+                | x |
+                |/ \|
+                G   H
+                 \ /
+                  I
+
+     Say E and F are duplicates, and F is removed.  The cfg then looks like
+     this:
+
+                A
+               / \
+              B   \
+             / \   \
+                C   D
+               / \ / \
+                  E
+                 / \
+                G   H
+                 \ /
+                  I
+
+     E is now the new direct dominator of I.
+
+     In order to calculate the new dominator info, we take the nearest common
+     dominator (A) of bb1 (F) and bb2 (E), and get the set of bbs immediately
+     dominated by it.  Some of this set may now be directly dominated by bb2.
+
+     Ideally we would have a means to determine which bbs in the set are now
+     dominated by bb2, and call set_immediate_dominator for those bbs, but we
+     don't, so instead we let iterate_fix_dominators figure it out.  */
+
+  /* Add bbs immediately dominated by the most common dominator.  */
+  dom = nearest_common_dominator (CDI_DOMINATORS, bb1, bb2);
+  fix_dom_bb = get_dominated_by (CDI_DOMINATORS, dom);
+
+  if (get_immediate_dominator (CDI_DOMINATORS, bb1) == dom)
+    for (i = 0; VEC_iterate (basic_block, fix_dom_bb, i, bb); ++i)
+      {
+	if (bb != bb1)
+	  continue;
+	VEC_unordered_remove (basic_block, fix_dom_bb, i);
+	break;
+      }
+
+  /* Add bb2, but not twice.  */
+  if (get_immediate_dominator (CDI_DOMINATORS, bb2) != dom)
+    VEC_safe_push (basic_block, heap, fix_dom_bb, bb2);
+  /* Add succs of bb2, but not twice.  */
+  FOR_EACH_EDGE (e, ei, bb2->succs)
+    if (get_immediate_dominator (CDI_DOMINATORS, e->dest) != dom)
+      VEC_safe_push (basic_block, heap, fix_dom_bb, e->dest);
+
+  delete_basic_block (bb1);
+  iterate_fix_dominators (CDI_DOMINATORS, fix_dom_bb, false);
+#if defined (ENABLE_CHECKING)
+  verify_dominators (CDI_DOMINATORS);
+#endif
+  VEC_free (basic_block, heap, fix_dom_bb);
+}
+
 /* Redirect all edges from BB1 to BB2, marks BB1 for removal, and if
    UPDATE_VOPS, inserts vop phis.  */
 
@@ -1539,7 +1621,6 @@  replace_block_by (basic_block bb1, basic
   edge e;
   edge_iterator ei;
   bool vuse1_phi_args = false;
-  VEC (basic_block,heap) *fix_dom_bb;
 
   phi_vuse2 = vop_at_entry (bb2);
   if (phi_vuse2 != NULL_TREE && TREE_CODE (phi_vuse2) != SSA_NAME)
@@ -1550,7 +1631,8 @@  replace_block_by (basic_block bb1, basic
       /* Find the vops at entry of bb1 and bb2.  */
       phi_vuse1 = vop_at_entry (bb1);
 
-      /* If both are not found, it means there's no need to update.  */
+      /* If both are not found, it means there's no need to update.  Uses old
+	 dominator info.  */
       if (phi_vuse1 == NULL_TREE && phi_vuse2 == NULL_TREE)
 	update_vops = false;
       else if (phi_vuse1 == NULL_TREE)
@@ -1591,25 +1673,20 @@  replace_block_by (basic_block bb1, basic
 		     pred_edge, UNKNOWN_LOCATION);
     }
 
-  /* Update the vops.  */
+  /* Do updates that use bb1, before deleting bb1.  */
+  if (!update_vops)
+    release_last_vdef (bb1);
+  same_succ_flush_bb (bb1);
+
+  delete_block_update_dominator_info (bb1, bb2);
+
+  /* Update the vops.  Uses new dominator info.  */
   if (update_vops)
     {
       update_vuses (vuse1_phi_args, phi_vuse1, phi_vuse2, bb2,
 		    redirected_edges);
       VEC_free (edge, heap, redirected_edges);
     }
-  else
-    release_last_vdef (bb1);
-
-  same_succ_flush_bb (bb1);
-  delete_basic_block (bb1);
-
-  fix_dom_bb = VEC_alloc (basic_block, heap, 2);
-  VEC_safe_push (basic_block, heap, fix_dom_bb, bb2);
-  FOR_EACH_EDGE (e, ei, bb2->succs)
-    VEC_safe_push (basic_block, heap, fix_dom_bb, e->dest);
-  iterate_fix_dominators (CDI_DOMINATORS, fix_dom_bb, false);
-  VEC_free (basic_block, heap, fix_dom_bb);
 }
 
 /* Bbs for which update_debug_stmt need to be called.  */