Patchwork Preserve loops from CFG build until after RTL loop opts

login
register
mail settings
Submitter Tom de Vries
Date April 28, 2013, 12:26 p.m.
Message ID <517D155C.3080509@mentor.com>
Download mbox | patch
Permalink /patch/240247/
State New
Headers show

Comments

Tom de Vries - April 28, 2013, 12:26 p.m.
On 26/04/13 16:27, Tom de Vries wrote:
> On 25/04/13 16:19, Richard Biener wrote:
> <SNIP>
>> and compared to the previous patch changed the tree-ssa-tailmerge.c
>> part to deal with merging of loop latch and loop preheader (even
>> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
>> Any suggestion on how to improve that part welcome.
<SNIP>
> So I think this is really a cornercase, and we should disregard it if that makes
> things simpler.
> 
> Rather than fixing up the loop structure, we could prevent tail-merge in these
> cases.
> 
> The current fix tests for current_loops == NULL, and I'm not sure that can still
> happen there, given that we have PROP_loops.

Richard,

I've found that it happens in these g++ test-cases:
  g++.dg/ext/mv1.C
  g++.dg/ext/mv12.C
  g++.dg/ext/mv2.C
  g++.dg/ext/mv5.C
  g++.dg/torture/covariant-1.C
  g++.dg/torture/pr43068.C
  g++.dg/torture/pr47714.C
This seems rare enough to just bail out of tail-merge in those cases.

> It's not evident to me that the test bb2->loop_father->latch == bb2 is
> sufficient. Before calling tail_merge_optimize, we call loop_optimizer_finalize
> in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
> theory we might miss some latches.
> 
> But I guess that pre (having started out with simple latches) maintains simple
> latches throughout, and that tail-merge does the same.

I've added a comment related to this in the patch.

Bootstrapped and reg-tested (ada inclusive) on x86_64.

OK for trunk?

Thanks,
- Tom

2013-04-28  Tom de Vries  <tom@codesourcery.com>

	* tree-ssa-tail-merge.c (find_same_succ_bb): Skip loop latch bbs.
	(replace_block_by): Don't set LOOPS_NEED_FIXUP.
	(tail_merge_optimize): Handle current_loops == NULL.

	* gcc.dg/pr50763.c: Update test.
Richard Guenther - April 29, 2013, 7:54 a.m.
On Sun, 28 Apr 2013, Tom de Vries wrote:

> On 26/04/13 16:27, Tom de Vries wrote:
> > On 25/04/13 16:19, Richard Biener wrote:
> > <SNIP>
> >> and compared to the previous patch changed the tree-ssa-tailmerge.c
> >> part to deal with merging of loop latch and loop preheader (even
> >> if that's a really bad idea) to not regress gcc.dg/pr50763.c.
> >> Any suggestion on how to improve that part welcome.
> <SNIP>
> > So I think this is really a cornercase, and we should disregard it if that makes
> > things simpler.
> > 
> > Rather than fixing up the loop structure, we could prevent tail-merge in these
> > cases.
> > 
> > The current fix tests for current_loops == NULL, and I'm not sure that can still
> > happen there, given that we have PROP_loops.
> 
> Richard,
> 
> I've found that it happens in these g++ test-cases:
>   g++.dg/ext/mv1.C
>   g++.dg/ext/mv12.C
>   g++.dg/ext/mv2.C
>   g++.dg/ext/mv5.C
>   g++.dg/torture/covariant-1.C
>   g++.dg/torture/pr43068.C
>   g++.dg/torture/pr47714.C
> This seems rare enough to just bail out of tail-merge in those cases.
> 
> > It's not evident to me that the test bb2->loop_father->latch == bb2 is
> > sufficient. Before calling tail_merge_optimize, we call loop_optimizer_finalize
> > in which we assert that LOOPS_MAY_HAVE_MULTIPLE_LATCHES from there on, so in
> > theory we might miss some latches.
> > 
> > But I guess that pre (having started out with simple latches) maintains simple
> > latches throughout, and that tail-merge does the same.
> 
> I've added a comment related to this in the patch.
> 
> Bootstrapped and reg-tested (ada inclusive) on x86_64.
> 
> OK for trunk?

+  if (bb == NULL
+      /* Be conservative with loop structure.  It's not evident that this
test
+        is sufficient.  Before tail-merge, we've just called
+        loop_optimizer_finalize, and LOOPS_MAY_HAVE_MULTIPLE_LATCHES is 
now
+        set, so there's no guarantee that the loop->latch value is still
valid.
+        But we assume that, since we've forced LOOPS_HAVE_SIMPLE_LATCHES 
at
the
+        start of pre, we've kept that property intact throughout pre, and
are
+        keeping it throughout tail-merge using this test.  */
+      || bb->loop_father->latch == bb)
     return;

A more complete test would be to use what the bb_loop_header_p predicate
does - skip latch _edges_.  Not sure if that's easily possible in
the loop looking at succs

  FOR_EACH_EDGE (e, ei, bb->succs)
    {
      int index = e->dest->index;
      bitmap_set_bit (same->succs, index);
      same_succ_edge_flags[index] = e->flags;
    }

but we'd skip all edges for which

    dominated_by_p (CDI_DOMINATORS, e->src, e->dest)

of course that's equal to skipping the whole basic-block if the
above is true.

I suppose the patch is ok as-is for now, but let's keep the above
in mind (I want to audit the whole bootstrap process for loops
that vanish and eventually re-appear, I just didn't get around
thinking about a proper way to efficiently instrument for that).

Thanks,
Richard.

Patch

diff --git a/gcc/tree-ssa-tail-merge.c b/gcc/tree-ssa-tail-merge.c
index f2ab7444..5467ae5 100644
--- a/gcc/tree-ssa-tail-merge.c
+++ b/gcc/tree-ssa-tail-merge.c
@@ -689,7 +689,15 @@  find_same_succ_bb (basic_block bb, same_succ *same_p)
   edge_iterator ei;
   edge e;
 
-  if (bb == NULL)
+  if (bb == NULL
+      /* Be conservative with loop structure.  It's not evident that this test
+	 is sufficient.  Before tail-merge, we've just called
+	 loop_optimizer_finalize, and LOOPS_MAY_HAVE_MULTIPLE_LATCHES is now
+	 set, so there's no guarantee that the loop->latch value is still valid.
+	 But we assume that, since we've forced LOOPS_HAVE_SIMPLE_LATCHES at the
+	 start of pre, we've kept that property intact throughout pre, and are
+	 keeping it throughout tail-merge using this test.  */
+      || bb->loop_father->latch == bb)
     return;
   bitmap_set_bit (same->bbs, bb->index);
   FOR_EACH_EDGE (e, ei, bb->succs)
@@ -1460,17 +1468,6 @@  replace_block_by (basic_block bb1, basic_block bb2)
   /* Mark the basic block as deleted.  */
   mark_basic_block_deleted (bb1);
 
-  /* ???  If we merge the loop preheader with the loop latch we are creating
-     additional entries into the loop, eventually rotating it.
-     Mark loops for fixup in this case.
-     ???  This is a completely unwanted transform and will wreck most
-     loops at this point - but with just not considering loop latches as
-     merge candidates we fail to commonize the two loops in gcc.dg/pr50763.c.
-     A better fix to avoid that regression is needed.  */
-  if (current_loops
-      && bb2->loop_father->latch == bb2)
-    loops_state_set (LOOPS_NEED_FIXUP);
-
   /* Redirect the incoming edges of bb1 to bb2.  */
   for (i = EDGE_COUNT (bb1->preds); i > 0 ; --i)
     {
@@ -1612,7 +1609,19 @@  tail_merge_optimize (unsigned int todo)
   int iteration_nr = 0;
   int max_iterations = PARAM_VALUE (PARAM_MAX_TAIL_MERGE_ITERATIONS);
 
-  if (!flag_tree_tail_merge || max_iterations == 0)
+  if (!flag_tree_tail_merge
+      || max_iterations == 0
+      /* We try to be conservative with respect to loop structure, since:
+	 - the cases where tail-merging could both affect loop structure and be
+	   benificial are rare,
+	 - it prevents us from having to fixup the loops using
+	   loops_state_set (LOOPS_NEED_FIXUP), and
+	 - keeping loop structure may allow us to simplify the pass.
+	 In order to be conservative, we need loop information.	 In rare cases
+	 (about 7 test-cases in the g++ testsuite) there is none (because
+	 loop_optimizer_finalize has been called before tail-merge, and
+	 PROP_loops is not set), so we bail out.  */
+      || current_loops == NULL)
     return 0;
 
   timevar_push (TV_TREE_TAIL_MERGE);
diff --git a/gcc/testsuite/gcc.dg/pr50763.c b/gcc/testsuite/gcc.dg/pr50763.c
index 60025e3..695b61c 100644
--- a/gcc/testsuite/gcc.dg/pr50763.c
+++ b/gcc/testsuite/gcc.dg/pr50763.c
@@ -12,5 +12,5 @@  foo (int c, int d)
   while (c == d);
 }
 
-/* { dg-final { scan-tree-dump-times "== 33" 1 "pre"} } */
+/* { dg-final { scan-tree-dump-times "== 33" 2 "pre"} } */
 /* { dg-final { cleanup-tree-dump "pre" } } */