Patchwork Don't bypass blocks with multiple latch edges (PR middle-end/54838)

login
register
mail settings
Submitter Marek Polacek
Date Nov. 29, 2012, 3:38 p.m.
Message ID <20121129153852.GC10621@redhat.com>
Download mbox | patch
Permalink /patch/202769/
State New
Headers show

Comments

Marek Polacek - Nov. 29, 2012, 3:38 p.m.
On Thu, Nov 29, 2012 at 09:34:31AM +0100, Richard Biener wrote:
> Definitely not - that means to not preserve loops until after cprop.  The goal
> is to preserve loops everywhere!

Yikes, sorry, it wasn't clear to me what PROP_loops really does.  Anyway,
I think I have a better fix now.  The problem is just that when removing
BB 4 (which was a header), we have to zap ->header and ->latch.  We
already have code for this:

  if (current_loops != NULL
      && e->src->loop_father->latch == e->src)
    {
      /* ???  Now we are creating (or may create) a loop
         with multiple entries.  Simply mark it for
         removal.  Alternatively we could not do this
         threading.  */
      e->src->loop_father->header = NULL;
      e->src->loop_father->latch = NULL;
    }

but the thing is that when there are multiple latch edges, then
->latch is NULL.  So we need to keep track of how many latch edges
the header has.  Regtested/bootstrapped on x86_64, ok for trunk?

Can I get rid of may_be_loop_header (and just use n_latch_edges > 0
instead at that one place) in a followup?

2012-11-29  Marek Polacek  <polacek@redhat.com>

	PR middle-end/54838
	* cprop.c (bypass_block): Set header and latch to NULL when
	BB has more than one latch edge.
	(n_latches): New variable.

	* gcc.dg/pr54838.c: New test.


	Marek
Marek Polacek - Nov. 29, 2012, 3:42 p.m.
On Thu, Nov 29, 2012 at 04:38:52PM +0100, Marek Polacek wrote:
> 2012-11-29  Marek Polacek  <polacek@redhat.com>
> 
> 	PR middle-end/54838
> 	* cprop.c (bypass_block): Set header and latch to NULL when
> 	BB has more than one latch edge.
> 	(n_latches): New variable.

Of course here should be (n_latch_edges).

	Marek
Steven Bosscher - Nov. 29, 2012, 3:50 p.m.
On Thu, Nov 29, 2012 at 4:38 PM, Marek Polacek wrote:
> On Thu, Nov 29, 2012 at 09:34:31AM +0100, Richard Biener wrote:
>> Definitely not - that means to not preserve loops until after cprop.  The goal
>> is to preserve loops everywhere!
>
> Yikes, sorry, it wasn't clear to me what PROP_loops really does.  Anyway,
> I think I have a better fix now.  The problem is just that when removing
> BB 4 (which was a header), we have to zap ->header and ->latch.  We
> already have code for this:
>
>   if (current_loops != NULL
>       && e->src->loop_father->latch == e->src)
>     {
>       /* ???  Now we are creating (or may create) a loop
>          with multiple entries.  Simply mark it for
>          removal.  Alternatively we could not do this
>          threading.  */
>       e->src->loop_father->header = NULL;
>       e->src->loop_father->latch = NULL;
>     }
>
> but the thing is that when there are multiple latch edges, then
> ->latch is NULL.  So we need to keep track of how many latch edges
> the header has.  Regtested/bootstrapped on x86_64, ok for trunk?
>
> Can I get rid of may_be_loop_header (and just use n_latch_edges > 0
> instead at that one place) in a followup?
>
> 2012-11-29  Marek Polacek  <>
>
>         PR middle-end/54838
>         * cprop.c (bypass_block): Set header and latch to NULL when
>         BB has more than one latch edge.
>         (n_latches): New variable.

You don't have to mention a new local variable in the ChangeLog.

But FWIW, not all DFS back edges are latches. Maybe name it n_back_edges?



>         * gcc.dg/pr54838.c: New test.
>
> --- gcc/cprop.c.mp      2012-11-29 15:49:53.120524295 +0100
> +++ gcc/cprop.c 2012-11-29 15:50:01.421547832 +0100
> @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc,
>    int may_be_loop_header;
>    unsigned removed_p;
>    unsigned i;
> +  unsigned n_latch_edges;
>    edge_iterator ei;
>
>    insn = (setcc != NULL) ? setcc : jump;
> @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc,
>    if (note)
>      find_used_regs (&XEXP (note, 0), NULL);
>
> -  may_be_loop_header = false;
> +  n_latch_edges = 0;
>    FOR_EACH_EDGE (e, ei, bb->preds)
>      if (e->flags & EDGE_DFS_BACK)
> -      {
> -       may_be_loop_header = true;
> -       break;
> -      }
> +      n_latch_edges++;
> +
> +  may_be_loop_header = n_latch_edges > 0;
>
>    change = 0;
>    for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
> @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc,
>               && dest != EXIT_BLOCK_PTR)
>              {
>               if (current_loops != NULL
> -                 && e->src->loop_father->latch == e->src)
> +                 && (e->src->loop_father->latch == e->src
> +                     || n_latch_edges > 1))
>                 {
>                   /* ???  Now we are creating (or may create) a loop
>                      with multiple entries.  Simply mark it for

It seems to me that this threading should just not happen. Creating
loops with multiple entries is something to be avoided because most
loop-based optimizations don't work on irreducible regions. So this
affects all passes that run after CPROP, including unrolling, IRA, the
scheduler, etc.

There is already code that tries to avoid creating multi-entry loops:

      /* The irreducible loops created by redirecting of edges entering the
         loop from outside would decrease effectiveness of some of the
         following optimizations, so prevent this.  */
      if (may_be_loop_header
          && !(e->flags & EDGE_DFS_BACK))
        {
          ei_next (&ei);
          continue;
        }

Apparently your test case manages to slip through, and I wonder why.

Ciao!
Steven
Eric Botcazou - Nov. 29, 2012, 5:43 p.m.
> Yikes, sorry, it wasn't clear to me what PROP_loops really does.  Anyway,
> I think I have a better fix now.  The problem is just that when removing
> BB 4 (which was a header), we have to zap ->header and ->latch.  We
> already have code for this:
> 
>   if (current_loops != NULL
>       && e->src->loop_father->latch == e->src)
>     {
>       /* ???  Now we are creating (or may create) a loop
>          with multiple entries.  Simply mark it for
>          removal.  Alternatively we could not do this
>          threading.  */
>       e->src->loop_father->header = NULL;
>       e->src->loop_father->latch = NULL;
>     }
> 
> but the thing is that when there are multiple latch edges, then
> ->latch is NULL.  So we need to keep track of how many latch edges
> the header has.  Regtested/bootstrapped on x86_64, ok for trunk?
> 
> Can I get rid of may_be_loop_header (and just use n_latch_edges > 0
> instead at that one place) in a followup?
> 
> 2012-11-29  Marek Polacek  <polacek@redhat.com>
> 
> 	PR middle-end/54838
> 	* cprop.c (bypass_block): Set header and latch to NULL when
> 	BB has more than one latch edge.
> 	(n_latches): New variable.

This looks good on principle, but can't we do better now that we have the loop 
structure?   Can't we compute is_loop_header instead of may_be_loop_header and 
simplify the condition under which we mark the loop for removal?  Or maybe we 
should call disambiguate_loops_with_multiple_latches on entry of the pass?

Richard, what would be the "modern" approach to solving the problem here?
Richard Guenther - Nov. 30, 2012, 9:01 a.m.
On Thu, Nov 29, 2012 at 6:43 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Yikes, sorry, it wasn't clear to me what PROP_loops really does.  Anyway,
>> I think I have a better fix now.  The problem is just that when removing
>> BB 4 (which was a header), we have to zap ->header and ->latch.  We
>> already have code for this:
>>
>>   if (current_loops != NULL
>>       && e->src->loop_father->latch == e->src)
>>     {
>>       /* ???  Now we are creating (or may create) a loop
>>          with multiple entries.  Simply mark it for
>>          removal.  Alternatively we could not do this
>>          threading.  */
>>       e->src->loop_father->header = NULL;
>>       e->src->loop_father->latch = NULL;
>>     }
>>
>> but the thing is that when there are multiple latch edges, then
>> ->latch is NULL.  So we need to keep track of how many latch edges
>> the header has.  Regtested/bootstrapped on x86_64, ok for trunk?
>>
>> Can I get rid of may_be_loop_header (and just use n_latch_edges > 0
>> instead at that one place) in a followup?
>>
>> 2012-11-29  Marek Polacek  <polacek@redhat.com>
>>
>>       PR middle-end/54838
>>       * cprop.c (bypass_block): Set header and latch to NULL when
>>       BB has more than one latch edge.
>>       (n_latches): New variable.
>
> This looks good on principle, but can't we do better now that we have the loop
> structure?   Can't we compute is_loop_header instead of may_be_loop_header and
> simplify the condition under which we mark the loop for removal?  Or maybe we
> should call disambiguate_loops_with_multiple_latches on entry of the pass?
>
> Richard, what would be the "modern" approach to solving the problem here?

RTL cprop seems to run both before and after RTL loop optimizers (currently
after RTL loop optimizers we throw away loops - an arbitrary chosen point
before IRA across which I could not get things to work).  Thus you could do

  if (current_loops)
    is_loop_header = bb == bb->loop_father->header;
  else
    {
  may_be_loop_header = false;
  FOR_EACH_EDGE (e, ei, bb->preds)
    if (e->flags & EDGE_DFS_BACK)
      {
        may_be_loop_header = true;
        break;
      }
    }

I don't understand

      /* The irreducible loops created by redirecting of edges entering the
         loop from outside would decrease effectiveness of some of the
         following optimizations, so prevent this.  */
      if (may_be_loop_header
          && !(e->flags & EDGE_DFS_BACK))
        {
          ei_next (&ei);
          continue;
        }

why isn't this simply

      if (may_be_loop_header)
        {
          ei_next (&ei);
          continue;
        }

?  It looks like the code tries to allow "rotating" a loop - but that's only
good if bb has exactly two predecessors (one entry and one latch edge).
And even then it requires to manually update the loop structures (update
what the new header and latch blocks are).

That said, removing the !(e->flags & EDGE_DFS_BACK) condition seems
to fix the ICE.  Threading across a loop header is in fact complicated
(see the special routine tree-ssa-threadupdate.c:thread_through_loop_header
necessary for that).  Let's declare the GIMPLE level did all interesting
threadings through headers.

Btw, if a pass wants to look at anything loop related _besides_ loop headers
it has to call loop_optimizer_init.  The only thing that is guaranteed
to be kept
up-to-date when PROP_loops is set (and thus current_loops != NULL) is
the header field in the loop tree.

Richard.

> --
> Eric Botcazou
Eric Botcazou - Nov. 30, 2012, 9:45 p.m.
> RTL cprop seems to run both before and after RTL loop optimizers (currently
> after RTL loop optimizers we throw away loops - an arbitrary chosen point
> before IRA across which I could not get things to work).  Thus you could do
> 
>   if (current_loops)
>     is_loop_header = bb == bb->loop_father->header;
>   else
>     {
>   may_be_loop_header = false;
>   FOR_EACH_EDGE (e, ei, bb->preds)
>     if (e->flags & EDGE_DFS_BACK)
>       {
>         may_be_loop_header = true;
>         break;
>       }
>     }

OK, let's do something like this then:

  if (current_loops)
    may_be_loop_header = (bb == bb->loop_father->header);
  else
    {
       may_be_loop_header = false;
       FOR_EACH_EDGE (e, ei, bb->preds)
       if (e->flags & EDGE_DFS_BACK)
         {
          may_be_loop_header = true;
          break;
         }
    }

since we cannot always be sure that it's a header.

> I don't understand
> 
>       /* The irreducible loops created by redirecting of edges entering the
>          loop from outside would decrease effectiveness of some of the
>          following optimizations, so prevent this.  */
>       if (may_be_loop_header
>           && !(e->flags & EDGE_DFS_BACK))
>         {
>           ei_next (&ei);
>           continue;
>         }
> 
> why isn't this simply
> 
>       if (may_be_loop_header)
>         {
>           ei_next (&ei);
>           continue;
>         }
> 
> ?  It looks like the code tries to allow "rotating" a loop - but that's only
> good if bb has exactly two predecessors (one entry and one latch edge). And
> even then it requires to manually update the loop structures (update what
> the new header and latch blocks are).

That's the bug.  We have a loop with 2 latch edges, but the loop fixup code in 
bypass_block only handles one.

> That said, removing the !(e->flags & EDGE_DFS_BACK) condition seems
> to fix the ICE.  Threading across a loop header is in fact complicated
> (see the special routine tree-ssa-threadupdate.c:thread_through_loop_header
> necessary for that).

Scary enough indeed.  But this seems to work fine here if the loop has exactly 
one latch edge, so we don't need to change that.  Removing the above condition 
is equivalent to early returning from bypass_block for all potential headers.

> Let's declare the GIMPLE level did all interesting threadings through
> headers.

The testcase is precisely a counterexample with 2 latch edges.

But OK, let's punt if there is more than a single (potential) latch edge.
Eric Botcazou - Nov. 30, 2012, 10 p.m.
> Yikes, sorry, it wasn't clear to me what PROP_loops really does.  Anyway,
> I think I have a better fix now.  The problem is just that when removing
> BB 4 (which was a header), we have to zap ->header and ->latch.  We
> already have code for this:
> 
>   if (current_loops != NULL
>       && e->src->loop_father->latch == e->src)
>     {
>       /* ???  Now we are creating (or may create) a loop
>          with multiple entries.  Simply mark it for
>          removal.  Alternatively we could not do this
>          threading.  */
>       e->src->loop_father->header = NULL;
>       e->src->loop_father->latch = NULL;
>     }
> 
> but the thing is that when there are multiple latch edges, then
> ->latch is NULL.  So we need to keep track of how many latch edges
> the header has.  Regtested/bootstrapped on x86_64, ok for trunk?
> 
> Can I get rid of may_be_loop_header (and just use n_latch_edges > 0
> instead at that one place) in a followup?
> 
> 2012-11-29  Marek Polacek  <polacek@redhat.com>
> 
> 	PR middle-end/54838
> 	* cprop.c (bypass_block): Set header and latch to NULL when
> 	BB has more than one latch edge.
> 	(n_latches): New variable.

OK, let's tweak the patch as follows:
 1) when current_loops is not NULL, we compute may_be_loop_header and whether 
the loop has more than 1 latch edge exactly,
 2) when current_loops is NULL, we use your above method to do the same,
 3) once this is done, we return from the function before entering the loop if 
this is a (potential) header with more than 1 (potential) latch edge.  The 
comment can say that threading through a loop header with more than 1 latch 
edge is delicate and cite tree-threadupdate.c:thread_through_loop_header.

Patch

--- gcc/cprop.c.mp	2012-11-29 15:49:53.120524295 +0100
+++ gcc/cprop.c	2012-11-29 15:50:01.421547832 +0100
@@ -1499,6 +1499,7 @@  bypass_block (basic_block bb, rtx setcc,
   int may_be_loop_header;
   unsigned removed_p;
   unsigned i;
+  unsigned n_latch_edges;
   edge_iterator ei;
 
   insn = (setcc != NULL) ? setcc : jump;
@@ -1510,13 +1511,12 @@  bypass_block (basic_block bb, rtx setcc,
   if (note)
     find_used_regs (&XEXP (note, 0), NULL);
 
-  may_be_loop_header = false;
+  n_latch_edges = 0;
   FOR_EACH_EDGE (e, ei, bb->preds)
     if (e->flags & EDGE_DFS_BACK)
-      {
-	may_be_loop_header = true;
-	break;
-      }
+      n_latch_edges++;
+
+  may_be_loop_header = n_latch_edges > 0;
 
   change = 0;
   for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); )
@@ -1605,7 +1605,8 @@  bypass_block (basic_block bb, rtx setcc,
 	      && dest != EXIT_BLOCK_PTR)
             {
 	      if (current_loops != NULL
-		  && e->src->loop_father->latch == e->src)
+		  && (e->src->loop_father->latch == e->src
+		      || n_latch_edges > 1))
 		{
 		  /* ???  Now we are creating (or may create) a loop
 		     with multiple entries.  Simply mark it for
--- gcc/testsuite/gcc.dg/pr54838.c.mp	2012-11-26 14:48:43.783980854 +0100
+++ gcc/testsuite/gcc.dg/pr54838.c	2012-11-26 14:49:51.051158719 +0100
@@ -0,0 +1,24 @@ 
+/* PR middle-end/54838 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-forward-propagate -ftracer" } */
+
+void bar (void);
+
+void
+foo (void *b, int *c)
+{
+again:
+  switch (*c)
+    {
+    case 1:
+      if (!b)
+	{
+	  bar ();
+	  return;
+	}
+      goto again;
+    case 3:
+      if (!b)
+	goto again;
+    }
+}