Message ID | 20121129153852.GC10621@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Nov 29, 2012 at 04:38:52PM +0100, Marek Polacek wrote: > 2012-11-29 Marek Polacek <polacek@redhat.com> > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. Of course here should be (n_latch_edges). Marek
On Thu, Nov 29, 2012 at 4:38 PM, Marek Polacek wrote: > On Thu, Nov 29, 2012 at 09:34:31AM +0100, Richard Biener wrote: >> Definitely not - that means to not preserve loops until after cprop. The goal >> is to preserve loops everywhere! > > Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, > I think I have a better fix now. The problem is just that when removing > BB 4 (which was a header), we have to zap ->header and ->latch. We > already have code for this: > > if (current_loops != NULL > && e->src->loop_father->latch == e->src) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for > removal. Alternatively we could not do this > threading. */ > e->src->loop_father->header = NULL; > e->src->loop_father->latch = NULL; > } > > but the thing is that when there are multiple latch edges, then > ->latch is NULL. So we need to keep track of how many latch edges > the header has. Regtested/bootstrapped on x86_64, ok for trunk? > > Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 > instead at that one place) in a followup? > > 2012-11-29 Marek Polacek <> > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. You don't have to mention a new local variable in the ChangeLog. But FWIW, not all DFS back edges are latches. Maybe name it n_back_edges? > * gcc.dg/pr54838.c: New test. > > --- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 > +++ gcc/cprop.c 2012-11-29 15:50:01.421547832 +0100 > @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc, > int may_be_loop_header; > unsigned removed_p; > unsigned i; > + unsigned n_latch_edges; > edge_iterator ei; > > insn = (setcc != NULL) ? setcc : jump; > @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc, > if (note) > find_used_regs (&XEXP (note, 0), NULL); > > - may_be_loop_header = false; > + n_latch_edges = 0; > FOR_EACH_EDGE (e, ei, bb->preds) > if (e->flags & EDGE_DFS_BACK) > - { > - may_be_loop_header = true; > - break; > - } > + n_latch_edges++; > + > + may_be_loop_header = n_latch_edges > 0; > > change = 0; > for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ) > @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, > && dest != EXIT_BLOCK_PTR) > { > if (current_loops != NULL > - && e->src->loop_father->latch == e->src) > + && (e->src->loop_father->latch == e->src > + || n_latch_edges > 1)) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for It seems to me that this threading should just not happen. Creating loops with multiple entries is something to be avoided because most loop-based optimizations don't work on irreducible regions. So this affects all passes that run after CPROP, including unrolling, IRA, the scheduler, etc. There is already code that tries to avoid creating multi-entry loops: /* The irreducible loops created by redirecting of edges entering the loop from outside would decrease effectiveness of some of the following optimizations, so prevent this. */ if (may_be_loop_header && !(e->flags & EDGE_DFS_BACK)) { ei_next (&ei); continue; } Apparently your test case manages to slip through, and I wonder why. Ciao! Steven
> Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, > I think I have a better fix now. The problem is just that when removing > BB 4 (which was a header), we have to zap ->header and ->latch. We > already have code for this: > > if (current_loops != NULL > && e->src->loop_father->latch == e->src) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for > removal. Alternatively we could not do this > threading. */ > e->src->loop_father->header = NULL; > e->src->loop_father->latch = NULL; > } > > but the thing is that when there are multiple latch edges, then > ->latch is NULL. So we need to keep track of how many latch edges > the header has. Regtested/bootstrapped on x86_64, ok for trunk? > > Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 > instead at that one place) in a followup? > > 2012-11-29 Marek Polacek <polacek@redhat.com> > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. This looks good on principle, but can't we do better now that we have the loop structure? Can't we compute is_loop_header instead of may_be_loop_header and simplify the condition under which we mark the loop for removal? Or maybe we should call disambiguate_loops_with_multiple_latches on entry of the pass? Richard, what would be the "modern" approach to solving the problem here?
On Thu, Nov 29, 2012 at 6:43 PM, Eric Botcazou <ebotcazou@adacore.com> wrote: >> Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, >> I think I have a better fix now. The problem is just that when removing >> BB 4 (which was a header), we have to zap ->header and ->latch. We >> already have code for this: >> >> if (current_loops != NULL >> && e->src->loop_father->latch == e->src) >> { >> /* ??? Now we are creating (or may create) a loop >> with multiple entries. Simply mark it for >> removal. Alternatively we could not do this >> threading. */ >> e->src->loop_father->header = NULL; >> e->src->loop_father->latch = NULL; >> } >> >> but the thing is that when there are multiple latch edges, then >> ->latch is NULL. So we need to keep track of how many latch edges >> the header has. Regtested/bootstrapped on x86_64, ok for trunk? >> >> Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 >> instead at that one place) in a followup? >> >> 2012-11-29 Marek Polacek <polacek@redhat.com> >> >> PR middle-end/54838 >> * cprop.c (bypass_block): Set header and latch to NULL when >> BB has more than one latch edge. >> (n_latches): New variable. > > This looks good on principle, but can't we do better now that we have the loop > structure? Can't we compute is_loop_header instead of may_be_loop_header and > simplify the condition under which we mark the loop for removal? Or maybe we > should call disambiguate_loops_with_multiple_latches on entry of the pass? > > Richard, what would be the "modern" approach to solving the problem here? RTL cprop seems to run both before and after RTL loop optimizers (currently after RTL loop optimizers we throw away loops - an arbitrary chosen point before IRA across which I could not get things to work). Thus you could do if (current_loops) is_loop_header = bb == bb->loop_father->header; else { may_be_loop_header = false; FOR_EACH_EDGE (e, ei, bb->preds) if (e->flags & EDGE_DFS_BACK) { may_be_loop_header = true; break; } } I don't understand /* The irreducible loops created by redirecting of edges entering the loop from outside would decrease effectiveness of some of the following optimizations, so prevent this. */ if (may_be_loop_header && !(e->flags & EDGE_DFS_BACK)) { ei_next (&ei); continue; } why isn't this simply if (may_be_loop_header) { ei_next (&ei); continue; } ? It looks like the code tries to allow "rotating" a loop - but that's only good if bb has exactly two predecessors (one entry and one latch edge). And even then it requires to manually update the loop structures (update what the new header and latch blocks are). That said, removing the !(e->flags & EDGE_DFS_BACK) condition seems to fix the ICE. Threading across a loop header is in fact complicated (see the special routine tree-ssa-threadupdate.c:thread_through_loop_header necessary for that). Let's declare the GIMPLE level did all interesting threadings through headers. Btw, if a pass wants to look at anything loop related _besides_ loop headers it has to call loop_optimizer_init. The only thing that is guaranteed to be kept up-to-date when PROP_loops is set (and thus current_loops != NULL) is the header field in the loop tree. Richard. > -- > Eric Botcazou
> RTL cprop seems to run both before and after RTL loop optimizers (currently > after RTL loop optimizers we throw away loops - an arbitrary chosen point > before IRA across which I could not get things to work). Thus you could do > > if (current_loops) > is_loop_header = bb == bb->loop_father->header; > else > { > may_be_loop_header = false; > FOR_EACH_EDGE (e, ei, bb->preds) > if (e->flags & EDGE_DFS_BACK) > { > may_be_loop_header = true; > break; > } > } OK, let's do something like this then: if (current_loops) may_be_loop_header = (bb == bb->loop_father->header); else { may_be_loop_header = false; FOR_EACH_EDGE (e, ei, bb->preds) if (e->flags & EDGE_DFS_BACK) { may_be_loop_header = true; break; } } since we cannot always be sure that it's a header. > I don't understand > > /* The irreducible loops created by redirecting of edges entering the > loop from outside would decrease effectiveness of some of the > following optimizations, so prevent this. */ > if (may_be_loop_header > && !(e->flags & EDGE_DFS_BACK)) > { > ei_next (&ei); > continue; > } > > why isn't this simply > > if (may_be_loop_header) > { > ei_next (&ei); > continue; > } > > ? It looks like the code tries to allow "rotating" a loop - but that's only > good if bb has exactly two predecessors (one entry and one latch edge). And > even then it requires to manually update the loop structures (update what > the new header and latch blocks are). That's the bug. We have a loop with 2 latch edges, but the loop fixup code in bypass_block only handles one. > That said, removing the !(e->flags & EDGE_DFS_BACK) condition seems > to fix the ICE. Threading across a loop header is in fact complicated > (see the special routine tree-ssa-threadupdate.c:thread_through_loop_header > necessary for that). Scary enough indeed. But this seems to work fine here if the loop has exactly one latch edge, so we don't need to change that. Removing the above condition is equivalent to early returning from bypass_block for all potential headers. > Let's declare the GIMPLE level did all interesting threadings through > headers. The testcase is precisely a counterexample with 2 latch edges. But OK, let's punt if there is more than a single (potential) latch edge.
> Yikes, sorry, it wasn't clear to me what PROP_loops really does. Anyway, > I think I have a better fix now. The problem is just that when removing > BB 4 (which was a header), we have to zap ->header and ->latch. We > already have code for this: > > if (current_loops != NULL > && e->src->loop_father->latch == e->src) > { > /* ??? Now we are creating (or may create) a loop > with multiple entries. Simply mark it for > removal. Alternatively we could not do this > threading. */ > e->src->loop_father->header = NULL; > e->src->loop_father->latch = NULL; > } > > but the thing is that when there are multiple latch edges, then > ->latch is NULL. So we need to keep track of how many latch edges > the header has. Regtested/bootstrapped on x86_64, ok for trunk? > > Can I get rid of may_be_loop_header (and just use n_latch_edges > 0 > instead at that one place) in a followup? > > 2012-11-29 Marek Polacek <polacek@redhat.com> > > PR middle-end/54838 > * cprop.c (bypass_block): Set header and latch to NULL when > BB has more than one latch edge. > (n_latches): New variable. OK, let's tweak the patch as follows: 1) when current_loops is not NULL, we compute may_be_loop_header and whether the loop has more than 1 latch edge exactly, 2) when current_loops is NULL, we use your above method to do the same, 3) once this is done, we return from the function before entering the loop if this is a (potential) header with more than 1 (potential) latch edge. The comment can say that threading through a loop header with more than 1 latch edge is delicate and cite tree-threadupdate.c:thread_through_loop_header.
--- gcc/cprop.c.mp 2012-11-29 15:49:53.120524295 +0100 +++ gcc/cprop.c 2012-11-29 15:50:01.421547832 +0100 @@ -1499,6 +1499,7 @@ bypass_block (basic_block bb, rtx setcc, int may_be_loop_header; unsigned removed_p; unsigned i; + unsigned n_latch_edges; edge_iterator ei; insn = (setcc != NULL) ? setcc : jump; @@ -1510,13 +1511,12 @@ bypass_block (basic_block bb, rtx setcc, if (note) find_used_regs (&XEXP (note, 0), NULL); - may_be_loop_header = false; + n_latch_edges = 0; FOR_EACH_EDGE (e, ei, bb->preds) if (e->flags & EDGE_DFS_BACK) - { - may_be_loop_header = true; - break; - } + n_latch_edges++; + + may_be_loop_header = n_latch_edges > 0; change = 0; for (ei = ei_start (bb->preds); (e = ei_safe_edge (ei)); ) @@ -1605,7 +1605,8 @@ bypass_block (basic_block bb, rtx setcc, && dest != EXIT_BLOCK_PTR) { if (current_loops != NULL - && e->src->loop_father->latch == e->src) + && (e->src->loop_father->latch == e->src + || n_latch_edges > 1)) { /* ??? Now we are creating (or may create) a loop with multiple entries. Simply mark it for --- gcc/testsuite/gcc.dg/pr54838.c.mp 2012-11-26 14:48:43.783980854 +0100 +++ gcc/testsuite/gcc.dg/pr54838.c 2012-11-26 14:49:51.051158719 +0100 @@ -0,0 +1,24 @@ +/* PR middle-end/54838 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-forward-propagate -ftracer" } */ + +void bar (void); + +void +foo (void *b, int *c) +{ +again: + switch (*c) + { + case 1: + if (!b) + { + bar (); + return; + } + goto again; + case 3: + if (!b) + goto again; + } +}