diff mbox

[IRA] Really record loop exits

Message ID CABu31nOwifrw=ZY43rGs2T9ixkW76z4ureCj8Q_EhGXLgvtUqQ@mail.gmail.com
State New
Headers show

Commit Message

Steven Bosscher Oct. 11, 2012, 8:17 p.m. UTC
Hello,

IRA uses record_loop_exits() to cache the loop exit edges, but due to
a code ordering bug the edges are not actually recorded.
record_loop_exits() starts with:

  if (!current_loops)
    return;

So ira.c should set current_loops before calling record_loop_exits.
With the current order, the exists are not actually recorded. Also,
ira.c should release recorded exits and loops, this releases a
considerable amount of GC memory that can be re-used in the next IRA
iteration.

It's not clear to me why IRA has to re-compute the cfgloop tree from
scratch, though, so I've added a "???" comment for that.

Bootstrapped&tested (with and without checking) on
powerpc64-unknown-linux-gnu. OK for trunk?

Ciao!
Steven

        * ira.c (ira): Set current_loops to &ira_loops before recording
        loop exits.  Release recorded exits and loops early.

Comments

Vladimir Makarov Oct. 12, 2012, 4:16 a.m. UTC | #1
On 12-10-11 4:17 PM, Steven Bosscher wrote:
> Hello,
>
> IRA uses record_loop_exits() to cache the loop exit edges, but due to
> a code ordering bug the edges are not actually recorded.
> record_loop_exits() starts with:
>
>    if (!current_loops)
>      return;
I have no idea why record_loop_exits is here.  Loop exits are not used 
in IRA.  I think it should be removed.
> So ira.c should set current_loops before calling record_loop_exits.
> With the current order, the exists are not actually recorded. Also,
> ira.c should release recorded exits and loops, this releases a
> considerable amount of GC memory that can be re-used in the next IRA
> iteration.
>
> It's not clear to me why IRA has to re-compute the cfgloop tree from
> scratch, though, so I've added a "???" comment for that.
I am not a specialist in cfgloop code.  I did not find how to update it 
incrementally.  So I just rebuild it.

My be you can advice how to do this.  IRA can generate insns on the BB 
edges, it results in creation of new BBs on the region exits/enters.
> Bootstrapped&tested (with and without checking) on
> powerpc64-unknown-linux-gnu. OK for trunk?
>
>
Steven Bosscher Oct. 12, 2012, 8:12 a.m. UTC | #2
On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote:
> On 12-10-11 4:17 PM, Steven Bosscher wrote:
>>
>> Hello,
>>
>> IRA uses record_loop_exits() to cache the loop exit edges, but due to
>> a code ordering bug the edges are not actually recorded.
>> record_loop_exits() starts with:
>>
>>    if (!current_loops)
>>      return;
>
> I have no idea why record_loop_exits is here.  Loop exits are not used in
> IRA.  I think it should be removed.

Actually they are used alright:

$ grep -n get_loop_exit_edges ira*
ira-build.c:169:          edges = get_loop_exit_edges (loop);
ira-build.c:1774:      edges = get_loop_exit_edges (loop_node->loop);
ira-build.c:1972:  edges = get_loop_exit_edges (loop);
ira-color.c:2023:      edges = get_loop_exit_edges (loop_node->loop);

Especially the last one, in ira_loop_edge_freq, seems to be called
frequently enough to justify caching loop exit edges.


>> So ira.c should set current_loops before calling record_loop_exits.
>> With the current order, the exists are not actually recorded. Also,
>> ira.c should release recorded exits and loops, this releases a
>> considerable amount of GC memory that can be re-used in the next IRA
>> iteration.
>>
>> It's not clear to me why IRA has to re-compute the cfgloop tree from
>> scratch, though, so I've added a "???" comment for that.
>
> I am not a specialist in cfgloop code.  I did not find how to update it
> incrementally.  So I just rebuild it.
>
> My be you can advice how to do this.  IRA can generate insns on the BB
> edges, it results in creation of new BBs on the region exits/enters.

AFAIU this mostly happens automatically for simple edge splitting
operations, and for more complex transformations there is
fix_loop_structure. But I'm not sure, perhaps Richi can give some
advice.

Ciao!
Steven
Richard Biener Oct. 12, 2012, 8:56 a.m. UTC | #3
On Fri, 12 Oct 2012, Steven Bosscher wrote:

> On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote:
> > On 12-10-11 4:17 PM, Steven Bosscher wrote:
> >>
> >> Hello,
> >>
> >> IRA uses record_loop_exits() to cache the loop exit edges, but due to
> >> a code ordering bug the edges are not actually recorded.
> >> record_loop_exits() starts with:
> >>
> >>    if (!current_loops)
> >>      return;
> >
> > I have no idea why record_loop_exits is here.  Loop exits are not used in
> > IRA.  I think it should be removed.
> 
> Actually they are used alright:
> 
> $ grep -n get_loop_exit_edges ira*
> ira-build.c:169:          edges = get_loop_exit_edges (loop);
> ira-build.c:1774:      edges = get_loop_exit_edges (loop_node->loop);
> ira-build.c:1972:  edges = get_loop_exit_edges (loop);
> ira-color.c:2023:      edges = get_loop_exit_edges (loop_node->loop);
> 
> Especially the last one, in ira_loop_edge_freq, seems to be called
> frequently enough to justify caching loop exit edges.
> 
> 
> >> So ira.c should set current_loops before calling record_loop_exits.
> >> With the current order, the exists are not actually recorded. Also,
> >> ira.c should release recorded exits and loops, this releases a
> >> considerable amount of GC memory that can be re-used in the next IRA
> >> iteration.
> >>
> >> It's not clear to me why IRA has to re-compute the cfgloop tree from
> >> scratch, though, so I've added a "???" comment for that.
> >
> > I am not a specialist in cfgloop code.  I did not find how to update it
> > incrementally.  So I just rebuild it.
> >
> > My be you can advice how to do this.  IRA can generate insns on the BB
> > edges, it results in creation of new BBs on the region exits/enters.
> 
> AFAIU this mostly happens automatically for simple edge splitting
> operations, and for more complex transformations there is
> fix_loop_structure. But I'm not sure, perhaps Richi can give some
> advice.

Just something I noticed when working on preserving loop structures.
IRA re-builds it in a weird way instead of simply using
loop_optimizer_init (), not sure why:

  ira_assert (current_loops == NULL);
  if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == 
IRA_REGION_MIXED)
    {
      flow_loops_find (&ira_loops);
      record_loop_exits ();
      current_loops = &ira_loops;

why does it have 'ira_loops' at all, instead of using current_loops!?

I noticed that when removing all this because loops are preserved
that inbetween IRA and reload loop structure verification would
fail (not sure it's a good idea to have separate passes if
the state inbetween is "broken").

And yes, the cfg manipulation primitives update loops.  fixup is
necessary in case you explicitely remove loops or the cfg
manipulation primitives "give up" and schedule loops for removal.

See how RTL cfgcleanup handles this now (yes, how ugly ...).

If IRA makes serious use of loops then it should arrange to
use the loop structures that are still available (ok, I destroy
them after RTL loop transforms currently).  IRA could be the
point where we destroy loops, but IIRC even machine-reorg
likes to know loops.

Richard.
Vladimir Makarov Oct. 12, 2012, 1:31 p.m. UTC | #4
On 12-10-12 4:12 AM, Steven Bosscher wrote:
> On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote:
>> On 12-10-11 4:17 PM, Steven Bosscher wrote:
>>> Hello,
>>>
>>> IRA uses record_loop_exits() to cache the loop exit edges, but due to
>>> a code ordering bug the edges are not actually recorded.
>>> record_loop_exits() starts with:
>>>
>>>     if (!current_loops)
>>>       return;
>> I have no idea why record_loop_exits is here.  Loop exits are not used in
>> IRA.  I think it should be removed.
> Actually they are used alright:
>
> $ grep -n get_loop_exit_edges ira*
> ira-build.c:169:          edges = get_loop_exit_edges (loop);
> ira-build.c:1774:      edges = get_loop_exit_edges (loop_node->loop);
> ira-build.c:1972:  edges = get_loop_exit_edges (loop);
> ira-color.c:2023:      edges = get_loop_exit_edges (loop_node->loop);
>
> Especially the last one, in ira_loop_edge_freq, seems to be called
> frequently enough to justify caching loop exit edges.
Ops.  Sorry, Steven.  I did a wrong conclusion because I thought I would 
have found such code generation problem if it had an affect.

I've just tried about 20 tests and did not find a difference in code 
generation when I moved current_loops assignment above record_loop_exits ().

   I did some investigation and it seems to me it is important to have 
the right loop exits when such exits are abnormal edges.  It would be 
problem if there is a generation of reg shuffle insns on such edges.  
But it does not happen (if it happens it should result in compiler 
crash), probably IRA makes the same allocation of pseudos somehow on 
different ends of such edges or such cases are extremely rare.

Another problem would be a possibility of different allocation when loop 
has a pseudo living at the exit edges but not on the enters which I 
think is also a rare situation.

Still it is good to have right loop exits for some rare corner cases.

So your patch is ok.  Thanks for working on this.

>
>>> So ira.c should set current_loops before calling record_loop_exits.
>>> With the current order, the exists are not actually recorded. Also,
>>> ira.c should release recorded exits and loops, this releases a
>>> considerable amount of GC memory that can be re-used in the next IRA
>>> iteration.
>>>
>>> It's not clear to me why IRA has to re-compute the cfgloop tree from
>>> scratch, though, so I've added a "???" comment for that.
>> I am not a specialist in cfgloop code.  I did not find how to update it
>> incrementally.  So I just rebuild it.
>>
>> My be you can advice how to do this.  IRA can generate insns on the BB
>> edges, it results in creation of new BBs on the region exits/enters.
> AFAIU this mostly happens automatically for simple edge splitting
> operations, and for more complex transformations there is
> fix_loop_structure. But I'm not sure, perhaps Richi can give some
> advice.
>
Vladimir Makarov Oct. 12, 2012, 1:56 p.m. UTC | #5
On 12-10-12 4:56 AM, Richard Biener wrote:
> On Fri, 12 Oct 2012, Steven Bosscher wrote:
>
>> On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote:
>>> On 12-10-11 4:17 PM, Steven Bosscher wrote:
>>>> Hello,
>>>>
>>>> IRA uses record_loop_exits() to cache the loop exit edges, but due to
>>>> a code ordering bug the edges are not actually recorded.
>>>> record_loop_exits() starts with:
>>>>
>>>>     if (!current_loops)
>>>>       return;
>>> I have no idea why record_loop_exits is here.  Loop exits are not used in
>>> IRA.  I think it should be removed.
>> Actually they are used alright:
>>
>> $ grep -n get_loop_exit_edges ira*
>> ira-build.c:169:          edges = get_loop_exit_edges (loop);
>> ira-build.c:1774:      edges = get_loop_exit_edges (loop_node->loop);
>> ira-build.c:1972:  edges = get_loop_exit_edges (loop);
>> ira-color.c:2023:      edges = get_loop_exit_edges (loop_node->loop);
>>
>> Especially the last one, in ira_loop_edge_freq, seems to be called
>> frequently enough to justify caching loop exit edges.
>>
>>
>>>> So ira.c should set current_loops before calling record_loop_exits.
>>>> With the current order, the exists are not actually recorded. Also,
>>>> ira.c should release recorded exits and loops, this releases a
>>>> considerable amount of GC memory that can be re-used in the next IRA
>>>> iteration.
>>>>
>>>> It's not clear to me why IRA has to re-compute the cfgloop tree from
>>>> scratch, though, so I've added a "???" comment for that.
>>> I am not a specialist in cfgloop code.  I did not find how to update it
>>> incrementally.  So I just rebuild it.
>>>
>>> My be you can advice how to do this.  IRA can generate insns on the BB
>>> edges, it results in creation of new BBs on the region exits/enters.
>> AFAIU this mostly happens automatically for simple edge splitting
>> operations, and for more complex transformations there is
>> fix_loop_structure. But I'm not sure, perhaps Richi can give some
>> advice.
> Just something I noticed when working on preserving loop structures.
> IRA re-builds it in a weird way instead of simply using
> loop_optimizer_init (), not sure why:
>
>    ira_assert (current_loops == NULL);
>    if (flag_ira_region == IRA_REGION_ALL || flag_ira_region ==
> IRA_REGION_MIXED)
>      {
>        flow_loops_find (&ira_loops);
>        record_loop_exits ();
>        current_loops = &ira_loops;
>
> why does it have 'ira_loops' at all, instead of using current_loops!?
Some code in IRA is very old.  So I don't remember already reasons for this.

I'd appreciate any your patch switching to current_loops.
> I noticed that when removing all this because loops are preserved
> that inbetween IRA and reload loop structure verification would
> fail (not sure it's a good idea to have separate passes if
> the state inbetween is "broken").
> And yes, the cfg manipulation primitives update loops.  fixup is
> necessary in case you explicitely remove loops or the cfg
> manipulation primitives "give up" and schedule loops for removal.
As I wrote IRA only generates new BBs because it is putting sometimes 
insns on the edges.
If there is a real saving just updating loop structures not rebuilding 
it.  I'd prefer it.
But again I am not well familiar with cfgloop code and  will welcome any 
patch from others.

On the other hand, common knowledge says me that there will be no 
significant savings as finding loops is pretty fast because it works on 
BBs (not insns, not insns operands, not insn alternatives).  May be it 
has a sense when you need recalculate loop info frequently as in loop 
optimizations.
> See how RTL cfgcleanup handles this now (yes, how ugly ...).
>
> If IRA makes serious use of loops then it should arrange to
> use the loop structures that are still available (ok, I destroy
> them after RTL loop transforms currently).  IRA could be the
> point where we destroy loops, but IIRC even machine-reorg
> likes to know loops.
>
>
Yes, some info about loops (like BB depth in loops) would be useful for 
different heuristics even in later passes.
Richard Biener Oct. 12, 2012, 2:06 p.m. UTC | #6
On Fri, 12 Oct 2012, Vladimir Makarov wrote:

> On 12-10-12 4:56 AM, Richard Biener wrote:
> > On Fri, 12 Oct 2012, Steven Bosscher wrote:
> > 
> > > On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote:
> > > > On 12-10-11 4:17 PM, Steven Bosscher wrote:
> > > > > Hello,
> > > > > 
> > > > > IRA uses record_loop_exits() to cache the loop exit edges, but due to
> > > > > a code ordering bug the edges are not actually recorded.
> > > > > record_loop_exits() starts with:
> > > > > 
> > > > >     if (!current_loops)
> > > > >       return;
> > > > I have no idea why record_loop_exits is here.  Loop exits are not used
> > > > in
> > > > IRA.  I think it should be removed.
> > > Actually they are used alright:
> > > 
> > > $ grep -n get_loop_exit_edges ira*
> > > ira-build.c:169:          edges = get_loop_exit_edges (loop);
> > > ira-build.c:1774:      edges = get_loop_exit_edges (loop_node->loop);
> > > ira-build.c:1972:  edges = get_loop_exit_edges (loop);
> > > ira-color.c:2023:      edges = get_loop_exit_edges (loop_node->loop);
> > > 
> > > Especially the last one, in ira_loop_edge_freq, seems to be called
> > > frequently enough to justify caching loop exit edges.
> > > 
> > > 
> > > > > So ira.c should set current_loops before calling record_loop_exits.
> > > > > With the current order, the exists are not actually recorded. Also,
> > > > > ira.c should release recorded exits and loops, this releases a
> > > > > considerable amount of GC memory that can be re-used in the next IRA
> > > > > iteration.
> > > > > 
> > > > > It's not clear to me why IRA has to re-compute the cfgloop tree from
> > > > > scratch, though, so I've added a "???" comment for that.
> > > > I am not a specialist in cfgloop code.  I did not find how to update it
> > > > incrementally.  So I just rebuild it.
> > > > 
> > > > My be you can advice how to do this.  IRA can generate insns on the BB
> > > > edges, it results in creation of new BBs on the region exits/enters.
> > > AFAIU this mostly happens automatically for simple edge splitting
> > > operations, and for more complex transformations there is
> > > fix_loop_structure. But I'm not sure, perhaps Richi can give some
> > > advice.
> > Just something I noticed when working on preserving loop structures.
> > IRA re-builds it in a weird way instead of simply using
> > loop_optimizer_init (), not sure why:
> > 
> >    ira_assert (current_loops == NULL);
> >    if (flag_ira_region == IRA_REGION_ALL || flag_ira_region ==
> > IRA_REGION_MIXED)
> >      {
> >        flow_loops_find (&ira_loops);
> >        record_loop_exits ();
> >        current_loops = &ira_loops;
> > 
> > why does it have 'ira_loops' at all, instead of using current_loops!?
> Some code in IRA is very old.  So I don't remember already reasons for this.
> 
> I'd appreciate any your patch switching to current_loops.
> > I noticed that when removing all this because loops are preserved
> > that inbetween IRA and reload loop structure verification would
> > fail (not sure it's a good idea to have separate passes if
> > the state inbetween is "broken").
> > And yes, the cfg manipulation primitives update loops.  fixup is
> > necessary in case you explicitely remove loops or the cfg
> > manipulation primitives "give up" and schedule loops for removal.
> As I wrote IRA only generates new BBs because it is putting sometimes insns on
> the edges.
> If there is a real saving just updating loop structures not rebuilding it.
> I'd prefer it.
> But again I am not well familiar with cfgloop code and  will welcome any patch
> from others.
> 
> On the other hand, common knowledge says me that there will be no significant
> savings as finding loops is pretty fast because it works on BBs (not insns,
> not insns operands, not insn alternatives).  May be it has a sense when you
> need recalculate loop info frequently as in loop optimizations.

The point is not to save recalculation time but to be able to associate
information with loops that is preserved across passes.

I'll see if I can make IRA use current_loops.

Richard.
Steven Bosscher Oct. 12, 2012, 2:16 p.m. UTC | #7
On Fri, Oct 12, 2012 at 3:31 PM, Vladimir Makarov wrote:
> Ops.  Sorry, Steven.  I did a wrong conclusion because I thought I would
> have found such code generation problem if it had an affect.

Oh, the patch shouldn't (and doesn't) change the generated code, that
is not how cfgloops works: record_loop_exits is just a cache.

If you use record_loop_exits, cfgloops sets up a cache of the loop
exits so that if you look them up frequently it can return the VEC of
edges faster. But if you don't use record_loop_exits, it still returns
the same set of edges (and in the same order, too). It just will be
slower because cfgloop will search for the loop exit edges by loading
the loop body and scanning the successor edges of the loop body basic
blocks.


> So your patch is ok.  Thanks for working on this.

I'll commit the patch later this weekend. Thanks for the quick review!

Ciao!
Steven
diff mbox

Patch

Index: ira.c
===================================================================
--- ira.c       (revision 192377)
+++ ira.c       (working copy)
@@ -4233,8 +4233,8 @@  ira (FILE *f)
   if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED)
     {
       flow_loops_find (&ira_loops);
+      current_loops = &ira_loops;
       record_loop_exits ();
-      current_loops = &ira_loops;
     }

   if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL)
@@ -4277,9 +4277,14 @@  ira (FILE *f)
             info.  */
          df_analyze ();

+         /* ??? Rebuild the loop tree, but why?  Does the loop tree
+            change if new insns were generated?  Can that be handled
+            by updating the loop tree incrementally?  */
+         release_recorded_exits ();
+         flow_loops_free (&ira_loops);
          flow_loops_find (&ira_loops);
+         current_loops = &ira_loops;
          record_loop_exits ();
-         current_loops = &ira_loops;

          setup_allocno_assignment_flags ();
          ira_initiate_assign ();
@@ -4363,6 +4368,7 @@  do_reload (void)

   if (current_loops != NULL)
     {
+      release_recorded_exits ();
       flow_loops_free (&ira_loops);
       free_dominance_info (CDI_DOMINATORS);
     }