Message ID | CABu31nOwifrw=ZY43rGs2T9ixkW76z4ureCj8Q_EhGXLgvtUqQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 12-10-11 4:17 PM, Steven Bosscher wrote: > Hello, > > IRA uses record_loop_exits() to cache the loop exit edges, but due to > a code ordering bug the edges are not actually recorded. > record_loop_exits() starts with: > > if (!current_loops) > return; I have no idea why record_loop_exits is here. Loop exits are not used in IRA. I think it should be removed. > So ira.c should set current_loops before calling record_loop_exits. > With the current order, the exists are not actually recorded. Also, > ira.c should release recorded exits and loops, this releases a > considerable amount of GC memory that can be re-used in the next IRA > iteration. > > It's not clear to me why IRA has to re-compute the cfgloop tree from > scratch, though, so I've added a "???" comment for that. I am not a specialist in cfgloop code. I did not find how to update it incrementally. So I just rebuild it. My be you can advice how to do this. IRA can generate insns on the BB edges, it results in creation of new BBs on the region exits/enters. > Bootstrapped&tested (with and without checking) on > powerpc64-unknown-linux-gnu. OK for trunk? > >
On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: > On 12-10-11 4:17 PM, Steven Bosscher wrote: >> >> Hello, >> >> IRA uses record_loop_exits() to cache the loop exit edges, but due to >> a code ordering bug the edges are not actually recorded. >> record_loop_exits() starts with: >> >> if (!current_loops) >> return; > > I have no idea why record_loop_exits is here. Loop exits are not used in > IRA. I think it should be removed. Actually they are used alright: $ grep -n get_loop_exit_edges ira* ira-build.c:169: edges = get_loop_exit_edges (loop); ira-build.c:1774: edges = get_loop_exit_edges (loop_node->loop); ira-build.c:1972: edges = get_loop_exit_edges (loop); ira-color.c:2023: edges = get_loop_exit_edges (loop_node->loop); Especially the last one, in ira_loop_edge_freq, seems to be called frequently enough to justify caching loop exit edges. >> So ira.c should set current_loops before calling record_loop_exits. >> With the current order, the exists are not actually recorded. Also, >> ira.c should release recorded exits and loops, this releases a >> considerable amount of GC memory that can be re-used in the next IRA >> iteration. >> >> It's not clear to me why IRA has to re-compute the cfgloop tree from >> scratch, though, so I've added a "???" comment for that. > > I am not a specialist in cfgloop code. I did not find how to update it > incrementally. So I just rebuild it. > > My be you can advice how to do this. IRA can generate insns on the BB > edges, it results in creation of new BBs on the region exits/enters. AFAIU this mostly happens automatically for simple edge splitting operations, and for more complex transformations there is fix_loop_structure. But I'm not sure, perhaps Richi can give some advice. Ciao! Steven
On Fri, 12 Oct 2012, Steven Bosscher wrote: > On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: > > On 12-10-11 4:17 PM, Steven Bosscher wrote: > >> > >> Hello, > >> > >> IRA uses record_loop_exits() to cache the loop exit edges, but due to > >> a code ordering bug the edges are not actually recorded. > >> record_loop_exits() starts with: > >> > >> if (!current_loops) > >> return; > > > > I have no idea why record_loop_exits is here. Loop exits are not used in > > IRA. I think it should be removed. > > Actually they are used alright: > > $ grep -n get_loop_exit_edges ira* > ira-build.c:169: edges = get_loop_exit_edges (loop); > ira-build.c:1774: edges = get_loop_exit_edges (loop_node->loop); > ira-build.c:1972: edges = get_loop_exit_edges (loop); > ira-color.c:2023: edges = get_loop_exit_edges (loop_node->loop); > > Especially the last one, in ira_loop_edge_freq, seems to be called > frequently enough to justify caching loop exit edges. > > > >> So ira.c should set current_loops before calling record_loop_exits. > >> With the current order, the exists are not actually recorded. Also, > >> ira.c should release recorded exits and loops, this releases a > >> considerable amount of GC memory that can be re-used in the next IRA > >> iteration. > >> > >> It's not clear to me why IRA has to re-compute the cfgloop tree from > >> scratch, though, so I've added a "???" comment for that. > > > > I am not a specialist in cfgloop code. I did not find how to update it > > incrementally. So I just rebuild it. > > > > My be you can advice how to do this. IRA can generate insns on the BB > > edges, it results in creation of new BBs on the region exits/enters. > > AFAIU this mostly happens automatically for simple edge splitting > operations, and for more complex transformations there is > fix_loop_structure. But I'm not sure, perhaps Richi can give some > advice. Just something I noticed when working on preserving loop structures. IRA re-builds it in a weird way instead of simply using loop_optimizer_init (), not sure why: ira_assert (current_loops == NULL); if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED) { flow_loops_find (&ira_loops); record_loop_exits (); current_loops = &ira_loops; why does it have 'ira_loops' at all, instead of using current_loops!? I noticed that when removing all this because loops are preserved that inbetween IRA and reload loop structure verification would fail (not sure it's a good idea to have separate passes if the state inbetween is "broken"). And yes, the cfg manipulation primitives update loops. fixup is necessary in case you explicitely remove loops or the cfg manipulation primitives "give up" and schedule loops for removal. See how RTL cfgcleanup handles this now (yes, how ugly ...). If IRA makes serious use of loops then it should arrange to use the loop structures that are still available (ok, I destroy them after RTL loop transforms currently). IRA could be the point where we destroy loops, but IIRC even machine-reorg likes to know loops. Richard.
On 12-10-12 4:12 AM, Steven Bosscher wrote: > On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: >> On 12-10-11 4:17 PM, Steven Bosscher wrote: >>> Hello, >>> >>> IRA uses record_loop_exits() to cache the loop exit edges, but due to >>> a code ordering bug the edges are not actually recorded. >>> record_loop_exits() starts with: >>> >>> if (!current_loops) >>> return; >> I have no idea why record_loop_exits is here. Loop exits are not used in >> IRA. I think it should be removed. > Actually they are used alright: > > $ grep -n get_loop_exit_edges ira* > ira-build.c:169: edges = get_loop_exit_edges (loop); > ira-build.c:1774: edges = get_loop_exit_edges (loop_node->loop); > ira-build.c:1972: edges = get_loop_exit_edges (loop); > ira-color.c:2023: edges = get_loop_exit_edges (loop_node->loop); > > Especially the last one, in ira_loop_edge_freq, seems to be called > frequently enough to justify caching loop exit edges. Ops. Sorry, Steven. I did a wrong conclusion because I thought I would have found such code generation problem if it had an affect. I've just tried about 20 tests and did not find a difference in code generation when I moved current_loops assignment above record_loop_exits (). I did some investigation and it seems to me it is important to have the right loop exits when such exits are abnormal edges. It would be problem if there is a generation of reg shuffle insns on such edges. But it does not happen (if it happens it should result in compiler crash), probably IRA makes the same allocation of pseudos somehow on different ends of such edges or such cases are extremely rare. Another problem would be a possibility of different allocation when loop has a pseudo living at the exit edges but not on the enters which I think is also a rare situation. Still it is good to have right loop exits for some rare corner cases. So your patch is ok. Thanks for working on this. > >>> So ira.c should set current_loops before calling record_loop_exits. >>> With the current order, the exists are not actually recorded. Also, >>> ira.c should release recorded exits and loops, this releases a >>> considerable amount of GC memory that can be re-used in the next IRA >>> iteration. >>> >>> It's not clear to me why IRA has to re-compute the cfgloop tree from >>> scratch, though, so I've added a "???" comment for that. >> I am not a specialist in cfgloop code. I did not find how to update it >> incrementally. So I just rebuild it. >> >> My be you can advice how to do this. IRA can generate insns on the BB >> edges, it results in creation of new BBs on the region exits/enters. > AFAIU this mostly happens automatically for simple edge splitting > operations, and for more complex transformations there is > fix_loop_structure. But I'm not sure, perhaps Richi can give some > advice. >
On 12-10-12 4:56 AM, Richard Biener wrote: > On Fri, 12 Oct 2012, Steven Bosscher wrote: > >> On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: >>> On 12-10-11 4:17 PM, Steven Bosscher wrote: >>>> Hello, >>>> >>>> IRA uses record_loop_exits() to cache the loop exit edges, but due to >>>> a code ordering bug the edges are not actually recorded. >>>> record_loop_exits() starts with: >>>> >>>> if (!current_loops) >>>> return; >>> I have no idea why record_loop_exits is here. Loop exits are not used in >>> IRA. I think it should be removed. >> Actually they are used alright: >> >> $ grep -n get_loop_exit_edges ira* >> ira-build.c:169: edges = get_loop_exit_edges (loop); >> ira-build.c:1774: edges = get_loop_exit_edges (loop_node->loop); >> ira-build.c:1972: edges = get_loop_exit_edges (loop); >> ira-color.c:2023: edges = get_loop_exit_edges (loop_node->loop); >> >> Especially the last one, in ira_loop_edge_freq, seems to be called >> frequently enough to justify caching loop exit edges. >> >> >>>> So ira.c should set current_loops before calling record_loop_exits. >>>> With the current order, the exists are not actually recorded. Also, >>>> ira.c should release recorded exits and loops, this releases a >>>> considerable amount of GC memory that can be re-used in the next IRA >>>> iteration. >>>> >>>> It's not clear to me why IRA has to re-compute the cfgloop tree from >>>> scratch, though, so I've added a "???" comment for that. >>> I am not a specialist in cfgloop code. I did not find how to update it >>> incrementally. So I just rebuild it. >>> >>> My be you can advice how to do this. IRA can generate insns on the BB >>> edges, it results in creation of new BBs on the region exits/enters. >> AFAIU this mostly happens automatically for simple edge splitting >> operations, and for more complex transformations there is >> fix_loop_structure. But I'm not sure, perhaps Richi can give some >> advice. > Just something I noticed when working on preserving loop structures. > IRA re-builds it in a weird way instead of simply using > loop_optimizer_init (), not sure why: > > ira_assert (current_loops == NULL); > if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == > IRA_REGION_MIXED) > { > flow_loops_find (&ira_loops); > record_loop_exits (); > current_loops = &ira_loops; > > why does it have 'ira_loops' at all, instead of using current_loops!? Some code in IRA is very old. So I don't remember already reasons for this. I'd appreciate any your patch switching to current_loops. > I noticed that when removing all this because loops are preserved > that inbetween IRA and reload loop structure verification would > fail (not sure it's a good idea to have separate passes if > the state inbetween is "broken"). > And yes, the cfg manipulation primitives update loops. fixup is > necessary in case you explicitely remove loops or the cfg > manipulation primitives "give up" and schedule loops for removal. As I wrote IRA only generates new BBs because it is putting sometimes insns on the edges. If there is a real saving just updating loop structures not rebuilding it. I'd prefer it. But again I am not well familiar with cfgloop code and will welcome any patch from others. On the other hand, common knowledge says me that there will be no significant savings as finding loops is pretty fast because it works on BBs (not insns, not insns operands, not insn alternatives). May be it has a sense when you need recalculate loop info frequently as in loop optimizations. > See how RTL cfgcleanup handles this now (yes, how ugly ...). > > If IRA makes serious use of loops then it should arrange to > use the loop structures that are still available (ok, I destroy > them after RTL loop transforms currently). IRA could be the > point where we destroy loops, but IIRC even machine-reorg > likes to know loops. > > Yes, some info about loops (like BB depth in loops) would be useful for different heuristics even in later passes.
On Fri, 12 Oct 2012, Vladimir Makarov wrote: > On 12-10-12 4:56 AM, Richard Biener wrote: > > On Fri, 12 Oct 2012, Steven Bosscher wrote: > > > > > On Fri, Oct 12, 2012 at 6:16 AM, Vladimir Makarov wrote: > > > > On 12-10-11 4:17 PM, Steven Bosscher wrote: > > > > > Hello, > > > > > > > > > > IRA uses record_loop_exits() to cache the loop exit edges, but due to > > > > > a code ordering bug the edges are not actually recorded. > > > > > record_loop_exits() starts with: > > > > > > > > > > if (!current_loops) > > > > > return; > > > > I have no idea why record_loop_exits is here. Loop exits are not used > > > > in > > > > IRA. I think it should be removed. > > > Actually they are used alright: > > > > > > $ grep -n get_loop_exit_edges ira* > > > ira-build.c:169: edges = get_loop_exit_edges (loop); > > > ira-build.c:1774: edges = get_loop_exit_edges (loop_node->loop); > > > ira-build.c:1972: edges = get_loop_exit_edges (loop); > > > ira-color.c:2023: edges = get_loop_exit_edges (loop_node->loop); > > > > > > Especially the last one, in ira_loop_edge_freq, seems to be called > > > frequently enough to justify caching loop exit edges. > > > > > > > > > > > So ira.c should set current_loops before calling record_loop_exits. > > > > > With the current order, the exists are not actually recorded. Also, > > > > > ira.c should release recorded exits and loops, this releases a > > > > > considerable amount of GC memory that can be re-used in the next IRA > > > > > iteration. > > > > > > > > > > It's not clear to me why IRA has to re-compute the cfgloop tree from > > > > > scratch, though, so I've added a "???" comment for that. > > > > I am not a specialist in cfgloop code. I did not find how to update it > > > > incrementally. So I just rebuild it. > > > > > > > > My be you can advice how to do this. IRA can generate insns on the BB > > > > edges, it results in creation of new BBs on the region exits/enters. > > > AFAIU this mostly happens automatically for simple edge splitting > > > operations, and for more complex transformations there is > > > fix_loop_structure. But I'm not sure, perhaps Richi can give some > > > advice. > > Just something I noticed when working on preserving loop structures. > > IRA re-builds it in a weird way instead of simply using > > loop_optimizer_init (), not sure why: > > > > ira_assert (current_loops == NULL); > > if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == > > IRA_REGION_MIXED) > > { > > flow_loops_find (&ira_loops); > > record_loop_exits (); > > current_loops = &ira_loops; > > > > why does it have 'ira_loops' at all, instead of using current_loops!? > Some code in IRA is very old. So I don't remember already reasons for this. > > I'd appreciate any your patch switching to current_loops. > > I noticed that when removing all this because loops are preserved > > that inbetween IRA and reload loop structure verification would > > fail (not sure it's a good idea to have separate passes if > > the state inbetween is "broken"). > > And yes, the cfg manipulation primitives update loops. fixup is > > necessary in case you explicitely remove loops or the cfg > > manipulation primitives "give up" and schedule loops for removal. > As I wrote IRA only generates new BBs because it is putting sometimes insns on > the edges. > If there is a real saving just updating loop structures not rebuilding it. > I'd prefer it. > But again I am not well familiar with cfgloop code and will welcome any patch > from others. > > On the other hand, common knowledge says me that there will be no significant > savings as finding loops is pretty fast because it works on BBs (not insns, > not insns operands, not insn alternatives). May be it has a sense when you > need recalculate loop info frequently as in loop optimizations. The point is not to save recalculation time but to be able to associate information with loops that is preserved across passes. I'll see if I can make IRA use current_loops. Richard.
On Fri, Oct 12, 2012 at 3:31 PM, Vladimir Makarov wrote: > Ops. Sorry, Steven. I did a wrong conclusion because I thought I would > have found such code generation problem if it had an affect. Oh, the patch shouldn't (and doesn't) change the generated code, that is not how cfgloops works: record_loop_exits is just a cache. If you use record_loop_exits, cfgloops sets up a cache of the loop exits so that if you look them up frequently it can return the VEC of edges faster. But if you don't use record_loop_exits, it still returns the same set of edges (and in the same order, too). It just will be slower because cfgloop will search for the loop exit edges by loading the loop body and scanning the successor edges of the loop body basic blocks. > So your patch is ok. Thanks for working on this. I'll commit the patch later this weekend. Thanks for the quick review! Ciao! Steven
Index: ira.c =================================================================== --- ira.c (revision 192377) +++ ira.c (working copy) @@ -4233,8 +4233,8 @@ ira (FILE *f) if (flag_ira_region == IRA_REGION_ALL || flag_ira_region == IRA_REGION_MIXED) { flow_loops_find (&ira_loops); + current_loops = &ira_loops; record_loop_exits (); - current_loops = &ira_loops; } if (internal_flag_ira_verbose > 0 && ira_dump_file != NULL) @@ -4277,9 +4277,14 @@ ira (FILE *f) info. */ df_analyze (); + /* ??? Rebuild the loop tree, but why? Does the loop tree + change if new insns were generated? Can that be handled + by updating the loop tree incrementally? */ + release_recorded_exits (); + flow_loops_free (&ira_loops); flow_loops_find (&ira_loops); + current_loops = &ira_loops; record_loop_exits (); - current_loops = &ira_loops; setup_allocno_assignment_flags (); ira_initiate_assign (); @@ -4363,6 +4368,7 @@ do_reload (void) if (current_loops != NULL) { + release_recorded_exits (); flow_loops_free (&ira_loops); free_dominance_info (CDI_DOMINATORS); }