Message ID | 20121012194333.GA8987@kam.mff.cuni.cz |
---|---|
State | New |
Headers | show |
On 2012.10.12 at 21:43 +0200, Jan Hubicka wrote: > I finally tracked down twolf misoptimization triggered by my loop-unroll.c > changes. It has turned out to be semi-latent wrong code issue in webizer. > What happens is: > > 1) gcse.c drop REG_EQUAL note on the induction variable > 2) loop optimizer unrolls the loop enabling webizer to cleanup > 3) webizer do not track reaching refs into REG_EQUAL note because > the variable is dead and renames it to unused pseudo > Program is stil valid but the REG_EQUAL is bogus. > 4) CSE eventually takes the value and put it back into the code > 5) init-regs initializes it to 0 > > and result is a segfault on the following testcase. > Fixed by making webizer to not prune dead regs. > Will commit it after testing on x86_64-linux. FYI this also fixes the lto/profiledbootstrap breakage, PR 54885. Thanks.
On Fri, Oct 12, 2012 at 9:43 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > Hi, > I finally tracked down twolf misoptimization triggered by my loop-unroll.c > changes. It has turned out to be semi-latent wrong code issue in webizer. > What happens is: > > 1) gcse.c drop REG_EQUAL note on the induction variable > 2) loop optimizer unrolls the loop enabling webizer to cleanup > 3) webizer do not track reaching refs into REG_EQUAL note because > the variable is dead and renames it to unused pseudo > Program is stil valid but the REG_EQUAL is bogus. What do you mean, "not track"? web.c does track EQ notes via DF this: web.c:315: df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES); > 4) CSE eventually takes the value and put it back into the code > 5) init-regs initializes it to 0 > > and result is a segfault on the following testcase. > Fixed by making webizer to not prune dead regs. > Will commit it after testing on x86_64-linux. > > Honza > > /* { dg-do run } */ > /* { dg-options "-O3 -funroll-loops" } */ > typedef struct rowbox { > int startx ; > int endx ; > int endx1 ; > int startx2 ; > int ypos ; > int desiredL ; > } ROWBOX ; > ROWBOX rowArray1[2] ; > ROWBOX *rowArray = rowArray1; > > int numRows = 2; > > int row = 1; > int block = 0; > double ckt_size_factor ; > > __attribute__ ((noinline)) > configure2() > { > block = 0 ; > for( row = 1 ; row <= numRows ; row++ ) { > block++ ; > if( rowArray[row].endx1 > 0 ) { > block++ ; > } > } > } > > main() > { > configure2(); > } > > * web.c (web_main): Do not set DF_RD_PRUNE_DEAD_DEFS flag. > Index: web.c > =================================================================== > --- web.c (revision 192369) > +++ web.c (working copy) > @@ -313,7 +313,8 @@ web_main (void) > rtx insn; > > df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES); > - df_set_flags (DF_RD_PRUNE_DEAD_DEFS); > + /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL > + notes. */ > df_chain_add_problem (DF_UD_CHAIN); > df_analyze (); > df_set_flags (DF_DEFER_INSN_RESCAN); This is bogus. You're papering over the underlying bug of an invalid REG_EQUAL note. This patch is not OK! I did not add DF_RD_PRUNE_DEAD_DEFS for no reason. You're re-introducing a major compile time hog. Did you do timings on some significant body of code with -fweb enabled? Ciao! Steven
Hi, On your test case, I have this: Web oldreg=72 newreg=111 Web oldreg=72 newreg=112 Web oldreg=72 newreg=139 Web oldreg=72 newreg=145 Web oldreg=72 newreg=151 Web oldreg=72 newreg=157 Web oldreg=72 newreg=163 Web oldreg=72 newreg=169 --- t.c.182r.loop2_done 2012-10-12 22:23:59.000000000 +0200 +++ t.c.183r.web 2012-10-12 22:24:23.000000000 +0200 - 79 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 79 r112:DI=r102:DI + REG_EQUAL: r111:DI+0x18 102 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 124 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 150 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 176 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 202 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 228 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 254 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + REG_EQUAL: r111:DI+0x18 - 285 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 285 r139:DI=r114:DI + REG_EQUAL: r111:DI+0x18 - 306 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 306 r145:DI=r141:DI + REG_EQUAL: r111:DI+0x18 - 327 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 327 r151:DI=r147:DI + REG_EQUAL: r111:DI+0x18 - 348 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 348 r157:DI=r153:DI + REG_EQUAL: r111:DI+0x18 - 369 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 369 r163:DI=r159:DI + REG_EQUAL: r111:DI+0x18 - 390 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 390 r169:DI=r165:DI + REG_EQUAL: r111:DI+0x18 - 411 r72:DI=r102:DI - REG_EQUAL: r72:DI+0x18 + 411 r72:DI=r171:DI + REG_EQUAL: r111:DI+0x18 So all appearances of r72 in REG_EQUAL notes have been replaced with r111. That means the constructed webs are wrong. If the REG_EQUAL is for the DEF operand of the insn (which must be a single_set insn, or there wouldn't be a note) then the DEF and the EQ_USE should be in the same web. I don't think this has anything to do with DF_RD_PRUNE_DEAD_DEFS. This is a pre-existing bug in web.c that just happens to be exposed by loop unrolling. Please do not apply the patch, it only papers over another problem in web.c. Ciao! Steven
> On Fri, Oct 12, 2012 at 9:43 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > > Hi, > > I finally tracked down twolf misoptimization triggered by my loop-unroll.c > > changes. It has turned out to be semi-latent wrong code issue in webizer. > > What happens is: > > > > 1) gcse.c drop REG_EQUAL note on the induction variable > > 2) loop optimizer unrolls the loop enabling webizer to cleanup > > 3) webizer do not track reaching refs into REG_EQUAL note because > > the variable is dead and renames it to unused pseudo > > Program is stil valid but the REG_EQUAL is bogus. > > What do you mean, "not track"? web.c does track EQ notes via DF this: > > web.c:315: df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES); > > This is bogus. You're papering over the underlying bug of an invalid > REG_EQUAL note. This patch is not OK! The REG_EQUAL note is valid until webizer breaks it. DF_EQ_NOTES makes UD chain problem to look into uses of REG_EQUAL and add all reaching definitions into the list. To do the reaching definitions it uses RD problem solution. This solution do not really care about REG_EQUAL notes because it does only over definitions. It however uses results of LIVENESS problem via REG_DEAD notes and the pruning trick. Now th REG_EQUAL note uses dead register (that is aloved), because the liveness is computed ignoring REG_EQUAL notes and thus the definition gets prunned from the list and consequentely webizer misses the link. > > I did not add DF_RD_PRUNE_DEAD_DEFS for no reason. You're > re-introducing a major compile time hog. Did you do timings on some > significant body of code with -fweb enabled? Well, I can not think how to fix it without 1) computing liveness with REG_EQUAL included prior RD that means a lot of shuffling of REG_DEAD notes 2) making webizer to drop all REG_EQUAL notes that use dead register (for that it will need to track liveness too) Do you have better ideas? Honza > > Ciao! > Steven
On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > 1) computing liveness with REG_EQUAL included prior RD that means a lot > of shuffling of REG_DEAD notes I was already working on a patch for this. I'll send it here later tonight. Ciao! Steven
> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote: > > 1) computing liveness with REG_EQUAL included prior RD that means a lot > > of shuffling of REG_DEAD notes > > I was already working on a patch for this. I'll send it here later tonight. Great, thanks! This is probably most sensible approach even if we will need to recompute liveness before/after webizer. This bug really took me days to reduce into something sensible. It tends to reproduce only in terribly complex functions with many loops. Honza > > Ciao! > Steven
On Fri, Oct 12, 2012 at 11:16 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >> > 1) computing liveness with REG_EQUAL included prior RD that means a lot >> > of shuffling of REG_DEAD notes >> >> I was already working on a patch for this. I'll send it here later tonight. > > Great, thanks! This is probably most sensible approach even if we will need to > recompute liveness before/after webizer. I don't think we have to touch the liveness sets. We can compute an extra set of registers live only for REG_EQUAL/REG_EQUIV notes. Attached is what I had in mind. Untested, etc. it's late (and the Yankees are playing) so I'll get back to properly testing this tomorrow. Ciao! Steven
Il 13/10/2012 00:25, Steven Bosscher ha scritto: > On Fri, Oct 12, 2012 at 11:16 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >>> On Fri, Oct 12, 2012 at 10:44 PM, Jan Hubicka <hubicka@ucw.cz> wrote: >>>> 1) computing liveness with REG_EQUAL included prior RD that means a lot >>>> of shuffling of REG_DEAD notes >>> >>> I was already working on a patch for this. I'll send it here later tonight. >> >> Great, thanks! This is probably most sensible approach even if we will need to >> recompute liveness before/after webizer. > > I don't think we have to touch the liveness sets. We can compute an > extra set of registers live only for REG_EQUAL/REG_EQUIV notes. > Attached is what I had in mind. Untested, etc. it's late (and the > Yankees are playing) so I'll get back to properly testing this > tomorrow. Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV notes that refer to a dead pseudo? Paolo
On Sun, Oct 14, 2012 at 9:02 AM, Paolo Bonzini wrote: > Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV > notes that refer to a dead pseudo? I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a pseudo that isn't live and still be valid. Consider a simple example like this: a = b + 3 // b dies here c = a {REG_EQUAL b+3} The REG_EQUAL note is valid and may help optimization. Removing it just because b is dead at that point would be unnecessarily pessimistic. I also don't want to compute DF_LR taking EQ_USES into account as real uses for liveness, because that involves recomputing and enlarging the DF_LR sets (all of them, both globally and locally) before LR&RD and after LR&RD. That's why I implemented the quick-and-dirty liveness computation for the notes: It's non-intrusive on DF_LR and it's cheap. Ciao! Steven
> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a > pseudo that isn't live and still be valid. Consider a simple example > like this: > > a = b + 3 > // b dies here > c = a {REG_EQUAL b+3} > > The REG_EQUAL note is valid and may help optimization. Removing it > just because b is dead at that point would be unnecessarily > pessimistic. But if you have a REG_DEAD note for b on the first insn, then you cannot rematerialize the REG_EQUAL note after it, otherwise bad things can happen. See PR rtl-optimization/51505 for an example.
On Sun, Oct 14, 2012 at 11:25 PM, Eric Botcazou wrote: >> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a >> pseudo that isn't live and still be valid. Consider a simple example >> like this: >> >> a = b + 3 >> // b dies here >> c = a {REG_EQUAL b+3} >> >> The REG_EQUAL note is valid and may help optimization. Removing it >> just because b is dead at that point would be unnecessarily >> pessimistic. > > But if you have a REG_DEAD note for b on the first insn, then you cannot > rematerialize the REG_EQUAL note after it, otherwise bad things can happen. > > See PR rtl-optimization/51505 for an example. That's not the case here. The register is only dead because the webizer renamed one of its live ranges but "forgets" to rename the EQ_NOTE use. Ciao! Steven
Il 14/10/2012 22:59, Steven Bosscher ha scritto: > On Sun, Oct 14, 2012 at 9:02 AM, Paolo Bonzini wrote: >> Can we just simulate liveness for web, and drop REG_EQUAL/REG_EQUIV >> notes that refer to a dead pseudo? > > I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a > pseudo that isn't live and still be valid. Consider a simple example > like this: > > a = b + 3 > // b dies here > c = a {REG_EQUAL b+3} > > The REG_EQUAL note is valid and may help optimization. Removing it > just because b is dead at that point would be unnecessarily > pessimistic. I disagree that it is valid. At least it is risky to consider it valid, because a pass that simulates liveness might end up doing something wrong because of that note. If simulation is done backwards, it doesn't even require any interaction with REG_DEAD notes. > I also don't want to compute DF_LR taking EQ_USES into account as real > uses for liveness, because that involves recomputing and enlarging the > DF_LR sets (all of them, both globally and locally) before LR&RD and > after LR&RD. That's why I implemented the quick-and-dirty liveness > computation for the notes: It's non-intrusive on DF_LR and it's cheap. Yes, I agree on that part of the implementation. :) Paolo
On Mon, Oct 15, 2012 at 9:38 AM, Paolo Bonzini wrote: >> I don't think we want to do that. A REG_EQUAL/REG_EQUIV note can use a >> pseudo that isn't live and still be valid. Consider a simple example >> like this: >> >> a = b + 3 >> // b dies here >> c = a {REG_EQUAL b+3} >> >> The REG_EQUAL note is valid and may help optimization. Removing it >> just because b is dead at that point would be unnecessarily >> pessimistic. > > I disagree that it is valid. At least it is risky to consider it valid, > because a pass that simulates liveness might end up doing something > wrong because of that note. If simulation is done backwards, it doesn't > even require any interaction with REG_DEAD notes. In any case, if web doesn't properly rename the register in the REG_EQUAL note (which it doesn't do without my patch) and we declare such a note invalid, then we should remove the note. You're right that GCC ends up doing something wrong, that's why Honza's test case fails. With my patch, the registers in the notes are properly renamed and there are no REG_EQUAL notes that refer to dead registers, so the point whether such a note would be valid is moot. After renaming, the note is valid, and the behavior is restored that GCC had before I added DF_RD_PRUNE_DEAD_DEFS. I think we should come to a conclusion of this discussion: Either we drop the notes (e.g. by re-computing the DF_NOTE problem after web) or we update them, like my patch does. I prefer the patch I proposed because it re-instates the behavior GCC had before. Ciao! Steven
Il 15/10/2012 10:13, Steven Bosscher ha scritto: > > I disagree that it is valid. At least it is risky to consider it valid, > > because a pass that simulates liveness might end up doing something > > wrong because of that note. If simulation is done backwards, it doesn't > > even require any interaction with REG_DEAD notes. > > In any case, if web doesn't properly rename the register in the > REG_EQUAL note (which it doesn't do without my patch) and we declare > such a note invalid, then we should remove the note. You're right that > GCC ends up doing something wrong, that's why Honza's test case fails. > > I think we should come to a conclusion of this discussion: Either we > drop the notes (e.g. by re-computing the DF_NOTE problem after web) or > we update them, like my patch does. I prefer the patch I proposed > because it re-instates the behavior GCC had before. I prefer to declare the notes invalid and drop the notes. Paolo
On Mon, Oct 15, 2012 at 10:19 AM, Paolo Bonzini wrote:
> I prefer to declare the notes invalid and drop the notes.
Then, afaic, our only option is to drop them all in web, as per attached patch.
I strongly disagree with this approach though. It destroys information
that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can
update, and that helps with optimization.
This whole discussion about notes being dead has gone in completely
the wrong direction. With renaming these notes are valid, and do not
refer to dead regs. Perhaps you could be convinced if you look at
Honza's test case with the patch of r192413 reverted.
The test case still fails with --param max-unroll-times=3, that makes
visualizing the problem easier.
Ciao!
Steven
On Mon, Oct 15, 2012 at 10:37 AM, Steven Bosscher <stevenb.gcc@gmail.com> wrote: > On Mon, Oct 15, 2012 at 10:19 AM, Paolo Bonzini wrote: >> I prefer to declare the notes invalid and drop the notes. > > I strongly disagree with this approach though. It destroys information > that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can > update, and that helps with optimization. PR54916 is a case where dropping the notes would cause a missed optimization in cse-after-loop. With my patch to update the notes, the optimization (CSE of a load-immediate) is retained. Ciao! Steven
Il 15/10/2012 10:37, Steven Bosscher ha scritto: >> I prefer to declare the notes invalid and drop the notes. > Then, afaic, our only option is to drop them all in web, as per attached patch. > > I strongly disagree with this approach though. It destroys information > that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can > update, and that helps with optimization. With renaming these notes > are valid, and do not refer to dead regs I agree it is bad. But I do not understand the last sentence: I suppose you mean that _without_ renaming these notes are valid, on the other hand it is normal that some of the notes will be dropped if you shorten live ranges. Without removing all of the notes you can do something like this: - drop the deferred rescanning from web.c. Instead, make replace_ref return a bool and call df_insn_rescan manually from web_main. - attribute new registers to webs in a separate pass that happens before rewriting, and compute a special version of LR_IN/LR_OUT that uses the rewritten registers. - process instructions in reverse order; before starting the visit of a basic block, initialize the local LR bitmap with the rewritten LR_OUT of the previous step - after rewriting and scanning each statement, simulate liveness using the new defs and uses. - after rewriting each statement, look for EQ_USES referring to registers that are dead just before the statement, and delete REG_EQUAL notes if this is the case Paolo > This whole discussion about notes being dead has gone in completely > the wrong direction. With renaming these notes are valid, and do not > refer to dead regs. Perhaps you could be convinced if you look at > Honza's test case with the patch of r192413 reverted. > The test case still fails with --param max-unroll-times=3, that makes > visualizing the problem easier.
On Mon, Oct 15, 2012 at 1:49 PM, Paolo Bonzini wrote: >> I strongly disagree with this approach though. It destroys information >> that is correct, that we had before DF_RD_PRUNE_DEAD_DEFS, that we can >> update, and that helps with optimization. With renaming these notes >> are valid, and do not refer to dead regs > > I agree it is bad. But I do not understand the last sentence: I > suppose you mean that _without_ renaming these notes are valid, on the > other hand it is normal that some of the notes will be dropped if you > shorten live ranges. OK, I now got so confused that I looked into this a bit deeper. The situation is something like the following just after unrolling, but before web: 33 r72:DI=[`rowArray'] 34 {r103:DI=r72:DI+0x18;clobber flags:CC;} ... 45 flags:CCNO=cmp([r72:DI+0x20],0) REG_DEAD: r72:DI ;; diamond shape region follows, joining up again in bb 9: 79 r72:DI=r103:DI REG_EQUAL: r72:DI+0x18 On entry to bb9, r72 is not in LR_IN, so after loop unrolling this note is already invalid if we say that a note should not refer to a dead register. But the register dies much earlier. The first place where insn 79 appears is in the .169r.pre dump: ;; basic block 8, loop depth 1, count 0, freq 9100, maybe hot ;; prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 7 ;; 6 ;; bb 8 artificial_defs: { } ;; bb 8 artificial_uses: { u43(6){ }u44(7){ }u45(16){ }u46(20){ }} ;; lr in 6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 82 85 87 ;; lr use 6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 87 ;; lr def 17 [flags] 72 ;; live in 6 [bp] 7 [sp] 16 [argp] 20 [frame] 72 82 85 87 ;; live gen 17 [flags] 72 ;; live kill 17 [flags] L49: 50 NOTE_INSN_BASIC_BLOCK 79 r72:DI=r103:DI REG_EQUAL: r72:DI+0x18 Here, r72 is still in LR_IN so the note is valid. Then in .171r.cprop2: ;; basic block 8, loop depth 1, count 0, freq 9100, maybe hot ;; prev block 7, next block 9, flags: (REACHABLE, RTL, MODIFIED) ;; pred: 7 ;; 6 ;; bb 8 artificial_defs: { } ;; bb 8 artificial_uses: { u43(6){ }u44(7){ }u45(16){ }u46(20){ }} ;; lr in 6 [bp] 7 [sp] 16 [argp] 20 [frame] 82 85 87 103 ;; lr use 6 [bp] 7 [sp] 16 [argp] 20 [frame] 87 103 ;; lr def 17 [flags] 72 ;; live in 6 [bp] 7 [sp] 16 [argp] 20 [frame] 82 85 87 103 ;; live gen 17 [flags] 72 ;; live kill L49: 50 NOTE_INSN_BASIC_BLOCK 79 r72:DI=r103:DI REG_EQUAL: r72:DI+0x18 So already after CPROP2, the REG_EQUAL note is invalid if we require that they only refer to live registers. This all happens well before any pass uses the DF_RD problem, so this is a pre-existing problem if we consider this kind of REG_EQUAL note to be invalid. > Without removing all of the notes you can do something like this: > > - drop the deferred rescanning from web.c. Instead, make replace_ref > return a bool and call df_insn_rescan manually from web_main. > > - attribute new registers to webs in a separate pass that happens > before rewriting, and compute a special version of LR_IN/LR_OUT that > uses the rewritten registers. > > - process instructions in reverse order; before starting the visit of > a basic block, initialize the local LR bitmap with the rewritten > LR_OUT of the previous step > > - after rewriting and scanning each statement, simulate liveness using > the new defs and uses. > > - after rewriting each statement, look for EQ_USES referring to > registers that are dead just before the statement, and delete > REG_EQUAL notes if this is the case I think I've shown above that we're all looking at the wrong pass... Ciao! Steven
Index: web.c =================================================================== --- web.c (revision 192369) +++ web.c (working copy) @@ -313,7 +313,8 @@ web_main (void) rtx insn; df_set_flags (DF_NO_HARD_REGS + DF_EQ_NOTES); - df_set_flags (DF_RD_PRUNE_DEAD_DEFS); + /* We can not RD_PRUNE_DEAD_DEFS, because we care about REG_EQUAL + notes. */ df_chain_add_problem (DF_UD_CHAIN); df_analyze (); df_set_flags (DF_DEFER_INSN_RESCAN);