Message ID | 000b01d013e5$4bc87010$e3595030$@com |
---|---|
State | New |
Headers | show |
On 12/09/14 12:21, Wilco Dijkstra wrote: > This fixes a bug in register preferencing. When live range splitting creates a new register from > another, it copies most fields except for the register preferences. The preference GENERAL_REGS is > used as reg_pref[i].prefclass is initialized with GENERAL_REGS in allocate_reg_info () and > resize_reg_info (). > > This initialization value is not innocuous like the comment suggests - if a new register has a > non-integer mode, it is forced to prefer GENERAL_REGS. This changes the register costs in pass 2 so > that they are incorrect. As a result the liverange is either spilled or allocated to an integer > register: > > void g(double); > void f(double x) > { > if (x == 0) > return; > g (x); > g (x); > } > > f: > fcmp d0, #0.0 > bne .L6 > ret > .L6: > stp x19, x30, [sp, -16]! > fmov x19, d0 > bl g > fmov d0, x19 > ldp x19, x30, [sp], 16 > b g > > With the fix it uses a floating point register as expected. Given a similar issue in > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be better to change the > initialization values of reg_pref to illegal register classes so this kind of issue can be trivially > found with an assert? Also would it not be a good idea to have a single register copy function that > ensures all data is copied? But there are other times when you really don't want to copy, say when the original had a small class, but the copy can go into a larger class. I banged my head on this when I was doing similar work on range splitting a few years back and ended up recomputing the preferred and alternate class information. That was much better than copying the classes. I pondered heuristics to expand the preferred class, but never implemented anything IIRC. A trivial heuristic would be to bump to the next larger class if the given class was a singleton, otherwise copy the class. The obvious counter to that heuristic is an allocno that has two ranges (or N ranges) where we would prefer a different singleton class for each range. In fact, I'm pretty sure I ran into this kind of situation and that led me down the "just recompute it" path. I'd hazard a guess that the simple heuristic would do better than what we're doing now with GENERAL_REGS though or what you're doing with copying. jeff
> Jeff Law wrote: > On 12/09/14 12:21, Wilco Dijkstra wrote: > > With the fix it uses a floating point register as expected. Given a similar issue in > > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be better to change > the > > initialization values of reg_pref to illegal register classes so this kind of issue can be > trivially > > found with an assert? Also would it not be a good idea to have a single register copy > function that > > ensures all data is copied? > But there are other times when you really don't want to copy, say when > the original had a small class, but the copy can go into a larger class. > > I banged my head on this when I was doing similar work on range > splitting a few years back and ended up recomputing the preferred and > alternate class information. That was much better than copying the > classes. If recomputing is best does that mean that record_reg_classes should not give a boost to the preferred class in the 2nd pass? I don't understand what purpose this has - if the preferred class is from the first pass, it is already correct, so all it does is boost the preferred class further. And if the preferred class is wrong (eg. after live range splitting), it will boost the incorrect class even harder, so in the end you never get a different class. > I pondered heuristics to expand the preferred class, but never > implemented anything IIRC. A trivial heuristic would be to bump to the > next larger class if the given class was a singleton, otherwise copy the > class. > > The obvious counter to that heuristic is an allocno that has two ranges > (or N ranges) where we would prefer a different singleton class for each > range. In fact, I'm pretty sure I ran into this kind of situation and > that led me down the "just recompute it" path. > > I'd hazard a guess that the simple heuristic would do better than what > we're doing now with GENERAL_REGS though or what you're doing with copying. From what you're saying, recomputing seems best, and I'd be happy to submit a patch to remove all the preferred class code from record_reg_classes. However there is still the possibility the preferred class is queried before the recomputation happens (I think that is a case Renlin fixed). Either these should be faulted and fixed by forcing recomputation, or we need to provide a correct preferred class. That should be a copy of the original class. Wilco
On 12/10/14 06:26, Wilco Dijkstra wrote: > > If recomputing is best does that mean that record_reg_classes should not > give a boost to the preferred class in the 2nd pass? Perhaps. I haven't looked deeply at this part of IRA. I was relaying my experiences with (ab)using the ira-reload callbacks to handle allocation after splitting -- where getting the costs and classes updated in a reasonable manner was clearly important to getting good code. One could probably argue I should have kept testcases from that work :-) I don't understand > what purpose this has - if the preferred class is from the first pass, it > is already correct, so all it does is boost the preferred class further. > And if the preferred class is wrong (eg. after live range splitting), it > will boost the incorrect class even harder, so in the end you never get > a different class. It may be historical from the old regclass code, not really sure. > From what you're saying, recomputing seems best, and I'd be happy to submit > a patch to remove all the preferred class code from record_reg_classes. Recomputing certainly helped the cases I was looking at. > > However there is still the possibility the preferred class is queried before > the recomputation happens (I think that is a case Renlin fixed). Either these > should be faulted and fixed by forcing recomputation, or we need to provide a > correct preferred class. That should be a copy of the original class. I believe I had copied the original classes, then recomputed them to avoid any uninitialized memory reads and the like. But looking at the old branch, I don't see the recomputation for classes (though I do see it for costs). Of course all the backwards walk stuff isn't there either, so there's clearly code I worked on extensively, but never committed... Jeff
> Jeff Law wrote: > On 12/10/14 06:26, Wilco Dijkstra wrote: > > > > If recomputing is best does that mean that record_reg_classes should not > > give a boost to the preferred class in the 2nd pass? > Perhaps. I haven't looked deeply at this part of IRA. I was relaying > my experiences with (ab)using the ira-reload callbacks to handle > allocation after splitting -- where getting the costs and classes > updated in a reasonable manner was clearly important to getting good > code. One could probably argue I should have kept testcases from that > work :-) > > > I don't understand > > what purpose this has - if the preferred class is from the first pass, it > > is already correct, so all it does is boost the preferred class further. > > And if the preferred class is wrong (eg. after live range splitting), it > > will boost the incorrect class even harder, so in the end you never get > > a different class. > It may be historical from the old regclass code, not really sure. > > > From what you're saying, recomputing seems best, and I'd be happy to submit > > a patch to remove all the preferred class code from record_reg_classes. > Recomputing certainly helped the cases I was looking at. > > > > However there is still the possibility the preferred class is queried before > > the recomputation happens (I think that is a case Renlin fixed). Either these > > should be faulted and fixed by forcing recomputation, or we need to provide a > > correct preferred class. That should be a copy of the original class. > I believe I had copied the original classes, then recomputed them to > avoid any uninitialized memory reads and the like. But looking at the > old branch, I don't see the recomputation for classes (though I do see > it for costs). Of course all the backwards walk stuff isn't there > either, so there's clearly code I worked on extensively, but never > committed... Well, looking into this further it does look like the preferences are improved during the 2nd pass (in particular fewer cases of the bad ALL_REGS preference that causes all the inefficient code), so it looks like recomputing preferences is not beneficial and my original patch is the right fix for now. https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00829.html Wilco
diff --git a/gcc/ira-emit.c b/gcc/ira-emit.c index d246b7f..d736836 100644 --- a/gcc/ira-emit.c +++ b/gcc/ira-emit.c @@ -348,6 +348,7 @@ rtx ira_create_new_reg (rtx original_reg) { rtx new_reg; + int original_regno = REGNO (original_reg); new_reg = gen_reg_rtx (GET_MODE (original_reg)); ORIGINAL_REGNO (new_reg) = ORIGINAL_REGNO (original_reg); @@ -356,8 +357,16 @@ ira_create_new_reg (rtx original_reg) REG_ATTRS (new_reg) = REG_ATTRS (original_reg); if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL) fprintf (ira_dump_file, " Creating newreg=%i from oldreg=%i\n", - REGNO (new_reg), REGNO (original_reg)); + REGNO (new_reg), original_regno); ira_expand_reg_equiv (); + + /* Copy the preference classes to new_reg. */ + resize_reg_info (); + setup_reg_classes (REGNO (new_reg), + reg_preferred_class (original_regno), + reg_alternate_class (original_regno), + reg_allocno_class (original_regno)); + return new_reg; }