diff mbox

Fix IRA register preferencing

Message ID 000b01d013e5$4bc87010$e3595030$@com
State New
Headers show

Commit Message

Wilco Dec. 9, 2014, 7:21 p.m. UTC
This fixes a bug in register preferencing. When live range splitting creates a new register from
another, it copies most fields except for the register preferences. The preference GENERAL_REGS is
used as reg_pref[i].prefclass is initialized with GENERAL_REGS in allocate_reg_info () and
resize_reg_info ().

This initialization value is not innocuous like the comment suggests - if a new register has a
non-integer mode, it is forced to prefer GENERAL_REGS. This changes the register costs in pass 2 so
that they are incorrect. As a result the liverange is either spilled or allocated to an integer
register:

void g(double);
void f(double x) 
{ 
  if (x == 0) 
    return; 
  g (x);
  g (x); 
}

f:
        fcmp    d0, #0.0
        bne     .L6
        ret
.L6:
        stp     x19, x30, [sp, -16]!
        fmov    x19, d0
        bl      g
        fmov    d0, x19
        ldp     x19, x30, [sp], 16
        b       g

With the fix it uses a floating point register as expected. Given a similar issue in
https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be better to change the
initialization values of reg_pref to illegal register classes so this kind of issue can be trivially
found with an assert? Also would it not be a good idea to have a single register copy function that
ensures all data is copied?


ChangeLog: 2014-12-09  Wilco Dijkstra  wdijkstr@arm.com

	* gcc/ira-emit.c (ira_create_new_reg): Copy preference classes.

---
 gcc/ira-emit.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

Comments

Jeff Law Dec. 9, 2014, 11:17 p.m. UTC | #1
On 12/09/14 12:21, Wilco Dijkstra wrote:
> This fixes a bug in register preferencing. When live range splitting creates a new register from
> another, it copies most fields except for the register preferences. The preference GENERAL_REGS is
> used as reg_pref[i].prefclass is initialized with GENERAL_REGS in allocate_reg_info () and
> resize_reg_info ().
>
> This initialization value is not innocuous like the comment suggests - if a new register has a
> non-integer mode, it is forced to prefer GENERAL_REGS. This changes the register costs in pass 2 so
> that they are incorrect. As a result the liverange is either spilled or allocated to an integer
> register:
>
> void g(double);
> void f(double x)
> {
>    if (x == 0)
>      return;
>    g (x);
>    g (x);
> }
>
> f:
>          fcmp    d0, #0.0
>          bne     .L6
>          ret
> .L6:
>          stp     x19, x30, [sp, -16]!
>          fmov    x19, d0
>          bl      g
>          fmov    d0, x19
>          ldp     x19, x30, [sp], 16
>          b       g
>
> With the fix it uses a floating point register as expected. Given a similar issue in
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be better to change the
> initialization values of reg_pref to illegal register classes so this kind of issue can be trivially
> found with an assert? Also would it not be a good idea to have a single register copy function that
> ensures all data is copied?
But there are other times when you really don't want to copy, say when 
the original had a small class, but the copy can go into a larger class.

I banged my head on this when I was doing similar work on range 
splitting a few years back and ended up recomputing the preferred and 
alternate class information.  That was much better than copying the 
classes.

I pondered heuristics to expand the preferred class, but never 
implemented anything IIRC.  A trivial heuristic would be to bump to the 
next larger class if the given class was a singleton, otherwise copy the 
class.

The obvious counter to that heuristic is an allocno that has two ranges 
(or N ranges) where we would prefer a different singleton class for each 
range.  In fact, I'm pretty sure I ran into this kind of situation and 
that led me down the "just recompute it" path.

I'd hazard a guess that the simple heuristic would do better than what 
we're doing now with GENERAL_REGS though or what you're doing with copying.

jeff
Wilco Dec. 10, 2014, 1:26 p.m. UTC | #2
> Jeff Law wrote:
> On 12/09/14 12:21, Wilco Dijkstra wrote:
> > With the fix it uses a floating point register as expected. Given a similar issue in
> > https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html, would it not be better to change
> the
> > initialization values of reg_pref to illegal register classes so this kind of issue can be
> trivially
> > found with an assert? Also would it not be a good idea to have a single register copy
> function that
> > ensures all data is copied?
> But there are other times when you really don't want to copy, say when
> the original had a small class, but the copy can go into a larger class.
> 
> I banged my head on this when I was doing similar work on range
> splitting a few years back and ended up recomputing the preferred and
> alternate class information.  That was much better than copying the
> classes.

If recomputing is best does that mean that record_reg_classes should not
give a boost to the preferred class in the 2nd pass? I don't understand
what purpose this has - if the preferred class is from the first pass, it
is already correct, so all it does is boost the preferred class further. 
And if the preferred class is wrong (eg. after live range splitting), it 
will boost the incorrect class even harder, so in the end you never get 
a different class.

> I pondered heuristics to expand the preferred class, but never
> implemented anything IIRC.  A trivial heuristic would be to bump to the
> next larger class if the given class was a singleton, otherwise copy the
> class.
> 
> The obvious counter to that heuristic is an allocno that has two ranges
> (or N ranges) where we would prefer a different singleton class for each
> range.  In fact, I'm pretty sure I ran into this kind of situation and
> that led me down the "just recompute it" path.
> 
> I'd hazard a guess that the simple heuristic would do better than what
> we're doing now with GENERAL_REGS though or what you're doing with copying.

From what you're saying, recomputing seems best, and I'd be happy to submit
a patch to remove all the preferred class code from record_reg_classes.

However there is still the possibility the preferred class is queried before
the recomputation happens (I think that is a case Renlin fixed). Either these
should be faulted and fixed by forcing recomputation, or we need to provide a 
correct preferred class. That should be a copy of the original class.

Wilco
Jeff Law Dec. 10, 2014, 8:47 p.m. UTC | #3
On 12/10/14 06:26, Wilco Dijkstra wrote:
>
> If recomputing is best does that mean that record_reg_classes should not
> give a boost to the preferred class in the 2nd pass?
Perhaps.  I haven't looked deeply at this part of IRA.  I was relaying 
my experiences with (ab)using the ira-reload callbacks to handle 
allocation after splitting -- where getting the costs and classes 
updated in a reasonable manner was clearly important to getting good 
code.  One could probably argue I should have kept testcases from that 
work :-)


I don't understand
> what purpose this has - if the preferred class is from the first pass, it
> is already correct, so all it does is boost the preferred class further.
> And if the preferred class is wrong (eg. after live range splitting), it
> will boost the incorrect class even harder, so in the end you never get
> a different class.
It may be historical from the old regclass code, not really sure.

>  From what you're saying, recomputing seems best, and I'd be happy to submit
> a patch to remove all the preferred class code from record_reg_classes.
Recomputing certainly helped the cases I was looking at.
>
> However there is still the possibility the preferred class is queried before
> the recomputation happens (I think that is a case Renlin fixed). Either these
> should be faulted and fixed by forcing recomputation, or we need to provide a
> correct preferred class. That should be a copy of the original class.
I believe I had copied the original classes, then recomputed them to 
avoid any uninitialized memory reads and the like.  But looking at the 
old branch, I don't see the recomputation for classes (though I do see 
it for costs).  Of course all the backwards walk stuff isn't there 
either, so there's clearly code I worked on extensively, but never 
committed...

Jeff
Wilco April 27, 2015, 3:01 p.m. UTC | #4
> Jeff Law wrote:
> On 12/10/14 06:26, Wilco Dijkstra wrote:
> >
> > If recomputing is best does that mean that record_reg_classes should not
> > give a boost to the preferred class in the 2nd pass?
> Perhaps.  I haven't looked deeply at this part of IRA.  I was relaying
> my experiences with (ab)using the ira-reload callbacks to handle
> allocation after splitting -- where getting the costs and classes
> updated in a reasonable manner was clearly important to getting good
> code.  One could probably argue I should have kept testcases from that
> work :-)
> 
> 
> I don't understand
> > what purpose this has - if the preferred class is from the first pass, it
> > is already correct, so all it does is boost the preferred class further.
> > And if the preferred class is wrong (eg. after live range splitting), it
> > will boost the incorrect class even harder, so in the end you never get
> > a different class.
> It may be historical from the old regclass code, not really sure.
> 
> >  From what you're saying, recomputing seems best, and I'd be happy to submit
> > a patch to remove all the preferred class code from record_reg_classes.
> Recomputing certainly helped the cases I was looking at.
> >
> > However there is still the possibility the preferred class is queried before
> > the recomputation happens (I think that is a case Renlin fixed). Either these
> > should be faulted and fixed by forcing recomputation, or we need to provide a
> > correct preferred class. That should be a copy of the original class.
> I believe I had copied the original classes, then recomputed them to
> avoid any uninitialized memory reads and the like.  But looking at the
> old branch, I don't see the recomputation for classes (though I do see
> it for costs).  Of course all the backwards walk stuff isn't there
> either, so there's clearly code I worked on extensively, but never
> committed...

Well, looking into this further it does look like the preferences are improved
during the 2nd pass (in particular fewer cases of the bad ALL_REGS preference
that causes all the inefficient code), so it looks like recomputing preferences
is not beneficial and my original patch is the right fix for now.

https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00829.html

Wilco
diff mbox

Patch

diff --git a/gcc/ira-emit.c b/gcc/ira-emit.c
index d246b7f..d736836 100644
--- a/gcc/ira-emit.c
+++ b/gcc/ira-emit.c
@@ -348,6 +348,7 @@  rtx
 ira_create_new_reg (rtx original_reg)
 {
   rtx new_reg;
+  int original_regno = REGNO (original_reg);
 
   new_reg = gen_reg_rtx (GET_MODE (original_reg));
   ORIGINAL_REGNO (new_reg) = ORIGINAL_REGNO (original_reg);
@@ -356,8 +357,16 @@  ira_create_new_reg (rtx original_reg)
   REG_ATTRS (new_reg) = REG_ATTRS (original_reg);
   if (internal_flag_ira_verbose > 3 && ira_dump_file != NULL)
     fprintf (ira_dump_file, "      Creating newreg=%i from oldreg=%i\n",
-	     REGNO (new_reg), REGNO (original_reg));
+	     REGNO (new_reg), original_regno);
   ira_expand_reg_equiv ();
+
+  /* Copy the preference classes to new_reg.  */
+  resize_reg_info ();
+  setup_reg_classes (REGNO (new_reg),
+		     reg_preferred_class (original_regno),
+		     reg_alternate_class (original_regno),
+		     reg_allocno_class (original_regno));
+
   return new_reg;
 }