diff mbox

[RFC] Fix spill failure at -O on 64-bit Windows

Message ID 2808912.mi0B0sGudI@polaris
State New
Headers show

Commit Message

Eric Botcazou Oct. 15, 2012, 5:45 p.m. UTC
For the attached Ada testcase, the compiler aborts with a spill failure at -O:

p.adb: In function 'P.F':
p.adb:16:7: error: unable to find a register to spill in class 'DREG'
p.adb:16:7: error: this is the insn:
(insn 141 140 142 17 (parallel [
            (set (reg:SI 0 ax [174])
                (div:SI (subreg:SI (reg:DI 43 r14 [orig:81 iftmp.6 ] [81]) 0)
                    (reg:SI 41 r12 [orig:140 q__l.7 ] [140])))
            (set (reg:SI 43 r14 [175])
                (mod:SI (subreg:SI (reg:DI 43 r14 [orig:81 iftmp.6 ] [81]) 0)
                    (reg:SI 41 r12 [orig:140 q__l.7 ] [140])))
            (clobber (reg:CC 17 flags))
        ]) p.adb:6 349 {*divmodsi4}
     (expr_list:REG_DEAD (reg:SI 41 r12 [orig:140 q__l.7 ] [140])
        (expr_list:REG_DEAD (reg:DI 43 r14 [orig:81 iftmp.6 ] [81])
            (expr_list:REG_UNUSED (reg:SI 43 r14 [175])
                (expr_list:REG_UNUSED (reg:CC 17 flags)
                    (nil))))))
+===========================GNAT BUG DETECTED==============================+
| 4.8.0 20121015 (experimental) [trunk revision 192447] (x86_64-pc-mingw32) 
GCC error:|
| in spill_failure, at reload1.c:2124               

The problem is that LIM hoists instructions preparing an argument register for 
a call out of a loop, without moving the associated clobber:

(insn 249 248 250 33 (clobber (reg:SC 2 cx)) p.adb:12 -1
     (nil))
[...]
(insn 253 251 254 33 (parallel [
            (set (reg:DI 204)
                (and:DI (reg:DI 2 cx)
                    (reg:DI 195)))
            (clobber (reg:CC 17 flags))
        ]) p.adb:12 375 {*anddi_1}
     (expr_list:REG_EQUAL (and:DI (reg:DI 2 cx)
            (const_int -4294967296 [0xffffffff00000000]))
        (expr_list:REG_DEAD (reg:DI 2 cx)
            (expr_list:REG_UNUSED (reg:CC 17 flags)
                (nil)))))

Set in insn 253 is invariant (8), cost 8, depends on 6
Decided to move invariant 8 -- gain 8

This extends the lifetime of the hard register up to the beginning of the 
function, causing reload to die on the complex division instruction.

The attached patch prevents the invariant from being hoisted in this very 
particular case.  Any better idea?

Tested on x86_64-suse-linux.


2012-10-15  Eric Botcazou  <ebotcazou@adacore.com>

	* loop-invariant.c: Include target.h.
	(check_dependency): Return false for an uninitialized argument register
	that is likely to be spilled.
	* Makefile.in (loop-invariant.o): Add $(TARGET_H).


2012-10-15  Eric Botcazou  <ebotcazou@adacore.com>

	* gnat.dg/loop_optimization13.ad[sb]: New test.
	* gnat.dg/loop_optimization13_pkg.ads: New helper.

Comments

Steven Bosscher Oct. 15, 2012, 6:10 p.m. UTC | #1
On Mon, Oct 15, 2012 at 7:45 PM, Eric Botcazou wrote:
> This extends the lifetime of the hard register up to the beginning of the
> function, causing reload to die on the complex division instruction.

Does this still happen after my patch from yesterday to use DF_LIVE in IRA?


> The attached patch prevents the invariant from being hoisted in this very
> particular case.  Any better idea?

Maybe add a cost-free dependency on the clobber, so that it's moved
with the insn?

And/or maybe punt on all likely_spilled hard registers if
-fira-loop-pressure is not in effect, it's unlikely to be a win in any
case.

Ciao!
Steven
Eric Botcazou Oct. 17, 2012, 9:55 a.m. UTC | #2
> Does this still happen after my patch from yesterday to use DF_LIVE in IRA?

Yes, it even fails at -O2.

> Maybe add a cost-free dependency on the clobber, so that it's moved
> with the insn?

Maybe.  But I'm a little worried about (1) extending the lifetime of the hard 
register and (2) simply moving around a clobber of a hard register.

> And/or maybe punt on all likely_spilled hard registers if
> -fira-loop-pressure is not in effect, it's unlikely to be a win in any
> case.

I'd like to keep the change minimal, in particular restricted to uninitialized 
hard registers.  The failure mode is very specific (and might be unique to the 
64-bit ABI on Windows) so I'm not sure it's worth generalizing.
Steven Bosscher Oct. 17, 2012, 11:42 a.m. UTC | #3
On Wed, Oct 17, 2012 at 11:55 AM, Eric Botcazou wrote:
>> Maybe add a cost-free dependency on the clobber, so that it's moved
>> with the insn?
>
> Maybe.  But I'm a little worried about (1) extending the lifetime of the hard
> register and (2) simply moving around a clobber of a hard register.

AFAIU the clobber is there because it was emitted specially for this
particular use of the hard register. So moving it around with the
reason why it's there in the first place would make sense to me.

I don't understand what you mean with extending the life of the hard
register in this case. If you move the clobber with the instruction,
the hard reg is dead before the clobber and after the insn that uses
it, just like when the insn is not hoisted from the loop. So you don't
extend the reg's live range if you move the clobber with it. And if
the register is not dead after the insn, it must be live through the
loop (otherwise it would be only partially available on the loop
entry, coming in from the loop latch) so you don't extend the live
range in that case either.


(One thing I don't fully understand here, though, is why the clobber
doesn't prevent the code motion to begin with. If you don't move the
clobber, that could change the semantics of the program, right? E.g if
you have before LIM something like this:

  while (some_condition) { clobber(ecx); invariant_use (ecx);  ... };
  use (ecx);

and after LIM you have:

  invariant_use (ecx);
  while (some_condition) { clobber(ecx);  ... };
  use (ecx);

that's not the same program anymore if some_condition isn't true...
Not sure if that matters for this kind of uninitialized use.)


> I'd like to keep the change minimal, in particular restricted to uninitialized
> hard registers.  The failure mode is very specific (and might be unique to the
> 64-bit ABI on Windows) so I'm not sure it's worth generalizing.

I've got a great dislike for special cases in the GCC code base, they
tend to grow into common practice at alarmingly fast rates :-)
But I suppose your point of view is more pragmatic in this peculiar case.

Ciao!
Steven
Eric Botcazou Oct. 17, 2012, 2:24 p.m. UTC | #4
> I don't understand what you mean with extending the life of the hard
> register in this case. If you move the clobber with the instruction,
> the hard reg is dead before the clobber and after the insn that uses
> it, just like when the insn is not hoisted from the loop. So you don't
> extend the reg's live range if you move the clobber with it. And if
> the register is not dead after the insn, it must be live through the
> loop (otherwise it would be only partially available on the loop
> entry, coming in from the loop latch) so you don't extend the live
> range in that case either.

In the former case, I agree that you don't really extend the live range.
But you nevertheless extend its "span", i.e. you create a live range for the 
hard register out of the loop, while it's supposed to be used only just before 
the call in the argument preparation.  In the latter case, I'm not sure that 
your conclusion holds: why can't it be used in a second insn and die there?

> (One thing I don't fully understand here, though, is why the clobber
> doesn't prevent the code motion to begin with. If you don't move the
> clobber, that could change the semantics of the program, right? E.g if
> you have before LIM something like this:
> 
>   while (some_condition) { clobber(ecx); invariant_use (ecx);  ... };
>   use (ecx);
> 
> and after LIM you have:
> 
>   invariant_use (ecx);
>   while (some_condition) { clobber(ecx);  ... };
>   use (ecx);
> 
> that's not the same program anymore if some_condition isn't true...
> Not sure if that matters for this kind of uninitialized use.)

I agree that moving an uninitialized use (of a hard register) without moving 
the associated clobber is weird.  But I think most RTL passes don't care and 
this probably doesn't affect the correctness of the generated code.

> I've got a great dislike for special cases in the GCC code base, they
> tend to grow into common practice at alarmingly fast rates :-)
> But I suppose your point of view is more pragmatic in this peculiar case.

I'm just realistic: you don't want to make substantial changes in a powerful 
optimization pass just because of a corner case in an exotic language on a 
semi-exotic platform. :-)
diff mbox

Patch

Index: Makefile.in
===================================================================
--- Makefile.in	(revision 192447)
+++ Makefile.in	(working copy)
@@ -3101,7 +3101,7 @@  loop-iv.o : loop-iv.c $(CONFIG_H) $(SYST
    intl.h $(DIAGNOSTIC_CORE_H) $(DF_H) $(HASHTAB_H)
 loop-invariant.o : loop-invariant.c $(CONFIG_H) $(SYSTEM_H) coretypes.h dumpfile.h \
    $(RTL_H) $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) $(EXPR_H) $(RECOG_H) \
-   $(TM_H) $(TM_P_H) $(FUNCTION_H) $(FLAGS_H) $(DF_H) \
+   $(TM_H) $(TM_P_H) $(FUNCTION_H) $(FLAGS_H) $(DF_H) $(TARGET_H) \
    $(OBSTACK_H) $(HASHTAB_H) $(EXCEPT_H) $(PARAMS_H) $(REGS_H) ira.h
 cfgloopmanip.o : cfgloopmanip.c $(CONFIG_H) $(SYSTEM_H) $(RTL_H) \
    $(BASIC_BLOCK_H) hard-reg-set.h $(CFGLOOP_H) \
Index: loop-invariant.c
===================================================================
--- loop-invariant.c	(revision 192447)
+++ loop-invariant.c	(working copy)
@@ -47,6 +47,7 @@  along with GCC; see the file COPYING3.
 #include "cfgloop.h"
 #include "expr.h"
 #include "recog.h"
+#include "target.h"
 #include "function.h"
 #include "flags.h"
 #include "df.h"
@@ -784,7 +785,22 @@  check_dependency (basic_block bb, df_ref
 
   defs = DF_REF_CHAIN (use);
   if (!defs)
-    return true;
+    {
+      unsigned int regno = DF_REF_REGNO (use);
+
+      /* If this is the use of an uninitialized argument register that is
+	 likely to be spilled, do not move it lest this might extend its
+	 lifetime and cause reload to die.  This can occur for a call to
+	 a function taking complex number arguments and moving the insns
+	 preparing the arguments without moving the call itself wouldn't
+	 gain much in practice.  */
+      if ((DF_REF_FLAGS (use) & DF_HARD_REG_LIVE)
+	  && FUNCTION_ARG_REGNO_P (regno)
+	  && targetm.class_likely_spilled_p (REGNO_REG_CLASS (regno)))
+	return false;
+
+      return true;
+    }
 
   if (defs->next)
     return false;