diff mbox

PATCH [10/n]: Prepare x32: PR rtl-optimization/49114: Reload failed to handle (set reg:X (plus:X (subreg:X (reg:Y) 0) (const_int)))

Message ID 20110625183336.GA9651@intel.com
State New
Headers show

Commit Message

H.J. Lu June 25, 2011, 6:33 p.m. UTC
Hi,

When reload gets:

(insn 588 587 589 28 (set (mem:DF (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI 182 [ b ]) 0) 
                    (const_int 8 [0x8]))) [4 MEM[base: b_96(D), index:
D.15020_278, step: 8, offset: 0B]+0 S8 A64])
        (reg:DF 340 [ D.14980 ])) spooles.c:291 106
{*movdf_internal_rex64}
     (expr_list:REG_DEAD (reg:DF 340 [ D.14980 ])
        (nil)))

it generates:

Reloads for insn # 588
Reload 0: reload_in (DI) = (reg/v/f:DI 182 [ b ]) 
        GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0) 
        reload_in_reg: (reg/v/f:DI 182 [ b ]) 
        reload_reg_rtx: (reg:DI 1 dx)
Reload 1: reload_in (DI) = (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI 182
[ b ]) 0)
                                                        (const_int 8
[0x8])))
        GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0) 
        reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI
182 [ b
]) 0)
                                                        (const_int 8
[0x8])))
        reload_reg_rtx: (reg:DI 1 dx)
Reload 2: reload_out (DF) = (mem:DF (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI 182 [ b ]) 0)
                                                            (const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset:
0B]+0 S8
A64])
        NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
        reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI
182 [ b ]) 0)
                                                            (const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset:
0B]+0 S8
A64])

leads to

(insn 1017 587 1020 34 (set (reg:DI 1 dx)
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 112 [0x70])) [5 %sfp+-208 S8 A64]))
spooles.c:291 62
{*movdi_internal_rex64}
     (nil))

(insn 1020 1017 1022 34 (set (reg:SI 1 dx)
        (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
     (nil))

(insn 1022 1020 1023 34 (set (reg:SI 1 dx)
        (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
     (nil))

(insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

(insn 1024 1023 588 34 (set (reg:DI 1 dx)
        (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
{*zero_extendsidi2_rex64}
     (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1
dx) 0)
                (const_int 8 [0x8])))
        (nil)))

(insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D),
index:
D.15020_278, step: 8, offset: 0B]+0 S8 A64])
        (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
{*movdf_internal_rex64}
     (nil))

gen_load has

     if (CONSTANT_P (op1) || MEM_P (op1) || GET_CODE (op1) == SUBREG
          || (REG_P (op1)

For

(plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0) (const_int 8 [0x8]))

it swaps SUBREG and CONST_INT.  It leads to wrong code.  This patch
checks if OP0 is SUBREG before swapping.  OK for trunk?

Thanks.


H.J.
----
2011-06-25  H.J. Lu  <hongjiu.lu@intel.com>

	PR rtl-optimization/49114
	* reload1.c (gen_reload): Properly handle
	(set reg:X (plus:X (subreg:X (reg:Y) 0) (const_int)))

Comments

H.J. Lu June 27, 2011, 6:03 p.m. UTC | #1
On Mon, Jun 27, 2011 at 7:47 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> H.J. Lu wrote:
>
>> When reload gets:
>>
>> (insn 588 587 589 28 (set (mem:DF (zero_extend:DI (plus:SI (subreg:SI
>> (reg/v/f:DI 182 [ b ]) 0)
>>                     (const_int 8 [0x8]))) [4 MEM[base: b_96(D), index:
>> D.15020_278, step: 8, offset: 0B]+0 S8 A64])
>>         (reg:DF 340 [ D.14980 ])) spooles.c:291 106
>> {*movdf_internal_rex64}
>>      (expr_list:REG_DEAD (reg:DF 340 [ D.14980 ])
>>         (nil)))
>
> Reloading of PLUS expressions is a long-standing problem.  gen_reload
> supports those only for PLUSes that look more or less like address
> computations, and then only the "usual" cases.
>
> Is the address above (once the pseudo reg:DI 182 is replaced by
> a hard reg) really a legitimate address on your platform?  If not,
> this would need to be fixed at some earlier place.
>
> If this *is* a valid address (and just not valid for this particular
> insn pattern), the back-end needs to provide some means to reload to
> allow reloading of such expressions.  This can be done either by
> providing an insn (plus post-reload splitter if necessary), or else
> defining a secondary reload to handle the case where additional
> registers are required.  Assuming the generic gen_reload code is
> powerful enough to handle complex expressions like this is probably
> not wise ...
>
> In any case, however, gen_reload should not generate *wrong*
> code, so there's indeed a bug here.
>
> However, this:
>
>> -      if (CONSTANT_P (op1) || MEM_P (op1) || GET_CODE (op1) == SUBREG
>> +      if ((GET_CODE (op0) != SUBREG
>> +        && (CONSTANT_P (op1) || MEM_P (op1)))
>> +       || GET_CODE (op1) == SUBREG
>>         || (REG_P (op1)
>>             && REGNO (op1) >= FIRST_PSEUDO_REGISTER)
>>         || (code != CODE_FOR_nothing
>
> doesn't look like the proper fix for all cases.  The actual problem
> here is that this part of gen_reload takes the approach to transform
>
>     out <- op0 + op1
>
> into
>
>     out <- op0
>     out <- out + op1
>
> which is invalid if writing to out clobbers op1.
>
> This means that:
>
> - The "if" testing whether to swap op0 and op1 should verify
>    !reg_overlap_mentioned_p (out, op0)
>
> - Regardless of whether we swapped or not, there needs to be a
>    gcc_assert (!reg_overlap_mentioned_p (out, op1));
>  before the gen_reload (out, op0, opnum, type) line.
>
> There may still be cases where the algorithm of gen_reload doesn't
> work, but at least we'll get an ICE instead of wrong code now.
> Those cases will have to be fixed by the back-end as described
> above ...
>

The problem is reload 0 puts OP1 in OUT. Adding

gcc_assert (!reg_overlap_mentioned_p (out, op1));

doesn't help in reload 2.  How can I check if OP1 overlaps with
OUT in previous reload?
H.J. Lu June 27, 2011, 6:40 p.m. UTC | #2
On Mon, Jun 27, 2011 at 11:28 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> H.J. Lu wrote:
>> On Mon, Jun 27, 2011 at 7:47 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> > The actual problem
>> > here is that this part of gen_reload takes the approach to transform
>> >
>> >  out <- op0 + op1
>> >
>> > into
>> >
>> >  out <- op0
>> >  out <- out + op1
>> >
>> > which is invalid if writing to out clobbers op1.
>
>> The problem is reload 0 puts OP1 in OUT. Adding
>>
>> gcc_assert (!reg_overlap_mentioned_p (out, op1));
>>
>> doesn't help in reload 2.  How can I check if OP1 overlaps with
>> OUT in previous reload?
>
> Sorry, I don't understand how previous reloads come into play here.
> gen_reload is supposed to load "in" (which happens to be of the
> form op0 + op1) into "out", which means it is of course supposed
> to clobber "out" (as long as that doesn't implictly clobber op0
> or op1 before they're used).  Any conflicts with other reloads ought
> to have been resolved earlier.
>
> Can you elaborate?
>

Here is the output from reload:

Reloads for insn # 588
Reload 0: reload_in (DI) = (reg/v/f:DI 182 [ b ])
        GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0)
        reload_in_reg: (reg/v/f:DI 182 [ b ])
        reload_reg_rtx: (reg:DI 1 dx)
Reload 1: reload_in (DI) = (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182
[ b ]) 0)
                                                        (const_int 8 [0x8])))
        GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum = 0)
        reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 [ b
]) 0)
                                                        (const_int 8 [0x8])))
        reload_reg_rtx: (reg:DI 1 dx)
Reload 2: reload_out (DF) = (mem:DF (zero_extend:DI (plus:SI (subreg:SI
(reg/v/f:DI 182 [ b ]) 0)
                                                            (const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
A64])
        NO_REGS, RELOAD_FOR_OUTPUT (opnum = 0), optional
        reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI
182 [ b ]) 0)
                                                            (const_int 8
[0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
A64])

leads to

(insn 1017 587 1020 34 (set (reg:DI 1 dx)
        (mem/c:DI (plus:DI (reg/f:DI 7 sp)
                (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:291 62
{*movdi_internal_rex64}
     (nil))

(insn 1020 1017 1022 34 (set (reg:SI 1 dx)
        (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
     (nil))

(insn 1022 1020 1023 34 (set (reg:SI 1 dx)
        (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
     (nil))

(insn 1023 1022 1024 34 (set (reg:SI 1 dx)
        (plus:SI (reg:SI 1 dx)
            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
            (const_int 8 [0x8]))
        (nil)))

(insn 1024 1023 588 34 (set (reg:DI 1 dx)
        (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
{*zero_extendsidi2_rex64}
     (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx) 0)
                (const_int 8 [0x8])))
        (nil)))

(insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), index:
D.15020_278, step: 8, offset: 0B]+0 S8 A64])
        (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
{*movdf_internal_rex64}
     (nil))

Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
However, reload 2
puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking what
reload 0 did.
H.J. Lu June 27, 2011, 8:42 p.m. UTC | #3
On Mon, Jun 27, 2011 at 11:59 AM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> H.J. Lu wrote:
>
>> Reloads for insn # 588
>> Reload 0: reload_in (DI) =3D (reg/v/f:DI 182 [ b ])
>>         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
>>         reload_in_reg: (reg/v/f:DI 182 [ b ])
>>         reload_reg_rtx: (reg:DI 1 dx)
>> Reload 1: reload_in (DI) =3D (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:D=
>> I 182
>> [ b ]) 0)
>>                                                         (const_int 8 [0x8])=
>> ))
>>         GENERAL_REGS, RELOAD_FOR_OPERAND_ADDRESS (opnum =3D 0)
>>         reload_in_reg: (zero_extend:DI (plus:SI (subreg:SI (reg/v/f:DI 182 =
>> [ b
>> ]) 0)
>>                                                         (const_int 8 [0x8])=
>> ))
>>         reload_reg_rtx: (reg:DI 1 dx)
>> Reload 2: reload_out (DF) =3D (mem:DF (zero_extend:DI (plus:SI (subreg:SI
>> (reg/v/f:DI 182 [ b ]) 0)
>>                                                             (const_int 8
>> [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
>> A64])
>>         NO_REGS, RELOAD_FOR_OUTPUT (opnum =3D 0), optional
>>         reload_out_reg: (mem:DF (zero_extend:DI (plus:SI (subreg:SI (reg/v/=
>> f:DI
>> 182 [ b ]) 0)
>>                                                             (const_int 8
>> [0x8]))) [4 MEM[base: b_96(D), index: D.15020_278, step: 8, offset: 0B]+0 S8
>> A64])
>>
>> leads to
>>
>
>> (insn 1017 587 1020 34 (set (reg:DI 1 dx)
>>         (mem/c:DI (plus:DI (reg/f:DI 7 sp)
>>                 (const_int 112 [0x70])) [5 %sfp+-208 S8 A64])) spooles.c:29=
>> 1 62
>> {*movdi_internal_rex64}
>>      (nil))
>
> So this is the reload insn generated from reload 0.  So far so good.
>
>> (insn 1020 1017 1022 34 (set (reg:SI 1 dx)
>>         (const_int 8 [0x8])) spooles.c:291 64 {*movsi_internal}
>>      (nil))
>>
>> (insn 1022 1020 1023 34 (set (reg:SI 1 dx)
>>         (reg:SI 1 dx)) spooles.c:291 64 {*movsi_internal}
>>      (nil))
>>
>> (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
>>         (plus:SI (reg:SI 1 dx)
>>             (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
>>      (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
>>             (const_int 8 [0x8]))
>>         (nil)))
>>
>> (insn 1024 1023 588 34 (set (reg:DI 1 dx)
>>         (zero_extend:DI (reg:SI 1 dx))) spooles.c:291 112
>> {*zero_extendsidi2_rex64}
>>      (expr_list:REG_EQUIV (zero_extend:DI (plus:SI (subreg:SI (reg:DI 1 dx)=
>>  0)
>>                 (const_int 8 [0x8])))
>>         (nil)))
>
> All these reload insns are generated from reload 1.
>
>> (insn 588 1024 589 34 (set (mem:DF (reg:DI 1 dx) [4 MEM[base: b_96(D), inde=
>> x:
>> D.15020_278, step: 8, offset: 0B]+0 S8 A64])
>>         (reg:DF 0 ax [orig:340 D.14980 ] [340])) spooles.c:291 106
>> {*movdf_internal_rex64}
>>      (nil))
>
> This is the original reloaded insn.
>
>> Reload 0 puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for input.
>> However, reload 2
>> puts (reg/v/f:DI 182 [ b ]) in  (reg:DI 1 dx) for output.without checking w=
>> hat
>> reload 0 did.
>
> Reload 2 is an optional reload which reload chose not to utilize, so this
> is not really relevant here in any case.  There is no output reload.
>
> The wrong code above originates from how reload 1 is handled:
>
> gen_reload is called to load the ZERO_EXTEND into (reg:DI 1).  This triggers
> the "unary predicate" path, which recurses into gen_reload to load the operand
> of the ZERO_EXTEND (reg:SI 1), and subsequently generates insn 1024.
>
> The recursive call loads (plus:SI (subreg:SI (reg:DI 1)) (const_int 8)) into
> (reg:SI 1).  It attempts to do that in a single SET and fails (for some
> reason).  It then attempts to load the constant (const_int 8) into the
> destination register (insn 1020) [** which is broken **], and re-tries.
> This still fails, so it falls through to the last attempt, which is to
> instead copy the subreg to the destination (which results in insn 1022
> as the subreg is optimized away at this point), followed by adding the
> constant.
>
> Note that the point marked with "[** which is broken **]" is the place
> I pointed out in the previous mail.
>

reload generates:

(insn 914 912 0 (set (reg:SI 0 ax)
        (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
            (const_int 8 [0x8]))) 248 {*lea_1_x32}
     (nil))

from

insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));

Since (reg/v/f:DI 182 [ b ]) is a pseudo register, it is
rejected by

      if ((strict && ! REG_OK_FOR_BASE_STRICT_P (reg))
          || (! strict && ! REG_OK_FOR_BASE_NONSTRICT_P (reg)))
        /* Base is not valid.  */
        return false;

in ix86_legitimate_address_p.
H.J. Lu June 27, 2011, 10:19 p.m. UTC | #4
On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
> H.J. Lu wrote:
>
>> reload generates:
>>
>> (insn 914 912 0 (set (reg:SI 0 ax)
>>         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
>>             (const_int 8 [0x8]))) 248 {*lea_1_x32}
>>      (nil))
>>
>> from
>>
>> insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));
>
> Interesting.  The pseudo should have been replaced by the
> hard register (reg:DI 1) during the preceding call to
>      op0 = find_replacement (&XEXP (in, 0));
> (since reload 0 should have pushed a replacement record.)
>
> Interestingly enough, in the final output that replacement *is*
> performed in the REG_EQUIV note:
>
> (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
>        (plus:SI (reg:SI 1 dx)
>            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
>     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
>            (const_int 8 [0x8]))
>        (nil)))
>
> which is why I hadn't expected this to be a problem here.
>
> Can you try to find out why the find_replacement doesn't work
> with your test case?
>

I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
a problem?
H.J. Lu June 27, 2011, 10:25 p.m. UTC | #5
On Mon, Jun 27, 2011 at 3:19 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Jun 27, 2011 at 3:08 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> H.J. Lu wrote:
>>
>>> reload generates:
>>>
>>> (insn 914 912 0 (set (reg:SI 0 ax)
>>>         (plus:SI (subreg:SI (reg/v/f:DI 182 [ b ]) 0)
>>>             (const_int 8 [0x8]))) 248 {*lea_1_x32}
>>>      (nil))
>>>
>>> from
>>>
>>> insn = emit_insn_if_valid_for_reload (gen_rtx_SET (VOIDmode, out, in));
>>
>> Interesting.  The pseudo should have been replaced by the
>> hard register (reg:DI 1) during the preceding call to
>>      op0 = find_replacement (&XEXP (in, 0));
>> (since reload 0 should have pushed a replacement record.)
>>
>> Interestingly enough, in the final output that replacement *is*
>> performed in the REG_EQUIV note:
>>
>> (insn 1023 1022 1024 34 (set (reg:SI 1 dx)
>>        (plus:SI (reg:SI 1 dx)
>>            (const_int 8 [0x8]))) spooles.c:291 248 {*lea_1_x32}
>>     (expr_list:REG_EQUIV (plus:SI (subreg:SI (reg:DI 1 dx) 0)
>>            (const_int 8 [0x8]))
>>        (nil)))
>>
>> which is why I hadn't expected this to be a problem here.
>>
>> Can you try to find out why the find_replacement doesn't work
>> with your test case?
>>
>
> I will investigate.  Could (reg:SI 1 dx) vs  (subreg:SI (reg:DI 1 dx) 0)
> a problem?
>

find_replacement never checks subreg:

Breakpoint 3, find_replacement (loc=0x7ffff068ab00)
    at /export/gnu/import/git/gcc-x32/gcc/reload.c:6411
6411	      if (reloadreg && r->where == loc)
(reg:DI 0 ax)
(reg/v/f:DI 182 [ b ])
(gdb) call debug_rtx (*loc)
(subreg:SI (reg/v/f:DI 182 [ b ]) 0)
(gdb)
diff mbox

Patch

diff --git a/gcc/reload1.c b/gcc/reload1.c
index 4a697c2..d618a29 100644
--- a/gcc/reload1.c
+++ b/gcc/reload1.c
@@ -8528,7 +8528,9 @@  gen_reload (rtx out, rtx in, int opnum, enum reload_type type)
 
       code = optab_handler (add_optab, GET_MODE (out));
 
-      if (CONSTANT_P (op1) || MEM_P (op1) || GET_CODE (op1) == SUBREG
+      if ((GET_CODE (op0) != SUBREG
+	   && (CONSTANT_P (op1) || MEM_P (op1)))
+	  || GET_CODE (op1) == SUBREG
 	  || (REG_P (op1)
 	      && REGNO (op1) >= FIRST_PSEUDO_REGISTER)
 	  || (code != CODE_FOR_nothing