Message ID | 517A8974.7000202@codesourcery.com |
---|---|
State | New |
Headers | show |
Bernd Schmidt <bernds@codesourcery.com> writes: > This patch here: > http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00661.html > > changed simplification code from > case TRUNCATE: > - /* We can't handle truncation to a partial integer mode here > - because we don't know the real bitsize of the partial > - integer mode. */ > - if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) > - break; > > to > + if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) > + { > + if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))) > + return rtl_hooks.gen_lowpart_no_emit (mode, op); > + /* We can't handle truncation to a partial integer mode here > + because we don't know the real bitsize of the partial > + integer mode. */ > + break; > + } > > This is problematic for m32c; it defines TRULY_NOOP_TRUNCATION as 1, and > it's not really possible to define it meaningfully for partial int > modes, since it only gets passed precisions. Allowing subregs of PSImode > values leads to out of registers reload failures, so it kind of relies > on the previous behaviour. Argh, that's unfortunate. The point of that change was to make simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg. Would the equivalent lowpart simplify_gen_subreg call succeed (return nonnull)? If so, I think we want truncate to do the same. What simplification is this blocking, and why does it lead to reload failures? Thanks, Richard
On 04/27/2013 10:39 AM, Richard Sandiford wrote: > Argh, that's unfortunate. The point of that change was to make > simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg. > Would the equivalent lowpart simplify_gen_subreg call succeed > (return nonnull)? If so, I think we want truncate to do the same. > > What simplification is this blocking, and why does it lead to > reload failures? There's an explicit (set (reg:PSI) (truncate:PSI (reg:SI)) insn which currently gets changed to (set (reg:PSI) (subreg:PSI (reg:SI)) during cse1. Reload fails because the subreg gets propagated into a memory address, which requires a class of A_REGS, but A_REGS can only hold PSImode values, not SImode. This shows that the truncation is not always a no-op: in this case it involves a register move, but there's no way to describe this using TRULY_NOOP_TRUNCATION. Bernd
Bernd Schmidt <bernds@codesourcery.com> writes: > On 04/27/2013 10:39 AM, Richard Sandiford wrote: >> Argh, that's unfortunate. The point of that change was to make >> simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg. >> Would the equivalent lowpart simplify_gen_subreg call succeed >> (return nonnull)? If so, I think we want truncate to do the same. >> >> What simplification is this blocking, and why does it lead to >> reload failures? > > There's an explicit (set (reg:PSI) (truncate:PSI (reg:SI)) insn which > currently gets changed to (set (reg:PSI) (subreg:PSI (reg:SI)) during > cse1. Reload fails because the subreg gets propagated into a memory > address, which requires a class of A_REGS, but A_REGS can only hold > PSImode values, not SImode. This shows that the truncation is not > always a no-op: in this case it involves a register move, but there's no > way to describe this using TRULY_NOOP_TRUNCATION. Hmm, but isn't this a reload bug? We have: (insn 53 51 54 10 (set (reg:HI 0 r0 [orig:26 D.2817 ] [26]) (zero_extend:HI (mem/u/j:QI (plus:PSI (subreg:PSI (reg:SI 44 [ D.2818 ]) 0) (symbol_ref:PSI ("__clz_tab") [flags 0x40] <var_decl 0x7f2c253d42f8 __clz_tab>)) [0 __clz_tab S1 A8]))) /home/richards/gcc/HEAD/gcc/libgcc/libgcc2.c:520 115 {zero_extendqihi2} (expr_list:REG_DEAD (reg:SI 44 [ D.2818 ]) (nil))) Reloads for insn # 53 Reload 0: reload_in (SI) = (reg:SI 44 [ D.2818 ]) A_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0) reload_in_reg: (reg:SI 44 [ D.2818 ]) find_reloads_address_1 is reloading the SUBREG_REG rather than the SUBREG itself, even though SImode is not valid for BASE_REGS == A_REGS: if (GET_CODE (op0) == SUBREG) { op0 = SUBREG_REG (op0); code0 = GET_CODE (op0); if (code0 == REG && REGNO (op0) < FIRST_PSEUDO_REGISTER) op0 = gen_rtx_REG (word_mode, (REGNO (op0) + subreg_regno_offset (REGNO (SUBREG_REG (orig_op0)), GET_MODE (SUBREG_REG (orig_op0)), SUBREG_BYTE (orig_op0), GET_MODE (orig_op0)))); } push_reloads would specifically not convert a SUBREG reload to a REG reload in this case. In principle, I think address subregs should be handled in the same way. So is the problem really that (subreg:PSI (reg:SI ...)) isn't a valid truncation on m32c? Without TRULY_NOOP_TRUNCATION, I don't see what forces most code to use (truncate:PSI (reg:SI ...)) instead. Many places would call gen_lowpart directly. Sorry for missing the truncation patterns, I should have grepped more than m32c.md. They look a lot like normal moves though. Is truncation really not a noop, or are the patterns there to work around something (probably this :-))? Richard
> Sorry for missing the truncation patterns, I should have grepped > more than m32c.md. They look a lot like normal moves though. Is > truncation really not a noop, or are the patterns there to work > around something (probably this :-))? Not sure which pattern you're talking about, but in general, the m32c's registers are either 16-bit or 24-bit. You can move a pair of 16-bit registers into a 24-bit register and it truncates as part of the move, likewise from 32-bit memory to 24-bit reg. Note that moves to other 32-bit destinations do *not* truncate, nor can 24-bit registers hold 32-bit values (duh). The 24-bit registers may also hold a 16-bit value. If you move a 16-bit value into a 24-bit register, it zero_extends.
DJ Delorie <dj@redhat.com> writes: >> Sorry for missing the truncation patterns, I should have grepped >> more than m32c.md. They look a lot like normal moves though. Is >> truncation really not a noop, or are the patterns there to work >> around something (probably this :-))? > > Not sure which pattern you're talking about, but in general, the > m32c's registers are either 16-bit or 24-bit. You can move a pair of > 16-bit registers into a 24-bit register and it truncates as part of > the move, likewise from 32-bit memory to 24-bit reg. Note that moves > to other 32-bit destinations do *not* truncate, nor can 24-bit > registers hold 32-bit values (duh). The 24-bit registers may also > hold a 16-bit value. The pattern in this case was: (define_insn "truncsipsi2_24" [(set (match_operand:PSI 0 "m32c_nonimmediate_operand" "=RsiSd*Rmm,Raa,!Rcl,RsiSd*Rmm") (truncate:PSI (match_operand:SI 1 "m32c_nonimmediate_operand" "0,RsiSd*Rmm,RsiSd*Rmm,!Rcl")))] "TARGET_A24" "@ ; no-op trunc si %1 to psi %0 mov.l\t%1,%0 ldc\t%1,%0 stc\t%1,%0" [(set_attr "flags" "n,sz,n,n")] ) (the ICE I mentioned was on -mcpu=m32cm, forgot to mention that sorry). It looked like alternatives 0 and 1 were really moves, with alternative 0 being a no-op move. The question was really whether: (set (reg:PSI foo) (truncate:PSI (reg:SI bar))) and: (set (reg:PSI foo) (subreg:PSI (reg:SI bar) 0)) are fundamentally different on this target. In other words, does m32c really want to set TRULY_NOOP_TRUNCATION to false for SImode->PSImode truncations, but can't because the interface is broken for partial ints? It looked to me like the answer was "no" in both cases: truncation from SImode to PSImode seems to work just like a lowpart subreg on this target. The corresponding move pattern seems to be: (define_insn "movpsi_op" [(set (match_operand:PSI 0 "m32c_nonimmediate_operand" "=Raa, SdRmmRpi, Rcl, RpiSd*Rmm, <, <, Rcl, RpiRaa*Rmm") (match_operand:PSI 1 "m32c_any_operand" "sIU3, iSdRmmRpi, iRpiSd*Rmm, Rcl, Rpi*Rmm, Rcl, >, >"))] "TARGET_A24 && m32c_mov_ok (operands, PSImode)" "@ mov.l:s\t%1,%0 mov.l\t%1,%0 ldc\t%1,%0 stc\t%1,%0 push.l\t%1 pushc\t%1 popc\t%0 #" [(set_attr "flags" "sz,sz,n,n,n,n,n,*")] ) and AIUI alternative 1 in the truncsipsi2_24 pattern is basically acting like alternative 1 in movpsi_op. If that's right, what do you think of the patch I posted yesterday? Thanks, Richard
* simplify-rtx.c (simplify_unary_operation_1): Don't try to simplify truncations of partial int modes. diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 791f91a..6a8221c 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -1038,12 +1038,6 @@ simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op) if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT) { - if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op))) - { - temp = rtl_hooks.gen_lowpart_no_emit (mode, op); - if (temp) - return temp; - } /* We can't handle truncation to a partial integer mode here because we don't know the real bitsize of the partial integer mode. */