diff mbox

Make m32c build, fix PSImode truncation

Message ID 517A8974.7000202@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt April 26, 2013, 2:04 p.m. UTC
This patch here:
  http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00661.html

changed simplification code from
     case TRUNCATE:
-      /* We can't handle truncation to a partial integer mode here
-         because we don't know the real bitsize of the partial
-         integer mode.  */
-      if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
-        break;

to
+      if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
+       {
+         if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op)))
+           return rtl_hooks.gen_lowpart_no_emit (mode, op);
+         /* We can't handle truncation to a partial integer mode here
+            because we don't know the real bitsize of the partial
+            integer mode.  */
+         break;
+       }

This is problematic for m32c; it defines TRULY_NOOP_TRUNCATION as 1, and
it's not really possible to define it meaningfully for partial int
modes, since it only gets passed precisions. Allowing subregs of PSImode
values leads to out of registers reload failures, so it kind of relies
on the previous behaviour.

The patch below restores the old behaviour. Bootstrapped and tested on
x86_64-linux, and it makes m32c build. Ok?


Bernd

Comments

Richard Sandiford April 27, 2013, 8:39 a.m. UTC | #1
Bernd Schmidt <bernds@codesourcery.com> writes:
> This patch here:
>   http://gcc.gnu.org/ml/gcc-patches/2012-10/msg00661.html
>
> changed simplification code from
>      case TRUNCATE:
> -      /* We can't handle truncation to a partial integer mode here
> -         because we don't know the real bitsize of the partial
> -         integer mode.  */
> -      if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
> -        break;
>
> to
> +      if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
> +       {
> +         if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op)))
> +           return rtl_hooks.gen_lowpart_no_emit (mode, op);
> +         /* We can't handle truncation to a partial integer mode here
> +            because we don't know the real bitsize of the partial
> +            integer mode.  */
> +         break;
> +       }
>
> This is problematic for m32c; it defines TRULY_NOOP_TRUNCATION as 1, and
> it's not really possible to define it meaningfully for partial int
> modes, since it only gets passed precisions. Allowing subregs of PSImode
> values leads to out of registers reload failures, so it kind of relies
> on the previous behaviour.

Argh, that's unfortunate.  The point of that change was to make
simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg.
Would the equivalent lowpart simplify_gen_subreg call succeed
(return nonnull)?  If so, I think we want truncate to do the same.

What simplification is this blocking, and why does it lead to
reload failures?

Thanks,
Richard
Bernd Schmidt April 29, 2013, 9:52 a.m. UTC | #2
On 04/27/2013 10:39 AM, Richard Sandiford wrote:
> Argh, that's unfortunate.  The point of that change was to make
> simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg.
> Would the equivalent lowpart simplify_gen_subreg call succeed
> (return nonnull)?  If so, I think we want truncate to do the same.
> 
> What simplification is this blocking, and why does it lead to
> reload failures?

There's an explicit (set (reg:PSI) (truncate:PSI (reg:SI)) insn which
currently gets changed to (set (reg:PSI) (subreg:PSI (reg:SI)) during
cse1. Reload fails because the subreg gets propagated into a memory
address, which requires a class of A_REGS, but A_REGS can only hold
PSImode values, not SImode.  This shows that the truncation is not
always a no-op: in this case it involves a register move, but there's no
way to describe this using TRULY_NOOP_TRUNCATION.


Bernd
Richard Sandiford April 29, 2013, 12:25 p.m. UTC | #3
Bernd Schmidt <bernds@codesourcery.com> writes:
> On 04/27/2013 10:39 AM, Richard Sandiford wrote:
>> Argh, that's unfortunate.  The point of that change was to make
>> simplify_gen_unary (TRUNCATE, ...) no worse than using a subreg.
>> Would the equivalent lowpart simplify_gen_subreg call succeed
>> (return nonnull)?  If so, I think we want truncate to do the same.
>> 
>> What simplification is this blocking, and why does it lead to
>> reload failures?
>
> There's an explicit (set (reg:PSI) (truncate:PSI (reg:SI)) insn which
> currently gets changed to (set (reg:PSI) (subreg:PSI (reg:SI)) during
> cse1. Reload fails because the subreg gets propagated into a memory
> address, which requires a class of A_REGS, but A_REGS can only hold
> PSImode values, not SImode.  This shows that the truncation is not
> always a no-op: in this case it involves a register move, but there's no
> way to describe this using TRULY_NOOP_TRUNCATION.

Hmm, but isn't this a reload bug?  We have:

(insn 53 51 54 10 (set (reg:HI 0 r0 [orig:26 D.2817 ] [26])
        (zero_extend:HI (mem/u/j:QI (plus:PSI (subreg:PSI (reg:SI 44 [ D.2818 ]) 0)
                    (symbol_ref:PSI ("__clz_tab") [flags 0x40]  <var_decl 0x7f2c253d42f8 __clz_tab>)) [0 __clz_tab S1 A8]))) /home/richards/gcc/HEAD/gcc/libgcc/libgcc2.c:520 115 {zero_extendqihi2}
     (expr_list:REG_DEAD (reg:SI 44 [ D.2818 ])
        (nil)))

Reloads for insn # 53
Reload 0: reload_in (SI) = (reg:SI 44 [ D.2818 ])
        A_REGS, RELOAD_FOR_OTHER_ADDRESS (opnum = 0)
        reload_in_reg: (reg:SI 44 [ D.2818 ])

find_reloads_address_1 is reloading the SUBREG_REG rather than the
SUBREG itself, even though SImode is not valid for BASE_REGS == A_REGS:

	if (GET_CODE (op0) == SUBREG)
	  {
	    op0 = SUBREG_REG (op0);
	    code0 = GET_CODE (op0);
	    if (code0 == REG && REGNO (op0) < FIRST_PSEUDO_REGISTER)
	      op0 = gen_rtx_REG (word_mode,
				 (REGNO (op0) +
				  subreg_regno_offset (REGNO (SUBREG_REG (orig_op0)),
						       GET_MODE (SUBREG_REG (orig_op0)),
						       SUBREG_BYTE (orig_op0),
						       GET_MODE (orig_op0))));
	  }

push_reloads would specifically not convert a SUBREG reload to a
REG reload in this case.  In principle, I think address subregs
should be handled in the same way.

So is the problem really that (subreg:PSI (reg:SI ...)) isn't a valid
truncation on m32c?  Without TRULY_NOOP_TRUNCATION, I don't see what
forces most code to use (truncate:PSI (reg:SI ...)) instead.  Many places
would call gen_lowpart directly.

Sorry for missing the truncation patterns, I should have grepped more
than m32c.md.  They look a lot like normal moves though.  Is truncation
really not a noop, or are the patterns there to work around something
(probably this :-))?

Richard
DJ Delorie April 29, 2013, 6:33 p.m. UTC | #4
> Sorry for missing the truncation patterns, I should have grepped
> more than m32c.md.  They look a lot like normal moves though.  Is
> truncation really not a noop, or are the patterns there to work
> around something (probably this :-))?

Not sure which pattern you're talking about, but in general, the
m32c's registers are either 16-bit or 24-bit.  You can move a pair of
16-bit registers into a 24-bit register and it truncates as part of
the move, likewise from 32-bit memory to 24-bit reg.  Note that moves
to other 32-bit destinations do *not* truncate, nor can 24-bit
registers hold 32-bit values (duh).  The 24-bit registers may also
hold a 16-bit value.

If you move a 16-bit value into a 24-bit register, it zero_extends.
Richard Sandiford April 30, 2013, 7:24 a.m. UTC | #5
DJ Delorie <dj@redhat.com> writes:
>> Sorry for missing the truncation patterns, I should have grepped
>> more than m32c.md.  They look a lot like normal moves though.  Is
>> truncation really not a noop, or are the patterns there to work
>> around something (probably this :-))?
>
> Not sure which pattern you're talking about, but in general, the
> m32c's registers are either 16-bit or 24-bit.  You can move a pair of
> 16-bit registers into a 24-bit register and it truncates as part of
> the move, likewise from 32-bit memory to 24-bit reg.  Note that moves
> to other 32-bit destinations do *not* truncate, nor can 24-bit
> registers hold 32-bit values (duh).  The 24-bit registers may also
> hold a 16-bit value.

The pattern in this case was:

(define_insn "truncsipsi2_24"
  [(set (match_operand:PSI 0              "m32c_nonimmediate_operand" "=RsiSd*Rmm,Raa,!Rcl,RsiSd*Rmm")
	(truncate:PSI (match_operand:SI 1 "m32c_nonimmediate_operand" "0,RsiSd*Rmm,RsiSd*Rmm,!Rcl")))]
  "TARGET_A24"
  "@
   ; no-op trunc si %1 to psi %0
   mov.l\t%1,%0
   ldc\t%1,%0
   stc\t%1,%0"
  [(set_attr "flags" "n,sz,n,n")]
  )

(the ICE I mentioned was on -mcpu=m32cm, forgot to mention that sorry).
It looked like alternatives 0 and 1 were really moves, with alternative
0 being a no-op move.

The question was really whether:

   (set (reg:PSI foo) (truncate:PSI (reg:SI bar)))
and:
   (set (reg:PSI foo) (subreg:PSI (reg:SI bar) 0))

are fundamentally different on this target.  In other words,
does m32c really want to set TRULY_NOOP_TRUNCATION to false for
SImode->PSImode truncations, but can't because the interface is broken
for partial ints?  It looked to me like the answer was "no" in both cases:
truncation from SImode to PSImode seems to work just like a lowpart subreg
on this target.

The corresponding move pattern seems to be:

(define_insn "movpsi_op"
  [(set (match_operand:PSI 0 "m32c_nonimmediate_operand"
			   "=Raa, SdRmmRpi,  Rcl,  RpiSd*Rmm, <,       <, Rcl, RpiRaa*Rmm")
	(match_operand:PSI 1 "m32c_any_operand"
			   "sIU3, iSdRmmRpi, iRpiSd*Rmm, Rcl, Rpi*Rmm, Rcl, >, >"))]
  "TARGET_A24 && m32c_mov_ok (operands, PSImode)"
  "@
   mov.l:s\t%1,%0
   mov.l\t%1,%0
   ldc\t%1,%0
   stc\t%1,%0
   push.l\t%1
   pushc\t%1
   popc\t%0
   #"
  [(set_attr "flags" "sz,sz,n,n,n,n,n,*")]
  )

and AIUI alternative 1 in the truncsipsi2_24 pattern is basically acting
like alternative 1 in movpsi_op.

If that's right, what do you think of the patch I posted yesterday?

Thanks,
Richard
diff mbox

Patch

	* simplify-rtx.c (simplify_unary_operation_1): Don't try to
	simplify truncations of partial int modes.

diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index 791f91a..6a8221c 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -1038,12 +1038,6 @@  simplify_unary_operation_1 (enum rtx_code code, enum machine_mode mode, rtx op)
 
       if (GET_MODE_CLASS (mode) == MODE_PARTIAL_INT)
 	{
-	  if (TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op)))
-	    {
-	      temp = rtl_hooks.gen_lowpart_no_emit (mode, op);
-	      if (temp)
-		return temp;
-	    }
 	  /* We can't handle truncation to a partial integer mode here
 	     because we don't know the real bitsize of the partial
 	     integer mode.  */