Patchwork [committed] forward_propagate_subreg vs. mode_rep_extended

login
register
mail settings
Submitter Richard Sandiford
Date Dec. 19, 2011, 10:06 p.m.
Message ID <87hb0w8bd9.fsf@firetop.home>
Download mbox | patch
Permalink /patch/132323/
State New
Headers show

Comments

Richard Sandiford - Dec. 19, 2011, 10:06 p.m.
forward_propagate_subreg has code to optimise sequences like:

    (set (reg:DI X) (*_extend:DI (reg:SI Y)))
    (... (subreg:SI (reg:DI X)) ...)

by replacing (subreg:SI (reg:DI X)) with (reg:SI Y).  However, there is
a special case to stop this happening if Y is loaded from memory and the
extension matches LOAD_EXTEND_OP.  In that case the transformation isn't
profitable, because we'll be converting a single load into a load followed
by an extension.

The same problem applies to any SI->DI sign extension on 64-bit MIPS,
because SI registers are stored in sign-extended form.  The comment
below explains why in a bit more detail.

This optimisation was causing a failure in gcc.target/mips/octeon-bbit-2.c
for -mabi=64.  We reused the input to a sign_extend instruction, such that
the inputs and outputs were simultaneously live and could no longer be tied.

This probably isn't going to be the kind of thing that will show
up in benchmarks, but I tried compiling cc1 *.i files for -mabi=64
with and without the patch.  The patch removed 4089 redundant moves
and 21,384 text bytes overall.

I've normally tried to avoid self-approving rtl optimiser stuff,
but since this is a simple extension of the existing MEM handling,
and since the mode_rep_extended test effectively makes it MIPS-specific,
I hope it's OK to make an exception here.

Tested on mips-sde-elf and mips64-linux-gnu.  Applied.

Richard


gcc/
	* fwprop.c (forward_propagate_subreg): Skip the SIGN/ZERO_EXTEND
	optimization if the source register is already extended.
Andrew Pinski - Dec. 19, 2011, 10:13 p.m.
On Mon, Dec 19, 2011 at 2:06 PM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> forward_propagate_subreg has code to optimise sequences like:
>
>    (set (reg:DI X) (*_extend:DI (reg:SI Y)))
>    (... (subreg:SI (reg:DI X)) ...)
>
> by replacing (subreg:SI (reg:DI X)) with (reg:SI Y).  However, there is
> a special case to stop this happening if Y is loaded from memory and the
> extension matches LOAD_EXTEND_OP.  In that case the transformation isn't
> profitable, because we'll be converting a single load into a load followed
> by an extension.
>
> The same problem applies to any SI->DI sign extension on 64-bit MIPS,
> because SI registers are stored in sign-extended form.  The comment
> below explains why in a bit more detail.
>
> This optimisation was causing a failure in gcc.target/mips/octeon-bbit-2.c
> for -mabi=64.  We reused the input to a sign_extend instruction, such that
> the inputs and outputs were simultaneously live and could no longer be tied.

This is PR 42839.

Thanks,
Andrew Pinski
Richard Sandiford - Dec. 19, 2011, 10:25 p.m.
Andrew Pinski <pinskia@gmail.com> writes:
> On Mon, Dec 19, 2011 at 2:06 PM, Richard Sandiford
> <rdsandiford@googlemail.com> wrote:
>> forward_propagate_subreg has code to optimise sequences like:
>>
>>    (set (reg:DI X) (*_extend:DI (reg:SI Y)))
>>    (... (subreg:SI (reg:DI X)) ...)
>>
>> by replacing (subreg:SI (reg:DI X)) with (reg:SI Y).  However, there is
>> a special case to stop this happening if Y is loaded from memory and the
>> extension matches LOAD_EXTEND_OP.  In that case the transformation isn't
>> profitable, because we'll be converting a single load into a load followed
>> by an extension.
>>
>> The same problem applies to any SI->DI sign extension on 64-bit MIPS,
>> because SI registers are stored in sign-extended form.  The comment
>> below explains why in a bit more detail.
>>
>> This optimisation was causing a failure in gcc.target/mips/octeon-bbit-2.c
>> for -mabi=64.  We reused the input to a sign_extend instruction, such that
>> the inputs and outputs were simultaneously live and could no longer be tied.
>
> This is PR 42839.

Doh.  I really should to search for this stuff in bugzilla first.

I've just added a PR reference to the ChangeLog entry.  I don't think it's
the kind of thing that should be backported to branches, but I'm never
too sure whether that means the bug should be closed as fixed or not.

Richard
Mike Stump - Dec. 20, 2011, 1:09 a.m.
On Dec 19, 2011, at 2:25 PM, Richard Sandiford wrote:
> I've just added a PR reference to the ChangeLog entry.  I don't think it's
> the kind of thing that should be backported to branches, but I'm never
> too sure whether that means the bug should be closed as fixed or not.

Closed, fixed, known to work in 4.7.0, known to fail in 4.6.0.  It's reasonable to leave a PR open for expected back ports, if one or more are planned.
Paolo Bonzini - Dec. 20, 2011, 11:47 a.m.
On 12/19/2011 11:06 PM, Richard Sandiford wrote:
>
> I've normally tried to avoid self-approving rtl optimiser stuff,
> but since this is a simple extension of the existing MEM handling,
> and since the mode_rep_extended test effectively makes it MIPS-specific,
> I hope it's OK to make an exception here.

Looks good anyway. :)

Paolo

Patch

Index: gcc/fwprop.c
===================================================================
--- gcc/fwprop.c	2011-12-19 21:18:42.000000000 +0000
+++ gcc/fwprop.c	2011-12-19 21:52:03.000000000 +0000
@@ -1112,7 +1112,18 @@  forward_propagate_subreg (df_ref use, rt
   /* If this is a SUBREG of a ZERO_EXTEND or SIGN_EXTEND, and the SUBREG
      is the low part of the reg being extended then just use the inner
      operand.  Don't do this if the ZERO_EXTEND or SIGN_EXTEND insn will
-     be removed due to it matching a LOAD_EXTEND_OP load from memory.  */
+     be removed due to it matching a LOAD_EXTEND_OP load from memory,
+     or due to the operation being a no-op when applied to registers.
+     For example, if we have:
+
+	 A: (set (reg:DI X) (sign_extend:DI (reg:SI Y)))
+	 B: (... (subreg:SI (reg:DI X)) ...)
+
+     and mode_rep_extended says that Y is already sign-extended,
+     the backend will typically allow A to be combined with the
+     definition of Y or, failing that, allow A to be deleted after
+     reload through register tying.  Introducing more uses of Y
+     prevents both optimisations.  */
   else if (subreg_lowpart_p (use_reg))
     {
       use_insn = DF_REF_INSN (use);
@@ -1123,6 +1134,8 @@  forward_propagate_subreg (df_ref use, rt
 	  && REGNO (XEXP (src, 0)) >= FIRST_PSEUDO_REGISTER
 	  && GET_MODE (XEXP (src, 0)) == use_mode
 	  && !free_load_extend (src, def_insn)
+	  && (targetm.mode_rep_extended (use_mode, GET_MODE (src))
+	      != (int) GET_CODE (src))
 	  && all_uses_available_at (def_insn, use_insn))
 	return try_fwprop_subst (use, DF_REF_LOC (use), XEXP (src, 0),
 				 def_insn, false);