diff mbox

[mips] delete bit-rotten ADJUST_REG_ALLOC_ORDER definition

Message ID 878uq4ci0m.fsf@talisman.default
State New
Headers show

Commit Message

Richard Sandiford May 14, 2014, 6:49 p.m. UTC
Jeff Law <law@redhat.com> writes:
> On 05/13/14 14:11, Sandra Loosemore wrote:
>> This patch is a follow-up to this thread from a few years ago:
>>
>> https://gcc.gnu.org/ml/gcc/2011-01/msg00093.html
>> https://gcc.gnu.org/ml/gcc/2011-01/msg00158.html
>>
>> As noted there, the current definition of ADJUST_REG_ALLOC_ORDER is
>> obsolete:
>>
>> (1) This hook is a holdover from the old pre-IRA register allocator and
>> it's not clear it does anything useful with IRA.
>>
>> (2) It's inconsistent with the current REG_ALLOC_ORDER definition which
>> is not just {0, 1, 2, ...} any more.
>>
>> I considered re-working the hook to jiggle the $24 ordering for MIPS16
>> relative to the current REG_ALLOC_ORDER definition, but it wasn't clear
>> to me either where in the ordering it should go, or that it would be
>> worthwhile to try to tune this.  Indeed, the CSiBE results for removing
>> it entirely are pretty much in the noise range, except that removing the
>> hack that is supposed to benefit MIPS16 resulted in more improvement for
>> that case than any of the others tested!  Here are some numbers
>> comparing geomean for old and new with -Os for various combinations:
>>
>> default: 1062.3,1062.28
>> -EL -mips16 -msoft-float: 758.624,758.551
>> -EL -mips16: 763.398,763.321
>> -EL -msoft-float: 1115.11,1115.09
>> -EL: 1062.22,1062.2
>> -EL -mabi=64: 1181.8,1181.81
>> -mabi=64: 1181.93,1181.94
>> -mips16 -msoft-float: 758.337,758.236
>> -mips16: 763.11,763.011
>> -EL -mabi=64 -msoft-float: 1218.11,1218.11
>> -mabi=64 -msoft-float: 1218.24,1218.25
>> -msoft-float: 1115.73,1115.71
>> -EL -mmicromips -msoft-float: 792.314,792.32
>> -EL -mnan=2008: 1062.22,1062.2
>> -mnan=2008: 1062.3,1062.28
>>
>> As for the invalid JALR patch I posted just now
>> https://gcc.gnu.org/ml/gcc-patches/2014-05/msg01015.html
>> I had trouble regression-testing this patch on trunk due to various
>> other breakage in the past week, and ended up doing it on a 4.9.0
>> checkout modified to support Mentor's extended set of mips-sde-elf
>> multilibs instead.  That's also the source of the CSiBE numbers above.
>> Also, those numbers do include the JALR patch in the baseline.
>>
>> This patch did fix some of the dspr2-MULT.c and dspr2-MULTU.c test
>> failures that were reported in the original discussion, but I believe
>> the current XFAILs for those tests are still valid for the same reasons
>> listed in PR target/51729.
>>
>> So, this patch should be considered more of a code cleanup thing, than
>> either an optimization improvement or a test regression fix.  OK to commit?
>>
>> -Sandra
>>
>>
>> 2014-05-13  Catherine Moore  <clm@codesourcery.com>
>>          Sandra Loosemore  <sandra@codesourcery.com>
>>
>>      gcc/
>>      * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete.
>>      * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete.
>>      * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete.
> OK for the trunk.

Would it be OK to hold off until after the switch to LRA?  That patch
has been written and the MIPS parts approved, but we're waiting for
some legal things to be sorted out and for a fixed version of the LRA
EXTRA_MEMORY_CONSTRAINT patch.  I just think it'd be better to tune this
sort of thing once that's done, rather than tune it against reload.

I realise this is old code (goes back to the initial MIPS16 commit),
but the reasoning behind promoting $24 still seems sound.  So I was
hoping to see whether the bad performance is from promoting $24 itself
or whether it's from the damage that:

  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
    reg_alloc_order[i] = i;

does to the other registers.

FWIW here's an old patch I had lying around.  I wasn't happy with
the results for CSiBE at the time but maybe it'd be worth looking
at it again after LRA.  E.g., we might want to put the non-MIPS16
call-clobbered GPRs ahead of the MIPS16 ones, so that when we're
looking for any old GPR, we only pick a MIPS16 register if no
others are free.

Thanks,
Richard

Comments

Sandra Loosemore May 14, 2014, 8:51 p.m. UTC | #1
On 05/14/2014 12:49 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 05/13/14 14:11, Sandra Loosemore wrote:
>>> 2014-05-13  Catherine Moore  <clm@codesourcery.com>
>>>           Sandra Loosemore  <sandra@codesourcery.com>
>>>
>>>       gcc/
>>>       * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete.
>>>       * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete.
>>>       * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete.
>> OK for the trunk.
>
> Would it be OK to hold off until after the switch to LRA?  That patch
> has been written and the MIPS parts approved, but we're waiting for
> some legal things to be sorted out and for a fixed version of the LRA
> EXTRA_MEMORY_CONSTRAINT patch.  I just think it'd be better to tune this
> sort of thing once that's done, rather than tune it against reload.

Yeah, sure -- I think we're all in agreement that the current code needs 
to go, but re-benchmarking would be a good thing.

> FWIW here's an old patch I had lying around.  I wasn't happy with
> the results for CSiBE at the time but maybe it'd be worth looking
> at it again after LRA.  E.g., we might want to put the non-MIPS16
> call-clobbered GPRs ahead of the MIPS16 ones, so that when we're
> looking for any old GPR, we only pick a MIPS16 register if no
> others are free.

FWIW, I think that just messing with REG_ALLOC_ORDER directly, as this 
patch does, is a better approach for tuning anyway than the 
ADJUST_REG_ALLOC_ORDER hook.

-Sandra
Sandra Loosemore June 22, 2014, 11:53 p.m. UTC | #2
On 05/14/2014 12:49 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> On 05/13/14 14:11, Sandra Loosemore wrote:
>>>
>>> 2014-05-13  Catherine Moore  <clm@codesourcery.com>
>>>           Sandra Loosemore  <sandra@codesourcery.com>
>>>
>>>       gcc/
>>>       * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete.
>>>       * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete.
>>>       * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete.
>> OK for the trunk.
>
> Would it be OK to hold off until after the switch to LRA?  That patch
> has been written and the MIPS parts approved, but we're waiting for
> some legal things to be sorted out and for a fixed version of the LRA
> EXTRA_MEMORY_CONSTRAINT patch.  I just think it'd be better to tune this
> sort of thing once that's done, rather than tune it against reload.

Richard, is it OK to commit this patch now that LRA is in, or do you 
want to experiment some more with tuning first?  I think we're all in 
agreement that this is broken old code that should be removed regardless 
of whether we do other things to tune REG_ALLOC_ORDER.

-Sandra
Richard Sandiford June 24, 2014, 8:19 p.m. UTC | #3
Sandra Loosemore <sandra@codesourcery.com> writes:
> On 05/14/2014 12:49 PM, Richard Sandiford wrote:
>> Jeff Law <law@redhat.com> writes:
>>> On 05/13/14 14:11, Sandra Loosemore wrote:
>>>>
>>>> 2014-05-13  Catherine Moore  <clm@codesourcery.com>
>>>>           Sandra Loosemore  <sandra@codesourcery.com>
>>>>
>>>>       gcc/
>>>>       * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete.
>>>>       * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete.
>>>>       * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete.
>>> OK for the trunk.
>>
>> Would it be OK to hold off until after the switch to LRA?  That patch
>> has been written and the MIPS parts approved, but we're waiting for
>> some legal things to be sorted out and for a fixed version of the LRA
>> EXTRA_MEMORY_CONSTRAINT patch.  I just think it'd be better to tune this
>> sort of thing once that's done, rather than tune it against reload.
>
> Richard, is it OK to commit this patch now that LRA is in, or do you 
> want to experiment some more with tuning first?  I think we're all in 
> agreement that this is broken old code that should be removed regardless 
> of whether we do other things to tune REG_ALLOC_ORDER.

Sure, go ahead.  I retried with trunk and removing the definition had
very little effect on non-MIPS16 and an overall positive effect on MIPS16.
Which is a bit ironic, given that the hook was supposed to help MIPS16
and at face value would hurt non-MIPS16 more.  It looks like moving $24
first really isn't a win for MIPS16 with IRA.

To reinforce that, I tried the old patch I posted in place of yours, but
moving $24 ahead of the other registers was better for non-MIPS16 code
and worse for MIPS16.  The MIPS16 results were as close to trunk as the
non-MIPS16 ones were with your patch; just two differences.  So the
MIPS16 benefit of your patch really is coming from having $24 after
the MIPS16 registers.

The reason for the improvement in non-MIPS16 results with my patch
seemed to be that putting the return and argument registers first leads
to less freedom of movement and thus more unfilled delay slots.  In some
cases I think these were more due to reorg.c's infamous approach to
register liveness rather than real interference.  E.g. branches to a
return statement were not having their delay slots filled with an
assignment to $2 even though the return statement set $2 itself.  Change
the assigned register from $2 to $24 and the delay slot could be filled.

So it might be interesting to experiment with putting the later
call-clobbered registers first, but that's a separate change.

Richard
diff mbox

Patch

Index: gcc/config/mips/mips-protos.h
===================================================================
--- gcc/config/mips/mips-protos.h	2011-09-17 10:18:14.000000000 +0100
+++ gcc/config/mips/mips-protos.h	2011-09-17 19:22:49.000000000 +0100
@@ -248,7 +248,6 @@  extern bool mips_expand_ext_as_unaligned
 extern bool mips_expand_ins_as_unaligned_store (rtx, rtx, HOST_WIDE_INT,
 						HOST_WIDE_INT);
 extern bool mips_mem_fits_mode_p (enum machine_mode mode, rtx x);
-extern void mips_order_regs_for_local_alloc (void);
 extern HOST_WIDE_INT mips_debugger_offset (rtx, HOST_WIDE_INT);
 
 extern void mips_push_asm_switch (struct mips_asm_switch *);
Index: gcc/config/mips/mips.c
===================================================================
--- gcc/config/mips/mips.c	2011-09-17 08:59:21.000000000 +0100
+++ gcc/config/mips/mips.c	2011-09-17 19:22:49.000000000 +0100
@@ -15900,28 +15900,6 @@  mips_expand_vector_init (rtx target, rtx
   emit_move_insn (target, mem);
 }
 
-/* When generating MIPS16 code, we want to allocate $24 (T_REG) before
-   other registers for instructions for which it is possible.  This
-   encourages the compiler to use CMP in cases where an XOR would
-   require some register shuffling.  */
-
-void
-mips_order_regs_for_local_alloc (void)
-{
-  int i;
-
-  for (i = 0; i < FIRST_PSEUDO_REGISTER; i++)
-    reg_alloc_order[i] = i;
-
-  if (TARGET_MIPS16)
-    {
-      /* It really doesn't matter where we put register 0, since it is
-         a fixed register anyhow.  */
-      reg_alloc_order[0] = 24;
-      reg_alloc_order[24] = 0;
-    }
-}
-
 /* Implement EH_USES.  */
 
 bool
Index: gcc/config/mips/mips.h
===================================================================
--- gcc/config/mips/mips.h	2011-09-17 08:51:17.000000000 +0100
+++ gcc/config/mips/mips.h	2011-09-17 10:18:04.000000000 +0100
@@ -1930,9 +1930,18 @@  #define REG_ALLOC_ORDER							\
      of the extra accumulators available with -mdspr2.  In some cases,	\
      it can also help to reduce register pressure.  */			\
   64, 65,176,177,178,179,180,181,					\
-  /* Call-clobbered GPRs.  */						\
-  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,		\
-  24, 25, 31,								\
+  /* Call-clobbered GPRs.  Put $24 first, because when M16_REGS and	\
+     T_REG have equal cost, we prefer to use T_REG.  In particular,	\
+     it means that CMP+branch sequences will generally be preferred	\
+     to XOR+branch sequences; the former are shorter, and using T_REG	\
+     reduces the register pressure on the main MIPS16 registers.	\
+     This does not seem to adversely affect non-MIPS16 code.		\
+									\
+     Normally we'd keep register pairs together, but in this case,	\
+     it's better not to.  $25 is not a MIPS16 register, and in		\
+     abicalls code, we'd prefer to leave it free for its ABI use.  */	\
+  24, 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,		\
+  25, 1,  31,								\
   /* The global pointer.  This is call-clobbered for o32 and o64	\
      abicalls, call-saved for n32 and n64 abicalls, and a program	\
      invariant otherwise.  Putting it between the call-clobbered	\
@@ -1964,13 +1973,6 @@  #define REG_ALLOC_ORDER							\
   182,183,184,185,186,187						\
 }
 
-/* ADJUST_REG_ALLOC_ORDER is a macro which permits reg_alloc_order
-   to be rearranged based on a particular function.  On the mips16, we
-   want to allocate $24 (T_REG) before other registers for
-   instructions for which it is possible.  */
-
-#define ADJUST_REG_ALLOC_ORDER mips_order_regs_for_local_alloc ()
-
 /* True if VALUE is an unsigned 6-bit number.  */
 
 #define UIMM6_OPERAND(VALUE) \