Patchwork [RFC,ARM] Audit uses of optimize_size in the ARM backend.

login
register
mail settings
Submitter Ramana Radhakrishnan
Date Aug. 12, 2011, 12:24 p.m.
Message ID <CACUk7=WcdAs3_X0=f6a7G6oC=cu-kiNDJ9pJphhc5M0Ym9r1pQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/109836/
State New
Headers show

Comments

Ramana Radhakrishnan - Aug. 12, 2011, 12:24 p.m.
Hi,

Quite some time back someone had pointed out that the ARM backend
used optimize_size in quite a few areas and that backends shouldn't
use this directly in patterns any more. I had written this patch up a few weeks
back and it was in one of my trees and had gone through some degree of
testing.

While the ARM backend doesn't support hot cold partitioning of basic blocks
because of issues with the mini-pool placements, I suspect
this by itself is a good cleanup. The bits of the use that I'm not
convinced about
yet are the changes of optimize_size in thumb_legitimize_address
to optimize_insn_for_size_p and I'm looking for some comments there.

There are still other uses of optimize_size and here are some thoughts
on what we should do there. I will go back and do this when I have
some more free time next but I hope to have the changes in by the time
stage1 is over if these are deemed to be useful.


   - arm/aout.h : ASM_OUTPUT_ADDR_DIFF_ELT : replace with
optimize_function_for_size_p ?
   - arm/arm.h : TARGET_USE_MOVT (probably again something that could
       benefit with the change.)
   - arm/arm.h : CONSTANT_ALIGNMENT - probably should retain optimize_size .
   - arm/arm.h : DATA_ALIGNMENT - Likewise.
   - arm/arm.h : CASE_VECTOR_PC_RELATIVE should go hand in glove with
addr_diff_elt output.
   - arm/coff.h or arm/elf.h : JUMP_TABLES_IN_TEXT_SECTION :
optimize_function_for_size_p () ?
   - arm/arm.c  (arm_compute_save_reg_mask) : Replace optimize_size
with optimize_function_for_size_p ().
   - arm/arm.c  (arm_output_epilogue): Replace optimize_size with
optimize_function_for_size_p ().
   - arm/arm.c ( arm_expand_prologue): Likewise
   - arm/arm.c (thumb1_extra_regs_pushed): optimize_function_for_size_p
   - arm/arm.c (arm_final_prescan_insn): Probably optimize_insn_for_size_p () .
   - arm/arm.c (arm_conditional_register_usage): optimize_function_for_size_p.


Ok for trunk after a bootstrap, test run ? Thoughts about what we do
with the rest of the uses ?

cheers
Ramana


          * config/arm/arm.md ("*mulsi3_compare0_v6"): Replace optimize_size
            with optimize_insn_for_size_p.
            ("*mulsi_compare0_scratch_v6"): Likewise.
            ("*mulsi3addsi_compare0_v6"): Likewise.
            ("casesi"): Likewise.
            (dimode_general_splitter): Name existing splitter and like above.
            ("bswapsi2"): Likewise.
            * config/arm/thumb2.md (t2_muls_peepholes): Likewise.
            * config/arm/arm.c (thumb_legitimize_address): Replace optimize_size
            with optimize_insn_for_size_p.
            (adjacent_mem_locations): Likewise.
            (arm_const_double_by_parts): Likewise.
            * config/arm/arm.h (FUNCTION_BOUNDARY): Use
optimize_function_for_size_p.
            (MODE_BASE_REG_CLASS): Likewise.
            * config/arm/constraints.md (constraint "Dc"): Use
optimize_insn_for_size_p.
Richard Sandiford - Aug. 12, 2011, 12:58 p.m.
Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org> writes:
> While the ARM backend doesn't support hot cold partitioning of basic blocks
> because of issues with the mini-pool placements, I suspect
> this by itself is a good cleanup. The bits of the use that I'm not
> convinced about
> yet are the changes of optimize_size in thumb_legitimize_address
> to optimize_insn_for_size_p and I'm looking for some comments there.

I'm not sure it's correct for the define_insns either.  I think it can
only be called during passes that produce new code, and which explicitly
set the global state appropriately (e.g. expand, split and peephole2).
I might be wrong, but as things stand, I don't think you can guarantee
that the global state describes the insn under test every time that
recog is called.

For existing insns, I think optimize_bb_for_speed_p (BLOCK_FOR_INSN (insn))
is the canonical check.

Richard

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6cd80f8..97dd249 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6359,7 +6359,7 @@  thumb_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode)
       /* Try and fold the offset into a biasing of the base register and
 	 then offsetting that.  Don't do this when optimizing for space
 	 since it can cause too many CSEs.  */
-      if (optimize_size && offset >= 0
+      if (optimize_insn_for_size_p () && offset >= 0
 	  && offset < 256 + 31 * GET_MODE_SIZE (mode))
 	{
 	  HOST_WIDE_INT delta;
@@ -9787,7 +9787,7 @@  adjacent_mem_locations (rtx a, rtx b)
 	  /* If the target has load delay slots, then there's no benefit
 	     to using an ldm instruction unless the offset is zero and
 	     we are optimizing for size.  */
-	  return (optimize_size && (REGNO (reg0) == REGNO (reg1))
+	  return (optimize_insn_for_size_p () && (REGNO (reg0) == REGNO (reg1))
 		  && (val0 == 0 || val1 == 0 || val0 == 4 || val1 == 4)
 		  && (val_diff == 4 || val_diff == -4));
 	}
@@ -9868,7 +9868,7 @@  multiple_operation_profitable_p (bool is_store ATTRIBUTE_UNUSED,
 
      As a compromise, we use ldr for counts of 1 or 2 regs, and ldm
      for counts of 3 or 4 regs.  */
-  if (nops <= 2 && arm_tune_xscale && !optimize_size)
+  if (nops <= 2 && arm_tune_xscale && !optimize_insn_for_size_p ())
     return false;
   return true;
 }
@@ -12445,7 +12445,7 @@  arm_const_double_by_parts (rtx val)
   enum machine_mode mode = GET_MODE (val);
   rtx part;
 
-  if (optimize_size || arm_ld_sched)
+  if (optimize_insn_for_size_p () || arm_ld_sched)
     return true;
 
   if (mode == VOIDmode)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 869b9a9..b18f08e 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -533,7 +533,7 @@  extern int arm_arch_thumb_hwdiv;
 #define PREFERRED_STACK_BOUNDARY \
     (arm_abi == ARM_ABI_ATPCS ? 64 : STACK_BOUNDARY)
 
-#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_size) ? 16 : 32)
+#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_function_for_size_p (cfun)) ? 16 : 32)
 
 /* The lowest bit is used to indicate Thumb-mode functions, so the
    vbit must go into the delta field of pointers to member
@@ -1141,9 +1141,10 @@  enum reg_class
 /* For the Thumb the high registers cannot be used as base registers
    when addressing quantities in QI or HI mode; if we don't know the
    mode, then we must be conservative.  */
-#define MODE_BASE_REG_CLASS(MODE)					\
-    (TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS :      \
-     (((MODE) == SImode) ? BASE_REGS : LO_REGS))
+#define MODE_BASE_REG_CLASS(MODE)   				        \
+  (TARGET_ARM || (TARGET_THUMB2 && !optimize_function_for_size_p ()) ?	\
+   CORE_REGS :								\
+   (((MODE) == SImode) ? BASE_REGS : LO_REGS))
 
 /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
    instead of BASE_REGS.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 3d4dcfa..8e17930 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1400,7 +1400,7 @@ 
 			 (const_int 0)))
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(mult:SI (match_dup 2) (match_dup 1)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1426,7 +1426,7 @@ 
 			  (match_operand:SI 1 "s_register_operand" "r"))
 			 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1486,7 +1486,7 @@ 
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(plus:SI (mult:SI (match_dup 2) (match_dup 1))
 		 (match_dup 3)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -1516,7 +1516,7 @@ 
 		  (match_operand:SI 3 "s_register_operand" "r"))
 	 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -4992,13 +4992,13 @@ 
    (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
 )
 
-(define_split
+(define_split ;; dimode_general_splitter
   [(set (match_operand:ANY64 0 "arm_general_register_operand" "")
 	(match_operand:ANY64 1 "const_double_operand" ""))]
   "TARGET_32BIT
    && reload_completed
    && (arm_const_double_inline_cost (operands[1])
-       <= ((optimize_size || arm_ld_sched) ? 3 : 4))"
+       <= ((optimize_insn_for_size_p () || arm_ld_sched) ? 3 : 4))"
   [(const_int 0)]
   "
   arm_split_constant (SET, SImode, curr_insn,
@@ -8477,7 +8477,7 @@ 
    (match_operand:SI 2 "const_int_operand" "")	; total range
    (match_operand:SI 3 "" "")			; table label
    (match_operand:SI 4 "" "")]			; Out of range label
-  "TARGET_32BIT || optimize_size || flag_pic"
+  "TARGET_32BIT || optimize_insn_for_size_p () || flag_pic"
   "
   {
     enum insn_code code;
@@ -10845,7 +10845,7 @@ 
 (define_expand "bswapsi2"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
   	(bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
-"TARGET_EITHER && (arm_arch6 || !optimize_size)"
+"TARGET_EITHER && (arm_arch6 || !optimize_insn_for_size_p ())"
 "
     if (!arm_arch6)
       {
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index f5b8521..c7e13ec 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -232,7 +232,7 @@ 
   if optimizing for space or when we have load-delay slots to fill."
  (and (match_code "const_double,const_int,const_vector")
       (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
-		   && !(optimize_size || arm_ld_sched)")))
+		   && !(optimize_insn_for_size_p () || arm_ld_sched)")))
 
 (define_constraint "Di"
  "@internal
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 9a11012..e228963 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -873,11 +873,11 @@ 
 ;; implementations and since "mul<c>" will be generated by
 ;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
 ;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
+(define_peephole2 ;; t2_muls_peepholes
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_operand:SI 1 "low_register_operand" "")
                  (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -889,7 +889,7 @@ 
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_dup 0)
 	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -902,7 +902,7 @@ 
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
                  (match_operand:SI 2 "low_register_operand" "l")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_THUMB2 && optimize_size && reload_completed"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && reload_completed"
   "mul%!\\t%0, %2, %0"
   [(set_attr "predicable" "yes")
    (set_attr "length" "2")
@@ -916,7 +916,7 @@ 
          (const_int 0)))
    (set (match_operand:SI 0 "register_operand" "=l")
 	(mult:SI (match_dup 1) (match_dup 2)))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])
@@ -928,7 +928,7 @@ 
 	          (match_operand:SI 2 "register_operand" "l"))
          (const_int 0)))
    (clobber (match_scratch:SI 0 "=l"))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])