From patchwork Fri Aug 12 12:24:55 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
X-Patchwork-Id: 109836
Return-Path: 
 <gcc-patches-return-299635-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id D9F41B6F67
	for <incoming@patchwork.ozlabs.org>;
	Fri, 12 Aug 2011 22:25:18 +1000 (EST)
Received: (qmail 9184 invoked by alias); 12 Aug 2011 12:25:14 -0000
Received: (qmail 9150 invoked by uid 22791); 12 Aug 2011 12:25:11 -0000
X-SWARE-Spam-Status: No, hits=-2.5 required=5.0	tests=AWL, BAYES_00,
	RCVD_IN_DNSWL_LOW, TW_VB
X-Spam-Check-By: sourceware.org
Received: from mail-qy0-f175.google.com (HELO mail-qy0-f175.google.com)
	(209.85.216.175) by sourceware.org (qpsmtpd/0.43rc1) with
	ESMTP; Fri, 12 Aug 2011 12:24:56 +0000
Received: by qyk4 with SMTP id 4so289347qyk.20 for <gcc-patches@gcc.gnu.org>;
	Fri, 12 Aug 2011 05:24:55 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.224.109.71 with SMTP id i7mr565491qap.210.1313151895437;
	Fri, 12 Aug 2011 05:24:55 -0700 (PDT)
Received: by 10.224.89.67 with HTTP; Fri, 12 Aug 2011 05:24:55 -0700 (PDT)
Date: Fri, 12 Aug 2011 13:24:55 +0100
Message-ID: 
 <CACUk7=WcdAs3_X0=f6a7G6oC=cu-kiNDJ9pJphhc5M0Ym9r1pQ@mail.gmail.com>
Subject: [RFC ARM] Audit uses of optimize_size in the ARM backend.
From: Ramana Radhakrishnan <ramana.radhakrishnan@linaro.org>
To: gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Patch Tracking <patches@linaro.org>, Richard Earnshaw <rearnsha@arm.com>
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

Hi,

Quite some time back someone had pointed out that the ARM backend
used optimize_size in quite a few areas and that backends shouldn't
use this directly in patterns any more. I had written this patch up a few weeks
back and it was in one of my trees and had gone through some degree of
testing.

While the ARM backend doesn't support hot cold partitioning of basic blocks
because of issues with the mini-pool placements, I suspect
this by itself is a good cleanup. The bits of the use that I'm not
convinced about
yet are the changes of optimize_size in thumb_legitimize_address
to optimize_insn_for_size_p and I'm looking for some comments there.

There are still other uses of optimize_size and here are some thoughts
on what we should do there. I will go back and do this when I have
some more free time next but I hope to have the changes in by the time
stage1 is over if these are deemed to be useful.


   - arm/aout.h : ASM_OUTPUT_ADDR_DIFF_ELT : replace with
optimize_function_for_size_p ?
   - arm/arm.h : TARGET_USE_MOVT (probably again something that could
       benefit with the change.)
   - arm/arm.h : CONSTANT_ALIGNMENT - probably should retain optimize_size .
   - arm/arm.h : DATA_ALIGNMENT - Likewise.
   - arm/arm.h : CASE_VECTOR_PC_RELATIVE should go hand in glove with
addr_diff_elt output.
   - arm/coff.h or arm/elf.h : JUMP_TABLES_IN_TEXT_SECTION :
optimize_function_for_size_p () ?
   - arm/arm.c  (arm_compute_save_reg_mask) : Replace optimize_size
with optimize_function_for_size_p ().
   - arm/arm.c  (arm_output_epilogue): Replace optimize_size with
optimize_function_for_size_p ().
   - arm/arm.c ( arm_expand_prologue): Likewise
   - arm/arm.c (thumb1_extra_regs_pushed): optimize_function_for_size_p
   - arm/arm.c (arm_final_prescan_insn): Probably optimize_insn_for_size_p () .
   - arm/arm.c (arm_conditional_register_usage): optimize_function_for_size_p.


Ok for trunk after a bootstrap, test run ? Thoughts about what we do
with the rest of the uses ?

cheers
Ramana


          * config/arm/arm.md ("*mulsi3_compare0_v6"): Replace optimize_size
            with optimize_insn_for_size_p.
            ("*mulsi_compare0_scratch_v6"): Likewise.
            ("*mulsi3addsi_compare0_v6"): Likewise.
            ("casesi"): Likewise.
            (dimode_general_splitter): Name existing splitter and like above.
            ("bswapsi2"): Likewise.
            * config/arm/thumb2.md (t2_muls_peepholes): Likewise.
            * config/arm/arm.c (thumb_legitimize_address): Replace optimize_size
            with optimize_insn_for_size_p.
            (adjacent_mem_locations): Likewise.
            (arm_const_double_by_parts): Likewise.
            * config/arm/arm.h (FUNCTION_BOUNDARY): Use
optimize_function_for_size_p.
            (MODE_BASE_REG_CLASS): Likewise.
            * config/arm/constraints.md (constraint "Dc"): Use
optimize_insn_for_size_p.

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 6cd80f8..97dd249 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6359,7 +6359,7 @@ thumb_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode)
       /* Try and fold the offset into a biasing of the base register and
 	 then offsetting that.  Don't do this when optimizing for space
 	 since it can cause too many CSEs.  */
-      if (optimize_size && offset >= 0
+      if (optimize_insn_for_size_p () && offset >= 0
 	  && offset < 256 + 31 * GET_MODE_SIZE (mode))
 	{
 	  HOST_WIDE_INT delta;
@@ -9787,7 +9787,7 @@ adjacent_mem_locations (rtx a, rtx b)
 	  /* If the target has load delay slots, then there's no benefit
 	     to using an ldm instruction unless the offset is zero and
 	     we are optimizing for size.  */
-	  return (optimize_size && (REGNO (reg0) == REGNO (reg1))
+	  return (optimize_insn_for_size_p () && (REGNO (reg0) == REGNO (reg1))
 		  && (val0 == 0 || val1 == 0 || val0 == 4 || val1 == 4)
 		  && (val_diff == 4 || val_diff == -4));
 	}
@@ -9868,7 +9868,7 @@ multiple_operation_profitable_p (bool is_store ATTRIBUTE_UNUSED,
 
      As a compromise, we use ldr for counts of 1 or 2 regs, and ldm
      for counts of 3 or 4 regs.  */
-  if (nops <= 2 && arm_tune_xscale && !optimize_size)
+  if (nops <= 2 && arm_tune_xscale && !optimize_insn_for_size_p ())
     return false;
   return true;
 }
@@ -12445,7 +12445,7 @@ arm_const_double_by_parts (rtx val)
   enum machine_mode mode = GET_MODE (val);
   rtx part;
 
-  if (optimize_size || arm_ld_sched)
+  if (optimize_insn_for_size_p () || arm_ld_sched)
     return true;
 
   if (mode == VOIDmode)
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 869b9a9..b18f08e 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -533,7 +533,7 @@ extern int arm_arch_thumb_hwdiv;
 #define PREFERRED_STACK_BOUNDARY \
     (arm_abi == ARM_ABI_ATPCS ? 64 : STACK_BOUNDARY)
 
-#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_size) ? 16 : 32)
+#define FUNCTION_BOUNDARY  ((TARGET_THUMB && optimize_function_for_size_p (cfun)) ? 16 : 32)
 
 /* The lowest bit is used to indicate Thumb-mode functions, so the
    vbit must go into the delta field of pointers to member
@@ -1141,9 +1141,10 @@ enum reg_class
 /* For the Thumb the high registers cannot be used as base registers
    when addressing quantities in QI or HI mode; if we don't know the
    mode, then we must be conservative.  */
-#define MODE_BASE_REG_CLASS(MODE)					\
-    (TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS :      \
-     (((MODE) == SImode) ? BASE_REGS : LO_REGS))
+#define MODE_BASE_REG_CLASS(MODE)   				        \
+  (TARGET_ARM || (TARGET_THUMB2 && !optimize_function_for_size_p ()) ?	\
+   CORE_REGS :								\
+   (((MODE) == SImode) ? BASE_REGS : LO_REGS))
 
 /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
    instead of BASE_REGS.  */
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 3d4dcfa..8e17930 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -1400,7 +1400,7 @@
 			 (const_int 0)))
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(mult:SI (match_dup 2) (match_dup 1)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1426,7 +1426,7 @@
 			  (match_operand:SI 1 "s_register_operand" "r"))
 			 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mul%.\\t%0, %2, %1"
   [(set_attr "conds" "set")
    (set_attr "insn" "muls")]
@@ -1486,7 +1486,7 @@
    (set (match_operand:SI 0 "s_register_operand" "=r")
 	(plus:SI (mult:SI (match_dup 2) (match_dup 1))
 		 (match_dup 3)))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -1516,7 +1516,7 @@
 		  (match_operand:SI 3 "s_register_operand" "r"))
 	 (const_int 0)))
    (clobber (match_scratch:SI 0 "=r"))]
-  "TARGET_ARM && arm_arch6 && optimize_size"
+  "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()"
   "mla%.\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
    (set_attr "insn" "mlas")]
@@ -4992,13 +4992,13 @@
    (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")]
 )
 
-(define_split
+(define_split ;; dimode_general_splitter
   [(set (match_operand:ANY64 0 "arm_general_register_operand" "")
 	(match_operand:ANY64 1 "const_double_operand" ""))]
   "TARGET_32BIT
    && reload_completed
    && (arm_const_double_inline_cost (operands[1])
-       <= ((optimize_size || arm_ld_sched) ? 3 : 4))"
+       <= ((optimize_insn_for_size_p () || arm_ld_sched) ? 3 : 4))"
   [(const_int 0)]
   "
   arm_split_constant (SET, SImode, curr_insn,
@@ -8477,7 +8477,7 @@
    (match_operand:SI 2 "const_int_operand" "")	; total range
    (match_operand:SI 3 "" "")			; table label
    (match_operand:SI 4 "" "")]			; Out of range label
-  "TARGET_32BIT || optimize_size || flag_pic"
+  "TARGET_32BIT || optimize_insn_for_size_p () || flag_pic"
   "
   {
     enum insn_code code;
@@ -10845,7 +10845,7 @@
 (define_expand "bswapsi2"
   [(set (match_operand:SI 0 "s_register_operand" "=r")
   	(bswap:SI (match_operand:SI 1 "s_register_operand" "r")))]
-"TARGET_EITHER && (arm_arch6 || !optimize_size)"
+"TARGET_EITHER && (arm_arch6 || !optimize_insn_for_size_p ())"
 "
     if (!arm_arch6)
       {
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index f5b8521..c7e13ec 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -232,7 +232,7 @@
   if optimizing for space or when we have load-delay slots to fill."
  (and (match_code "const_double,const_int,const_vector")
       (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4
-		   && !(optimize_size || arm_ld_sched)")))
+		   && !(optimize_insn_for_size_p () || arm_ld_sched)")))
 
 (define_constraint "Di"
  "@internal
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index 9a11012..e228963 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -873,11 +873,11 @@
 ;; implementations and since "mul<c>" will be generated by
 ;; "*arm_mulsi3_v6" anyhow.  The assembler will use a 16-bit encoding
 ;; for "mul<c>" whenever possible anyhow.
-(define_peephole2
+(define_peephole2 ;; t2_muls_peepholes
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_operand:SI 1 "low_register_operand" "")
                  (match_dup 0)))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -889,7 +889,7 @@
   [(set (match_operand:SI 0 "low_register_operand" "")
         (mult:SI (match_dup 0)
 	         (match_operand:SI 1 "low_register_operand" "")))]
-  "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)"
   [(parallel
     [(set (match_dup 0)
            (mult:SI (match_dup 0) (match_dup 1)))
@@ -902,7 +902,7 @@
         (mult:SI (match_operand:SI 1 "low_register_operand" "%0")
                  (match_operand:SI 2 "low_register_operand" "l")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_THUMB2 && optimize_size && reload_completed"
+  "TARGET_THUMB2 && optimize_insn_for_size_p () && reload_completed"
   "mul%!\\t%0, %2, %0"
   [(set_attr "predicable" "yes")
    (set_attr "length" "2")
@@ -916,7 +916,7 @@
          (const_int 0)))
    (set (match_operand:SI 0 "register_operand" "=l")
 	(mult:SI (match_dup 1) (match_dup 2)))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])
@@ -928,7 +928,7 @@
 	          (match_operand:SI 2 "register_operand" "l"))
          (const_int 0)))
    (clobber (match_scratch:SI 0 "=l"))]
-  "TARGET_THUMB2 && optimize_size"
+  "TARGET_THUMB2 && optimize_insn_for_size_p ()"
   "muls\\t%0, %2, %0"
   [(set_attr "length" "2")
    (set_attr "insn" "muls")])