From patchwork Fri Aug 12 12:24:55 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ramana Radhakrishnan X-Patchwork-Id: 109836 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D9F41B6F67 for ; Fri, 12 Aug 2011 22:25:18 +1000 (EST) Received: (qmail 9184 invoked by alias); 12 Aug 2011 12:25:14 -0000 Received: (qmail 9150 invoked by uid 22791); 12 Aug 2011 12:25:11 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, TW_VB X-Spam-Check-By: sourceware.org Received: from mail-qy0-f175.google.com (HELO mail-qy0-f175.google.com) (209.85.216.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 12 Aug 2011 12:24:56 +0000 Received: by qyk4 with SMTP id 4so289347qyk.20 for ; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) MIME-Version: 1.0 Received: by 10.224.109.71 with SMTP id i7mr565491qap.210.1313151895437; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) Received: by 10.224.89.67 with HTTP; Fri, 12 Aug 2011 05:24:55 -0700 (PDT) Date: Fri, 12 Aug 2011 13:24:55 +0100 Message-ID: Subject: [RFC ARM] Audit uses of optimize_size in the ARM backend. From: Ramana Radhakrishnan To: gcc-patches Cc: Patch Tracking , Richard Earnshaw X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, Quite some time back someone had pointed out that the ARM backend used optimize_size in quite a few areas and that backends shouldn't use this directly in patterns any more. I had written this patch up a few weeks back and it was in one of my trees and had gone through some degree of testing. While the ARM backend doesn't support hot cold partitioning of basic blocks because of issues with the mini-pool placements, I suspect this by itself is a good cleanup. The bits of the use that I'm not convinced about yet are the changes of optimize_size in thumb_legitimize_address to optimize_insn_for_size_p and I'm looking for some comments there. There are still other uses of optimize_size and here are some thoughts on what we should do there. I will go back and do this when I have some more free time next but I hope to have the changes in by the time stage1 is over if these are deemed to be useful. - arm/aout.h : ASM_OUTPUT_ADDR_DIFF_ELT : replace with optimize_function_for_size_p ? - arm/arm.h : TARGET_USE_MOVT (probably again something that could benefit with the change.) - arm/arm.h : CONSTANT_ALIGNMENT - probably should retain optimize_size . - arm/arm.h : DATA_ALIGNMENT - Likewise. - arm/arm.h : CASE_VECTOR_PC_RELATIVE should go hand in glove with addr_diff_elt output. - arm/coff.h or arm/elf.h : JUMP_TABLES_IN_TEXT_SECTION : optimize_function_for_size_p () ? - arm/arm.c (arm_compute_save_reg_mask) : Replace optimize_size with optimize_function_for_size_p (). - arm/arm.c (arm_output_epilogue): Replace optimize_size with optimize_function_for_size_p (). - arm/arm.c ( arm_expand_prologue): Likewise - arm/arm.c (thumb1_extra_regs_pushed): optimize_function_for_size_p - arm/arm.c (arm_final_prescan_insn): Probably optimize_insn_for_size_p () . - arm/arm.c (arm_conditional_register_usage): optimize_function_for_size_p. Ok for trunk after a bootstrap, test run ? Thoughts about what we do with the rest of the uses ? cheers Ramana * config/arm/arm.md ("*mulsi3_compare0_v6"): Replace optimize_size with optimize_insn_for_size_p. ("*mulsi_compare0_scratch_v6"): Likewise. ("*mulsi3addsi_compare0_v6"): Likewise. ("casesi"): Likewise. (dimode_general_splitter): Name existing splitter and like above. ("bswapsi2"): Likewise. * config/arm/thumb2.md (t2_muls_peepholes): Likewise. * config/arm/arm.c (thumb_legitimize_address): Replace optimize_size with optimize_insn_for_size_p. (adjacent_mem_locations): Likewise. (arm_const_double_by_parts): Likewise. * config/arm/arm.h (FUNCTION_BOUNDARY): Use optimize_function_for_size_p. (MODE_BASE_REG_CLASS): Likewise. * config/arm/constraints.md (constraint "Dc"): Use optimize_insn_for_size_p. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 6cd80f8..97dd249 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -6359,7 +6359,7 @@ thumb_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode) /* Try and fold the offset into a biasing of the base register and then offsetting that. Don't do this when optimizing for space since it can cause too many CSEs. */ - if (optimize_size && offset >= 0 + if (optimize_insn_for_size_p () && offset >= 0 && offset < 256 + 31 * GET_MODE_SIZE (mode)) { HOST_WIDE_INT delta; @@ -9787,7 +9787,7 @@ adjacent_mem_locations (rtx a, rtx b) /* If the target has load delay slots, then there's no benefit to using an ldm instruction unless the offset is zero and we are optimizing for size. */ - return (optimize_size && (REGNO (reg0) == REGNO (reg1)) + return (optimize_insn_for_size_p () && (REGNO (reg0) == REGNO (reg1)) && (val0 == 0 || val1 == 0 || val0 == 4 || val1 == 4) && (val_diff == 4 || val_diff == -4)); } @@ -9868,7 +9868,7 @@ multiple_operation_profitable_p (bool is_store ATTRIBUTE_UNUSED, As a compromise, we use ldr for counts of 1 or 2 regs, and ldm for counts of 3 or 4 regs. */ - if (nops <= 2 && arm_tune_xscale && !optimize_size) + if (nops <= 2 && arm_tune_xscale && !optimize_insn_for_size_p ()) return false; return true; } @@ -12445,7 +12445,7 @@ arm_const_double_by_parts (rtx val) enum machine_mode mode = GET_MODE (val); rtx part; - if (optimize_size || arm_ld_sched) + if (optimize_insn_for_size_p () || arm_ld_sched) return true; if (mode == VOIDmode) diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 869b9a9..b18f08e 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -533,7 +533,7 @@ extern int arm_arch_thumb_hwdiv; #define PREFERRED_STACK_BOUNDARY \ (arm_abi == ARM_ABI_ATPCS ? 64 : STACK_BOUNDARY) -#define FUNCTION_BOUNDARY ((TARGET_THUMB && optimize_size) ? 16 : 32) +#define FUNCTION_BOUNDARY ((TARGET_THUMB && optimize_function_for_size_p (cfun)) ? 16 : 32) /* The lowest bit is used to indicate Thumb-mode functions, so the vbit must go into the delta field of pointers to member @@ -1141,9 +1141,10 @@ enum reg_class /* For the Thumb the high registers cannot be used as base registers when addressing quantities in QI or HI mode; if we don't know the mode, then we must be conservative. */ -#define MODE_BASE_REG_CLASS(MODE) \ - (TARGET_ARM || (TARGET_THUMB2 && !optimize_size) ? CORE_REGS : \ - (((MODE) == SImode) ? BASE_REGS : LO_REGS)) +#define MODE_BASE_REG_CLASS(MODE) \ + (TARGET_ARM || (TARGET_THUMB2 && !optimize_function_for_size_p ()) ? \ + CORE_REGS : \ + (((MODE) == SImode) ? BASE_REGS : LO_REGS)) /* For Thumb we can not support SP+reg addressing, so we return LO_REGS instead of BASE_REGS. */ diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 3d4dcfa..8e17930 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1400,7 +1400,7 @@ (const_int 0))) (set (match_operand:SI 0 "s_register_operand" "=r") (mult:SI (match_dup 2) (match_dup 1)))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mul%.\\t%0, %2, %1" [(set_attr "conds" "set") (set_attr "insn" "muls")] @@ -1426,7 +1426,7 @@ (match_operand:SI 1 "s_register_operand" "r")) (const_int 0))) (clobber (match_scratch:SI 0 "=r"))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mul%.\\t%0, %2, %1" [(set_attr "conds" "set") (set_attr "insn" "muls")] @@ -1486,7 +1486,7 @@ (set (match_operand:SI 0 "s_register_operand" "=r") (plus:SI (mult:SI (match_dup 2) (match_dup 1)) (match_dup 3)))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mla%.\\t%0, %2, %1, %3" [(set_attr "conds" "set") (set_attr "insn" "mlas")] @@ -1516,7 +1516,7 @@ (match_operand:SI 3 "s_register_operand" "r")) (const_int 0))) (clobber (match_scratch:SI 0 "=r"))] - "TARGET_ARM && arm_arch6 && optimize_size" + "TARGET_ARM && arm_arch6 && optimize_insn_for_size_p ()" "mla%.\\t%0, %2, %1, %3" [(set_attr "conds" "set") (set_attr "insn" "mlas")] @@ -4992,13 +4992,13 @@ (set_attr "thumb2_neg_pool_range" "*,*,*,0,*")] ) -(define_split +(define_split ;; dimode_general_splitter [(set (match_operand:ANY64 0 "arm_general_register_operand" "") (match_operand:ANY64 1 "const_double_operand" ""))] "TARGET_32BIT && reload_completed && (arm_const_double_inline_cost (operands[1]) - <= ((optimize_size || arm_ld_sched) ? 3 : 4))" + <= ((optimize_insn_for_size_p () || arm_ld_sched) ? 3 : 4))" [(const_int 0)] " arm_split_constant (SET, SImode, curr_insn, @@ -8477,7 +8477,7 @@ (match_operand:SI 2 "const_int_operand" "") ; total range (match_operand:SI 3 "" "") ; table label (match_operand:SI 4 "" "")] ; Out of range label - "TARGET_32BIT || optimize_size || flag_pic" + "TARGET_32BIT || optimize_insn_for_size_p () || flag_pic" " { enum insn_code code; @@ -10845,7 +10845,7 @@ (define_expand "bswapsi2" [(set (match_operand:SI 0 "s_register_operand" "=r") (bswap:SI (match_operand:SI 1 "s_register_operand" "r")))] -"TARGET_EITHER && (arm_arch6 || !optimize_size)" +"TARGET_EITHER && (arm_arch6 || !optimize_insn_for_size_p ())" " if (!arm_arch6) { diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md index f5b8521..c7e13ec 100644 --- a/gcc/config/arm/constraints.md +++ b/gcc/config/arm/constraints.md @@ -232,7 +232,7 @@ if optimizing for space or when we have load-delay slots to fill." (and (match_code "const_double,const_int,const_vector") (match_test "TARGET_32BIT && arm_const_double_inline_cost (op) == 4 - && !(optimize_size || arm_ld_sched)"))) + && !(optimize_insn_for_size_p () || arm_ld_sched)"))) (define_constraint "Di" "@internal diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md index 9a11012..e228963 100644 --- a/gcc/config/arm/thumb2.md +++ b/gcc/config/arm/thumb2.md @@ -873,11 +873,11 @@ ;; implementations and since "mul" will be generated by ;; "*arm_mulsi3_v6" anyhow. The assembler will use a 16-bit encoding ;; for "mul" whenever possible anyhow. -(define_peephole2 +(define_peephole2 ;; t2_muls_peepholes [(set (match_operand:SI 0 "low_register_operand" "") (mult:SI (match_operand:SI 1 "low_register_operand" "") (match_dup 0)))] - "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)" + "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)" [(parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 1))) @@ -889,7 +889,7 @@ [(set (match_operand:SI 0 "low_register_operand" "") (mult:SI (match_dup 0) (match_operand:SI 1 "low_register_operand" "")))] - "TARGET_THUMB2 && optimize_size && peep2_regno_dead_p (0, CC_REGNUM)" + "TARGET_THUMB2 && optimize_insn_for_size_p () && peep2_regno_dead_p (0, CC_REGNUM)" [(parallel [(set (match_dup 0) (mult:SI (match_dup 0) (match_dup 1))) @@ -902,7 +902,7 @@ (mult:SI (match_operand:SI 1 "low_register_operand" "%0") (match_operand:SI 2 "low_register_operand" "l"))) (clobber (reg:CC CC_REGNUM))] - "TARGET_THUMB2 && optimize_size && reload_completed" + "TARGET_THUMB2 && optimize_insn_for_size_p () && reload_completed" "mul%!\\t%0, %2, %0" [(set_attr "predicable" "yes") (set_attr "length" "2") @@ -916,7 +916,7 @@ (const_int 0))) (set (match_operand:SI 0 "register_operand" "=l") (mult:SI (match_dup 1) (match_dup 2)))] - "TARGET_THUMB2 && optimize_size" + "TARGET_THUMB2 && optimize_insn_for_size_p ()" "muls\\t%0, %2, %0" [(set_attr "length" "2") (set_attr "insn" "muls")]) @@ -928,7 +928,7 @@ (match_operand:SI 2 "register_operand" "l")) (const_int 0))) (clobber (match_scratch:SI 0 "=l"))] - "TARGET_THUMB2 && optimize_size" + "TARGET_THUMB2 && optimize_insn_for_size_p ()" "muls\\t%0, %2, %0" [(set_attr "length" "2") (set_attr "insn" "muls")])