From patchwork Fri Oct 12 08:08:58 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bin Cheng X-Patchwork-Id: 191070 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 3493A2C008A for ; Fri, 12 Oct 2012 19:14:31 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1350634471; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: From:To:Cc:References:In-Reply-To:Subject:Date:Message-ID: MIME-Version:Content-Type:Mailing-List:Precedence:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:Sender: Delivered-To; bh=N8gCLD3yE8l/KhZKkrDgcu2oXWk=; b=XHXUpiDuBXbGA85 XJcljfFx3IaEYC+sxHFygg/e+xVKQIsTd+vcgSgLab0IeHsHuVLjTl3q0TieBAly vHVSiXt5dRl1bX73npEuQO7lqgb5oS8GwU6+X5OZchDjrC7Dane/jKWMC+VSpoAE BxkCvpeGvMFFVHHIHjp+tnaeBx0A= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:From:To:Cc:References:In-Reply-To:Subject:Date:Message-ID:MIME-Version:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=nPW9+2GI3gGymMDLqJJQH8T4JrCElR+HW5O2H9jYbKUVi/ydT3WsUOOjiIwjUx rYFcZbcl7nNUTt++B+idE/hq03Y7QFVR598iFHQRDb3t4JMK286llGw1QXOzcfKT HJHYF+gMPeGhykuAxzOS6299QnCNiyDJ8lpfNYKx/LHQw=; Received: (qmail 24272 invoked by alias); 12 Oct 2012 08:14:20 -0000 Received: (qmail 24259 invoked by uid 22791); 12 Oct 2012 08:14:16 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, KHOP_THREADED, MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_LOW, TW_DB X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 12 Oct 2012 08:14:06 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Fri, 12 Oct 2012 09:14:04 +0100 Received: from Binsh02 ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Fri, 12 Oct 2012 09:14:01 +0100 From: "Bin Cheng" To: "'Steven Bosscher'" Cc: "Jeff Law" , References: <50655073.e54c420a.651e.ffffac0fSMTPIN_ADDED@mx.google.com> <506AE4F1.5030807@redhat.com> <50766d69.a853420a.2475.fffffbafSMTPIN_ADDED@mx.google.com> In-Reply-To: Subject: RE: [PATCH RFA] Implement register pressure directed hoist pass Date: Fri, 12 Oct 2012 16:08:58 +0800 Message-ID: <000c01cda850$d6fa2460$84ee6d20$@cheng@arm.com> MIME-Version: 1.0 X-MC-Unique: 112101209140406401 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, This is the updated patches split from original one according to Steven's suggestion. Also fixed spelling errors. Apart from this, I also implemented a draft patch simulating register pressure accurately during hoisting, unfortunately the size data isn't better than this patch. If it's right, this can prove my previous observation that decrease of register pressure during hoisting process is rare. I will continue investigating the correctness of the patch and see what I can get. Please review. Thanks And the ChangeLog: 2012-10-12 Bin Cheng * gcse.c: Update copyright dates. (hoist_expr_reaches_here_p): Change parameter type from char * to sbitmap. 2012-10-12 Bin Cheng * common.opt (flag_ira_hoist_pressure): New. * doc/invoke.texi (-fira-hoist-pressure): Describe. * ira-costs.c (ira_set_pseudo_classes): New parameter. * ira.h: Update copyright dates. (ira_set_pseudo_classes): Update prototype. * haifa-sched.c (sched_init): Update call. * ira.c (ira): Update call. * regmove.c: Update copyright dates. (regmove_optimize): Update call. * loop-invariant.c: Update copyright dates. (move_loop_invariants): Update call. * gcse.c: (struct bb_data): New structure. (BB_DATA): New macro. (curr_bb, curr_reg_pressure): New static variables. (should_hoist_expr_to_dom): Rename from hoist_expr_reaches_here_p. Change parameter expr_index to expr. New parameters pressure_class, nregs and hoisted_bbs. Use reg pressure to determine the distance expr can be hoisted. (hoist_code): Use reg pressure to direct the hoist process. (get_regno_pressure_class, get_pressure_class_and_nregs) (change_pressure, calculate_bb_reg_pressure): New. (one_code_hoisting_pass): Calculate register pressure. Allocate and free data. gcc/testsuite/ChangeLog 2012-10-12 Bin Cheng * testsuite/gcc.dg/hoist-register-pressure.c: New test. diff -x .svn -rpN wc1/gcc/common.opt wc2/gcc/common.opt *** wc1/gcc/common.opt 2012-10-12 15:13:41.679617846 +0800 --- wc2/gcc/common.opt 2012-10-12 15:12:15.614174292 +0800 *************** Enum(ira_region) String(all) Value(IRA_R *** 1392,1397 **** --- 1392,1402 ---- EnumValue Enum(ira_region) String(mixed) Value(IRA_REGION_MIXED) + fira-hoist-pressure + Common Report Var(flag_ira_hoist_pressure) Init(1) Optimization + Use IRA based register pressure calculation + in RTL hoist optimizations. + fira-loop-pressure Common Report Var(flag_ira_loop_pressure) Use IRA based register pressure calculation diff -x .svn -rpN wc1/gcc/doc/invoke.texi wc2/gcc/doc/invoke.texi *** wc1/gcc/doc/invoke.texi 2012-10-12 15:13:40.282479595 +0800 --- wc2/gcc/doc/invoke.texi 2012-10-12 15:12:15.622110427 +0800 *************** Objective-C and Objective-C++ Dialects}. *** 372,378 **** -finline-small-functions -fipa-cp -fipa-cp-clone @gol -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol -fira-algorithm=@var{algorithm} @gol ! -fira-region=@var{region} @gol -fira-loop-pressure -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol -fivopts -fkeep-inline-functions -fkeep-static-consts @gol --- 372,378 ---- -finline-small-functions -fipa-cp -fipa-cp-clone @gol -fipa-pta -fipa-profile -fipa-pure-const -fipa-reference @gol -fira-algorithm=@var{algorithm} @gol ! -fira-region=@var{region} -fira-hoist-pressure @gol -fira-loop-pressure -fno-ira-share-save-slots @gol -fno-ira-share-spill-slots -fira-verbose=@var{n} @gol -fivopts -fkeep-inline-functions -fkeep-static-consts @gol *************** This typically results in the smallest c *** 6996,7001 **** --- 6996,7009 ---- @end table + @item -fira-hoist-pressure + @opindex fira-hoist-pressure + Use IRA to evaluate register pressure in the code hoisting pass for + decisions to hoist expressions. This option usually results in smaller + code, but it can slow the compiler down. + + This option is enabled at level @option{-Os} for all targets. + @item -fira-loop-pressure @opindex fira-loop-pressure Use IRA to evaluate register pressure in loops for decisions to move diff -x .svn -rpN wc1/gcc/gcse.c wc2/gcc/gcse.c *** wc1/gcc/gcse.c 2012-10-12 15:45:12.822998377 +0800 --- wc2/gcc/gcse.c 2012-10-12 15:31:45.062189749 +0800 *************** along with GCC; see the file COPYING3. *** 20,28 **** /* TODO - reordering of memory allocation and freeing to be more space efficient ! - do rough calc of how many regs are needed in each block, and a rough ! calc of how many regs are available in each class and use that to ! throttle back the code in cases where RTX_COST is minimal. */ /* References searched while implementing this. --- 20,30 ---- /* TODO - reordering of memory allocation and freeing to be more space efficient ! - simulate register pressure change of each basic block accurately during ! hoist process. But I doubt the benefit since most expressions hoisted ! are constant or address, which usually won't reduce register pressure. ! - calc rough register pressure information and use the info to drive all ! kinds of code motion (including code hoisting) in a unified way. */ /* References searched while implementing this. *************** along with GCC; see the file COPYING3. *** 141,151 **** #include "diagnostic-core.h" #include "toplev.h" #include "rtl.h" #include "tree.h" #include "tm_p.h" #include "regs.h" ! #include "hard-reg-set.h" #include "flags.h" #include "insn-config.h" #include "recog.h" --- 143,154 ---- #include "diagnostic-core.h" #include "toplev.h" + #include "hard-reg-set.h" #include "rtl.h" #include "tree.h" #include "tm_p.h" #include "regs.h" ! #include "ira.h" #include "flags.h" #include "insn-config.h" #include "recog.h" *************** static bool doing_code_hoisting_p = fals *** 412,417 **** --- 415,436 ---- /* For available exprs */ static sbitmap *ae_kill; + /* Data stored for each basic block. */ + struct bb_data + { + /* Maximal register pressure inside basic block for given register class + (defined only for the pressure classes). */ + int max_reg_pressure[N_REG_CLASSES]; + }; + + #define BB_DATA(bb) ((struct bb_data *) (bb)->aux) + + static basic_block curr_bb; + + /* Currently register pressure for each pressure class. */ + static int curr_reg_pressure[N_REG_CLASSES]; + + static void compute_can_copy (void); static void *gmalloc (size_t) ATTRIBUTE_MALLOC; static void *gcalloc (size_t, size_t) ATTRIBUTE_MALLOC; *************** static void alloc_code_hoist_mem (int, i *** 460,468 **** static void free_code_hoist_mem (void); static void compute_code_hoist_vbeinout (void); static void compute_code_hoist_data (void); ! static int hoist_expr_reaches_here_p (basic_block, int, basic_block, sbitmap, ! int, int *); static int hoist_code (void); static int one_code_hoisting_pass (void); static rtx process_insert_insn (struct expr *); static int pre_edge_insert (struct edge_list *, struct expr **); --- 479,489 ---- static void free_code_hoist_mem (void); static void compute_code_hoist_vbeinout (void); static void compute_code_hoist_data (void); ! static int should_hoist_expr_to_dom (basic_block, struct expr *, basic_block, ! sbitmap, int, int *, enum reg_class, ! int *, bitmap); static int hoist_code (void); + static enum reg_class get_pressure_class_and_nregs (rtx insn, int *nregs); static int one_code_hoisting_pass (void); static rtx process_insert_insn (struct expr *); static int pre_edge_insert (struct edge_list *, struct expr **); *************** prune_expressions (bool pre_p) *** 1857,1863 **** a basic block we should account for any side-effects of a subsequent jump instructions that could clobber the expression. It would be best to implement this check along the lines of ! hoist_expr_reaches_here_p where the target block is already known and, hence, there's no need to conservatively prune expressions on "intermediate" set-and-jump instructions. */ FOR_EACH_EDGE (e, ei, bb->preds) --- 1878,1884 ---- a basic block we should account for any side-effects of a subsequent jump instructions that could clobber the expression. It would be best to implement this check along the lines of ! should_hoist_expr_to_dom where the target block is already known and, hence, there's no need to conservatively prune expressions on "intermediate" set-and-jump instructions. */ FOR_EACH_EDGE (e, ei, bb->preds) *************** compute_code_hoist_data (void) *** 2825,2834 **** fprintf (dump_file, "\n"); } ! /* Determine if the expression identified by EXPR_INDEX would ! reach BB unimpared if it was placed at the end of EXPR_BB. ! Stop the search if the expression would need to be moved more ! than DISTANCE instructions. It's unclear exactly what Muchnick meant by "unimpared". It seems to me that the expression must either be computed or transparent in --- 2846,2866 ---- fprintf (dump_file, "\n"); } ! /* Determine if the expression EXPR should be hoisted to EXPR_BB up in ! flow graph, if it can reach BB unimpared. Stop the search if the ! expression would need to be moved more than DISTANCE instructions. ! ! DISTANCE is the number of instructions through which EXPR can be ! hoisted up in flow graph. ! ! BB_SIZE points to an array which contains the number of instructions ! for each basic block. ! ! PRESSURE_CLASS and NREGS are register class and number of hard registers ! for storing EXPR. ! ! HOISTED_BBS points to a bitmap indicating basic blocks through which ! EXPR is hoisted. It's unclear exactly what Muchnick meant by "unimpared". It seems to me that the expression must either be computed or transparent in *************** compute_code_hoist_data (void) *** 2841,2858 **** paths. */ static int ! hoist_expr_reaches_here_p (basic_block expr_bb, int expr_index, basic_block bb, ! sbitmap visited, int distance, int *bb_size) { edge pred; edge_iterator ei; int visited_allocated_locally = 0; /* Terminate the search if distance, for which EXPR is allowed to move, is exhausted. */ if (distance > 0) { ! distance -= bb_size[bb->index]; if (distance <= 0) return 0; --- 2873,2904 ---- paths. */ static int ! should_hoist_expr_to_dom (basic_block expr_bb, struct expr *expr, ! basic_block bb, sbitmap visited, int distance, ! int *bb_size, enum reg_class pressure_class, ! int *nregs, bitmap hoisted_bbs) { + unsigned int i; edge pred; edge_iterator ei; + sbitmap_iterator sbi; int visited_allocated_locally = 0; /* Terminate the search if distance, for which EXPR is allowed to move, is exhausted. */ if (distance > 0) { ! /* Let EXPR be hoisted through basic block at no cost if the block ! has low register pressure. An exception is constant expression, ! because hoisting constant expr aggressively results in worse code. ! The exception is made by the observation of CSiBE on ARM target, ! while it has no obvious effect on other targets like x86, x86_64, ! mips and powerpc. */ ! if (!flag_ira_hoist_pressure ! || (BB_DATA (bb)->max_reg_pressure[pressure_class] ! >= ira_class_hard_regs_num[pressure_class] ! || CONST_INT_P (expr->expr))) ! distance -= bb_size[bb->index]; if (distance <= 0) return 0; *************** hoist_expr_reaches_here_p (basic_block e *** 2877,2897 **** continue; else if (TEST_BIT (visited, pred_bb->index)) continue; ! ! else if (! TEST_BIT (transp[pred_bb->index], expr_index)) break; - /* Not killed. */ else { SET_BIT (visited, pred_bb->index); ! if (! hoist_expr_reaches_here_p (expr_bb, expr_index, pred_bb, ! visited, distance, bb_size)) break; } } if (visited_allocated_locally) ! sbitmap_free (visited); return (pred == NULL); } --- 2923,2957 ---- continue; else if (TEST_BIT (visited, pred_bb->index)) continue; ! else if (! TEST_BIT (transp[pred_bb->index], expr->bitmap_index)) break; /* Not killed. */ else { SET_BIT (visited, pred_bb->index); ! if (! should_hoist_expr_to_dom (expr_bb, expr, pred_bb, ! visited, distance, bb_size, ! pressure_class, nregs, hoisted_bbs)) break; } } if (visited_allocated_locally) ! { ! /* If EXPR can be hoisted to expr_bb, record basic blocks through ! which EXPR is hoisted in hoisted_bbs. Also update register ! pressure for basic blocks newly added in hoisted_bbs. */ ! if (flag_ira_hoist_pressure && !pred) ! { ! EXECUTE_IF_SET_IN_SBITMAP (visited, 0, i, sbi) ! if (!bitmap_bit_p (hoisted_bbs, i)) ! { ! bitmap_set_bit (hoisted_bbs, i); ! BB_DATA (BASIC_BLOCK (i))->max_reg_pressure[pressure_class] ! += *nregs; ! } ! } ! sbitmap_free (visited); ! } return (pred == NULL); } *************** find_occr_in_bb (struct occr *occr, basi *** 2908,2914 **** return occr; } ! /* Actually perform code hoisting. */ static int hoist_code (void) --- 2968,3011 ---- return occr; } ! /* Actually perform code hoisting. ! ! The code hoisting pass can hoist multiple computations of the same ! expression along dominated path to a dominating basic block, like ! from b2/b3 to b1 as depicted below: ! ! b1 ------ ! /\ | ! / \ | ! bx by distance ! / \ | ! / \ | ! b2 b3 ------ ! ! Unfortunately code hoisting generally extends the live range of an ! output pseudo register, which increases register pressure and hurts ! register allocation. To address this issue, an attribute MAX_DISTANCE ! is computed and attached to each expression. The attribute is computed ! from rtx cost of the corresponding expression and it's used to control ! how long the expression can be hoisted up in flow graph. As the ! expression is hoisted up in flow graph, GCC decreases its DISTANCE ! and stops the hoist if DISTANCE reaches 0. ! ! Option "-fira-hoist-pressure" implements register pressure directed ! hoist based on upper method. The rationale is: ! 1. Calculate register pressure for each basic block by reusing IRA ! facility. ! 2. When expression is hoisted through one basic block, GCC checks ! register pressure of the basic block and decrease DISTANCE only ! when the register pressure is high. In other words, expression ! will be hoisted through basic block with low register pressure ! at no cost. ! 3. Update register pressure information for basic blocks through ! which expression is hoisted. ! TODO: It is possible to have register pressure decreased because ! of shrinked live ranges of input pseudo registers when hoisting ! an expression. For now, this effect is not simulated and we just ! increase register pressure for hoisted expressions. */ static int hoist_code (void) *************** hoist_code (void) *** 2917,2928 **** VEC (basic_block, heap) *dom_tree_walk; unsigned int dom_tree_walk_index; VEC (basic_block, heap) *domby; ! unsigned int i,j; struct expr **index_map; struct expr *expr; int *to_bb_head; int *bb_size; int changed = 0; /* Compute a mapping from expression number (`bitmap_index') to hash table entry. */ --- 3014,3031 ---- VEC (basic_block, heap) *dom_tree_walk; unsigned int dom_tree_walk_index; VEC (basic_block, heap) *domby; ! unsigned int i, j, k; struct expr **index_map; struct expr *expr; int *to_bb_head; int *bb_size; int changed = 0; + struct bb_data *data; + /* Basic blocks that have occurrences reachable from BB. */ + bitmap from_bbs; + /* Basic blocks through which expr is hoisted. */ + bitmap hoisted_bbs = NULL; + bitmap_iterator bi; /* Compute a mapping from expression number (`bitmap_index') to hash table entry. */ *************** hoist_code (void) *** 2960,2965 **** --- 3063,3072 ---- && (EDGE_SUCC (ENTRY_BLOCK_PTR, 0)->dest == ENTRY_BLOCK_PTR->next_bb)); + from_bbs = BITMAP_ALLOC (NULL); + if (flag_ira_hoist_pressure) + hoisted_bbs = BITMAP_ALLOC (NULL); + dom_tree_walk = get_all_dominated_blocks (CDI_DOMINATORS, ENTRY_BLOCK_PTR->next_bb); *************** hoist_code (void) *** 2978,2989 **** { if (TEST_BIT (hoist_vbeout[bb->index], i)) { /* Current expression. */ struct expr *expr = index_map[i]; /* Number of occurrences of EXPR that can be hoisted to BB. */ int hoistable = 0; - /* Basic blocks that have occurrences reachable from BB. */ - bitmap_head _from_bbs, *from_bbs = &_from_bbs; /* Occurrences reachable from BB. */ VEC (occr_t, heap) *occrs_to_hoist = NULL; /* We want to insert the expression into BB only once, so --- 3085,3096 ---- { if (TEST_BIT (hoist_vbeout[bb->index], i)) { + int nregs = 0; + enum reg_class pressure_class = NO_REGS; /* Current expression. */ struct expr *expr = index_map[i]; /* Number of occurrences of EXPR that can be hoisted to BB. */ int hoistable = 0; /* Occurrences reachable from BB. */ VEC (occr_t, heap) *occrs_to_hoist = NULL; /* We want to insert the expression into BB only once, so *************** hoist_code (void) *** 2991,2998 **** int insn_inserted_p; occr_t occr; - bitmap_initialize (from_bbs, 0); - /* If an expression is computed in BB and is available at end of BB, hoist all occurrences dominated by BB to BB. */ if (TEST_BIT (comp[bb->index], i)) --- 3098,3103 ---- *************** hoist_code (void) *** 3046,3058 **** max_distance += (bb_size[dominated->index] - to_bb_head[INSN_UID (occr->insn)]); ! /* Note if the expression would reach the dominated block ! unimpared if it was placed at the end of BB. Keep track of how many times this expression is hoistable from a dominated block into BB. */ ! if (hoist_expr_reaches_here_p (bb, i, dominated, NULL, ! max_distance, bb_size)) { hoistable++; VEC_safe_push (occr_t, heap, --- 3151,3168 ---- max_distance += (bb_size[dominated->index] - to_bb_head[INSN_UID (occr->insn)]); ! pressure_class = get_pressure_class_and_nregs (occr->insn, ! &nregs); ! ! /* Note if the expression should be hoisted from the dominated ! block to BB if it can reach DOMINATED unimpared. Keep track of how many times this expression is hoistable from a dominated block into BB. */ ! if (should_hoist_expr_to_dom (bb, expr, dominated, NULL, ! max_distance, bb_size, ! pressure_class, &nregs, ! hoisted_bbs)) { hoistable++; VEC_safe_push (occr_t, heap, *************** hoist_code (void) *** 3073,3078 **** --- 3183,3195 ---- to nullify any benefit we get from code hoisting. */ if (hoistable > 1 && dbg_cnt (hoist_insn)) { + /* Update register pressure for basic block to which expr + is hoisted. */ + if (flag_ira_hoist_pressure) + { + data = BB_DATA (bb); + data->max_reg_pressure[pressure_class] += nregs; + } /* If (hoistable != VEC_length), then there is an occurrence of EXPR in BB itself. Don't waste time looking for LCA in this case. */ *************** hoist_code (void) *** 3090,3097 **** } } else ! /* Punt, no point hoisting a single occurence. */ ! VEC_free (occr_t, heap, occrs_to_hoist); insn_inserted_p = 0; --- 3207,3226 ---- } } else ! { ! /* Punt, no point hoisting a single occurence. */ ! VEC_free (occr_t, heap, occrs_to_hoist); ! /* Restore register pressure of basic block recorded in ! hoisted_bbs when expr will not be hoisted. */ ! if (flag_ira_hoist_pressure) ! EXECUTE_IF_SET_IN_BITMAP (hoisted_bbs, 0, k, bi) ! { ! data = BB_DATA (BASIC_BLOCK (k)); ! data->max_reg_pressure[pressure_class] -= nregs; ! } ! } ! if (flag_ira_hoist_pressure) ! bitmap_clear (hoisted_bbs); insn_inserted_p = 0; *************** hoist_code (void) *** 3141,3146 **** --- 3270,3279 ---- } VEC_free (basic_block, heap, dom_tree_walk); + BITMAP_FREE (from_bbs); + if (flag_ira_hoist_pressure) + BITMAP_FREE (hoisted_bbs); + free (bb_size); free (to_bb_head); free (index_map); *************** hoist_code (void) *** 3148,3153 **** --- 3281,3445 ---- return changed; } + /* Return pressure class and number of needed hard registers (through + *NREGS) of register REGNO. */ + static enum reg_class + get_regno_pressure_class (int regno, int *nregs) + { + if (regno >= FIRST_PSEUDO_REGISTER) + { + enum reg_class pressure_class; + + pressure_class = reg_allocno_class (regno); + pressure_class = ira_pressure_class_translate[pressure_class]; + *nregs + = ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno)]; + return pressure_class; + } + else if (! TEST_HARD_REG_BIT (ira_no_alloc_regs, regno) + && ! TEST_HARD_REG_BIT (eliminable_regset, regno)) + { + *nregs = 1; + return ira_pressure_class_translate[REGNO_REG_CLASS (regno)]; + } + else + { + *nregs = 0; + return NO_REGS; + } + } + + /* Return pressure class and number of hard registers (through *NREGS) + for destination of INSN. */ + static enum reg_class + get_pressure_class_and_nregs (rtx insn, int *nregs) + { + rtx reg; + enum reg_class pressure_class; + rtx set = single_set (insn); + + /* Considered invariant insns have only one set. */ + gcc_assert (set != NULL_RTX); + reg = SET_DEST (set); + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + if (MEM_P (reg)) + { + *nregs = 0; + pressure_class = NO_REGS; + } + else + { + gcc_assert (REG_P (reg)); + pressure_class = reg_allocno_class (REGNO (reg)); + pressure_class = ira_pressure_class_translate[pressure_class]; + *nregs + = ira_reg_class_max_nregs[pressure_class][GET_MODE (SET_SRC (set))]; + } + return pressure_class; + } + + /* Increase (if INCR_P) or decrease current register pressure for + register REGNO. */ + static void + change_pressure (int regno, bool incr_p) + { + int nregs; + enum reg_class pressure_class; + + pressure_class = get_regno_pressure_class (regno, &nregs); + if (! incr_p) + curr_reg_pressure[pressure_class] -= nregs; + else + { + curr_reg_pressure[pressure_class] += nregs; + if (BB_DATA (curr_bb)->max_reg_pressure[pressure_class] + < curr_reg_pressure[pressure_class]) + BB_DATA (curr_bb)->max_reg_pressure[pressure_class] + = curr_reg_pressure[pressure_class]; + } + } + + /* Calculate register pressure for each basic block by walking insns + from last to first. */ + static void + calculate_bb_reg_pressure (void) + { + int i; + unsigned int j; + rtx insn; + basic_block bb; + bitmap curr_regs_live; + bitmap_iterator bi; + + + ira_setup_eliminable_regset (); + curr_regs_live = BITMAP_ALLOC (®_obstack); + FOR_EACH_BB (bb) + { + curr_bb = bb; + bitmap_copy (curr_regs_live, DF_LR_OUT (bb)); + for (i = 0; i < ira_pressure_classes_num; i++) + curr_reg_pressure[ira_pressure_classes[i]] = 0; + EXECUTE_IF_SET_IN_BITMAP (curr_regs_live, 0, j, bi) + change_pressure (j, true); + + FOR_BB_INSNS_REVERSE (bb, insn) + { + rtx dreg; + int regno; + df_ref *def_rec, *use_rec; + + if (! NONDEBUG_INSN_P (insn)) + continue; + + for (def_rec = DF_INSN_DEFS (insn); *def_rec; def_rec++) + { + dreg = DF_REF_REAL_REG (*def_rec); + gcc_assert (REG_P (dreg)); + regno = REGNO (dreg); + if (!(DF_REF_FLAGS (*def_rec) + & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))) + { + if (bitmap_clear_bit (curr_regs_live, regno)) + change_pressure (regno, false); + } + } + + for (use_rec = DF_INSN_USES (insn); *use_rec; use_rec++) + { + dreg = DF_REF_REAL_REG (*use_rec); + gcc_assert (REG_P (dreg)); + regno = REGNO (dreg); + if (bitmap_set_bit (curr_regs_live, regno)) + change_pressure (regno, true); + } + } + } + BITMAP_FREE (curr_regs_live); + + if (dump_file == NULL) + return; + + fprintf (dump_file, "\nRegister Pressure: \n"); + FOR_EACH_BB (bb) + { + fprintf (dump_file, " Basic block %d: \n", bb->index); + for (i = 0; (int) i < ira_pressure_classes_num; i++) + { + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + if (BB_DATA (bb)->max_reg_pressure[pressure_class] == 0) + continue; + + fprintf (dump_file, " %s=%d\n", reg_class_names[pressure_class], + BB_DATA (bb)->max_reg_pressure[pressure_class]); + } + } + fprintf (dump_file, "\n"); + } + /* Top level routine to perform one code hoisting (aka unification) pass Return nonzero if a change was made. */ *************** one_code_hoisting_pass (void) *** 3167,3172 **** --- 3459,3474 ---- doing_code_hoisting_p = true; + /* Calculate register pressure for each basic block. */ + if (flag_ira_hoist_pressure) + { + regstat_init_n_sets_and_refs (); + ira_set_pseudo_classes (false, dump_file); + alloc_aux_for_blocks (sizeof (struct bb_data)); + calculate_bb_reg_pressure (); + regstat_free_n_sets_and_refs (); + } + /* We need alias. */ init_alias_analysis (); *************** one_code_hoisting_pass (void) *** 3187,3192 **** --- 3489,3499 ---- free_code_hoist_mem (); } + if (flag_ira_hoist_pressure) + { + free_aux_for_blocks (); + free_reg_info (); + } free_hash_table (&expr_hash_table); free_gcse_mem (); obstack_free (&gcse_obstack, NULL); diff -x .svn -rpN wc1/gcc/haifa-sched.c wc2/gcc/haifa-sched.c *** wc1/gcc/haifa-sched.c 2012-10-12 15:13:41.582914633 +0800 --- wc2/gcc/haifa-sched.c 2012-10-12 15:12:15.690110976 +0800 *************** sched_init (void) *** 6629,6635 **** /* We need info about pseudos for rtl dumps about pseudo classes and costs. */ regstat_init_n_sets_and_refs (); ! ira_set_pseudo_classes (sched_verbose ? sched_dump : NULL); sched_regno_pressure_class = (enum reg_class *) xmalloc (max_regno * sizeof (enum reg_class)); for (i = 0; i < max_regno; i++) --- 6629,6635 ---- /* We need info about pseudos for rtl dumps about pseudo classes and costs. */ regstat_init_n_sets_and_refs (); ! ira_set_pseudo_classes (true, sched_verbose ? sched_dump : NULL); sched_regno_pressure_class = (enum reg_class *) xmalloc (max_regno * sizeof (enum reg_class)); for (i = 0; i < max_regno; i++) diff -x .svn -rpN wc1/gcc/ira.c wc2/gcc/ira.c *** wc1/gcc/ira.c 2012-10-12 15:13:41.687618514 +0800 --- wc2/gcc/ira.c 2012-10-12 15:12:15.690110976 +0800 *************** ira (FILE *f) *** 4183,4189 **** crtl->is_leaf = leaf_function_p (); if (resize_reg_info () && flag_ira_loop_pressure) ! ira_set_pseudo_classes (ira_dump_file); rebuild_p = update_equiv_regs (); --- 4183,4189 ---- crtl->is_leaf = leaf_function_p (); if (resize_reg_info () && flag_ira_loop_pressure) ! ira_set_pseudo_classes (true, ira_dump_file); rebuild_p = update_equiv_regs (); diff -x .svn -rpN wc1/gcc/ira-costs.c wc2/gcc/ira-costs.c *** wc1/gcc/ira-costs.c 2012-10-12 15:13:41.694110750 +0800 --- wc2/gcc/ira-costs.c 2012-10-12 15:12:15.690110976 +0800 *************** ira_costs (void) *** 2048,2056 **** ira_free (total_allocno_costs); } ! /* Entry function which defines classes for pseudos. */ void ! ira_set_pseudo_classes (FILE *dump_file) { allocno_p = false; internal_flag_ira_verbose = flag_ira_verbose; --- 2048,2057 ---- ira_free (total_allocno_costs); } ! /* Entry function which defines classes for pseudos. ! Set pseudo_classes_defined_p only if DEFINE_PSEUDO_CLASSES is true. */ void ! ira_set_pseudo_classes (bool define_pseudo_classes, FILE *dump_file) { allocno_p = false; internal_flag_ira_verbose = flag_ira_verbose; *************** ira_set_pseudo_classes (FILE *dump_file) *** 2059,2065 **** initiate_regno_cost_classes (); find_costs_and_classes (dump_file); finish_regno_cost_classes (); ! pseudo_classes_defined_p = true; finish_costs (); } --- 2060,2068 ---- initiate_regno_cost_classes (); find_costs_and_classes (dump_file); finish_regno_cost_classes (); ! if (define_pseudo_classes) ! pseudo_classes_defined_p = true; ! finish_costs (); } diff -x .svn -rpN wc1/gcc/ira.h wc2/gcc/ira.h *** wc1/gcc/ira.h 2012-10-12 15:13:41.691617905 +0800 --- wc2/gcc/ira.h 2012-10-12 15:12:15.690110976 +0800 *************** *** 1,6 **** /* Communication between the Integrated Register Allocator (IRA) and the rest of the compiler. ! Copyright (C) 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. Contributed by Vladimir Makarov . --- 1,6 ---- /* Communication between the Integrated Register Allocator (IRA) and the rest of the compiler. ! Copyright (C) 2006, 2007, 2008, 2009, 2010, 2012 Free Software Foundation, Inc. Contributed by Vladimir Makarov . *************** extern void ira_init (void); *** 131,137 **** extern void ira_finish_once (void); extern void ira_setup_eliminable_regset (void); extern rtx ira_eliminate_regs (rtx, enum machine_mode); ! extern void ira_set_pseudo_classes (FILE *); extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *); extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *); --- 131,137 ---- extern void ira_finish_once (void); extern void ira_setup_eliminable_regset (void); extern rtx ira_eliminate_regs (rtx, enum machine_mode); ! extern void ira_set_pseudo_classes (bool, FILE *); extern void ira_implicitly_set_insn_hard_regs (HARD_REG_SET *); extern void ira_sort_regnos_for_alter_reg (int *, int, unsigned int *); diff -x .svn -rpN wc1/gcc/loop-invariant.c wc2/gcc/loop-invariant.c *** wc1/gcc/loop-invariant.c 2012-10-12 15:13:41.674117845 +0800 --- wc2/gcc/loop-invariant.c 2012-10-12 15:12:15.710111844 +0800 *************** *** 1,5 **** /* RTL-level loop invariant motion. ! Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. This file is part of GCC. --- 1,5 ---- /* RTL-level loop invariant motion. ! Copyright (C) 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2012 Free Software Foundation, Inc. This file is part of GCC. *************** move_loop_invariants (void) *** 1915,1921 **** { df_analyze (); regstat_init_n_sets_and_refs (); ! ira_set_pseudo_classes (dump_file); calculate_loop_reg_pressure (); regstat_free_n_sets_and_refs (); } --- 1915,1921 ---- { df_analyze (); regstat_init_n_sets_and_refs (); ! ira_set_pseudo_classes (true, dump_file); calculate_loop_reg_pressure (); regstat_free_n_sets_and_refs (); } diff -x .svn -rpN wc1/gcc/regmove.c wc2/gcc/regmove.c *** wc1/gcc/regmove.c 2012-10-12 15:13:41.659610135 +0800 --- wc2/gcc/regmove.c 2012-10-12 15:12:15.710111844 +0800 *************** *** 1,6 **** /* Move registers around to reduce number of move instructions needed. Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, ! 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. This file is part of GCC. --- 1,7 ---- /* Move registers around to reduce number of move instructions needed. Copyright (C) 1987, 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998, ! 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, ! 2012 Free Software Foundation, Inc. This file is part of GCC. *************** regmove_optimize (void) *** 1237,1243 **** regstat_compute_ri (); if (flag_ira_loop_pressure) ! ira_set_pseudo_classes (dump_file); regno_src_regno = XNEWVEC (int, nregs); for (i = nregs; --i >= 0; ) --- 1238,1244 ---- regstat_compute_ri (); if (flag_ira_loop_pressure) ! ira_set_pseudo_classes (true, dump_file); regno_src_regno = XNEWVEC (int, nregs); for (i = nregs; --i >= 0; ) diff -x .svn -rpN wc1/gcc/testsuite/gcc.dg/hoist-register-pressure.c wc2/gcc/testsuite/gcc.dg/hoist-register-pressure.c *** wc1/gcc/testsuite/gcc.dg/hoist-register-pressure.c 1970-01-01 08:00:00.000000000 +0800 --- wc2/gcc/testsuite/gcc.dg/hoist-register-pressure.c 2012-10-12 15:12:15.710111844 +0800 *************** *** 0 **** --- 1,31 ---- + /* { dg-options "-Os -fdump-rtl-hoist" } */ + /* { dg-final { scan-rtl-dump "PRE/HOIST: end of bb .* copying expression" "hoist" } } */ + + #define BUF 100 + int a[BUF]; + + void com (int); + void bar (int); + + int foo (int x, int y, int z) + { + /* "x+y" won't be hoisted if "-fira-hoist-pressure" is disabled, + because its rtx_cost is too small. */ + if (z) + { + a[1] = a[0] + a[2]; + a[2] = a[1] + a[3]; + a[3] = a[2] + a[4]; + a[4] = a[3] + a[5]; + a[5] = a[4] + a[6]; + a[6] = a[5] + a[7]; + a[7] = a[6] + a[8]; + com (x+y); + } + else + { + bar (x+y); + } + + return 0; + } Index: gcc/gcse.c =================================================================== --- gcc/gcse.c (revision 192194) +++ gcc/gcse.c (working copy) @@ -1,6 +1,6 @@ /* Partial redundancy elimination / Hoisting for RTL. Copyright (C) 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, - 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. + 2006, 2007, 2008, 2009, 2010, 2011, 2012 Free Software Foundation, Inc. This file is part of GCC. @@ -460,7 +460,7 @@ static void alloc_code_hoist_mem (int, int); static void free_code_hoist_mem (void); static void compute_code_hoist_vbeinout (void); static void compute_code_hoist_data (void); -static int hoist_expr_reaches_here_p (basic_block, int, basic_block, char *, +static int hoist_expr_reaches_here_p (basic_block, int, basic_block, sbitmap, int, int *); static int hoist_code (void); static int one_code_hoisting_pass (void); @@ -2842,7 +2842,7 @@ compute_code_hoist_data (void) static int hoist_expr_reaches_here_p (basic_block expr_bb, int expr_index, basic_block bb, - char *visited, int distance, int *bb_size) + sbitmap visited, int distance, int *bb_size) { edge pred; edge_iterator ei; @@ -2863,7 +2863,8 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in if (visited == NULL) { visited_allocated_locally = 1; - visited = XCNEWVEC (char, last_basic_block); + visited = sbitmap_alloc (last_basic_block); + sbitmap_zero (visited); } FOR_EACH_EDGE (pred, ei, bb->preds) @@ -2874,7 +2875,7 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in break; else if (pred_bb == expr_bb) continue; - else if (visited[pred_bb->index]) + else if (TEST_BIT (visited, pred_bb->index)) continue; else if (! TEST_BIT (transp[pred_bb->index], expr_index)) @@ -2883,14 +2884,14 @@ hoist_expr_reaches_here_p (basic_block expr_bb, in /* Not killed. */ else { - visited[pred_bb->index] = 1; + SET_BIT (visited, pred_bb->index); if (! hoist_expr_reaches_here_p (expr_bb, expr_index, pred_bb, visited, distance, bb_size)) break; } } if (visited_allocated_locally) - free (visited); + sbitmap_free (visited); return (pred == NULL); }