From patchwork Fri Nov 2 08:34:57 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bin Cheng X-Patchwork-Id: 196509 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id ADE2D2C0345 for ; Fri, 2 Nov 2012 19:35:46 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1352450148; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: From:To:Subject:Date:Message-ID:MIME-Version:Content-Type: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=9jn8orvxECgLvwZc4ChI 5xWIu3E=; b=xJua+bjZxkq/xLM0PZqPqmcbWVvnymwwavcpcbE2A/a8g6wd2gTp tlzf9uTZZpKVFDWmoDedX31YD7X9V4XnBgp7743S4gq/FWWron9Zv1E85CXewptm Ct6saGEPowoUqJS0/j1dbpOp5pllU/LMPPp1uF22Uwb4PN+t72/HDUE= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:From:To:Subject:Date:Message-ID:MIME-Version:X-MC-Unique:Content-Type:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=BFKhIUvx4kbKUf4WQU1JqfewVbXbTbtNVRdSrdjSFl12JdHxm8aPl0a72eN68d RgYzq+fD9OeVaYEFdaZKlTRlW8dXEsnAVF93IlbKXpXg/OHOImbwsxBtgSEjWI9a a3GrQzk/DCcyZMVbw73yAxhbLk8J06/aNf10x0UH3xKtM=; Received: (qmail 16880 invoked by alias); 2 Nov 2012 08:35:38 -0000 Received: (qmail 16823 invoked by uid 22791); 2 Nov 2012 08:35:37 -0000 X-SWARE-Spam-Status: No, hits=0.1 required=5.0 tests=AWL, BAYES_50, KHOP_RCVD_UNTRUST, KHOP_SPAMHAUS_DROP, MSGID_MULTIPLE_AT, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from service87.mimecast.com (HELO service87.mimecast.com) (91.220.42.44) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 02 Nov 2012 08:35:31 +0000 Received: from cam-owa1.Emea.Arm.com (fw-tnat.cambridge.arm.com [217.140.96.21]) by service87.mimecast.com; Fri, 02 Nov 2012 08:35:29 +0000 Received: from Binsh02 ([10.1.255.212]) by cam-owa1.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.0); Fri, 2 Nov 2012 08:35:26 +0000 From: "Bin Cheng" To: Subject: [PATCH Version 2][RFA]Improving register pressure directed hoist Date: Fri, 2 Nov 2012 16:34:57 +0800 Message-ID: <007001cdb8d4$f2198a30$d64c9e90$@cheng@arm.com> MIME-Version: 1.0 X-MC-Unique: 112110208352900401 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, I posted a patch improving register pressure directed hoist at http://gcc.gnu.org/ml/gcc-patches/2012-10/msg02552.html Turns out it has a mistake in update_bb_reg_pressure resulting in changing register pressure incorrectly. Here comes the 2nd version patch for review. Unfortunately and strangely, this correct patch isn't as good as the bogus one. The improvement is as below: improvement Thumb1 0.17% (0.13% before) ARM 0.15% (0.12% before) AARCH64 0.33% MIPS 0.24% PowerPC 0.62% Thumb2 X x86 X x86_64 X X means no obvious effect on the corresponding target. Though the effect is not as good as expected, I still think it worth to be reviewed, because: a) it does improve code size a little bit on ARM target and introduces fewer regression. b) it makes further optimization possible (for example, computing VBE optimistically in compute_code_hoist_vbeinout). Also I don't understand why the bogus patch can catch more hoist opportunities and improve code size, so please help if you have any idea about this. It is re-tested on x86. OK? Thanks very much. 2012-11-02 Bin Cheng * gcse.c: (struct bb_data): Add new fields, old_pressure, live_in and backup. (calculate_bb_reg_pressure): Initialize live_in and backup. (update_bb_reg_pressure): New. (should_hoist_expr_to_dom): Add new parameter from. Monitor the change of reg pressure and use it to drive hoisting. (hoist_code): Update LIVE and reg pressure information. gcc/testsuite/ChangeLog 2012-11-02 Bin Cheng * gcc.dg/hoist-register-pressure-3.c: New test. Index: gcc/testsuite/gcc.dg/hoist-register-pressure-3.c =================================================================== --- gcc/testsuite/gcc.dg/hoist-register-pressure-3.c (revision 0) +++ gcc/testsuite/gcc.dg/hoist-register-pressure-3.c (revision 0) @@ -0,0 +1,32 @@ +/* { dg-options "-Os -fdump-rtl-hoist" } */ +/* { dg-final { scan-rtl-dump "PRE/HOIST: end of bb .* copying expression" "hoist" } } */ + +#define BUF 100 +int a[BUF]; + +void com (int); +void bar (int); + +int foo (int x, int y, int z) +{ + /* "x+y" won't be hoisted if "-fira-hoist-pressure" is disabled, + because its rtx_cost is too small. */ + if (z) + { + a[1] = a[0] + a[2] + a[3] + a[4] + a[5] + a[6]; + a[2] = a[1] + a[3] + a[5] + a[5] + a[6] + a[7]; + a[3] = a[2] + a[5] + a[7] + a[6] + a[7] + a[8]; + a[4] = a[3] + a[7] + a[11] + a[7] + a[8] + a[9]; + a[5] = a[5] + a[11] + a[13] + a[8] + a[9] + a[10]; + a[6] = a[7] + a[13] + a[17] + a[9] + a[10] + a[11]; + a[7] = a[11] + a[17] + a[19] + a[10] + a[11] + a[12]; + com (x+y); + } + else + { + bar (x+y); + } + + return 0; +} + Index: gcc/gcse.c =================================================================== --- gcc/gcse.c (revision 193013) +++ gcc/gcse.c (working copy) @@ -20,9 +20,6 @@ along with GCC; see the file COPYING3. If not see /* TODO - reordering of memory allocation and freeing to be more space efficient - - simulate register pressure change of each basic block accurately during - hoist process. But I doubt the benefit since most expressions hoisted - are constant or address, which usually won't reduce register pressure. - calc rough register pressure information and use the info to drive all kinds of code motion (including code hoisting) in a unified way. */ @@ -421,6 +418,15 @@ struct bb_data /* Maximal register pressure inside basic block for given register class (defined only for the pressure classes). */ int max_reg_pressure[N_REG_CLASSES]; + /* Recorded register pressure of basic block before trying to hoist + an expression. Will be used to restore the register pressure + if the expression should not be hoisted. */ + int old_pressure; + /* Recorded register live_in info of basic block during code hoisting + process. BACKUP is used to record live_in info before trying to + hoist an expression, and will be used to restore LIVE_IN if the + expression should not be hoisted. */ + bitmap live_in, backup; }; #define BB_DATA(bb) ((struct bb_data *) (bb)->aux) @@ -481,7 +487,7 @@ static void compute_code_hoist_vbeinout (void); static void compute_code_hoist_data (void); static int should_hoist_expr_to_dom (basic_block, struct expr *, basic_block, sbitmap, int, int *, enum reg_class, - int *, bitmap); + int *, bitmap, rtx); static int hoist_code (void); static enum reg_class get_pressure_class_and_nregs (rtx insn, int *nregs); static int one_code_hoisting_pass (void); @@ -2847,6 +2853,69 @@ compute_code_hoist_data (void) fprintf (dump_file, "\n"); } +/* Update register pressure for BB when hoisting an expression from + instruction FROM, if live ranges of inputs are shrunk. Also + maintain live_in information if live range of register referred + in FROM is shrunk. + + Return 0 if register pressure doesn't change, otherwise return + the number by which register pressure is decreased. + + NOTE: Register pressure won't be increased in this function. */ + +static int +update_bb_reg_pressure (basic_block bb, rtx from, + enum reg_class pressure_class, int nregs) +{ + rtx dreg, insn; + basic_block succ_bb; + df_ref *op, op_ref; + edge succ; + edge_iterator ei; + int decreased_pressure = 0; + + for (op = DF_INSN_USES (from); *op; op++) + { + dreg = DF_REF_REAL_REG (*op); + /* The live range of register is shrunk only if it isn't: + 1. referred on any path from the end of this block to EXIT, or + 2. referred by insns other than FROM in this block. */ + FOR_EACH_EDGE (succ, ei, bb->succs) + { + succ_bb = succ->dest; + if (succ_bb == EXIT_BLOCK_PTR) + continue; + + if (bitmap_bit_p (BB_DATA (succ_bb)->live_in, REGNO (dreg))) + break; + } + if (succ != NULL) + continue; + + op_ref = DF_REG_USE_CHAIN (REGNO (dreg)); + for (; op_ref; op_ref = DF_REF_NEXT_REG (op_ref)) + { + if (!DF_REF_INSN_INFO (op_ref)) + continue; + + insn = DF_REF_INSN (op_ref); + if (BLOCK_FOR_INSN (insn) == bb + && NONDEBUG_INSN_P (insn) && insn != from) + break; + } + + /* Decrease register pressure and update live_in information for + this block. */ + if (!op_ref) + { + decreased_pressure += nregs; + BB_DATA (bb)->max_reg_pressure[pressure_class] -= nregs; + bitmap_clear_bit (BB_DATA (bb)->live_in, REGNO (dreg)); + } + } + return decreased_pressure; +} + /* Determine if the expression EXPR should be hoisted to EXPR_BB up in flow graph, if it can reach BB unimpared. Stop the search if the expression would need to be moved more than DISTANCE instructions. @@ -2863,6 +2932,8 @@ compute_code_hoist_data (void) HOISTED_BBS points to a bitmap indicating basic blocks through which EXPR is hoisted. + FROM is the instruction from which EXPR is hoisted. + It's unclear exactly what Muchnick meant by "unimpared". It seems to me that the expression must either be computed or transparent in *every* block in the path(s) from EXPR_BB to BB. Any other definition @@ -2877,28 +2948,55 @@ static int should_hoist_expr_to_dom (basic_block expr_bb, struct expr *expr, basic_block bb, sbitmap visited, int distance, int *bb_size, enum reg_class pressure_class, - int *nregs, bitmap hoisted_bbs) + int *nregs, bitmap hoisted_bbs, rtx from) { unsigned int i; edge pred; edge_iterator ei; sbitmap_iterator sbi; int visited_allocated_locally = 0; + int decreased_pressure = 0; + if (flag_ira_hoist_pressure) + { + /* Record old information of basic block BB when it is visited + at the first time. */ + if (!bitmap_bit_p (hoisted_bbs, bb->index)) + { + struct bb_data *data = BB_DATA (bb); + bitmap_copy (data->backup, data->live_in); + data->old_pressure = data->max_reg_pressure[pressure_class]; + } + decreased_pressure = update_bb_reg_pressure (bb, from, + pressure_class, *nregs); + } /* Terminate the search if distance, for which EXPR is allowed to move, is exhausted. */ if (distance > 0) { - /* Let EXPR be hoisted through basic block at no cost if the block - has low register pressure. An exception is constant expression, - because hoisting constant expr aggressively results in worse code. - The exception is made by the observation of CSiBE on ARM target, - while it has no obvious effect on other targets like x86, x86_64, - mips and powerpc. */ - if (!flag_ira_hoist_pressure - || (BB_DATA (bb)->max_reg_pressure[pressure_class] - >= ira_class_hard_regs_num[pressure_class] - || CONST_INT_P (expr->expr))) + if (flag_ira_hoist_pressure) + { + /* Prefer to hoist EXPR if register pressure is decreased. */ + if (decreased_pressure > *nregs) + distance += bb_size[bb->index]; + /* Let EXPR be hoisted through basic block at no cost if one + of following conditions is satisfied: + + 1. The basic block has low register pressure. + 2. Register pressure won't be increases after hoisting EXPR. + + Constant expressions is handled conservatively, because + hoisting constant expression aggressively results in worse + code. This decision is made by the observation of CSiBE + on ARM target, while it has no obvious effect on other + targets like x86, x86_64, mips and powerpc. */ + else if (CONST_INT_P (expr->expr) + || (BB_DATA (bb)->max_reg_pressure[pressure_class] + >= ira_class_hard_regs_num[pressure_class] + && decreased_pressure < *nregs)) + distance -= bb_size[bb->index]; + } + else distance -= bb_size[bb->index]; if (distance <= 0) @@ -2932,24 +3030,21 @@ should_hoist_expr_to_dom (basic_block expr_bb, str SET_BIT (visited, pred_bb->index); if (! should_hoist_expr_to_dom (expr_bb, expr, pred_bb, visited, distance, bb_size, - pressure_class, nregs, hoisted_bbs)) + pressure_class, nregs, + hoisted_bbs, from)) break; } } if (visited_allocated_locally) { /* If EXPR can be hoisted to expr_bb, record basic blocks through - which EXPR is hoisted in hoisted_bbs. Also update register - pressure for basic blocks newly added in hoisted_bbs. */ + which EXPR is hoisted in hoisted_bbs. */ if (flag_ira_hoist_pressure && !pred) { + /* Record the basic block from which EXPR is hoisted. */ + SET_BIT (visited, bb->index); EXECUTE_IF_SET_IN_SBITMAP (visited, 0, i, sbi) - if (!bitmap_bit_p (hoisted_bbs, i)) - { - bitmap_set_bit (hoisted_bbs, i); - BB_DATA (BASIC_BLOCK (i))->max_reg_pressure[pressure_class] - += *nregs; - } + bitmap_set_bit (hoisted_bbs, i); } sbitmap_free (visited); } @@ -2990,23 +3085,28 @@ find_occr_in_bb (struct occr *occr, basic_block bb from rtx cost of the corresponding expression and it's used to control how long the expression can be hoisted up in flow graph. As the expression is hoisted up in flow graph, GCC decreases its DISTANCE - and stops the hoist if DISTANCE reaches 0. + and stops the hoist if DISTANCE reaches 0. Code hoisting can decrease + register pressure if live ranges of inputs are shrunk. Option "-fira-hoist-pressure" implements register pressure directed hoist based on upper method. The rationale is: 1. Calculate register pressure for each basic block by reusing IRA facility. 2. When expression is hoisted through one basic block, GCC checks - register pressure of the basic block and decrease DISTANCE only - when the register pressure is high. In other words, expression - will be hoisted through basic block with low register pressure - at no cost. - 3. Update register pressure information for basic blocks through - which expression is hoisted. - TODO: It is possible to have register pressure decreased because - of shrinked live ranges of input pseudo registers when hoisting - an expression. For now, this effect is not simulated and we just - increase register pressure for hoisted expressions. */ + the change of live ranges for inputs/output. The basic block's + register pressure will be increased because of extended live + range of output. However, register pressure will be decreased + if the live ranges of inputs are shrunk. + 3. After knowing how hoisting affects register pressure, GCC prefers + to hoist the expression if it can decrease register pressure, by + increasing DISTANCE of the corresponding expression. + 4. If hoisting the expression increases register pressure, GCC checks + register pressure of the basic block and decrease DISTANCE only if + the register pressure is high. In other words, expression will be + hoisted through at no cost if the basic block has low register + pressure. + 5. Update register pressure information for basic blocks through + which expression is hoisted. */ static int hoist_code (void) @@ -3163,7 +3263,7 @@ hoist_code (void) if (should_hoist_expr_to_dom (bb, expr, dominated, NULL, max_distance, bb_size, pressure_class, &nregs, - hoisted_bbs)) + hoisted_bbs, occr->insn)) { hoistable++; VEC_safe_push (occr_t, heap, @@ -3207,19 +3307,28 @@ hoist_code (void) if (flag_ira_hoist_pressure && !VEC_empty (occr_t, occrs_to_hoist)) { - /* Update register pressure for basic block to which expr - is hoisted. */ + /* Increase register pressure of basic blocks to which + expr is hoisted because of extended live range of + output. */ data = BB_DATA (bb); data->max_reg_pressure[pressure_class] += nregs; + EXECUTE_IF_SET_IN_BITMAP (hoisted_bbs, 0, k, bi) + { + data = BB_DATA (BASIC_BLOCK (k)); + data->max_reg_pressure[pressure_class] += nregs; + } } else if (flag_ira_hoist_pressure) { - /* Restore register pressure of basic block recorded in - hoisted_bbs when expr will not be hoisted. */ + /* Restore register pressure and live_in info for basic + blocks recorded in hoisted_bbs when expr will not be + hoisted. */ EXECUTE_IF_SET_IN_BITMAP (hoisted_bbs, 0, k, bi) { data = BB_DATA (BASIC_BLOCK (k)); - data->max_reg_pressure[pressure_class] -= nregs; + bitmap_copy (data->live_in, data->backup); + data->max_reg_pressure[pressure_class] + = data->old_pressure; } } @@ -3382,7 +3491,10 @@ calculate_bb_reg_pressure (void) FOR_EACH_BB (bb) { curr_bb = bb; - bitmap_copy (curr_regs_live, DF_LR_OUT (bb)); + BB_DATA (bb)->live_in = BITMAP_ALLOC (NULL); + BB_DATA (bb)->backup = BITMAP_ALLOC (NULL); + bitmap_copy (BB_DATA (bb)->live_in, df_get_live_in (bb)); + bitmap_copy (curr_regs_live, df_get_live_out (bb)); for (i = 0; i < ira_pressure_classes_num; i++) curr_reg_pressure[ira_pressure_classes[i]] = 0; EXECUTE_IF_SET_IN_BITMAP (curr_regs_live, 0, j, bi)