From patchwork Mon Nov 21 05:07:46 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Revital Eres X-Patchwork-Id: 126684 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 159AAB71F9 for ; Mon, 21 Nov 2011 16:08:13 +1100 (EST) Received: (qmail 19121 invoked by alias); 21 Nov 2011 05:08:08 -0000 Received: (qmail 19111 invoked by uid 22791); 21 Nov 2011 05:08:03 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, TW_CP X-Spam-Check-By: sourceware.org Received: from mail-iy0-f175.google.com (HELO mail-iy0-f175.google.com) (209.85.210.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 21 Nov 2011 05:07:46 +0000 Received: by mail-iy0-f175.google.com with SMTP id k25so7373794iah.20 for ; Sun, 20 Nov 2011 21:07:46 -0800 (PST) MIME-Version: 1.0 Received: by 10.42.150.135 with SMTP id a7mr11638280icw.53.1321852066245; Sun, 20 Nov 2011 21:07:46 -0800 (PST) Received: by 10.50.159.135 with HTTP; Sun, 20 Nov 2011 21:07:46 -0800 (PST) Date: Mon, 21 Nov 2011 07:07:46 +0200 Message-ID: Subject: [PATCH SMS 2/2, RFC] Register pressure estimation for the partial schedule From: Revital Eres To: Ayal Zaks Cc: richard sandiford , gcc-patches@gcc.gnu.org, Patch Tracking Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, The attached patch adds register pressure estimation of the partial schedule. Tested and bootstrap with the other patches in the series on ppc64-redhat-linux, enabling SMS on loops with SC 1. Comments are welcome. Thanks, Revital Changelog: * loop-invariant.c (get_regno_pressure_class): Move function to... * ira.c (get_regno_pressure_class): Here. * common.opt (fmodulo-sched-reg-pressure): New flag. * doc/invoke.texi (fmodulo-sched-reg-pressure): Document it. * ira.h (get_regno_pressure_class): Declare. * rtl.h (set_reg_allocno_class): Declare. * reginfo.c (set_reg_allocno_class): New function. * Makefile.in (modulo-sched.o): Include ira.h. * modulo-sched.c (ira.h): New include. (rtl_insn_ps, undo_reg_moves, mark_def_regs, mark_reg_use, mark_reg_use_1, insn_exists_in_epilog_p, calc_lr_out_regs, change_pressure, update_reg_moves_pressure_info, initiate_reg_pressure_info, mark_regno_live, mark_reg_birth_1, mark_reg_birth, mark_regno_death, mark_ref_regs, calc_insn_reg_pressure_info, calc_reg_pressure, free_loop_data, free_reg_pressure_info, ps_reg_pressure_p): New functions. (apply_reg_moves): Add parameter. (curr_regs_live, curr_reg_pressure, curr_loop): New data-structures. (loop_data): New struct. (LOOP_DATA): New definition. (sms_schedule): Use register pressure estimation. Index: doc/invoke.texi =================================================================== --- doc/invoke.texi (revision 181149) +++ doc/invoke.texi (working copy) @@ -373,6 +373,7 @@ Objective-C and Objective-C++ Dialects}. -floop-parallelize-all -flto -flto-compression-level @gol -flto-partition=@var{alg} -flto-report -fmerge-all-constants @gol -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol +-fmodulo-sched-reg-pressure @gol -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol -fno-default-inline @gol -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol @@ -6457,6 +6458,11 @@ deleted which will trigger the generatio life-range analysis. This option is effective only with @option{-fmodulo-sched} enabled. +@item -fmodulo-sched-reg-pressure +@opindex fmodulo-sched-reg-pressure +Perform SMS based modulo scheduling with register pressure estimation. +This option is effective only with @option{-fmodulo-sched} enabled. + @item -fno-branch-count-reg @opindex fno-branch-count-reg Do not use ``decrement and branch'' instructions on a count register, Index: loop-invariant.c =================================================================== --- loop-invariant.c (revision 181149) +++ loop-invariant.c (working copy) @@ -1619,34 +1619,6 @@ static rtx regs_set[(FIRST_PSEUDO_REGIST /* Number of regs stored in the previous array. */ static int n_regs_set; -/* Return pressure class and number of needed hard registers (through - *NREGS) of register REGNO. */ -static enum reg_class -get_regno_pressure_class (int regno, int *nregs) -{ - if (regno >= FIRST_PSEUDO_REGISTER) - { - enum reg_class pressure_class; - - pressure_class = reg_allocno_class (regno); - pressure_class = ira_pressure_class_translate[pressure_class]; - *nregs - = ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno)]; - return pressure_class; - } - else if (! TEST_HARD_REG_BIT (ira_no_alloc_regs, regno) - && ! TEST_HARD_REG_BIT (eliminable_regset, regno)) - { - *nregs = 1; - return ira_pressure_class_translate[REGNO_REG_CLASS (regno)]; - } - else - { - *nregs = 0; - return NO_REGS; - } -} - /* Increase (if INCR_P) or decrease current register pressure for register REGNO. */ static void Index: common.opt =================================================================== --- common.opt (revision 181149) +++ common.opt (working copy) @@ -1457,6 +1457,10 @@ fmodulo-sched-allow-regmoves Common Report Var(flag_modulo_sched_allow_regmoves) Perform SMS based modulo scheduling with register moves allowed +fmodulo-sched-reg-pressure +Common Report Var(flag_modulo_sched_reg_pressure) +Perform SMS based modulo scheduling with regsiter pressure estimation. + fmove-loop-invariants Common Report Var(flag_move_loop_invariants) Init(1) Optimization Move loop invariant computations out of loops Index: ira.c =================================================================== --- ira.c (revision 181149) +++ ira.c (working copy) @@ -3784,6 +3784,34 @@ ira (FILE *f) timevar_pop (TV_IRA); } +/* Return pressure class and number of needed hard registers (through + *NREGS) of register REGNO. */ +enum reg_class +get_regno_pressure_class (int regno, int *nregs) +{ + if (regno >= FIRST_PSEUDO_REGISTER) + { + enum reg_class pressure_class; + + pressure_class = reg_allocno_class (regno); + pressure_class = ira_pressure_class_translate[pressure_class]; + *nregs + = ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno)]; + return pressure_class; + } + else if (!TEST_HARD_REG_BIT (ira_no_alloc_regs, regno) + && !TEST_HARD_REG_BIT (eliminable_regset, regno)) + { + *nregs = 1; + return ira_pressure_class_translate[REGNO_REG_CLASS (regno)]; + } + else + { + *nregs = 0; + return NO_REGS; + } +} + static bool Index: ira.h =================================================================== --- ira.h (revision 181149) +++ ira.h (working copy) @@ -145,3 +145,4 @@ extern bool ira_better_spill_reload_regn extern bool ira_bad_reload_regno (int, rtx, rtx); extern void ira_adjust_equiv_reg_cost (unsigned, int); +enum reg_class get_regno_pressure_class (int, int *); Index: rtl.h =================================================================== --- rtl.h (revision 181149) +++ rtl.h (working copy) @@ -2074,6 +2074,7 @@ extern const char *decode_asm_operands ( extern enum reg_class reg_preferred_class (int); extern enum reg_class reg_alternate_class (int); extern enum reg_class reg_allocno_class (int); +extern void set_reg_allocno_class (int, enum reg_class); extern void setup_reg_classes (int, enum reg_class, enum reg_class, enum reg_class); Index: reginfo.c =================================================================== --- reginfo.c (revision 181149) +++ reginfo.c (working copy) @@ -953,6 +953,16 @@ reg_allocno_class (int regno) return (enum reg_class) reg_pref[regno].allocnoclass; } +/* Set the register REG with reg_class ALLOCNOCLASS. */ +void +set_reg_allocno_class (int regno, enum reg_class allocnoclass) +{ + if (reg_pref == 0) + return; + + reg_pref[regno].allocnoclass = allocnoclass; +} + /* Allocate space for reg info. */ Index: Makefile.in =================================================================== --- Makefile.in (revision 181149) +++ Makefile.in (working copy) @@ -3311,7 +3311,7 @@ modulo-sched.o : modulo-sched.c $(DDG_H) $(FLAGS_H) insn-config.h $(INSN_ATTR_H) $(EXCEPT_H) $(RECOG_H) \ $(SCHED_INT_H) $(CFGLAYOUT_H) $(CFGLOOP_H) $(EXPR_H) $(PARAMS_H) \ cfghooks.h $(GCOV_IO_H) hard-reg-set.h $(TM_H) $(TIMEVAR_H) $(TREE_PASS_H) \ - $(DF_H) $(DBGCNT_H) + $(DF_H) $(DBGCNT_H) ira.h haifa-sched.o : haifa-sched.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ $(SCHED_INT_H) $(REGS_H) hard-reg-set.h $(FLAGS_H) insn-config.h $(FUNCTION_H) \ $(INSN_ATTR_H) $(DIAGNOSTIC_CORE_H) $(RECOG_H) $(EXCEPT_H) $(TM_P_H) $(TARGET_H) output.h \ --- modulo-sched2.c 2011-11-20 08:30:12.000000000 +0100 +++ modulo-sched.c 2011-11-20 09:14:48.000000000 +0100 @@ -48,6 +48,7 @@ along with GCC; see the file COPYING3. #include "tree-pass.h" #include "dbgcnt.h" #include "df.h" +#include "ira.h" #ifdef INSN_SCHEDULING @@ -326,6 +327,22 @@ ps_rtl_insn (partial_schedule_ptr ps, in return ps_reg_move (ps, id)->insn; } +/* Return the ID of the partial schedule instruction in PS which belongs + to INSN. */ +static int +rtl_insn_ps (partial_schedule_ptr ps, rtx insn) +{ + int row; + ps_insn_ptr crr_insn; + + for (row = 0; row < ps->ii; row++) + for (crr_insn = ps->rows[row]; crr_insn; crr_insn = crr_insn->next_in_row) + if (insn == ps_rtl_insn (ps, crr_insn->id)) + return crr_insn->id; + + return -1; +} + /* Partial schedule instruction ID, which belongs to PS, occured in the original (unscheduled) loop. Return the first instruction in the loop that was associated with ps_rtl_insn (PS, ID). @@ -823,10 +840,11 @@ schedule_reg_moves (partial_schedule_ptr return true; } -/* Emit the moves associatied with PS. Apply the substitutions - associated with them. */ +/* Emit the moves associated with PS. Apply the substitutions + associated with them. If RESCAN_P is true update the df information. + */ static void -apply_reg_moves (partial_schedule_ptr ps) +apply_reg_moves (partial_schedule_ptr ps, bool df_rescan_p) { ps_reg_move_info *move; int i; @@ -839,11 +857,29 @@ apply_reg_moves (partial_schedule_ptr ps EXECUTE_IF_SET_IN_SBITMAP (move->uses, 0, i_use, sbi) { replace_rtx (ps->g->nodes[i_use].insn, move->old_reg, move->new_reg); - df_insn_rescan (ps->g->nodes[i_use].insn); + if (df_rescan_p) + df_insn_rescan (ps->g->nodes[i_use].insn); } } } +/* Undo the moves associated with PS. */ +static void +undo_reg_moves (partial_schedule_ptr ps) +{ + ps_reg_move_info *move; + int i; + + FOR_EACH_VEC_ELT (ps_reg_move_info, ps->reg_moves, i, move) + { + unsigned int i_use; + sbitmap_iterator sbi; + + EXECUTE_IF_SET_IN_SBITMAP (move->uses, 0, i_use, sbi) + replace_rtx (ps->g->nodes[i_use].insn, move->new_reg, move->old_reg); + } +} + /* Bump the SCHED_TIMEs of all nodes by AMOUNT. Set the values of SCHED_ROW and SCHED_STAGE. Instruction scheduled on cycle AMOUNT will move to cycle zero. */ @@ -1334,6 +1370,586 @@ setup_sched_infos (void) current_sched_info = &sms_sched_info; } +/* Registers currently living. */ +static bitmap_head curr_regs_live; + +/* Current reg pressure for each pressure class. */ +static int curr_reg_pressure[N_REG_CLASSES]; + +/* The data stored for the loop. */ + +struct loop_data +{ + /* Maximal register pressure inside loop for given register class + (defined only for the pressure classes). */ + int max_reg_pressure[N_REG_CLASSES]; + /* Loop regs referenced and live pseudo-registers. */ + bitmap_head regs_ref; + bitmap_head regs_live; +}; + +#define LOOP_DATA(LOOP) ((struct loop_data *) (LOOP)->aux) + +/* Currently processed loop. */ +static struct loop *curr_loop; + +/* Auxiliary function for calc_lr_out_regs. */ +static void +mark_def_regs (rtx reg, const_rtx setter ATTRIBUTE_UNUSED, void *data) +{ + bitmap_head *def_regs = (bitmap_head *) data; + + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + + if (!REG_P (reg)) + return; + + bitmap_set_bit (def_regs, REGNO (reg)); + return; +} + +/* Auxiliary function for mark_reg_use_1. */ +static int +mark_reg_use (rtx * x, void *data) +{ + bitmap_head *reg_used = (bitmap_head *) data; + + if (REG_P (*x)) + bitmap_set_bit (reg_used, REGNO (*x)); + + return 0; +} + +/* Auxiliary function for calc_lr_out_regs. */ +static void +mark_reg_use_1 (rtx * x, void *data) +{ + for_each_rtx (x, mark_reg_use, data); +} + +/* Return TRUE if the instruction noted by ID will be emitted in the + epilog. Otherwise return FALSE. Use STAGE_COUNT in the calculation. + */ +static bool +insn_exists_in_epilog_p (partial_schedule_ptr ps, int id, int stage_count) +{ + int last_stage = stage_count - 1; + int first_u, last_u; + int i; + + first_u = SCHED_STAGE (id); + last_u = first_u + ps_num_consecutive_stages (ps, id) - 1; + + for (i = 0; i < last_stage; i++) + if ((i + 1) <= last_u && last_stage >= first_u) + return true; + + return false; +} + +/* Calculate the registers that live out of the basic-block and mark + them in LR_OUT_REGS bitmap. Use stage-count SC in the calculation. + Mark in SKIP_INSNS bitmap instructions that should not be considered + for the register pressure calculation. */ +static void +calc_lr_out_regs (partial_schedule_ptr ps, + bitmap_head *lr_out_regs, bitmap_head *skip_insns, int sc) +{ + rtx insn; + bitmap_head insn_defs; + bitmap_head tmp_lr_out_regs; + basic_block bb = ps->g->bb; + unsigned int j; + bitmap_iterator bi; + int k; + unsigned rd_num; + struct df_rd_bb_info *rd_bb_info; + rtx link; + + bitmap_initialize (&tmp_lr_out_regs, ®_obstack); + bitmap_initialize (&insn_defs, ®_obstack); + + /* Start with the set of registers in DF_LR_OUT. */ + bitmap_copy (lr_out_regs, DF_LR_OUT (bb)); + for (k = ps->ii - 1; k >= 0; k--) + { + ps_insn_ptr ps_i = ps->rows_reverse[k]; + + while (ps_i) + { + insn = ps_rtl_insn (ps, ps_i->id); + + if (!NONDEBUG_INSN_P (insn)) + continue; + + bitmap_clear (&tmp_lr_out_regs); + bitmap_clear (&insn_defs); + note_uses (&PATTERN (insn), mark_reg_use_1, &tmp_lr_out_regs); +#ifdef AUTO_INC_DEC + for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) + if (REG_NOTE_KIND (link) == REG_INC) + mark_def_regs (XEXP (link, 0), NULL, &insn_defs); +#endif + note_stores (PATTERN (insn), mark_def_regs, &insn_defs); + /* Remove from the set of lr_out_regs registers any register + defined in the current instruction. */ + EXECUTE_IF_SET_IN_BITMAP (&insn_defs, 0, j, bi) + bitmap_clear_bit (lr_out_regs, j); + bitmap_ior_into (lr_out_regs, &tmp_lr_out_regs); + ps_i = ps_i->prev_in_row; + } + } + /* Add to the set of out live regs all the registers defined in bb + which have uses outside of it (those registers where eliminated in + the above calculation). Eliminate from this set the definitions + that exist in the epilog and with no uses inside the basic-block + as these definitions will be eliminated from the bb and thus should + not be considered for estimating register pressure in the bb. */ + rd_bb_info = DF_RD_BB_INFO (bb); + EXECUTE_IF_SET_IN_BITMAP (&rd_bb_info->gen, 0, rd_num, bi) + { + df_ref rd = DF_DEFS_GET (rd_num); + struct df_link *r_use; + int regno = DF_REF_REGNO (rd); + rtx def_insn = DF_REF_INSN (rd); + bool use_outside_of_bb = false; + int num_uses = 0; + + if (!bitmap_bit_p (DF_LR_OUT (bb), regno)) + continue; + + for (r_use = DF_REF_CHAIN (rd); r_use != NULL; r_use = r_use->next) + { + rtx use_insn = DF_REF_INSN (r_use->ref); + + if (BLOCK_FOR_INSN (use_insn) != bb) + { + use_outside_of_bb = true; + continue; + } + + num_uses++; + } + + if (use_outside_of_bb) + bitmap_set_bit (lr_out_regs, regno); + + if (num_uses == 0) + { + int id = rtl_insn_ps (ps, def_insn); + + gcc_assert (id >= 0); + + if (insn_exists_in_epilog_p (ps, id, sc)) + { + bitmap_set_bit (skip_insns, INSN_UID (def_insn)); + bitmap_clear_bit (lr_out_regs, regno); + } + } + } + + bitmap_clear (&insn_defs); + bitmap_clear (&tmp_lr_out_regs); +} + +/* Increase (if INCR_P) or decrease current register pressure for + register REGNO. */ +static void +change_pressure (int regno, bool incr_p) +{ + int nregs; + enum reg_class pressure_class; + + pressure_class = get_regno_pressure_class (regno, &nregs); + if (!incr_p) + curr_reg_pressure[pressure_class] -= nregs; + else + curr_reg_pressure[pressure_class] += nregs; +} + +/* Update the register class information for the register moves in PS. */ +static void +update_reg_moves_pressure_info (partial_schedule_ptr ps) +{ + ps_reg_move_info *move; + int i; + + if (resize_reg_info ()) + ira_set_pseudo_classes (dump_file); + + FOR_EACH_VEC_ELT (ps_reg_move_info, ps->reg_moves, i, move) + { + enum reg_class pressure_class; + int regno_new = REGNO (move->new_reg); + int regno_old = REGNO (move->old_reg); + + /* Update register class information for the register moves. */ + pressure_class = reg_allocno_class (regno_old); + set_reg_allocno_class (regno_new, pressure_class); + pressure_class = reg_allocno_class (regno_new); + + pressure_class = ira_pressure_class_translate[pressure_class]; + ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno_new)] + = + ira_reg_class_max_nregs[pressure_class][PSEUDO_REGNO_MODE (regno_old)]; + + setup_reg_classes (regno_new, reg_preferred_class (regno_old), + reg_alternate_class (regno_old), + reg_allocno_class (regno_old)); + } +} + +/* Initialize the data-structures needed for the register pressure + calculation. Mark in LR_OUT_REGS bitmap the live out registers + and in SKIP_INSNS the instructions that should not be considered in + the calculation. */ +static void +initiate_reg_pressure_info (partial_schedule_ptr ps, + bitmap_head *lr_out_regs, + bitmap_head *skip_insns, int sc) +{ + struct loop *loop; + loop_iterator li; + int i; + unsigned int j; + bitmap_iterator bi; + basic_block bb = ps->g->bb; + + /* Calculate the LR_LIVE_OUT set of registers. */ + calc_lr_out_regs (ps, lr_out_regs, skip_insns, sc); + update_reg_moves_pressure_info (ps); + + FOR_EACH_LOOP (li, loop, 0) + if (loop->aux == NULL) + { + loop->aux = xcalloc (1, sizeof (struct loop_data)); + memset (LOOP_DATA (loop)->max_reg_pressure, INT_MIN, + sizeof (LOOP_DATA (loop)->max_reg_pressure)); + bitmap_initialize (&LOOP_DATA (loop)->regs_ref, ®_obstack); + bitmap_initialize (&LOOP_DATA (loop)->regs_live, ®_obstack); + } + + bitmap_initialize (&curr_regs_live, ®_obstack); + curr_loop = bb->loop_father; + if (curr_loop != current_loops->tree_root) + for (loop = curr_loop; + loop->num != current_loops->tree_root->num; + loop = loop_outer (loop)) + bitmap_ior_into (&LOOP_DATA (loop)->regs_live, DF_LR_OUT (bb)); + + bitmap_ior_into (&LOOP_DATA (curr_loop)->regs_live, lr_out_regs); + bitmap_copy (&curr_regs_live, lr_out_regs); + for (i = 0; i < ira_pressure_classes_num; i++) + curr_reg_pressure[ira_pressure_classes[i]] = 0; + + EXECUTE_IF_SET_IN_BITMAP (&curr_regs_live, 0, j, bi) + change_pressure (j, true); +} + +/* Mark REGNO birth. */ +static void +mark_regno_live (int regno) +{ + struct loop *loop; + + for (loop = curr_loop; + loop != current_loops->tree_root; + loop = loop_outer (loop)) + bitmap_set_bit (&LOOP_DATA (loop)->regs_live, regno); + if (!bitmap_set_bit (&curr_regs_live, regno)) + return; + + change_pressure (regno, true); +} + +static int +mark_reg_birth_1 (rtx *x, void *data ATTRIBUTE_UNUSED) +{ + int regno; + rtx reg = *x; + + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + + if (!REG_P (reg)) + return 0; + + regno = REGNO (reg); + + if (regno >= FIRST_PSEUDO_REGISTER) + mark_regno_live (regno); + else + { + int last = regno + hard_regno_nregs[regno][GET_MODE (reg)]; + + while (regno < last) + { + mark_regno_live (regno); + regno++; + } + } + return 0; +} + +/* Mark any register in X as live. */ +static void +mark_reg_birth (rtx *x, void *data) +{ + for_each_rtx (x, mark_reg_birth_1, data); +} + +/* Mark REGNO death. */ +static void +mark_regno_death (int regno, void *data ATTRIBUTE_UNUSED) +{ + if (!bitmap_clear_bit (&curr_regs_live, regno)) + return; + + change_pressure (regno, false); +} + +/* Mark register REG death. */ +static void +mark_reg_death (rtx reg, const_rtx setter ATTRIBUTE_UNUSED, + void *data ATTRIBUTE_UNUSED) +{ + int regno; + + if (GET_CODE (reg) == SUBREG) + reg = SUBREG_REG (reg); + + if (!REG_P (reg)) + return; + + regno = REGNO (reg); + + if (regno >= FIRST_PSEUDO_REGISTER) + mark_regno_death (regno, data); + else + { + int last = regno + hard_regno_nregs[regno][GET_MODE (reg)]; + + while (regno < last) + { + mark_regno_death (regno, data); + regno++; + } + } +} + +/* Mark occurrence of registers in X. TMP_CURR_REGS_LIVE + bitmap holds the set of live registers. TMP_REG_PRESSURE holds the + register pressure so far. */ +static void +mark_ref_regs (rtx x, int *tmp_reg_pressure, bitmap_head *tmp_curr_regs_live) +{ + RTX_CODE code; + int i; + const char *fmt; + int nregs; + enum reg_class pressure_class; + + if (!x) + return; + + code = GET_CODE (x); + if (code == REG) + { + struct loop *loop; + + for (loop = curr_loop; + loop != current_loops->tree_root; loop = loop_outer (loop)) + bitmap_set_bit (&LOOP_DATA (loop)->regs_ref, REGNO (x)); + + if (bitmap_set_bit (tmp_curr_regs_live, REGNO (x))) + { + pressure_class = get_regno_pressure_class (REGNO (x), &nregs); + tmp_reg_pressure[pressure_class] += nregs; + } + return; + } + fmt = GET_RTX_FORMAT (code); + for (i = GET_RTX_LENGTH (code) - 1; i >= 0; i--) + if (fmt[i] == 'e') + mark_ref_regs (XEXP (x, i), tmp_reg_pressure, tmp_curr_regs_live); + else if (fmt[i] == 'E') + { + int j; + + for (j = 0; j < XVECLEN (x, i); j++) + mark_ref_regs (XVECEXP (x, i, j), tmp_reg_pressure, + tmp_curr_regs_live); + } +} + +/* Update the register pressure for INSN. */ +static void +calc_insn_reg_pressure_info (rtx insn) +{ + rtx link; + int i; + int tmp_reg_pressure[N_REG_CLASSES]; + bitmap_head tmp_curr_regs_live; + + bitmap_initialize (&tmp_curr_regs_live, ®_obstack); + + /* Tmp_curr_regs_live and tmp_reg_pressure hold the register pressure + information seen so far including for the current instruction. + We are taking a conservative approach here is the sense that we do + not add the dead registers in the current instruction to the pull + of available registers just yet. */ + bitmap_copy (&tmp_curr_regs_live, &curr_regs_live); + memcpy (tmp_reg_pressure, curr_reg_pressure, N_REG_CLASSES * sizeof (int)); + mark_ref_regs (PATTERN (insn), tmp_reg_pressure, &tmp_curr_regs_live); + + /* Update the pressure after the instruction. */ + note_stores (PATTERN (insn), mark_reg_death, NULL); + +#ifdef AUTO_INC_DEC + for (link = REG_NOTES (insn); link; link = XEXP (link, 1)) + if (REG_NOTE_KIND (link) == REG_INC) + mark_reg_death (XEXP (link, 0), NULL, NULL); +#endif + note_uses (&PATTERN (insn), mark_reg_birth, NULL); + /* Update max pressure. */ + for (i = 0; (int) i < ira_pressure_classes_num; i++) + { + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + if (LOOP_DATA (curr_loop)->max_reg_pressure[pressure_class] < + tmp_reg_pressure[pressure_class]) + LOOP_DATA (curr_loop)->max_reg_pressure[pressure_class] = + tmp_reg_pressure[pressure_class]; + } + + bitmap_clear (&tmp_curr_regs_live); +} + +/* Calculate the resgiter pressure in PS. SKIP_INSNS bitmap holds + the instructions that should be ignored during the calculation. */ +static void +calc_reg_pressure (partial_schedule_ptr ps, + bitmap_head *skip_insns) +{ + int k; + + for (k = ps->ii - 1; k >= 0; k--) + { + ps_insn_ptr ps_i = ps->rows_reverse[k]; + + while (ps_i) + { + rtx insn = ps_rtl_insn (ps, ps_i->id); + + if (bitmap_bit_p (skip_insns, INSN_UID (insn))) + goto next; + + if (!NONDEBUG_INSN_P (insn)) + goto next; + + calc_insn_reg_pressure_info (insn); + next: + ps_i = ps_i->prev_in_row; + } + } +} + +/* Releases the auxiliary data for LOOP. */ +static void +free_loop_data (struct loop *loop) +{ + struct loop_data *data = LOOP_DATA (loop); + if (!data) + return; + + bitmap_clear (&LOOP_DATA (loop)->regs_ref); + bitmap_clear (&LOOP_DATA (loop)->regs_live); + free (data); + loop->aux = NULL; +} + +/* Free the data-structures needed for the calculation. */ +static void +free_reg_pressure_info (void) +{ + loop_iterator li; + struct loop *loop; + + bitmap_clear (&curr_regs_live); + + FOR_EACH_LOOP (li, loop, 0) + free_loop_data (loop); +} + +/* Return TRUE if PS has register pressure. Otherwise return FALSE. + LOOP is the original loop and SC is the stage count which is needed + for the calculation. */ +static bool +ps_reg_pressure_p (struct loop *loop, partial_schedule_ptr ps, int sc) +{ + bool pressure_p = false; + int i; + bitmap_head lr_out_regs; + bitmap_head skip_insns; + + bitmap_initialize (&lr_out_regs, ®_obstack); + bitmap_initialize (&skip_insns, ®_obstack); + apply_reg_moves (ps, 0); + initiate_reg_pressure_info (ps, &lr_out_regs, &skip_insns, sc); + calc_reg_pressure (ps, &skip_insns); + + if (dump_file) + { + struct loop *parent; + unsigned int j; + bitmap_iterator bi; + + parent = loop_outer (loop); + fprintf (dump_file, "\n Loop %d (parent %d, header bb%d, depth %d)\n", + loop->num, (parent == NULL ? -1 : parent->num), + loop->header->index, loop_depth (loop)); + fprintf (dump_file, "\n ref. regnos:"); + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_ref, 0, j, bi) + fprintf (dump_file, " %d", j); + fprintf (dump_file, "\n live regnos:"); + EXECUTE_IF_SET_IN_BITMAP (&LOOP_DATA (loop)->regs_live, 0, j, bi) + fprintf (dump_file, " %d", j); + fprintf (dump_file, "\n Pressure:"); + } + + for (i = 0; i < ira_pressure_classes_num; i++) + { + enum reg_class pressure_class; + + pressure_class = ira_pressure_classes[i]; + + if (LOOP_DATA (loop)->max_reg_pressure[pressure_class] == 0) + continue; + + if (dump_file) + fprintf (dump_file, "%s=%d %d ", reg_class_names[pressure_class], + LOOP_DATA (loop)->max_reg_pressure[pressure_class], + ira_available_class_regs[pressure_class]); + + if (LOOP_DATA (loop)->max_reg_pressure[pressure_class] + > ira_class_hard_regs_num[pressure_class]) + { + if (dump_file) + fprintf (dump_file, "(pressure)\n"); + + pressure_p = true; + } + } + + bitmap_clear (&lr_out_regs); + bitmap_clear (&skip_insns); + free_reg_pressure_info (); + undo_reg_moves (ps); + return pressure_p; +} + /* Probability in % that the sms-ed loop rolls enough so that optimized version may be entered. Just a guess. */ #define PROB_SMS_ENOUGH_ITERATIONS 80 @@ -1366,6 +1982,13 @@ sms_schedule (void) return; /* There are no loops to schedule. */ } + if (flag_modulo_sched_reg_pressure) + { + regstat_init_n_sets_and_refs (); + ira_set_pseudo_classes (dump_file); + ira_setup_eliminable_regset (); + } + /* Initialize issue_rate. */ if (targetm.sched.issue_rate) { @@ -1681,7 +2304,9 @@ sms_schedule (void) set_columns_for_ps (ps); min_cycle = PS_MIN_CYCLE (ps) - SMODULO (PS_MIN_CYCLE (ps), ps->ii); - if (!schedule_reg_moves (ps)) + if (!schedule_reg_moves (ps) + || (flag_modulo_sched_reg_pressure + && ps_reg_pressure_p (loop, ps, stage_count))) { mii = ps->ii + 1; free_partial_schedule (ps); @@ -1742,7 +2367,7 @@ sms_schedule (void) /* The life-info is not valid any more. */ df_set_bb_dirty (g->bb); - apply_reg_moves (ps); + apply_reg_moves (ps, 1); if (dump_file) print_node_sched_params (dump_file, g->num_nodes, ps); /* Generate prolog and epilog. */ @@ -1757,6 +2382,11 @@ sms_schedule (void) } free (g_arr); + if (flag_modulo_sched_reg_pressure) + { + regstat_free_n_sets_and_refs (); + free_reg_info (); + } /* Release scheduler data, needed until now because of DFA. */ haifa_sched_finish ();