From patchwork Wed Jul 14 11:28:12 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: Patch: Revert find_reloads part of the IRA merge Date: Wed, 14 Jul 2010 01:28:12 -0000 From: Bernd Schmidt X-Patchwork-Id: 58869 Message-Id: <4C3D9F4C.8080304@codesourcery.com> To: GCC Patches Cc: "Vladimir N. Makarov" , Richard Earnshaw I've now reverted the find_reloads change. Attached is a patch which tries to solve the problem in a different way; I fixed some other problems that showed up in testing on ARM. Specifically: * Immediately marking used_spill_regs as ever_live can cause a situation where the next iteration will use a different set of hard regs, and we end up pushing registers that are never used. If we delay settings regs_ever_live until nothing else has changed, we can avoid that, at the cost of a little bit of complexity, and the risk of requiring more iterations. Not sure this is worth it. * When evaluating costs of alternatives, reload isn't taking into account whether something is going to need secondary reloads. * Use live_throughout sets to determine if a reload will force a spill. This seems to work better than the previous heuristic, but it can still fail (dead_or_set isn't taken into account since it's quite a useless set, the test isn't run on address reloads, etc.) Not committed. It would be nice to get reports from some target maintainers about what this does to code quality. Bernd Index: gcc/reload1.c =================================================================== --- gcc.orig/reload1.c +++ gcc/reload1.c @@ -197,7 +197,7 @@ static HARD_REG_SET bad_spill_regs; insn. This includes registers used for user variables and registers that we can't eliminate. A register that appears in this set also can't be used to retry register allocation. */ -static HARD_REG_SET bad_spill_regs_global; +HARD_REG_SET bad_spill_regs_global; /* Describes order of use of registers for reloading of spilled pseudo-registers. `n_spills' is the number of @@ -227,6 +227,10 @@ static HARD_REG_SET *pseudo_forbidden_re marked in this set. */ static HARD_REG_SET used_spill_regs; +/* Keeps track of hard registers used for spills in the current iteration. + This set will be merged into used_spill_regs in finish_spills. */ +static HARD_REG_SET current_used_spill_regs; + /* Index of last register assigned as a spill register. We allocate in a round-robin fashion. */ static int last_spill_reg; @@ -823,6 +827,7 @@ reload (rtx first, int global) /* Spill any hard regs that we know we can't eliminate. */ CLEAR_HARD_REG_SET (used_spill_regs); + CLEAR_HARD_REG_SET (current_used_spill_regs); /* There can be multiple ways to eliminate a register; they should be listed adjacently. Elimination for any register fails only if all possible ways fail. */ @@ -1511,7 +1516,7 @@ calculate_needs_all_insns (int global) did_elimination = eliminate_regs_in_insn (insn, 0); /* Analyze the instruction. */ - operands_changed = find_reloads (insn, 0, spill_indirect_levels, + operands_changed = find_reloads (chain, 0, spill_indirect_levels, global, spill_reg_order); /* If a no-op set needs more than one reload, this is likely @@ -2042,7 +2047,7 @@ find_reload_regs (struct insn_chain *cha } COPY_HARD_REG_SET (chain->used_spill_regs, used_spill_regs_local); - IOR_HARD_REG_SET (used_spill_regs, used_spill_regs_local); + IOR_HARD_REG_SET (current_used_spill_regs, used_spill_regs_local); memcpy (chain->rld, rld, n_reloads * sizeof (struct reload)); } @@ -2052,6 +2057,7 @@ select_reload_regs (void) { struct insn_chain *chain; + CLEAR_HARD_REG_SET (current_used_spill_regs); /* Try to satisfy the needs for each insn. */ for (chain = insns_need_reload; chain != 0; chain = chain->next_need_reload) @@ -4279,34 +4285,10 @@ finish_spills (int global) { struct insn_chain *chain; int something_changed = 0; + int new_regs_used; unsigned i; reg_set_iterator rsi; - /* Build the spill_regs array for the function. */ - /* If there are some registers still to eliminate and one of the spill regs - wasn't ever used before, additional stack space may have to be - allocated to store this register. Thus, we may have changed the offset - between the stack and frame pointers, so mark that something has changed. - - One might think that we need only set VAL to 1 if this is a call-used - register. However, the set of registers that must be saved by the - prologue is not identical to the call-used set. For example, the - register used by the call insn for the return PC is a call-used register, - but must be saved by the prologue. */ - - n_spills = 0; - for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) - if (TEST_HARD_REG_BIT (used_spill_regs, i)) - { - spill_reg_order[i] = n_spills; - spill_regs[n_spills++] = i; - if (num_eliminable && ! df_regs_ever_live_p (i)) - something_changed = 1; - df_set_regs_ever_live (i, true); - } - else - spill_reg_order[i] = -1; - EXECUTE_IF_SET_IN_REG_SET (&spilled_pseudos, FIRST_PSEUDO_REGISTER, i, rsi) if (! ira_conflicts_p || reg_renumber[i] >= 0) { @@ -4388,6 +4370,8 @@ finish_spills (int global) makes inheritance work somewhat better. */ if (chain->need_reload) { + HARD_REG_SET all_used_spill_regs; + REG_SET_TO_HARD_REG_SET (used_by_pseudos, &chain->live_throughout); REG_SET_TO_HARD_REG_SET (used_by_pseudos2, &chain->dead_or_set); IOR_HARD_REG_SET (used_by_pseudos, used_by_pseudos2); @@ -4398,8 +4382,10 @@ finish_spills (int global) may be not included in the value calculated here because of possible removing caller-saves insns (see function delete_caller_save_insns. */ + COPY_HARD_REG_SET (all_used_spill_regs, used_spill_regs); + IOR_HARD_REG_SET (all_used_spill_regs, current_used_spill_regs); COMPL_HARD_REG_SET (chain->used_spill_regs, used_by_pseudos); - AND_HARD_REG_SET (chain->used_spill_regs, used_spill_regs); + AND_HARD_REG_SET (chain->used_spill_regs, all_used_spill_regs); } } @@ -4425,6 +4411,43 @@ finish_spills (int global) } } + /* Build the spill_regs array for the function. */ + /* If there are some registers still to eliminate and one of the spill regs + wasn't ever used before, additional stack space may have to be + allocated to store this register. Thus, we may have changed the offset + between the stack and frame pointers, so mark that something has changed. + However, only do this if nothing else has changed - if we spilled a pseudo, + we may not end up using the same spill reg in the next iteration. + + One might think that we need only set VAL to 1 if this is a call-used + register. However, the set of registers that must be saved by the + prologue is not identical to the call-used set. For example, the + register used by the call insn for the return PC is a call-used register, + but must be saved by the prologue. */ + + n_spills = 0; + new_regs_used = 0; + for (i = 0; i < FIRST_PSEUDO_REGISTER; i++) + { + if (!something_changed && TEST_HARD_REG_BIT (current_used_spill_regs, i)) + { + if (num_eliminable && ! df_regs_ever_live_p (i)) + new_regs_used++; + df_set_regs_ever_live (i, true); + SET_HARD_REG_BIT (used_spill_regs, i); + } + + if (TEST_HARD_REG_BIT (used_spill_regs, i)) + { + spill_reg_order[i] = n_spills; + spill_regs[n_spills++] = i; + } + else + spill_reg_order[i] = -1; + } + if (new_regs_used) + something_changed = 1; + return something_changed; } @@ -4587,7 +4610,7 @@ reload_as_needed (int live_known) CLEAR_REG_SET (®_has_output_reload); CLEAR_HARD_REG_SET (reg_is_output_reload); - find_reloads (insn, 1, spill_indirect_levels, live_known, + find_reloads (chain, 1, spill_indirect_levels, live_known, spill_reg_order); } Index: gcc/reload.c =================================================================== --- gcc.orig/reload.c +++ gcc/reload.c @@ -2533,8 +2533,8 @@ safe_from_earlyclobber (rtx op, rtx clob return immune_p (op, clobber, early_data); } -/* Main entry point of this file: search the body of INSN - for values that need reloading and record them with push_reload. +/* Main entry point of this file: search the body of the insn found in + CHAIN for values that need reloading and record them with push_reload. REPLACE nonzero means record also where the values occur so that subst_reloads can be used. @@ -2556,9 +2556,10 @@ safe_from_earlyclobber (rtx op, rtx clob commutative operands, reg_equiv_address substitution, or whatever. */ int -find_reloads (rtx insn, int replace, int ind_levels, int live_known, - short *reload_reg_p) +find_reloads (struct insn_chain *chain, int replace, int ind_levels, + int live_known, short *reload_reg_p) { + rtx insn = chain->insn; int insn_code_number; int i, j; int noperands; @@ -2601,7 +2602,7 @@ find_reloads (rtx insn, int replace, int char goal_alternative_offmemok[MAX_RECOG_OPERANDS]; char goal_alternative_earlyclobber[MAX_RECOG_OPERANDS]; int goal_alternative_swapped; - int best; + int best, best_forced_reloads; int commutative; char operands_match[MAX_RECOG_OPERANDS][MAX_RECOG_OPERANDS]; rtx substed_operand[MAX_RECOG_OPERANDS]; @@ -2808,7 +2809,7 @@ find_reloads (rtx insn, int replace, int || GET_CODE (recog_data.operand[i]) == PLUS)) { INSN_CODE (insn) = -1; - retval = find_reloads (insn, replace, ind_levels, live_known, + retval = find_reloads (chain, replace, ind_levels, live_known, reload_reg_p); return retval; } @@ -2927,6 +2928,7 @@ find_reloads (rtx insn, int replace, int all the operands together against the register constraints. */ best = MAX_RECOG_OPERANDS * 2 + 600; + best_forced_reloads = 0; swapped = 0; goal_alternative_swapped = 0; @@ -3484,6 +3486,11 @@ find_reloads (rtx insn, int replace, int this_alternative_offmemok[i] = offmemok; losers++; + if (secondary_reload_class (modified[i] != RELOAD_WRITE, + this_alternative[i], + operand_mode[i], operand) != NO_REGS) + reject += 2; + if (badop) bad = 1; /* Alternative loses if it has no regs for a reg operand. */ @@ -3713,7 +3720,33 @@ find_reloads (rtx insn, int replace, int record it as the chosen goal for reloading. */ if (! bad) { - if (best > losers) + int forced_reloads = 0; + bool change_p = best > losers; + if (best >= losers) + { + for (i = 0; i < noperands; i++) + { + HARD_REG_SET tmp, used_by_pseudos; + enum reg_class alt_class = this_alternative[i]; + if (this_alternative_win[i] + || this_alternative_match_win[i] + || this_alternative_matches[i] >= 0 + || alt_class == NO_REGS) + continue; + COPY_HARD_REG_SET (tmp, reg_class_contents[alt_class]); + AND_COMPL_HARD_REG_SET (tmp, bad_spill_regs_global); + REG_SET_TO_HARD_REG_SET (used_by_pseudos, + &chain->live_throughout); + compute_use_by_pseudos (&used_by_pseudos, + &chain->live_throughout); + AND_COMPL_HARD_REG_SET (tmp, used_by_pseudos); + if (hard_reg_set_empty_p (tmp)) + forced_reloads++; + } + if (best == losers && forced_reloads < best_forced_reloads) + change_p = true; + } + if (change_p) { for (i = 0; i < noperands; i++) { @@ -3729,6 +3762,7 @@ find_reloads (rtx insn, int replace, int } goal_alternative_swapped = swapped; best = losers; + best_forced_reloads = forced_reloads; goal_alternative_number = this_alternative_number; goal_earlyclobber = this_earlyclobber; } Index: gcc/reload.h =================================================================== --- gcc.orig/reload.h +++ gcc/reload.h @@ -241,9 +241,20 @@ extern struct insn_chain *reload_insn_ch /* Allocate a new insn_chain structure. */ extern struct insn_chain *new_insn_chain (void); + +/* Search the body of the insn in CHAIN for values that need reloading + and record them with push_reload. REPLACE nonzero means record + also where the values occur so that subst_reloads can be used. */ +extern int find_reloads (struct insn_chain *, int, int, int, short *); #endif #if defined SET_HARD_REG_BIT +/* These are the hard registers that can't be used as spill register for any + insn. This includes registers used for user variables and registers that + we can't eliminate. A register that appears in this set also can't be used + to retry register allocation. */ +extern HARD_REG_SET bad_spill_regs_global; + extern void compute_use_by_pseudos (HARD_REG_SET *, bitmap); #endif @@ -282,11 +293,6 @@ extern int operands_match_p (rtx, rtx); /* Return 1 if altering OP will not modify the value of CLOBBER. */ extern int safe_from_earlyclobber (rtx, rtx); -/* Search the body of INSN for values that need reloading and record them - with push_reload. REPLACE nonzero means record also where the values occur - so that subst_reloads can be used. */ -extern int find_reloads (rtx, int, int, int, short *); - /* Compute the sum of X and Y, making canonicalizations assumed in an address, namely: sum constant integers, surround the sum of two constants with a CONST, put the constant as the second operand, and