From patchwork Sat Jun 26 00:29:08 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Subject: A few simple DImode improvements Date: Fri, 25 Jun 2010 14:29:08 -0000 From: Bernd Schmidt X-Patchwork-Id: 57045 Message-Id: <4C2549D4.10608@codesourcery.com> To: GCC Patches There are ways to improve register allocation for DImode without going to extremes like the patch I posted on Monday. The attached patch is quite simple and fixes two issues I noticed while working on the DCE patch I posted earlier. First, when we set DImode registers in a sequence of multiple SImode accesses, we emit a clobber at the start of the sequence to tell life analysis that the register is dead above this point. Scheduling can move these clobbers, extending the lifetime as far as IRA is able to tell. The changes in ira-lives.c make us move these clobbers downwards as far as possible. This may sometimes help even for single-word regs that are initialized with a sequence of partial sets involving zero_extract. The changes in the lower-subreg pass are motivated by nonsensical transformations like the following: -(insn 205 204 26 4 (set (reg:DI 68) - (reg:DI 178)) 49 (REG_DEAD (reg:DI 178) +(insn 261 204 262 4 (clobber (reg:DI 193)) + +(insn 262 261 263 4 (set (subreg:SI (reg:DI 193) 0) (reg:SI 191)) + +(insn 263 262 264 4 (set (subreg:SI (reg:DI 193) 4) (reg:SI 192)) + +(insn 264 263 26 4 (set (reg:DI 68) (reg:DI 193)) In resolve_simple_move, we generate a new DImode temporary pseudo rather than storing into the DImode pseudo which is the actual destination. This, I think, is due to confusion about the use of can_decompose_p; it makes sense to use it in this context only for hard regs. For pseudos, it tests the non_decomposable_context bitmap, which is irrelevant here. Bootstrapped and regression tested on i686-linux. Ok? Bernd * ira-lives.c (process_bb_node_lives): Move clobber insns downwards as much as possible before analyzing a basic block. * lower-subreg.c (resolve_simple_move): Don't call can_decompose_p on pseudos. Index: ira-lives.c =================================================================== --- ira-lives.c (revision 161371) +++ ira-lives.c (working copy) @@ -877,7 +877,7 @@ process_bb_node_lives (ira_loop_tree_nod int i, freq; unsigned int j; basic_block bb; - rtx insn; + rtx insn, prev; bitmap_iterator bi; bitmap reg_live_out; unsigned int px; @@ -925,6 +925,42 @@ process_bb_node_lives (ira_loop_tree_nod /* Invalidate all allocno_saved_at_call entries. */ last_call_num++; + /* Previous passes such as the first scheduling pass may have lengthened + the lifetime of pseudos by moving CLOBBER insns upwards. Undo this + here. */ + FOR_BB_INSNS_REVERSE_SAFE (bb, insn, prev) + { + rtx pat, next, reg; + if (!NONJUMP_INSN_P (insn) || insn == BB_END (bb)) + continue; + pat = PATTERN (insn); + if (GET_CODE (pat) != CLOBBER) + continue; + reg = XEXP (pat, 0); + if (!REG_P (reg) || REGNO (reg) < FIRST_PSEUDO_REGISTER) + continue; + next = insn; + while (next != BB_END (bb)) + { + next = NEXT_INSN (next); + if (!NONDEBUG_INSN_P (next)) + continue; + if (reg_mentioned_p (XEXP (pat, 0), PATTERN (next))) + break; + } + if (next == NEXT_INSN (insn)) + continue; + + NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn); + PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn); + + PREV_INSN (insn) = PREV_INSN (next); + NEXT_INSN (insn) = next; + + NEXT_INSN (PREV_INSN (next)) = insn; + PREV_INSN (next) = insn; + } + /* Scan the code of this basic block, noting which allocnos and hard regs are born or die. Index: lower-subreg.c =================================================================== --- lower-subreg.c (revision 161371) +++ lower-subreg.c (working copy) @@ -717,7 +717,7 @@ resolve_simple_move (rtx set, rtx insn) /* If SRC is a register which we can't decompose, or has side effects, we need to move via a temporary register. */ - if (!can_decompose_p (src) + if ((REG_P (src) && HARD_REGISTER_P (src) && !can_decompose_p (src)) || side_effects_p (src) || GET_CODE (src) == ASM_OPERANDS) { @@ -737,7 +737,7 @@ resolve_simple_move (rtx set, rtx insn) dest_mode = orig_mode; pushing = push_operand (dest, dest_mode); - if (!can_decompose_p (dest) + if ((REG_P (dest) && HARD_REGISTER_P (dest) && !can_decompose_p (dest)) || (side_effects_p (dest) && !pushing) || (!SCALAR_INT_MODE_P (dest_mode) && !resolve_reg_p (dest)