From patchwork Thu Apr 11 17:30:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 235844 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 777DE2C00B9 for ; Fri, 12 Apr 2013 03:42:36 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=ApNZ4E+1MXBoptw6mkyrcpKNnaqMI3LuiyngqUnzelEWgF Kq+xf18oQ4owZDAAx1BXg+xZGlnW5DMIgGVxNQfscw5d49Wtpw+EoCu7RaOPPlPs kFaCKeSTseGiORK+aWxxsBpQr2LWSfG67heUBNI77PtzxjE3emBJmfwht/NEU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=INW+dxe5HmIotU0RiN6T7R5t+tc=; b=bnbiIBD/HZ9g+EWIKs1+ cw+L8LGN/nhLRbjTkLIKmaJgIJp+z18XdGz0pEpIr5SUQTr9AaksQLLtjSIsp2sR OTMFCq24oJ6Yy5NamyV6QUX8Jde90+8lxNY0pljHzhOPWszYFvRK9g0JLOYW8XCU iWgmzxLp/UQOQGsGL0Rytxk= Received: (qmail 12044 invoked by alias); 11 Apr 2013 17:42:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 12029 invoked by uid 89); 11 Apr 2013 17:42:29 -0000 X-Spam-SWARE-Status: No, score=-7.6 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, RCVD_IN_HOSTKARMA_W, RP_MATCHES_RCVD, SPF_HELO_PASS, TW_TM, TW_VS autolearn=ham version=3.3.1 Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Thu, 11 Apr 2013 17:42:27 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r3BHgPMG002210 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 11 Apr 2013 13:42:25 -0400 Received: from toll.usersys.redhat.com (toll.yyz.redhat.com [10.15.16.165]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r3BHgOsD006562; Thu, 11 Apr 2013 13:42:24 -0400 Message-ID: <5166F34C.30901@redhat.com> Date: Thu, 11 Apr 2013 13:30:52 -0400 From: Vladimir Makarov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130311 Thunderbird/17.0.4 MIME-Version: 1.0 To: gcc-patches , David Edelsohn , Michael Meissner , "Bergner, Peter" Subject: RFA: enable LRA for rs6000 X-Virus-Found: No Here is a patch to enable LRA for rs6000. The patch includes code changes in rs6000 machine-dependent parts and in LRA parts. I've added a new switch -mlra for rs6000 to make debugging LRA for rs6000 easier but not documented it as it will be gone at the end of stage1 (may be with ability to use LRA for rs6000 if the results are unsatisfactory). I have only one question about its default value. Currently by default LRA is used but if you prefer opposite I can reverse it. Please just let me know your opinion. The patch was successfully bootstrapped and tested on pppc64 and x86/x86-64. Are rs6000 parts ok for trunk? 2013-04-11 Vladimir Makarov * rtl.h (struct rtx_def): Add comment for field jump. (LRA_SUBREG_P): New macro. * recog.c (register_operand): Check LRA_SUBREG_P. * lra-constraints.c (match_reload, simplify_operand_subreg): Use LRA_SUBREG_P. (emit_spill_move): Set up LRA_SUBREG_P. * lra.c (lra): Add note at the end of RTL code. Align non-empty stack frame. * lra-spills.c (lra_spill): Align stack after spilling pseudos. (lra_final_code_change): Skip subreg change for operators. * lra-eliminations.c (eliminate_regs_in_insn): Make return earlier if there are no operand changes. * lra-constraints.c (curr_insn_set): New. (match_reload): Set LRA_SUBREG_P. (emit_spill_move): Ditto. (check_and_process_move): Use curr_insn_set. Process only single set insns. Don't initialize sec_mem_p and change_p. (simplify_operand_subreg): Use LRA_SUBREG_P. (reg_in_class_p): New function. (process_alt_operands): Use it. Use #if HAVE_ATTR_enabled instead of #ifdef. Add code to remove cycling. (process_address): Check EXTRA_CONSTRAINT_STR. Process even if non-null disp. Reload inner instead of disp when base and index are null. (EBB_PROBABILITY_CUTOFF): Redefine probability in percents. (curr_insn_transform): Initialize sec_mem_p and change_p. Set up curr_insn_set. Call check_and_process_move only for single set insns. * config/rs6000/rs6000.opt (mlra): New option. * config/rs6000/rs6000.h (SECONDARY_MEMORY_NEEDED_MODE): New macro. * config/rs6000/rs6000-protos.h (rs6000_secondary_memory_needed_mode): New prototype. * config/rs6000/rs6000.c: Include ira.h. (TARGET_LRA_P): Redefine. (legitimate_lo_sum_address_p): Permit modes bigger word for LRA. (rs6000_emit_move): Add movsd generation code for LRA. (rs6000_secondary_memory_needed_mode): New function. (rs6000_lra_p): Ditto. (rs6000_alloc_sdmode_stack_slot): Ignore code for LRA. (rs6000_secondary_reload_class): Return NO_REGS for LRA in case constants, memory, or fp regs. Index: config/rs6000/rs6000-protos.h =================================================================== --- config/rs6000/rs6000-protos.h (revision 197640) +++ config/rs6000/rs6000-protos.h (working copy) @@ -118,6 +118,8 @@ extern rtx create_TOC_reference (rtx, rt extern void rs6000_split_multireg_move (rtx, rtx); extern void rs6000_emit_move (rtx, rtx, enum machine_mode); extern rtx rs6000_secondary_memory_needed_rtx (enum machine_mode); +extern enum machine_mode rs6000_secondary_memory_needed_mode (enum + machine_mode); extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, enum machine_mode, int, int, int, int *); extern bool rs6000_legitimate_offset_address_p (enum machine_mode, rtx, Index: config/rs6000/rs6000.c =================================================================== --- config/rs6000/rs6000.c (revision 197640) +++ config/rs6000/rs6000.c (working copy) @@ -56,6 +56,7 @@ #include "intl.h" #include "params.h" #include "tm-constrs.h" +#include "ira.h" #include "opts.h" #include "tree-vectorizer.h" #include "dumpfile.h" @@ -1425,6 +1426,9 @@ static const struct attribute_spec rs600 #undef TARGET_MODE_DEPENDENT_ADDRESS_P #define TARGET_MODE_DEPENDENT_ADDRESS_P rs6000_mode_dependent_address_p +#undef TARGET_LRA_P +#define TARGET_LRA_P rs6000_lra_p + #undef TARGET_CAN_ELIMINATE #define TARGET_CAN_ELIMINATE rs6000_can_eliminate @@ -5561,7 +5565,7 @@ rs6000_legitimate_offset_address_p (enum return false; if (!reg_offset_addressing_ok_p (mode)) return virtual_stack_registers_memory_p (x); - if (legitimate_constant_pool_address_p (x, mode, strict)) + if (legitimate_constant_pool_address_p (x, mode, strict || lra_in_progress)) return true; if (GET_CODE (XEXP (x, 1)) != CONST_INT) return false; @@ -5701,19 +5705,23 @@ legitimate_lo_sum_address_p (enum machin if (TARGET_ELF || TARGET_MACHO) { + bool toc_ok_p; + if (DEFAULT_ABI != ABI_AIX && DEFAULT_ABI != ABI_DARWIN && flag_pic) return false; - if (TARGET_TOC) + toc_ok_p = (lra_in_progress && TARGET_CMODEL != CMODEL_SMALL + && small_toc_ref (x, VOIDmode)); + if (TARGET_TOC && ! toc_ok_p) return false; if (GET_MODE_NUNITS (mode) != 1) return false; - if (GET_MODE_SIZE (mode) > UNITS_PER_WORD + if (! lra_in_progress && GET_MODE_SIZE (mode) > UNITS_PER_WORD && !(/* ??? Assume floating point reg based on mode? */ TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (mode == DFmode || mode == DDmode))) return false; - return CONSTANT_P (x); + return CONSTANT_P (x) || toc_ok_p; } return false; @@ -6711,7 +6719,8 @@ rs6000_legitimate_address_p (enum machin if (reg_offset_p && legitimate_small_data_p (mode, x)) return 1; if (reg_offset_p - && legitimate_constant_pool_address_p (x, mode, reg_ok_strict)) + && legitimate_constant_pool_address_p (x, mode, + reg_ok_strict || lra_in_progress)) return 1; /* If not REG_OK_STRICT (before reload) let pass any stack offset. */ if (! reg_ok_strict @@ -7000,6 +7009,7 @@ rs6000_conditional_register_usage (void) fixed_regs[i] = call_used_regs[i] = call_really_used_regs[i] = 1; } } + /* Try to output insns to set TARGET equal to the constant C if it can be done in less than N insns. Do all computations in MODE. @@ -7331,6 +7341,68 @@ rs6000_emit_move (rtx dest, rtx source, cfun->machine->sdmode_stack_slot = eliminate_regs (cfun->machine->sdmode_stack_slot, VOIDmode, NULL_RTX); + + if (lra_in_progress + && mode == SDmode + && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER + && reg_preferred_class (REGNO (operands[0])) == NO_REGS + && (REG_P (operands[1]) + || (GET_CODE (operands[1]) == SUBREG + && REG_P (SUBREG_REG (operands[1]))))) + { + int regno = REGNO (GET_CODE (operands[1]) == SUBREG + ? SUBREG_REG (operands[1]) : operands[1]); + enum reg_class cl; + + if (regno >= FIRST_PSEUDO_REGISTER) + { + cl = reg_preferred_class (regno); + gcc_assert (cl != NO_REGS); + regno = ira_class_hard_regs[cl][0]; + } + if (FP_REGNO_P (regno)) + { + if (GET_MODE (operands[0]) != DDmode) + operands[0] = gen_rtx_SUBREG (DDmode, operands[0], 0); + emit_insn (gen_movsd_store (operands[0], operands[1])); + } + else if (INT_REGNO_P (regno)) + emit_insn (gen_movsd_hardfloat (operands[0], operands[1])); + else + gcc_unreachable(); + return; + } + if (lra_in_progress + && mode == SDmode + && (REG_P (operands[0]) + || (GET_CODE (operands[0]) == SUBREG + && REG_P (SUBREG_REG (operands[0])))) + && REG_P (operands[1]) && REGNO (operands[1]) >= FIRST_PSEUDO_REGISTER + && reg_preferred_class (REGNO (operands[1])) == NO_REGS) + { + int regno = REGNO (GET_CODE (operands[0]) == SUBREG + ? SUBREG_REG (operands[0]) : operands[0]); + enum reg_class cl; + + if (regno >= FIRST_PSEUDO_REGISTER) + { + cl = reg_preferred_class (regno); + gcc_assert (cl != NO_REGS); + regno = ira_class_hard_regs[cl][0]; + } + if (FP_REGNO_P (regno)) + { + if (GET_MODE (operands[1]) != DDmode) + operands[1] = gen_rtx_SUBREG (DDmode, operands[1], 0); + emit_insn (gen_movsd_load (operands[0], operands[1])); + } + else if (INT_REGNO_P (regno)) + emit_insn (gen_movsd_hardfloat (operands[0], operands[1])); + else + gcc_unreachable(); + return; + } + if (reload_in_progress && mode == SDmode && cfun->machine->sdmode_stack_slot != NULL_RTX @@ -13848,6 +13920,17 @@ rs6000_secondary_memory_needed_rtx (enum return ret; } +/* Return the mode to be used for memory when a secondary memory + location is needed. For SDmode values we need to use DDmode, in + all other cases we can use the same mode. */ +enum machine_mode +rs6000_secondary_memory_needed_mode (enum machine_mode mode) +{ + if (mode == SDmode) + return DDmode; + return mode; +} + static tree rs6000_check_sdmode (tree *tp, int *walk_subtrees, void *data ATTRIBUTE_UNUSED) { @@ -14511,6 +14594,10 @@ rs6000_alloc_sdmode_stack_slot (void) gimple_stmt_iterator gsi; gcc_assert (cfun->machine->sdmode_stack_slot == NULL_RTX); + /* We use a different approach for dealing with the secondary + memmory in LRA. */ + if (ira_use_lra_p) + return; if (TARGET_NO_SDMODE_STACK) return; @@ -14747,7 +14834,7 @@ rs6000_secondary_reload_class (enum reg_ /* Constants, memory, and FP registers can go into FP registers. */ if ((regno == -1 || FP_REGNO_P (regno)) && (rclass == FLOAT_REGS || rclass == NON_SPECIAL_REGS)) - return (mode != SDmode) ? NO_REGS : GENERAL_REGS; + return (mode != SDmode || lra_in_progress) ? NO_REGS : GENERAL_REGS; /* Memory, and FP/altivec registers can go into fp/altivec registers under VSX. However, for scalar variables, use the traditional floating point @@ -27666,6 +27753,13 @@ rs6000_libcall_value (enum machine_mode } +/* Return true if we use LRA instead of reload pass. */ +static bool +rs6000_lra_p (void) +{ + return rs6000_lra_flag; +} + /* Given FROM and TO register numbers, say whether this elimination is allowed. Frame pointer elimination is automatically handled. Index: config/rs6000/rs6000.h =================================================================== --- config/rs6000/rs6000.h (revision 197640) +++ config/rs6000/rs6000.h (working copy) @@ -1391,6 +1391,13 @@ extern enum reg_class rs6000_constraints #define SECONDARY_MEMORY_NEEDED_RTX(MODE) \ rs6000_secondary_memory_needed_rtx (MODE) +/* Specify the mode to be used for memory when a secondary memory + location is needed. For cpus that cannot load/store SDmode values + from the 64-bit FP registers without using a full 64-bit + load/store, we need a wider mode. */ +#define SECONDARY_MEMORY_NEEDED_MODE(MODE) \ + rs6000_secondary_memory_needed_mode (MODE) + /* Return the maximum number of consecutive registers needed to represent mode MODE in a register of class CLASS. Index: config/rs6000/rs6000.opt =================================================================== --- config/rs6000/rs6000.opt (revision 197640) +++ config/rs6000/rs6000.opt (working copy) @@ -443,6 +443,10 @@ mlong-double- Target RejectNegative Joined UInteger Var(rs6000_long_double_type_size) Save -mlong-double- Specify size of long double (64 or 128 bits) +mlra +Target Report Var(rs6000_lra_flag) Init(1) Save +Use LRA instead of reload + msched-costly-dep= Target RejectNegative Joined Var(rs6000_sched_costly_dep_str) Determine which dependences between insns are considered costly Index: lra-constraints.c =================================================================== --- lra-constraints.c (revision 197640) +++ lra-constraints.c (working copy) @@ -135,10 +135,11 @@ reload insns. */ static int bb_reload_num; -/* The current insn being processed and corresponding its data (basic - block, the insn data, the insn static data, and the mode of each - operand). */ +/* The current insn being processed and corresponding its single set + (NULL otherwise), its data (basic block, the insn data, the insn + static data, and the mode of each operand). */ static rtx curr_insn; +static rtx curr_insn_set; static basic_block curr_bb; static lra_insn_recog_data_t curr_id; static struct lra_static_insn_data *curr_static_id; @@ -698,6 +699,7 @@ match_reload (signed char out, signed ch new_out_reg = gen_lowpart_SUBREG (outmode, reg); else new_out_reg = gen_rtx_SUBREG (outmode, reg, 0); + LRA_SUBREG_P (new_out_reg) = 1; /* If the input reg is dying here, we can use the same hard register for REG and IN_RTX. We do it only for original pseudos as reload pseudos can die although original @@ -721,6 +723,7 @@ match_reload (signed char out, signed ch it at the end of LRA work. */ clobber = emit_clobber (new_out_reg); LRA_TEMP_CLOBBER_P (PATTERN (clobber)) = 1; + LRA_SUBREG_P (new_in_reg) = 1; if (GET_CODE (in_rtx) == SUBREG) { rtx subreg_reg = SUBREG_REG (in_rtx); @@ -856,32 +859,34 @@ static rtx emit_spill_move (bool to_p, rtx mem_pseudo, rtx val) { if (GET_MODE (mem_pseudo) != GET_MODE (val)) - val = gen_rtx_SUBREG (GET_MODE (mem_pseudo), - GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val, - 0); + { + val = gen_rtx_SUBREG (GET_MODE (mem_pseudo), + GET_CODE (val) == SUBREG ? SUBREG_REG (val) : val, + 0); + LRA_SUBREG_P (val) = 1; + } return (to_p - ? gen_move_insn (mem_pseudo, val) - : gen_move_insn (val, mem_pseudo)); + ? gen_move_insn (mem_pseudo, val) + : gen_move_insn (val, mem_pseudo)); } /* Process a special case insn (register move), return true if we - don't need to process it anymore. Return that RTL was changed - through CHANGE_P and macro SECONDARY_MEMORY_NEEDED says to use - secondary memory through SEC_MEM_P. */ + don't need to process it anymore. INSN should be a single set + insn. Set up that RTL was changed through CHANGE_P and macro + SECONDARY_MEMORY_NEEDED says to use secondary memory through + SEC_MEM_P. */ static bool -check_and_process_move (bool *change_p, bool *sec_mem_p) +check_and_process_move (bool *change_p, bool *sec_mem_p ATTRIBUTE_UNUSED) { int sregno, dregno; - rtx set, dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg; + rtx dest, src, dreg, sreg, old_sreg, new_reg, before, scratch_reg; enum reg_class dclass, sclass, secondary_class; enum machine_mode sreg_mode; secondary_reload_info sri; - *sec_mem_p = *change_p = false; - if ((set = single_set (curr_insn)) == NULL) - return false; - dreg = dest = SET_DEST (set); - sreg = src = SET_SRC (set); + lra_assert (curr_insn_set != NULL_RTX); + dreg = dest = SET_DEST (curr_insn_set); + sreg = src = SET_SRC (curr_insn_set); /* Quick check on the right move insn which does not need reloads. */ if ((dclass = get_op_class (dest)) != NO_REGS @@ -1008,7 +1013,7 @@ check_and_process_move (bool *change_p, if (GET_CODE (src) == SUBREG) SUBREG_REG (src) = new_reg; else - SET_SRC (set) = new_reg; + SET_SRC (curr_insn_set) = new_reg; } else { @@ -1205,7 +1210,10 @@ simplify_operand_subreg (int nop, enum m && (hard_regno_nregs[hard_regno][GET_MODE (reg)] >= hard_regno_nregs[hard_regno][mode]) && simplify_subreg_regno (hard_regno, GET_MODE (reg), - SUBREG_BYTE (operand), mode) < 0) + SUBREG_BYTE (operand), mode) < 0 + /* Don't reload subreg for matching reload. It is actually + valid subreg in LRA. */ + && ! LRA_SUBREG_P (operand)) || CONSTANT_P (reg) || GET_CODE (reg) == PLUS || MEM_P (reg)) { enum op_type type = curr_static_id->operand[nop].type; @@ -1312,6 +1320,14 @@ general_constant_p (rtx x) return CONSTANT_P (x) && (! flag_pic || LEGITIMATE_PIC_OPERAND_P (x)); } +static bool +reg_in_class_p (rtx reg, enum reg_class cl) +{ + if (cl == NO_REGS) + return get_reg_class (REGNO (reg)) == NO_REGS; + return in_class_p (reg, cl, NULL); +} + /* Major function to choose the current insn alternative and what operands should be reloaded and how. If ONLY_ALTERNATIVE is not negative we should consider only this alternative. Return false if @@ -1391,7 +1407,7 @@ process_alt_operands (int only_alternati for (nalt = 0; nalt < n_alternatives; nalt++) { /* Loop over operands for one constraint alternative. */ -#ifdef HAVE_ATTR_enabled +#if HAVE_ATTR_enabled if (curr_id->alternative_enabled_p != NULL && ! curr_id->alternative_enabled_p[nalt]) continue; @@ -2048,6 +2064,31 @@ process_alt_operands (int only_alternati if (early_clobber_p && operand_reg[nop] != NULL_RTX) early_clobbered_nops[early_clobbered_regs_num++] = nop; } + if (curr_insn_set != NULL_RTX && n_operands == 2 + && ((! curr_alt_win[0] && ! curr_alt_win[1] + && REG_P (no_subreg_reg_operand[0]) + && REG_P (no_subreg_reg_operand[1]) + && (reg_in_class_p (no_subreg_reg_operand[0], curr_alt[1]) + || reg_in_class_p (no_subreg_reg_operand[1], curr_alt[0]))) + || (! curr_alt_win[0] && curr_alt_win[1] + && REG_P (no_subreg_reg_operand[1]) + && reg_in_class_p (no_subreg_reg_operand[1], curr_alt[0])) + || (curr_alt_win[0] && ! curr_alt_win[1] + && REG_P (no_subreg_reg_operand[0]) + && reg_in_class_p (no_subreg_reg_operand[0], curr_alt[1]) + && (! CONST_POOL_OK_P (curr_operand_mode[1], + no_subreg_reg_operand[1]) + || (targetm.preferred_reload_class + (no_subreg_reg_operand[1], + (enum reg_class) curr_alt[1]) != NO_REGS)) + /* If it is a result of recent elimination in move + insn we can transform it into an add still by + using this alternative. */ + && GET_CODE (no_subreg_reg_operand[1]) != PLUS))) + /* We have a move insn and a new reload insn will be similar + to the current insn. We should avoid such situation as it + results in LRA cycling. */ + overall += LRA_MAX_REJECT; ok_p = true; curr_alt_dont_inherit_ops_num = 0; for (nop = 0; nop < early_clobbered_regs_num; nop++) @@ -2419,27 +2460,35 @@ process_address (int nop, rtx *before, r && process_addr_reg (ad.index_term, before, NULL, INDEX_REG_CLASS)) change_p = true; +#ifdef EXTRA_CONSTRAINT_STR + /* Target hooks sometimes reject extra constraint addresses -- use + EXTRA_CONSTRAINT_STR for the validation. */ + if (constraint[0] != 'p' + && EXTRA_ADDRESS_CONSTRAINT (constraint[0], constraint) + && EXTRA_CONSTRAINT_STR (op, constraint[0], constraint)) + return change_p; +#endif + /* There are three cases where the shape of *AD.INNER may now be invalid: 1) the original address was valid, but either elimination or - equiv_address_substitution applied a displacement that made - it invalid. + equiv_address_substitution was applied and that made + the address invalid. 2) the address is an invalid symbolic address created by force_const_to_mem. 3) the address is a frame address with an invalid offset. - All these cases involve a displacement and a non-autoinc address, - so there is no point revalidating other types. */ - if (ad.disp == NULL || ad.autoinc_p || valid_address_p (&ad)) + All these cases involve a non-autoinc address, so there is no + point revalidating other types. */ + if (ad.autoinc_p || valid_address_p (&ad)) return change_p; /* Any index existed before LRA started, so we can assume that the presence and shape of the index is valid. */ push_to_sequence (*before); - gcc_assert (ad.segment == NULL); - gcc_assert (ad.disp == ad.disp_term); + lra_assert (ad.disp == ad.disp_term); if (ad.base == NULL) { if (ad.index == NULL) @@ -2447,25 +2496,25 @@ process_address (int nop, rtx *before, r int code = -1; enum reg_class cl = base_reg_class (ad.mode, ad.as, SCRATCH, SCRATCH); - rtx disp = *ad.disp; + rtx addr = *ad.inner; - new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "disp"); + new_reg = lra_create_new_reg (Pmode, NULL_RTX, cl, "addr"); #ifdef HAVE_lo_sum { rtx insn; rtx last = get_last_insn (); - /* disp => lo_sum (new_base, disp), case (2) above. */ + /* addr => lo_sum (new_base, addr), case (2) above. */ insn = emit_insn (gen_rtx_SET (VOIDmode, new_reg, - gen_rtx_HIGH (Pmode, copy_rtx (disp)))); + gen_rtx_HIGH (Pmode, copy_rtx (addr)))); code = recog_memoized (insn); if (code >= 0) { - *ad.disp = gen_rtx_LO_SUM (Pmode, new_reg, disp); + *ad.inner = gen_rtx_LO_SUM (Pmode, new_reg, addr); if (! valid_address_p (ad.mode, *ad.outer, ad.as)) { - *ad.disp = disp; + *ad.inner = addr; code = -1; } } @@ -2475,9 +2524,9 @@ process_address (int nop, rtx *before, r #endif if (code < 0) { - /* disp => new_base, case (2) above. */ - lra_emit_move (new_reg, disp); - *ad.disp = new_reg; + /* addr => new_base, case (2) above. */ + lra_emit_move (new_reg, addr); + *ad.inner = new_reg; } } else @@ -2690,7 +2739,10 @@ curr_insn_transform (void) no_input_reloads_p = no_output_reloads_p = false; goal_alt_number = -1; - if (check_and_process_move (&change_p, &sec_mem_p)) + change_p = sec_mem_p = false; + curr_insn_set = single_set (curr_insn); + if (curr_insn_set != NULL_RTX + && check_and_process_move (&change_p, &sec_mem_p)) return change_p; /* JUMP_INSNs and CALL_INSNs are not allowed to have any output @@ -4806,7 +4858,7 @@ inherit_in_ebb (rtx head, rtx tail) /* This value affects EBB forming. If probability of edge from EBB to a BB is not greater than the following value, we don't add the BB to EBB. */ -#define EBB_PROBABILITY_CUTOFF (REG_BR_PROB_BASE / 2) +#define EBB_PROBABILITY_CUTOFF ((REG_BR_PROB_BASE * 50) / 100) /* Current number of inheritance/split iteration. */ int lra_inheritance_iter; Index: lra-eliminations.c =================================================================== --- lra-eliminations.c (revision 197640) +++ lra-eliminations.c (working copy) @@ -975,6 +975,9 @@ eliminate_regs_in_insn (rtx insn, bool r } } + if (! validate_p) + return; + /* Substitute the operands; the new values are in the substed_operand array. */ for (i = 0; i < static_id->n_operands; i++) @@ -982,16 +985,13 @@ eliminate_regs_in_insn (rtx insn, bool r for (i = 0; i < static_id->n_dups; i++) *id->dup_loc[i] = substed_operand[(int) static_id->dup_num[i]]; - if (validate_p) - { - /* If we had a move insn but now we don't, re-recognize it. - This will cause spurious re-recognition if the old move had a - PARALLEL since the new one still will, but we can't call - single_set without having put new body into the insn and the - re-recognition won't hurt in this rare case. */ - id = lra_update_insn_recog_data (insn); - static_id = id->insn_static_data; - } + /* If we had a move insn but now we don't, re-recognize it. + This will cause spurious re-recognition if the old move had a + PARALLEL since the new one still will, but we can't call + single_set without having put new body into the insn and the + re-recognition won't hurt in this rare case. */ + id = lra_update_insn_recog_data (insn); + static_id = id->insn_static_data; } /* Spill pseudos which are assigned to hard registers in SET. Add Index: lra-spills.c =================================================================== --- lra-spills.c (revision 197640) +++ lra-spills.c (working copy) @@ -548,6 +548,11 @@ lra_spill (void) for (i = 0; i < n; i++) if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX) assign_mem_slot (pseudo_regnos[i]); + if (n > 0 && crtl->stack_alignment_needed) + /* If we have a stack frame, we must align it now. The stack size + may be a part of the offset computation for register + elimination. */ + assign_stack_local (BLKmode, 0, crtl->stack_alignment_needed); if (lra_dump_file != NULL) { for (i = 0; i < slots_num; i++) @@ -644,10 +649,12 @@ lra_final_code_change (void) } lra_insn_recog_data_t id = lra_get_insn_recog_data (insn); + struct lra_static_insn_data *static_id = id->insn_static_data; bool insn_change_p = false; for (i = id->insn_static_data->n_operands - 1; i >= 0; i--) - if (alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn))) + if ((DEBUG_INSN_P (insn) || ! static_id->operand[i].is_operator) + && alter_subregs (id->operand_loc[i], ! DEBUG_INSN_P (insn))) { lra_update_dup (id, i); insn_change_p = true; Index: lra.c =================================================================== --- lra.c (revision 197640) +++ lra.c (working copy) @@ -2202,6 +2202,10 @@ lra (FILE *f) timevar_push (TV_LRA); + /* Make sure that the last insn is a note. Some subsequent passes + need it. */ + emit_note (NOTE_INSN_DELETED); + COPY_HARD_REG_SET (lra_no_alloc_regs, ira_no_alloc_regs); init_reg_info (); @@ -2258,6 +2262,11 @@ lra (FILE *f) bitmap_initialize (&lra_split_regs, ®_obstack); bitmap_initialize (&lra_optional_reload_pseudos, ®_obstack); live_p = false; + if (get_frame_size () != 0 && crtl->stack_alignment_needed) + /* If we have a stack frame, we must align it now. The stack size + may be a part of the offset computation for register + elimination. */ + assign_stack_local (BLKmode, 0, crtl->stack_alignment_needed); for (;;) { for (;;) Index: recog.c =================================================================== --- recog.c (revision 197640) +++ recog.c (working copy) @@ -1065,7 +1065,8 @@ register_operand (rtx op, enum machine_m && REGNO (sub) < FIRST_PSEUDO_REGISTER && REG_CANNOT_CHANGE_MODE_P (REGNO (sub), GET_MODE (sub), mode) && GET_MODE_CLASS (GET_MODE (sub)) != MODE_COMPLEX_INT - && GET_MODE_CLASS (GET_MODE (sub)) != MODE_COMPLEX_FLOAT) + && GET_MODE_CLASS (GET_MODE (sub)) != MODE_COMPLEX_FLOAT + && ! LRA_SUBREG_P (op)) return 0; #endif Index: rtl.h =================================================================== --- rtl.h (revision 197640) +++ rtl.h (working copy) @@ -265,7 +265,8 @@ struct GTY((chain_next ("RTX_NEXT (&%h)" 1 in a SET that is for a return. In a CODE_LABEL, part of the two-bit alternate entry field. 1 in a CONCAT is VAL_EXPR_IS_COPIED in var-tracking.c. - 1 in a VALUE is SP_BASED_VALUE_P in cselib.c. */ + 1 in a VALUE is SP_BASED_VALUE_P in cselib.c. + 1 in a SUBREG generated by LRA for reload insns. */ unsigned int jump : 1; /* In a CODE_LABEL, part of the two-bit alternate entry field. 1 in a MEM if it cannot trap. @@ -1411,6 +1412,11 @@ do { \ ((RTL_FLAG_CHECK1("SUBREG_PROMOTED_UNSIGNED_P", (RTX), SUBREG)->volatil) \ ? -1 : (int) (RTX)->unchanging) +/* True if the subreg was generated by LRA for reload insns. Such + subregs are valid only during LRA. */ +#define LRA_SUBREG_P(RTX) \ + (RTL_FLAG_CHECK1("LRA_SUBREG_P", (RTX), SUBREG)->jump) + /* Access various components of an ASM_OPERANDS rtx. */ #define ASM_OPERANDS_TEMPLATE(RTX) XCSTR (RTX, 0, ASM_OPERANDS)