Message ID | 51705B25.6020402@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Apr 18, 2013 at 04:44:21PM -0400, Vladimir Makarov wrote: > LRA can fix the wrong address but secondary reload was done before > processing addresses. It could be fixed in rs6000.c code too but it > is complicated and I found a better (and i think more right) > solution by moving secondary reload generation after address > processing. I tended to think secondary_reload should always happen after address processing. > Here is the patch for your branch (patch for trunk is a bit > different as some changes in affected code were done on trunk). > > 2013-04-18 Vladimir Makarov <vmakarov@redhat.com> > > * lra-constraints.c (check_and_process_move): Move code for move > cost check to simple_move_p. Remove equiv_substitution. > (simple_move_p): New function. > (curr_insn_transform): Use the new function. Move call of > check_and_process_move after operand equiv substitution and > address process. > > Tomorrow I am going to look at SPEC2006 dealII crash for 32-bit mode. > Thanks for the patch. Unfortunately I just updated the branch to be merged up to 198065. So if you could send me a patch against current trunk, or update the branch (branches/ibm/meissner-lra) and check it into the branch, it would be appreciated.
On 13-04-18 4:44 PM, Vladimir Makarov wrote: > > Tomorrow I am going to look at SPEC2006 dealII crash for 32-bit mode. > LRA crashes on insn (insn 406 575 391 22 (set (reg:TF 35 3) (mem/u/c:TF (lo_sum:SI (reg:SI 7 7 [414]) (symbol_ref/u:SI ("*.LC10") [flags 0x82])) [85 S16 A128])) /home/cygnus/vmakarov/build/spec2006/benchspec/CPU2006/447.dealII/src/quadrature_lib.cc:80 373 {*movtf_intern\ al} (expr_list:REG_DEAD (reg:SI 7 7 [414]) (nil))) As I understand this correct insn. LRA checks the insn correctness in if (!constrain_operands (1)) fatal_insn_not_found (insn); and fails. The reason for this code in rs6000.c::legitimate_lo_sum_address_p if (! lra_in_progress && GET_MODE_SIZE (mode) > UNITS_PER_WORD && !(/* ??? Assume floating point reg based on mode? */ TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (mode == DFmode || mode == DDmode))) return false; lra_in_progress is false and mode==TFmode. I don't want to setup lra_in_progress to true at this stage as I need right correctness check (lra_in_progress may affect code checking RTLcorrectness, make the test less rigourous). If I change the above code to if (! lra_in_progress && GET_MODE_SIZE (mode) > UNITS_PER_WORD && !(/* ??? Assume floating point reg based on mode? */ TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && (mode == DFmode || mode == DDmode || mode == TFmode || mode == TDmode))) return false; the crash is gone. I don't understand what this check means and what comments ??? means too. So it is up to you, Mike, to decide is my change correct or not. Thanks.
On Fri, Apr 19, 2013 at 05:16:43PM -0400, Vladimir Makarov wrote: > If I change the above code to > > if (! lra_in_progress && GET_MODE_SIZE (mode) > UNITS_PER_WORD > && !(/* ??? Assume floating point reg based on mode? */ > TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT > && (mode == DFmode || mode == DDmode || mode == > TFmode || mode == TDmode))) > return false; > > the crash is gone. A lo_sum address isn't always offsettable. You know that you can't read or write the whole TFmode with one instruction, and that the mem will be accessed at addr+0 and addr+8 (and possibly at addr+4 and addr+12). If it so happens that addr is n*65536+32760, then addr+8 will overflow the 16-bit lo_sum and you'll actually try to access n*65536-32768 for the second part. So I don't think your change is correct. > I don't understand what this check means and what comments ??? means too. A lo_sum mem is only valid if you know it won't be offset (or that offsetting will never cross a 64k+32k boundary). If the access is smaller than a word then the load or store can be done in one insn. No offset required. If the access is a DFmode *and* you are loading or storing a floating point reg, then the access is also done in one insn. The ??? comment is referring to the fact that you don't know for sure that the DFmode is in a floating point reg. It usually is, but may be in two general purpose regs. Which then need an offset to load/store the second reg.
On 13-04-22 12:35 AM, Alan Modra wrote: > On Fri, Apr 19, 2013 at 05:16:43PM -0400, Vladimir Makarov wrote: >> I don't understand what this check means and what comments ??? means too. > A lo_sum mem is only valid if you know it won't be offset (or that > offsetting will never cross a 64k+32k boundary). If the access is > smaller than a word then the load or store can be done in one insn. > No offset required. If the access is a DFmode *and* you are loading > or storing a floating point reg, then the access is also done in one > insn. The ??? comment is referring to the fact that you don't know > for sure that the DFmode is in a floating point reg. It usually is, > but may be in two general purpose regs. Which then need an offset to > load/store the second reg. > Alan, thanks for the explanation. I'll search for another solution.
On Mon, Apr 22, 2013 at 03:26:45PM -0400, Vladimir Makarov wrote: > On 13-04-22 12:35 AM, Alan Modra wrote: > >On Fri, Apr 19, 2013 at 05:16:43PM -0400, Vladimir Makarov wrote: > >> I don't understand what this check means and what comments ??? means too. > >A lo_sum mem is only valid if you know it won't be offset (or that > >offsetting will never cross a 64k+32k boundary). If the access is > >smaller than a word then the load or store can be done in one insn. > >No offset required. If the access is a DFmode *and* you are loading > >or storing a floating point reg, then the access is also done in one > >insn. The ??? comment is referring to the fact that you don't know > >for sure that the DFmode is in a floating point reg. It usually is, > >but may be in two general purpose regs. Which then need an offset to > >load/store the second reg. > > > Alan, thanks for the explanation. I'll search for another solution. I'm suspecting secondary_reload needs more tuning for TF/TD modes.
Index: lra-constraints.c =================================================================== --- lra-constraints.c (revision 198028) +++ lra-constraints.c (working copy) @@ -887,14 +887,6 @@ check_and_process_move (bool *change_p, lra_assert (curr_insn_set != NULL_RTX); dreg = dest = SET_DEST (curr_insn_set); sreg = src = SET_SRC (curr_insn_set); - /* Quick check on the right move insn which does not need - reloads. */ - if ((dclass = get_op_class (dest)) != NO_REGS - && (sclass = get_op_class (src)) != NO_REGS - /* The backend guarantees that register moves of cost 2 never - need reloads. */ - && targetm.register_move_cost (GET_MODE (src), dclass, sclass) == 2) - return true; if (GET_CODE (dest) == SUBREG) dreg = SUBREG_REG (dest); if (GET_CODE (src) == SUBREG) @@ -902,7 +894,6 @@ check_and_process_move (bool *change_p, if (! REG_P (dreg) || ! REG_P (sreg)) return false; sclass = dclass = NO_REGS; - dreg = get_equiv_substitution (dreg); if (REG_P (dreg)) dclass = get_reg_class (REGNO (dreg)); if (dclass == ALL_REGS) @@ -916,7 +907,6 @@ check_and_process_move (bool *change_p, return false; sreg_mode = GET_MODE (sreg); old_sreg = sreg; - sreg = get_equiv_substitution (sreg); if (REG_P (sreg)) sclass = get_reg_class (REGNO (sreg)); if (sclass == ALL_REGS) @@ -2693,6 +2683,24 @@ emit_inc (enum reg_class new_rclass, rtx return result; } +/* Return true if the current move insn does not need processing as we + already know that it satisfies its constraints. */ +static bool +simple_move_p (void) +{ + rtx dest, src; + enum reg_class dclass, sclass; + + lra_assert (curr_insn_set != NULL_RTX); + dest = SET_DEST (curr_insn_set); + src = SET_SRC (curr_insn_set); + return ((dclass = get_op_class (dest)) != NO_REGS + && (sclass = get_op_class (src)) != NO_REGS + /* The backend guarantees that register moves of cost 2 + never need reloads. */ + && targetm.register_move_cost (GET_MODE (src), dclass, sclass) == 2); + } + /* Swap operands NOP and NOP + 1. */ static inline void swap_operands (int nop) @@ -2736,15 +2744,13 @@ curr_insn_transform (void) int max_regno_before; int reused_alternative_num; + curr_insn_set = single_set (curr_insn); + if (curr_insn_set != NULL_RTX && simple_move_p ()) + return false; + no_input_reloads_p = no_output_reloads_p = false; goal_alt_number = -1; - change_p = sec_mem_p = false; - curr_insn_set = single_set (curr_insn); - if (curr_insn_set != NULL_RTX - && check_and_process_move (&change_p, &sec_mem_p)) - return change_p; - /* JUMP_INSNs and CALL_INSNs are not allowed to have any output reloads; neither are insns that SET cc0. Insns that use CC0 are not allowed to have any input reloads. */ @@ -2839,6 +2845,10 @@ curr_insn_transform (void) we chose previously may no longer be valid. */ lra_set_used_insn_alternative (curr_insn, -1); + if (curr_insn_set != NULL_RTX + && check_and_process_move (&change_p, &sec_mem_p)) + return change_p; + try_swapped: reused_alternative_num = curr_id->used_insn_alternative;