Message ID | 4C2549D4.10608@codesourcery.com |
---|---|
State | New |
Headers | show |
> The changes in the lower-subreg pass are motivated by nonsensical > transformations like the following: > > -(insn 205 204 26 4 (set (reg:DI 68) > - (reg:DI 178)) 49 (REG_DEAD (reg:DI 178) > +(insn 261 204 262 4 (clobber (reg:DI 193)) > + > +(insn 262 261 263 4 (set (subreg:SI (reg:DI 193) 0) (reg:SI 191)) > + > +(insn 263 262 264 4 (set (subreg:SI (reg:DI 193) 4) (reg:SI 192)) > + > +(insn 264 263 26 4 (set (reg:DI 68) (reg:DI 193)) > > In resolve_simple_move, we generate a new DImode temporary pseudo rather > than storing into the DImode pseudo which is the actual destination. > This, I think, is due to confusion about the use of can_decompose_p; it > makes sense to use it in this context only for hard regs. For pseudos, > it tests the non_decomposable_context bitmap, which is irrelevant here. can_decompose_p is only used in this context though. But I agree that this looks superfluous. Ian, do you have any insight here?
Eric Botcazou <ebotcazou@adacore.com> writes: >> The changes in the lower-subreg pass are motivated by nonsensical >> transformations like the following: >> >> -(insn 205 204 26 4 (set (reg:DI 68) >> - (reg:DI 178)) 49 (REG_DEAD (reg:DI 178) >> +(insn 261 204 262 4 (clobber (reg:DI 193)) >> + >> +(insn 262 261 263 4 (set (subreg:SI (reg:DI 193) 0) (reg:SI 191)) >> + >> +(insn 263 262 264 4 (set (subreg:SI (reg:DI 193) 4) (reg:SI 192)) >> + >> +(insn 264 263 26 4 (set (reg:DI 68) (reg:DI 193)) >> >> In resolve_simple_move, we generate a new DImode temporary pseudo rather >> than storing into the DImode pseudo which is the actual destination. >> This, I think, is due to confusion about the use of can_decompose_p; it >> makes sense to use it in this context only for hard regs. For pseudos, >> it tests the non_decomposable_context bitmap, which is irrelevant here. > > can_decompose_p is only used in this context though. But I agree that this > looks superfluous. > > Ian, do you have any insight here? It's superfluous for a simple example like this one, but it's not clearly superfluous in all cases. If register 68 here should wind up being allocated to a floating point register which can't be SUBREG'ed, then my concern is that taking a SUBREG is going to make it harder to allocate the register correctly. In other words, the bitmap is not irrelevant; it's a proxy for non-SImode register classes. I don't know whether this a concern with IRA; can IRA allocate a DImode pseudo to a floating point register if there are SImode SUBREGs of it? The old register allocator wouldn't, and you could wind up with a bunch of reloads. (I actually had patches for the old register allocator which tracked subregs independently to improve allocation for cases like this. I delayed them to wait for the dataflow work, and then they became irrelevant with IRA.) Ian
On 06/28/2010 08:05 PM, Ian Lance Taylor wrote: > It's superfluous for a simple example like this one, but it's not > clearly superfluous in all cases. If register 68 here should wind up > being allocated to a floating point register which can't be SUBREG'ed, > then my concern is that taking a SUBREG is going to make it harder to > allocate the register correctly. In other words, the bitmap is not > irrelevant; it's a proxy for non-SImode register classes. That however applies to only one case in which bits can be set in that bitmap, the SUBREG case in find_decomposable_subregs. The normal case, I think, is that these bits come from occurrences of the reg that are in NOT_SIMPLE_MOVE, and I think this is irrelevant here. Do we need to keep track of two different bitmaps? > I don't know > whether this a concern with IRA; can IRA allocate a DImode pseudo to a > floating point register if there are SImode SUBREGs of it? The old > register allocator wouldn't, and you could wind up with a bunch of > reloads. I don't know whether IRA would do that or whether it can determine the costs correctly; I can't find anything right now which would deal with this case. Bernd
On 06/28/10 12:05, Ian Lance Taylor wrote: > I don't know > whether this a concern with IRA; can IRA allocate a DImode pseudo to a > floating point register if there are SImode SUBREGs of it? The old > register allocator wouldn't, and you could wind up with a bunch of > reloads. > I would expect that unless the target explicitly forbids such an allocation via CANNOT_CHANGE_MODE_CLASS, then it ought to work. In fact, this is a normal code generation situation for the PA family to support xmpyu -- if it had been broken I suspect we would have heard about it by now as anything doing an integer multiply would suddenly be using libcalls instead of xmpyu. Jeff
On 06/28/10 12:20, Bernd Schmidt wrote: > I don't know >> whether this a concern with IRA; can IRA allocate a DImode pseudo to a >> floating point register if there are SImode SUBREGs of it? The old >> register allocator wouldn't, and you could wind up with a bunch of >> reloads. >> > I don't know whether IRA would do that or whether it can determine the > costs correctly; I can't find anything right now which would deal with > this case. > It should be merely a question of consulting CANNOT_CHANGE_MODE_CLASS (for correctness) and costing models. It shouldn't really need special handling. jeff
Bernd Schmidt <bernds@codesourcery.com> writes: > On 06/28/2010 08:05 PM, Ian Lance Taylor wrote: > >> It's superfluous for a simple example like this one, but it's not >> clearly superfluous in all cases. If register 68 here should wind up >> being allocated to a floating point register which can't be SUBREG'ed, >> then my concern is that taking a SUBREG is going to make it harder to >> allocate the register correctly. In other words, the bitmap is not >> irrelevant; it's a proxy for non-SImode register classes. > > That however applies to only one case in which bits can be set in that > bitmap, the SUBREG case in find_decomposable_subregs. The normal case, > I think, is that these bits come from occurrences of the reg that are in > NOT_SIMPLE_MOVE, and I think this is irrelevant here. Do we need to > keep track of two different bitmaps? As far as I can see, it's not irrelevant there, because the pseudo might be used somewhere else in a case which is NOT_SIMPLE_MOVE. E.g., a floating point addition. What am I missing? Ian
On 06/28/2010 08:57 PM, Ian Lance Taylor wrote: > As far as I can see, it's not irrelevant there, because the pseudo might > be used somewhere else in a case which is NOT_SIMPLE_MOVE. E.g., a > floating point addition. What am I missing? Do you want to also verify that the mode of the final destination pseudo is DImode (i.e. orig_mode == dest_mode) before omitting the temporary? In that case it would be quite unlikely for it to be used in a floating point addition, and I think it would still avoid the extra temporary in the vast majority of the cases? Bernd
On 06/28/2010 11:33 AM, Jeff Law wrote: > On 06/28/10 12:05, Ian Lance Taylor wrote: >> I don't know >> whether this a concern with IRA; can IRA allocate a DImode pseudo to a >> floating point register if there are SImode SUBREGs of it? The old >> register allocator wouldn't, and you could wind up with a bunch of >> reloads. >> > I would expect that unless the target explicitly forbids such an > allocation via CANNOT_CHANGE_MODE_CLASS, then it ought to work. In > fact, this is a normal code generation situation for the PA family to > support xmpyu -- if it had been broken I suspect we would have heard > about it by now as anything doing an integer multiply would suddenly be > using libcalls instead of xmpyu. CANNOT_CHANGE_MODE_CLASS is the normal state of affairs for several ports doing int<->fp conversion. Where DImode is the only valid integer mode, and the single-precision load instruction does some bit swizzling (to place the SFmode value into register format) so that 32-bit integer values cannot be loaded or manipulated. This is true of some ppc variants and alpha, at least. As such, this is sort-of a non-answer, Jeff. r~
Bernd Schmidt <bernds@codesourcery.com> writes: > On 06/28/2010 08:57 PM, Ian Lance Taylor wrote: >> As far as I can see, it's not irrelevant there, because the pseudo might >> be used somewhere else in a case which is NOT_SIMPLE_MOVE. E.g., a >> floating point addition. What am I missing? > > Do you want to also verify that the mode of the final destination pseudo > is DImode (i.e. orig_mode == dest_mode) before omitting the temporary? > In that case it would be quite unlikely for it to be used in a floating > point addition, and I think it would still avoid the extra temporary in > the vast majority of the cases? I see what you're getting at. It's unlikely, but it does happen in code which examines the bits of a floating point number via a union or using -fno-strict-aliasing and a type cast. So if that code works correctly if perhaps less efficiently, then I agree that checking that the mode class is MODE_INT should work OK here. Ian
On 06/29/2010 01:03 AM, Ian Lance Taylor wrote: > I see what you're getting at. It's unlikely, but it does happen in code > which examines the bits of a floating point number via a union or using > -fno-strict-aliasing and a type cast. So if that code works correctly > if perhaps less efficiently, then I agree that checking that the mode > class is MODE_INT should work OK here. Hmm. Looking at this again now, aren't we already doing that? (resolve_reg_p checks whether DEST is a CONCAT) if (!can_decompose_p (dest) || (side_effects_p (dest) && !pushing) || (!SCALAR_INT_MODE_P (dest_mode) && !resolve_reg_p (dest) && !resolve_subreg_p (dest))) Bernd
Index: ira-lives.c =================================================================== --- ira-lives.c (revision 161371) +++ ira-lives.c (working copy) @@ -877,7 +877,7 @@ process_bb_node_lives (ira_loop_tree_nod int i, freq; unsigned int j; basic_block bb; - rtx insn; + rtx insn, prev; bitmap_iterator bi; bitmap reg_live_out; unsigned int px; @@ -925,6 +925,42 @@ process_bb_node_lives (ira_loop_tree_nod /* Invalidate all allocno_saved_at_call entries. */ last_call_num++; + /* Previous passes such as the first scheduling pass may have lengthened + the lifetime of pseudos by moving CLOBBER insns upwards. Undo this + here. */ + FOR_BB_INSNS_REVERSE_SAFE (bb, insn, prev) + { + rtx pat, next, reg; + if (!NONJUMP_INSN_P (insn) || insn == BB_END (bb)) + continue; + pat = PATTERN (insn); + if (GET_CODE (pat) != CLOBBER) + continue; + reg = XEXP (pat, 0); + if (!REG_P (reg) || REGNO (reg) < FIRST_PSEUDO_REGISTER) + continue; + next = insn; + while (next != BB_END (bb)) + { + next = NEXT_INSN (next); + if (!NONDEBUG_INSN_P (next)) + continue; + if (reg_mentioned_p (XEXP (pat, 0), PATTERN (next))) + break; + } + if (next == NEXT_INSN (insn)) + continue; + + NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn); + PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn); + + PREV_INSN (insn) = PREV_INSN (next); + NEXT_INSN (insn) = next; + + NEXT_INSN (PREV_INSN (next)) = insn; + PREV_INSN (next) = insn; + } + /* Scan the code of this basic block, noting which allocnos and hard regs are born or die. Index: lower-subreg.c =================================================================== --- lower-subreg.c (revision 161371) +++ lower-subreg.c (working copy) @@ -717,7 +717,7 @@ resolve_simple_move (rtx set, rtx insn) /* If SRC is a register which we can't decompose, or has side effects, we need to move via a temporary register. */ - if (!can_decompose_p (src) + if ((REG_P (src) && HARD_REGISTER_P (src) && !can_decompose_p (src)) || side_effects_p (src) || GET_CODE (src) == ASM_OPERANDS) { @@ -737,7 +737,7 @@ resolve_simple_move (rtx set, rtx insn) dest_mode = orig_mode; pushing = push_operand (dest, dest_mode); - if (!can_decompose_p (dest) + if ((REG_P (dest) && HARD_REGISTER_P (dest) && !can_decompose_p (dest)) || (side_effects_p (dest) && !pushing) || (!SCALAR_INT_MODE_P (dest_mode) && !resolve_reg_p (dest)