diff mbox

IRA patch: move clobbers downwards

Message ID 4C3C34B4.3070309@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt July 13, 2010, 9:41 a.m. UTC
This is an older patch I'm resubmitting.  The scheduler can move reg
clobbers away from the insns for which they were generated, generating a
window during which IRA thinks the register is live when it isn't.  This
can cause unnecessary conflicts.

The real problem here is that IRA does a backwards scan; just as in
peep2, it would probably be better to use a forwards scan using REG_DEAD
notes.  However, I'm told it's expensive to create those.

Bootstrapped and regression tested on i686-linux, also regression tested
on ARM.  Ok?


Bernd
* ira-lives.c (process_bb_node_lives): Move clobber insns downwards
	as much as possible before analyzing a basic block.

Comments

Steven Bosscher July 13, 2010, 9:59 a.m. UTC | #1
On Tue, Jul 13, 2010 at 11:41 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> This is an older patch I'm resubmitting.  The scheduler can move reg
> clobbers away from the insns for which they were generated, generating a
> window during which IRA thinks the register is live when it isn't.  This
> can cause unnecessary conflicts.
>
> The real problem here is that IRA does a backwards scan; just as in
> peep2, it would probably be better to use a forwards scan using REG_DEAD
> notes.  However, I'm told it's expensive to create those.

A backward scan is what practically every compiler (and every text
book) except GCC uses, so that can't really be the "real problem".

I'd say the real problem is that the clobbers are moved, or generated
at all for registers that are not live anyway. Are those clobbers
necessary? Can they be removed instead?

Ciao!
Steven
Bernd Schmidt July 13, 2010, 10:01 a.m. UTC | #2
On 07/13/2010 11:59 AM, Steven Bosscher wrote:
> On Tue, Jul 13, 2010 at 11:41 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> This is an older patch I'm resubmitting.  The scheduler can move reg
>> clobbers away from the insns for which they were generated, generating a
>> window during which IRA thinks the register is live when it isn't.  This
>> can cause unnecessary conflicts.
>>
>> The real problem here is that IRA does a backwards scan; just as in
>> peep2, it would probably be better to use a forwards scan using REG_DEAD
>> notes.  However, I'm told it's expensive to create those.
> 
> A backward scan is what practically every compiler (and every text
> book) except GCC uses, so that can't really be the "real problem".
> 
> I'd say the real problem is that the clobbers are moved, or generated
> at all for registers that are not live anyway. Are those clobbers
> necessary? Can they be removed instead?

(clobber (reg:DI))
(set (low half))
(set (high half))

is a common pattern in GCC to limit lifetimes of DImode pseudos.  Maybe
the text books ignore this problem, it wouldn't be the only one.


Bernd
Steven Bosscher July 13, 2010, 10:15 a.m. UTC | #3
On Tue, Jul 13, 2010 at 12:01 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 07/13/2010 11:59 AM, Steven Bosscher wrote:
>> On Tue, Jul 13, 2010 at 11:41 AM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>> This is an older patch I'm resubmitting.  The scheduler can move reg
>>> clobbers away from the insns for which they were generated, generating a
>>> window during which IRA thinks the register is live when it isn't.  This
>>> can cause unnecessary conflicts.
>>>
>>> The real problem here is that IRA does a backwards scan; just as in
>>> peep2, it would probably be better to use a forwards scan using REG_DEAD
>>> notes.  However, I'm told it's expensive to create those.
>>
>> A backward scan is what practically every compiler (and every text
>> book) except GCC uses, so that can't really be the "real problem".
>>
>> I'd say the real problem is that the clobbers are moved, or generated
>> at all for registers that are not live anyway. Are those clobbers
>> necessary? Can they be removed instead?
>
> (clobber (reg:DI))
> (set (low half))
> (set (high half))
>
> is a common pattern in GCC to limit lifetimes of DImode pseudos.  Maybe
> the text books ignore this problem, it wouldn't be the only one.

Most text books do, but most compilers do not.

Anyway, it seems to me that the scheduler should not move the clobber
up for a case like this, and that this is the real problem. Perhaps
the scheduler can be taught to give clobbers a low-enough priority to
schedule them no earlier than the first instruction that depends on it
(assuming the scheduler creates dependencies of the SETs on the
CLOBBER...?).

Ciao!
Steven
Bernd Schmidt July 13, 2010, 10:21 a.m. UTC | #4
On 07/13/2010 12:15 PM, Steven Bosscher wrote:
> Anyway, it seems to me that the scheduler should not move the clobber
> up for a case like this, and that this is the real problem. Perhaps
> the scheduler can be taught to give clobbers a low-enough priority to
> schedule them no earlier than the first instruction that depends on it
> (assuming the scheduler creates dependencies of the SETs on the
> CLOBBER...?).

I looked at that.  The problem is that you need to schedule the clobber
to make the instruction that follows it ready, so you'd want to schedule
them as soon as possible, really.  I tried putting them in a little
queue on the side and emitting them only once they became necessary
(i.e. we're about to schedule an insn that sets the reg), but that got
quite ugly and didn't work very well.  Hence, the IRA patch.

If you have a nice clean scheduler patch, feel free to post it.


Bernd
Vladimir Makarov July 13, 2010, 3:29 p.m. UTC | #5
Bernd Schmidt wrote:
> This is an older patch I'm resubmitting.  The scheduler can move reg
> clobbers away from the insns for which they were generated, generating a
> window during which IRA thinks the register is live when it isn't.  This
> can cause unnecessary conflicts.
>
>   
I am not sure that this is the right place to fix this problem.  Common 
sense is saying me that it should be solved in the first place (insn 
scheduling).

Moreover I am not sure that this problem is not solved already with 
-fsched-pressure.  If it is not it could be easily solved by using the 
infrastructure for -fsched-pressure.
> The real problem here is that IRA does a backwards scan; just as in
> peep2, it would probably be better to use a forwards scan using REG_DEAD
> notes.  However, I'm told it's expensive to create those.
>
>   
Originally, it was done on the forward scan in IRA.  As I know for many 
people the final goal was to remove REG_DEAD notes.  I was not in a 
hurry to implement the backward scan because the major problem of 
removing REG_DEAD is the reload pass which uses it.  But Richard 
Sandiford made a patch to use a backward pass.  We invested a lot of 
efforts to solve all (non-trivial) problems of the backward pass patch.  
So I think it is quite unreasonable to switch back to the forward pass.
> Bootstrapped and regression tested on i686-linux, also regression tested
> on ARM.  Ok?
>
>   

Bernd, could you check please that -fsched-pressure solves the problem.
Jeff Law July 13, 2010, 4:19 p.m. UTC | #6
On 07/13/10 09:29, Vladimir Makarov wrote:
> Bernd Schmidt wrote:
>> This is an older patch I'm resubmitting.  The scheduler can move reg
>> clobbers away from the insns for which they were generated, generating a
>> window during which IRA thinks the register is live when it isn't.  This
>> can cause unnecessary conflicts.
>>
> I am not sure that this is the right place to fix this problem.  
> Common sense is saying me that it should be solved in the first place 
> (insn scheduling).
>
> Moreover I am not sure that this problem is not solved already with 
> -fsched-pressure.  If it is not it could be easily solved by using the 
> infrastructure for -fsched-pressure.
You know, this sounds an awful lot like a discussion I had with David E. 
a while back.

ISTM the CLOBBER should be completely ignored by scheduling.  The 
clobber serves no purpose other than to tell the dataflow analyzers that 
a double-word value is going to be completely replaced.

Ideally, we'd just ignore the clobbers in the scheduler; however, that 
might be a PITA to implement.  But ISTM we ought to be able to shrink 
the clobber's lifetime in a single pass over the insns after the 
scheduler has run.

Or we could build a real pressure reduction pass using much of the 
scheduler's infrastructure.

>> The real problem here is that IRA does a backwards scan; just as in
>> peep2, it would probably be better to use a forwards scan using REG_DEAD
>> notes.  However, I'm told it's expensive to create those.
>>
> Originally, it was done on the forward scan in IRA.  As I know for 
> many people the final goal was to remove REG_DEAD notes.  I was not in 
> a hurry to implement the backward scan because the major problem of 
> removing REG_DEAD is the reload pass which uses it.  But Richard 
> Sandiford made a patch to use a backward pass.  We invested a lot of 
> efforts to solve all (non-trivial) problems of the backward pass 
> patch.  So I think it is quite unreasonable to switch back to the 
> forward pass.
Interestingly enough, reload's use of REG_DEAD notes caused me some 
major grief recently.    I'll spare everyone the details.

Jeff
Bernd Schmidt July 13, 2010, 4:23 p.m. UTC | #7
On 07/13/2010 06:19 PM, Jeff Law wrote:

> Ideally, we'd just ignore the clobbers in the scheduler; however, that
> might be a PITA to implement.  But ISTM we ought to be able to shrink
> the clobber's lifetime in a single pass over the insns after the
> scheduler has run.

Well, that's exactly what my patch does.

Ignoring them isn't going to work, I think - where to place them depends
on the scheduling of other insns, but ideally the scheduling of other
insns won't depend on the clobbers.

> Or we could build a real pressure reduction pass using much of the
> scheduler's infrastructure.

Possibly something I have to do anyway for some of the PRs on my list.

I haven't tried if -fsched-pressure fixes this, but it seemed to help
with some other issues I've seen recently.  Can we make this the default?


Bernd
Vladimir Makarov July 13, 2010, 4:33 p.m. UTC | #8
Bernd Schmidt wrote:
> On 07/13/2010 06:19 PM, Jeff Law wrote:
>
>   
>
> Possibly something I have to do anyway for some of the PRs on my list.
>
> I haven't tried if -fsched-pressure fixes this, but it seemed to help
> with some other issues I've seen recently.  Can we make this the default?
>
>   
As I remember there were some issues for -fsched-pressure on powerpc.  I 
proposed to decide to make it default for specific targets to target 
maintainers.  -fsched-presurre could slow down the compiler a lot (like 
1-2%).

So I can not say that we should make it default for -O2 or -O3 without 
extensive benchmarking a lot of targets.   But I have no time or 
resources for this.
Jeff Law July 13, 2010, 4:40 p.m. UTC | #9
On 07/13/10 10:23, Bernd Schmidt wrote:
> On 07/13/2010 06:19 PM, Jeff Law wrote:
>
>    
>> Ideally, we'd just ignore the clobbers in the scheduler; however, that
>> might be a PITA to implement.  But ISTM we ought to be able to shrink
>> the clobber's lifetime in a single pass over the insns after the
>> scheduler has run.
>>      
> Well, that's exactly what my patch does.
>    
Perhaps the thing to do is move it out of IRA and into sched-whatever.c :-)

> Ignoring them isn't going to work, I think - where to place them depends
> on the scheduling of other insns, but ideally the scheduling of other
> insns won't depend on the clobbers.
>    
The problem (as I see it) is the clobber has to issue prior to the insns 
with write-write dependencies.  Finding a way to issue the clobber, then 
immediately issue the write-write dependent insns sounds ugly to 
implement.  So the mental picture I was working with was to remove the 
clobber from the stream, schedule normally, then reinsert the clobber 
after scheduling was complete.  There's some bookkeeping to do when we 
pull the clobber out of the stream, but in the absence of inter-block 
motions, this shouldn't be terribly hard.




>> Or we could build a real pressure reduction pass using much of the
>> scheduler's infrastructure.
>>      
> Possibly something I have to do anyway for some of the PRs on my list.
>    
I keep pondering the dependency analysis and scheduler's list handling 
as the way to solve lifetime shrinkage (probably to be run in response 
to the existence of unallocated pseudos).    My gut tells me 90% of what 
we need is just sitting there.

Jeff
Bernd Schmidt July 14, 2010, 9:44 p.m. UTC | #10
On 07/14/2010 09:23 PM, Jeff Law wrote:

> When I first looked at this, I kept hearing a little voice saying I'd
> seen this discussion before.  So I did some searching today :-)
> 
> The first thread starts here:
> 
> http://gcc.gnu.org/ml/gcc-patches/2002-03/msg01651.html

Lovely :-)

>> The real problem here is that IRA does a backwards scan; just as in
>> peep2, it would probably be better to use a forwards scan using REG_DEAD
>> notes.  However, I'm told it's expensive to create those.
>>   
> Just so I'm clear, we're doing a backwards life analysis.  We see one or
> more assignments to the components of a double-word pseudo which makes
> the pseudo live.  The pseudo is assumed to still be live until we find
> its clobber thus making the pseudo live between the clobber and the
> component assignments, even though it holds no useful value.  Right?

That's what happens.  SETs of DF_REF_PARTIAL things are ignored for the
purpose of tracking liveness.
With a forward scan we'd ignore the clobbers and mark the reg live on
the first partial set, getting the right answer.

> I guess we could compensate for this with some special clobber handling
> when computing lifetimes, but just moving the clobbers after sched might
> be the best solution.

Since we seem to have decided to use my IRA subword patch, the issue is
a bit moot.  We can still run into trouble for

(clobber (reg:SI x))
(set (zero_extract (reg x) (low half)) (something))
(set (zero_extract (reg x) (high half)) (something else))

which IIRC happens for CHImode and maybe some other cases.  It's
probably no longer that important.


Bernd
Jeff Law July 14, 2010, 10:32 p.m. UTC | #11
On 07/14/10 15:44, Bernd Schmidt wrote:
> On 07/14/2010 09:23 PM, Jeff Law wrote:
>
>    
>> When I first looked at this, I kept hearing a little voice saying I'd
>> seen this discussion before.  So I did some searching today :-)
>>
>> The first thread starts here:
>>
>> http://gcc.gnu.org/ml/gcc-patches/2002-03/msg01651.html
>>      
> Lovely :-)
>    
Yea, we've been kicking this around for 8+ years.

David E.-- with Bernd's double-word IRA improvements you might be able 
to remove the PPC port hack referenced in that thread.


>    
>> I guess we could compensate for this with some special clobber handling
>> when computing lifetimes, but just moving the clobbers after sched might
>> be the best solution.
>>      
> Since we seem to have decided to use my IRA subword patch, the issue is
> a bit moot.  We can still run into trouble for
>
> (clobber (reg:SI x))
> (set (zero_extract (reg x) (low half)) (something))
> (set (zero_extract (reg x) (high half)) (something else))
>
> which IIRC happens for CHImode and maybe some other cases.  It's
> probably no longer that important.
>    
True.  There may be other things running post-sched which suffer from 
this issue, but I suspect if there are, they are trivial compared to 
getting this handled in IRA.

Jeff
Bernd Schmidt July 16, 2010, 12:36 a.m. UTC | #12
On 07/13/2010 06:40 PM, Jeff Law wrote:
> On 07/13/10 10:23, Bernd Schmidt wrote:
>> On 07/13/2010 06:19 PM, Jeff Law wrote:
>>
>>   
>>> Ideally, we'd just ignore the clobbers in the scheduler; however, that
>>> might be a PITA to implement.  But ISTM we ought to be able to shrink
>>> the clobber's lifetime in a single pass over the insns after the
>>> scheduler has run.
>>>      
>> Well, that's exactly what my patch does.
>>    
> Perhaps the thing to do is move it out of IRA and into sched-whatever.c :-)

I actually tried, and it's not that easy :-(  To avoid duplicating it
between the various sched-xxx.c files you'd want to put it into
schedule_block in haifa-sched.c, but then you get aborts when freeing
the dependency structures because it doesn't like it when you move
clobbers past debug_insns...

Really, the IRA thing is the simplest way to fix it (short of making IRA
do a forward scan), and it _is_ a long-standing problem.  Do we want to
fix it?


Bernd
Jeff Law July 16, 2010, 6:36 p.m. UTC | #13
On 07/15/10 18:36, Bernd Schmidt wrote:
> On 07/13/2010 06:40 PM, Jeff Law wrote:
>    
>> On 07/13/10 10:23, Bernd Schmidt wrote:
>>      
>>> On 07/13/2010 06:19 PM, Jeff Law wrote:
>>>
>>>
>>>        
>>>> Ideally, we'd just ignore the clobbers in the scheduler; however, that
>>>> might be a PITA to implement.  But ISTM we ought to be able to shrink
>>>> the clobber's lifetime in a single pass over the insns after the
>>>> scheduler has run.
>>>>
>>>>          
>>> Well, that's exactly what my patch does.
>>>
>>>        
>> Perhaps the thing to do is move it out of IRA and into sched-whatever.c :-)
>>      
> I actually tried, and it's not that easy :-(  To avoid duplicating it
> between the various sched-xxx.c files you'd want to put it into
> schedule_block in haifa-sched.c, but then you get aborts when freeing
> the dependency structures because it doesn't like it when you move
> clobbers past debug_insns...
>
> Really, the IRA thing is the simplest way to fix it (short of making IRA
> do a forward scan), and it _is_ a long-standing problem.  Do we want to
> fix it?
>    
BUt as you mentioned, is there any benefit to sinking the clobbers after 
your IRA change to more accurately track conflicts in the subwords of 
double-word objects?  I guess it would help with things like XFmode in 
GPRs on x86 or any quad-word modes, but are they common enough to matter.

jeff
diff mbox

Patch

Index: ira-lives.c
===================================================================
--- ira-lives.c	(revision 161371)
+++ ira-lives.c	(working copy)
@@ -877,7 +877,7 @@  process_bb_node_lives (ira_loop_tree_nod
   int i, freq;
   unsigned int j;
   basic_block bb;
-  rtx insn;
+  rtx insn, prev;
   bitmap_iterator bi;
   bitmap reg_live_out;
   unsigned int px;
@@ -925,6 +925,42 @@  process_bb_node_lives (ira_loop_tree_nod
       /* Invalidate all allocno_saved_at_call entries.  */
       last_call_num++;
 
+      /* Previous passes such as the first scheduling pass may have lengthened
+	 the lifetime of pseudos by moving CLOBBER insns upwards.  Undo this
+	 here.  */
+      FOR_BB_INSNS_REVERSE_SAFE (bb, insn, prev)
+	{
+	  rtx pat, next, reg;
+	  if (!NONJUMP_INSN_P (insn) || insn == BB_END (bb))
+	    continue;
+	  pat = PATTERN (insn);
+	  if (GET_CODE (pat) != CLOBBER)
+	    continue;
+	  reg = XEXP (pat, 0);
+	  if (!REG_P (reg) || REGNO (reg) < FIRST_PSEUDO_REGISTER)
+	    continue;
+	  next = insn;
+	  while (next != BB_END (bb))
+	    {
+	      next = NEXT_INSN (next);
+	      if (!NONDEBUG_INSN_P (next))
+		continue;
+	      if (reg_mentioned_p (XEXP (pat, 0), PATTERN (next)))
+		break;
+	    }
+	  if (next == NEXT_INSN (insn))
+	    continue;
+
+	  NEXT_INSN (PREV_INSN (insn)) = NEXT_INSN (insn);
+	  PREV_INSN (NEXT_INSN (insn)) = PREV_INSN (insn);
+
+	  PREV_INSN (insn) = PREV_INSN (next);
+	  NEXT_INSN (insn) = next;
+
+	  NEXT_INSN (PREV_INSN (next)) = insn;
+	  PREV_INSN (next) = insn;
+	}
+
       /* Scan the code of this basic block, noting which allocnos and
 	 hard regs are born or die.