diff mbox

df: Keep return address register undefined until epilogue_completed

Message ID c6b7f1d6bfb232511d9e0a3467bd87db65e27b6b.1472488939.git.segher@kernel.crashing.org
State New
Headers show

Commit Message

Segher Boessenkool Aug. 29, 2016, 4:50 p.m. UTC
For separate shrink-wrapping we need to find out which basic blocks
need what components set up by a prologue, so that we can move those
prologue pieces deeper into the function, so that those pieces are
executed less frequently, improving performance.

To do this, target code will normally look at the LIVE info: a block
needs a register set up if that register is in the IN, GEN, or KILL
sets.  This works fine for the callee-saved registers, since those
are not in entry_block_defs, but it does not work for the rs6000 link
register, because df_get_entry_block_def_set unconditinally adds the
register that is INCOMING_RETURN_ADDR_RTX (if it is a register).

This patch changes that so that that def is only added after
epilogue_completed.

With that, in the rs6000 backend we can use the same IN+GEN+KILL
condition to see whether LR is used (in the current patch set, only
KILL is used, which isn't correct; the backend can write to LR to
temporarily hold other values; a crazy idea, on modern hardware anyway,
but it does regardless).

Does this work on all other targets?  Should anything else be done?


Segher


2016-08-29  Segher Boessenkool  <segher@kernel.crashing.org>

	* df-scan.c (df_get_entry_block_def_set): Do not add the
	INCOMING_RETURN_ADDR_RTX register until epilogue_completed.

---
 gcc/df-scan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Steven Bosscher Aug. 29, 2016, 8:19 p.m. UTC | #1
On Mon, Aug 29, 2016 at 6:50 PM, Segher Boessenkool wrote:
> This patch changes that so that that def is only added after
> epilogue_completed.
...
> Does this work on all other targets?

Guessing: not for targets where INCOMING_RETURN_ADDR_RTX is the stack register?
That'd be at least h8300, rl78, and rx.

Ciao!
Steven
Segher Boessenkool Aug. 29, 2016, 8:41 p.m. UTC | #2
On Mon, Aug 29, 2016 at 10:19:25PM +0200, Steven Bosscher wrote:
> On Mon, Aug 29, 2016 at 6:50 PM, Segher Boessenkool wrote:
> > This patch changes that so that that def is only added after
> > epilogue_completed.
> ...
> > Does this work on all other targets?
> 
> Guessing: not for targets where INCOMING_RETURN_ADDR_RTX is the stack register?
> That'd be at least h8300, rl78, and rx.

The stack pointer is always added (earlier in this function):

  /* The always important stack pointer.  */
  bitmap_set_bit (entry_block_defs, STACK_POINTER_REGNUM);

so that isn't it.

History suggests it was a fix (or workaround) for a problem on ia64,
but I might well have that wrong.


Segher
Jeff Law Sept. 9, 2016, 10:19 p.m. UTC | #3
On 08/29/2016 02:19 PM, Steven Bosscher wrote:
> On Mon, Aug 29, 2016 at 6:50 PM, Segher Boessenkool wrote:
>> This patch changes that so that that def is only added after
>> epilogue_completed.
> ...
>> Does this work on all other targets?
>
> Guessing: not for targets where INCOMING_RETURN_ADDR_RTX is the stack register?
> That'd be at least h8300, rl78, and rx.
On those ports INCOMING_RETURN_ADDR_RTX is (mem (sp)), not plain (sp). 
So they're not materially different than x86.


Jeff
Jeff Law Sept. 9, 2016, 10:40 p.m. UTC | #4
On 08/29/2016 10:50 AM, Segher Boessenkool wrote:
> For separate shrink-wrapping we need to find out which basic blocks
> need what components set up by a prologue, so that we can move those
> prologue pieces deeper into the function, so that those pieces are
> executed less frequently, improving performance.
>
> To do this, target code will normally look at the LIVE info: a block
> needs a register set up if that register is in the IN, GEN, or KILL
> sets.  This works fine for the callee-saved registers, since those
> are not in entry_block_defs, but it does not work for the rs6000 link
> register, because df_get_entry_block_def_set unconditinally adds the
> register that is INCOMING_RETURN_ADDR_RTX (if it is a register).
>
> This patch changes that so that that def is only added after
> epilogue_completed.
>
> With that, in the rs6000 backend we can use the same IN+GEN+KILL
> condition to see whether LR is used (in the current patch set, only
> KILL is used, which isn't correct; the backend can write to LR to
> temporarily hold other values; a crazy idea, on modern hardware anyway,
> but it does regardless).
>
> Does this work on all other targets?  Should anything else be done?
>
>
> Segher
>
>
> 2016-08-29  Segher Boessenkool  <segher@kernel.crashing.org>
>
> 	* df-scan.c (df_get_entry_block_def_set): Do not add the
> 	INCOMING_RETURN_ADDR_RTX register until epilogue_completed.
Right now the dataflow is conservatively correct WRT the return register.

If we made the change you want to make than the dataflow becomes overly 
optimistic about the range over which the return register is live prior 
to inserting the prologue/epilogue into the insn chain.

This seems unsafe to me.
jeff
Jeff Law Sept. 9, 2016, 10:42 p.m. UTC | #5
On 08/29/2016 02:41 PM, Segher Boessenkool wrote:
> On Mon, Aug 29, 2016 at 10:19:25PM +0200, Steven Bosscher wrote:
>> On Mon, Aug 29, 2016 at 6:50 PM, Segher Boessenkool wrote:
>>> This patch changes that so that that def is only added after
>>> epilogue_completed.
>> ...
>>> Does this work on all other targets?
>>
>> Guessing: not for targets where INCOMING_RETURN_ADDR_RTX is the stack register?
>> That'd be at least h8300, rl78, and rx.
>
> The stack pointer is always added (earlier in this function):
>
>   /* The always important stack pointer.  */
>   bitmap_set_bit (entry_block_defs, STACK_POINTER_REGNUM);
>
> so that isn't it.
>
> History suggests it was a fix (or workaround) for a problem on ia64,
> but I might well have that wrong.

It came in as part of a much larger change where Kenny was trying to 
compute the correct set of registers that are live at function entry.  I 
believe this is the earliest draft of the function in question:

http://permalink.gmane.org/gmane.comp.gcc.devel/75329

Jeff
Segher Boessenkool Sept. 10, 2016, 7:17 a.m. UTC | #6
On Fri, Sep 09, 2016 at 04:40:12PM -0600, Jeff Law wrote:
> Right now the dataflow is conservatively correct WRT the return register.

And conservatively incorrect wrt all other callee-saved regs!

> If we made the change you want to make than the dataflow becomes overly 
> optimistic about the range over which the return register is live prior 
> to inserting the prologue/epilogue into the insn chain.
> 
> This seems unsafe to me.

Yes, but so does the current situation.  And it all seems to work
nevertheless :-)


Segher
Jeff Law Sept. 12, 2016, 4:30 p.m. UTC | #7
On 09/10/2016 01:17 AM, Segher Boessenkool wrote:
> On Fri, Sep 09, 2016 at 04:40:12PM -0600, Jeff Law wrote:
>> Right now the dataflow is conservatively correct WRT the return register.
>
> And conservatively incorrect wrt all other callee-saved regs!
But prior to prologue/epilogue insertion it shouldn't matter.  In fact, 
explicit references to callee saved regs prior to register allocation 
has always been problematical.

I do think our handling of life information for callee-saved regs after 
insertion of the prologue/epilogue could be improved.

In a separate shrink wrapped world ISTM that we want to move to a model 
where the return insn itself is a use of the appropriate callee saved 
regs.  If we did that I suspect some of the hacks from the separate 
shrink wrapping kit would just "go away".

Attaching actual USEs to those insns is obviously problematical though 
from a recognition standpoint.  It'd probably have to be structured more 
like what we do with CALL_INSNs to mark registers used/set.


>
>> If we made the change you want to make than the dataflow becomes overly
>> optimistic about the range over which the return register is live prior
>> to inserting the prologue/epilogue into the insn chain.
>>
>> This seems unsafe to me.
>
> Yes, but so does the current situation.  And it all seems to work
> nevertheless :-)
But that doesn't make it right...

Jeff
Segher Boessenkool Sept. 14, 2016, 12:38 p.m. UTC | #8
On Mon, Sep 12, 2016 at 10:30:56AM -0600, Jeff Law wrote:
> On 09/10/2016 01:17 AM, Segher Boessenkool wrote:
> >On Fri, Sep 09, 2016 at 04:40:12PM -0600, Jeff Law wrote:
> >>Right now the dataflow is conservatively correct WRT the return register.
> >
> >And conservatively incorrect wrt all other callee-saved regs!
> But prior to prologue/epilogue insertion it shouldn't matter.  In fact, 
> explicit references to callee saved regs prior to register allocation 
> has always been problematical.
> 
> I do think our handling of life information for callee-saved regs after 
> insertion of the prologue/epilogue could be improved.

And before: it is not really incorrect for a program to access a callee-
saved register (with "register asm", say).  Not very useful except for
debugging, but not really incorrect either.  This all is handled correctly
for LR already, I'm not sure about LIVE though.  Not sure it matters either,
everything seems to stumble along just fine.

> In a separate shrink wrapped world ISTM that we want to move to a model 
> where the return insn itself is a use of the appropriate callee saved 
> regs.  If we did that I suspect some of the hacks from the separate 
> shrink wrapping kit would just "go away".

That doesn't help, sorry.  In fact the rs6000 port used to do just as you
say (for return patterns; not for simple_return patterns), and that only
hindered optimisation (the exit block already has an artificial use of
the link register, so it is not needed; the return being a parallel was
not recognised by various things).

> Attaching actual USEs to those insns is obviously problematical though 
> from a recognition standpoint.  It'd probably have to be structured more 
> like what we do with CALL_INSNs to mark registers used/set.

And it already is, if I understand you correctly :-)

> >>If we made the change you want to make than the dataflow becomes overly
> >>optimistic about the range over which the return register is live prior
> >>to inserting the prologue/epilogue into the insn chain.
> >>
> >>This seems unsafe to me.
> >
> >Yes, but so does the current situation.  And it all seems to work
> >nevertheless :-)
> But that doesn't make it right...

Yes, but neither is the current situation, for powerpc at least: the link
register should *not* be live at function entry, just like the situation
with other callee-save registers, it does not hold a value the current
function has any business with -- before prologue generation.

Now, for separate shrink-wrapping, I want to use the LIVE problem to
determine which basic blocks have which concerns, sorry, need which
components set up; to do that i compute IN+GEN+KILL for the registers
of those components, for each basic block (all this is target code).
This needs those registers *not* marked as live in the entry and exit
blocks, which I probably have to force manually anyway.  And the setting
this patch corrects gets in the way for that a bit, and since it is
incorrect in general anyway, I wanted to do something about it.

But I'll just hard force it off during separate shrink-wrapping, and
we'll leave this problem for another day.


Segher
diff mbox

Patch

diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 9cd647a..ea94f37 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3567,7 +3567,7 @@  df_get_entry_block_def_set (bitmap entry_block_defs)
     }
 
 #ifdef INCOMING_RETURN_ADDR_RTX
-  if (REG_P (INCOMING_RETURN_ADDR_RTX))
+  if (REG_P (INCOMING_RETURN_ADDR_RTX) && epilogue_completed)
     bitmap_set_bit (entry_block_defs, REGNO (INCOMING_RETURN_ADDR_RTX));
 #endif