Patchwork [3/6] Allow jumps in epilogues

login
register
mail settings
Submitter Bernd Schmidt
Date March 23, 2011, 2:50 p.m.
Message ID <4D8A089D.7020507@codesourcery.com>
Download mbox | patch
Permalink /patch/88092/
State New
Headers show

Comments

Bernd Schmidt - March 23, 2011, 2:50 p.m.
dwarf2out has code that starts scanning from NOTE_INSN_EPILOGUE_BEG
until it finds the return jump. When there is common code in several
blocks ending in a return, we might want to share this, and in that case
it would be possible to encounter a simplejump rather than a returnjump.
This should be safe, and the following patch allows it.


Bernd
* cfgcleanup.c (flow_find_head_matching_sequence): Ignore
    	epilogue notes.
    	* df-problems.c (can_move_insns_across): Don't stop at epilogue
    	notes.
    	* dwarf2out.c (dwarf2out_cfi_begin_epilogue): Also allow a
    	simplejump to end the block.
Richard Henderson - March 23, 2011, 4:46 p.m.
On 03/23/2011 07:50 AM, Bernd Schmidt wrote:
> dwarf2out has code that starts scanning from NOTE_INSN_EPILOGUE_BEG
> until it finds the return jump. When there is common code in several
> blocks ending in a return, we might want to share this, and in that case
> it would be possible to encounter a simplejump rather than a returnjump.
> This should be safe, and the following patch allows it.

With no more code than this, I cannot believe you're generating correct
unwind info anymore.

It would be possible to handle code merging including epilogue blocks
if (and IMO only if) you track unwind state on a per-block basis, and
propagate this information around the CFG, finally linearizing this
when blocks are re-ordered for the last time before final.

At present, sadly, we assume steady-state for the unwind info except
before PROLOGUE_END and after EPILOGUE_BEG.


r~
Bernd Schmidt - March 23, 2011, 4:48 p.m.
On 03/23/2011 05:46 PM, Richard Henderson wrote:
> On 03/23/2011 07:50 AM, Bernd Schmidt wrote:
>> dwarf2out has code that starts scanning from NOTE_INSN_EPILOGUE_BEG
>> until it finds the return jump. When there is common code in several
>> blocks ending in a return, we might want to share this, and in that case
>> it would be possible to encounter a simplejump rather than a returnjump.
>> This should be safe, and the following patch allows it.
> 
> With no more code than this, I cannot believe you're generating correct
> unwind info anymore.

Why not? Are you worried about the code at the destination of the jump?
That should be preceded by another block falling through into it which
also has a NOTE_INSN_EPILOGUE_BEG.

If that isn't the problem you have in mind, what is and how can we test
for it?


Bernd
Richard Henderson - March 23, 2011, 5:19 p.m.
On 03/23/2011 09:48 AM, Bernd Schmidt wrote:
>> With no more code than this, I cannot believe you're generating correct
>> unwind info anymore.
> 
> Why not? Are you worried about the code at the destination of the jump?
> That should be preceded by another block falling through into it which
> also has a NOTE_INSN_EPILOGUE_BEG.
> 
> If that isn't the problem you have in mind, what is and how can we test
> for it?

	body
	body
	restore r1	XXX
	restore r2	XXX
	jmp L2		XXX

L1:	body		YYY
	body		YYY
	restore r2

L2:	restore r3
	return

Assume for the moment that "restore" on this target is something that
can't be delayed or repeated.  E.g. a pop, rather than a move which
leaves the saved value in memory at a unknown offset from the CFA.

This means we have to emit unwind directives immediately after the 
restore insn and cannot delay the epilogue unwind until we deallocate
the entire stack frame.

This means that your patch either gets the unwind info wrong for
the XXX sequence or the YYY sequence.

Correct unwind info would look like

	body
	body
	.cfi_remember_state
	restore r1
	.cfi_restore r1
	restore r2
	.cfi_restore r2
	jmp L2
	.cfi_restore_state

L1:	body
	body
	restore r2
	.cfi_restore r2

L2:	// validate the unwind info across the CFG making sure that the incoming
	// edges contain the same unwind info here.
	restore r3
	.cfi_restore r3
	return

In general, with shrink-wrapping, we can have essentially arbitrary
differences in unwind info between blocks that are sequential.  We have
to be prepared to fully adjust the unwind state between blocks.

Assume a { x } is the set of registers saved into the stack frame in a
given block.  We have both incoming and outgoing sets.

foo:				// in: { }
	cmp	r1,r2
	jne	L1		// out: { }

L0:				// in: { }
	save r8
	save r9
	body
	...			// out: { r8, r9 }

L2:				// in: { r8, r9, r10 }
	body
	body
	...			// out: { r8, r9, r10 }

L1:				// in: { }
	save r8
	save r9
	save r10
	body
	...			// out: { r8, r9, r10 }

L3:				// in: { r8, r9, r10 }
	restore r10		// out: { r8, r9 }

L4:				// in: { r8, r9 }
	restore r9
	restore r8
	return

This layout requires more than just .cfi_remember_state/restore_state
between blocks.  We have to be prepared to emit full unwind info at
any point.  Assume cfi info marked with XXX exists between basic blocks
to fixup the transition points:

L0:	save r8
	save r9
	.cfi_offset r8,x
	.cfi_offset r9,y
	body
	
	.cfi_offset r10,z	XXX

L2:	body
	body

	.cfi_restore r8		XXX
	.cfi_restore r9		XXX
	.cfi_restore r10	XXX

L1:	save r8
	save r9
	save r10
	.cfi_offset r8,x
	.cfi_offset r9,y
	.cfi_offset r10,z
	body

If this isn't clear, please ask questions.  The problem of unwinding is
way more complicated than what you appear to be assuming.


r~
Bernd Schmidt - March 23, 2011, 5:22 p.m.
On 03/23/2011 06:19 PM, Richard Henderson wrote:
> 	body
> 	body
> 	restore r1	XXX
> 	restore r2	XXX
> 	jmp L2		XXX
> 
> L1:	body		YYY
> 	body		YYY
> 	restore r2
> 
> L2:	restore r3
> 	return

> In general, with shrink-wrapping, we can have essentially arbitrary
> differences in unwind info between blocks that are sequential.

I don't think this can actually happen with the current implementation.
There is only one prologue, and all epilogues (the normal one and the
sibcall epilogues) match it exactly. I don't believe we can generate
code as in the example above, both before and after my patch.


Bernd
Richard Henderson - March 23, 2011, 5:27 p.m.
On 03/23/2011 10:22 AM, Bernd Schmidt wrote:
> On 03/23/2011 06:19 PM, Richard Henderson wrote:
>> 	body
>> 	body
>> 	restore r1	XXX
>> 	restore r2	XXX
>> 	jmp L2		XXX
>>
>> L1:	body		YYY
>> 	body		YYY
>> 	restore r2
>>
>> L2:	restore r3
>> 	return
> 
>> In general, with shrink-wrapping, we can have essentially arbitrary
>> differences in unwind info between blocks that are sequential.
> 
> I don't think this can actually happen with the current implementation.
> There is only one prologue, and all epilogues (the normal one and the
> sibcall epilogues) match it exactly. I don't believe we can generate
> code as in the example above, both before and after my patch.

Um.. then what's this "allow jumps in epilogues" thing of which you speak?
If there's a jump, then it goes somewhere, and branches over something.
I see no constraints on what that something might be.

Could you give an example of a transformation that is allowed by this?



r~
Bernd Schmidt - March 24, 2011, 10:29 a.m.
On 03/23/2011 06:27 PM, Richard Henderson wrote:
> On 03/23/2011 10:22 AM, Bernd Schmidt wrote:
>> On 03/23/2011 06:19 PM, Richard Henderson wrote:
>>> 	body
>>> 	body
>>> 	restore r1	XXX
>>> 	restore r2	XXX
>>> 	jmp L2		XXX
>>>
>>> L1:	body		YYY
>>> 	body		YYY
>>> 	restore r2
>>>
>>> L2:	restore r3
>>> 	return
>>
>>> In general, with shrink-wrapping, we can have essentially arbitrary
>>> differences in unwind info between blocks that are sequential.
>>
>> I don't think this can actually happen with the current implementation.
>> There is only one prologue, and all epilogues (the normal one and the
>> sibcall epilogues) match it exactly. I don't believe we can generate
>> code as in the example above, both before and after my patch.
> 
> Um.. then what's this "allow jumps in epilogues" thing of which you speak?
> If there's a jump, then it goes somewhere, and branches over something.
> I see no constraints on what that something might be.
> 
> Could you give an example of a transformation that is allowed by this?

The idea was to be able to share a single return instruction between
epilogue/non-epilogue return paths, so that e.g. on i686 a conditional
return could be implemented as a conditional jump to a common return
insn. The allow-jumps patch then becomes necessary because bbro can move
the blocks around.

It does seem, however, that bbro can in fact cause problems for the
unwind information when the prologue is no longer in the first block.
Let me try to come up with a solution for that.


Bernd
Bernd Schmidt - March 25, 2011, 5:34 p.m.
On 03/23/2011 06:19 PM, Richard Henderson wrote:
> In general, with shrink-wrapping, we can have essentially arbitrary
> differences in unwind info between blocks that are sequential.

So, while that isn't the case just yet with the current shrink-wrapping
patch, it seems I will either have to make dwarf2out fully general, or
ensure that basic blocks occur in only a certain order (the prologue
must be written out only after all basic blocks that can be executed
before or without reaching it).

I don't know much about the unwinding code. I'm currently thinking about
writing out a cfi_remember_state at the start of the function, restoring
that clean state when necessary at the start of a new block and emitting
the necessary directives to reach the correct state. What directives
should I expect to be required? Can I get by just with cfi_offset and
cfi_def_cfa_offset, or will something else be necessary?


Bernd
Richard Henderson - March 26, 2011, 3:26 a.m.
On 03/25/2011 10:34 AM, Bernd Schmidt wrote:
> I don't know much about the unwinding code. I'm currently thinking about
> writing out a cfi_remember_state at the start of the function, restoring
> that clean state when necessary at the start of a new block and emitting
> the necessary directives to reach the correct state. What directives
> should I expect to be required? Can I get by just with cfi_offset and
> cfi_def_cfa_offset, or will something else be necessary?

Yes, several things: register, expression, gnu_args_size, perhaps a few more.

I think the ideal thing would be a pass while the cfg is still extant that
captures the unwind info into notes; these can be recorded at basic block
boundaries, so that they persist until the end of compilation.

So long as late late compilation passes continue to not move frame-related
insns across basic block boundaries, we should be fine.

Irritatingly, the exact place to locate this pass is difficult to pin down.
Immediately before md_reorg is the last place we have the cfg.  But we do
strange things in, e.g. ia64 where we rebuild the cfg and run sched_ebb
during md_reorg.

Of course, ia64 is a bad example because its unwind info is target-specific,
and quite a lot of the possible benefit of shrink wrapping is lost via the
register windowing.

I'm willing to work with you on the problem of cfg-aware unwind info.  We
have needed this for a really long time; there are existing bugs related 
to exception handling and !ACCUMULATE_OUTGOING_ARGS that would be fixed by
this.


r~

Patch

Index: gcc/cfgcleanup.c
===================================================================
--- gcc.orig/cfgcleanup.c
+++ gcc/cfgcleanup.c
@@ -1184,20 +1184,12 @@  flow_find_head_matching_sequence (basic_
 
   while (true)
     {
-      /* Ignore notes, except NOTE_INSN_EPILOGUE_BEG.  */
+      /* Ignore notes.  */
       while (!NONDEBUG_INSN_P (i1) && i1 != BB_END (bb1))
-	{
-	  if (NOTE_P (i1) && NOTE_KIND (i1) == NOTE_INSN_EPILOGUE_BEG)
-	    break;
-	  i1 = NEXT_INSN (i1);
-	}
+	i1 = NEXT_INSN (i1);
 
       while (!NONDEBUG_INSN_P (i2) && i2 != BB_END (bb2))
-	{
-	  if (NOTE_P (i2) && NOTE_KIND (i2) == NOTE_INSN_EPILOGUE_BEG)
-	    break;
-	  i2 = NEXT_INSN (i2);
-	}
+	i2 = NEXT_INSN (i2);
 
       if ((i1 == BB_END (bb1) && !NONDEBUG_INSN_P (i1))
 	  || (i2 == BB_END (bb2) && !NONDEBUG_INSN_P (i2)))
Index: gcc/df-problems.c
===================================================================
--- gcc.orig/df-problems.c
+++ gcc/df-problems.c
@@ -3953,8 +3953,6 @@  can_move_insns_across (rtx from, rtx to,
     {
       if (CALL_P (insn))
 	break;
-      if (NOTE_P (insn) && NOTE_KIND (insn) == NOTE_INSN_EPILOGUE_BEG)
-	break;
       if (NONDEBUG_INSN_P (insn))
 	{
 	  if (may_trap_or_fault_p (PATTERN (insn))
Index: gcc/dwarf2out.c
===================================================================
--- gcc.orig/dwarf2out.c
+++ gcc/dwarf2out.c
@@ -2939,10 +2939,10 @@  dwarf2out_frame_debug (rtx insn, bool af
     dwarf2out_flush_queued_reg_saves ();
 }
 
-/* Determine if we need to save and restore CFI information around this
-   epilogue.  If SIBCALL is true, then this is a sibcall epilogue.  If
-   we do need to save/restore, then emit the save now, and insert a
-   NOTE_INSN_CFA_RESTORE_STATE at the appropriate place in the stream.  */
+/* Determine if we need to save and restore CFI information around
+   this epilogue.  If we do need to save/restore, then emit the save
+   now, and insert a NOTE_INSN_CFA_RESTORE_STATE at the appropriate
+   place in the stream.  */
 
 void
 dwarf2out_cfi_begin_epilogue (rtx insn)
@@ -2957,8 +2957,10 @@  dwarf2out_cfi_begin_epilogue (rtx insn)
       if (!INSN_P (i))
 	continue;
 
-      /* Look for both regular and sibcalls to end the block.  */
-      if (returnjump_p (i))
+      /* Look for both regular and sibcalls to end the block.  Various
+	 optimization passes may cause us to jump to a common epilogue
+	 tail, so we also accept simplejumps.  */
+      if (returnjump_p (i) || simplejump_p (i))
 	break;
       if (CALL_P (i) && SIBLING_CALL_P (i))
 	break;