diff mbox

[3/3] function: Restructure *logue insertion

Message ID 70c67b8f39aca9cf574f617fbfce43b93e2560ff.1463428211.git.segher@kernel.crashing.org
State New
Headers show

Commit Message

Segher Boessenkool May 17, 2016, 1:09 a.m. UTC
This patch restructures how the prologues/epilogues are inserted.  Sibcalls
that run without prologue are now handled in shrink-wrap.c; it communicates
what is already handled by setting the EDGE_IGNORE flag.  The
try_shrink_wrapping function then doesn't need to be passed the bb_flags
anymore.

Tested like the previous two patches; is this okay for trunk?


Segher


2016-05-16  Segher Boessenkool  <segher@kernel.crashing.org>

	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
	code.  Ignore sibcalls on EDGE_IGNORE edges.
	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
	on edges for sibcalls that run without prologue.  The rest of the
	function is combined from...
	(fix_fake_fallthrough_edge): ... this, and ...
	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
	function argument, make it a local variable.

---
 gcc/function.c    | 168 ++++++++++++++++++++++--------------------------------
 gcc/shrink-wrap.c |  88 ++++++++++++++--------------
 gcc/shrink-wrap.h |   3 +-
 3 files changed, 113 insertions(+), 146 deletions(-)

Comments

Segher Boessenkool May 19, 2016, 8:04 a.m. UTC | #1
On Tue, May 17, 2016 at 01:09:11AM +0000, Segher Boessenkool wrote:
> This patch restructures how the prologues/epilogues are inserted.  Sibcalls
> that run without prologue are now handled in shrink-wrap.c; it communicates
> what is already handled by setting the EDGE_IGNORE flag.  The
> try_shrink_wrapping function then doesn't need to be passed the bb_flags
> anymore.

> 2016-05-16  Segher Boessenkool  <segher@kernel.crashing.org>
> 
> 	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
> 	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
> 	code.  Ignore sibcalls on EDGE_IGNORE edges.
> 	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
> 	on edges for sibcalls that run without prologue.  The rest of the
> 	function is combined from...
> 	(fix_fake_fallthrough_edge): ... this, and ...
> 	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
> 	function argument, make it a local variable.

As promised in the 1/3 subthread, I looked at the difference of building
Linux with and without (only) this patch.  With the 24 (of 30) targets
that build, I saw no generated code differences.  There also are no
testsuite regressions on powerpc, powerpc64, powerpc64le, or x86_64.


Segher
Jeff Law May 19, 2016, 10 p.m. UTC | #2
On 05/16/2016 07:09 PM, Segher Boessenkool wrote:
> This patch restructures how the prologues/epilogues are inserted.  Sibcalls
> that run without prologue are now handled in shrink-wrap.c; it communicates
> what is already handled by setting the EDGE_IGNORE flag.  The
> try_shrink_wrapping function then doesn't need to be passed the bb_flags
> anymore.
>
> Tested like the previous two patches; is this okay for trunk?
>
>
> Segher
>
>
> 2016-05-16  Segher Boessenkool  <segher@kernel.crashing.org>
>
> 	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
> 	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
> 	code.  Ignore sibcalls on EDGE_IGNORE edges.
> 	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
> 	on edges for sibcalls that run without prologue.  The rest of the
> 	function is combined from...
> 	(fix_fake_fallthrough_edge): ... this, and ...
> 	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
> 	function argument, make it a local variable.
For some reason I found this patch awful to walk through.  In 
retrospect, it might have been better break this down further. Not 
because it's conceptually difficult to follow, but because the diffs 
themselves are difficult to read.

I kept slicing out hunks when I could pair up the original code to its 
new functional equivalent and hunks which were just "fluff" and kept 
iterating until there was nothing left that seemed unreasonable.

OK for the trunk, but please watch closely for any fallout.

jeff
Segher Boessenkool May 19, 2016, 10:20 p.m. UTC | #3
On Thu, May 19, 2016 at 04:00:22PM -0600, Jeff Law wrote:
> >	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
> >	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
> >	code.  Ignore sibcalls on EDGE_IGNORE edges.
> >	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
> >	on edges for sibcalls that run without prologue.  The rest of the
> >	function is combined from...
> >	(fix_fake_fallthrough_edge): ... this, and ...
> >	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
> >	function argument, make it a local variable.
> For some reason I found this patch awful to walk through.  In 
> retrospect, it might have been better break this down further. Not 
> because it's conceptually difficult to follow, but because the diffs 
> themselves are difficult to read.

Yeah, I should have realised that because the changelog was hard to write.

> I kept slicing out hunks when I could pair up the original code to its 
> new functional equivalent and hunks which were just "fluff" and kept 
> iterating until there was nothing left that seemed unreasonable.
> 
> OK for the trunk, but please watch closely for any fallout.

Thanks, and I will!


Segher
Thomas Schwinge May 20, 2016, 9:28 a.m. UTC | #4
Hi!

> > >	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
> > >	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
> > >	code.  Ignore sibcalls on EDGE_IGNORE edges.
> > >	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
> > >	on edges for sibcalls that run without prologue.  The rest of the
> > >	function is combined from...
> > >	(fix_fake_fallthrough_edge): ... this, and ...
> > >	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
> > >	function argument, make it a local variable.

On Thu, 19 May 2016 17:20:46 -0500, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> On Thu, May 19, 2016 at 04:00:22PM -0600, Jeff Law wrote:
> > OK for the trunk, but please watch closely for any fallout.
> 
> Thanks, and I will!

With nvptx offloading on x86_64 GNU/Linux, this (r236491) is causing
several execution test failures.  I'll have a look.


Grüße
 Thomas
Thomas Schwinge May 20, 2016, 1:21 p.m. UTC | #5
Hi!

The nvptx maintainer Bernd, Nathan: can you take it from here, or should
I continue to figure it out?

On Fri, 20 May 2016 11:28:25 +0200, I wrote:
> > > >	* function.c (make_epilogue_seq): Remove epilogue_end parameter.
> > > >	(thread_prologue_and_epilogue_insns): Remove bb_flags.  Restructure
> > > >	code.  Ignore sibcalls on EDGE_IGNORE edges.
> > > >	* shrink-wrap.c (handle_simple_exit): New function.  Set EDGE_IGNORE
> > > >	on edges for sibcalls that run without prologue.  The rest of the
> > > >	function is combined from...
> > > >	(fix_fake_fallthrough_edge): ... this, and ...
> > > >	(try_shrink_wrapping): ... a part of this.  Remove the bb_with
> > > >	function argument, make it a local variable.
> 
> On Thu, 19 May 2016 17:20:46 -0500, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > On Thu, May 19, 2016 at 04:00:22PM -0600, Jeff Law wrote:
> > > OK for the trunk, but please watch closely for any fallout.
> > 
> > Thanks, and I will!
> 
> With nvptx offloading on x86_64 GNU/Linux, this (r236491) is causing
> several execution test failures.  I'll have a look.

OK, no offloading required.  The problem -- or, "a" problem; hopefully
the same ;-) -- also reproduces with a nvptx-none target configuration.
A before/after r236491 diff of:

    $ build-gcc/gcc/xgcc -Bbuild-gcc/gcc/ source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c -O0 -Wall -Wextra -Bbuild-gcc/nvptx-none/newlib/ -Lbuild-gcc/nvptx-none/newlib -mmainkernel -o ./20000121-1.exe -fdump-tree-all -fdump-ipa-all -fdump-rtl-all -save-temps

..., shows the execution failure ("nvptx-none-run-single 20000121-1.exe"
returns exit code 1), and (aside from earlier, hopefully benign
address/ID changes) shows the following dump changes, starting with:

    --- before/20000121-1.c.281r.mach       2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.281r.mach        2016-05-20 14:54:34.537741174 +0200
    @@ -5,16 +5,10 @@
     ending the processing of deferred insns
     df_analyze called
     df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
    -scanning new insn with uid = 11.
    -changing bb of uid 13
    -  unscanned insn
    -changing bb of uid 11
    -  from 2 to 3
     starting the processing of deferred insns
     ending the processing of deferred insns
     df_analyze called
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    +df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
     
     
     big
    @@ -27,8 +21,8 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     1 [%stack] 2 [%frame]
     ;;  regs ever live      2 [%frame]
    -;;  ref usage  r1={1d,3u} r2={1d,4u} r3={1d,2u} r4={1d} r22={1d,1u} 
    -;;    total ref usage 15{5d,10u,0e} in 4{4 regular + 0 call} insns.
    +;;  ref usage  r1={1d,2u} r2={1d,3u} r3={1d,1u} r4={1d} r22={1d,1u} 
    +;;    total ref usage 12{5d,7u,0e} in 3{3 regular + 0 call} insns.
     
     ( )->[0]->( 2 )
     ;; bb 0 artificial_defs: { d-1(1){ }d-1(2){ }d-1(3){ }d-1(4){ }}
    @@ -42,7 +36,7 @@
     ;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    1 [%stack] 2 [%frame] 3 [%args]
     
    -( 0 )->[2]->( 3 )
    +( 0 )->[2]->( 1 )
     ;; bb 2 artificial_defs: { }
     ;; bb 2 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
     ;; lr  in       1 [%stack] 2 [%frame] 3 [%args]
    @@ -54,19 +48,7 @@
     ;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    1 [%stack] 2 [%frame] 3 [%args]
     
    -( 2 )->[3]->( 1 )
    -;; bb 3 artificial_defs: { }
    -;; bb 3 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
    -;; lr  in       1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  use      1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  def     
    -;; live  in     1 [%stack] 2 [%frame] 3 [%args]
    -;; live  gen   
    -;; live  kill  
    -;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
    -;; live  out    1 [%stack] 2 [%frame] 3 [%args]
    -
    -( 3 )->[1]->( )
    +( 2 )->[1]->( )
     ;; bb 1 artificial_defs: { }
     ;; bb 1 artificial_uses: { u-1(1){ }u-1(2){ }}
     ;; lr  in       1 [%stack] 2 [%frame]
    @@ -92,8 +74,8 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     1 [%stack] 2 [%frame]
     ;;  regs ever live      2 [%frame]
    -;;  ref usage  r1={1d,3u} r2={1d,4u} r3={1d,2u} r4={1d} r22={1d,1u} 
    -;;    total ref usage 15{5d,10u,0e} in 4{4 regular + 0 call} insns.
    +;;  ref usage  r1={1d,2u} r2={1d,3u} r3={1d,1u} r4={1d} r22={1d,1u} 
    +;;    total ref usage 12{5d,7u,0e} in 3{3 regular + 0 call} insns.
     (note 1 0 5 NOTE_INSN_DELETED)
     (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
     (insn 2 5 3 2 (set (reg:DI 22)
    @@ -105,14 +87,8 @@
             (reg:DI 22)) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:1 5 {*movdi_insn}
          (nil))
     (note 4 3 9 2 NOTE_INSN_FUNCTION_BEG)
    -(insn 9 4 10 2 (const_int 0 [0]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:1 191 {nop}
    +(insn 9 4 0 2 (const_int 0 [0]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:1 191 {nop}
          (nil))
    -(note 10 9 13 2 NOTE_INSN_EPILOGUE_BEG)
    -(note 13 10 11 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
    -(jump_insn 11 13 12 3 (return) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:1 192 {return}
    -     (nil)
    - -> return)
    -(barrier 12 11 0)
     
     ;; Function doit (doit, funcdef_no=1, decl_uid=1375, cgraph_uid=1, symbol_order=1)
     
    @@ -120,19 +96,13 @@
     ending the processing of deferred insns
     df_analyze called
     df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
    -scanning new insn with uid = 27.
     verify found no changes in insn with uid = 16.
     verify found no changes in insn with uid = 19.
     verify found no changes in insn with uid = 22.
    -changing bb of uid 29
    -  unscanned insn
    -changing bb of uid 27
    -  from 2 to 3
     starting the processing of deferred insns
     ending the processing of deferred insns
     df_analyze called
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    +df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
     
     
     doit
    @@ -145,8 +115,8 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     1 [%stack] 2 [%frame]
     ;;  regs ever live      1 [%stack] 2 [%frame]
    -;;  ref usage  r0={3d} r1={1d,6u} r2={1d,9u} r3={1d,2u} r4={4d} r5={3d} r6={3d} r7={3d} r8={3d} r9={3d} r10={3d} r11={3d} r12={3d} r13={3d} r14={3d} r15={3d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} r27={1d,1u} r28={1d,1u} r29={1d,1u} r30={1d,1u} r31={1d,1u} r32={1d,1u} r33={1d,1u} 
    -;;    total ref usage 84{55d,29u,0e} in 20{17 regular + 3 call} insns.
    +;;  ref usage  r0={3d} r1={1d,5u} r2={1d,8u} r3={1d,1u} r4={4d} r5={3d} r6={3d} r7={3d} r8={3d} r9={3d} r10={3d} r11={3d} r12={3d} r13={3d} r14={3d} r15={3d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} r27={1d,1u} r28={1d,1u} r29={1d,1u} r30={1d,1u} r31={1d,1u} r32={1d,1u} r33={1d,1u} 
    +;;    total ref usage 81{55d,26u,0e} in 19{16 regular + 3 call} insns.
     
     ( )->[0]->( 2 )
     ;; bb 0 artificial_defs: { d-1(1){ }d-1(2){ }d-1(3){ }d-1(4){ }}
    @@ -160,7 +130,7 @@
     ;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    1 [%stack] 2 [%frame] 3 [%args]
     
    -( 0 )->[2]->( 3 )
    +( 0 )->[2]->( 1 )
     ;; bb 2 artificial_defs: { }
     ;; bb 2 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
     ;; lr  in       1 [%stack] 2 [%frame] 3 [%args]
    @@ -172,19 +142,7 @@
     ;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    1 [%stack] 2 [%frame] 3 [%args]
     
    -( 2 )->[3]->( 1 )
    -;; bb 3 artificial_defs: { }
    -;; bb 3 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
    -;; lr  in       1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  use      1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  def     
    -;; live  in     1 [%stack] 2 [%frame] 3 [%args]
    -;; live  gen   
    -;; live  kill  
    -;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
    -;; live  out    1 [%stack] 2 [%frame] 3 [%args]
    -
    -( 3 )->[1]->( )
    +( 2 )->[1]->( )
     ;; bb 1 artificial_defs: { }
     ;; bb 1 artificial_uses: { u-1(1){ }u-1(2){ }}
     ;; lr  in       1 [%stack] 2 [%frame]
    @@ -210,8 +168,8 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     1 [%stack] 2 [%frame]
     ;;  regs ever live      1 [%stack] 2 [%frame]
    -;;  ref usage  r0={3d} r1={1d,6u} r2={1d,9u} r3={1d,2u} r4={4d} r5={3d} r6={3d} r7={3d} r8={3d} r9={3d} r10={3d} r11={3d} r12={3d} r13={3d} r14={3d} r15={3d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} r27={1d,1u} r28={1d,1u} r29={1d,1u} r30={1d,1u} r31={1d,1u} r32={1d,1u} r33={1d,1u} 
    -;;    total ref usage 84{55d,29u,0e} in 20{17 regular + 3 call} insns.
    +;;  ref usage  r0={3d} r1={1d,5u} r2={1d,8u} r3={1d,1u} r4={4d} r5={3d} r6={3d} r7={3d} r8={3d} r9={3d} r10={3d} r11={3d} r12={3d} r13={3d} r14={3d} r15={3d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} r27={1d,1u} r28={1d,1u} r29={1d,1u} r30={1d,1u} r31={1d,1u} r32={1d,1u} r33={1d,1u} 
    +;;    total ref usage 81{55d,26u,0e} in 19{16 regular + 3 call} insns.
     (note 1 0 9 NOTE_INSN_DELETED)
     (note 9 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
     (insn 2 9 3 2 (set (reg:SI 26)
    @@ -258,7 +216,7 @@
             (reg:DI 23 [ _2 ])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:5 5 {*movdi_insn}
          (nil))
     (call_insn 16 15 17 2 (parallel [
    -            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fea3de250e0 big>) [0 big S1 A8])
    +            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fb5fdc880e0 big>) [0 big S1 A8])
                     (const_int 0 [0]))
                 (use (reg:DI 31))
             ]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:5 129 {call_insn}
    @@ -271,7 +229,7 @@
             (reg:DI 24 [ _3 ])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:6 5 {*movdi_insn}
          (nil))
     (call_insn 19 18 20 2 (parallel [
    -            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fea3de250e0 big>) [0 big S1 A8])
    +            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fb5fdc880e0 big>) [0 big S1 A8])
                     (const_int 0 [0]))
                 (use (reg:DI 32))
             ]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:6 129 {call_insn}
    @@ -285,20 +243,14 @@
             (reg:DI 25 [ _4 ])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:7 5 {*movdi_insn}
          (nil))
     (call_insn 22 21 25 2 (parallel [
    -            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fea3de250e0 big>) [0 big S1 A8])
    +            (call (mem:QI (symbol_ref:DI ("big") [flags 0x3]  <function_decl 0x7fb5fdc880e0 big>) [0 big S1 A8])
                     (const_int 0 [0]))
                 (use (reg:DI 33))
             ]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:7 129 {call_insn}
          (nil)
         (nil))
    -(insn 25 22 26 2 (const_int 0 [0]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:8 191 {nop}
    +(insn 25 22 0 2 (const_int 0 [0]) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:8 191 {nop}
          (nil))
    -(note 26 25 29 2 NOTE_INSN_EPILOGUE_BEG)
    -(note 29 26 27 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
    -(jump_insn 27 29 28 3 (return) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:8 192 {return}
    -     (nil)
    - -> return)
    -(barrier 28 27 0)
     
     ;; Function main (main, funcdef_no=2, decl_uid=1378, cgraph_uid=2, symbol_order=2)
     
    @@ -306,17 +258,11 @@
     ending the processing of deferred insns
     df_analyze called
     df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
    -scanning new insn with uid = 20.
     verify found no changes in insn with uid = 8.
    -changing bb of uid 22
    -  unscanned insn
    -changing bb of uid 20
    -  from 2 to 3
     starting the processing of deferred insns
     ending the processing of deferred insns
     df_analyze called
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    -df_worklist_dataflow_doublequeue:n_basic_blocks 4 n_edges 3 count 4 (    1)
    +df_worklist_dataflow_doublequeue:n_basic_blocks 3 n_edges 2 count 3 (    1)
     
     
     main
    @@ -329,8 +275,8 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     0 [%value] 1 [%stack] 2 [%frame]
     ;;  regs ever live      0 [%value] 1 [%stack]
    -;;  ref usage  r0={2d,2u} r1={1d,4u} r2={1d,3u} r3={1d,2u} r4={2d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r10={1d} r11={1d} r12={1d} r13={1d} r14={1d} r15={1d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} 
    -;;    total ref usage 39{23d,16u,0e} in 9{8 regular + 1 call} insns.
    +;;  ref usage  r0={2d,2u} r1={1d,3u} r2={1d,2u} r3={1d,1u} r4={2d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r10={1d} r11={1d} r12={1d} r13={1d} r14={1d} r15={1d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} 
    +;;    total ref usage 36{23d,13u,0e} in 8{7 regular + 1 call} insns.
     
     ( )->[0]->( 2 )
     ;; bb 0 artificial_defs: { d-1(1){ }d-1(2){ }d-1(3){ }d-1(4){ }}
    @@ -344,7 +290,7 @@
     ;; lr  out      1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    1 [%stack] 2 [%frame] 3 [%args]
     
    -( 0 )->[2]->( 3 )
    +( 0 )->[2]->( 1 )
     ;; bb 2 artificial_defs: { }
     ;; bb 2 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
     ;; lr  in       1 [%stack] 2 [%frame] 3 [%args]
    @@ -356,19 +302,7 @@
     ;; lr  out      0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
     ;; live  out    0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
     
    -( 2 )->[3]->( 1 )
    -;; bb 3 artificial_defs: { }
    -;; bb 3 artificial_uses: { u-1(1){ }u-1(2){ }u-1(3){ }}
    -;; lr  in       0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  use      1 [%stack] 2 [%frame] 3 [%args]
    -;; lr  def     
    -;; live  in     0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
    -;; live  gen   
    -;; live  kill  
    -;; lr  out      0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
    -;; live  out    0 [%value] 1 [%stack] 2 [%frame] 3 [%args]
    -
    -( 3 )->[1]->( )
    +( 2 )->[1]->( )
     ;; bb 1 artificial_defs: { }
     ;; bb 1 artificial_uses: { u-1(0){ }u-1(1){ }u-1(2){ }}
     ;; lr  in       0 [%value] 1 [%stack] 2 [%frame]
    @@ -394,13 +328,13 @@
     ;;  entry block defs    1 [%stack] 2 [%frame] 3 [%args] 4 [%chain]
     ;;  exit block uses     0 [%value] 1 [%stack] 2 [%frame]
     ;;  regs ever live      0 [%value] 1 [%stack]
    -;;  ref usage  r0={2d,2u} r1={1d,4u} r2={1d,3u} r3={1d,2u} r4={2d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r10={1d} r11={1d} r12={1d} r13={1d} r14={1d} r15={1d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} 
    -;;    total ref usage 39{23d,16u,0e} in 9{8 regular + 1 call} insns.
    +;;  ref usage  r0={2d,2u} r1={1d,3u} r2={1d,2u} r3={1d,1u} r4={2d} r5={1d} r6={1d} r7={1d} r8={1d} r9={1d} r10={1d} r11={1d} r12={1d} r13={1d} r14={1d} r15={1d} r22={1d,1u} r23={1d,1u} r24={1d,1u} r25={1d,1u} r26={1d,1u} 
    +;;    total ref usage 36{23d,13u,0e} in 8{7 regular + 1 call} insns.
     (note 1 0 3 NOTE_INSN_DELETED)
     (note 3 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
     (note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
     (insn 5 2 6 2 (set (reg:DI 26)
    -        (symbol_ref/f:DI ("$LC0") [flags 0x802]  <var_decl 0x7fea3f484480 $LC0>)) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:12 5 {*movdi_insn}
    +        (symbol_ref/f:DI ("$LC0") [flags 0x802]  <var_decl 0x7fb5ff2e7480 $LC0>)) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:12 5 {*movdi_insn}
          (nil))
     (insn 6 5 7 2 (set (reg:SI 25)
             (const_int 1 [0x1])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:12 4 {*movsi_insn}
    @@ -409,7 +343,7 @@
             (const_int 1 [0x1])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:12 4 {*movsi_insn}
          (nil))
     (call_insn 8 7 9 2 (parallel [
    -            (call (mem:QI (symbol_ref:DI ("doit") [flags 0x3]  <function_decl 0x7fea3de251c0 doit>) [0 doit S1 A8])
    +            (call (mem:QI (symbol_ref:DI ("doit") [flags 0x3]  <function_decl 0x7fb5fdc881c0 doit>) [0 doit S1 A8])
                     (const_int 0 [0]))
                 (use (reg:SI 24))
                 (use (reg:SI 25))
    @@ -426,11 +360,5 @@
     (insn 16 12 17 2 (set (reg/i:SI 0 %value)
             (reg:SI 23 [ <retval> ])) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:14 4 {*movsi_insn}
          (nil))
    -(insn 17 16 19 2 (use (reg/i:SI 0 %value)) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:14 -1
    +(insn 17 16 0 2 (use (reg/i:SI 0 %value)) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:14 -1
          (nil))
    -(note 19 17 22 2 NOTE_INSN_EPILOGUE_BEG)
    -(note 22 19 20 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
    -(jump_insn 20 22 21 3 (return) source-gcc/gcc/testsuite/gcc.c-torture/execute/20000121-1.c:14 192 {return}
    -     (nil)
    - -> return)
    -(barrier 21 20 0)
    --- before/20000121-1.c.282r.barriers   2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.282r.barriers    2016-05-20 14:54:34.537741174 +0200
    [...]
    --- before/20000121-1.c.286r.shorten    2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.286r.shorten     2016-05-20 14:54:34.537741174 +0200
    [...]
    --- before/20000121-1.c.287r.nothrow    2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.287r.nothrow     2016-05-20 14:54:34.537741174 +0200
    [...]
    --- before/20000121-1.c.289r.final      2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.289r.final       2016-05-20 14:54:34.537741174 +0200
    [...]
    --- before/20000121-1.c.290r.dfinish    2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.c.290r.dfinish     2016-05-20 14:54:34.537741174 +0200
    [...]

..., and resulting in the following assembly changes:

    --- before/20000121-1.s 2016-05-20 14:56:37.794367323 +0200
    +++ after/20000121-1.s  2016-05-20 14:54:34.537741174 +0200
    @@ -19,7 +19,6 @@
            .reg.u64 %r22;
                    mov.u64 %r22, %ar0;
                    st.u64  [%frame], %r22;
    -       ret;
     }
     
     // BEGIN GLOBAL FUNCTION DECL: doit
    @@ -79,7 +78,6 @@
                    st.param.u64 [%out_arg1], %r33;
                    call big, (%out_arg1);
            }
    -       ret;
     }
     
     // BEGIN VAR DEF: $LC0
    @@ -112,6 +110,4 @@
                    mov.u32 %r22, 0;
                    mov.u32 %r23, %r22;
                    mov.u32 %value, %r23;
    -       st.param.u32    [%value_out], %value;
    -       ret;
     }

The disappearing "ret" statements don't matter, but the disappearing
store at the end of "main" does.


Grüße
 Thomas
Nathan Sidwell May 20, 2016, 2:47 p.m. UTC | #6
On 05/20/16 09:21, Thomas Schwinge wrote:
> Hi!
>
> The nvptx maintainer Bernd, Nathan: can you take it from here, or should
> I continue to figure it out?

What is the defect?
Segher Boessenkool May 20, 2016, 3:35 p.m. UTC | #7
On Fri, May 20, 2016 at 10:47:19AM -0400, Nathan Sidwell wrote:
> On 05/20/16 09:21, Thomas Schwinge wrote:
> >Hi!
> >
> >The nvptx maintainer Bernd, Nathan: can you take it from here, or should
> >I continue to figure it out?
> 
> What is the defect?

I have a fix, testing now.


Segher
diff mbox

Patch

diff --git a/gcc/function.c b/gcc/function.c
index 75d2ad4..278aaf6 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5819,13 +5819,13 @@  make_prologue_seq (void)
 }
 
 static rtx_insn *
-make_epilogue_seq (rtx_insn **epilogue_end)
+make_epilogue_seq (void)
 {
   if (!targetm.have_epilogue ())
     return NULL;
 
   start_sequence ();
-  *epilogue_end = emit_note (NOTE_INSN_EPILOGUE_BEG);
+  emit_note (NOTE_INSN_EPILOGUE_BEG);
   rtx_insn *seq = targetm.gen_epilogue ();
   if (seq)
     emit_jump_insn (seq);
@@ -5897,66 +5897,29 @@  make_epilogue_seq (rtx_insn **epilogue_end)
 void
 thread_prologue_and_epilogue_insns (void)
 {
-  bool inserted;
-  bitmap_head bb_flags;
-  rtx_insn *epilogue_end ATTRIBUTE_UNUSED;
-  edge e, entry_edge, orig_entry_edge, exit_fallthru_edge;
-  edge_iterator ei;
-
   df_analyze ();
 
-  rtl_profile_for_bb (ENTRY_BLOCK_PTR_FOR_FN (cfun));
-
-  inserted = false;
-  epilogue_end = NULL;
-
   /* Can't deal with multiple successors of the entry block at the
      moment.  Function should always have at least one entry
      point.  */
   gcc_assert (single_succ_p (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
-  entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
-  orig_entry_edge = entry_edge;
+
+  edge entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+  edge orig_entry_edge = entry_edge;
 
   rtx_insn *split_prologue_seq = make_split_prologue_seq ();
   rtx_insn *prologue_seq = make_prologue_seq ();
-  rtx_insn *epilogue_seq = make_epilogue_seq (&epilogue_end);
-
-  bitmap_initialize (&bb_flags, &bitmap_default_obstack);
+  rtx_insn *epilogue_seq = make_epilogue_seq ();
 
   /* Try to perform a kind of shrink-wrapping, making sure the
      prologue/epilogue is emitted only around those parts of the
      function that require it.  */
 
-  try_shrink_wrapping (&entry_edge, &bb_flags, prologue_seq);
+  try_shrink_wrapping (&entry_edge, prologue_seq);
 
-  if (split_prologue_seq != NULL_RTX)
-    {
-      insert_insn_on_edge (split_prologue_seq, orig_entry_edge);
-      inserted = true;
-    }
-  if (prologue_seq != NULL_RTX)
-    {
-      insert_insn_on_edge (prologue_seq, entry_edge);
-      inserted = true;
-    }
-
-  /* If the exit block has no non-fake predecessors, we don't need
-     an epilogue.  */
-  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
-    if ((e->flags & EDGE_FAKE) == 0)
-      break;
-  if (e == NULL)
-    goto epilogue_done;
 
   rtl_profile_for_bb (EXIT_BLOCK_PTR_FOR_FN (cfun));
 
-  exit_fallthru_edge = find_fallthru_edge (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds);
-
-  /* If nothing falls through into the exit block, we don't need an
-     epilogue.  */
-  if (exit_fallthru_edge == NULL)
-    goto epilogue_done;
-
   /* A small fib -- epilogue is not yet completed, but we wish to re-use
      this marker for the splits of EH_RETURN patterns, and nothing else
      uses the flag in the meantime.  */
@@ -5967,6 +5930,8 @@  thread_prologue_and_epilogue_insns (void)
      code.  In order to be able to properly annotate these with unwind
      info, try to split them now.  If we get a valid split, drop an
      EPILOGUE_BEG note and mark the insns as epilogue insns.  */
+  edge e;
+  edge_iterator ei;
   FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
     {
       rtx_insn *prev, *last, *trial;
@@ -5986,78 +5951,84 @@  thread_prologue_and_epilogue_insns (void)
       emit_note_after (NOTE_INSN_EPILOGUE_BEG, prev);
     }
 
-  if (epilogue_seq)
-    {
-      insert_insn_on_edge (epilogue_seq, exit_fallthru_edge);
-      inserted = true;
-    }
-  else
-    {
-      basic_block cur_bb;
+  edge exit_fallthru_edge = find_fallthru_edge (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds);
 
-      if (! next_active_insn (BB_END (exit_fallthru_edge->src)))
-	goto epilogue_done;
-      /* We have a fall-through edge to the exit block, the source is not
-         at the end of the function, and there will be an assembler epilogue
-         at the end of the function.
-         We can't use force_nonfallthru here, because that would try to
-	 use return.  Inserting a jump 'by hand' is extremely messy, so
-	 we take advantage of cfg_layout_finalize using
-	 fixup_fallthru_exit_predecessor.  */
-      cfg_layout_initialize (0);
-      FOR_EACH_BB_FN (cur_bb, cfun)
-	if (cur_bb->index >= NUM_FIXED_BLOCKS
-	    && cur_bb->next_bb->index >= NUM_FIXED_BLOCKS)
-	  cur_bb->aux = cur_bb->next_bb;
-      cfg_layout_finalize ();
+  if (exit_fallthru_edge)
+    {
+      if (epilogue_seq)
+	{
+	  insert_insn_on_edge (epilogue_seq, exit_fallthru_edge);
+
+	  /* The epilogue insns we inserted may cause the exit edge to no longer
+	     be fallthru.  */
+	  FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
+	    {
+	      if (((e->flags & EDGE_FALLTHRU) != 0)
+		  && returnjump_p (BB_END (e->src)))
+		e->flags &= ~EDGE_FALLTHRU;
+	    }
+	}
+      else if (next_active_insn (BB_END (exit_fallthru_edge->src)))
+	{
+	  /* We have a fall-through edge to the exit block, the source is not
+	     at the end of the function, and there will be an assembler epilogue
+	     at the end of the function.
+	     We can't use force_nonfallthru here, because that would try to
+	     use return.  Inserting a jump 'by hand' is extremely messy, so
+	     we take advantage of cfg_layout_finalize using
+	     fixup_fallthru_exit_predecessor.  */
+	  cfg_layout_initialize (0);
+	  basic_block cur_bb;
+	  FOR_EACH_BB_FN (cur_bb, cfun)
+	    if (cur_bb->index >= NUM_FIXED_BLOCKS
+		&& cur_bb->next_bb->index >= NUM_FIXED_BLOCKS)
+	      cur_bb->aux = cur_bb->next_bb;
+	  cfg_layout_finalize ();
+	}
     }
 
-epilogue_done:
+  /* Insert the prologue.  */
 
-  default_rtl_profile ();
+  rtl_profile_for_bb (ENTRY_BLOCK_PTR_FOR_FN (cfun));
 
-  if (inserted)
+  if (split_prologue_seq || prologue_seq)
     {
-      sbitmap blocks;
+      if (split_prologue_seq)
+	insert_insn_on_edge (split_prologue_seq, orig_entry_edge);
+
+      if (prologue_seq)
+	insert_insn_on_edge (prologue_seq, entry_edge);
 
       commit_edge_insertions ();
 
       /* Look for basic blocks within the prologue insns.  */
-      blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
+      sbitmap blocks = sbitmap_alloc (last_basic_block_for_fn (cfun));
       bitmap_clear (blocks);
       bitmap_set_bit (blocks, entry_edge->dest->index);
       bitmap_set_bit (blocks, orig_entry_edge->dest->index);
       find_many_sub_basic_blocks (blocks);
       sbitmap_free (blocks);
-
-      /* The epilogue insns we inserted may cause the exit edge to no longer
-	 be fallthru.  */
-      FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds)
-	{
-	  if (((e->flags & EDGE_FALLTHRU) != 0)
-	      && returnjump_p (BB_END (e->src)))
-	    e->flags &= ~EDGE_FALLTHRU;
-	}
     }
 
-  /* Emit sibling epilogues before any sibling call sites.  */
-  for (ei = ei_start (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds); (e =
-							     ei_safe_edge (ei));
-							     )
-    {
-      basic_block bb = e->src;
-      rtx_insn *insn = BB_END (bb);
+  default_rtl_profile ();
 
-      if (!CALL_P (insn)
-	  || ! SIBLING_CALL_P (insn)
-	  || (targetm.have_simple_return ()
-	      && entry_edge != orig_entry_edge
-	      && !bitmap_bit_p (&bb_flags, bb->index)))
+  /* Emit sibling epilogues before any sibling call sites.  */
+  for (ei = ei_start (EXIT_BLOCK_PTR_FOR_FN (cfun)->preds);
+       (e = ei_safe_edge (ei));
+       ei_next (&ei))
+    {
+      /* Skip those already handled, the ones that run without prologue.  */
+      if (e->flags & EDGE_IGNORE)
 	{
-	  ei_next (&ei);
+	  e->flags &= ~EDGE_IGNORE;
 	  continue;
 	}
 
+      rtx_insn *insn = BB_END (e->src);
+
+      if (!(CALL_P (insn) && SIBLING_CALL_P (insn)))
+	continue;
+
       if (rtx_insn *ep_seq = targetm.gen_sibcall_epilogue ())
 	{
 	  start_sequence ();
@@ -6074,10 +6045,9 @@  epilogue_done:
 
 	  emit_insn_before (seq, insn);
 	}
-      ei_next (&ei);
     }
 
-  if (epilogue_end)
+  if (epilogue_seq)
     {
       rtx_insn *insn, *next;
 
@@ -6086,17 +6056,15 @@  epilogue_done:
 	 of such a note.  Also possibly move
 	 NOTE_INSN_FUNCTION_BEG notes, as those can be relevant for debug
 	 info generation.  */
-      for (insn = epilogue_end; insn; insn = next)
+      for (insn = epilogue_seq; insn; insn = next)
 	{
 	  next = NEXT_INSN (insn);
 	  if (NOTE_P (insn)
 	      && (NOTE_KIND (insn) == NOTE_INSN_FUNCTION_BEG))
-	    reorder_insns (insn, insn, PREV_INSN (epilogue_end));
+	    reorder_insns (insn, insn, PREV_INSN (epilogue_seq));
 	}
     }
 
-  bitmap_clear (&bb_flags);
-
   /* Threading the prologue and epilogue changes the artificial refs
      in the entry and exit blocks.  */
   epilogue_completed = 1;
diff --git a/gcc/shrink-wrap.c b/gcc/shrink-wrap.c
index 0ba1fed..b85b1c3 100644
--- a/gcc/shrink-wrap.c
+++ b/gcc/shrink-wrap.c
@@ -529,30 +529,49 @@  can_dup_for_shrink_wrapping (basic_block bb, basic_block pro, unsigned max_size)
   return true;
 }
 
-/* If the source of edge E has more than one successor, the verifier for
-   branch probabilities gets confused by the fake edges we make where
-   simple_return statements will be inserted later (because those are not
-   marked as fallthrough edges).  Fix this by creating an extra block just
-   for that fallthrough.  */
+/* Do whatever needs to be done for exits that run without prologue.
+   Sibcalls need nothing done.  Normal exits get a simple_return inserted.  */
 
-static edge
-fix_fake_fallthrough_edge (edge e)
+static void
+handle_simple_exit (edge e)
 {
-  if (EDGE_COUNT (e->src->succs) <= 1)
-    return e;
 
-  basic_block old_bb = e->src;
-  rtx_insn *end = BB_END (old_bb);
-  rtx_note *note = emit_note_after (NOTE_INSN_DELETED, end);
-  basic_block new_bb = create_basic_block (note, note, old_bb);
-  BB_COPY_PARTITION (new_bb, old_bb);
-  BB_END (old_bb) = end;
+  if (e->flags & EDGE_SIBCALL)
+    {
+      /* Tell function.c to take no further action on this edge.  */
+      e->flags |= EDGE_IGNORE;
 
-  redirect_edge_succ (e, new_bb);
-  e->flags |= EDGE_FALLTHRU;
-  e->flags &= ~EDGE_FAKE;
+      e->flags &= ~EDGE_FALLTHRU;
+      emit_barrier_after_bb (e->src);
+      return;
+    }
 
-  return make_edge (new_bb, EXIT_BLOCK_PTR_FOR_FN (cfun), EDGE_FAKE);
+  /* If the basic block the edge comes from has multiple successors,
+     split the edge.  */
+  if (EDGE_COUNT (e->src->succs) > 1)
+    {
+      basic_block old_bb = e->src;
+      rtx_insn *end = BB_END (old_bb);
+      rtx_note *note = emit_note_after (NOTE_INSN_DELETED, end);
+      basic_block new_bb = create_basic_block (note, note, old_bb);
+      BB_COPY_PARTITION (new_bb, old_bb);
+      BB_END (old_bb) = end;
+
+      redirect_edge_succ (e, new_bb);
+      e->flags |= EDGE_FALLTHRU;
+
+      e = make_edge (new_bb, EXIT_BLOCK_PTR_FOR_FN (cfun), 0);
+    }
+
+  e->flags &= ~EDGE_FALLTHRU;
+  rtx_jump_insn *ret = emit_jump_insn_after (targetm.gen_simple_return (),
+					     BB_END (e->src));
+  JUMP_LABEL (ret) = simple_return_rtx;
+  emit_barrier_after_bb (e->src);
+
+  if (dump_file)
+    fprintf (dump_file, "Made simple_return with UID %d in bb %d\n",
+	     INSN_UID (ret), e->src->index);
 }
 
 /* Try to perform a kind of shrink-wrapping, making sure the
@@ -610,13 +629,10 @@  fix_fake_fallthrough_edge (edge e)
    (bb 4 is duplicated to 5; the prologue is inserted on the edge 5->3).
 
    ENTRY_EDGE is the edge where the prologue will be placed, possibly
-   changed by this function.  BB_WITH is a bitmap that, if we do shrink-
-   wrap, will on return contain the interesting blocks that run with
-   prologue.  PROLOGUE_SEQ is the prologue we will insert.  */
+   changed by this function.  PROLOGUE_SEQ is the prologue we will insert.  */
 
 void
-try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with,
-		     rtx_insn *prologue_seq)
+try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq)
 {
   /* If we cannot shrink-wrap, are told not to shrink-wrap, or it makes
      no sense to shrink-wrap: then do not shrink-wrap!  */
@@ -739,6 +755,7 @@  try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with,
      reachable from PRO that we already found, and in VEC a stack of
      those we still need to consider (to find successors).  */
 
+  bitmap bb_with = BITMAP_ALLOC (NULL);
   bitmap_set_bit (bb_with, pro->index);
 
   vec<basic_block> vec;
@@ -851,6 +868,7 @@  try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with,
 
   if (pro == entry)
     {
+      BITMAP_FREE (bb_with);
       free_dominance_info (CDI_DOMINATORS);
       return;
     }
@@ -952,26 +970,7 @@  try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with,
     if (!bitmap_bit_p (bb_with, bb->index))
       FOR_EACH_EDGE (e, ei, bb->succs)
 	if (e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
-	  {
-	    e = fix_fake_fallthrough_edge (e);
-
-	    e->flags &= ~EDGE_FALLTHRU;
-	    if (!(e->flags & EDGE_SIBCALL))
-	      {
-		rtx_insn *ret = targetm.gen_simple_return ();
-		rtx_insn *end = BB_END (e->src);
-		rtx_jump_insn *start = emit_jump_insn_after (ret, end);
-		JUMP_LABEL (start) = simple_return_rtx;
-		e->flags &= ~EDGE_FAKE;
-
-		if (dump_file)
-		  fprintf (dump_file,
-			   "Made simple_return with UID %d in bb %d\n",
-			   INSN_UID (start), e->src->index);
-	      }
-
-	    emit_barrier_after_bb (e->src);
-	  }
+	  handle_simple_exit (e);
 
   /* Finally, we want a single edge to put the prologue on.  Make a new
      block before the PRO block; the edge beteen them is the edge we want.
@@ -1004,5 +1003,6 @@  try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_with,
   *entry_edge = make_single_succ_edge (new_bb, pro, EDGE_FALLTHRU);
   force_nonfallthru (*entry_edge);
 
+  BITMAP_FREE (bb_with);
   free_dominance_info (CDI_DOMINATORS);
 }
diff --git a/gcc/shrink-wrap.h b/gcc/shrink-wrap.h
index 4d821d7..e06ab37 100644
--- a/gcc/shrink-wrap.h
+++ b/gcc/shrink-wrap.h
@@ -24,8 +24,7 @@  along with GCC; see the file COPYING3.  If not see
 
 /* In shrink-wrap.c.  */
 extern bool requires_stack_frame_p (rtx_insn *, HARD_REG_SET, HARD_REG_SET);
-extern void try_shrink_wrapping (edge *entry_edge, bitmap_head *bb_flags,
-				 rtx_insn *prologue_seq);
+extern void try_shrink_wrapping (edge *entry_edge, rtx_insn *prologue_seq);
 #define SHRINK_WRAPPING_ENABLED \
   (flag_shrink_wrap && targetm.have_simple_return ())