diff mbox

Fix up __builtin_setjmp_receiver handling (PR c++/60082)

Message ID 20140206073732.GQ12671@tucnak.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Feb. 6, 2014, 7:37 a.m. UTC
Hi!

I've looked at the spawner_inline.c timeout issue, and it looks like the
problem is that __builtin_setjmp_receiver is a stmt_ends_bb_p call and
insert_backedge_copies then happily inserts statements before the call,
because it can't insert them after the call.  But __builtin_setjmp_receiver
is quite special beast, it's bb is reachable only through an abnormal edge
and it pretty much relies on being at the beginning of the bb, as before the
call e.g. frame pointer value is bogus, the expansion of the call itself
fixes that up and then emits a blockage to make sure things aren't
reordered.  But when outof-ssa emits something before the builtin, it isn't
reordered, but is on the wrong side of the frame pointer restoration.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?  Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead
in call_can_make_abnormal_goto, there is probably no need to have AB edge
out of __builtin_setjmp_receiver, we already have one out of
__builtin_setjmp_setup and receiver is really expanded just to load of frame
pointer, so it doesn't jump anywhere.

Note that this fixes both the CK tests on x86_64-linux, on i686-linux
neither of them timeout, but catch_exc.cc still fails:
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
 FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test

2014-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR c++/60082
	* tree-outof-ssa.c (insert_backedge_copies): Never insert statements
	before __builtin_setjmp_receiver.

	Revert
	2014-02-05  Balaji V. Iyer  <balaji.v.iyer@intel.com>

	* g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1.
	* c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.


	Jakub

Comments

Richard Biener Feb. 6, 2014, 9:13 a.m. UTC | #1
On Thu, 6 Feb 2014, Jakub Jelinek wrote:

> Hi!
> 
> I've looked at the spawner_inline.c timeout issue, and it looks like the
> problem is that __builtin_setjmp_receiver is a stmt_ends_bb_p call and
> insert_backedge_copies then happily inserts statements before the call,
> because it can't insert them after the call.  But __builtin_setjmp_receiver
> is quite special beast, it's bb is reachable only through an abnormal edge
> and it pretty much relies on being at the beginning of the bb, as before the
> call e.g. frame pointer value is bogus, the expansion of the call itself
> fixes that up and then emits a blockage to make sure things aren't
> reordered.  But when outof-ssa emits something before the builtin, it isn't
> reordered, but is on the wrong side of the frame pointer restoration.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?  Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead
> in call_can_make_abnormal_goto, there is probably no need to have AB edge
> out of __builtin_setjmp_receiver, we already have one out of
> __builtin_setjmp_setup and receiver is really expanded just to load of frame
> pointer, so it doesn't jump anywhere.
> 
> Note that this fixes both the CK tests on x86_64-linux, on i686-linux
> neither of them timeout, but catch_exc.cc still fails:
>  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
>  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
>  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test

I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't
currently have a suitable exported stmt_starts_bb_p () (the existing
one is too special), but similar to is_ctrl_altering_stmt we could
add a is_ctrl_receiving_stmt that covers nonlocal labels and other
stmts we add abnormal edges to.  Btw, why isn't BUILT_IN_SETJMP_RECEIVER
ECF_RETURNS_TWICE?

Thanks,
Richard.

> 2014-02-06  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c++/60082
> 	* tree-outof-ssa.c (insert_backedge_copies): Never insert statements
> 	before __builtin_setjmp_receiver.
> 
> 	Revert
> 	2014-02-05  Balaji V. Iyer  <balaji.v.iyer@intel.com>
> 
> 	* g++.dg/cilk-plus/CK/catch_exc.cc: Disable test for -O1.
> 	* c-c++-common/cilk-plus/CK/spawner_inline.c: Likewise.
> 
> --- gcc/tree-outof-ssa.c.jj	2014-01-03 11:41:01.000000000 +0100
> +++ gcc/tree-outof-ssa.c	2014-02-05 22:12:03.637412102 +0100
> @@ -1152,6 +1152,12 @@ insert_backedge_copies (void)
>  		      if (TREE_CODE (arg) == SSA_NAME
>  			  && SSA_NAME_DEF_STMT (arg) == last)
>  			continue;
> +		      /* Never insert anything before
> +			 __builtin_setjmp_receiver, that builtin should be
> +			 immediately after labels.  */
> +		      if (gimple_call_builtin_p (last,
> +						 BUILT_IN_SETJMP_RECEIVER))
> +			continue;
>  		    }
>  
>  		  /* Create a new instance of the underlying variable of the
> --- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc	(revision 207519)
> +++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc	(revision 207518)
> @@ -1,7 +1,6 @@
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
>  /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* arm*-*-* } } } */
> -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
>  
>  #include <assert.h>
>  #include <unistd.h>
> --- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c	(revision 207519)
> +++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c	(revision 207518)
> @@ -1,7 +1,6 @@
>  /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
>  /* { dg-options "-fcilkplus" } */
>  /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
> -/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
>  
>  #include <stdlib.h>
>  #define DEFAULT_VALUE 30
> 
> 	Jakub
> 
>
Jakub Jelinek Feb. 6, 2014, 9:22 a.m. UTC | #2
On Thu, Feb 06, 2014 at 10:13:25AM +0100, Richard Biener wrote:
> > trunk?  Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead
> > in call_can_make_abnormal_goto, there is probably no need to have AB edge
> > out of __builtin_setjmp_receiver, we already have one out of
> > __builtin_setjmp_setup and receiver is really expanded just to load of frame
> > pointer, so it doesn't jump anywhere.
> > 
> > Note that this fixes both the CK tests on x86_64-linux, on i686-linux
> > neither of them timeout, but catch_exc.cc still fails:
> >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
> >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
> >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test
> 
> I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't

Not even in call_can_make_abnormal_goto?  I mean, as we split
__builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former
has abnormal edge out of it, there is no need to have another abnormal edge
out of the receiver.

> currently have a suitable exported stmt_starts_bb_p () (the existing
> one is too special), but similar to is_ctrl_altering_stmt we could
> add a is_ctrl_receiving_stmt that covers nonlocal labels and other

Yes, that works for me.  I wondered if insert_backedge_copies wouldn't do
harm if it inserts the assignments before normal setjmp call or other
call that returns twice.

> stmts we add abnormal edges to.  Btw, why isn't BUILT_IN_SETJMP_RECEIVER
> ECF_RETURNS_TWICE?

See PR60003, #c5 in particular which really didn't work at all, and Eric's
preference in #c8.

	Jakub
Richard Biener Feb. 6, 2014, 9:33 a.m. UTC | #3
On Thu, 6 Feb 2014, Jakub Jelinek wrote:

> On Thu, Feb 06, 2014 at 10:13:25AM +0100, Richard Biener wrote:
> > > trunk?  Alternatively we could special case BUILT_IN_SETJMP_RECEIVER instead
> > > in call_can_make_abnormal_goto, there is probably no need to have AB edge
> > > out of __builtin_setjmp_receiver, we already have one out of
> > > __builtin_setjmp_setup and receiver is really expanded just to load of frame
> > > pointer, so it doesn't jump anywhere.
> > > 
> > > Note that this fixes both the CK tests on x86_64-linux, on i686-linux
> > > neither of them timeout, but catch_exc.cc still fails:
> > >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O1 -fcilkplus execution test
> > >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -O3 -fcilkplus execution test
> > >  FAIL: g++.dg/cilk-plus/CK/catch_exc.cc  -g -O2 -fcilkplus execution test
> > 
> > I don't like special-casing BUILT_IN_SETJMP_RECEIVER - we don't
> 
> Not even in call_can_make_abnormal_goto?  I mean, as we split
> __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former
> has abnormal edge out of it, there is no need to have another abnormal edge
> out of the receiver.

longjmp has outgoing abnormal edges, setjmp has
incoming abnormal edges.  So how's the dispatcher working exactly?
It receives all abnormal edges from all possibly longjmp calling
stmts and dispatches them to all setjmp calls?

So why would there be extra edges from the receiver?

> > currently have a suitable exported stmt_starts_bb_p () (the existing
> > one is too special), but similar to is_ctrl_altering_stmt we could
> > add a is_ctrl_receiving_stmt that covers nonlocal labels and other
> 
> Yes, that works for me.  I wondered if insert_backedge_copies wouldn't do
> harm if it inserts the assignments before normal setjmp call or other
> call that returns twice.

Yeah, it disrupts the CFG (unfortunately we don't verify that abnormal
edge receievers start a BB, or rather that if a BB receives abnormal
edges there is a stmt at the beginning that receives them).  Adding
such verification to gimple_verify_flow_info would be nice.

> > stmts we add abnormal edges to.  Btw, why isn't BUILT_IN_SETJMP_RECEIVER
> > ECF_RETURNS_TWICE?
> 
> See PR60003, #c5 in particular which really didn't work at all, and Eric's
> preference in #c8.

Hmm, but it really is ECF_RETURNS_TWICE ...
Jakub Jelinek Feb. 6, 2014, 9:47 a.m. UTC | #4
On Thu, Feb 06, 2014 at 10:33:54AM +0100, Richard Biener wrote:
> > Not even in call_can_make_abnormal_goto?  I mean, as we split
> > __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former
> > has abnormal edge out of it, there is no need to have another abnormal edge
> > out of the receiver.
> 
> longjmp has outgoing abnormal edges, setjmp has
> incoming abnormal edges.  So how's the dispatcher working exactly?
> It receives all abnormal edges from all possibly longjmp calling
> stmts and dispatches them to all setjmp calls?

I've actually removed the special __builtin_setjmp_dispatcher and replaced
it with the same ABNORMAL_DISPATCHER used for normal setjmp/returns_twice
calls.  __builtin_setjmp_receiver actually is not really a returns_twice
function, it only has the abnormal predecessor edge (from
ABNORMAL_DISPATCHER) and no other predecessor edge, so it really works as a
normal call with just the requirement that it is at the start of the bb.
We just need to treat it as receiver of the abnormal edges (we do already).
It is __builtin_setjmp_setup that has both normal fallthru edge and abnormal
edge to the dispatcher.  The dispatcher is expanded as nothing at all (it's
basic block is actually removed), the receiver as loading of frame pointer
etc. from somewhere and setup as saving frame pointer somewhere.

> So why would there be extra edges from the receiver?

That is what I'd like to change.  The reason for the extra edge from the
receiver is that it is a non-leaf call.  Supposedly if we set ECF_LEAF on
__builtin_setjmp_receiver, it might work, but that looks too dangerous to
me, at least right now.

> > > stmts we add abnormal edges to.  Btw, why isn't BUILT_IN_SETJMP_RECEIVER
> > > ECF_RETURNS_TWICE?
> > 
> > See PR60003, #c5 in particular which really didn't work at all, and Eric's
> > preference in #c8.
> 
> Hmm, but it really is ECF_RETURNS_TWICE ...

It is not.

	Jakub
Richard Biener Feb. 6, 2014, 10 a.m. UTC | #5
On Thu, 6 Feb 2014, Jakub Jelinek wrote:

> On Thu, Feb 06, 2014 at 10:33:54AM +0100, Richard Biener wrote:
> > > Not even in call_can_make_abnormal_goto?  I mean, as we split
> > > __builtin_setjmp into __builtin_setjmp{_setup,_receiver} and the former
> > > has abnormal edge out of it, there is no need to have another abnormal edge
> > > out of the receiver.
> > 
> > longjmp has outgoing abnormal edges, setjmp has
> > incoming abnormal edges.  So how's the dispatcher working exactly?
> > It receives all abnormal edges from all possibly longjmp calling
> > stmts and dispatches them to all setjmp calls?
> 
> I've actually removed the special __builtin_setjmp_dispatcher and replaced
> it with the same ABNORMAL_DISPATCHER used for normal setjmp/returns_twice
> calls.  __builtin_setjmp_receiver actually is not really a returns_twice
> function, it only has the abnormal predecessor edge (from
> ABNORMAL_DISPATCHER) and no other predecessor edge, so it really works as a
> normal call with just the requirement that it is at the start of the bb.
> We just need to treat it as receiver of the abnormal edges (we do already).
> It is __builtin_setjmp_setup that has both normal fallthru edge and abnormal
> edge to the dispatcher.  The dispatcher is expanded as nothing at all (it's
> basic block is actually removed), the receiver as loading of frame pointer
> etc. from somewhere and setup as saving frame pointer somewhere.
> 
> > So why would there be extra edges from the receiver?
> 
> That is what I'd like to change.  The reason for the extra edge from the
> receiver is that it is a non-leaf call.  Supposedly if we set ECF_LEAF on
> __builtin_setjmp_receiver, it might work, but that looks too dangerous to
> me, at least right now.

Ah, so __builtin_setjmp_receiver is like setjmp in this regard
and setjmp is LEAF (it's a stmt that doesn't direct control-flow
anywhere else).  So __builtin_setjmp_receiver should be LEAF as well
(and so should __builtin_setjmp_setup).

Doesn't sound dangerous at all to me ...

(same applies to trampoline setup and stack save/restore, _not_ to
__builtin_nonlocal_goto though)

> > > > stmts we add abnormal edges to.  Btw, why isn't BUILT_IN_SETJMP_RECEIVER
> > > > ECF_RETURNS_TWICE?
> > > 
> > > See PR60003, #c5 in particular which really didn't work at all, and Eric's
> > > preference in #c8.
> > 
> > Hmm, but it really is ECF_RETURNS_TWICE ...
> 
> It is not.

Then __builtin_setjmp_setup is?

Looks like I have to build a cross to a sjlj EH target ...

Richard.
Jakub Jelinek Feb. 6, 2014, 10:09 a.m. UTC | #6
On Thu, Feb 06, 2014 at 11:00:14AM +0100, Richard Biener wrote:
> Then __builtin_setjmp_setup is?

No, it never returns twice through the fallthru edge.  It only
returns second time through the abnormal edge virtually to the
ABNORMAL_DISPATCHER and from that through another abnormal edge
to __builtin_setjmp_receiver.  So, among other things, we don't
want abnormal edge to the start of __builtin_setjmp_setup.

Perhaps we could stop this lowering of __builtin_setjmp into
two special builtins now, but such changes look too big for me for stage4.
At expansion time we'd need to make sure to have the abnormal edges
(which right now expansion recreates from scratch) point to
the middle of __builtin_setjmp expansion (the non-local label we
insert there).  Then __builtin_setjmp would be a normal returns_twice
function.  But then there is Eric's argument that we also need to revisit
all cfun->calls_setjmp inhibiting optimizations, at least at the tree level,
and for RTL need to keep abnormal edges and kill the rest of the
optimization inhibitions.

	Jakub
Eric Botcazou Feb. 8, 2014, 9:49 a.m. UTC | #7
> Looks like I have to build a cross to a sjlj EH target ...

Let's clear a common misconception here: the SJLJ EH scheme implemented in C++ 
doesn't explicitly use __builtin_setjmp/__buitin_longjmp so it won't help to 
understand how things work for the case at hand; it's instead a hybrid scheme, 
i.e. EH machinery and __builtin_setjmp/__buitin_longjmp at the RTL level.

If you want to experiment with __builtin_setjmp/__buitin_longjmp without 
writing them explicitly, you can use the SJLJ EH scheme implemented in Ada,
because it is entirely piggybacked on __builtin_setjmp/__buitin_longjmp: gigi 
lowers all the EH constructs to use them (and consequently doesn't use the EH 
machinery at all, for example there is not a single EH edge in the CFG).

And, of course, the canonical setjmp/longjmp are something else and are never 
turned into __builtin_setjmp/__buitin_longjmp because the semantics is not the 
same.  They were recently changed to make use of the existing machinery for 
__builtin_setjmp/__buitin_longjmp under the hood (AB edges then AB dispatcher) 
but they are still optimization blockers because of cfun->calls_setjmp, which 
is set neither for __builtin_setjmp/__buitin_longjmp nor for SJLJ EH schemes.
diff mbox

Patch

--- gcc/tree-outof-ssa.c.jj	2014-01-03 11:41:01.000000000 +0100
+++ gcc/tree-outof-ssa.c	2014-02-05 22:12:03.637412102 +0100
@@ -1152,6 +1152,12 @@  insert_backedge_copies (void)
 		      if (TREE_CODE (arg) == SSA_NAME
 			  && SSA_NAME_DEF_STMT (arg) == last)
 			continue;
+		      /* Never insert anything before
+			 __builtin_setjmp_receiver, that builtin should be
+			 immediately after labels.  */
+		      if (gimple_call_builtin_p (last,
+						 BUILT_IN_SETJMP_RECEIVER))
+			continue;
 		    }
 
 		  /* Create a new instance of the underlying variable of the
--- gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc	(revision 207519)
+++ gcc/testsuite/g++.dg/cilk-plus/CK/catch_exc.cc	(revision 207518)
@@ -1,7 +1,6 @@ 
 /* { dg-options "-fcilkplus" } */
 /* { dg-do run { target i?86-*-* x86_64-*-* arm*-*-* } } */
 /* { dg-options "-fcilkplus -lcilkrts" { target { i?86-*-* x86_64-*-* arm*-*-* } } } */
-/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
 
 #include <assert.h>
 #include <unistd.h>
--- gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c	(revision 207519)
+++ gcc/testsuite/c-c++-common/cilk-plus/CK/spawner_inline.c	(revision 207518)
@@ -1,7 +1,6 @@ 
 /* { dg-do run  { target { i?86-*-* x86_64-*-* } } } */
 /* { dg-options "-fcilkplus" } */
 /* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */
-/* { dg-skip-if "" { *-*-* } { "-O1" } { "" } } */
 
 #include <stdlib.h>
 #define DEFAULT_VALUE 30