Message ID | dd888f582418c5e1d257ed149c4bc0d6b057bd78.1495972471.git.segher@kernel.crashing.org |
---|---|
State | New |
Headers | show |
Ping. (Sorry for the very aggressive ping; this fixes 764 testsuite failures on powerpc-linux). Segher On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote: > __atomic_add_fetch adds a value to some memory, and returns the result. > If there is no direct support for this, expand_builtin_atomic_fetch_op > is asked to implement this as __atomic_fetch_add (which returns the > original value of the mem), followed by the addition. Now, the > __atomic_add_fetch could have been a tail call, but we shouldn't > perform the __atomic_fetch_add as a tail call: following code would > not be executed, and in fact thrown away because there is a barrier > after tail calls. > > This fixes it. > > Tested on powerpc64-linux {-m32,-m64}. Is this okay for trunk? > > > Segher > > > 2017-05-28 Segher Boessenkool <segher@kernel.crashing.org> > > PR middle-end/80902 > * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after > a call, force the call to not be a tail call. > > --- > gcc/builtins.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/gcc/builtins.c b/gcc/builtins.c > index 4f6c9c4..3a70693 100644 > --- a/gcc/builtins.c > +++ b/gcc/builtins.c > @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, > gcc_assert (TREE_OPERAND (addr, 0) == fndecl); > TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call); > > + /* If we will emit code after the call, the call can not be a tail call. > + If it is emitted as a tail call, a barrier is emitted after it, and > + then all trailing code is removed. */ > + if (!ignore) > + CALL_EXPR_TAILCALL (exp) = 0; > + > /* Expand the call here so we can emit trailing code. */ > ret = expand_call (exp, target, ignore); > > -- > 1.9.3
Ping. On Wed, May 31, 2017 at 08:19:56AM -0500, Segher Boessenkool wrote: > Ping. > > (Sorry for the very aggressive ping; this fixes 764 testsuite failures > on powerpc-linux). > > > Segher > > > On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote: > > __atomic_add_fetch adds a value to some memory, and returns the result. > > If there is no direct support for this, expand_builtin_atomic_fetch_op > > is asked to implement this as __atomic_fetch_add (which returns the > > original value of the mem), followed by the addition. Now, the > > __atomic_add_fetch could have been a tail call, but we shouldn't > > perform the __atomic_fetch_add as a tail call: following code would > > not be executed, and in fact thrown away because there is a barrier > > after tail calls. > > > > This fixes it. > > > > Tested on powerpc64-linux {-m32,-m64}. Is this okay for trunk? > > > > > > Segher > > > > > > 2017-05-28 Segher Boessenkool <segher@kernel.crashing.org> > > > > PR middle-end/80902 > > * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after > > a call, force the call to not be a tail call. > > > > --- > > gcc/builtins.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/gcc/builtins.c b/gcc/builtins.c > > index 4f6c9c4..3a70693 100644 > > --- a/gcc/builtins.c > > +++ b/gcc/builtins.c > > @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, > > gcc_assert (TREE_OPERAND (addr, 0) == fndecl); > > TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call); > > > > + /* If we will emit code after the call, the call can not be a tail call. > > + If it is emitted as a tail call, a barrier is emitted after it, and > > + then all trailing code is removed. */ > > + if (!ignore) > > + CALL_EXPR_TAILCALL (exp) = 0; > > + > > /* Expand the call here so we can emit trailing code. */ > > ret = expand_call (exp, target, ignore); > > > > -- > > 1.9.3
On 05/28/2017 06:31 AM, Segher Boessenkool wrote: > __atomic_add_fetch adds a value to some memory, and returns the result. > If there is no direct support for this, expand_builtin_atomic_fetch_op > is asked to implement this as __atomic_fetch_add (which returns the > original value of the mem), followed by the addition. Now, the > __atomic_add_fetch could have been a tail call, but we shouldn't > perform the __atomic_fetch_add as a tail call: following code would > not be executed, and in fact thrown away because there is a barrier > after tail calls. > > This fixes it. > > Tested on powerpc64-linux {-m32,-m64}. Is this okay for trunk? > > > Segher > > > 2017-05-28 Segher Boessenkool <segher@kernel.crashing.org> > > PR middle-end/80902 > * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after > a call, force the call to not be a tail call. Hmmm. I wonder if we have similar problems elsewhere. For example expand_builtin_int_roundingfn_2, stack_protect_epilogue, expand_builtin_trap (though this one probably isn't broken in practice), expand_ifn_atomic_compare_exchange_into_call. OK, but please check the other instances where we call expand_call, then continue generating code afterwards. Fixing those can be a follow-up patch. Sorry for the delay. Jeff
On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote: > On 05/28/2017 06:31 AM, Segher Boessenkool wrote: > > __atomic_add_fetch adds a value to some memory, and returns the result. > > If there is no direct support for this, expand_builtin_atomic_fetch_op > > is asked to implement this as __atomic_fetch_add (which returns the > > original value of the mem), followed by the addition. Now, the > > __atomic_add_fetch could have been a tail call, but we shouldn't > > perform the __atomic_fetch_add as a tail call: following code would > > not be executed, and in fact thrown away because there is a barrier > > after tail calls. > > > > This fixes it. > > PR middle-end/80902 > > * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after > > a call, force the call to not be a tail call. > Hmmm. I wonder if we have similar problems elsewhere. For example > expand_builtin_int_roundingfn_2, stack_protect_epilogue, > expand_builtin_trap (though this one probably isn't broken in practice), > expand_ifn_atomic_compare_exchange_into_call. > > OK, but please check the other instances where we call expand_call, then > continue generating code afterwards. Fixing those can be a follow-up patch. I guess we want an expand_call_notail helper? Or, hrm, why are function calls expanded as tail calls at all, should that not be decided later? Segher
On 06/23/2017 11:44 AM, Segher Boessenkool wrote: > On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote: >> On 05/28/2017 06:31 AM, Segher Boessenkool wrote: >>> __atomic_add_fetch adds a value to some memory, and returns the result. >>> If there is no direct support for this, expand_builtin_atomic_fetch_op >>> is asked to implement this as __atomic_fetch_add (which returns the >>> original value of the mem), followed by the addition. Now, the >>> __atomic_add_fetch could have been a tail call, but we shouldn't >>> perform the __atomic_fetch_add as a tail call: following code would >>> not be executed, and in fact thrown away because there is a barrier >>> after tail calls. >>> >>> This fixes it. > >>> PR middle-end/80902 >>> * builtins.c (expand_builtin_atomic_fetch_op): If emitting code after >>> a call, force the call to not be a tail call. >> Hmmm. I wonder if we have similar problems elsewhere. For example >> expand_builtin_int_roundingfn_2, stack_protect_epilogue, >> expand_builtin_trap (though this one probably isn't broken in practice), >> expand_ifn_atomic_compare_exchange_into_call. >> >> OK, but please check the other instances where we call expand_call, then >> continue generating code afterwards. Fixing those can be a follow-up patch. > > I guess we want an expand_call_notail helper? Probably. Or, hrm, why are function > calls expanded as tail calls at all, should that not be decided later? That's how I thought it worked. We create two streams of insns, then decide later which of the two streams to use. But I think part of the criteria for creating streams was that call was in the tail position to start with. And that's not the case with the code you pointed out and the others I found. Jeff
On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote: > On 05/28/2017 06:31 AM, Segher Boessenkool wrote: > > __atomic_add_fetch adds a value to some memory, and returns the result. > > If there is no direct support for this, expand_builtin_atomic_fetch_op > > is asked to implement this as __atomic_fetch_add (which returns the > > original value of the mem), followed by the addition. Now, the > > __atomic_add_fetch could have been a tail call, but we shouldn't > > perform the __atomic_fetch_add as a tail call: following code would > > not be executed, and in fact thrown away because there is a barrier > > after tail calls. > Hmmm. I wonder if we have similar problems elsewhere. For example > expand_builtin_int_roundingfn_2, stack_protect_epilogue, > expand_builtin_trap (though this one probably isn't broken in practice), > expand_ifn_atomic_compare_exchange_into_call. > > OK, but please check the other instances where we call expand_call, then > continue generating code afterwards. Fixing those can be a follow-up patch. We certainly have similar problems elsewhere. I'm doing tests detecting whenever we create dead code (right after a barrier); it finds a few things, mostly harmless, but there are quite a few places where we create dead code during expand. This will take a while, but it will need to happen during stage 1... I'm trying to fit it in :-/ Segher
diff --git a/gcc/builtins.c b/gcc/builtins.c index 4f6c9c4..3a70693 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target, gcc_assert (TREE_OPERAND (addr, 0) == fndecl); TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call); + /* If we will emit code after the call, the call can not be a tail call. + If it is emitted as a tail call, a barrier is emitted after it, and + then all trailing code is removed. */ + if (!ignore) + CALL_EXPR_TAILCALL (exp) = 0; + /* Expand the call here so we can emit trailing code. */ ret = expand_call (exp, target, ignore);