diff mbox

Fix expand_builtin_atomic_fetch_op for pre-op (PR80902)

Message ID dd888f582418c5e1d257ed149c4bc0d6b057bd78.1495972471.git.segher@kernel.crashing.org
State New
Headers show

Commit Message

Segher Boessenkool May 28, 2017, 12:31 p.m. UTC
__atomic_add_fetch adds a value to some memory, and returns the result.
If there is no direct support for this, expand_builtin_atomic_fetch_op
is asked to implement this as __atomic_fetch_add (which returns the
original value of the mem), followed by the addition.  Now, the
__atomic_add_fetch could have been a tail call, but we shouldn't
perform the __atomic_fetch_add as a tail call: following code would
not be executed, and in fact thrown away because there is a barrier
after tail calls.

This fixes it.

Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?


Segher


2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>

	PR middle-end/80902
	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
	a call, force the call to not be a tail call.

---
 gcc/builtins.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Segher Boessenkool May 31, 2017, 1:19 p.m. UTC | #1
Ping.

(Sorry for the very aggressive ping; this fixes 764 testsuite failures
on powerpc-linux).


Segher


On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote:
> __atomic_add_fetch adds a value to some memory, and returns the result.
> If there is no direct support for this, expand_builtin_atomic_fetch_op
> is asked to implement this as __atomic_fetch_add (which returns the
> original value of the mem), followed by the addition.  Now, the
> __atomic_add_fetch could have been a tail call, but we shouldn't
> perform the __atomic_fetch_add as a tail call: following code would
> not be executed, and in fact thrown away because there is a barrier
> after tail calls.
> 
> This fixes it.
> 
> Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> 
> 	PR middle-end/80902
> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> 	a call, force the call to not be a tail call.
> 
> ---
>  gcc/builtins.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 4f6c9c4..3a70693 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
>    gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
>    TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
>  
> +  /* If we will emit code after the call, the call can not be a tail call.
> +     If it is emitted as a tail call, a barrier is emitted after it, and
> +     then all trailing code is removed.  */
> +  if (!ignore)
> +    CALL_EXPR_TAILCALL (exp) = 0;
> +
>    /* Expand the call here so we can emit trailing code.  */
>    ret = expand_call (exp, target, ignore);
>  
> -- 
> 1.9.3
Segher Boessenkool June 11, 2017, 12:15 a.m. UTC | #2
Ping.

On Wed, May 31, 2017 at 08:19:56AM -0500, Segher Boessenkool wrote:
> Ping.
> 
> (Sorry for the very aggressive ping; this fixes 764 testsuite failures
> on powerpc-linux).
> 
> 
> Segher
> 
> 
> On Sun, May 28, 2017 at 12:31:12PM +0000, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.
> > 
> > This fixes it.
> > 
> > Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> > 
> > 
> > Segher
> > 
> > 
> > 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> > 
> > 	PR middle-end/80902
> > 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> > 	a call, force the call to not be a tail call.
> > 
> > ---
> >  gcc/builtins.c | 6 ++++++
> >  1 file changed, 6 insertions(+)
> > 
> > diff --git a/gcc/builtins.c b/gcc/builtins.c
> > index 4f6c9c4..3a70693 100644
> > --- a/gcc/builtins.c
> > +++ b/gcc/builtins.c
> > @@ -6079,6 +6079,12 @@ expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
> >    gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
> >    TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
> >  
> > +  /* If we will emit code after the call, the call can not be a tail call.
> > +     If it is emitted as a tail call, a barrier is emitted after it, and
> > +     then all trailing code is removed.  */
> > +  if (!ignore)
> > +    CALL_EXPR_TAILCALL (exp) = 0;
> > +
> >    /* Expand the call here so we can emit trailing code.  */
> >    ret = expand_call (exp, target, ignore);
> >  
> > -- 
> > 1.9.3
Jeff Law June 23, 2017, 4:59 a.m. UTC | #3
On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> __atomic_add_fetch adds a value to some memory, and returns the result.
> If there is no direct support for this, expand_builtin_atomic_fetch_op
> is asked to implement this as __atomic_fetch_add (which returns the
> original value of the mem), followed by the addition.  Now, the
> __atomic_add_fetch could have been a tail call, but we shouldn't
> perform the __atomic_fetch_add as a tail call: following code would
> not be executed, and in fact thrown away because there is a barrier
> after tail calls.
> 
> This fixes it.
> 
> Tested on powerpc64-linux {-m32,-m64}.  Is this okay for trunk?
> 
> 
> Segher
> 
> 
> 2017-05-28  Segher Boessenkool  <segher@kernel.crashing.org>
> 
> 	PR middle-end/80902
> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> 	a call, force the call to not be a tail call.
Hmmm.  I wonder if we have similar problems elsewhere.  For example
expand_builtin_int_roundingfn_2, stack_protect_epilogue,
expand_builtin_trap (though this one probably isn't broken in practice),
expand_ifn_atomic_compare_exchange_into_call.

OK, but please check the other instances where we call expand_call, then
continue generating code afterwards.  Fixing those can be a follow-up patch.

Sorry for the delay.

Jeff
Segher Boessenkool June 23, 2017, 5:44 p.m. UTC | #4
On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.
> > 
> > This fixes it.

> > 	PR middle-end/80902
> > 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
> > 	a call, force the call to not be a tail call.
> Hmmm.  I wonder if we have similar problems elsewhere.  For example
> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> expand_builtin_trap (though this one probably isn't broken in practice),
> expand_ifn_atomic_compare_exchange_into_call.
> 
> OK, but please check the other instances where we call expand_call, then
> continue generating code afterwards.  Fixing those can be a follow-up patch.

I guess we want an expand_call_notail helper?  Or, hrm, why are function
calls expanded as tail calls at all, should that not be decided later?


Segher
Jeff Law June 23, 2017, 5:48 p.m. UTC | #5
On 06/23/2017 11:44 AM, Segher Boessenkool wrote:
> On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
>> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
>>> __atomic_add_fetch adds a value to some memory, and returns the result.
>>> If there is no direct support for this, expand_builtin_atomic_fetch_op
>>> is asked to implement this as __atomic_fetch_add (which returns the
>>> original value of the mem), followed by the addition.  Now, the
>>> __atomic_add_fetch could have been a tail call, but we shouldn't
>>> perform the __atomic_fetch_add as a tail call: following code would
>>> not be executed, and in fact thrown away because there is a barrier
>>> after tail calls.
>>>
>>> This fixes it.
> 
>>> 	PR middle-end/80902
>>> 	* builtins.c (expand_builtin_atomic_fetch_op): If emitting code after
>>> 	a call, force the call to not be a tail call.
>> Hmmm.  I wonder if we have similar problems elsewhere.  For example
>> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
>> expand_builtin_trap (though this one probably isn't broken in practice),
>> expand_ifn_atomic_compare_exchange_into_call.
>>
>> OK, but please check the other instances where we call expand_call, then
>> continue generating code afterwards.  Fixing those can be a follow-up patch.
> 
> I guess we want an expand_call_notail helper? 
Probably.

 Or, hrm, why are function
> calls expanded as tail calls at all, should that not be decided later?
That's how I thought it worked.  We create two streams of insns, then
decide later which of the two streams to use.

But I think part of the criteria for creating streams was that call was
in the tail position to start with.  And that's not the case with the
code you pointed out and the others I found.

Jeff
Segher Boessenkool July 7, 2017, 11:56 a.m. UTC | #6
On Thu, Jun 22, 2017 at 10:59:05PM -0600, Jeff Law wrote:
> On 05/28/2017 06:31 AM, Segher Boessenkool wrote:
> > __atomic_add_fetch adds a value to some memory, and returns the result.
> > If there is no direct support for this, expand_builtin_atomic_fetch_op
> > is asked to implement this as __atomic_fetch_add (which returns the
> > original value of the mem), followed by the addition.  Now, the
> > __atomic_add_fetch could have been a tail call, but we shouldn't
> > perform the __atomic_fetch_add as a tail call: following code would
> > not be executed, and in fact thrown away because there is a barrier
> > after tail calls.

> Hmmm.  I wonder if we have similar problems elsewhere.  For example
> expand_builtin_int_roundingfn_2, stack_protect_epilogue,
> expand_builtin_trap (though this one probably isn't broken in practice),
> expand_ifn_atomic_compare_exchange_into_call.
> 
> OK, but please check the other instances where we call expand_call, then
> continue generating code afterwards.  Fixing those can be a follow-up patch.

We certainly have similar problems elsewhere.

I'm doing tests detecting whenever we create dead code (right after a
barrier); it finds a few things, mostly harmless, but there are quite
a few places where we create dead code during expand.  This will take
a while, but it will need to happen during stage 1...  I'm trying to
fit it in :-/


Segher
diff mbox

Patch

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 4f6c9c4..3a70693 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6079,6 +6079,12 @@  expand_builtin_atomic_fetch_op (machine_mode mode, tree exp, rtx target,
   gcc_assert (TREE_OPERAND (addr, 0) == fndecl);
   TREE_OPERAND (addr, 0) = builtin_decl_explicit (ext_call);
 
+  /* If we will emit code after the call, the call can not be a tail call.
+     If it is emitted as a tail call, a barrier is emitted after it, and
+     then all trailing code is removed.  */
+  if (!ignore)
+    CALL_EXPR_TAILCALL (exp) = 0;
+
   /* Expand the call here so we can emit trailing code.  */
   ret = expand_call (exp, target, ignore);