diff mbox series

Reject tail calls that read from an escaped RESULT_DECL (PR90313)

Message ID mptef1uzqjb.fsf@arm.com
State New
Headers show
Series Reject tail calls that read from an escaped RESULT_DECL (PR90313) | expand

Commit Message

Richard Sandiford Aug. 9, 2019, 8:33 a.m. UTC
In this PR we have two return paths from a function "map".  The common
code sets <result> to the value returned by one path, while the other
path does:

   <retval> = map (&<retval>, ...);

We treated this call as tail recursion, losing the copy semantics
on the value returned by the recursive call.

We'd correctly reject the same thing for variables:

   local = map (&local, ...);

The problem is that RESULT_DECLs didn't get the same treatment.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2019-08-09  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	PR middle-end/90313
	* tree-tailcall.c (find_tail_calls): Reject calls that might
	read from an escaped RESULT_DECL.

gcc/testsuite/
	PR middle-end/90313
	* g++.dg/torture/pr90313.cc: New test.

Comments

Richard Biener Aug. 9, 2019, 9:28 a.m. UTC | #1
On Fri, Aug 9, 2019 at 10:33 AM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> In this PR we have two return paths from a function "map".  The common
> code sets <result> to the value returned by one path, while the other
> path does:
>
>    <retval> = map (&<retval>, ...);
>
> We treated this call as tail recursion, losing the copy semantics
> on the value returned by the recursive call.
>
> We'd correctly reject the same thing for variables:
>
>    local = map (&local, ...);
>
> The problem is that RESULT_DECLs didn't get the same treatment.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

OK.

Richard.

> Richard
>
>
> 2019-08-09  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         PR middle-end/90313
>         * tree-tailcall.c (find_tail_calls): Reject calls that might
>         read from an escaped RESULT_DECL.
>
> gcc/testsuite/
>         PR middle-end/90313
>         * g++.dg/torture/pr90313.cc: New test.
>
> Index: gcc/tree-tailcall.c
> ===================================================================
> --- gcc/tree-tailcall.c 2019-05-29 10:49:37.868706770 +0100
> +++ gcc/tree-tailcall.c 2019-08-09 09:31:27.441318174 +0100
> @@ -491,6 +491,35 @@ find_tail_calls (basic_block bb, struct
>        && !stmt_can_throw_external (cfun, stmt))
>      return;
>
> +  /* If the function returns a value, then at present, the tail call
> +     must return the same type of value.  There is conceptually a copy
> +     between the object returned by the tail call candidate and the
> +     object returned by CFUN itself.
> +
> +     This means that if we have:
> +
> +        lhs = f (&<retval>);    // f reads from <retval>
> +                                // (lhs is usually also <retval>)
> +
> +     there is a copy between the temporary object returned by f and lhs,
> +     meaning that any use of <retval> in f occurs before the assignment
> +     to lhs begins.  Thus the <retval> that is live on entry to the call
> +     to f is really an independent local variable V that happens to be
> +     stored in the RESULT_DECL rather than a local VAR_DECL.
> +
> +     Turning this into a tail call would remove the copy and make the
> +     lifetimes of the return value and V overlap.  The same applies to
> +     tail recursion, since if f can read from <retval>, we have to assume
> +     that CFUN might already have written to <retval> before the call.
> +
> +     The problem doesn't apply when <retval> is passed by value, but that
> +     isn't a case we handle anyway.  */
> +  tree result_decl = DECL_RESULT (cfun->decl);
> +  if (result_decl
> +      && may_be_aliased (result_decl)
> +      && ref_maybe_used_by_stmt_p (call, result_decl))
> +    return;
> +
>    /* We found the call, check whether it is suitable.  */
>    tail_recursion = false;
>    func = gimple_call_fndecl (call);
> Index: gcc/testsuite/g++.dg/torture/pr90313.cc
> ===================================================================
> --- /dev/null   2019-07-30 08:53:31.317691683 +0100
> +++ gcc/testsuite/g++.dg/torture/pr90313.cc     2019-08-09 09:31:27.437318206 +0100
> @@ -0,0 +1,33 @@
> +// { dg-do run }
> +
> +#include <stddef.h>
> +
> +namespace std {
> +  template<typename T, size_t N> struct array {
> +    T elems[N];
> +    const T &operator[](size_t i) const { return elems[i]; }
> +  };
> +}
> +
> +using Coordinates = std::array<double, 3>;
> +
> +Coordinates map(const Coordinates &c, size_t level)
> +{
> +  Coordinates result{ c[1], c[2], c[0] };
> +
> +  if (level != 0)
> +    result = map (result, level - 1);
> +
> +  return result;
> +}
> +
> +int main()
> +{
> +  Coordinates vecOfCoordinates = { 1.0, 2.0, 3.0 };
> +
> +  auto result = map(vecOfCoordinates, 1);
> +  if (result[0] != 3 || result[1] != 1 || result[2] != 2)
> +    __builtin_abort ();
> +
> +  return 0;
> +}
Jakub Jelinek Aug. 9, 2019, 10:22 a.m. UTC | #2
On Fri, Aug 09, 2019 at 11:28:32AM +0200, Richard Biener wrote:
> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> OK.

Can't we have a CLOBBER also for the RESULT_DECL var in some cases and
on some paths and thus shouldn't we track the RESULT_DECL in
compute_live_vars/live_vars_at_stmt
in addition to the local vars (sure, tree-ssa-live.c would need to change
the two spots where it tests VAR_P to VAR_P || == RESULT_DECL)?

	Jakub
Richard Sandiford Aug. 9, 2019, 3:19 p.m. UTC | #3
Jakub Jelinek <jakub@redhat.com> writes:
> On Fri, Aug 09, 2019 at 11:28:32AM +0200, Richard Biener wrote:
>> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>> 
>> OK.
>
> Can't we have a CLOBBER also for the RESULT_DECL var in some cases and
> on some paths and thus shouldn't we track the RESULT_DECL in
> compute_live_vars/live_vars_at_stmt
> in addition to the local vars (sure, tree-ssa-live.c would need to change
> the two spots where it tests VAR_P to VAR_P || == RESULT_DECL)?
>
> 	Jakub

Have you got an example of the kind of case in which that would cause
problems?  If the RESULT_DECL isn't read by the tail call candidate,
then it should be OK for the candidate to write to and potentially
clobber the RESULT_DECL as it goes along, just like it would for
any other call.

In cases that we can currently turn into tail calls/recursion, either:

- the call assigns to the RESULT_DECL and the RESULT_DECL is live "at"
  (= after) the call

- the call assigns to a local variable that is later copied (via a chain
  of assignments) to RESULT_DECL.  The RESULT_DECL isn't then live at/after
  the call.  (The chain of assignments must be complete assignments;
  we don't allow RESULT_DECL to be written piecemeal.)

Thanks,
Richard
Jakub Jelinek Aug. 9, 2019, 3:32 p.m. UTC | #4
On Fri, Aug 09, 2019 at 04:19:38PM +0100, Richard Sandiford wrote:
> > Can't we have a CLOBBER also for the RESULT_DECL var in some cases and
> > on some paths and thus shouldn't we track the RESULT_DECL in
> > compute_live_vars/live_vars_at_stmt
> > in addition to the local vars (sure, tree-ssa-live.c would need to change
> > the two spots where it tests VAR_P to VAR_P || == RESULT_DECL)?
> 
> Have you got an example of the kind of case in which that would cause
> problems?  If the RESULT_DECL isn't read by the tail call candidate,
> then it should be OK for the candidate to write to and potentially
> clobber the RESULT_DECL as it goes along, just like it would for
> any other call.

I don't, I just wasn't sure if it can happen or not.  Maybe it can't and
would prevent NVR or NVRO from happening.
I guess in my next bootstrap I'll add some statistic gathering code if
there are ever any gimple_clobber_p stmts with RESULT_DECL on the lhs.

	Jakub
diff mbox series

Patch

Index: gcc/tree-tailcall.c
===================================================================
--- gcc/tree-tailcall.c	2019-05-29 10:49:37.868706770 +0100
+++ gcc/tree-tailcall.c	2019-08-09 09:31:27.441318174 +0100
@@ -491,6 +491,35 @@  find_tail_calls (basic_block bb, struct
       && !stmt_can_throw_external (cfun, stmt))
     return;
 
+  /* If the function returns a value, then at present, the tail call
+     must return the same type of value.  There is conceptually a copy
+     between the object returned by the tail call candidate and the
+     object returned by CFUN itself.
+
+     This means that if we have:
+
+	 lhs = f (&<retval>);    // f reads from <retval>
+				 // (lhs is usually also <retval>)
+
+     there is a copy between the temporary object returned by f and lhs,
+     meaning that any use of <retval> in f occurs before the assignment
+     to lhs begins.  Thus the <retval> that is live on entry to the call
+     to f is really an independent local variable V that happens to be
+     stored in the RESULT_DECL rather than a local VAR_DECL.
+
+     Turning this into a tail call would remove the copy and make the
+     lifetimes of the return value and V overlap.  The same applies to
+     tail recursion, since if f can read from <retval>, we have to assume
+     that CFUN might already have written to <retval> before the call.
+
+     The problem doesn't apply when <retval> is passed by value, but that
+     isn't a case we handle anyway.  */
+  tree result_decl = DECL_RESULT (cfun->decl);
+  if (result_decl
+      && may_be_aliased (result_decl)
+      && ref_maybe_used_by_stmt_p (call, result_decl))
+    return;
+
   /* We found the call, check whether it is suitable.  */
   tail_recursion = false;
   func = gimple_call_fndecl (call);
Index: gcc/testsuite/g++.dg/torture/pr90313.cc
===================================================================
--- /dev/null	2019-07-30 08:53:31.317691683 +0100
+++ gcc/testsuite/g++.dg/torture/pr90313.cc	2019-08-09 09:31:27.437318206 +0100
@@ -0,0 +1,33 @@ 
+// { dg-do run }
+
+#include <stddef.h>
+
+namespace std {
+  template<typename T, size_t N> struct array {
+    T elems[N];
+    const T &operator[](size_t i) const { return elems[i]; }
+  };
+}
+
+using Coordinates = std::array<double, 3>;
+
+Coordinates map(const Coordinates &c, size_t level)
+{
+  Coordinates result{ c[1], c[2], c[0] };
+
+  if (level != 0)
+    result = map (result, level - 1);
+
+  return result;
+}
+
+int main()
+{
+  Coordinates vecOfCoordinates = { 1.0, 2.0, 3.0 };
+
+  auto result = map(vecOfCoordinates, 1);
+  if (result[0] != 3 || result[1] != 1 || result[2] != 2)
+    __builtin_abort ();
+
+  return 0;
+}