diff mbox series

coroutines: Implement n4849 recommended symmetric transfer.

Message ID 45518B31-C75F-4FF1-A8F7-EBEDF788A76F@gmail.com
State New
Headers show
Series coroutines: Implement n4849 recommended symmetric transfer. | expand

Commit Message

Jeff Law via Gcc-patches March 20, 2020, 3:40 p.m. UTC
Hi,

This is the final (non-bug-fixing) patch to bring our implementation
into line with n4849.

===

Although the note in the text [expr.await] / 5.1.1 is not normative,
it is asserted by users that an implementation that is unable to
perform unlimited symmetric transfers is not terribly useful.

This relates to the following circumstance:

try {
 users-function-body:
 {
    ....
    { some suspend context
      continuation_handle = await_suspend (another handle);
      continuation_handle.resume ();
      'return' (actually a suspension operation).
    }
  }
} catch (...) {}

The call to 'continuation_handle.resume ()' needs to be a tail-
call in order that an arbitrary number of coroutines can be handled
in this manner.  There are two issues with this:

1. That the user's function body is wrapped in a try/catch block and
   one cannot tail-call from within those.

2. That GCC doesn't usually produce tail-calls when the optimisation
   level is < O2.

After considerable discussion at WG21 meetings, it has been determined
that the intent is that the operation behaves as if the resume call is
executed in the context of the caller.

So, we can remap the fragment above like this:

{
  void_coroutine_handle continuation;

  try {
   users-function-body:
   {
      ....
      { some suspend context
        continuation = await_suspend (another handle);
        <scope exit without cleanup> symmetric_transfer;
      }
    }
  } catch (...) {}

symmetric_transfer:
  continuation.resume(); [tail call] [must tail call]
}

Thus we take the call outside the try-catch block which solves
issue (1) and mark it as a tail call and as "must tail call" for
correctness which solves (2).

As bonuses, since we no longer need to differentiate handle types
returned from await_suspend() methods, nor do we need to keep them
in the coroutine frame, since they are ephemeral, we save entries in
the frame and reduce some code too.

tested on x86_64-{linux,darwin} and powerpc-linux,
OK for master?
thanks
Iain


gcc/cp/ChangeLog:

2020-03-20  Iain Sandoe  <iain@sandoe.co.uk>

	* coroutines.cc (coro_init_identifiers): Initialize an identifier
	for the cororoutine handle 'address' method name.
	(struct coro_aw_data): Add fields to cover the continuations.
	(co_await_expander): Determine the kind of await_suspend in use.
	If we have the case that returns a continuation handle, then save
	this and make the target for 'scope exit without cleanup' be the
	continuation resume label.
	(expand_co_awaits): Handle the continuation case.
	(struct suspend_point_info): Remove fields that kept the returned
	await_suspend handle type.
	(transform_await_expr): Remove code tracking continuation handles.
	(build_actor_fn): Add the continuation handle as an actor-function
	scope var.  Build the symmetric transfer continuation point.
	(register_await_info): Remove fields tracking continuation handles.
	(get_await_suspend_return_type): Remove.
	(register_awaits): Remove code tracking continuation handles.
	(morph_fn_to_coro): Remove code tracking continuation handles.

gcc/testsuite/ChangeLog:

2020-03-20  Iain Sandoe  <iain@sandoe.co.uk>

	* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: Amend
	to n4849 behaviour.
	* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C: New
	test.

Comments

Nathan Sidwell March 24, 2020, 1:12 p.m. UTC | #1
On 3/20/20 11:40 AM, Iain Sandoe via Gcc-patches wrote:

> 2020-03-20  Iain Sandoe  <iain@sandoe.co.uk>
> 
> 	* coroutines.cc (coro_init_identifiers): Initialize an identifier
> 	for the cororoutine handle 'address' method name.
> 	(struct coro_aw_data): Add fields to cover the continuations.
> 	(co_await_expander): Determine the kind of await_suspend in use.
> 	If we have the case that returns a continuation handle, then save
> 	this and make the target for 'scope exit without cleanup' be the
> 	continuation resume label.
> 	(expand_co_awaits): Handle the continuation case.
> 	(struct suspend_point_info): Remove fields that kept the returned
> 	await_suspend handle type.
> 	(transform_await_expr): Remove code tracking continuation handles.
> 	(build_actor_fn): Add the continuation handle as an actor-function
> 	scope var.  Build the symmetric transfer continuation point.
> 	(register_await_info): Remove fields tracking continuation handles.
> 	(get_await_suspend_return_type): Remove.
> 	(register_awaits): Remove code tracking continuation handles.
> 	(morph_fn_to_coro): Remove code tracking continuation handles.
> 
> gcc/testsuite/ChangeLog:
> 
> 2020-03-20  Iain Sandoe  <iain@sandoe.co.uk>
> 
> 	* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C: Amend
> 	to n4849 behaviour.
> 	* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C: New
> 	test.
> 
> 
> diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
> index abd4cac1c82..49e03f2ccea 100644
> --- a/gcc/cp/coroutines.cc
> +++ b/gcc/cp/coroutines.cc

>     tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
> +  tree susp_type;
> +  if (TREE_CODE (suspend) == CALL_EXPR)
> +    {
> +      susp_type = CALL_EXPR_FN (suspend);
> +      if (TREE_CODE (susp_type) == ADDR_EXPR)
> +	susp_type = TREE_OPERAND (susp_type, 0);
> +      susp_type = TREE_TYPE (TREE_TYPE (susp_type));
> +    }
> +  else
> +    susp_type = TREE_TYPE (suspend);

I think:
  if (tree fndec = get_callee_fndecl_nofold (suspend))
     susp_type = TREE_TYPE (TREE_TYPE (fndecl));
  else
     susp_type = TREE_TYPE (suspend);
would do?  But how can TREE_TYPE (suspend) be different from the return 
type of the function decl?  It seems funky that the behaviour could 
depend on the form of the suspend.  What am I missing?


> @@ -1532,6 +1553,8 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
>       = build1 (ADDR_EXPR, build_reference_type (void_type_node), resume_label);
>     tree susp
>       = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->cororet);
> +  tree cont
> +    = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->corocont);
>     tree final_susp = build_int_cst (integer_type_node, is_final ? 1 : 0);

Wait, 'void &' is not a thing.  What's been happening here?  do you mean 
to build pointer_types?  'build_address (data->$FIELD)'

> @@ -2012,6 +2027,15 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
>       = create_named_label_with_ctx (loc, "actor.begin", actor);
>     tree actor_frame = build1_loc (loc, INDIRECT_REF, coro_frame_type, actor_fp);
>   
> +  /* Init the continuation with a NULL ptr.  */
> +  tree zeroinit = build1 (CONVERT_EXPR, void_coro_handle_type,
> +			  integer_zero_node);
> +  tree ci = build2 (INIT_EXPR, void_coro_handle_type, continuation, zeroinit);
> +  ci = build_stmt (loc, DECL_EXPR, continuation);

This appears to be overwriting ci?

> +  ci = build1_loc (loc, CONVERT_EXPR, void_type_node, ci);
.. and still expecting it to be an EXPR node?
> +  ci = maybe_cleanup_point_expr_void (ci);
> +  add_stmt (ci);
> +
>     /* Re-write param references in the body, no code should be generated
>        here.  */
>     if (DECL_ARGUMENTS (orig))

> +
> +  /* Now we have the actual call, and we can mark it as a tail.  */
> +  CALL_EXPR_TAILCALL (resume) = true;
> +  /* ... and for optimisation levels 0..1, mark it as requiring a tail-call
> +     for correctness.  It seems that doing this for optimisation levels that
> +     normally perform tail-calling, confuses the ME (or it would be logical
> +     to put this on unilaterally).  */

  Might be worth ping a ME expert about why that is?

> +  if (optimize < 2)
> +    CALL_EXPR_MUST_TAIL_CALL (resume) = true;
> +  resume = coro_build_cvt_void_expr_stmt (resume, loc);
> +  add_stmt (resume);
> +
> +  r = build_stmt (loc, RETURN_EXPR, NULL);
> +  r = maybe_cleanup_point_expr_void (r);

Shouldn't there be no cleanups?  Perhaps assert it didn't add any?

> +  add_stmt (r);
> +
>     /* We need the resume index to work with.  */
>     tree res_idx_m
>       = lookup_member (coro_frame_type, resume_idx_name,

nathan
Iain Sandoe March 24, 2020, 6:08 p.m. UTC | #2
Hi Nathan,
Thanks for the review,
comments embedded and a new version attached.

@David, you added the CALL_EXPR_MUST_TAIL_CALL which I’ve made use of
here (but with an observation).  Perhaps you would be able to comment on whether
I’ve (ab-)used it correctly?

OK for master now?
thanks
Iain

Nathan Sidwell <nathan@acm.org> wrote:

> On 3/20/20 11:40 AM, Iain Sandoe via Gcc-patches wrote:
> 

>>    tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
>> +  tree susp_type;
>> +  if (TREE_CODE (suspend) == CALL_EXPR)
>> +    {
>> +      susp_type = CALL_EXPR_FN (suspend);
>> +      if (TREE_CODE (susp_type) == ADDR_EXPR)
>> +	susp_type = TREE_OPERAND (susp_type, 0);
>> +      susp_type = TREE_TYPE (TREE_TYPE (susp_type));
>> +    }
>> +  else
>> +    susp_type = TREE_TYPE (suspend);
> 
> I think:
> if (tree fndec = get_callee_fndecl_nofold (suspend))
>    susp_type = TREE_TYPE (TREE_TYPE (fndecl));
> else
>    susp_type = TREE_TYPE (suspend);
> would do?  But how can TREE_TYPE (suspend) be different from the return type of the function decl?  It seems funky that the behaviour could depend on the form of the suspend.  What am I missing?

Your proposed change is much neater, thanks.
The reason for the alternate is that the expression can also be a target_expr.

>> @@ -1532,6 +1553,8 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
>>      = build1 (ADDR_EXPR, build_reference_type (void_type_node), resume_label);
>>    tree susp
>>      = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->cororet);
>> +  tree cont
>> +    = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->corocont);
>>    tree final_susp = build_int_cst (integer_type_node, is_final ? 1 : 0);
> 
> Wait, 'void &' is not a thing.  What's been happening here?  do you mean to build pointer_types?  'build_address (data->$FIELD)’
yes, fixed.

>> @@ -2012,6 +2027,15 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
>>      = create_named_label_with_ctx (loc, "actor.begin", actor);
>>    tree actor_frame = build1_loc (loc, INDIRECT_REF, coro_frame_type, actor_fp);
>>  +  /* Init the continuation with a NULL ptr.  */
>> +  tree zeroinit = build1 (CONVERT_EXPR, void_coro_handle_type,
>> +			  integer_zero_node);
>> +  tree ci = build2 (INIT_EXPR, void_coro_handle_type, continuation, zeroinit);
>> +  ci = build_stmt (loc, DECL_EXPR, continuation);
> 
> This appears to be overwriting ci?

thanks, my carelessness :( .. I failed to delete all the code from an earlier version.
fixed.

>> +  /* Now we have the actual call, and we can mark it as a tail.  */
>> +  CALL_EXPR_TAILCALL (resume) = true;
>> +  /* ... and for optimisation levels 0..1, mark it as requiring a tail-call
>> +     for correctness.  It seems that doing this for optimisation levels that
>> +     normally perform tail-calling, confuses the ME (or it would be logical
>> +     to put this on unilaterally).  */
> 
> Might be worth ping a ME expert about why that is?

David, any comments here?
(confuses == ICE in expand, with an unexpected dead edge, IIRC).

>> +  if (optimize < 2)
>> +    CALL_EXPR_MUST_TAIL_CALL (resume) = true;
>> +  resume = coro_build_cvt_void_expr_stmt (resume, loc);
>> +  add_stmt (resume);
>> +
>> +  r = build_stmt (loc, RETURN_EXPR, NULL);
>> +  r = maybe_cleanup_point_expr_void (r);
> 
> Shouldn't there be no cleanups?  Perhaps assert it didn't add any?
no, you’re right, that call’s  a waste of time - hoist with the C&P petard again.
removed (and I’d like to do a trivial follow-up to remove two more instances).

======

amended patch:

coroutines: Implement n4849 recommended symmetric transfer.
    
Although the note in the text [expr.await] / 5.1.1 is not normative,
it is asserted by users that an implementation that is unable to
perform unlimited symmetric transfers is not terribly useful.

This relates to the following circumstance:

try {
     users-function-body:
     {
        ....
        { some suspend context
          continuation_handle = await_suspend (another handle);
          continuation_handle.resume ();
          'return' (actually a suspension operation).
        }
      }
    } catch (...) {}

The call to 'continuation_handle.resume ()' needs to be a tail-
call in order that an arbitrary number of coroutines can be handled
in this manner.  There are two issues with this:

1. That the user's function body is wrapped in a try/catch block and
    one cannot tail-call from within those.

2. That GCC doesn't usually produce tail-calls when the optimisation
    level is < O2.

After considerable discussion at WG21 meetings, it has been determined
that the intent is that the operation behaves as if the resume call is
executed in the context of the caller.

So, we can remap the fragment above like this:

    {
      void_coroutine_handle continuation;
    
      try {
       users-function-body:
       {
          ....
          { some suspend context
            continuation = await_suspend (another handle);
            <scope exit without cleanup> symmetric_transfer;
          }
        }
      } catch (...) {}
    
    symmetric_transfer:
      continuation.resume(); [tail call] [must tail call]
    }

Thus we take the call outside the try-catch block which solves
issue (1) and mark it as a tail call and as "must tail call" for
correctness which solves (2).

As bonuses, since we no longer need to differentiate handle types
returned from await_suspend() methods, nor do we need to keep them
in the coroutine frame, since they are ephemeral, we save entries in
the frame and reduce some code too.

gcc/cp/ChangeLog:

2020-03-24  Iain Sandoe  <iain@sandoe.co.uk>

	* coroutines.cc (coro_init_identifiers): Initialize an identifier
	for the cororoutine handle 'address' method name.
	(struct coro_aw_data): Add fields to cover the continuations.
	(co_await_expander): Determine the kind of await_suspend in use.
	If we have the case that returns a continuation handle, then save
	this and make the target for 'scope exit without cleanup' be the
	continuation resume label.
	(expand_co_awaits): Handle the continuation case.
	(struct suspend_point_info): Remove fields that kept the returned
	await_suspend handle type.
	(transform_await_expr): Remove code tracking continuation handles.
	(build_actor_fn): Add the continuation handle as an actor-function
	scope var.  Build the symmetric transfer continuation point.
	(register_await_info): Remove fields tracking continuation handles.
	(get_await_suspend_return_type): Remove.
	(register_awaits): Remove code tracking continuation handles.
	(morph_fn_to_coro): Remove code tracking continuation handles.

gcc/testsuite/ChangeLog:

2020-03-24  Iain Sandoe  <iain@sandoe.co.uk>

	* g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C:
	* g++.dg/coroutines/torture/symmetric-transfer-00-basic.C: New test.

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index bcf3514bbcf..efe116b30e9 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -199,6 +199,7 @@ static GTY(()) tree coro_return_void_identifier;
 static GTY(()) tree coro_return_value_identifier;
 static GTY(()) tree coro_yield_value_identifier;
 static GTY(()) tree coro_resume_identifier;
+static GTY(()) tree coro_address_identifier;
 static GTY(()) tree coro_from_address_identifier;
 static GTY(()) tree coro_get_return_object_identifier;
 static GTY(()) tree coro_gro_on_allocation_fail_identifier;
@@ -226,6 +227,7 @@ coro_init_identifiers ()
   coro_return_value_identifier = get_identifier ("return_value");
   coro_yield_value_identifier = get_identifier ("yield_value");
   coro_resume_identifier = get_identifier ("resume");
+  coro_address_identifier = get_identifier ("address");
   coro_from_address_identifier = get_identifier ("from_address");
   coro_get_return_object_identifier = get_identifier ("get_return_object");
   coro_gro_on_allocation_fail_identifier =
@@ -1352,6 +1354,8 @@ struct coro_aw_data
   tree self_h;     /* This is a handle to the current coro (frame var).  */
   tree cleanup;    /* This is where to go once we complete local destroy.  */
   tree cororet;    /* This is where to go if we suspend.  */
+  tree corocont;   /* This is where to go if we continue.  */
+  tree conthand;   /* This is the handle for a continuation.  */
   unsigned index;  /* This is our current resume index.  */
 };
 
@@ -1440,7 +1444,6 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 
   tree actor = data->actor_fn;
   location_t loc = EXPR_LOCATION (*stmt);
-  tree sv_handle = TREE_OPERAND (saved_co_await, 0);
   tree var = TREE_OPERAND (saved_co_await, 1);  /* frame slot. */
   tree expr = TREE_OPERAND (saved_co_await, 2); /* initializer.  */
   tree awaiter_calls = TREE_OPERAND (saved_co_await, 3);
@@ -1493,15 +1496,33 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
   r = coro_build_cvt_void_expr_stmt (r, loc);
   append_to_statement_list (r, &body_list);
 
+  /* Find out what we have to do with the awaiter's suspend method.
+     [expr.await]
+     (5.1) If the result of await-ready is false, the coroutine is considered
+	   suspended. Then:
+     (5.1.1) If the type of await-suspend is std::coroutine_handle<Z>,
+	     await-suspend.resume() is evaluated.
+     (5.1.2) if the type of await-suspend is bool, await-suspend is evaluated,
+	     and the coroutine is resumed if the result is false.
+     (5.1.3) Otherwise, await-suspend is evaluated.  */
+
   tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
+  tree susp_type;
+  if (tree fndecl = cp_get_callee_fndecl_nofold (suspend))
+    susp_type = TREE_TYPE (TREE_TYPE (fndecl));
+  else
+    susp_type = TREE_TYPE (suspend);
 
-  if (sv_handle == NULL_TREE)
+  bool is_cont = false;
+  /* NOTE: final suspend can't resume; the "resume" label in that case
+     corresponds to implicit destruction.  */
+  if (VOID_TYPE_P (susp_type))
     {
-      /* void return, we just call it and hit the yield.  */
+      /* We just call await_suspend() and hit the yield.  */
       suspend = coro_build_cvt_void_expr_stmt (suspend, loc);
       append_to_statement_list (suspend, &body_list);
     }
-  else if (sv_handle == boolean_type_node)
+  else if (TREE_CODE (susp_type) == BOOLEAN_TYPE)
     {
       /* Boolean return, continue if the call returns false.  */
       suspend = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node, suspend);
@@ -1514,24 +1535,17 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
     }
   else
     {
-      r = build2_loc (loc, INIT_EXPR, TREE_TYPE (sv_handle), sv_handle,
-		      suspend);
+      r = build1_loc (loc, CONVERT_EXPR, void_coro_handle_type, suspend);
+      r = build2_loc (loc, INIT_EXPR, void_coro_handle_type, data->conthand, r);
+      r = build1 (CONVERT_EXPR, void_type_node, r);
       append_to_statement_list (r, &body_list);
-      tree resume
-	= lookup_member (TREE_TYPE (sv_handle), coro_resume_identifier, 1, 0,
-			 tf_warning_or_error);
-      resume = build_new_method_call (sv_handle, resume, NULL, NULL_TREE,
-				      LOOKUP_NORMAL, NULL, tf_warning_or_error);
-      resume = coro_build_cvt_void_expr_stmt (resume, loc);
-      append_to_statement_list (resume, &body_list);
-    }
-
-  tree d_l
-    = build1 (ADDR_EXPR, build_reference_type (void_type_node), destroy_label);
-  tree r_l
-    = build1 (ADDR_EXPR, build_reference_type (void_type_node), resume_label);
-  tree susp
-    = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->cororet);
+      is_cont = true;
+    }
+
+  tree d_l = build_address (destroy_label);
+  tree r_l = build_address (resume_label);
+  tree susp = build_address (data->cororet);
+  tree cont = build_address (data->corocont);
   tree final_susp = build_int_cst (integer_type_node, is_final ? 1 : 0);
 
   susp_idx = build_int_cst (integer_type_node, data->index);
@@ -1551,7 +1565,8 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 			create_anon_label_with_ctx (loc, actor));
   add_stmt (r); /* case 0: */
   /* Implement the suspend, a scope exit without clean ups.  */
-  r = build_call_expr_internal_loc (loc, IFN_CO_SUSPN, void_type_node, 1, susp);
+  r = build_call_expr_internal_loc (loc, IFN_CO_SUSPN, void_type_node, 1,
+				    is_cont ? cont : susp);
   r = coro_build_cvt_void_expr_stmt (r, loc);
   add_stmt (r); /*   goto ret;  */
   r = build_case_label (build_int_cst (integer_type_node, 1), NULL_TREE,
@@ -1647,10 +1662,12 @@ co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 
 static tree
 expand_co_awaits (tree fn, tree *fnbody, tree coro_fp, tree resume_idx,
-		  tree iarc_idx, tree cleanup, tree cororet, tree self_h)
+		  tree iarc_idx, tree cleanup, tree cororet, tree self_h,
+		  tree contlabel, tree conthandle)
 {
   coro_aw_data data
-    = {fn, coro_fp, resume_idx, iarc_idx, self_h, cleanup, cororet, 2};
+    = {fn, coro_fp, resume_idx, iarc_idx, self_h, cleanup,
+       cororet, contlabel, conthandle, 2};
   cp_walk_tree (fnbody, co_await_expander, &data, NULL);
   return *fnbody;
 }
@@ -1663,10 +1680,6 @@ struct suspend_point_info
   tree awaitable_type;
   /* coro frame field name.  */
   tree await_field_id;
-  /* suspend method return type.  */
-  tree suspend_type;
-  /* suspend handle field name, NULL_TREE if not needed.  */
-  tree susp_handle_id;
 };
 
 static hash_map<tree, suspend_point_info> *suspend_points;
@@ -1703,22 +1716,6 @@ transform_await_expr (tree await_expr, await_xform_data *xform)
 	  We need to replace the e_proxy in the awr_call.  */
 
   tree coro_frame_type = TREE_TYPE (xform->actor_frame);
-  tree ah = NULL_TREE;
-  if (si->susp_handle_id)
-    {
-      tree ah_m
-	= lookup_member (coro_frame_type, si->susp_handle_id,
-			 /*protect=*/1, /*want_type=*/0, tf_warning_or_error);
-      ah = build_class_member_access_expr (xform->actor_frame, ah_m, NULL_TREE,
-					   true, tf_warning_or_error);
-    }
-  else if (TREE_CODE (si->suspend_type) == BOOLEAN_TYPE)
-    ah = boolean_type_node;
-
-  /* Replace Op 0 with the frame slot for the temporary handle, if it's needed.
-     If there's no frame type to be stored we flag boolean_type for that case
-     and an empty pointer for void return.  */
-  TREE_OPERAND (await_expr, 0) = ah;
 
   /* If we have a frame var for the awaitable, get a reference to it.  */
   proxy_replace data;
@@ -1988,6 +1985,15 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   tree top_block = make_node (BLOCK);
   BIND_EXPR_BLOCK (actor_bind) = top_block;
 
+  
+  tree continuation = build_lang_decl (VAR_DECL,
+				       get_identifier ("actor.continue"),
+				       void_coro_handle_type);
+  DECL_ARTIFICIAL (continuation) = 1;
+  DECL_IGNORED_P (continuation) = 1;
+  DECL_CONTEXT (continuation) = actor;
+  BIND_EXPR_VARS (actor_bind) = continuation;
+
   /* Update the block associated with the outer scope of the orig fn.  */
   tree first = expr_first (fnbody);
   if (first && TREE_CODE (first) == BIND_EXPR)
@@ -2012,6 +2018,12 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
     = create_named_label_with_ctx (loc, "actor.begin", actor);
   tree actor_frame = build1_loc (loc, INDIRECT_REF, coro_frame_type, actor_fp);
 
+  /* Declare the continuation handle.  */
+  tree ci = build_stmt (loc, DECL_EXPR, continuation);
+  ci = build1_loc (loc, CONVERT_EXPR, void_type_node, ci);
+  ci = maybe_cleanup_point_expr_void (ci);
+  add_stmt (ci);
+
   /* Re-write param references in the body, no code should be generated
      here.  */
   if (DECL_ARGUMENTS (orig))
@@ -2062,6 +2074,9 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   tree ret_label
     = create_named_label_with_ctx (loc, "actor.suspend.ret", actor);
 
+  tree continue_label
+    = create_named_label_with_ctx (loc, "actor.continue.ret", actor);
+
   tree lsb_if = begin_if_stmt ();
   tree chkb0 = build2 (BIT_AND_EXPR, short_unsigned_type_node, rat,
 		       build_int_cst (short_unsigned_type_node, 1));
@@ -2368,6 +2383,35 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   r = maybe_cleanup_point_expr_void (r);
   add_stmt (r);
 
+  /* This is the 'continuation' return point.  For such a case we have a coro
+     handle (from the await_suspend() call) and we want handle.resume() to
+     execute as a tailcall allowing arbitrary chaining of coroutines.  */
+  r = build_stmt (loc, LABEL_EXPR, continue_label);
+  add_stmt (r);
+
+  /* We want to force a tail-call even for O0/1, so this expands the resume
+     call into its underlying implementation.  */
+  tree addr = lookup_member (void_coro_handle_type, coro_address_identifier,
+			       1, 0, tf_warning_or_error);
+  addr = build_new_method_call (continuation, addr, NULL, NULL_TREE,
+				  LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  tree resume = build_call_expr_loc
+    (loc, builtin_decl_explicit (BUILT_IN_CORO_RESUME), 1, addr);
+
+  /* Now we have the actual call, and we can mark it as a tail.  */
+  CALL_EXPR_TAILCALL (resume) = true;
+  /* ... and for optimisation levels 0..1, mark it as requiring a tail-call
+     for correctness.  It seems that doing this for optimisation levels that
+     normally perform tail-calling, confuses the ME (or it would be logical
+     to put this on unilaterally).  */
+  if (optimize < 2)
+    CALL_EXPR_MUST_TAIL_CALL (resume) = true;
+  resume = coro_build_cvt_void_expr_stmt (resume, loc);
+  add_stmt (resume);
+
+  r = build_stmt (loc, RETURN_EXPR, NULL);
+  add_stmt (r);
+
   /* We need the resume index to work with.  */
   tree res_idx_m
     = lookup_member (coro_frame_type, resume_idx_name,
@@ -2387,7 +2431,8 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   /* We've now rewritten the tree and added the initial and final
      co_awaits.  Now pass over the tree and expand the co_awaits.  */
   actor_body = expand_co_awaits (actor, &actor_body, actor_fp, res_idx,
-				 iarc_idx, del_promise_label, ret_label, ash);
+				 iarc_idx, del_promise_label, ret_label, ash,
+				 continue_label, continuation);
 
   actor_body = pop_stmt_list (actor_body);
   BIND_EXPR_BODY (actor_bind) = actor_body;
@@ -2523,8 +2568,7 @@ build_init_or_final_await (location_t loc, bool is_final)
    function.  */
 
 static bool
-register_await_info (tree await_expr, tree aw_type, tree aw_nam, tree susp_type,
-		     tree susp_handle_nam)
+register_await_info (tree await_expr, tree aw_type, tree aw_nam)
 {
   bool seen;
   suspend_point_info &s
@@ -2537,8 +2581,6 @@ register_await_info (tree await_expr, tree aw_type, tree aw_nam, tree susp_type,
     }
   s.awaitable_type = aw_type;
   s.await_field_id = aw_nam;
-  s.suspend_type = susp_type;
-  s.susp_handle_id = susp_handle_nam;
   return true;
 }
 
@@ -2568,26 +2610,6 @@ struct susp_frame_data
   bool captures_temporary;
 };
 
-/* Helper to return the type of an awaiter's await_suspend() method.
-   We start with the result of the build method call, which will be either
-   a call expression (void, bool) or a target expressions (handle).  */
-
-static tree
-get_await_suspend_return_type (tree aw_expr)
-{
-  tree susp_fn = TREE_VEC_ELT (TREE_OPERAND (aw_expr, 3), 1);
-  if (TREE_CODE (susp_fn) == CALL_EXPR)
-    {
-      susp_fn = CALL_EXPR_FN (susp_fn);
-      if (TREE_CODE (susp_fn) == ADDR_EXPR)
-	susp_fn = TREE_OPERAND (susp_fn, 0);
-      return TREE_TYPE (TREE_TYPE (susp_fn));
-    }
-  else if (TREE_CODE (susp_fn) == TARGET_EXPR)
-    return TREE_TYPE (susp_fn);
-  return TREE_TYPE (susp_fn);
-}
-
 /* Walk the sub-tree looking for call expressions that both capture
    references and have compiler-temporaries as parms.  */
 
@@ -2749,31 +2771,7 @@ register_awaits (tree *stmt, int *do_subtree ATTRIBUTE_UNUSED, void *d)
       free (nam);
     }
 
-  /* Find out what we have to do with the awaiter's suspend method (this
-     determines if we need somewhere to stash the suspend method's handle).
-     Cache the result of this in the suspend point info.
-     [expr.await]
-     (5.1) If the result of await-ready is false, the coroutine is considered
-	   suspended. Then:
-     (5.1.1) If the type of await-suspend is std::coroutine_handle<Z>,
-	     await-suspend.resume() is evaluated.
-     (5.1.2) if the type of await-suspend is bool, await-suspend is evaluated,
-	     and the coroutine is resumed if the result is false.
-     (5.1.3) Otherwise, await-suspend is evaluated.  */
-
-  tree susp_typ = get_await_suspend_return_type (aw_expr);
-  tree handle_field_nam;
-  if (VOID_TYPE_P (susp_typ) || TREE_CODE (susp_typ) == BOOLEAN_TYPE)
-    handle_field_nam = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      char *nam = xasprintf ("__aw_h.%u", data->count);
-      handle_field_nam
-	= coro_make_frame_entry (data->field_list, nam, susp_typ, aw_loc);
-      free (nam);
-    }
-  register_await_info (aw_expr, aw_field_type, aw_field_nam, susp_typ,
-		       handle_field_nam);
+  register_await_info (aw_expr, aw_field_type, aw_field_nam);
 
   data->count++; /* Each await suspend context is unique.  */
 
@@ -3093,9 +3091,7 @@ act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)
   handle_type self_handle;
   (maybe) parameter copies.
   coro1::suspend_never_prt __is;
-  (maybe) handle_type i_hand;
   coro1::suspend_always_prt __fs;
-  (maybe) handle_type f_hand;
   (maybe) local variables saved
   (maybe) trailing space.
  };  */
@@ -3297,19 +3293,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
   tree init_susp_name = coro_make_frame_entry (&field_list, "__aw_s.is",
 					       initial_suspend_type, fn_start);
 
-  /* Figure out if we need a saved handle from the awaiter type.  */
-  tree ret_typ = get_await_suspend_return_type (initial_await);
-  tree init_hand_name;
-  if (VOID_TYPE_P (ret_typ) || TREE_CODE (ret_typ) == BOOLEAN_TYPE)
-    init_hand_name = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      init_hand_name
-	= coro_make_frame_entry (&field_list, "__ih", ret_typ, fn_start);
-    }
-
-  register_await_info (initial_await, initial_suspend_type, init_susp_name,
-		       ret_typ, init_hand_name);
+  register_await_info (initial_await, initial_suspend_type, init_susp_name);
 
   /* Now insert the data for any body await points, at this time we also need
      to promote any temporaries that are captured by reference (to regular
@@ -3324,18 +3308,7 @@ morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
   tree fin_susp_name = coro_make_frame_entry (&field_list, "__aw_s.fs",
 					      final_suspend_type, fn_start);
 
-  ret_typ = get_await_suspend_return_type (final_await);
-  tree fin_hand_name;
-  if (VOID_TYPE_P (ret_typ) || TREE_CODE (ret_typ) == BOOLEAN_TYPE)
-    fin_hand_name = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      fin_hand_name
-	= coro_make_frame_entry (&field_list, "__fh", ret_typ, fn_start);
-    }
-
-  register_await_info (final_await, final_suspend_type, fin_susp_name,
-		       void_type_node, fin_hand_name);
+  register_await_info (final_await, final_suspend_type, fin_susp_name);
 
   /* 4. Now make space for local vars, this is conservative again, and we
      would expect to delete unused entries later.  */
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C b/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
index 91f3f14cc09..d14c358c4c5 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
@@ -4,6 +4,9 @@
 
 #include "../coro.h"
 
+int coro1_dtor_ran = 0;
+int promise_dtor_ran = 0;
+
 struct coro1 {
   struct promise_type;
   using handle_type = coro::coroutine_handle<coro1::promise_type>;
@@ -24,10 +27,13 @@ struct coro1 {
 	PRINT("coro1 op=  ");
 	return *this;
   }
+
   ~coro1() {
         PRINT("Destroyed coro1");
-        if ( handle )
-          handle.destroy();
+        coro1_dtor_ran++;
+        // The coro handle will point to an invalid frame by this stage,
+        // the coroutine will already have self-destroyed the frame and
+        // promise.
   }
 
   struct suspend_never_prt {
@@ -51,7 +57,10 @@ struct coro1 {
 
   struct promise_type {
   promise_type() {  PRINT ("Created Promise"); }
-  ~promise_type() { PRINT ("Destroyed Promise"); }
+  ~promise_type() {
+     PRINT ("Destroyed Promise"); 
+     promise_dtor_ran++;
+   }
 
   coro1 get_return_object () {
     PRINT ("get_return_object: from handle from promise");
@@ -73,7 +82,7 @@ struct coro1 {
 };
 
 struct coro1
-f () noexcept
+my_coro () noexcept
 {
   PRINT ("coro1: about to return");
   co_return;
@@ -81,15 +90,26 @@ f () noexcept
 
 int main ()
 {
-  PRINT ("main: create coro1");
-  struct coro1 x = f ();
-  auto p = x.handle.promise ();
-  auto aw = p.initial_suspend();
-  auto f = aw.await_suspend(coro::coroutine_handle<coro1::promise_type>::from_address ((void *)&x));
-  PRINT ("main: got coro1 - should be done");
-  if (!x.handle.done())
+  { // scope so that we can examine the coro dtor marker.
+    PRINT ("main: creating coro");
+
+    // This should just run through to completion/destruction.
+    // In both the initial and final await expressions, the await_suspend()
+    // method will return 'false' and prevent the suspension.
+    struct coro1 x = my_coro ();
+
+    PRINT ("main: the coro frame should be already destroyed");
+    // We will take the running of the promise DTOR as evidence that the
+    // frame was destroyed as expected.
+    if (promise_dtor_ran != 1)
+      {
+	PRINT ("main: apparently we didn't destroy the frame");
+	abort ();
+      }
+  }
+  if (coro1_dtor_ran != 1 || promise_dtor_ran != 1)
     {
-      PRINT ("main: apparently was not done...");
+      PRINT ("main: bad DTOR counts");
       abort ();
     }
   PRINT ("main: returning");
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C b/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
new file mode 100644
index 00000000000..c445fc55a2c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
@@ -0,0 +1,110 @@
+//  { dg-do run }
+
+#if __has_include(<coroutine>)
+
+#include <coroutine>
+namespace coro = std;
+
+#elif __has_include(<experimental/coroutine>)
+
+#include <experimental/coroutine>
+namespace coro = std::experimental;
+
+#endif
+
+#if __clang__
+#  include <utility>
+#endif
+
+#include <chrono>
+#include <thread>
+ 
+template <typename T> 
+struct Loopy {
+  struct promise_type;
+  using handle_type = coro::coroutine_handle<Loopy::promise_type>;
+  handle_type handle;
+
+  struct promise_type {
+    // Cause the Loopy object to be created from the handle.
+    auto get_return_object () {
+     return handle_type::from_promise (*this);
+    }
+
+    coro::suspend_never yield_value(T value) {
+      //fprintf (stderr, "%s yields %d\n", me, value);
+      current_value = value;
+      return coro::suspend_never{};
+    }
+
+    coro::suspend_always initial_suspend() { return {}; }
+    coro::suspend_always final_suspend() { return {}; }
+
+    void unhandled_exception() { /*std::terminate();*/  }
+    void return_void() {}
+
+    void set_peer (handle_type  _p) { peer = _p; }
+    void set_me (const char *m) { me = m; }
+
+    T get_value () {return current_value;}
+    T current_value;
+    handle_type peer;
+    const char *me;
+  };
+
+  Loopy () : handle(0) {}
+  Loopy (handle_type _handle) : handle(_handle) {}
+  Loopy (const Loopy &) = delete;
+  Loopy (Loopy &&s) : handle(s.handle) { s.handle = nullptr; }
+  ~Loopy() { if ( handle ) handle.destroy(); }
+
+  struct loopy_awaiter {
+    handle_type p;
+    bool await_ready () { return false; }
+    /* continue the peer. */
+    handle_type await_suspend (handle_type h) {
+      p = h.promise().peer;
+      return p;
+    }
+    T await_resume () { return p.promise().get_value (); }
+  };
+};
+
+Loopy<int>
+pingpong (const char *id)
+{
+  int v = 0;
+  Loopy<int>::loopy_awaiter aw{};
+  for (;;)
+    {
+      co_yield (v+1);
+      //std::this_thread::sleep_for(std::chrono::milliseconds(500));
+      if (v > 10'000'000)
+        break;
+      v = co_await aw;
+      //fprintf (stderr, "%s = %d\n", id, v);
+    }
+ fprintf (stderr, "%s = %d\n", id, v);
+}
+
+int main ()
+{
+  // identify for fun.. 
+  const char *ping_id = "ping";
+  const char *pong_id = "pong";
+  auto ping = pingpong (ping_id); // created suspended.
+  auto pong = pingpong (pong_id);
+
+  // point them at each other...
+  ping.handle.promise().set_peer (pong.handle);
+  pong.handle.promise().set_peer (ping.handle);
+
+  ping.handle.promise().set_me (ping_id);
+  pong.handle.promise().set_me (pong_id);
+
+  // and start them ...
+  ping.handle.resume ();
+  pong.handle.resume ();
+  
+}
+
Nathan Sidwell March 24, 2020, 6:20 p.m. UTC | #3
On 3/24/20 2:08 PM, Iain Sandoe wrote:
> Hi Nathan,
> Thanks for the review,
> comments embedded and a new version attached.
> 
> @David, you added the CALL_EXPR_MUST_TAIL_CALL which I’ve made use of
> here (but with an observation).  Perhaps you would be able to comment on whether
> I’ve (ab-)used it correctly?
> 
> OK for master now?
> thanks
> Iain
> 
> Nathan Sidwell <nathan@acm.org> wrote:
> 
>> On 3/20/20 11:40 AM, Iain Sandoe via Gcc-patches wrote:
>>
> 
>>>     tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
>>> +  tree susp_type;
>>> +  if (TREE_CODE (suspend) == CALL_EXPR)
>>> +    {
>>> +      susp_type = CALL_EXPR_FN (suspend);
>>> +      if (TREE_CODE (susp_type) == ADDR_EXPR)
>>> +	susp_type = TREE_OPERAND (susp_type, 0);
>>> +      susp_type = TREE_TYPE (TREE_TYPE (susp_type));
>>> +    }
>>> +  else
>>> +    susp_type = TREE_TYPE (suspend);
>>
>> I think:
>> if (tree fndec = get_callee_fndecl_nofold (suspend))
>>     susp_type = TREE_TYPE (TREE_TYPE (fndecl));
>> else
>>     susp_type = TREE_TYPE (suspend);
>> would do?  But how can TREE_TYPE (suspend) be different from the return type of the function decl?  It seems funky that the behaviour could depend on the form of the suspend.  What am I missing?
> 
> Your proposed change is much neater, thanks.
> The reason for the alternate is that the expression can also be a target_expr.

That's not clarifying it for me.  Why is a CALL_EXPR's type different to 
they return type of the function being called?  And why in that case do 
you want the former, not the latter?

nathan
Iain Sandoe March 24, 2020, 7:52 p.m. UTC | #4
Nathan Sidwell <nathan@acm.org> wrote:

> On 3/24/20 2:08 PM, Iain Sandoe wrote:
>> Hi Nathan,
>> Thanks for the review,
>> comments embedded and a new version attached.
>> @David, you added the CALL_EXPR_MUST_TAIL_CALL which I’ve made use of
>> here (but with an observation).  Perhaps you would be able to comment on  
>> whether
>> I’ve (ab-)used it correctly?
>> OK for master now?
>> thanks
>> Iain
>> Nathan Sidwell <nathan@acm.org> wrote:
>>> On 3/20/20 11:40 AM, Iain Sandoe via Gcc-patches wrote:
>>>>    tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
>>>> +  tree susp_type;
>>>> +  if (TREE_CODE (suspend) == CALL_EXPR)
>>>> +    {
>>>> +      susp_type = CALL_EXPR_FN (suspend);
>>>> +      if (TREE_CODE (susp_type) == ADDR_EXPR)
>>>> +	susp_type = TREE_OPERAND (susp_type, 0);
>>>> +      susp_type = TREE_TYPE (TREE_TYPE (susp_type));
>>>> +    }
>>>> +  else
>>>> +    susp_type = TREE_TYPE (suspend);
>>>
>>> I think:
>>> if (tree fndec = get_callee_fndecl_nofold (suspend))
>>>    susp_type = TREE_TYPE (TREE_TYPE (fndecl));
>>> else
>>>    susp_type = TREE_TYPE (suspend);
>>> would do?  But how can TREE_TYPE (suspend) be different from the return  
>>> type of the function decl?  It seems funky that the behaviour could  
>>> depend on the form of the suspend. What am I missing?
>> Your proposed change is much neater, thanks.
>> The reason for the alternate is that the expression can also be a  
>> target_expr.
>
> That's not clarifying it for me.  Why is a CALL_EXPR's type different to  
> they return type of the function being called?  And why in that case do  
> you want the former, not the latter?

it isn’t (and therefore I don’t), and the type of the call expr is fine viz:

tree susp_type = TREE_TYPE (suspend);

(changed locally)
Iain
Nathan Sidwell March 25, 2020, 12:46 p.m. UTC | #5
On 3/24/20 2:08 PM, Iain Sandoe wrote:

>     tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
> +  tree susp_type;
> +  if (tree fndecl = cp_get_callee_fndecl_nofold (suspend))
> +    susp_type = TREE_TYPE (TREE_TYPE (fndecl));
> +  else
> +    susp_type = TREE_TYPE (suspend);

Why, when there's a call of a named function, is it that TREE_TYPE 
(suspend) is insufficient? You mentioned TARGET_EXPR, but why is their 
type differenty?

> @@ -2012,6 +2018,12 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
>       = create_named_label_with_ctx (loc, "actor.begin", actor);
>     tree actor_frame = build1_loc (loc, INDIRECT_REF, coro_frame_type, actor_fp);
>   
> +  /* Declare the continuation handle.  */
> +  tree ci = build_stmt (loc, DECL_EXPR, continuation);
> +  ci = build1_loc (loc, CONVERT_EXPR, void_type_node, ci);
> +  ci = maybe_cleanup_point_expr_void (ci);
> +  add_stmt (ci);

you don't need to wrap in a CONVERT_EXPR, can you use add_decl_expr ?
> @@ -2368,6 +2383,35 @@ build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,

> +
> +  /* Now we have the actual call, and we can mark it as a tail.  */
> +  CALL_EXPR_TAILCALL (resume) = true;
> +  /* ... and for optimisation levels 0..1, mark it as requiring a tail-call
> +     for correctness.  It seems that doing this for optimisation levels that
> +     normally perform tail-calling, confuses the ME (or it would be logical
> +     to put this on unilaterally).  */
> +  if (optimize < 2)
> +    CALL_EXPR_MUST_TAIL_CALL (resume) = true;
> +  resume = coro_build_cvt_void_expr_stmt (resume, loc);

I'd be happier with
    gcc_checking_assert (maybe_cleanup_point_expr_void (resume) == resume);
  here.  I see we can wrap RETURN_EXPRs with cleanups, so perhaps adding 
is ok anyway?

nathan
diff mbox series

Patch

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index abd4cac1c82..49e03f2ccea 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -199,6 +199,7 @@  static GTY(()) tree coro_return_void_identifier;
 static GTY(()) tree coro_return_value_identifier;
 static GTY(()) tree coro_yield_value_identifier;
 static GTY(()) tree coro_resume_identifier;
+static GTY(()) tree coro_address_identifier;
 static GTY(()) tree coro_from_address_identifier;
 static GTY(()) tree coro_get_return_object_identifier;
 static GTY(()) tree coro_gro_on_allocation_fail_identifier;
@@ -226,6 +227,7 @@  coro_init_identifiers ()
   coro_return_value_identifier = get_identifier ("return_value");
   coro_yield_value_identifier = get_identifier ("yield_value");
   coro_resume_identifier = get_identifier ("resume");
+  coro_address_identifier = get_identifier ("address");
   coro_from_address_identifier = get_identifier ("from_address");
   coro_get_return_object_identifier = get_identifier ("get_return_object");
   coro_gro_on_allocation_fail_identifier =
@@ -1352,6 +1354,8 @@  struct coro_aw_data
   tree self_h;     /* This is a handle to the current coro (frame var).  */
   tree cleanup;    /* This is where to go once we complete local destroy.  */
   tree cororet;    /* This is where to go if we suspend.  */
+  tree corocont;   /* This is where to go if we continue.  */
+  tree conthand;   /* This is the handle for a continuation.  */
   unsigned index;  /* This is our current resume index.  */
 };
 
@@ -1440,7 +1444,6 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 
   tree actor = data->actor_fn;
   location_t loc = EXPR_LOCATION (*stmt);
-  tree sv_handle = TREE_OPERAND (saved_co_await, 0);
   tree var = TREE_OPERAND (saved_co_await, 1);  /* frame slot. */
   tree expr = TREE_OPERAND (saved_co_await, 2); /* initializer.  */
   tree awaiter_calls = TREE_OPERAND (saved_co_await, 3);
@@ -1493,15 +1496,38 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
   r = coro_build_cvt_void_expr_stmt (r, loc);
   append_to_statement_list (r, &body_list);
 
+  /* Find out what we have to do with the awaiter's suspend method.
+     [expr.await]
+     (5.1) If the result of await-ready is false, the coroutine is considered
+	   suspended. Then:
+     (5.1.1) If the type of await-suspend is std::coroutine_handle<Z>,
+	     await-suspend.resume() is evaluated.
+     (5.1.2) if the type of await-suspend is bool, await-suspend is evaluated,
+	     and the coroutine is resumed if the result is false.
+     (5.1.3) Otherwise, await-suspend is evaluated.  */
+
   tree suspend = TREE_VEC_ELT (awaiter_calls, 1); /* await_suspend().  */
+  tree susp_type;
+  if (TREE_CODE (suspend) == CALL_EXPR)
+    {
+      susp_type = CALL_EXPR_FN (suspend);
+      if (TREE_CODE (susp_type) == ADDR_EXPR)
+	susp_type = TREE_OPERAND (susp_type, 0);
+      susp_type = TREE_TYPE (TREE_TYPE (susp_type));
+    }
+  else
+    susp_type = TREE_TYPE (suspend);
+  bool is_cont = false;
 
-  if (sv_handle == NULL_TREE)
+  /* Final suspend can't resume; the "resume" label corresponds in that
+     case to implicit destruction.  */
+  if (VOID_TYPE_P (susp_type))
     {
-      /* void return, we just call it and hit the yield.  */
+      /* We just call await_suspend() and hit the yield.  */
       suspend = coro_build_cvt_void_expr_stmt (suspend, loc);
       append_to_statement_list (suspend, &body_list);
     }
-  else if (sv_handle == boolean_type_node)
+  else if (TREE_CODE (susp_type) == BOOLEAN_TYPE)
     {
       /* Boolean return, continue if the call returns false.  */
       suspend = build1_loc (loc, TRUTH_NOT_EXPR, boolean_type_node, suspend);
@@ -1514,16 +1540,11 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
     }
   else
     {
-      r = build2_loc (loc, INIT_EXPR, TREE_TYPE (sv_handle), sv_handle,
-		      suspend);
+      r = build1_loc (loc, CONVERT_EXPR, void_coro_handle_type, suspend);
+      r = build2_loc (loc, INIT_EXPR, void_coro_handle_type, data->conthand, r);
+      r = build1 (CONVERT_EXPR, void_type_node, r);
       append_to_statement_list (r, &body_list);
-      tree resume
-	= lookup_member (TREE_TYPE (sv_handle), coro_resume_identifier, 1, 0,
-			 tf_warning_or_error);
-      resume = build_new_method_call (sv_handle, resume, NULL, NULL_TREE,
-				      LOOKUP_NORMAL, NULL, tf_warning_or_error);
-      resume = coro_build_cvt_void_expr_stmt (resume, loc);
-      append_to_statement_list (resume, &body_list);
+      is_cont = true;
     }
 
   tree d_l
@@ -1532,6 +1553,8 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
     = build1 (ADDR_EXPR, build_reference_type (void_type_node), resume_label);
   tree susp
     = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->cororet);
+  tree cont
+    = build1 (ADDR_EXPR, build_reference_type (void_type_node), data->corocont);
   tree final_susp = build_int_cst (integer_type_node, is_final ? 1 : 0);
 
   susp_idx = build_int_cst (integer_type_node, data->index);
@@ -1551,7 +1574,8 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 			create_anon_label_with_ctx (loc, actor));
   add_stmt (r); /* case 0: */
   /* Implement the suspend, a scope exit without clean ups.  */
-  r = build_call_expr_internal_loc (loc, IFN_CO_SUSPN, void_type_node, 1, susp);
+  r = build_call_expr_internal_loc (loc, IFN_CO_SUSPN, void_type_node, 1,
+				    is_cont ? cont : susp);
   r = coro_build_cvt_void_expr_stmt (r, loc);
   add_stmt (r); /*   goto ret;  */
   r = build_case_label (build_int_cst (integer_type_node, 1), NULL_TREE,
@@ -1647,10 +1671,12 @@  co_await_expander (tree *stmt, int * /*do_subtree*/, void *d)
 
 static tree
 expand_co_awaits (tree fn, tree *fnbody, tree coro_fp, tree resume_idx,
-		  tree iarc_idx, tree cleanup, tree cororet, tree self_h)
+		  tree iarc_idx, tree cleanup, tree cororet, tree self_h,
+		  tree contlabel, tree conthandle)
 {
   coro_aw_data data
-    = {fn, coro_fp, resume_idx, iarc_idx, self_h, cleanup, cororet, 2};
+    = {fn, coro_fp, resume_idx, iarc_idx, self_h, cleanup,
+       cororet, contlabel, conthandle, 2};
   cp_walk_tree (fnbody, co_await_expander, &data, NULL);
   return *fnbody;
 }
@@ -1663,10 +1689,6 @@  struct suspend_point_info
   tree awaitable_type;
   /* coro frame field name.  */
   tree await_field_id;
-  /* suspend method return type.  */
-  tree suspend_type;
-  /* suspend handle field name, NULL_TREE if not needed.  */
-  tree susp_handle_id;
 };
 
 static hash_map<tree, suspend_point_info> *suspend_points;
@@ -1703,22 +1725,6 @@  transform_await_expr (tree await_expr, await_xform_data *xform)
 	  We need to replace the e_proxy in the awr_call.  */
 
   tree coro_frame_type = TREE_TYPE (xform->actor_frame);
-  tree ah = NULL_TREE;
-  if (si->susp_handle_id)
-    {
-      tree ah_m
-	= lookup_member (coro_frame_type, si->susp_handle_id,
-			 /*protect=*/1, /*want_type=*/0, tf_warning_or_error);
-      ah = build_class_member_access_expr (xform->actor_frame, ah_m, NULL_TREE,
-					   true, tf_warning_or_error);
-    }
-  else if (TREE_CODE (si->suspend_type) == BOOLEAN_TYPE)
-    ah = boolean_type_node;
-
-  /* Replace Op 0 with the frame slot for the temporary handle, if it's needed.
-     If there's no frame type to be stored we flag boolean_type for that case
-     and an empty pointer for void return.  */
-  TREE_OPERAND (await_expr, 0) = ah;
 
   /* If we have a frame var for the awaitable, get a reference to it.  */
   proxy_replace data;
@@ -1988,6 +1994,15 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   tree top_block = make_node (BLOCK);
   BIND_EXPR_BLOCK (actor_bind) = top_block;
 
+  
+  tree continuation = build_lang_decl (VAR_DECL,
+				       get_identifier ("actor.continue"),
+				       void_coro_handle_type);
+  DECL_ARTIFICIAL (continuation) = 1;
+  DECL_IGNORED_P (continuation) = 1;
+  DECL_CONTEXT (continuation) = actor;
+  BIND_EXPR_VARS (actor_bind) = continuation;
+
   /* Update the block associated with the outer scope of the orig fn.  */
   tree first = expr_first (fnbody);
   if (first && TREE_CODE (first) == BIND_EXPR)
@@ -2012,6 +2027,15 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
     = create_named_label_with_ctx (loc, "actor.begin", actor);
   tree actor_frame = build1_loc (loc, INDIRECT_REF, coro_frame_type, actor_fp);
 
+  /* Init the continuation with a NULL ptr.  */
+  tree zeroinit = build1 (CONVERT_EXPR, void_coro_handle_type,
+			  integer_zero_node);
+  tree ci = build2 (INIT_EXPR, void_coro_handle_type, continuation, zeroinit);
+  ci = build_stmt (loc, DECL_EXPR, continuation);
+  ci = build1_loc (loc, CONVERT_EXPR, void_type_node, ci);
+  ci = maybe_cleanup_point_expr_void (ci);
+  add_stmt (ci);
+
   /* Re-write param references in the body, no code should be generated
      here.  */
   if (DECL_ARGUMENTS (orig))
@@ -2062,6 +2086,9 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   tree ret_label
     = create_named_label_with_ctx (loc, "actor.suspend.ret", actor);
 
+  tree continue_label
+    = create_named_label_with_ctx (loc, "actor.continue.ret", actor);
+
   tree lsb_if = begin_if_stmt ();
   tree chkb0 = build2 (BIT_AND_EXPR, short_unsigned_type_node, rat,
 		       build_int_cst (short_unsigned_type_node, 1));
@@ -2368,6 +2395,36 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   r = maybe_cleanup_point_expr_void (r);
   add_stmt (r);
 
+  /* This is the 'continuation' return point.  For such a case we have a coro
+     handle (from the await_suspend() call) and we want handle.resume() to
+     execute as a tailcall allowing arbitrary chaining of coroutines.  */
+  r = build_stmt (loc, LABEL_EXPR, continue_label);
+  add_stmt (r);
+
+  /* We want to force a tail-call even for O0/1, so this expands the resume
+     call into its underlying implementation.  */
+  tree addr = lookup_member (void_coro_handle_type, coro_address_identifier,
+			       1, 0, tf_warning_or_error);
+  addr = build_new_method_call (continuation, addr, NULL, NULL_TREE,
+				  LOOKUP_NORMAL, NULL, tf_warning_or_error);
+  tree resume = build_call_expr_loc
+    (loc, builtin_decl_explicit (BUILT_IN_CORO_RESUME), 1, addr);
+
+  /* Now we have the actual call, and we can mark it as a tail.  */
+  CALL_EXPR_TAILCALL (resume) = true;
+  /* ... and for optimisation levels 0..1, mark it as requiring a tail-call
+     for correctness.  It seems that doing this for optimisation levels that
+     normally perform tail-calling, confuses the ME (or it would be logical
+     to put this on unilaterally).  */
+  if (optimize < 2)
+    CALL_EXPR_MUST_TAIL_CALL (resume) = true;
+  resume = coro_build_cvt_void_expr_stmt (resume, loc);
+  add_stmt (resume);
+
+  r = build_stmt (loc, RETURN_EXPR, NULL);
+  r = maybe_cleanup_point_expr_void (r);
+  add_stmt (r);
+
   /* We need the resume index to work with.  */
   tree res_idx_m
     = lookup_member (coro_frame_type, resume_idx_name,
@@ -2387,7 +2444,8 @@  build_actor_fn (location_t loc, tree coro_frame_type, tree actor, tree fnbody,
   /* We've now rewritten the tree and added the initial and final
      co_awaits.  Now pass over the tree and expand the co_awaits.  */
   actor_body = expand_co_awaits (actor, &actor_body, actor_fp, res_idx,
-				 iarc_idx, del_promise_label, ret_label, ash);
+				 iarc_idx, del_promise_label, ret_label, ash,
+				 continue_label, continuation);
 
   actor_body = pop_stmt_list (actor_body);
   BIND_EXPR_BODY (actor_bind) = actor_body;
@@ -2523,8 +2581,7 @@  build_init_or_final_await (location_t loc, bool is_final)
    function.  */
 
 static bool
-register_await_info (tree await_expr, tree aw_type, tree aw_nam, tree susp_type,
-		     tree susp_handle_nam)
+register_await_info (tree await_expr, tree aw_type, tree aw_nam)
 {
   bool seen;
   suspend_point_info &s
@@ -2537,8 +2594,6 @@  register_await_info (tree await_expr, tree aw_type, tree aw_nam, tree susp_type,
     }
   s.awaitable_type = aw_type;
   s.await_field_id = aw_nam;
-  s.suspend_type = susp_type;
-  s.susp_handle_id = susp_handle_nam;
   return true;
 }
 
@@ -2568,26 +2623,6 @@  struct susp_frame_data
   bool captures_temporary;
 };
 
-/* Helper to return the type of an awaiter's await_suspend() method.
-   We start with the result of the build method call, which will be either
-   a call expression (void, bool) or a target expressions (handle).  */
-
-static tree
-get_await_suspend_return_type (tree aw_expr)
-{
-  tree susp_fn = TREE_VEC_ELT (TREE_OPERAND (aw_expr, 3), 1);
-  if (TREE_CODE (susp_fn) == CALL_EXPR)
-    {
-      susp_fn = CALL_EXPR_FN (susp_fn);
-      if (TREE_CODE (susp_fn) == ADDR_EXPR)
-	susp_fn = TREE_OPERAND (susp_fn, 0);
-      return TREE_TYPE (TREE_TYPE (susp_fn));
-    }
-  else if (TREE_CODE (susp_fn) == TARGET_EXPR)
-    return TREE_TYPE (susp_fn);
-  return TREE_TYPE (susp_fn);
-}
-
 /* Walk the sub-tree looking for call expressions that both capture
    references and have compiler-temporaries as parms.  */
 
@@ -2749,31 +2784,7 @@  register_awaits (tree *stmt, int *do_subtree ATTRIBUTE_UNUSED, void *d)
       free (nam);
     }
 
-  /* Find out what we have to do with the awaiter's suspend method (this
-     determines if we need somewhere to stash the suspend method's handle).
-     Cache the result of this in the suspend point info.
-     [expr.await]
-     (5.1) If the result of await-ready is false, the coroutine is considered
-	   suspended. Then:
-     (5.1.1) If the type of await-suspend is std::coroutine_handle<Z>,
-	     await-suspend.resume() is evaluated.
-     (5.1.2) if the type of await-suspend is bool, await-suspend is evaluated,
-	     and the coroutine is resumed if the result is false.
-     (5.1.3) Otherwise, await-suspend is evaluated.  */
-
-  tree susp_typ = get_await_suspend_return_type (aw_expr);
-  tree handle_field_nam;
-  if (VOID_TYPE_P (susp_typ) || TREE_CODE (susp_typ) == BOOLEAN_TYPE)
-    handle_field_nam = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      char *nam = xasprintf ("__aw_h.%u", data->count);
-      handle_field_nam
-	= coro_make_frame_entry (data->field_list, nam, susp_typ, aw_loc);
-      free (nam);
-    }
-  register_await_info (aw_expr, aw_field_type, aw_field_nam, susp_typ,
-		       handle_field_nam);
+  register_await_info (aw_expr, aw_field_type, aw_field_nam);
 
   data->count++; /* Each await suspend context is unique.  */
 
@@ -3093,9 +3104,7 @@  act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)
   handle_type self_handle;
   (maybe) parameter copies.
   coro1::suspend_never_prt __is;
-  (maybe) handle_type i_hand;
   coro1::suspend_always_prt __fs;
-  (maybe) handle_type f_hand;
   (maybe) local variables saved
   (maybe) trailing space.
  };  */
@@ -3297,19 +3306,7 @@  morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
   tree init_susp_name = coro_make_frame_entry (&field_list, "__aw_s.is",
 					       initial_suspend_type, fn_start);
 
-  /* Figure out if we need a saved handle from the awaiter type.  */
-  tree ret_typ = get_await_suspend_return_type (initial_await);
-  tree init_hand_name;
-  if (VOID_TYPE_P (ret_typ) || TREE_CODE (ret_typ) == BOOLEAN_TYPE)
-    init_hand_name = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      init_hand_name
-	= coro_make_frame_entry (&field_list, "__ih", ret_typ, fn_start);
-    }
-
-  register_await_info (initial_await, initial_suspend_type, init_susp_name,
-		       ret_typ, init_hand_name);
+  register_await_info (initial_await, initial_suspend_type, init_susp_name);
 
   /* Now insert the data for any body await points, at this time we also need
      to promote any temporaries that are captured by reference (to regular
@@ -3324,18 +3321,7 @@  morph_fn_to_coro (tree orig, tree *resumer, tree *destroyer)
   tree fin_susp_name = coro_make_frame_entry (&field_list, "__aw_s.fs",
 					      final_suspend_type, fn_start);
 
-  ret_typ = get_await_suspend_return_type (final_await);
-  tree fin_hand_name;
-  if (VOID_TYPE_P (ret_typ) || TREE_CODE (ret_typ) == BOOLEAN_TYPE)
-    fin_hand_name = NULL_TREE; /* no handle is needed.  */
-  else
-    {
-      fin_hand_name
-	= coro_make_frame_entry (&field_list, "__fh", ret_typ, fn_start);
-    }
-
-  register_await_info (final_await, final_suspend_type, fin_susp_name,
-		       void_type_node, fin_hand_name);
+  register_await_info (final_await, final_suspend_type, fin_susp_name);
 
   /* 4. Now make space for local vars, this is conservative again, and we
      would expect to delete unused entries later.  */
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C b/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
index 91f3f14cc09..d14c358c4c5 100644
--- a/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-ret-09-bool-await-susp.C
@@ -4,6 +4,9 @@ 
 
 #include "../coro.h"
 
+int coro1_dtor_ran = 0;
+int promise_dtor_ran = 0;
+
 struct coro1 {
   struct promise_type;
   using handle_type = coro::coroutine_handle<coro1::promise_type>;
@@ -24,10 +27,13 @@  struct coro1 {
 	PRINT("coro1 op=  ");
 	return *this;
   }
+
   ~coro1() {
         PRINT("Destroyed coro1");
-        if ( handle )
-          handle.destroy();
+        coro1_dtor_ran++;
+        // The coro handle will point to an invalid frame by this stage,
+        // the coroutine will already have self-destroyed the frame and
+        // promise.
   }
 
   struct suspend_never_prt {
@@ -51,7 +57,10 @@  struct coro1 {
 
   struct promise_type {
   promise_type() {  PRINT ("Created Promise"); }
-  ~promise_type() { PRINT ("Destroyed Promise"); }
+  ~promise_type() {
+     PRINT ("Destroyed Promise"); 
+     promise_dtor_ran++;
+   }
 
   coro1 get_return_object () {
     PRINT ("get_return_object: from handle from promise");
@@ -73,7 +82,7 @@  struct coro1 {
 };
 
 struct coro1
-f () noexcept
+my_coro () noexcept
 {
   PRINT ("coro1: about to return");
   co_return;
@@ -81,15 +90,26 @@  f () noexcept
 
 int main ()
 {
-  PRINT ("main: create coro1");
-  struct coro1 x = f ();
-  auto p = x.handle.promise ();
-  auto aw = p.initial_suspend();
-  auto f = aw.await_suspend(coro::coroutine_handle<coro1::promise_type>::from_address ((void *)&x));
-  PRINT ("main: got coro1 - should be done");
-  if (!x.handle.done())
+  { // scope so that we can examine the coro dtor marker.
+    PRINT ("main: creating coro");
+
+    // This should just run through to completion/destruction.
+    // In both the initial and final await expressions, the await_suspend()
+    // method will return 'false' and prevent the suspension.
+    struct coro1 x = my_coro ();
+
+    PRINT ("main: the coro frame should be already destroyed");
+    // We will take the running of the promise DTOR as evidence that the
+    // frame was destroyed as expected.
+    if (promise_dtor_ran != 1)
+      {
+	PRINT ("main: apparently we didn't destroy the frame");
+	abort ();
+      }
+  }
+  if (coro1_dtor_ran != 1 || promise_dtor_ran != 1)
     {
-      PRINT ("main: apparently was not done...");
+      PRINT ("main: bad DTOR counts");
       abort ();
     }
   PRINT ("main: returning");
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C b/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
new file mode 100644
index 00000000000..c445fc55a2c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/symmetric-transfer-00-basic.C
@@ -0,0 +1,110 @@ 
+//  { dg-do run }
+
+#if __has_include(<coroutine>)
+
+#include <coroutine>
+namespace coro = std;
+
+#elif __has_include(<experimental/coroutine>)
+
+#include <experimental/coroutine>
+namespace coro = std::experimental;
+
+#endif
+
+#if __clang__
+#  include <utility>
+#endif
+
+#include <chrono>
+#include <thread>
+ 
+template <typename T> 
+struct Loopy {
+  struct promise_type;
+  using handle_type = coro::coroutine_handle<Loopy::promise_type>;
+  handle_type handle;
+
+  struct promise_type {
+    // Cause the Loopy object to be created from the handle.
+    auto get_return_object () {
+     return handle_type::from_promise (*this);
+    }
+
+    coro::suspend_never yield_value(T value) {
+      //fprintf (stderr, "%s yields %d\n", me, value);
+      current_value = value;
+      return coro::suspend_never{};
+    }
+
+    coro::suspend_always initial_suspend() { return {}; }
+    coro::suspend_always final_suspend() { return {}; }
+
+    void unhandled_exception() { /*std::terminate();*/  }
+    void return_void() {}
+
+    void set_peer (handle_type  _p) { peer = _p; }
+    void set_me (const char *m) { me = m; }
+
+    T get_value () {return current_value;}
+    T current_value;
+    handle_type peer;
+    const char *me;
+  };
+
+  Loopy () : handle(0) {}
+  Loopy (handle_type _handle) : handle(_handle) {}
+  Loopy (const Loopy &) = delete;
+  Loopy (Loopy &&s) : handle(s.handle) { s.handle = nullptr; }
+  ~Loopy() { if ( handle ) handle.destroy(); }
+
+  struct loopy_awaiter {
+    handle_type p;
+    bool await_ready () { return false; }
+    /* continue the peer. */
+    handle_type await_suspend (handle_type h) {
+      p = h.promise().peer;
+      return p;
+    }
+    T await_resume () { return p.promise().get_value (); }
+  };
+};
+
+Loopy<int>
+pingpong (const char *id)
+{
+  int v = 0;
+  Loopy<int>::loopy_awaiter aw{};
+  for (;;)
+    {
+      co_yield (v+1);
+      //std::this_thread::sleep_for(std::chrono::milliseconds(500));
+      if (v > 10'000'000)
+        break;
+      v = co_await aw;
+      //fprintf (stderr, "%s = %d\n", id, v);
+    }
+ fprintf (stderr, "%s = %d\n", id, v);
+}
+
+int main ()
+{
+  // identify for fun.. 
+  const char *ping_id = "ping";
+  const char *pong_id = "pong";
+  auto ping = pingpong (ping_id); // created suspended.
+  auto pong = pingpong (pong_id);
+
+  // point them at each other...
+  ping.handle.promise().set_peer (pong.handle);
+  pong.handle.promise().set_peer (ping.handle);
+
+  ping.handle.promise().set_me (ping_id);
+  pong.handle.promise().set_me (pong_id);
+
+  // and start them ...
+  ping.handle.resume ();
+  pong.handle.resume ();
+  
+}
+