Patchwork [trans-mem] PR47690: redirect recursive calls in a clone correctly

login
register
mail settings
Submitter Aldy Hernandez
Date Feb. 15, 2011, 3:31 p.m.
Message ID <4D5A9C5D.4070809@redhat.com>
Download mbox | patch
Permalink /patch/83264/
State New
Headers show

Comments

Aldy Hernandez - Feb. 15, 2011, 3:31 p.m.
Consider a sample where a transaction callable function has a recursive 
call to itself:

int george;
void NewQueueNode()
{
         __transaction [[atomic]] { george=999; }
         NewQueueNode();
}

Currently, when we are transforming calls to their transactional clone 
counterparts, we stop transforming once we reach the end of a 
transaction.  Ultimately, this means that in the cloned function, the 
recursive call to NewQueueNode() will have the edges redirected to the 
clone, but the gimple_call_fndecl() in the clone will still be pointing 
to the original function.  This causes cgraph verification to fail with 
a "edge points to wrong declaration" ICE.

I think the original (not-cloned) version of NewQueueNode() should 
recurse to the original version.  And the cloned version of 
NewQueueNode() should recurse to the cloned function.  I have done this 
in the patch below by continuing the call transformation past the end of 
a transaction, but only on recursive calls to the cloned function.

With my patch, the original function gets instrumented like this:

NewQueueNode:
         pushl   %ebp
         movl    %esp, %ebp
         subl    $8, %esp
         movl    $41, %eax
         call    _ITM_beginTransaction
         movl    $999, %edx
         movl    $george, %eax
         call    _ITM_WU4
         call    _ITM_commitTransaction
         call    NewQueueNode
         leave

And the cloned function gets instrumented like this:

_ZGTt12NewQueueNode:
         pushl   %ebp
         movl    %esp, %ebp
         subl    $8, %esp
         movl    $41, %eax
         call    _ITM_beginTransaction
         movl    $999, %edx
         movl    $george, %eax
         call    _ITM_WU4
         call    _ITM_commitTransaction
         call    _ZGTt12NewQueueNode
         leave

What do you think?
Aldy
PR 47690
	* trans-mem.c (ipa_tm_transform_calls_1): Abstract edge
	redirection logic...
	(ipa_tm_transform_calls_redirect): ...here.  Redirect recursive
	clone calls even if we have scanned past the end of the
	transaction.
Patrick Marlier - Feb. 15, 2011, 3:53 p.m.
Hi Aldy,

On 02/15/2011 04:31 PM, Aldy Hernandez wrote:
> Consider a sample where a transaction callable function has a recursive
> call to itself:
>
> int george;
> void NewQueueNode()
> {
>           __transaction [[atomic]] { george=999; }
>           NewQueueNode();
> }
>
> Currently, when we are transforming calls to their transactional clone
> counterparts, we stop transforming once we reach the end of a
> transaction.  Ultimately, this means that in the cloned function, the
> recursive call to NewQueueNode() will have the edges redirected to the
> clone, but the gimple_call_fndecl() in the clone will still be pointing
> to the original function.  This causes cgraph verification to fail with
> a "edge points to wrong declaration" ICE.

But in this case, why is NewQueueNode cloned?
It shouldn't because it is never called in a transaction and it is not a 
transaction safe function.

Patrick.
Richard Henderson - Feb. 15, 2011, 4:40 p.m.
On 02/15/2011 07:53 AM, Patrick Marlier wrote:
> Hi Aldy,
> 
> On 02/15/2011 04:31 PM, Aldy Hernandez wrote:
>> Consider a sample where a transaction callable function has a recursive
>> call to itself:
>>
>> int george;
>> void NewQueueNode()
>> {
>>           __transaction [[atomic]] { george=999; }
>>           NewQueueNode();
>> }
>>
>> Currently, when we are transforming calls to their transactional clone
>> counterparts, we stop transforming once we reach the end of a
>> transaction.  Ultimately, this means that in the cloned function, the
>> recursive call to NewQueueNode() will have the edges redirected to the
>> clone, but the gimple_call_fndecl() in the clone will still be pointing
>> to the original function.  This causes cgraph verification to fail with
>> a "edge points to wrong declaration" ICE.
> 
> But in this case, why is NewQueueNode cloned?
> It shouldn't because it is never called in a transaction and it is not a transaction safe function.

I was about to ask the very same question.  Though if the test case is
transformed to add __attribute__((transaction_callable)), then we do 
have a problem.

I must say the "outside_transaction" parameter is faulty to the point
of confusing.  Inside a clone, we are *never* outside the transaction;
the transaction begins somewhere up the call stack.


r~
Patrick Marlier - Feb. 15, 2011, 6:21 p.m.
On Tue, 15 Feb 2011, Richard Henderson wrote:

> On 02/15/2011 07:53 AM, Patrick Marlier wrote:
>> Hi Aldy,
>>
>> On 02/15/2011 04:31 PM, Aldy Hernandez wrote:
>>> Consider a sample where a transaction callable function has a recursive
>>> call to itself:
>>>
>>> int george;
>>> void NewQueueNode()
>>> {
>>>           __transaction [[atomic]] { george=999; }
>>>           NewQueueNode();
>>> }
>>>
>>> Currently, when we are transforming calls to their transactional clone
>>> counterparts, we stop transforming once we reach the end of a
>>> transaction.  Ultimately, this means that in the cloned function, the
>>> recursive call to NewQueueNode() will have the edges redirected to the
>>> clone, but the gimple_call_fndecl() in the clone will still be pointing
>>> to the original function.  This causes cgraph verification to fail with
>>> a "edge points to wrong declaration" ICE.
>>
>> But in this case, why is NewQueueNode cloned?
>> It shouldn't because it is never called in a transaction and it is not a transaction safe function.
>
> I was about to ask the very same question.

Actually I have already filled a PR about this cloned problem:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47689

The testcase inside is almost the same.

Patrick.
Aldy Hernandez - Feb. 15, 2011, 6:30 p.m.
> Actually I have already filled a PR about this cloned problem:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47689
>
> The testcase inside is almost the same.

Yeah!  Two PRs, one patch.  I'm on a roll.
Aldy Hernandez - Feb. 15, 2011, 6:34 p.m.
>> But in this case, why is NewQueueNode cloned?
 >> It shouldn't because it is never called in a transaction and it is
 >> not a transaction safe function.

You are both right.  I was assuming the function was 
transaction_callable (which, as rth points out, still exhibits the problem).

The reason we are incorrectly cloning the function is because in 
ipa_tm_scan_calls_transaction() we see that the BB for the call is in 
the set of transactional blocks, but we fail to notice that within a 
transactional block, we can have instructions past the end of a 
transaction.  For example:

<bb 2>:
   __transaction
<bb 3>:
   george = 999;
   __builtin__ITM_commitTransaction ();
   NewQueueNode ();
   return;

Here we have BB3 which is in the global set of blocks inside of a 
transaction, but the call to NewQueueNode() is actually beyond the 
commit.  So we must verify that the call in question is not beyond a 
transaction ending statement.  I have done so in the updated patch below.

 > I was about to ask the very same question.  Though if the test case is
 > transformed to add __attribute__((transaction_callable)), then we do
 > have a problem.

Yes, and my original patch fixes this particular problem though I didn't 
even know it at the time :).

 > I must say the "outside_transaction" parameter is faulty to the point
 > of confusing.  Inside a clone, we are *never* outside the transaction;
 > the transaction begins somewhere up the call stack.

How about "past_tm_ending_stmt", that's fairly unambiguous :)?

The attached patch fixes both problems.

OK?

Patch

Index: testsuite/gcc.dg/tm/pr47690.c
===================================================================
--- testsuite/gcc.dg/tm/pr47690.c	(revision 0)
+++ testsuite/gcc.dg/tm/pr47690.c	(revision 0)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fgnu-tm" } */
+
+int george;
+
+void NewQueueNode()
+{
+  __transaction [[atomic]] {
+      george=999;
+  }
+  NewQueueNode();
+}
Index: trans-mem.c
===================================================================
--- trans-mem.c	(revision 170142)
+++ trans-mem.c	(working copy)
@@ -4279,6 +4279,89 @@  ipa_tm_insert_gettmclone_call (struct cg
   return true;
 }
 
+/* Helper function for ipa_tm_transform_calls*.  Given a call
+   statement in GSI which resides inside transaction REGION, redirect
+   the call to either its wrapper function, or its clone.
+   OUTSIDE_TRANSACTION is true if we have scanned past the end of the
+   transaction.  */
+
+static void
+ipa_tm_transform_calls_redirect (struct cgraph_node *node,
+				 struct tm_region *region,
+				 gimple_stmt_iterator *gsi,
+				 bool outside_transaction,
+				 bool *need_ssa_rename_p)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  struct cgraph_node *new_node;
+  struct cgraph_edge *e = cgraph_edge (node, stmt);
+  tree fndecl = gimple_call_fndecl (stmt);
+
+  /* We only redirect calls inside of a transaction, with the
+     exception of recursive calls to the clone itself.  We must make
+     sure recursive calls in a cloned function get redirected to the
+     clone itself, not the original function.  So... bail unless we're
+     inside a transaction or we're handling this special case.  */
+  if (outside_transaction
+      && (e->caller != e->callee
+	  || !DECL_IS_TM_CLONE (e->callee->decl)))
+    return;
+
+  /* For a recursive call to the clone, call the clone directly.  No
+     need to go through the runtime...  */
+  if (e->caller == e->callee
+      && DECL_IS_TM_CLONE (e->callee->decl))
+    fndecl = e->callee->decl;
+  else
+    {
+      /* ...otherwise start looking for a replacement.  */
+      fndecl = find_tm_replacement_function (fndecl);
+    }
+
+  /* If there is a replacement, use it, otherwise use the clone.  */
+  if (fndecl)
+    {
+      new_node = cgraph_node (fndecl);
+
+      /* ??? Mark all transaction_wrap functions tm_may_enter_irr.
+
+	 We can't do this earlier in record_tm_replacement because
+	 cgraph_remove_unreachable_nodes is called before we inject
+	 references to the node.  Further, we can't do this in some
+	 nice central place in ipa_tm_execute because we don't have
+	 the exact list of wrapper functions that would be used.
+	 Marking more wrappers than necessary results in the creation
+	 of unnecessary cgraph_nodes, which can cause some of the
+	 other IPA passes to crash.
+
+	 We do need to mark these nodes so that we get the proper
+	 result in expand_call_tm.  */
+      new_node->local.tm_may_enter_irr = 1;
+    }
+  else
+    {
+      struct tm_ipa_cg_data *d = get_cg_data (e->callee);
+      new_node = d->clone;
+
+      /* As we've already skipped pure calls and appropriate builtins,
+	 and we've already marked irrevocable blocks, if we can't come
+	 up with a static replacement, then ask the runtime.  */
+      if (new_node == NULL)
+	{
+	  *need_ssa_rename_p |=
+	    ipa_tm_insert_gettmclone_call (node, region, gsi, stmt);
+	  cgraph_remove_edge (e);
+	  return;
+	}
+
+      fndecl = new_node->decl;
+    }
+
+  cgraph_redirect_edge_callee (e, new_node);
+  gimple_call_set_fndecl (stmt, fndecl);
+  return;
+}
+
 /* Helper function for ipa_tm_transform_calls.  For a given BB,
    install calls to tm_irrevocable when IRR_BLOCKS are reached,
    redirect other calls to the generated transactional clone.  */
@@ -4289,6 +4372,7 @@  ipa_tm_transform_calls_1 (struct cgraph_
 {
   gimple_stmt_iterator gsi;
   bool need_ssa_rename = false;
+  bool outside_transaction = false;
 
   if (irr_blocks && bitmap_bit_p (irr_blocks, bb->index))
     {
@@ -4299,8 +4383,6 @@  ipa_tm_transform_calls_1 (struct cgraph_
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple stmt = gsi_stmt (gsi);
-      struct cgraph_edge *e;
-      struct cgraph_node *new_node;
       tree fndecl;
 
       if (!is_gimple_call (stmt))
@@ -4310,61 +4392,26 @@  ipa_tm_transform_calls_1 (struct cgraph_
 
       fndecl = gimple_call_fndecl (stmt);
 
-      /* For indirect calls, pass the address through the runtime.  */
-      if (fndecl == NULL)
-	{
-	  need_ssa_rename |=
-	    ipa_tm_insert_gettmclone_call (node, region, &gsi, stmt);
-	  continue;
-	}
-
-      /* Don't scan past the end of the transaction.  */
-      if (is_tm_ending_fndecl (fndecl))
-	break;
-      e = cgraph_edge (node, stmt);
-
-      /* If there is a replacement, use it, otherwise use the clone.  */
-      fndecl = find_tm_replacement_function (fndecl);
-      if (fndecl)
-	{
-	  new_node = cgraph_node (fndecl);
-
-	  /* ??? Mark all transaction_wrap functions tm_may_enter_irr.
-
-	     We can't do this earlier in record_tm_replacement because
-	     cgraph_remove_unreachable_nodes is called before we inject
-	     references to the node.  Further, we can't do this in some
-	     nice central place in ipa_tm_execute because we don't have
-	     the exact list of wrapper functions that would be used.
-	     Marking more wrappers than necessary results in the creation
-	     of unnecessary cgraph_nodes, which can cause some of the
-	     other IPA passes to crash.
-
-	     We do need to mark these nodes so that we get the proper
-	     result in expand_call_tm.  */
-	  new_node->local.tm_may_enter_irr = 1;
-	}
-      else
+      if (!outside_transaction)
 	{
-	  struct tm_ipa_cg_data *d = get_cg_data (e->callee);
-	  new_node = d->clone;
-
-	  /* As we've already skipped pure calls and appropriate builtins,
-	     and we've already marked irrevocable blocks, if we can't come
-	     up with a static replacement, then ask the runtime.  */
-	  if (new_node == NULL)
+	  /* For indirect calls, pass the address through the runtime.  */
+	  if (fndecl == NULL)
 	    {
 	      need_ssa_rename |=
-	        ipa_tm_insert_gettmclone_call (node, region, &gsi, stmt);
-	      cgraph_remove_edge (e);
+		ipa_tm_insert_gettmclone_call (node, region, &gsi, stmt);
 	      continue;
 	    }
 
-	  fndecl = new_node->decl;
+	  if (is_tm_ending_fndecl (fndecl))
+	    outside_transaction = true;
 	}
 
-      cgraph_redirect_edge_callee (e, new_node);
-      gimple_call_set_fndecl (stmt, fndecl);
+      /* Redirect edges to the appropriate replacement or clone.  This
+	 will even redirect calls outside of a transaction, but only
+	 if they're recursive calls to the clone.  */
+      ipa_tm_transform_calls_redirect (node, region, &gsi,
+				       outside_transaction,
+				       &need_ssa_rename);
     }
 
   return need_ssa_rename;