Patchwork [trans-mem] undefined reference to a static cloned function

login
register
mail settings
Submitter Aldy Hernandez
Date June 10, 2010, 1:37 p.m.
Message ID <20100610133753.GA13188@redhat.com>
Download mbox | patch
Permalink /patch/55221/
State New
Headers show

Comments

Aldy Hernandez - June 10, 2010, 1:37 p.m.
> gcc -fgnu-tm -Wall -DNDEBUG -O3 element.c -o element -litm -lm
> /tmp/ccEhZPec.o:(.tm_clone_table+0x18): undefined reference to
> `transaction clone for calculateCircumCircle'

In the testcase below we have a clone that gets inlined and optimized
away, so it doesn't get outputted, but we still dump its address into
the clone table.

I've fixed this by not dumping entries into the clone table for which
clones are not needed.  Previously we were looking at the neediness of
the original function, but since we now mark clones as needed in
ipa_tm_insert_gettmclone_call(), we are assured to have the needy bit
set correctly.

Also, I got tired of IPA dumps not distinguishing between regular
functions and TM clones.  I've made the CFG dumper distinguish between
the two.

OK for branch?

	* varasm.c (finish_tm_clone_pairs_1): 
	* tree-cfg.c (dump_function_to_file): Display [tm-clone] if
	applicable.
Richard Henderson - June 10, 2010, 10:37 p.m.
On 06/10/2010 06:37 AM, Aldy Hernandez wrote:
> 	* varasm.c (finish_tm_clone_pairs_1): 
> 	* tree-cfg.c (dump_function_to_file): Display [tm-clone] if
> 	applicable.

Ok if you finish writing the changelog entry for finish_tm_clone_pairs_1.


r~
Patrick Marlier - June 14, 2010, 7:27 a.m.
Hello Aldy,

> +element_t*
> +element_alloc (coordinate_t* coordinates, long numCoordinate)
> +{
> +    element_t* elementPtr;
> +    elementPtr = (element_t*)xmalloc(sizeof(element_t));
> +    if (elementPtr) {
> +        calculateCircumCircle(elementPtr);
> +    }
> +    return elementPtr;
> +}
> +
> +__attribute__((transaction_safe))
> +element_t*
> +TMelement_alloc (coordinate_t* coordinates, long numCoordinate)
> +{
> +    element_t* elementPtr;
> +    elementPtr = (element_t*)xmalloc(sizeof(element_t));
> +    if (elementPtr) {
> +        calculateCircumCircle(elementPtr);
> +    }
> +    return elementPtr;
> +}

As we can see here, calculateCircumCircle function can be called into a 
transaction_safe function but also in a regular function. So the 
function calculateCircumCircle must have the transaction_safe attribute 
but...

if calculateCircumCircle is declared and defined like that (without 
transaction_safe):
static void
calculateCircumCircle (element_t* elementPtr)

GCC doesn't complain about transaction_safe attribute. The reason could 
be that the elementPtr is allocated inside the transaction (is gcc-tm 
able to detect that?) but if I modify the function TMelement_alloc to 
use a global variable, gcc still doesn't complain. So I don't understand 
the behaviour here.

Can you just explain me why it doesn't complain that the 
calculateCircumCircle isn't safe? is it a mistake?

Thank you,

Patrick.
Aldy Hernandez - June 15, 2010, 2:41 p.m.
> As we can see here, calculateCircumCircle function can be called into a  
> transaction_safe function but also in a regular function. So the  
> function calculateCircumCircle must have the transaction_safe attribute  
> but...
>
> if calculateCircumCircle is declared and defined like that (without  
> transaction_safe):
> static void
> calculateCircumCircle (element_t* elementPtr)
>
> GCC doesn't complain about transaction_safe attribute. The reason could  

GCC doesn't complain because it can determine that calculateCircumCircle
never enters serial irrevocable mode (besides I believe in this case the
compiler inlines the call to calculateCircumCircle).

This exclusion is done in ipa_tm_diagnose_tm_safe():

	 if (!is_tm_callable (e->callee->decl)
		&& e->callee->local.tm_may_enter_irr)
	      error_at (gimple_location (e->call_stmt),

I suppose we could warn regardless.  I don't know if having the compiler
outsmart us was part of rth's plan.  Richard?

Aldy
Richard Henderson - June 15, 2010, 4:17 p.m.
On 06/15/2010 07:41 AM, Aldy Hernandez wrote:
>> As we can see here, calculateCircumCircle function can be called into a  
>> transaction_safe function but also in a regular function. So the  
>> function calculateCircumCircle must have the transaction_safe attribute  
>> but...
>>
>> if calculateCircumCircle is declared and defined like that (without  
>> transaction_safe):
>> static void
>> calculateCircumCircle (element_t* elementPtr)
>>
>> GCC doesn't complain about transaction_safe attribute. The reason could  
> 
> GCC doesn't complain because it can determine that calculateCircumCircle
> never enters serial irrevocable mode (besides I believe in this case the
> compiler inlines the call to calculateCircumCircle).
> 
> This exclusion is done in ipa_tm_diagnose_tm_safe():
> 
> 	 if (!is_tm_callable (e->callee->decl)
> 		&& e->callee->local.tm_may_enter_irr)
> 	      error_at (gimple_location (e->call_stmt),
> 
> I suppose we could warn regardless.  I don't know if having the compiler
> outsmart us was part of rth's plan.  Richard?

The intent of the spec is that static functions need not be explicitly
annotated, since the compiler *can* derive all the information it needs,
including implicitly generating a clone.

I've sort of lost the thread here... does that even answer the question?


r~
Patrick Marlier - June 16, 2010, 9:31 a.m.
On 06/15/2010 06:17 PM, Richard Henderson wrote:
> On 06/15/2010 07:41 AM, Aldy Hernandez wrote:
>>> As we can see here, calculateCircumCircle function can be called into a
>>> transaction_safe function but also in a regular function. So the
>>> function calculateCircumCircle must have the transaction_safe attribute
>>> but...
>>>
>>> if calculateCircumCircle is declared and defined like that (without
>>> transaction_safe):
>>> static void
>>> calculateCircumCircle (element_t* elementPtr)
>>>
>>> GCC doesn't complain about transaction_safe attribute. The reason could
>>
>> GCC doesn't complain because it can determine that calculateCircumCircle
>> never enters serial irrevocable mode (besides I believe in this case the
>> compiler inlines the call to calculateCircumCircle).
>
> The intent of the spec is that static functions need not be explicitly
> annotated, since the compiler *can* derive all the information it needs,
> including implicitly generating a clone.
>
> I've sort of lost the thread here... does that even answer the question?

Yes! Thank you very much :)
Since most functions need transaction_safe, I was supposing it was 
mandatory. Arg sorry for the noise.

Patrick

Patch

Index: testsuite/gcc.dg/tm/20100610.c
===================================================================
--- testsuite/gcc.dg/tm/20100610.c	(revision 0)
+++ testsuite/gcc.dg/tm/20100610.c	(revision 0)
@@ -0,0 +1,90 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fgnu-tm -O3" } */
+
+/* The function calculateCircumCircle() should get inlined into the TM
+   clone for TMelement_alloc(), so we don't need to generate a TM
+   clone for calculateCircumCircle().  We also don't need to put its
+   entry into the clone table since it's static.  */
+
+/* { dg-final { scan-assembler-not "ZGTt21calculateCircumCircle" } } */
+
+extern double sqrt(double) __attribute__((transaction_pure));
+extern void *xmalloc(int) __attribute__((transaction_safe));
+
+typedef struct coordinate {
+    double x;
+    double y;
+} coordinate_t;
+typedef struct element {
+    coordinate_t coordinates[3];
+    long numCoordinate;
+    coordinate_t circumCenter;
+    double circumRadius;
+} element_t;
+
+__attribute__((transaction_safe))
+double
+coordinate_distance (coordinate_t* coordinatePtr, coordinate_t* aPtr)
+{
+    return sqrt( coordinatePtr->x );
+}
+
+__attribute__((transaction_safe))
+static void
+calculateCircumCircle (element_t* elementPtr)
+{
+    long numCoordinate = elementPtr->numCoordinate;
+    coordinate_t* coordinates = elementPtr->coordinates;
+    coordinate_t* circumCenterPtr = &elementPtr->circumCenter;
+    ((void) (0));
+    if (numCoordinate == 2) {
+        circumCenterPtr->x = (coordinates[0].x + coordinates[1].x) / 2.0;
+        circumCenterPtr->y = (coordinates[0].y + coordinates[1].y) / 2.0;
+    }
+ else {
+        double ax = coordinates[0].x;
+        double ay = coordinates[0].y;
+        double bx = coordinates[1].x;
+        double by = coordinates[1].y;
+        double cx = coordinates[2].x;
+        double cy = coordinates[2].y;
+        double bxDelta = bx - ax;
+        double byDelta = by - ay;
+        double cxDelta = cx - ax;
+        double cyDelta = cy - ay;
+        double bDistance2 = (bxDelta * bxDelta) + (byDelta * byDelta);
+        double cDistance2 = (cxDelta * cxDelta) + (cyDelta * cyDelta);
+        double xNumerator = (byDelta * cDistance2) - (cyDelta * bDistance2);
+        double yNumerator = (bxDelta * cDistance2) - (cxDelta * bDistance2);
+        double denominator = 2 * ((bxDelta * cyDelta) - (cxDelta * byDelta));
+        double rx = ax - (xNumerator / denominator);
+        double ry = ay + (yNumerator / denominator);
+        circumCenterPtr->x = rx;
+        circumCenterPtr->y = ry;
+    }
+    elementPtr->circumRadius = coordinate_distance(circumCenterPtr,
+                                                   &coordinates[0]);
+}
+
+element_t*
+element_alloc (coordinate_t* coordinates, long numCoordinate)
+{
+    element_t* elementPtr;
+    elementPtr = (element_t*)xmalloc(sizeof(element_t));
+    if (elementPtr) {
+        calculateCircumCircle(elementPtr);
+    }
+    return elementPtr;
+}
+
+__attribute__((transaction_safe))
+element_t*
+TMelement_alloc (coordinate_t* coordinates, long numCoordinate)
+{
+    element_t* elementPtr;
+    elementPtr = (element_t*)xmalloc(sizeof(element_t));
+    if (elementPtr) {
+        calculateCircumCircle(elementPtr);
+    }
+    return elementPtr;
+}
Index: varasm.c
===================================================================
--- varasm.c	(revision 160538)
+++ varasm.c	(working copy)
@@ -5850,9 +5850,15 @@  finish_tm_clone_pairs_1 (void **slot, vo
   bool *switched = (bool *) info;
   tree src = map->base.from;
   tree dst = map->to;
-  struct cgraph_node *src_n = cgraph_node (src);
+  struct cgraph_node *dst_n = cgraph_node (dst);
 
-  if (!src_n->needed)
+  /* The function ipa_tm_create_version() marks the clone as needed if
+     the original function was needed.  But we also mark the clone as
+     needed if we ever called the clone indirectly through
+     TM_GETTMCLONE.  If neither of these are true, we didn't generate
+     a clone, and we didn't call it indirectly... no sense keeping it
+     in the clone table.  */
+  if (!dst_n->needed)
     return 1;
 
   if (!*switched)
Index: tree-cfg.c
===================================================================
--- tree-cfg.c	(revision 160538)
+++ tree-cfg.c	(working copy)
@@ -6215,8 +6215,10 @@  dump_function_to_file (tree fn, FILE *fi
   bool ignore_topmost_bind = false, any_var = false;
   basic_block bb;
   tree chain;
+  bool tmclone = TREE_CODE (fn) == FUNCTION_DECL && DECL_IS_TM_CLONE (fn);
 
-  fprintf (file, "%s (", lang_hooks.decl_printable_name (fn, 2));
+  fprintf (file, "%s %s(", lang_hooks.decl_printable_name (fn, 2),
+	   tmclone ? "[tm-clone] " : "");
 
   arg = DECL_ARGUMENTS (fn);
   while (arg)