Patchwork Fix PR52448 (cselim with calls)

login
register
mail settings
Submitter Michael Matz
Date Feb. 6, 2013, 1:45 p.m.
Message ID <alpine.LNX.2.00.1302061433310.23146@wotan.suse.de>
Download mbox | patch
Permalink /patch/218614/
State New
Headers show

Comments

Michael Matz - Feb. 6, 2013, 1:45 p.m.
Hi,

marking of non-trapping accesses currently has an issue when there's a 
call between the dominated and the dominating access (that makes the 
former non-trapping), namely when that call frees the accessed memory (or 
makes it unavailable via other means).

Instead of a full-blown algorithm iterating until a fixed point to exactly 
determine the killed set of accesses I simply clear the whole hash table 
of seen accesses whenever I see a call, which requires being conservative 
on backedges (or generally when there are unvisited predecessors, but the 
current domwalker ensures that domchilds are visited in topological 
order).  Quick clearing is done by using a phase counter.

This still retains the important transformation in hmmer, but fixes the 
bug.  Regstrapped on x86_64-linux, okay for trunk?


Ciao,
Michael.
Jakub Jelinek - Feb. 6, 2013, 1:53 p.m.
On Wed, Feb 06, 2013 at 02:45:10PM +0100, Michael Matz wrote:
> @@ -1329,20 +1337,53 @@ add_or_mark_expr (basic_block bb, tree e
>      }
>  }
>  
> +/* Return true when CALL is a call stmt that definitely doesn't
> +   free any memory or makes it unavailable otherwise.  */
> +static bool
> +nonfreeing_call_p (gimple call)
> +{
> +  tree fndecl = gimple_call_fndecl (call);
> +
> +  if (!fndecl || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
> +    return false;
> +
> +  switch (DECL_FUNCTION_CODE (fndecl))
> +    {
> +      case BUILT_IN_FREE:
> +      case BUILT_IN_STACK_RESTORE:
> +	return false;
> +      default:
> +	return true;
> +    }
> +}

This doesn't look conservative enough.

First of all, I'd use gimple_call_builtin_p (call, BUILT_IN_NORMAL)
test.  And, the function should certainly conservatively return false
for all builtins that aren't (gimple_call_flags (call) & ECF_LEAF) != 0,
otherwise they might call hooks in the current CU and similar and free in
those hooks (likewise setjmp or similar, but again, that is not ECF_LEAF).

	Jakub
Michael Matz - Feb. 6, 2013, 2:20 p.m.
Hi,

On Wed, 6 Feb 2013, Jakub Jelinek wrote:

> First of all, I'd use gimple_call_builtin_p (call, BUILT_IN_NORMAL) 
> test.  And, the function should certainly conservatively return false 
> for all builtins that aren't (gimple_call_flags (call) & ECF_LEAF) != 0, 
> otherwise they might call hooks in the current CU and similar and free 
> in those hooks (likewise setjmp or similar, but again, that is not 
> ECF_LEAF).

Sure.  Consider the patch changed to introduce this function:

static bool
nonfreeing_call_p (gimple call)
{
  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL)
      && gimple_call_flags (call) & ECF_LEAF)
    return true;

  return false;
}

Tested on the testcases and hmmer, okay after regstrapping?


Ciao,
Michael.
Richard Guenther - Feb. 6, 2013, 2:42 p.m.
On Wed, Feb 6, 2013 at 3:20 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 6 Feb 2013, Jakub Jelinek wrote:
>
>> First of all, I'd use gimple_call_builtin_p (call, BUILT_IN_NORMAL)
>> test.  And, the function should certainly conservatively return false
>> for all builtins that aren't (gimple_call_flags (call) & ECF_LEAF) != 0,
>> otherwise they might call hooks in the current CU and similar and free
>> in those hooks (likewise setjmp or similar, but again, that is not
>> ECF_LEAF).
>
> Sure.  Consider the patch changed to introduce this function:
>
> static bool
> nonfreeing_call_p (gimple call)
> {
>   if (gimple_call_builtin_p (call, BUILT_IN_NORMAL)
>       && gimple_call_flags (call) & ECF_LEAF)
>     return true;
>
>   return false;
> }
>
> Tested on the testcases and hmmer, okay after regstrapping?

realloc can also effectively free memory.  But the above no longer
considers free () a freeing call as well?!  builtins.def also suggests
built_in_tm_free is to be handled as well.

That said, please enumerate freeing builtins even if they are
not ECF_LEAF at the moment - they might become such I suppose
(realloc is LEAF, free is not) - curiously there is no __builtin_in_tm_realloc.

Richard.

>
> Ciao,
> Michael.
Michael Matz - Feb. 6, 2013, 3:07 p.m.
Hi,

On Wed, 6 Feb 2013, Richard Biener wrote:

> realloc can also effectively free memory.  But the above no longer
> considers free () a freeing call as well?!

free is not ECF_LEAF, so is handled correctly, like tm_free.  I didn't 
consider realloc, it's indeed not handled.

> That said, please enumerate freeing builtins even if they are not 
> ECF_LEAF at the moment - they might become such I suppose (realloc is 
> LEAF, free is not) - curiously there is no __builtin_in_tm_realloc.

The function now reads:

--------------
static bool
nonfreeing_call_p (gimple call)
{
  if (gimple_call_builtin_p (call, BUILT_IN_NORMAL)
      && gimple_call_flags (call) & ECF_LEAF)
    switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
      {
        /* Just in case these become ECF_LEAF in the future.  */
        case BUILT_IN_FREE:
        case BUILT_IN_TM_FREE:
        case BUILT_IN_REALLOC:
        case BUILT_IN_STACK_RESTORE:
          return false;
        default:
          return true;
      }

  return false;
}
--------------

Testcase and hmmer work, okay after regstrapping?


Ciao,
Michael.
Richard Guenther - Feb. 6, 2013, 3:12 p.m.
On Wed, Feb 6, 2013 at 4:07 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 6 Feb 2013, Richard Biener wrote:
>
>> realloc can also effectively free memory.  But the above no longer
>> considers free () a freeing call as well?!
>
> free is not ECF_LEAF, so is handled correctly, like tm_free.  I didn't
> consider realloc, it's indeed not handled.
>
>> That said, please enumerate freeing builtins even if they are not
>> ECF_LEAF at the moment - they might become such I suppose (realloc is
>> LEAF, free is not) - curiously there is no __builtin_in_tm_realloc.
>
> The function now reads:
>
> --------------
> static bool
> nonfreeing_call_p (gimple call)
> {
>   if (gimple_call_builtin_p (call, BUILT_IN_NORMAL)
>       && gimple_call_flags (call) & ECF_LEAF)
>     switch (DECL_FUNCTION_CODE (gimple_call_fndecl (call)))
>       {
>         /* Just in case these become ECF_LEAF in the future.  */
>         case BUILT_IN_FREE:
>         case BUILT_IN_TM_FREE:
>         case BUILT_IN_REALLOC:
>         case BUILT_IN_STACK_RESTORE:
>           return false;
>         default:
>           return true;
>       }
>
>   return false;
> }
> --------------
>
> Testcase and hmmer work, okay after regstrapping?

Ok.

Thanks,
Richard.

>
> Ciao,
> Michael.

Patch

Index: tree-ssa-phiopt.c
===================================================================
--- tree-ssa-phiopt.c	(revision 195753)
+++ tree-ssa-phiopt.c	(working copy)
@@ -1233,6 +1233,7 @@  abs_replacement (basic_block cond_bb, ba
 struct name_to_bb
 {
   unsigned int ssa_name_ver;
+  unsigned int phase;
   bool store;
   HOST_WIDE_INT offset, size;
   basic_block bb;
@@ -1241,6 +1242,10 @@  struct name_to_bb
 /* The hash table for remembering what we've seen.  */
 static htab_t seen_ssa_names;
 
+/* Used for quick clearing of the hash-table when we see calls.
+   Hash entries with phase < nt_call_phase are invalid.  */
+static unsigned int nt_call_phase;
+
 /* The set of MEM_REFs which can't trap.  */
 static struct pointer_set_t *nontrap_set;
 
@@ -1291,6 +1296,7 @@  add_or_mark_expr (basic_block bb, tree e
       /* Try to find the last seen MEM_REF through the same
          SSA_NAME, which can trap.  */
       map.ssa_name_ver = SSA_NAME_VERSION (name);
+      map.phase = 0;
       map.bb = 0;
       map.store = store;
       map.offset = tree_low_cst (TREE_OPERAND (exp, 1), 0);
@@ -1298,13 +1304,13 @@  add_or_mark_expr (basic_block bb, tree e
 
       slot = htab_find_slot (seen_ssa_names, &map, INSERT);
       n2bb = (struct name_to_bb *) *slot;
-      if (n2bb)
+      if (n2bb && n2bb->phase >= nt_call_phase)
         found_bb = n2bb->bb;
 
       /* If we've found a trapping MEM_REF, _and_ it dominates EXP
          (it's in a basic block on the path from us to the dominator root)
 	 then we can't trap.  */
-      if (found_bb && found_bb->aux == (void *)1)
+      if (found_bb && (((size_t)found_bb->aux) & 1) == 1)
 	{
 	  pointer_set_insert (nontrap, exp);
 	}
@@ -1313,12 +1319,14 @@  add_or_mark_expr (basic_block bb, tree e
 	  /* EXP might trap, so insert it into the hash table.  */
 	  if (n2bb)
 	    {
+	      n2bb->phase = nt_call_phase;
 	      n2bb->bb = bb;
 	    }
 	  else
 	    {
 	      n2bb = XNEW (struct name_to_bb);
 	      n2bb->ssa_name_ver = SSA_NAME_VERSION (name);
+	      n2bb->phase = nt_call_phase;
 	      n2bb->bb = bb;
 	      n2bb->store = store;
 	      n2bb->offset = map.offset;
@@ -1329,20 +1337,53 @@  add_or_mark_expr (basic_block bb, tree e
     }
 }
 
+/* Return true when CALL is a call stmt that definitely doesn't
+   free any memory or makes it unavailable otherwise.  */
+static bool
+nonfreeing_call_p (gimple call)
+{
+  tree fndecl = gimple_call_fndecl (call);
+
+  if (!fndecl || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+    return false;
+
+  switch (DECL_FUNCTION_CODE (fndecl))
+    {
+      case BUILT_IN_FREE:
+      case BUILT_IN_STACK_RESTORE:
+	return false;
+      default:
+	return true;
+    }
+}
+
 /* Called by walk_dominator_tree, when entering the block BB.  */
 static void
 nt_init_block (struct dom_walk_data *data ATTRIBUTE_UNUSED, basic_block bb)
 {
+  edge e;
+  edge_iterator ei;
   gimple_stmt_iterator gsi;
-  /* Mark this BB as being on the path to dominator root.  */
-  bb->aux = (void*)1;
+
+  /* If we haven't seen all our predecessors, clear the hash-table.  */
+  FOR_EACH_EDGE (e, ei, bb->preds)
+    if ((((size_t)e->src->aux) & 2) == 0)
+      {
+	nt_call_phase++;
+	break;
+      }
+
+  /* Mark this BB as being on the path to dominator root and as visited.  */
+  bb->aux = (void*)(1 | 2);
 
   /* And walk the statements in order.  */
   for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
     {
       gimple stmt = gsi_stmt (gsi);
 
-      if (gimple_assign_single_p (stmt) && !gimple_has_volatile_ops (stmt))
+      if (is_gimple_call (stmt) && !nonfreeing_call_p (stmt))
+	nt_call_phase++;
+      else if (gimple_assign_single_p (stmt) && !gimple_has_volatile_ops (stmt))
 	{
 	  add_or_mark_expr (bb, gimple_assign_lhs (stmt), nontrap_set, true);
 	  add_or_mark_expr (bb, gimple_assign_rhs1 (stmt), nontrap_set, false);
@@ -1355,7 +1396,7 @@  static void
 nt_fini_block (struct dom_walk_data *data ATTRIBUTE_UNUSED, basic_block bb)
 {
   /* This BB isn't on the path to dominator root anymore.  */
-  bb->aux = NULL;
+  bb->aux = (void*)2;
 }
 
 /* This is the entry point of gathering non trapping memory accesses.
@@ -1368,6 +1409,7 @@  get_non_trapping (void)
   struct pointer_set_t *nontrap;
   struct dom_walk_data walk_data;
 
+  nt_call_phase = 0;
   nontrap = pointer_set_create ();
   seen_ssa_names = htab_create (128, name_to_bb_hash, name_to_bb_eq,
 				free);
@@ -1389,6 +1431,7 @@  get_non_trapping (void)
   fini_walk_dominator_tree (&walk_data);
   htab_delete (seen_ssa_names);
 
+  clear_aux_for_blocks ();
   return nontrap;
 }
 
Index: testsuite/gcc.dg/pr52448.c
===================================================================
--- testsuite/gcc.dg/pr52448.c	(revision 0)
+++ testsuite/gcc.dg/pr52448.c	(working copy)
@@ -0,0 +1,30 @@ 
+/* PR tree-optimization/52448 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-cselim -fdump-tree-cselim-details" } */
+
+extern void perhaps_free_something (void);
+
+void f1 (int *p, int a, int b, int cond, int cond2)
+{
+  *p = a;
+  if (cond)
+    perhaps_free_something ();
+  if (cond2)
+    *p = b;
+}
+
+void f2 (int *p, int a, int b, int *cond, int *cond2)
+{
+  int i;
+  *p = a;
+  for (i = 0; cond[i]; i++)
+    {
+      if (cond2[i])
+        *p = b;
+      perhaps_free_something ();
+    }
+}
+
+/* None of the above conditional stores might be made unconditional.  */
+/* { dg-final { scan-tree-dump-not "cstore" "cselim" } } */
+/* { dg-final { cleanup-tree-dump "cselim" } } */