diff mbox

[RFC] replace malloc with a decl on the stack

Message ID alpine.DEB.2.10.1311120001180.17040@laptop-mg.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse Nov. 11, 2013, 11:55 p.m. UTC
On Mon, 11 Nov 2013, Richard Biener wrote:

> On Sun, Nov 10, 2013 at 4:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Hello,
>>
>> I am posting this patch to get some feedback on the approach. The goal is to
>> replace malloc+free with a stack allocation (a decl actually) when the size
>> is a small constant.
>>
>> For testing, I highjacked the "leaf" attribute, but it isn't right, I'll
>> remove it from the list (not sure what I'll do for the testcases then). What
>> I'd want instead is a "returns" attribute that means the function will
>> return (either by "return" or an exception), as opposed to having an
>> infinite loop, calling exit or longjmp, etc (sched-deps.c has a related
>> call_may_noreturn_p). The situation I am trying to avoid is:
>> p=malloc(12);
>> f(p)
>> free(p)
>>
>> where f contains somewhere:
>> free(p); exit(42);
>> (or longjmp or something else that takes the regular call to free out of the
>> execution flow).
>
> Instead of a new attribute IPA should be used to compute this call-graph
> property.

Could be both, no? Any idea where I could store this information? Space is 
so short that pure and const don't even have their bits in the same place.
Then there is the question of what properties should be looked at exactly:
- may leave the function via longjmp
- must leave the function via return or an exception
- may free memory it didn't allocate itself
- ?

> An infinite loop wouldn't be an issue, but the real issue is
> that it shouldn't free the object which would be only valid if the free
> after the call isn't reached.

So the infinite loop is an issue exactly in the same way calling exit is 
an issue, right?

> IPA pure-const is used to compute similar properties (may loop,
> may throw).
>
> You also have to make sure to properly align the replacement.

Thanks, for some reason I lost this line when I copied from 
fold_builtin_alloca_with_align.

>> The size above which the malloc->stack transformation is not applied should
>> depend on a parameter, I don't know if it should get its own or depend on an
>> existing one. In any case, I'd like to avoid ending up with a ridiculously
>> low threshold (my programs use GMP, which internally uses alloca up to 65536
>> bytes (even in recursive functions that have a dozen allocations), so I
>> don't want gcc to tell me that 50 bytes are too much).
>>
>> A program with a double-free may, with this patch, end up crashing on the
>> first free instead of the second, but that's an invalid program anyway.
>>
>>
>> I don't know if tree-ssa-forwprop is a good place for it, but it was
>> convenient for a prototype. I like the idea of running it several times:
>> replacing malloc with a decl early can help other optimizations, and other
>> optimizations can make this one possible later.
>
> We have a similar transform in CCP (fold_builtin_alloca_with_align) which
> is there because CCP is a good place where size arguments may
> become constant (the case you handle - you don't seem to handle
> variable malloc -> alloca transform which would be possible if for example
> VRP figures out a acceptable upper bound for the size).

I added something for the case where VRP gives an upper bound on the size, 
but I am still using a DECL and not alloca. Using alloca has some 
drawbacks: it shouldn't be done inside a loop (unless we know the memory 
is deallocated in the same iteration and we can save/restore the stack à 
la VLA, but that gets complicated), it should be tried only in late passes 
not to hinder inlining, etc. I guess I could try, but I fear it will be 
quite rare to have suitable bounds on the size of allocations.


I am currently having trouble with removing the free calls. I am using 
gsi_replace to put a clobber instead (is that useful?) but I also tried 
gsi_remove, and adding various release_defs, unlink_stmt_vdef, update_stmt 
without success. When I compile heapstack-1.c, the sink pass crashes 
because it looks at the uses of p=&array and ends up on __builtin_free(p), 
which was removed several passes earlier and doesn't have a bb anymore. 
What is the right way to get rid of those statements?

Comments

Ondřej Bílka Nov. 12, 2013, 3:07 p.m. UTC | #1
On Tue, Nov 12, 2013 at 12:55:17AM +0100, Marc Glisse wrote:
> On Mon, 11 Nov 2013, Richard Biener wrote:
> 
> >On Sun, Nov 10, 2013 at 4:27 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> >>Hello,
> >>
> >>I am posting this patch to get some feedback on the approach. The goal is to
> >>replace malloc+free with a stack allocation (a decl actually) when the size
> >>is a small constant.
> >>
> >>For testing, I highjacked the "leaf" attribute, but it isn't right, I'll
> >>remove it from the list (not sure what I'll do for the testcases then). What
> >>I'd want instead is a "returns" attribute that means the function will
> >>return (either by "return" or an exception), as opposed to having an
> >>infinite loop, calling exit or longjmp, etc (sched-deps.c has a related
> >>call_may_noreturn_p). The situation I am trying to avoid is:
> >>p=malloc(12);
> >>f(p)
> >>free(p)
> >>
> >>where f contains somewhere:
> >>free(p); exit(42);
> >>(or longjmp or something else that takes the regular call to free out of the
> >>execution flow).
> >
> >Instead of a new attribute IPA should be used to compute this call-graph
> >property.
> 
> Could be both, no? Any idea where I could store this information?
> Space is so short that pure and const don't even have their bits in
> the same place.
> Then there is the question of what properties should be looked at exactly:
> - may leave the function via longjmp
> - must leave the function via return or an exception
> - may free memory it didn't allocate itself
> - ?

Anyway you need a better analysis to determine if user called realloc on
converted pointer.
Marc Glisse Nov. 12, 2013, 3:22 p.m. UTC | #2
On Tue, 12 Nov 2013, Ondřej Bílka wrote:

> Anyway you need a better analysis to determine if user called realloc on
> converted pointer.

Note that I am checking if the argument of free is the same as the return 
value of malloc by looking at the SSA_NAME, not the C variable. If the 
user called realloc on this value and then calls free on the same value on 
all paths (including those where realloc returned a different value), 
that's a broken program to start with. But indeed my current patch fails 
on:

p=malloc(12);
q=realloc(p,16);
if(p==q)free(p);
exit(0);

because I (wrongly apparently) considered exit as safe as free :-(
Florian Weimer Nov. 24, 2013, 1:22 p.m. UTC | #3
On 11/12/2013 04:22 PM, Marc Glisse wrote:
> On Tue, 12 Nov 2013, Ondřej Bílka wrote:
>
>> Anyway you need a better analysis to determine if user called realloc on
>> converted pointer.
>
> Note that I am checking if the argument of free is the same as the
> return value of malloc by looking at the SSA_NAME, not the C variable.
> If the user called realloc on this value and then calls free on the same
> value on all paths (including those where realloc returned a different
> value), that's a broken program to start with. But indeed my current
> patch fails on:
>
> p=malloc(12);
> q=realloc(p,16);
> if(p==q)free(p);
> exit(0);
>
> because I (wrongly apparently) considered exit as safe as free :-(

Strictly speaking, this code is equivalent to:

p=malloc(12);
q=realloc(p,16);
free(p);
exit(0);

So I don't see how your optimization breaks it if it is applied to it. :)

p==q is undefined if the realloc call returned a different pointer. 
Past-realloc pointer uses are fairly common, though, and often used to 
adjust interior pointers.  Empirically, this seem to work, although I 
wouldn't be surprised if we already have optimizations which make 
past-realloc pointer use undefined.

(I don't think C provides a safe way to update interior pointers in 
place, FWIW.)
diff mbox

Patch

Index: testsuite/g++.dg/tree-ssa/heapstack-1.C
===================================================================
--- testsuite/g++.dg/tree-ssa/heapstack-1.C	(revision 0)
+++ testsuite/g++.dg/tree-ssa/heapstack-1.C	(working copy)
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+struct A {
+  void*p;
+  A(void*q) : p(q) {}
+  ~A(){ __builtin_free(p); }
+};
+void f(void*)__attribute__((__leaf__));
+void h(void*)__attribute__((__leaf__,__nothrow__));
+void g(){
+  void*p=__builtin_malloc(12);
+  A a(p);
+  f(p);
+}
+
+void i(){
+  void*p=__builtin_malloc(12);
+  h(p);
+  __builtin_free(p);
+}
+
+/* { dg-final { scan-tree-dump-not "malloc" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/heapstack-1.C
___________________________________________________________________
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Index: testsuite/g++.dg/tree-ssa/heapstack-2.C
===================================================================
--- testsuite/g++.dg/tree-ssa/heapstack-2.C	(revision 0)
+++ testsuite/g++.dg/tree-ssa/heapstack-2.C	(working copy)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void f(void*)__attribute__((__leaf__));
+void g(){
+  void*p=__builtin_malloc(12);
+  f(p);
+  __builtin_free(p);
+}
+
+/* { dg-final { scan-tree-dump-times "malloc" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/g++.dg/tree-ssa/heapstack-2.C
___________________________________________________________________
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: testsuite/gcc.dg/tree-ssa/heapstack-1.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/heapstack-1.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/heapstack-1.c	(working copy)
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void* f(void*,int)__attribute__((__pure__));
+void * volatile q;
+void g(int m,int n,int s){
+  int i;
+  if (s<0||s>3)__builtin_unreachable();
+  void*p=__builtin_calloc(s,4);
+  switch(n){
+    case 1:
+      for (i=0; i<m; ++i)
+	q=f(p,i);
+      break;
+    case 2:
+      if (p)
+	__builtin_free(p);
+      return;
+    case 3:
+      __builtin_exit(42);
+    case 4:
+      __builtin_unreachable();
+    default:;
+  }
+  __builtin_free(p);
+}
+
+/* { dg-final { scan-tree-dump-not "malloc" "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/gcc.dg/tree-ssa/heapstack-1.c
___________________________________________________________________
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: testsuite/gcc.dg/tree-ssa/heapstack-2.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/heapstack-2.c	(revision 0)
+++ testsuite/gcc.dg/tree-ssa/heapstack-2.c	(working copy)
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void f(void*);
+void g(){
+  void*p=__builtin_malloc(12);
+  f(p);
+  __builtin_free(p);
+}
+
+/* { dg-final { scan-tree-dump-times "malloc" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */

Property changes on: testsuite/gcc.dg/tree-ssa/heapstack-2.c
___________________________________________________________________
Added: svn:keywords
## -0,0 +1 ##
+Author Date Id Revision URL
\ No newline at end of property
Added: svn:eol-style
## -0,0 +1 ##
+native
\ No newline at end of property
Index: testsuite/gcc.dg/tree-ssa/pr19831-3.c
===================================================================
--- testsuite/gcc.dg/tree-ssa/pr19831-3.c	(revision 204685)
+++ testsuite/gcc.dg/tree-ssa/pr19831-3.c	(working copy)
@@ -1,31 +1,31 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O2 -fdump-tree-optimized" } */
 
-void test2(void)
+void test2(int n)
 {
-  int *p = __builtin_malloc (sizeof (int) * 4);
+  int *p = __builtin_malloc (sizeof (int) * n);
   if (p == (void *)0)
     __builtin_abort ();
   __builtin_free (p);
 }
 
-void test5(int b)
+void test5(int n)
 {
-  int *p = __builtin_malloc (sizeof (int) * 4);
+  int *p = __builtin_malloc (sizeof (int) * n);
   if (p)
     __builtin_free (p);
 }
 
-void test6(void)
+void test6(int n)
 {
-  int *p = __builtin_malloc (sizeof (int) * 4);
+  int *p = __builtin_malloc (sizeof (int) * n);
   if (p == (void *)0)
     __builtin_abort ();
   if (p)
     __builtin_free (p);
 }
 
 /* We should be able to remove all malloc/free pairs with CDDCE.
    Assume p was non-NULL for test2.
    For test5, it doesn't matter if p is NULL or non-NULL.  */
 
Index: tree-ssa-ccp.c
===================================================================
--- tree-ssa-ccp.c	(revision 204685)
+++ tree-ssa-ccp.c	(working copy)
@@ -127,20 +127,21 @@  along with GCC; see the file COPYING3.
 #include "tree-ssanames.h"
 #include "tree-pass.h"
 #include "tree-ssa-propagate.h"
 #include "value-prof.h"
 #include "langhooks.h"
 #include "target.h"
 #include "diagnostic-core.h"
 #include "dbgcnt.h"
 #include "params.h"
 #include "hash-table.h"
+#include "bitmap.h"
 
 
 /* Possible lattice values.  */
 typedef enum
 {
   UNINITIALIZED,
   UNDEFINED,
   CONSTANT,
   VARYING
 } ccp_lattice_t;
@@ -161,20 +162,24 @@  typedef struct prop_value_d prop_value_t
 
 /* Array of propagated constant values.  After propagation,
    CONST_VAL[I].VALUE holds the constant value for SSA_NAME(I).  If
    the constant is held in an SSA name representing a memory store
    (i.e., a VDEF), CONST_VAL[I].MEM_REF will contain the actual
    memory reference used to store (i.e., the LHS of the assignment
    doing the store).  */
 static prop_value_t *const_val;
 static unsigned n_const_val;
 
+/* Avoid looking at the same malloc call twice if there are several
+   matching free. The bitmap stores the SSA_NAME of the variable.  */
+static bitmap malloc_studied;
+
 static void canonicalize_value (prop_value_t *);
 static bool ccp_fold_stmt (gimple_stmt_iterator *);
 
 /* Dump constant propagation value VAL to file OUTF prefixed by PREFIX.  */
 
 static void
 dump_lattice_value (FILE *outf, const char *prefix, prop_value_t val)
 {
   switch (val.lattice_val)
     {
@@ -805,20 +810,22 @@  ccp_initialize (void)
       for (i = gsi_start_phis (bb); !gsi_end_p (i); gsi_next (&i))
         {
           gimple phi = gsi_stmt (i);
 
 	  if (virtual_operand_p (gimple_phi_result (phi)))
             prop_set_simulate_again (phi, false);
 	  else
             prop_set_simulate_again (phi, true);
 	}
     }
+
+  malloc_studied = BITMAP_ALLOC (NULL);
 }
 
 /* Debug count support. Reset the values of ssa names
    VARYING when the total number ssa names analyzed is
    beyond the debug count specified.  */
 
 static void
 do_dbg_cnt (void)
 {
   unsigned i;
@@ -886,20 +893,21 @@  ccp_finalize (void)
 	  nonzero_bits = nonzero_bits | tree_to_double_int (val->value);
 	  nonzero_bits &= get_nonzero_bits (name);
 	  set_nonzero_bits (name, nonzero_bits);
 	}
     }
 
   /* Perform substitutions based on the known constant values.  */
   something_changed = substitute_and_fold (get_constant_value,
 					   ccp_fold_stmt, true);
 
+  BITMAP_FREE (malloc_studied);
   free (const_val);
   const_val = NULL;
   return something_changed;;
 }
 
 
 /* Compute the meet operator between *VAL1 and *VAL2.  Store the result
    in VAL1.
 
    		any  M UNDEFINED   = any
@@ -1910,20 +1918,161 @@  fold_builtin_alloca_with_align (gimple s
 	singleton_p = pt_solution_singleton_p (&pi->pt, &uid);
 	gcc_assert (singleton_p);
 	SET_DECL_PT_UID (var, uid);
       }
   }
 
   /* Fold alloca to the address of the array.  */
   return fold_convert (TREE_TYPE (lhs), build_fold_addr_expr (var));
 }
 
+/* Returns a constant upper bound if possible, or the variable if not.  */
+static tree
+get_upper_bound (tree var)
+{
+  tree cst = get_constant_value (var);
+  if (cst)
+    return cst;
+  double_int min, max;
+  if (TREE_CODE (var) != SSA_NAME
+      || get_range_info (var, &min, &max) != VR_RANGE)
+    return var;
+  return double_int_to_tree (TREE_TYPE (var), max);
+}
+
+/* We want to be sure that free (VAR) can't be called in another function, and
+   the easiest way to ensure that is to prove that it is called in this
+   function.  The hardest part is avoiding some call to a function that may not
+   return because of exit, an infinite loop, setjmp, etc, where it might call
+   free on VAR.  */
+static bool
+malloc_safe_on_stack (gimple stmt, vec<gimple, va_heap> *list_of_frees)
+{
+  tree var = gimple_call_lhs (stmt);
+  if (!bitmap_set_bit (malloc_studied, SSA_NAME_VERSION (var)))
+    return false;
+  basic_block bb = gimple_bb (stmt);
+  stack_vec<basic_block, 20> bb_to_visit;
+  bitmap bb_visited = BITMAP_ALLOC (NULL);
+  bitmap_set_bit (bb_visited, bb->index);
+  gimple_stmt_iterator gsi = gsi_for_stmt (stmt);
+  enum tree_code code;
+
+next_stmt:
+  gsi_next_nondebug (&gsi);
+
+handle_stmt:
+  if (gsi_end_p (gsi))
+    {
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, bb->succs)
+	bb_to_visit.safe_push (e->dest);
+      goto next_bb;
+    }
+  stmt = gsi_stmt (gsi);
+  if (stmt_can_throw_external (stmt))
+    /* We could give up only if VAR has escaped (same for return), but that
+       means there is a memory leak, so don't bother.  */
+    goto unsafe;
+
+  switch (gimple_code (stmt))
+  // TODO: GIMPLE_ASM, EH related gimples?
+    {
+	/* We leave the function without calling free.  */
+      case GIMPLE_RETURN:
+	goto unsafe;
+
+      case GIMPLE_COND:
+	code = gimple_cond_code (stmt);
+	/* If we replace malloc by an array on the stack, it can't be NULL.  */
+	if ((code == EQ_EXPR || code == NE_EXPR)
+	    && var == gimple_cond_lhs (stmt)
+	    && integer_zerop (gimple_cond_rhs (stmt)))
+	  {
+	    edge e;
+	    edge_iterator ei;
+	    FOR_EACH_EDGE (e, ei, bb->succs)
+	      if (e->flags
+		  & ((code == NE_EXPR) ? EDGE_TRUE_VALUE : EDGE_FALSE_VALUE))
+		bb_to_visit.safe_push (e->dest);
+	    goto next_bb;
+	  }
+	/* Fallthrough.  */
+
+	/* Last stmt of the bb, handled by looking at the outgoing edges.  */
+      case GIMPLE_GOTO:
+      case GIMPLE_SWITCH:
+	/* Statements that are irrelevant.  */
+      case GIMPLE_ASSIGN:
+      case GIMPLE_LABEL:
+      case GIMPLE_NOP:
+	goto next_stmt;
+
+      case GIMPLE_CALL:
+	{
+	  tree callee = gimple_call_fndecl (stmt);
+	  if (callee != NULL_TREE
+	      && DECL_BUILT_IN_CLASS (callee) == BUILT_IN_NORMAL)
+	    switch (DECL_FUNCTION_CODE (callee))
+	      {
+		case BUILT_IN_FREE:
+		  if (gimple_call_arg (stmt, 0) != var)
+		    goto next_stmt;
+		  list_of_frees->safe_push (stmt);
+		  /* Fallthrough.  */
+
+		case BUILT_IN_ABORT:
+		case BUILT_IN_EXIT:
+		case BUILT_IN__EXIT:
+		case BUILT_IN__EXIT2:
+		case BUILT_IN_TRAP:
+		case BUILT_IN_UNREACHABLE:
+		  /* Nothing could throw an exception earlier in the block,
+		     so we don't need to check the EH edges.  */
+		  goto next_bb;
+		default:;
+	      }
+
+#if 0
+	  // For Joost
+	  goto next_stmt;
+#endif
+	  int flags = gimple_call_flags (stmt);
+	  // FIXME: leaf is actually not safe, we need some new ECF_* flags.
+	  // ??? In fortran any call is safe?
+	  if (flags & (ECF_CONST|ECF_PURE|ECF_LEAF|ECF_NOVOPS))
+	    goto next_stmt;
+	  goto unsafe;
+	}
+      default:
+	goto unsafe;
+    }
+
+next_bb:
+  if (bb_to_visit.is_empty ())
+    goto safe;
+  bb = bb_to_visit.last ();
+  bb_to_visit.pop ();
+  if (!bitmap_set_bit (bb_visited, bb->index))
+    goto next_bb;
+  gsi = gsi_start_nondebug_bb (bb);
+  goto handle_stmt;
+
+unsafe:
+  BITMAP_FREE (bb_visited);
+  return false;
+safe:
+  BITMAP_FREE (bb_visited);
+  return true;
+}
+
 /* Fold the stmt at *GSI with CCP specific information that propagating
    and regular folding does not catch.  */
 
 static bool
 ccp_fold_stmt (gimple_stmt_iterator *gsi)
 {
   gimple stmt = gsi_stmt (*gsi);
 
   switch (gimple_code (stmt))
     {
@@ -1998,20 +2147,90 @@  ccp_fold_stmt (gimple_stmt_iterator *gsi
             if (new_rhs)
 	      {
 		bool res = update_call_from_tree (gsi, new_rhs);
 		tree var = TREE_OPERAND (TREE_OPERAND (new_rhs, 0),0);
 		gcc_assert (res);
 		insert_clobbers_for_var (*gsi, var);
 		return true;
 	      }
           }
 
+	/* Replace malloc+free with a stack allocation.  */
+	if (gimple_call_builtin_p (stmt, BUILT_IN_FREE))
+	  {
+	    bool is_calloc = false;
+	    tree ptr = gimple_call_arg (stmt, 0);
+	    if (TREE_CODE (ptr) != SSA_NAME)
+	      return false;
+	    gimple stmt1 = SSA_NAME_DEF_STMT (ptr);
+	    if ((is_calloc = gimple_call_builtin_p (stmt1, BUILT_IN_CALLOC))
+		|| gimple_call_builtin_p (stmt1, BUILT_IN_MALLOC))
+	      {
+		gcc_checking_assert (gimple_call_lhs (stmt1) == ptr);
+		tree size = gimple_call_arg (stmt1, 0);
+		size = get_upper_bound (size);
+		if (is_calloc)
+		  {
+		    tree siz2 = gimple_call_arg (stmt1, 1);
+		    siz2 = get_upper_bound (siz2);
+		    size = fold_binary (MULT_EXPR, size_type_node, size, siz2);
+		  }
+		if (!size || !host_integerp (size, 1)
+		/* param_value(...) / 10? Some new parameter?  */
+		    || TREE_INT_CST_LOW (size)
+		       > (unsigned HOST_WIDE_INT)
+			   PARAM_VALUE (PARAM_LARGE_STACK_FRAME))
+		  return false;
+		stack_vec<gimple, 20> list_of_frees;
+		if (!malloc_safe_on_stack (stmt1, &list_of_frees))
+		  return false;
+		/* Declare array.  */
+		tree elem_type =
+		  build_nonstandard_integer_type (BITS_PER_UNIT, 1);
+		tree array_type =
+		  build_array_type_nelts (elem_type, TREE_INT_CST_LOW (size));
+		tree var = create_tmp_var (array_type, NULL);
+		DECL_ALIGN (var) = BIGGEST_ALIGNMENT;
+		tree p = fold_convert (TREE_TYPE (ptr),
+				       build_fold_addr_expr (var));
+		/* Replace free with a clobber.  */
+		int i;
+		gimple free_stmt;
+		FOR_EACH_VEC_ELT (list_of_frees, i, free_stmt)
+		  {
+#if 0
+		    unlink_stmt_vdef (free_stmt);
+		    gimple_stmt_iterator gsi_free = gsi_for_stmt (free_stmt);
+		    gsi_remove (&gsi_free, true);
+		    release_defs (free_stmt);
+#else
+		    tree clobber = build_constructor (array_type, NULL);
+		    TREE_THIS_VOLATILE (clobber) = 1;
+		    gimple clobber_stmt = gimple_build_assign (var, clobber);
+		    gimple_stmt_iterator gsi_free = gsi_for_stmt (free_stmt);
+		    gsi_replace (&gsi_free, clobber_stmt, false);
+#endif
+		  }
+		/* Replace malloc with the address of the variable.  */
+		gimple_stmt_iterator gsi1 = gsi_for_stmt (stmt1);
+		update_call_from_tree (&gsi1, p);
+		if (is_calloc)
+		  {
+		    tree memset_decl = builtin_decl_explicit (BUILT_IN_MEMSET);
+		    gimple zero = gimple_build_call (memset_decl, 3, ptr,
+						     integer_zero_node, size);
+		    gsi_insert_after (&gsi1, zero, GSI_SAME_STMT);
+		  }
+		return true;
+	      }
+	  }
+
 	/* Propagate into the call arguments.  Compared to replace_uses_in
 	   this can use the argument slot types for type verification
 	   instead of the current argument type.  We also can safely
 	   drop qualifiers here as we are dealing with constants anyway.  */
 	argt = TYPE_ARG_TYPES (gimple_call_fntype (stmt));
 	for (i = 0; i < gimple_call_num_args (stmt) && argt;
 	     ++i, argt = TREE_CHAIN (argt))
 	  {
 	    tree arg = gimple_call_arg (stmt, i);
 	    if (TREE_CODE (arg) == SSA_NAME