diff mbox

Alignment fault during function call

Message ID 4C5AD815.70103@codesourcery.com
State New
Headers show

Commit Message

Bernd Schmidt Aug. 5, 2010, 3:26 p.m. UTC
On 08/05/2010 02:44 AM, Jeff Law wrote:
> Unnesting would be good, but I don't think it's sufficient since we have
> to deal with libcalls.  Now if the function's arguments were guaranteed
> to be regs, SSA_NAMEs or the like, then we'd have a fighting chance of
> greatly simplifying that code.

Maybe something like the following draft patch?  There are some code
generation differences on i686, but nothing that looks overly serious.
The following is slightly puzzling:

+       movl    300(%ebx), %eax
-       call    *300(%ebx)
+       call    *%eax

because one would expect the optimizers to do that (unless the costs
suggest it isn't profitable).

Maybe we'd want to restrict the TER changes to ACCUMULATE_OUTGOING_ARGS
targets, and find some additional kinds of replacements we can allow.

There are comments about constructors being called to construct an
object in the argument list - that can't happen anymore, can it?


Bernd

Comments

Michael Matz Aug. 5, 2010, 3:59 p.m. UTC | #1
Hi,

On Thu, 5 Aug 2010, Bernd Schmidt wrote:

> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
> +     generate library calls.  */
> +  if (is_gimple_call (use_stmt)
> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
> +    return false;

You want to allow SSA_NAMEs at least here.  Of course the above is quite 
conservative, OTOH TERing into calls shouldn't be terribly important 
anyway.


Ciao,
Michael.
Bernd Schmidt Aug. 5, 2010, 4:20 p.m. UTC | #2
On 08/05/2010 05:59 PM, Michael Matz wrote:
> Hi,
> 
> On Thu, 5 Aug 2010, Bernd Schmidt wrote:
> 
>> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
>> +     generate library calls.  */
>> +  if (is_gimple_call (use_stmt)
>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>> +    return false;
> 
> You want to allow SSA_NAMEs at least here.

Well, if the SSA_NAME was set in a division for example, won't that lead
to a libfunc to __div<mode>3?  I saw a case like that in testing.


Bernd
Richard Biener Aug. 5, 2010, 4:22 p.m. UTC | #3
On Thu, Aug 5, 2010 at 6:20 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 08/05/2010 05:59 PM, Michael Matz wrote:
>> Hi,
>>
>> On Thu, 5 Aug 2010, Bernd Schmidt wrote:
>>
>>> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
>>> +     generate library calls.  */
>>> +  if (is_gimple_call (use_stmt)
>>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>>> +    return false;
>>
>> You want to allow SSA_NAMEs at least here.
>
> Well, if the SSA_NAME was set in a division for example, won't that lead
> to a libfunc to __div<mode>3?  I saw a case like that in testing.

Sure.  But see cfgexpand.c where the TERing is important to get
alignment right.

      /* TER addresses into arguments of builtin functions so we have a
         chance to infer more correct alignment information.  See PR39954.  */
      if (builtin_p
          && TREE_CODE (arg) == SSA_NAME
          && (def = get_gimple_for_ssa_name (arg))
          && gimple_assign_rhs_code (def) == ADDR_EXPR)
        arg = gimple_assign_rhs1 (def);

Richard.

>
> Bernd
>
Bernd Schmidt Aug. 5, 2010, 4:29 p.m. UTC | #4
On 08/05/2010 06:22 PM, Richard Guenther wrote:
> On Thu, Aug 5, 2010 at 6:20 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> On 08/05/2010 05:59 PM, Michael Matz wrote:
>>> Hi,
>>>
>>> On Thu, 5 Aug 2010, Bernd Schmidt wrote:
>>>
>>>> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
>>>> +     generate library calls.  */
>>>> +  if (is_gimple_call (use_stmt)
>>>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>>>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>>>> +    return false;
>>>
>>> You want to allow SSA_NAMEs at least here.
>>
>> Well, if the SSA_NAME was set in a division for example, won't that lead
>> to a libfunc to __div<mode>3?  I saw a case like that in testing.
> 
> Sure.  But see cfgexpand.c where the TERing is important to get
> alignment right.
> 
>       /* TER addresses into arguments of builtin functions so we have a
>          chance to infer more correct alignment information.  See PR39954.  */
>       if (builtin_p
>           && TREE_CODE (arg) == SSA_NAME
>           && (def = get_gimple_for_ssa_name (arg))
>           && gimple_assign_rhs_code (def) == ADDR_EXPR)
>         arg = gimple_assign_rhs1 (def);

Ok.  I guess that means we need the propagation in some cases, but if we
want to be able to eliminate code in calls.c, we must avoid all cases
where it can cause a library call while expanding an argument.

Would this cover the case you're thinking about?

+  if (is_gimple_call (use_stmt)
+      && gimple_assign_rhs_code (stmt) != ADDR_EXPR
+      && gimple_assign_rhs_code (stmt) != VAR_DECL
+      && gimple_assign_rhs_code (stmt) != PARM_DECL)
+    return false;

Any other exceptions we can safely make?


Bernd
Richard Biener Aug. 5, 2010, 4:42 p.m. UTC | #5
On Thu, Aug 5, 2010 at 6:29 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
> On 08/05/2010 06:22 PM, Richard Guenther wrote:
>> On Thu, Aug 5, 2010 at 6:20 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>> On 08/05/2010 05:59 PM, Michael Matz wrote:
>>>> Hi,
>>>>
>>>> On Thu, 5 Aug 2010, Bernd Schmidt wrote:
>>>>
>>>>> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
>>>>> +     generate library calls.  */
>>>>> +  if (is_gimple_call (use_stmt)
>>>>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>>>>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>>>>> +    return false;
>>>>
>>>> You want to allow SSA_NAMEs at least here.
>>>
>>> Well, if the SSA_NAME was set in a division for example, won't that lead
>>> to a libfunc to __div<mode>3?  I saw a case like that in testing.
>>
>> Sure.  But see cfgexpand.c where the TERing is important to get
>> alignment right.
>>
>>       /* TER addresses into arguments of builtin functions so we have a
>>          chance to infer more correct alignment information.  See PR39954.  */
>>       if (builtin_p
>>           && TREE_CODE (arg) == SSA_NAME
>>           && (def = get_gimple_for_ssa_name (arg))
>>           && gimple_assign_rhs_code (def) == ADDR_EXPR)
>>         arg = gimple_assign_rhs1 (def);
>
> Ok.  I guess that means we need the propagation in some cases, but if we
> want to be able to eliminate code in calls.c, we must avoid all cases
> where it can cause a library call while expanding an argument.
>
> Would this cover the case you're thinking about?
>
> +  if (is_gimple_call (use_stmt)
> +      && gimple_assign_rhs_code (stmt) != ADDR_EXPR
> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
> +    return false;
>
> Any other exceptions we can safely make?

I think you want CONVERT_EXPR_CODE_P and
all of tcc_reference.  Also plain SSA_NAME (which would
be an SSA name copy).

Richard.

>
>
> Bernd
>
Richard Biener Aug. 5, 2010, 4:44 p.m. UTC | #6
On Thu, Aug 5, 2010 at 6:42 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Aug 5, 2010 at 6:29 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>> On 08/05/2010 06:22 PM, Richard Guenther wrote:
>>> On Thu, Aug 5, 2010 at 6:20 PM, Bernd Schmidt <bernds@codesourcery.com> wrote:
>>>> On 08/05/2010 05:59 PM, Michael Matz wrote:
>>>>> Hi,
>>>>>
>>>>> On Thu, 5 Aug 2010, Bernd Schmidt wrote:
>>>>>
>>>>>> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
>>>>>> +     generate library calls.  */
>>>>>> +  if (is_gimple_call (use_stmt)
>>>>>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>>>>>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>>>>>> +    return false;
>>>>>
>>>>> You want to allow SSA_NAMEs at least here.
>>>>
>>>> Well, if the SSA_NAME was set in a division for example, won't that lead
>>>> to a libfunc to __div<mode>3?  I saw a case like that in testing.
>>>
>>> Sure.  But see cfgexpand.c where the TERing is important to get
>>> alignment right.
>>>
>>>       /* TER addresses into arguments of builtin functions so we have a
>>>          chance to infer more correct alignment information.  See PR39954.  */
>>>       if (builtin_p
>>>           && TREE_CODE (arg) == SSA_NAME
>>>           && (def = get_gimple_for_ssa_name (arg))
>>>           && gimple_assign_rhs_code (def) == ADDR_EXPR)
>>>         arg = gimple_assign_rhs1 (def);
>>
>> Ok.  I guess that means we need the propagation in some cases, but if we
>> want to be able to eliminate code in calls.c, we must avoid all cases
>> where it can cause a library call while expanding an argument.
>>
>> Would this cover the case you're thinking about?
>>
>> +  if (is_gimple_call (use_stmt)
>> +      && gimple_assign_rhs_code (stmt) != ADDR_EXPR
>> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
>> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
>> +    return false;
>>
>> Any other exceptions we can safely make?
>
> I think you want CONVERT_EXPR_CODE_P and
> all of tcc_reference.  Also plain SSA_NAME (which would
> be an SSA name copy).

Oh, and note that TER is a recursive process, so ...

  i_1 = j_4 / 5;
  reg_3 = &a[i_1];

can give you a libcall from TERing i_1 via the use in the reg_3 rhs ...

Richard.

> Richard.
>
>>
>>
>> Bernd
>>
>
Bernd Schmidt Aug. 5, 2010, 4:46 p.m. UTC | #7
On 08/05/2010 06:42 PM, Richard Guenther wrote:

>> Any other exceptions we can safely make?
> 
> I think you want CONVERT_EXPR_CODE_P

I don't think that's safe wrt libcalls, so the whole idea probably fails.


Bernd
Richard Henderson Aug. 5, 2010, 6:37 p.m. UTC | #8
On 08/05/2010 09:29 AM, Bernd Schmidt wrote:
> Ok.  I guess that means we need the propagation in some cases, but if we
> want to be able to eliminate code in calls.c, we must avoid all cases
> where it can cause a library call while expanding an argument.

You'll need to be careful to use BLOCK_OP_NO_LIBCALL to place BLKmode
arguments into the stack, otherwise you may get calls to memcpy.


r~
Michael Matz Aug. 6, 2010, 11:14 a.m. UTC | #9
Hi,

On Thu, 5 Aug 2010, Bernd Schmidt wrote:

> On 08/05/2010 05:59 PM, Michael Matz wrote:
> > Hi,
> > 
> > On Thu, 5 Aug 2010, Bernd Schmidt wrote:
> > 
> >> +  /* Avoid nesting calls.  We allow a few things which we're certain won't
> >> +     generate library calls.  */
> >> +  if (is_gimple_call (use_stmt)
> >> +      && gimple_assign_rhs_code (stmt) != VAR_DECL
> >> +      && gimple_assign_rhs_code (stmt) != PARM_DECL)
> >> +    return false;
> > 
> > You want to allow SSA_NAMEs at least here.
> 
> Well, if the SSA_NAME was set in a division for example, won't that lead
> to a libfunc to __div<mode>3?

If recursive TERing occurs, that might happen indeed.  If rhs_code is a 
SSA_NAME then stmt will just be a plain copy, but that RHS name itself 
could also be a TER target.  We could rule that out by checking 
version_to_be_replaced_p() before allowing an SSA_NAME, but that probably 
doesn't improve much (as a plain copy that is replacable can also be 
coalesced which out-of-ssa will probably have done already).  Hmpf.

Okay, SSA_NAMEs probably can't be allowed.  I'd hope that combine would 
get rid of most suboptimal code resulting from that.

I think that getting rid of support for nested calls is a worthy goal, 
especially because we aren't far from it for normal calls.


Ciao,
Michael.
Bernd Schmidt Aug. 6, 2010, 11:15 a.m. UTC | #10
On 08/06/2010 01:14 PM, Michael Matz wrote:
> I think that getting rid of support for nested calls is a worthy goal, 
> especially because we aren't far from it for normal calls.

It would be nice to do.

Maybe we can have a lowering pass before expand that turns conversion
operations, float comparisons, divisions etc. which aren't directly
supported by the machine into libcalls?


Bernd
diff mbox

Patch

Index: calls.c
===================================================================
--- calls.c	(revision 162821)
+++ calls.c	(working copy)
@@ -91,8 +91,6 @@  struct arg_data
      differ from STACK if this arg pads downward.  This location is known
      to be aligned to FUNCTION_ARG_BOUNDARY.  */
   rtx stack_slot;
-  /* Place that this stack area has been saved, if needed.  */
-  rtx save_area;
   /* If an argument's alignment does not permit direct copying into registers,
      copy in smaller-sized pieces into pseudos.  These are stored in a
      block pointed to by this field.  The next field says how many
@@ -101,15 +99,6 @@  struct arg_data
   int n_aligned_regs;
 };
 
-/* A vector of one char per byte of stack space.  A byte if nonzero if
-   the corresponding stack location has been used.
-   This vector is used to prevent a function call within an argument from
-   clobbering any stack already set up.  */
-static char *stack_usage_map;
-
-/* Size of STACK_USAGE_MAP.  */
-static int highest_outgoing_arg_in_use;
-
 /* A bitmap of virtual-incoming stack space.  Bit is set if the corresponding
    stack location's tail call argument has been already stored into the stack.
    This bitmap is used to prevent sibling call optimization if function tries
@@ -117,13 +106,6 @@  static int highest_outgoing_arg_in_use;
    overwritten with tail call arguments.  */
 static sbitmap stored_args_map;
 
-/* stack_arg_under_construction is nonzero when an argument may be
-   initialized with a constructor call (including a C function that
-   returns a BLKmode struct) and expand_call must take special action
-   to make sure the object being constructed does not overlap the
-   argument list for the constructor call.  */
-static int stack_arg_under_construction;
-
 static void emit_call_1 (rtx, tree, tree, tree, HOST_WIDE_INT, HOST_WIDE_INT,
 			 HOST_WIDE_INT, rtx, rtx, int, rtx, int,
 			 CUMULATIVE_ARGS *);
@@ -153,11 +135,6 @@  static int check_sibcall_argument_overla
 static int combine_pending_stack_adjustment_and_call (int, struct args_size *,
 						      unsigned int);
 static tree split_complex_types (tree);
-
-#ifdef REG_PARM_STACK_SPACE
-static rtx save_fixed_argument_area (int, rtx, int *, int *);
-static void restore_fixed_argument_area (rtx, rtx, int, int);
-#endif
 
 /* Force FUNEXP into a form suitable for the address of a CALL,
    and return that as an rtx.  Also load the static chain register
@@ -722,105 +699,6 @@  precompute_register_parameters (int num_
       }
 }
 
-#ifdef REG_PARM_STACK_SPACE
-
-  /* The argument list is the property of the called routine and it
-     may clobber it.  If the fixed area has been used for previous
-     parameters, we must save and restore it.  */
-
-static rtx
-save_fixed_argument_area (int reg_parm_stack_space, rtx argblock, int *low_to_save, int *high_to_save)
-{
-  int low;
-  int high;
-
-  /* Compute the boundary of the area that needs to be saved, if any.  */
-  high = reg_parm_stack_space;
-#ifdef ARGS_GROW_DOWNWARD
-  high += 1;
-#endif
-  if (high > highest_outgoing_arg_in_use)
-    high = highest_outgoing_arg_in_use;
-
-  for (low = 0; low < high; low++)
-    if (stack_usage_map[low] != 0)
-      {
-	int num_to_save;
-	enum machine_mode save_mode;
-	int delta;
-	rtx stack_area;
-	rtx save_area;
-
-	while (stack_usage_map[--high] == 0)
-	  ;
-
-	*low_to_save = low;
-	*high_to_save = high;
-
-	num_to_save = high - low + 1;
-	save_mode = mode_for_size (num_to_save * BITS_PER_UNIT, MODE_INT, 1);
-
-	/* If we don't have the required alignment, must do this
-	   in BLKmode.  */
-	if ((low & (MIN (GET_MODE_SIZE (save_mode),
-			 BIGGEST_ALIGNMENT / UNITS_PER_WORD) - 1)))
-	  save_mode = BLKmode;
-
-#ifdef ARGS_GROW_DOWNWARD
-	delta = -high;
-#else
-	delta = low;
-#endif
-	stack_area = gen_rtx_MEM (save_mode,
-				  memory_address (save_mode,
-						  plus_constant (argblock,
-								 delta)));
-
-	set_mem_align (stack_area, PARM_BOUNDARY);
-	if (save_mode == BLKmode)
-	  {
-	    save_area = assign_stack_temp (BLKmode, num_to_save, 0);
-	    emit_block_move (validize_mem (save_area), stack_area,
-			     GEN_INT (num_to_save), BLOCK_OP_CALL_PARM);
-	  }
-	else
-	  {
-	    save_area = gen_reg_rtx (save_mode);
-	    emit_move_insn (save_area, stack_area);
-	  }
-
-	return save_area;
-      }
-
-  return NULL_RTX;
-}
-
-static void
-restore_fixed_argument_area (rtx save_area, rtx argblock, int high_to_save, int low_to_save)
-{
-  enum machine_mode save_mode = GET_MODE (save_area);
-  int delta;
-  rtx stack_area;
-
-#ifdef ARGS_GROW_DOWNWARD
-  delta = -high_to_save;
-#else
-  delta = low_to_save;
-#endif
-  stack_area = gen_rtx_MEM (save_mode,
-			    memory_address (save_mode,
-					    plus_constant (argblock, delta)));
-  set_mem_align (stack_area, PARM_BOUNDARY);
-
-  if (save_mode != BLKmode)
-    emit_move_insn (stack_area, save_area);
-  else
-    emit_block_move (stack_area, validize_mem (save_area),
-		     GEN_INT (high_to_save - low_to_save + 1),
-		     BLOCK_OP_CALL_PARM);
-}
-#endif /* REG_PARM_STACK_SPACE */
-
 /* If any elements in ARGS refer to parameters that are to be passed in
    registers, but not in memory, and whose alignment does not permit a
    direct copy into registers.  Copy the values into a group of pseudos
@@ -1907,6 +1785,9 @@  avoid_likely_spilled_reg (rtx x)
   return x;
 }
 
+/* Nonzero if we are currently expanding a call.  */
+static int currently_expanding_call = 0;
+
 /* Generate all the code for a CALL_EXPR exp
    and return an rtx for its value.
    Store the value in TARGET (specified as an rtx) if convenient.
@@ -1916,9 +1797,6 @@  avoid_likely_spilled_reg (rtx x)
 rtx
 expand_call (tree exp, rtx target, int ignore)
 {
-  /* Nonzero if we are currently expanding a call.  */
-  static int currently_expanding_call = 0;
-
   /* RTX for the function to be called.  */
   rtx funexp;
   /* Sequence of insns to perform a normal "call".  */
@@ -1999,22 +1877,11 @@  expand_call (tree exp, rtx target, int i
 
   /* Mask of ECF_ flags.  */
   int flags = 0;
-#ifdef REG_PARM_STACK_SPACE
-  /* Define the boundary of the register parm stack space that needs to be
-     saved, if any.  */
-  int low_to_save, high_to_save;
-  rtx save_area = 0;		/* Place that it is saved */
-#endif
-
-  int initial_highest_arg_in_use = highest_outgoing_arg_in_use;
-  char *initial_stack_usage_map = stack_usage_map;
-  char *stack_usage_map_buf = NULL;
 
   int old_stack_allocated;
 
   /* State variables to track stack modifications.  */
   rtx old_stack_level = 0;
-  int old_stack_arg_under_construction = 0;
   int old_pending_adj = 0;
   int old_inhibit_defer_pop = inhibit_defer_pop;
 
@@ -2177,13 +2044,8 @@  expand_call (tree exp, rtx target, int i
     {
       /* If structure_value_addr is a REG other than
 	 virtual_outgoing_args_rtx, we can use always use it.  If it
-	 is not a REG, we must always copy it into a register.
-	 If it is virtual_outgoing_args_rtx, we must copy it to another
-	 register in some cases.  */
+	 is not a REG, we must always copy it into a register.  */
       rtx temp = (!REG_P (structure_value_addr)
-		  || (ACCUMULATE_OUTGOING_ARGS
-		      && stack_arg_under_construction
-		      && structure_value_addr == virtual_outgoing_args_rtx)
 		  ? copy_addr_to_reg (convert_memory_address
 				      (Pmode, structure_value_addr))
 		  : structure_value_addr);
@@ -2285,12 +2147,15 @@  expand_call (tree exp, rtx target, int i
      expanding a call, as that means we're an argument.  Don't try if
      there's cleanups, as we know there's code to follow the call.  */
 
-  if (currently_expanding_call++ != 0
+  if (currently_expanding_call != 0
       || !flag_optimize_sibling_calls
       || args_size.var
       || dbg_cnt (tail_call) == false)
     try_tail_call = 0;
 
+  gcc_assert (currently_expanding_call == 0);
+
+  currently_expanding_call++;
   /*  Rest of purposes for tail call optimizations to fail.  */
   if (
 #ifdef HAVE_sibcall_epilogue
@@ -2481,11 +2346,6 @@  expand_call (tree exp, rtx target, int i
 	      old_stack_pointer_delta = stack_pointer_delta;
 	      old_pending_adj = pending_stack_adjust;
 	      pending_stack_adjust = 0;
-	      /* stack_arg_under_construction says whether a stack arg is
-		 being constructed at the old stack level.  Pushing the stack
-		 gets a clean outgoing argument block.  */
-	      old_stack_arg_under_construction = stack_arg_under_construction;
-	      stack_arg_under_construction = 0;
 	    }
 	  argblock = push_block (ARGS_SIZE_RTX (adjusted_args_size), 0, 0);
 	}
@@ -2509,48 +2369,6 @@  expand_call (tree exp, rtx target, int i
 	    {
 	      if (ACCUMULATE_OUTGOING_ARGS)
 		{
-		  /* Since the stack pointer will never be pushed, it is
-		     possible for the evaluation of a parm to clobber
-		     something we have already written to the stack.
-		     Since most function calls on RISC machines do not use
-		     the stack, this is uncommon, but must work correctly.
-
-		     Therefore, we save any area of the stack that was already
-		     written and that we are using.  Here we set up to do this
-		     by making a new stack usage map from the old one.  The
-		     actual save will be done by store_one_arg.
-
-		     Another approach might be to try to reorder the argument
-		     evaluations to avoid this conflicting stack usage.  */
-
-		  /* Since we will be writing into the entire argument area,
-		     the map must be allocated for its entire size, not just
-		     the part that is the responsibility of the caller.  */
-		  if (! OUTGOING_REG_PARM_STACK_SPACE ((!fndecl ? fntype : TREE_TYPE (fndecl))))
-		    needed += reg_parm_stack_space;
-
-#ifdef ARGS_GROW_DOWNWARD
-		  highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use,
-						     needed + 1);
-#else
-		  highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use,
-						     needed);
-#endif
-		  if (stack_usage_map_buf)
-		    free (stack_usage_map_buf);
-		  stack_usage_map_buf = XNEWVEC (char, highest_outgoing_arg_in_use);
-		  stack_usage_map = stack_usage_map_buf;
-
-		  if (initial_highest_arg_in_use)
-		    memcpy (stack_usage_map, initial_stack_usage_map,
-			    initial_highest_arg_in_use);
-
-		  if (initial_highest_arg_in_use != highest_outgoing_arg_in_use)
-		    memset (&stack_usage_map[initial_highest_arg_in_use], 0,
-			   (highest_outgoing_arg_in_use
-			    - initial_highest_arg_in_use));
-		  needed = 0;
-
 		  /* The address of the outgoing argument list must not be
 		     copied to a register here, because argblock would be left
 		     pointing to the wrong place after the call to
@@ -2615,54 +2433,6 @@  expand_call (tree exp, rtx target, int i
 	    }
 	}
 
-      if (ACCUMULATE_OUTGOING_ARGS)
-	{
-	  /* The save/restore code in store_one_arg handles all
-	     cases except one: a constructor call (including a C
-	     function returning a BLKmode struct) to initialize
-	     an argument.  */
-	  if (stack_arg_under_construction)
-	    {
-	      rtx push_size
-		= GEN_INT (adjusted_args_size.constant
-			   + (OUTGOING_REG_PARM_STACK_SPACE ((!fndecl ? fntype
-			   					      : TREE_TYPE (fndecl))) ? 0
-			      : reg_parm_stack_space));
-	      if (old_stack_level == 0)
-		{
-		  emit_stack_save (SAVE_BLOCK, &old_stack_level,
-				   NULL_RTX);
-		  old_stack_pointer_delta = stack_pointer_delta;
-		  old_pending_adj = pending_stack_adjust;
-		  pending_stack_adjust = 0;
-		  /* stack_arg_under_construction says whether a stack
-		     arg is being constructed at the old stack level.
-		     Pushing the stack gets a clean outgoing argument
-		     block.  */
-		  old_stack_arg_under_construction
-		    = stack_arg_under_construction;
-		  stack_arg_under_construction = 0;
-		  /* Make a new map for the new argument list.  */
-		  if (stack_usage_map_buf)
-		    free (stack_usage_map_buf);
-		  stack_usage_map_buf = XCNEWVEC (char, highest_outgoing_arg_in_use);
-		  stack_usage_map = stack_usage_map_buf;
-		  highest_outgoing_arg_in_use = 0;
-		}
-	      allocate_dynamic_stack_space (push_size, NULL_RTX,
-					    BITS_PER_UNIT);
-	    }
-
-	  /* If argument evaluation might modify the stack pointer,
-	     copy the address of the argument list to a register.  */
-	  for (i = 0; i < num_actuals; i++)
-	    if (args[i].pass_on_stack)
-	      {
-		argblock = copy_addr_to_reg (argblock);
-		break;
-	      }
-	}
-
       compute_argument_addresses (args, argblock, num_actuals);
 
       /* If we push args individually in reverse order, perform stack alignment
@@ -2727,14 +2497,6 @@  expand_call (tree exp, rtx target, int i
       else
 	static_chain_value = 0;
 
-#ifdef REG_PARM_STACK_SPACE
-      /* Save the fixed argument area if it's part of the caller's frame and
-	 is clobbered by argument setup for this call.  */
-      if (ACCUMULATE_OUTGOING_ARGS && pass)
-	save_area = save_fixed_argument_area (reg_parm_stack_space, argblock,
-					      &low_to_save, &high_to_save);
-#endif
-
       /* Now store (and compute if necessary) all non-register parms.
 	 These come before register parms, since they can require block-moves,
 	 which could clobber the registers used for register parms.
@@ -2979,47 +2741,26 @@  expand_call (tree exp, rtx target, int i
 	       && GET_MODE (target) == TYPE_MODE (rettype)
 	       && GET_MODE (target) == GET_MODE (valreg))
 	{
-	  bool may_overlap = false;
-
 	  /* We have to copy a return value in a CLASS_LIKELY_SPILLED hard
 	     reg to a plain register.  */
 	  if (!REG_P (target) || HARD_REGISTER_P (target))
 	    valreg = avoid_likely_spilled_reg (valreg);
 
-	  /* If TARGET is a MEM in the argument area, and we have
-	     saved part of the argument area, then we can't store
-	     directly into TARGET as it may get overwritten when we
-	     restore the argument save area below.  Don't work too
-	     hard though and simply force TARGET to a register if it
-	     is a MEM; the optimizer is quite likely to sort it out.  */
-	  if (ACCUMULATE_OUTGOING_ARGS && pass && MEM_P (target))
-	    for (i = 0; i < num_actuals; i++)
-	      if (args[i].save_area)
-		{
-		  may_overlap = true;
-		  break;
-		}
-
-	  if (may_overlap)
-	    target = copy_to_reg (valreg);
-	  else
-	    {
-	      /* TARGET and VALREG cannot be equal at this point
-		 because the latter would not have
-		 REG_FUNCTION_VALUE_P true, while the former would if
-		 it were referring to the same register.
+	  /* TARGET and VALREG cannot be equal at this point
+	     because the latter would not have
+	     REG_FUNCTION_VALUE_P true, while the former would if
+	     it were referring to the same register.
 
-		 If they refer to the same register, this move will be
-		 a no-op, except when function inlining is being
-		 done.  */
-	      emit_move_insn (target, valreg);
+	     If they refer to the same register, this move will be
+	     a no-op, except when function inlining is being
+	     done.  */
+	  emit_move_insn (target, valreg);
 
-	      /* If we are setting a MEM, this code must be executed.
-		 Since it is emitted after the call insn, sibcall
-		 optimization cannot be performed in that case.  */
-	      if (MEM_P (target))
-		sibcall_failure = 1;
-	    }
+	  /* If we are setting a MEM, this code must be executed.
+	     Since it is emitted after the call insn, sibcall
+	     optimization cannot be performed in that case.  */
+	  if (MEM_P (target))
+	    sibcall_failure = 1;
 	}
       else if (TYPE_MODE (rettype) == BLKmode)
 	{
@@ -3076,40 +2817,8 @@  expand_call (tree exp, rtx target, int i
 	  stack_pointer_delta = old_stack_pointer_delta;
 	  pending_stack_adjust = old_pending_adj;
 	  old_stack_allocated = stack_pointer_delta - pending_stack_adjust;
-	  stack_arg_under_construction = old_stack_arg_under_construction;
-	  highest_outgoing_arg_in_use = initial_highest_arg_in_use;
-	  stack_usage_map = initial_stack_usage_map;
 	  sibcall_failure = 1;
 	}
-      else if (ACCUMULATE_OUTGOING_ARGS && pass)
-	{
-#ifdef REG_PARM_STACK_SPACE
-	  if (save_area)
-	    restore_fixed_argument_area (save_area, argblock,
-					 high_to_save, low_to_save);
-#endif
-
-	  /* If we saved any argument areas, restore them.  */
-	  for (i = 0; i < num_actuals; i++)
-	    if (args[i].save_area)
-	      {
-		enum machine_mode save_mode = GET_MODE (args[i].save_area);
-		rtx stack_area
-		  = gen_rtx_MEM (save_mode,
-				 memory_address (save_mode,
-						 XEXP (args[i].stack_slot, 0)));
-
-		if (save_mode != BLKmode)
-		  emit_move_insn (stack_area, args[i].save_area);
-		else
-		  emit_block_move (stack_area, args[i].save_area,
-				   GEN_INT (args[i].locate.size.constant),
-				   BLOCK_OP_CALL_PARM);
-	      }
-
-	  highest_outgoing_arg_in_use = initial_highest_arg_in_use;
-	  stack_usage_map = initial_stack_usage_map;
-	}
 
       /* If this was alloca, record the new stack level for nonlocal gotos.
 	 Check for the handler slots since we might not have a save area
@@ -3176,9 +2885,6 @@  expand_call (tree exp, rtx target, int i
 
   currently_expanding_call--;
 
-  if (stack_usage_map_buf)
-    free (stack_usage_map_buf);
-
   return target;
 }
 
@@ -3293,7 +2999,6 @@  emit_library_call_value_1 (int retval, r
     rtx reg;
     int partial;
     struct locate_and_pad_arg_data locate;
-    rtx save_area;
   };
   struct arg *argvec;
   int old_inhibit_defer_pop = inhibit_defer_pop;
@@ -3304,28 +3009,17 @@  emit_library_call_value_1 (int retval, r
   int struct_value_size = 0;
   int flags;
   int reg_parm_stack_space = 0;
-  int needed;
   rtx before_call;
   tree tfom;			/* type_for_mode (outmode, 0) */
 
-#ifdef REG_PARM_STACK_SPACE
-  /* Define the boundary of the register parm stack space that needs to be
-     save, if any.  */
-  int low_to_save = 0, high_to_save = 0;
-  rtx save_area = 0;            /* Place that it is saved.  */
-#endif
-
-  /* Size of the stack reserved for parameter registers.  */
-  int initial_highest_arg_in_use = highest_outgoing_arg_in_use;
-  char *initial_stack_usage_map = stack_usage_map;
-  char *stack_usage_map_buf = NULL;
-
   rtx struct_value = targetm.calls.struct_value_rtx (0, 0);
 
 #ifdef REG_PARM_STACK_SPACE
   reg_parm_stack_space = REG_PARM_STACK_SPACE ((tree) 0);
 #endif
 
+  gcc_assert (currently_expanding_call == 0);
+
   /* By default, library functions can not throw.  */
   flags = ECF_NOTHROW;
 
@@ -3549,45 +3243,6 @@  emit_library_call_value_1 (int retval, r
 
   if (ACCUMULATE_OUTGOING_ARGS)
     {
-      /* Since the stack pointer will never be pushed, it is possible for
-	 the evaluation of a parm to clobber something we have already
-	 written to the stack.  Since most function calls on RISC machines
-	 do not use the stack, this is uncommon, but must work correctly.
-
-	 Therefore, we save any area of the stack that was already written
-	 and that we are using.  Here we set up to do this by making a new
-	 stack usage map from the old one.
-
-	 Another approach might be to try to reorder the argument
-	 evaluations to avoid this conflicting stack usage.  */
-
-      needed = args_size.constant;
-
-      /* Since we will be writing into the entire argument area, the
-	 map must be allocated for its entire size, not just the part that
-	 is the responsibility of the caller.  */
-      if (! OUTGOING_REG_PARM_STACK_SPACE ((!fndecl ? fntype : TREE_TYPE (fndecl))))
-	needed += reg_parm_stack_space;
-
-#ifdef ARGS_GROW_DOWNWARD
-      highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use,
-					 needed + 1);
-#else
-      highest_outgoing_arg_in_use = MAX (initial_highest_arg_in_use,
-					 needed);
-#endif
-      stack_usage_map_buf = XNEWVEC (char, highest_outgoing_arg_in_use);
-      stack_usage_map = stack_usage_map_buf;
-
-      if (initial_highest_arg_in_use)
-	memcpy (stack_usage_map, initial_stack_usage_map,
-		initial_highest_arg_in_use);
-
-      if (initial_highest_arg_in_use != highest_outgoing_arg_in_use)
-	memset (&stack_usage_map[initial_highest_arg_in_use], 0,
-	       highest_outgoing_arg_in_use - initial_highest_arg_in_use);
-      needed = 0;
-
       /* We must be careful to use virtual regs before they're instantiated,
 	 and real regs afterwards.  Loop optimization, for example, can create
 	 new libcalls after we've instantiated the virtual regs, and if we
@@ -3598,11 +3253,8 @@  emit_library_call_value_1 (int retval, r
       else
 	argblock = virtual_outgoing_args_rtx;
     }
-  else
-    {
-      if (!PUSH_ARGS)
-	argblock = push_block (GEN_INT (args_size.constant), 0, 0);
-    }
+  else if (!PUSH_ARGS)
+    argblock = push_block (GEN_INT (args_size.constant), 0, 0);
 
   /* If we push args individually in reverse order, perform stack alignment
      before the first push (the last arg).  */
@@ -3621,17 +3273,6 @@  emit_library_call_value_1 (int retval, r
       argnum = 0;
     }
 
-#ifdef REG_PARM_STACK_SPACE
-  if (ACCUMULATE_OUTGOING_ARGS)
-    {
-      /* The argument list is the property of the called routine and it
-	 may clobber it.  If the fixed area has been used for previous
-	 parameters, we must save and restore it.  */
-      save_area = save_fixed_argument_area (reg_parm_stack_space, argblock,
-					    &low_to_save, &high_to_save);
-    }
-#endif
-
   /* Push the args that need to be pushed.  */
 
   /* ARGNUM indexes the ARGVEC array in the order in which the arguments
@@ -3643,78 +3284,15 @@  emit_library_call_value_1 (int retval, r
       rtx reg = argvec[argnum].reg;
       int partial = argvec[argnum].partial;
       unsigned int parm_align = argvec[argnum].locate.boundary;
-      int lower_bound = 0, upper_bound = 0, i;
 
       if (! (reg != 0 && partial == 0))
 	{
-	  if (ACCUMULATE_OUTGOING_ARGS)
-	    {
-	      /* If this is being stored into a pre-allocated, fixed-size,
-		 stack area, save any previous data at that location.  */
-
-#ifdef ARGS_GROW_DOWNWARD
-	      /* stack_slot is negative, but we want to index stack_usage_map
-		 with positive values.  */
-	      upper_bound = -argvec[argnum].locate.slot_offset.constant + 1;
-	      lower_bound = upper_bound - argvec[argnum].locate.size.constant;
-#else
-	      lower_bound = argvec[argnum].locate.slot_offset.constant;
-	      upper_bound = lower_bound + argvec[argnum].locate.size.constant;
-#endif
-
-	      i = lower_bound;
-	      /* Don't worry about things in the fixed argument area;
-		 it has already been saved.  */
-	      if (i < reg_parm_stack_space)
-		i = reg_parm_stack_space;
-	      while (i < upper_bound && stack_usage_map[i] == 0)
-		i++;
-
-	      if (i < upper_bound)
-		{
-		  /* We need to make a save area.  */
-		  unsigned int size
-		    = argvec[argnum].locate.size.constant * BITS_PER_UNIT;
-		  enum machine_mode save_mode
-		    = mode_for_size (size, MODE_INT, 1);
-		  rtx adr
-		    = plus_constant (argblock,
-				     argvec[argnum].locate.offset.constant);
-		  rtx stack_area
-		    = gen_rtx_MEM (save_mode, memory_address (save_mode, adr));
-
-		  if (save_mode == BLKmode)
-		    {
-		      argvec[argnum].save_area
-			= assign_stack_temp (BLKmode,
-					     argvec[argnum].locate.size.constant,
-					     0);
-
-		      emit_block_move (validize_mem (argvec[argnum].save_area),
-				       stack_area,
-				       GEN_INT (argvec[argnum].locate.size.constant),
-				       BLOCK_OP_CALL_PARM);
-		    }
-		  else
-		    {
-		      argvec[argnum].save_area = gen_reg_rtx (save_mode);
-
-		      emit_move_insn (argvec[argnum].save_area, stack_area);
-		    }
-		}
-	    }
-
 	  emit_push_insn (val, mode, NULL_TREE, NULL_RTX, parm_align,
 			  partial, reg, 0, argblock,
 			  GEN_INT (argvec[argnum].locate.offset.constant),
 			  reg_parm_stack_space,
 			  ARGS_SIZE_RTX (argvec[argnum].locate.alignment_pad));
 
-	  /* Now mark the segment we just used.  */
-	  if (ACCUMULATE_OUTGOING_ARGS)
-	    for (i = lower_bound; i < upper_bound; i++)
-	      stack_usage_map[i] = 1;
-
 	  NO_DEFER_POP;
 
 	  if ((flags & ECF_CONST)
@@ -3900,40 +3478,6 @@  emit_library_call_value_1 (int retval, r
 	}
     }
 
-  if (ACCUMULATE_OUTGOING_ARGS)
-    {
-#ifdef REG_PARM_STACK_SPACE
-      if (save_area)
-	restore_fixed_argument_area (save_area, argblock,
-				     high_to_save, low_to_save);
-#endif
-
-      /* If we saved any argument areas, restore them.  */
-      for (count = 0; count < nargs; count++)
-	if (argvec[count].save_area)
-	  {
-	    enum machine_mode save_mode = GET_MODE (argvec[count].save_area);
-	    rtx adr = plus_constant (argblock,
-				     argvec[count].locate.offset.constant);
-	    rtx stack_area = gen_rtx_MEM (save_mode,
-					  memory_address (save_mode, adr));
-
-	    if (save_mode == BLKmode)
-	      emit_block_move (stack_area,
-			       validize_mem (argvec[count].save_area),
-			       GEN_INT (argvec[count].locate.size.constant),
-			       BLOCK_OP_CALL_PARM);
-	    else
-	      emit_move_insn (stack_area, argvec[count].save_area);
-	  }
-
-      highest_outgoing_arg_in_use = initial_highest_arg_in_use;
-      stack_usage_map = initial_stack_usage_map;
-    }
-
-  if (stack_usage_map_buf)
-    free (stack_usage_map_buf);
-
   return value;
 
 }
@@ -4010,7 +3554,6 @@  store_one_arg (struct arg_data *arg, rtx
   rtx reg = 0;
   int partial = 0;
   int used = 0;
-  int i, lower_bound = 0, upper_bound = 0;
   int sibcall_failure = 0;
 
   if (TREE_CODE (pval) == ERROR_MARK)
@@ -4020,67 +3563,6 @@  store_one_arg (struct arg_data *arg, rtx
      this argument.  */
   push_temp_slots ();
 
-  if (ACCUMULATE_OUTGOING_ARGS && !(flags & ECF_SIBCALL))
-    {
-      /* If this is being stored into a pre-allocated, fixed-size, stack area,
-	 save any previous data at that location.  */
-      if (argblock && ! variable_size && arg->stack)
-	{
-#ifdef ARGS_GROW_DOWNWARD
-	  /* stack_slot is negative, but we want to index stack_usage_map
-	     with positive values.  */
-	  if (GET_CODE (XEXP (arg->stack_slot, 0)) == PLUS)
-	    upper_bound = -INTVAL (XEXP (XEXP (arg->stack_slot, 0), 1)) + 1;
-	  else
-	    upper_bound = 0;
-
-	  lower_bound = upper_bound - arg->locate.size.constant;
-#else
-	  if (GET_CODE (XEXP (arg->stack_slot, 0)) == PLUS)
-	    lower_bound = INTVAL (XEXP (XEXP (arg->stack_slot, 0), 1));
-	  else
-	    lower_bound = 0;
-
-	  upper_bound = lower_bound + arg->locate.size.constant;
-#endif
-
-	  i = lower_bound;
-	  /* Don't worry about things in the fixed argument area;
-	     it has already been saved.  */
-	  if (i < reg_parm_stack_space)
-	    i = reg_parm_stack_space;
-	  while (i < upper_bound && stack_usage_map[i] == 0)
-	    i++;
-
-	  if (i < upper_bound)
-	    {
-	      /* We need to make a save area.  */
-	      unsigned int size = arg->locate.size.constant * BITS_PER_UNIT;
-	      enum machine_mode save_mode = mode_for_size (size, MODE_INT, 1);
-	      rtx adr = memory_address (save_mode, XEXP (arg->stack_slot, 0));
-	      rtx stack_area = gen_rtx_MEM (save_mode, adr);
-
-	      if (save_mode == BLKmode)
-		{
-		  tree ot = TREE_TYPE (arg->tree_value);
-		  tree nt = build_qualified_type (ot, (TYPE_QUALS (ot)
-						       | TYPE_QUAL_CONST));
-
-		  arg->save_area = assign_temp (nt, 0, 1, 1);
-		  preserve_temp_slots (arg->save_area);
-		  emit_block_move (validize_mem (arg->save_area), stack_area,
-				   GEN_INT (arg->locate.size.constant),
-				   BLOCK_OP_CALL_PARM);
-		}
-	      else
-		{
-		  arg->save_area = gen_reg_rtx (save_mode);
-		  emit_move_insn (arg->save_area, stack_area);
-		}
-	    }
-	}
-    }
-
   /* If this isn't going to be placed on both the stack and in registers,
      set up the register and number of words.  */
   if (! arg->pass_on_stack)
@@ -4105,27 +3587,6 @@  store_one_arg (struct arg_data *arg, rtx
      it directly into its stack slot.  Otherwise, we can.  */
   if (arg->value == 0)
     {
-      /* stack_arg_under_construction is nonzero if a function argument is
-	 being evaluated directly into the outgoing argument list and
-	 expand_call must take special action to preserve the argument list
-	 if it is called recursively.
-
-	 For scalar function arguments stack_usage_map is sufficient to
-	 determine which stack slots must be saved and restored.  Scalar
-	 arguments in general have pass_on_stack == 0.
-
-	 If this argument is initialized by a function which takes the
-	 address of the argument (a C++ constructor or a C function
-	 returning a BLKmode structure), then stack_usage_map is
-	 insufficient and expand_call must push the stack around the
-	 function call.  Such arguments have pass_on_stack == 1.
-
-	 Note that it is always safe to set stack_arg_under_construction,
-	 but this generates suboptimal code if set when not needed.  */
-
-      if (arg->pass_on_stack)
-	stack_arg_under_construction++;
-
       arg->value = expand_expr (pval,
 				(partial
 				 || TYPE_MODE (TREE_TYPE (pval)) != arg->mode)
@@ -4138,9 +3599,6 @@  store_one_arg (struct arg_data *arg, rtx
       if (arg->mode != TYPE_MODE (TREE_TYPE (pval)))
 	arg->value = convert_modes (arg->mode, TYPE_MODE (TREE_TYPE (pval)),
 				    arg->value, arg->unsignedp);
-
-      if (arg->pass_on_stack)
-	stack_arg_under_construction--;
     }
 
   /* Check for overlap with already clobbered argument area.  */
@@ -4333,12 +3791,6 @@  store_one_arg (struct arg_data *arg, rtx
 				      int_size_in_bytes (type));
     }
 
-  /* Mark all slots this store used.  */
-  if (ACCUMULATE_OUTGOING_ARGS && !(flags & ECF_SIBCALL)
-      && argblock && ! variable_size && arg->stack)
-    for (i = lower_bound; i < upper_bound; i++)
-      stack_usage_map[i] = 1;
-
   /* Once we have pushed something, pops can't safely
      be deferred during the rest of the arguments.  */
   NO_DEFER_POP;
Index: tree-ssa-ter.c
===================================================================
--- tree-ssa-ter.c	(revision 162821)
+++ tree-ssa-ter.c	(working copy)
@@ -366,6 +366,7 @@  is_replaceable_p (gimple stmt)
   use_operand_p use_p;
   tree def;
   gimple use_stmt;
+  enum gimple_code code;
   location_t locus1, locus2;
   tree block1, block2;
 
@@ -434,6 +435,12 @@  is_replaceable_p (gimple stmt)
   /* No function calls can be replaced.  */
   if (is_gimple_call (stmt))
     return false;
+  /* Avoid nesting calls.  We allow a few things which we're certain won't
+     generate library calls.  */
+  if (is_gimple_call (use_stmt)
+      && gimple_assign_rhs_code (stmt) != VAR_DECL
+      && gimple_assign_rhs_code (stmt) != PARM_DECL)
+    return false;
 
   /* Leave any stmt with volatile operands alone as well.  */
   if (gimple_has_volatile_ops (stmt))