diff mbox

[5/5] Postpone expanding va_arg until pass_stdarg

Message ID 54E5D0D2.8080306@mentor.com
State New
Headers show

Commit Message

Tom de Vries Feb. 19, 2015, 12:02 p.m. UTC
On 19-02-15 11:29, Tom de Vries wrote:
> Hi,
>
> I'm posting this patch series for stage1:
> - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> - 0002-Add-gimple_find_sub_bbs.patch
> - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> - 0004-Handle-internal_fn-in-operand_equal_p.patch
> - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>
> The patch series - based on Michael's initial patch - postpones expanding va_arg
> until pass_stdarg, which makes pass_stdarg more robust.
>
> Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
> unix/-m32 testing.
>
> I'll post the patches in reply to this email.
>

This patch postpones expanding va_arg until pass_stdarg.

We add a new internal function IFN_VA_ARG. During gimplification, we map 
VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a 
gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual code.

There are a few implementation details worth mentioning:
- passing the type beyond gimplification is done by adding a NULL pointer-
   to-type to IFN_VA_ARG.
- there is special handling for IFN_VA_ARG that would be most suited to be
   placed in gimplify_va_arg_expr. However, that function lacks the scope for
   the special handling, so it's placed awkwardly in gimplify_modify_expr.
- there's special handling in case the va_arg type is variable-sized.
   gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
   variable-sized types. However, this is gimplified into a gimple_call which
   does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
   we're adding the size argument of the WITH_SIZE_EXPR as argument to
   IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
   gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
   gimplify_assign will generate a memcpy if necessary.
- when gimplifying the va_arg argument ap, it may not be addressable. So
   gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
   This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
   The copy is classified by the va_list_gpr/fpr_size optimization as an
   escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
   because of that.

OK for stage1?

Thanks,
- Tom

Comments

Richard Biener Feb. 19, 2015, 1:03 p.m. UTC | #1
On Thu, 19 Feb 2015, Tom de Vries wrote:

> On 19-02-15 11:29, Tom de Vries wrote:
> > Hi,
> > 
> > I'm posting this patch series for stage1:
> > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> > - 0002-Add-gimple_find_sub_bbs.patch
> > - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> > - 0004-Handle-internal_fn-in-operand_equal_p.patch
> > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
> > 
> > The patch series - based on Michael's initial patch - postpones expanding
> > va_arg
> > until pass_stdarg, which makes pass_stdarg more robust.
> > 
> > Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
> > unix/-m32 testing.
> > 
> > I'll post the patches in reply to this email.
> > 
> 
> This patch postpones expanding va_arg until pass_stdarg.
> 
> We add a new internal function IFN_VA_ARG. During gimplification, we map
> VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a
> gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual
> code.
> 
> There are a few implementation details worth mentioning:
> - passing the type beyond gimplification is done by adding a NULL pointer-
>   to-type to IFN_VA_ARG.
> - there is special handling for IFN_VA_ARG that would be most suited to be
>   placed in gimplify_va_arg_expr. However, that function lacks the scope for
>   the special handling, so it's placed awkwardly in gimplify_modify_expr.
> - there's special handling in case the va_arg type is variable-sized.
>   gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
>   variable-sized types. However, this is gimplified into a gimple_call which
>   does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
>   we're adding the size argument of the WITH_SIZE_EXPR as argument to
>   IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>   gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
>   gimplify_assign will generate a memcpy if necessary.
> - when gimplifying the va_arg argument ap, it may not be addressable. So
>   gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
>   This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
>   The copy is classified by the va_list_gpr/fpr_size optimization as an
>   escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
>   because of that.
> 
> OK for stage1?

Looks mostly good, though it looks like with -O0 this doesn't delay
lowering of va-arg and thus won't "fix" offloading.  Can you instead
introduce a PROP_gimple_lva, provide it by the stdarg pass and add
a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
that runs of !PROP_gimple_lva (and also provides it), and require
PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
the complex stuff to get an idea what needs to be touched)

Thanks,
Richard.
diff mbox

Patch

2015-02-17  Tom de Vries  <tom@codesourcery.com>
	    Michael Matz  <matz@suse.de>

	* gimple-iterator.c (update_modified_stmts): Remove static.
	* gimple-iterator.h (update_modified_stmts): Declare.
	* gimplify.c (gimplify_modify_expr): Handle IFN_VA_ARG.
	(gimplify_va_arg_internal): New function.
	(gimplify_va_arg_expr): Use IFN_VA_ARG.
	* gimplify.h (gimplify_va_arg_internal): Declare.
	* internal-fn.c (expand_VA_ARG): New unreachable function.
	* internal-fn.def (VA_ARG): New DEF_INTERNAL_FN.
	* tree-stdarg.c (gimple_call_ifn_va_arg_p, expand_ifn_va_arg): New
	function.
	(pass_stdarg::execute): Call expand_ifn_va_arg.

	* gcc.dg/tree-ssa/stdarg-2.c: Change f15 scan-tree-dump for target
	x86_64-*-*.
---
 gcc/gimple-iterator.c                    |   2 +-
 gcc/gimple-iterator.h                    |   1 +
 gcc/gimplify.c                           | 109 +++++++++++++++++++++--------
 gcc/gimplify.h                           |   2 +
 gcc/internal-fn.c                        |   9 +++
 gcc/internal-fn.def                      |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c |   4 +-
 gcc/tree-stdarg.c                        | 115 +++++++++++++++++++++++++++++--
 8 files changed, 208 insertions(+), 35 deletions(-)

diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
index a322390..df29123 100644
--- a/gcc/gimple-iterator.c
+++ b/gcc/gimple-iterator.c
@@ -72,7 +72,7 @@  update_modified_stmt (gimple stmt)
 
 /* Mark the statements in SEQ as modified, and update them.  */
 
-static void
+void
 update_modified_stmts (gimple_seq seq)
 {
   gimple_stmt_iterator gsi;
diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index 6be88dd..ab5759e 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -90,6 +90,7 @@  extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq);
 extern void gsi_commit_edge_inserts (void);
 extern void gsi_commit_one_edge_insert (edge, basic_block *);
 extern gphi_iterator gsi_start_phis (basic_block);
+extern void update_modified_stmts (gimple_seq);
 
 /* Return a new iterator pointing to GIMPLE_SEQ's first statement.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 1353ada..932200e 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4564,6 +4564,7 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   gimple assign;
   location_t loc = EXPR_LOCATION (*expr_p);
   gimple_stmt_iterator gsi;
+  tree ap = NULL_TREE, ap_copy = NULL_TREE;
 
   gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
 	      || TREE_CODE (*expr_p) == INIT_EXPR);
@@ -4640,6 +4641,27 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   if (ret == GS_ERROR)
     return ret;
 
+  /* In case of va_arg internal fn wrappped in a WITH_SIZE_EXPR, add the type
+     size as argument to the the call.  */
+  if (TREE_CODE (*from_p) == WITH_SIZE_EXPR)
+    {
+      tree call = TREE_OPERAND (*from_p, 0);
+      tree vlasize = TREE_OPERAND (*from_p, 1);
+
+      if (TREE_CODE (call) == CALL_EXPR
+	  && CALL_EXPR_IFN (call) == IFN_VA_ARG)
+	{
+	  tree type = TREE_TYPE (call);
+	  tree ap = CALL_EXPR_ARG (call, 0);
+	  tree tag = CALL_EXPR_ARG (call, 1);
+	  tree newcall = build_call_expr_internal_loc (EXPR_LOCATION (call),
+						       IFN_VA_ARG, type, 3, ap,
+						       tag, vlasize);
+	  tree *call_p = &(TREE_OPERAND (*from_p, 0));
+	  *call_p = newcall;
+	}
+    }
+
   /* Now see if the above changed *from_p to something we handle specially.  */
   ret = gimplify_modify_expr_rhs (expr_p, from_p, to_p, pre_p, post_p,
 				  want_value);
@@ -4703,12 +4725,16 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  enum internal_fn ifn = CALL_EXPR_IFN (*from_p);
 	  auto_vec<tree> vargs (nargs);
 
+	  if (ifn == IFN_VA_ARG)
+	    ap = unshare_expr (CALL_EXPR_ARG (*from_p, 0));
 	  for (i = 0; i < nargs; i++)
 	    {
 	      gimplify_arg (&CALL_EXPR_ARG (*from_p, i), pre_p,
 			    EXPR_LOCATION (*from_p));
 	      vargs.quick_push (CALL_EXPR_ARG (*from_p, i));
 	    }
+	  if (ifn == IFN_VA_ARG)
+	    ap_copy = CALL_EXPR_ARG (*from_p, 0);
 	  call_stmt = gimple_build_call_internal_vec (ifn, vargs);
 	  gimple_set_location (call_stmt, EXPR_LOCATION (*expr_p));
 	}
@@ -4753,6 +4779,17 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   gsi = gsi_last (*pre_p);
   maybe_fold_stmt (&gsi);
 
+  /* When gimplifying the &ap argument of va_arg, we might end up with
+       ap.1 = ap
+       va_arg (&ap.1, 0B)
+     We need to assign ap.1 back to ap, otherwise va_arg has no effect on
+     ap.  */
+  if (ap != NULL_TREE
+      && TREE_CODE (ap) == ADDR_EXPR
+      && TREE_CODE (ap_copy) == ADDR_EXPR
+      && TREE_OPERAND (ap, 0) != TREE_OPERAND (ap_copy, 0))
+    gimplify_assign (TREE_OPERAND (ap, 0), TREE_OPERAND (ap_copy, 0), pre_p);
+
   if (want_value)
     {
       *expr_p = TREE_THIS_VOLATILE (*to_p) ? *from_p : unshare_expr (*to_p);
@@ -9290,6 +9327,42 @@  dummy_object (tree type)
   return build2 (MEM_REF, type, t, t);
 }
 
+/* Call the target expander for evaluating a va_arg call of VALIST
+   and TYPE.  */
+
+tree
+gimplify_va_arg_internal (tree valist, tree type, location_t loc,
+			  gimple_seq *pre_p, gimple_seq *post_p)
+{
+  tree have_va_type = TREE_TYPE (valist);
+  tree cano_type = targetm.canonical_va_list_type (have_va_type);
+
+  if (cano_type != NULL_TREE)
+    have_va_type = cano_type;
+
+  /* Make it easier for the backends by protecting the valist argument
+     from multiple evaluations.  */
+  if (TREE_CODE (have_va_type) == ARRAY_TYPE)
+    {
+      /* For this case, the backends will be expecting a pointer to
+	 TREE_TYPE (abi), but it's possible we've
+	 actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
+	 So fix it.  */
+      if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
+	{
+	  tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
+	  valist = fold_convert_loc (loc, p1,
+				     build_fold_addr_expr_loc (loc, valist));
+	}
+
+      gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
+    }
+  else
+    gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
+
+  return targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+}
+
 /* Gimplify __builtin_va_arg, aka VA_ARG_EXPR, which is not really a
    builtin function, but a very special sort of operator.  */
 
@@ -9351,36 +9424,18 @@  gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
       *expr_p = dummy_object (type);
       return GS_ALL_DONE;
     }
-  else
+  else if (optimize && !optimize_debug)
     {
-      /* Make it easier for the backends by protecting the valist argument
-	 from multiple evaluations.  */
-      if (TREE_CODE (have_va_type) == ARRAY_TYPE)
-	{
-	  /* For this case, the backends will be expecting a pointer to
-	     TREE_TYPE (abi), but it's possible we've
-	     actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
-	     So fix it.  */
-	  if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
-	    {
-	      tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
-	      valist = fold_convert_loc (loc, p1,
-					 build_fold_addr_expr_loc (loc, valist));
-	    }
+      tree ap, tag;
 
-	  gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
-	}
-      else
-	gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
-
-      if (!targetm.gimplify_va_arg_expr)
-	/* FIXME: Once most targets are converted we should merely
-	   assert this is non-null.  */
-	return GS_ALL_DONE;
-
-      *expr_p = targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
-      return GS_OK;
+      /* Transform a VA_ARG_EXPR into an VA_ARG internal function.  */
+      ap = build_fold_addr_expr_loc (loc, valist);
+      tag = build_int_cst (build_pointer_type (type), 0);
+      *expr_p = build_call_expr_internal_loc (loc, IFN_VA_ARG, type, 2, ap, tag);
     }
+  else
+    *expr_p = gimplify_va_arg_internal (valist, type, loc, pre_p, post_p);
+  return GS_OK;
 }
 
 /* Build a new GIMPLE_ASSIGN tuple and append it to the end of *SEQ_P.
diff --git a/gcc/gimplify.h b/gcc/gimplify.h
index 615925c..bad8e0f 100644
--- a/gcc/gimplify.h
+++ b/gcc/gimplify.h
@@ -82,6 +82,8 @@  extern void gimplify_function_tree (tree);
 extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,
 						  gimple_seq *);
 gimple gimplify_assign (tree, tree, gimple_seq *);
+extern tree gimplify_va_arg_internal (tree, tree, location_t, gimple_seq *,
+				      gimple_seq *);
 
 /* Return true if gimplify_one_sizepos doesn't need to gimplify
    expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index e402825..0053ed9 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1972,6 +1972,15 @@  expand_BUILTIN_EXPECT (gcall *stmt)
     emit_move_insn (target, val);
 }
 
+/* IFN_VA_ARG is supposed to be expanded at pass_stdarg.  So this dummy function
+   should never be called.  */
+
+static void
+expand_VA_ARG (gcall *stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 032ce6c..f557c64 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -62,3 +62,4 @@  DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (VA_ARG, 0, NULL)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
index fe39da3..5a74280 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
@@ -288,9 +288,9 @@  f15 (int i, ...)
   f15_1 (ap);
   va_end (ap);
 }
-/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && { ! { ia32 || llp64 } } } } } } */
+/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* } && { ! { ia32 || llp64 } } } } } } */
 /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { powerpc*-*-linux* && { powerpc_fprs && ilp32 } } } } } */
-/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target alpha*-*-linux* } } } */
+/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target { { alpha*-*-linux* } || { { x86_64-*-* } && { ! { ia32 || llp64 } } } } } } } */
 /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save 1 GPR units and 2 FPR units" "stdarg" { target s390*-*-linux* } } } */
 /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && ia32 } } } } */
 /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target ia64-*-* } } } */
diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
index 5eac56a..9dc5fe5 100644
--- a/gcc/tree-stdarg.c
+++ b/gcc/tree-stdarg.c
@@ -52,11 +52,14 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimple-walk.h"
 #include "gimple-ssa.h"
+#include "gimplify.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
 #include "stringpool.h"
 #include "tree-ssanames.h"
+#include "tree-into-ssa.h"
 #include "sbitmap.h"
+#include "tree-cfg.h"
 #include "tree-pass.h"
 #include "tree-stdarg.h"
 
@@ -1016,6 +1019,101 @@  finish:
     }
 }
 
+/* Return true if STMT is IFN_VA_ARG.  */
+
+static bool
+gimple_call_ifn_va_arg_p (gimple stmt)
+{
+  return (is_gimple_call (stmt)
+	  && gimple_call_internal_p (stmt)
+	  && gimple_call_internal_fn (stmt) == IFN_VA_ARG);
+}
+
+/* Expand IFN_VA_ARGs in FUN.  */
+
+static void
+expand_ifn_va_arg (function *fun)
+{
+  bool modified = false;
+  basic_block bb;
+  gimple_stmt_iterator i;
+
+  FOR_EACH_BB_FN (bb, fun)
+    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+      {
+	gimple stmt = gsi_stmt (i);
+	tree ap, expr, lhs, type;
+	gimple_seq pre = NULL, post = NULL;
+
+	if (!gimple_call_ifn_va_arg_p (stmt))
+	  continue;
+
+	modified = true;
+
+	type = TREE_TYPE (TREE_TYPE (gimple_call_arg (stmt, 1)));
+	ap = gimple_call_arg (stmt, 0);
+	ap = build_fold_indirect_ref (ap);
+
+	push_gimplify_context (false);
+
+	expr = gimplify_va_arg_internal (ap, type, gimple_location (stmt),
+					 &pre, &post);
+
+	lhs = gimple_call_lhs (stmt);
+	if (lhs != NULL_TREE)
+	  {
+	    gcc_assert (useless_type_conversion_p (TREE_TYPE (lhs), type));
+
+	    if (gimple_call_num_args (stmt) == 3)
+	      {
+		/* We've transported the size of with WITH_SIZE_EXPR here as
+		   the 3rd argument of the internal fn call.  Now reinstate
+		   it.  */
+		tree size = gimple_call_arg (stmt, 2);
+		expr = build2 (WITH_SIZE_EXPR, TREE_TYPE (expr), expr, size);
+	      }
+
+	    /* We use gimplify_assign here, rather than gimple_build_assign,
+	       because gimple_assign knows how to deal with variable-sized
+	       types.*/
+	    gimplify_assign (lhs, expr, &pre);
+	  }
+
+	pop_gimplify_context (NULL);
+
+	gimple_seq_add_seq (&pre, post);
+	update_modified_stmts (pre);
+
+	/* Add the sequence after IFN_VA_ARG.  This splits the bb right
+	   after IFN_VA_ARG, and adds the sequence in one or more new bbs
+	   inbetween.  */
+	gimple_find_sub_bbs (pre, &i);
+
+	/* Remove the IFN_VA_ARG gimple_call. It's the last stmt in the
+	   bb.  */
+	gsi_remove (&i, true);
+	gcc_assert (gsi_end_p (i));
+
+	/* We're walking here into the bbs which contain the expansion of
+	   IFN_VA_ARG, and will not contain another IFN_VA_ARG that needs
+	   expanding.  We could try to skip walking these bbs, perhaps by
+	   walking backwards over gimples and bbs.  */
+	break;
+      }
+
+#if ENABLE_CHECKING
+  FOR_EACH_BB_FN (bb, fun)
+    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+      gcc_assert (!gimple_call_ifn_va_arg_p (gsi_stmt (i)));
+#endif
+
+  if (modified)
+    {
+      free_dominance_info (CDI_DOMINATORS);
+      update_ssa (TODO_update_ssa);
+    }
+}
+
 namespace {
 
 const pass_data pass_data_stdarg =
@@ -1039,11 +1137,13 @@  public:
   {}
 
   /* opt_pass methods: */
-  virtual bool gate (function *fun)
+  virtual bool gate (function *)
     {
-      return (flag_stdarg_opt
-	      /* This optimization is only for stdarg functions.  */
-	      && fun->stdarg != 0);
+      /* Always run this pass, in order to expand va_arg internal_fns.  We
+	 also need to do that if fun->stdarg == 0, because a va_arg may also
+	 occur in a function without varargs, f.i. if when passing a va_list to
+	 another function.  */
+      return true;
     }
 
   virtual unsigned int execute (function *);
@@ -1053,7 +1153,12 @@  public:
 unsigned int
 pass_stdarg::execute (function *fun)
 {
-  optimize_va_list_gpr_fpr_size (fun);
+  expand_ifn_va_arg (fun);
+
+  if (flag_stdarg_opt
+      /* This optimization is only for stdarg functions.  */
+      && fun->stdarg != 0)
+    optimize_va_list_gpr_fpr_size (fun);
 
   return 0;
 }
-- 
1.9.1