diff mbox

[5/5] Postpone expanding va_arg until pass_stdarg

Message ID 54E9D5E2.3070805@mentor.com
State New
Headers show

Commit Message

Tom de Vries Feb. 22, 2015, 1:13 p.m. UTC
On 19-02-15 14:03, Richard Biener wrote:
> On Thu, 19 Feb 2015, Tom de Vries wrote:
>
>> On 19-02-15 11:29, Tom de Vries wrote:
>>> Hi,
>>>
>>> I'm posting this patch series for stage1:
>>> - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
>>> - 0002-Add-gimple_find_sub_bbs.patch
>>> - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
>>> - 0004-Handle-internal_fn-in-operand_equal_p.patch
>>> - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>>>
>>> The patch series - based on Michael's initial patch - postpones expanding
>>> va_arg
>>> until pass_stdarg, which makes pass_stdarg more robust.
>>>
>>> Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
>>> unix/-m32 testing.
>>>
>>> I'll post the patches in reply to this email.
>>>
>>
>> This patch postpones expanding va_arg until pass_stdarg.
>>
>> We add a new internal function IFN_VA_ARG. During gimplification, we map
>> VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a
>> gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual
>> code.
>>
>> There are a few implementation details worth mentioning:
>> - passing the type beyond gimplification is done by adding a NULL pointer-
>>    to-type to IFN_VA_ARG.
>> - there is special handling for IFN_VA_ARG that would be most suited to be
>>    placed in gimplify_va_arg_expr. However, that function lacks the scope for
>>    the special handling, so it's placed awkwardly in gimplify_modify_expr.
>> - there's special handling in case the va_arg type is variable-sized.
>>    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
>>    variable-sized types. However, this is gimplified into a gimple_call which
>>    does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
>>    we're adding the size argument of the WITH_SIZE_EXPR as argument to
>>    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>>    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
>>    gimplify_assign will generate a memcpy if necessary.
>> - when gimplifying the va_arg argument ap, it may not be addressable. So
>>    gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
>>    This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
>>    The copy is classified by the va_list_gpr/fpr_size optimization as an
>>    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
>>    because of that.
>>
>> OK for stage1?
>
> Looks mostly good, though it looks like with -O0 this doesn't delay
> lowering of va-arg and thus won't "fix" offloading.  Can you instead
> introduce a PROP_gimple_lva, provide it by the stdarg pass and add
> a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
> that runs of !PROP_gimple_lva (and also provides it), and require
> PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
> the complex stuff to get an idea what needs to be touched)
>

Updated according to comments.

Furthermore (having updated the patch series to recent trunk), I'm dropping the 
ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates to this 
patch.

Retested as before.

OK for stage1?

Btw, I'm wondering if as run-time optimization we can tentatively set 
PROP_gimple_lva at the start of the gimple pass, and unset it in 
gimplify_va_arg_expr. That way we would avoid the loop in expand_ifn_va_arg_1 
(over all bbs and gimples) in functions without va_arg.

Thanks,
- Tom

Comments

Michael Matz Feb. 23, 2015, 8:26 a.m. UTC | #1
Hi,

On Sun, 22 Feb 2015, Tom de Vries wrote:

> Btw, I'm wondering if as run-time optimization we can tentatively set 
> PROP_gimple_lva at the start of the gimple pass, and unset it in 
> gimplify_va_arg_expr. That way we would avoid the loop in 
> expand_ifn_va_arg_1 (over all bbs and gimples) in functions without 
> va_arg.

That makes sense.


Ciao,
Michael.
Tom de Vries March 10, 2015, 3:30 p.m. UTC | #2
[stage1 ping]
On 22-02-15 14:13, Tom de Vries wrote:
> On 19-02-15 14:03, Richard Biener wrote:
>> On Thu, 19 Feb 2015, Tom de Vries wrote:
>>
>>> On 19-02-15 11:29, Tom de Vries wrote:
>>>> Hi,
>>>>
>>>> I'm posting this patch series for stage1:
>>>> - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
>>>> - 0002-Add-gimple_find_sub_bbs.patch
>>>> - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
>>>> - 0004-Handle-internal_fn-in-operand_equal_p.patch
>>>> - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>>>>
>>>> The patch series - based on Michael's initial patch - postpones expanding
>>>> va_arg
>>>> until pass_stdarg, which makes pass_stdarg more robust.
>>>>
>>>> Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
>>>> unix/-m32 testing.
>>>>
>>>> I'll post the patches in reply to this email.
>>>>
>>>
>>> This patch postpones expanding va_arg until pass_stdarg.
>>>
>>> We add a new internal function IFN_VA_ARG. During gimplification, we map
>>> VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a
>>> gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual
>>> code.
>>>
>>> There are a few implementation details worth mentioning:
>>> - passing the type beyond gimplification is done by adding a NULL pointer-
>>>    to-type to IFN_VA_ARG.
>>> - there is special handling for IFN_VA_ARG that would be most suited to be
>>>    placed in gimplify_va_arg_expr. However, that function lacks the scope for
>>>    the special handling, so it's placed awkwardly in gimplify_modify_expr.
>>> - there's special handling in case the va_arg type is variable-sized.
>>>    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
>>>    variable-sized types. However, this is gimplified into a gimple_call which
>>>    does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
>>>    we're adding the size argument of the WITH_SIZE_EXPR as argument to
>>>    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>>>    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
>>>    gimplify_assign will generate a memcpy if necessary.
>>> - when gimplifying the va_arg argument ap, it may not be addressable. So
>>>    gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
>>>    This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
>>>    The copy is classified by the va_list_gpr/fpr_size optimization as an
>>>    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
>>>    because of that.
>>>
>>> OK for stage1?
>>
>> Looks mostly good, though it looks like with -O0 this doesn't delay
>> lowering of va-arg and thus won't "fix" offloading.  Can you instead
>> introduce a PROP_gimple_lva, provide it by the stdarg pass and add
>> a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
>> that runs of !PROP_gimple_lva (and also provides it), and require
>> PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
>> the complex stuff to get an idea what needs to be touched)
>>
>
> Updated according to comments.
>
> Furthermore (having updated the patch series to recent trunk), I'm dropping the
> ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates to this
> patch.
>
> Retested as before.
>
> OK for stage1?
>

Ping.

> Btw, I'm wondering if as run-time optimization we can tentatively set
> PROP_gimple_lva at the start of the gimple pass, and unset it in
> gimplify_va_arg_expr. That way we would avoid the loop in expand_ifn_va_arg_1
> (over all bbs and gimples) in functions without va_arg.
>

Taken care of in follow-up patch 5b.

Thanks,
- Tom

> 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>
>
> 2015-02-17  Tom de Vries<tom@codesourcery.com>
> 	    Michael Matz<matz@suse.de>
>
> 	* gimple-iterator.c (update_modified_stmts): Remove static.
> 	* gimple-iterator.h (update_modified_stmts): Declare.
> 	* gimplify.c (gimplify_modify_expr): Handle IFN_VA_ARG.
> 	(gimplify_va_arg_internal): New function.
> 	(gimplify_va_arg_expr): Use IFN_VA_ARG.
> 	* gimplify.h (gimplify_va_arg_internal): Declare.
> 	* internal-fn.c (expand_VA_ARG): New unreachable function.
> 	* internal-fn.def (VA_ARG): New DEF_INTERNAL_FN.
> 	* tree-stdarg.c (gimple_call_ifn_va_arg_p, expand_ifn_va_arg_1)
> 	(expand_ifn_va_arg): New function.
> 	(pass_data_stdarg): Add PROP_gimple_lva to properties_provided field.
> 	(pass_stdarg::execute): Call expand_ifn_va_arg.
> 	(pass_data_lower_vaarg): New pass_data.
> 	(pass_lower_vaarg): New gimple_opt_pass.
> 	(pass_lower_vaarg::gate, pass_lower_vaarg::execute)
> 	(make_pass_lower_vaarg): New function.
> 	* cfgexpand.c (pass_data_expand): Add PROP_gimple_lva to
> 	properties_required field.
> 	* passes.def (all_passes): Add pass_lower_vaarg.
> 	* tree-pass.h (PROP_gimple_lva): Add define.
> 	(make_pass_lower_vaarg): Declare.
>
> 	* gcc.dg/tree-ssa/stdarg-2.c: Change f15 scan-tree-dump for target
> 	x86_64-*-*.
> ---
>   gcc/cfgexpand.c                          |   3 +-
>   gcc/gimple-iterator.c                    |   2 +-
>   gcc/gimple-iterator.h                    |   1 +
>   gcc/gimplify.c                           | 111 ++++++++++++++-----
>   gcc/gimplify.h                           |   2 +
>   gcc/internal-fn.c                        |   9 ++
>   gcc/internal-fn.def                      |   1 +
>   gcc/passes.def                           |   1 +
>   gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c |   4 +-
>   gcc/tree-pass.h                          |   2 +
>   gcc/tree-stdarg.c                        | 184 ++++++++++++++++++++++++++++---
>   11 files changed, 273 insertions(+), 47 deletions(-)
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 7dfe1f6..af5a652 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -5860,7 +5860,8 @@ const pass_data pass_data_expand =
>     TV_EXPAND, /* tv_id */
>     ( PROP_ssa | PROP_gimple_leh | PROP_cfg
>       | PROP_gimple_lcx
> -    | PROP_gimple_lvec ), /* properties_required */
> +    | PROP_gimple_lvec
> +    | PROP_gimple_lva), /* properties_required */
>     PROP_rtl, /* properties_provided */
>     ( PROP_ssa | PROP_trees ), /* properties_destroyed */
>     0, /* todo_flags_start */
> diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
> index a322390..df29123 100644
> --- a/gcc/gimple-iterator.c
> +++ b/gcc/gimple-iterator.c
> @@ -72,7 +72,7 @@ update_modified_stmt (gimple stmt)
>
>   /* Mark the statements in SEQ as modified, and update them.  */
>
> -static void
> +void
>   update_modified_stmts (gimple_seq seq)
>   {
>     gimple_stmt_iterator gsi;
> diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
> index 6be88dd..ab5759e 100644
> --- a/gcc/gimple-iterator.h
> +++ b/gcc/gimple-iterator.h
> @@ -90,6 +90,7 @@ extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq);
>   extern void gsi_commit_edge_inserts (void);
>   extern void gsi_commit_one_edge_insert (edge, basic_block *);
>   extern gphi_iterator gsi_start_phis (basic_block);
> +extern void update_modified_stmts (gimple_seq);
>
>   /* Return a new iterator pointing to GIMPLE_SEQ's first statement.  */
>
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 1353ada..8ac6a35 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -4564,6 +4564,7 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
>     gimple assign;
>     location_t loc = EXPR_LOCATION (*expr_p);
>     gimple_stmt_iterator gsi;
> +  tree ap = NULL_TREE, ap_copy = NULL_TREE;
>
>     gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
>   	      || TREE_CODE (*expr_p) == INIT_EXPR);
> @@ -4640,6 +4641,27 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
>     if (ret == GS_ERROR)
>       return ret;
>
> +  /* In case of va_arg internal fn wrappped in a WITH_SIZE_EXPR, add the type
> +     size as argument to the the call.  */
> +  if (TREE_CODE (*from_p) == WITH_SIZE_EXPR)
> +    {
> +      tree call = TREE_OPERAND (*from_p, 0);
> +      tree vlasize = TREE_OPERAND (*from_p, 1);
> +
> +      if (TREE_CODE (call) == CALL_EXPR
> +	  && CALL_EXPR_IFN (call) == IFN_VA_ARG)
> +	{
> +	  tree type = TREE_TYPE (call);
> +	  tree ap = CALL_EXPR_ARG (call, 0);
> +	  tree tag = CALL_EXPR_ARG (call, 1);
> +	  tree newcall = build_call_expr_internal_loc (EXPR_LOCATION (call),
> +						       IFN_VA_ARG, type, 3, ap,
> +						       tag, vlasize);
> +	  tree *call_p = &(TREE_OPERAND (*from_p, 0));
> +	  *call_p = newcall;
> +	}
> +    }
> +
>     /* Now see if the above changed *from_p to something we handle specially.  */
>     ret = gimplify_modify_expr_rhs (expr_p, from_p, to_p, pre_p, post_p,
>   				  want_value);
> @@ -4703,12 +4725,16 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
>   	  enum internal_fn ifn = CALL_EXPR_IFN (*from_p);
>   	  auto_vec<tree> vargs (nargs);
>
> +	  if (ifn == IFN_VA_ARG)
> +	    ap = unshare_expr (CALL_EXPR_ARG (*from_p, 0));
>   	  for (i = 0; i < nargs; i++)
>   	    {
>   	      gimplify_arg (&CALL_EXPR_ARG (*from_p, i), pre_p,
>   			    EXPR_LOCATION (*from_p));
>   	      vargs.quick_push (CALL_EXPR_ARG (*from_p, i));
>   	    }
> +	  if (ifn == IFN_VA_ARG)
> +	    ap_copy = CALL_EXPR_ARG (*from_p, 0);
>   	  call_stmt = gimple_build_call_internal_vec (ifn, vargs);
>   	  gimple_set_location (call_stmt, EXPR_LOCATION (*expr_p));
>   	}
> @@ -4753,6 +4779,17 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
>     gsi = gsi_last (*pre_p);
>     maybe_fold_stmt (&gsi);
>
> +  /* When gimplifying the &ap argument of va_arg, we might end up with
> +       ap.1 = ap
> +       va_arg (&ap.1, 0B)
> +     We need to assign ap.1 back to ap, otherwise va_arg has no effect on
> +     ap.  */
> +  if (ap != NULL_TREE
> +      && TREE_CODE (ap) == ADDR_EXPR
> +      && TREE_CODE (ap_copy) == ADDR_EXPR
> +      && TREE_OPERAND (ap, 0) != TREE_OPERAND (ap_copy, 0))
> +    gimplify_assign (TREE_OPERAND (ap, 0), TREE_OPERAND (ap_copy, 0), pre_p);
> +
>     if (want_value)
>       {
>         *expr_p = TREE_THIS_VOLATILE (*to_p) ? *from_p : unshare_expr (*to_p);
> @@ -9290,16 +9327,53 @@ dummy_object (tree type)
>     return build2 (MEM_REF, type, t, t);
>   }
>
> +/* Call the target expander for evaluating a va_arg call of VALIST
> +   and TYPE.  */
> +
> +tree
> +gimplify_va_arg_internal (tree valist, tree type, location_t loc,
> +			  gimple_seq *pre_p, gimple_seq *post_p)
> +{
> +  tree have_va_type = TREE_TYPE (valist);
> +  tree cano_type = targetm.canonical_va_list_type (have_va_type);
> +
> +  if (cano_type != NULL_TREE)
> +    have_va_type = cano_type;
> +
> +  /* Make it easier for the backends by protecting the valist argument
> +     from multiple evaluations.  */
> +  if (TREE_CODE (have_va_type) == ARRAY_TYPE)
> +    {
> +      /* For this case, the backends will be expecting a pointer to
> +	 TREE_TYPE (abi), but it's possible we've
> +	 actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
> +	 So fix it.  */
> +      if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
> +	{
> +	  tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
> +	  valist = fold_convert_loc (loc, p1,
> +				     build_fold_addr_expr_loc (loc, valist));
> +	}
> +
> +      gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
> +    }
> +  else
> +    gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
> +
> +  return targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
> +}
> +
>   /* Gimplify __builtin_va_arg, aka VA_ARG_EXPR, which is not really a
>      builtin function, but a very special sort of operator.  */
>
>   enum gimplify_status
> -gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
> +gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p,
> +		      gimple_seq *post_p ATTRIBUTE_UNUSED)
>   {
>     tree promoted_type, have_va_type;
>     tree valist = TREE_OPERAND (*expr_p, 0);
>     tree type = TREE_TYPE (*expr_p);
> -  tree t;
> +  tree t, tag, ap;
>     location_t loc = EXPR_LOCATION (*expr_p);
>
>     /* Verify that valist is of the proper type.  */
> @@ -9351,36 +9425,13 @@ gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
>         *expr_p = dummy_object (type);
>         return GS_ALL_DONE;
>       }
> -  else
> -    {
> -      /* Make it easier for the backends by protecting the valist argument
> -	 from multiple evaluations.  */
> -      if (TREE_CODE (have_va_type) == ARRAY_TYPE)
> -	{
> -	  /* For this case, the backends will be expecting a pointer to
> -	     TREE_TYPE (abi), but it's possible we've
> -	     actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
> -	     So fix it.  */
> -	  if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
> -	    {
> -	      tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
> -	      valist = fold_convert_loc (loc, p1,
> -					 build_fold_addr_expr_loc (loc, valist));
> -	    }
> -
> -	  gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
> -	}
> -      else
> -	gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
>
> -      if (!targetm.gimplify_va_arg_expr)
> -	/* FIXME: Once most targets are converted we should merely
> -	   assert this is non-null.  */
> -	return GS_ALL_DONE;
> +  /* Transform a VA_ARG_EXPR into an VA_ARG internal function.  */
> +  ap = build_fold_addr_expr_loc (loc, valist);
> +  tag = build_int_cst (build_pointer_type (type), 0);
> +  *expr_p = build_call_expr_internal_loc (loc, IFN_VA_ARG, type, 2, ap, tag);
>
> -      *expr_p = targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
> -      return GS_OK;
> -    }
> +  return GS_OK;
>   }
>
>   /* Build a new GIMPLE_ASSIGN tuple and append it to the end of *SEQ_P.
> diff --git a/gcc/gimplify.h b/gcc/gimplify.h
> index 615925c..bad8e0f 100644
> --- a/gcc/gimplify.h
> +++ b/gcc/gimplify.h
> @@ -82,6 +82,8 @@ extern void gimplify_function_tree (tree);
>   extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,
>   						  gimple_seq *);
>   gimple gimplify_assign (tree, tree, gimple_seq *);
> +extern tree gimplify_va_arg_internal (tree, tree, location_t, gimple_seq *,
> +				      gimple_seq *);
>
>   /* Return true if gimplify_one_sizepos doesn't need to gimplify
>      expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
> diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
> index e402825..0053ed9 100644
> --- a/gcc/internal-fn.c
> +++ b/gcc/internal-fn.c
> @@ -1972,6 +1972,15 @@ expand_BUILTIN_EXPECT (gcall *stmt)
>       emit_move_insn (target, val);
>   }
>
> +/* IFN_VA_ARG is supposed to be expanded at pass_stdarg.  So this dummy function
> +   should never be called.  */
> +
> +static void
> +expand_VA_ARG (gcall *stmt ATTRIBUTE_UNUSED)
> +{
> +  gcc_unreachable ();
> +}
> +
>   /* Routines to expand each internal function, indexed by function number.
>      Each routine has the prototype:
>
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 032ce6c..f557c64 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -62,3 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>   DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>   DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
>   DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
> +DEF_INTERNAL_FN (VA_ARG, 0, NULL)
> diff --git a/gcc/passes.def b/gcc/passes.def
> index 2bc5dcd..b9d396f 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -342,6 +342,7 @@ along with GCC; see the file COPYING3.  If not see
>         NEXT_PASS (pass_tm_edges);
>     POP_INSERT_PASSES ()
>     NEXT_PASS (pass_vtable_verify);
> +  NEXT_PASS (pass_lower_vaarg);
>     NEXT_PASS (pass_lower_vector);
>     NEXT_PASS (pass_lower_complex_O0);
>     NEXT_PASS (pass_asan_O0);
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
> index fe39da3..5a74280 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
> @@ -288,9 +288,9 @@ f15 (int i, ...)
>     f15_1 (ap);
>     va_end (ap);
>   }
> -/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && { ! { ia32 || llp64 } } } } } } */
> +/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* } && { ! { ia32 || llp64 } } } } } } */
>   /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { powerpc*-*-linux* && { powerpc_fprs && ilp32 } } } } } */
> -/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target alpha*-*-linux* } } } */
> +/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target { { alpha*-*-linux* } || { { x86_64-*-* } && { ! { ia32 || llp64 } } } } } } } */
>   /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save 1 GPR units and 2 FPR units" "stdarg" { target s390*-*-linux* } } } */
>   /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && ia32 } } } } */
>   /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target ia64-*-* } } } */
> diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
> index b59ae7a..bcefb3e 100644
> --- a/gcc/tree-pass.h
> +++ b/gcc/tree-pass.h
> @@ -221,6 +221,7 @@ protected:
>   #define PROP_loops		(1 << 11)	/* preserve loop structures */
>   #define PROP_gimple_lvec	(1 << 12)       /* lowered vector */
>   #define PROP_gimple_eomp	(1 << 13)       /* no OpenMP directives */
> +#define PROP_gimple_lva		(1 << 14)       /* No va_arg internal function.  */
>
>   #define PROP_trees \
>     (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | PROP_gimple_lomp)
> @@ -588,6 +589,7 @@ extern gimple_opt_pass *make_pass_early_inline (gcc::context *ctxt);
>   extern gimple_opt_pass *make_pass_inline_parameters (gcc::context *ctxt);
>   extern gimple_opt_pass *make_pass_update_address_taken (gcc::context *ctxt);
>   extern gimple_opt_pass *make_pass_convert_switch (gcc::context *ctxt);
> +extern gimple_opt_pass *make_pass_lower_vaarg (gcc::context *ctxt);
>
>   /* Current optimization pass.  */
>   extern opt_pass *current_pass;
> diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
> index 8d221a4..16a9e2c 100644
> --- a/gcc/tree-stdarg.c
> +++ b/gcc/tree-stdarg.c
> @@ -52,11 +52,14 @@ along with GCC; see the file COPYING3.  If not see
>   #include "gimple-iterator.h"
>   #include "gimple-walk.h"
>   #include "gimple-ssa.h"
> +#include "gimplify.h"
>   #include "tree-phinodes.h"
>   #include "ssa-iterators.h"
>   #include "stringpool.h"
>   #include "tree-ssanames.h"
> +#include "tree-into-ssa.h"
>   #include "sbitmap.h"
> +#include "tree-cfg.h"
>   #include "tree-pass.h"
>   #include "tree-stdarg.h"
>
> @@ -1016,6 +1019,112 @@ finish:
>       }
>   }
>
> +/* Return true if STMT is IFN_VA_ARG.  */
> +
> +static bool
> +gimple_call_ifn_va_arg_p (gimple stmt)
> +{
> +  return (is_gimple_call (stmt)
> +	  && gimple_call_internal_p (stmt)
> +	  && gimple_call_internal_fn (stmt) == IFN_VA_ARG);
> +}
> +
> +/* Expand IFN_VA_ARGs in FUN.  */
> +
> +static void
> +expand_ifn_va_arg_1 (function *fun)
> +{
> +  bool modified = false;
> +  basic_block bb;
> +  gimple_stmt_iterator i;
> +
> +  FOR_EACH_BB_FN (bb, fun)
> +    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
> +      {
> +	gimple stmt = gsi_stmt (i);
> +	tree ap, expr, lhs, type;
> +	gimple_seq pre = NULL, post = NULL;
> +
> +	if (!gimple_call_ifn_va_arg_p (stmt))
> +	  continue;
> +
> +	modified = true;
> +
> +	type = TREE_TYPE (TREE_TYPE (gimple_call_arg (stmt, 1)));
> +	ap = gimple_call_arg (stmt, 0);
> +	ap = build_fold_indirect_ref (ap);
> +
> +	push_gimplify_context (false);
> +
> +	expr = gimplify_va_arg_internal (ap, type, gimple_location (stmt),
> +					 &pre, &post);
> +
> +	lhs = gimple_call_lhs (stmt);
> +	if (lhs != NULL_TREE)
> +	  {
> +	    gcc_assert (useless_type_conversion_p (TREE_TYPE (lhs), type));
> +
> +	    if (gimple_call_num_args (stmt) == 3)
> +	      {
> +		/* We've transported the size of with WITH_SIZE_EXPR here as
> +		   the 3rd argument of the internal fn call.  Now reinstate
> +		   it.  */
> +		tree size = gimple_call_arg (stmt, 2);
> +		expr = build2 (WITH_SIZE_EXPR, TREE_TYPE (expr), expr, size);
> +	      }
> +
> +	    /* We use gimplify_assign here, rather than gimple_build_assign,
> +	       because gimple_assign knows how to deal with variable-sized
> +	       types.  */
> +	    gimplify_assign (lhs, expr, &pre);
> +	  }
> +
> +	pop_gimplify_context (NULL);
> +
> +	gimple_seq_add_seq (&pre, post);
> +	update_modified_stmts (pre);
> +
> +	/* Add the sequence after IFN_VA_ARG.  This splits the bb right
> +	   after IFN_VA_ARG, and adds the sequence in one or more new bbs
> +	   inbetween.  */
> +	gimple_find_sub_bbs (pre, &i);
> +
> +	/* Remove the IFN_VA_ARG gimple_call.  It's the last stmt in the
> +	   bb.  */
> +	gsi_remove (&i, true);
> +	gcc_assert (gsi_end_p (i));
> +
> +	/* We're walking here into the bbs which contain the expansion of
> +	   IFN_VA_ARG, and will not contain another IFN_VA_ARG that needs
> +	   expanding.  We could try to skip walking these bbs, perhaps by
> +	   walking backwards over gimples and bbs.  */
> +	break;
> +      }
> +
> +  if (!modified)
> +    return;
> +
> +  free_dominance_info (CDI_DOMINATORS);
> +  update_ssa (TODO_update_ssa);
> +}
> +
> +/* Expand IFN_VA_ARGs in FUN, if necessary.  */
> +
> +static void
> +expand_ifn_va_arg (function *fun)
> +{
> +  if ((fun->curr_properties & PROP_gimple_lva) == 0)
> +    expand_ifn_va_arg_1 (fun);
> +
> +#if ENABLE_CHECKING
> +  basic_block bb;
> +  gimple_stmt_iterator i;
> +  FOR_EACH_BB_FN (bb, fun)
> +    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
> +      gcc_assert (!gimple_call_ifn_va_arg_p (gsi_stmt (i)));
> +#endif
> +}
> +
>   namespace {
>
>   const pass_data pass_data_stdarg =
> @@ -1025,7 +1134,7 @@ const pass_data pass_data_stdarg =
>     OPTGROUP_NONE, /* optinfo_flags */
>     TV_NONE, /* tv_id */
>     ( PROP_cfg | PROP_ssa ), /* properties_required */
> -  0, /* properties_provided */
> +  PROP_gimple_lva, /* properties_provided */
>     0, /* properties_destroyed */
>     0, /* todo_flags_start */
>     0, /* todo_flags_finish */
> @@ -1039,18 +1148,13 @@ public:
>     {}
>
>     /* opt_pass methods: */
> -  virtual bool gate (function *fun)
> +  virtual bool gate (function *)
>       {
> -      return (flag_stdarg_opt
> -#ifdef ACCEL_COMPILER
> -	      /* Disable for GCC5 in the offloading compilers, as
> -		 va_list and gpr/fpr counter fields are not merged.
> -		 In GCC6 when stdarg is lowered late this shouldn't be
> -		 an issue.  */
> -	      && !in_lto_p
> -#endif
> -	      /* This optimization is only for stdarg functions.  */
> -	      && fun->stdarg != 0);
> +      /* Always run this pass, in order to expand va_arg internal_fns.  We
> +	 also need to do that if fun->stdarg == 0, because a va_arg may also
> +	 occur in a function without varargs, f.i. if when passing a va_list to
> +	 another function.  */
> +      return true;
>       }
>
>     virtual unsigned int execute (function *);
> @@ -1060,7 +1164,14 @@ public:
>   unsigned int
>   pass_stdarg::execute (function *fun)
>   {
> -  optimize_va_list_gpr_fpr_size (fun);
> +  /* TODO: Postpone expand_ifn_va_arg till after
> +     optimize_va_list_gpr_fpr_size.  */
> +  expand_ifn_va_arg (fun);
> +
> +  if (flag_stdarg_opt
> +      /* This optimization is only for stdarg functions.  */
> +      && fun->stdarg != 0)
> +    optimize_va_list_gpr_fpr_size (fun);
>
>     return 0;
>   }
> @@ -1072,3 +1183,50 @@ make_pass_stdarg (gcc::context *ctxt)
>   {
>     return new pass_stdarg (ctxt);
>   }
> +
> +namespace {
> +
> +const pass_data pass_data_lower_vaarg =
> +{
> +  GIMPLE_PASS, /* type */
> +  "lower_vaarg", /* name */
> +  OPTGROUP_NONE, /* optinfo_flags */
> +  TV_NONE, /* tv_id */
> +  ( PROP_cfg | PROP_ssa ), /* properties_required */
> +  PROP_gimple_lva, /* properties_provided */
> +  0, /* properties_destroyed */
> +  0, /* todo_flags_start */
> +  0, /* todo_flags_finish */
> +};
> +
> +class pass_lower_vaarg : public gimple_opt_pass
> +{
> +public:
> +  pass_lower_vaarg (gcc::context *ctxt)
> +    : gimple_opt_pass (pass_data_lower_vaarg, ctxt)
> +  {}
> +
> +  /* opt_pass methods: */
> +  virtual bool gate (function *)
> +    {
> +      return (cfun->curr_properties & PROP_gimple_lva) == 0;
> +    }
> +
> +  virtual unsigned int execute (function *);
> +
> +}; // class pass_lower_vaarg
> +
> +unsigned int
> +pass_lower_vaarg::execute (function *fun)
> +{
> +  expand_ifn_va_arg (fun);
> +  return 0;
> +}
> +
> +} // anon namespace
> +
> +gimple_opt_pass *
> +make_pass_lower_vaarg (gcc::context *ctxt)
> +{
> +  return new pass_lower_vaarg (ctxt);
> +}
> -- 1.9.1
>
Tom de Vries April 16, 2015, 8:49 a.m. UTC | #3
[stage1 ping^2]
On 10-03-15 16:30, Tom de Vries wrote:
> [stage1 ping]
> On 22-02-15 14:13, Tom de Vries wrote:
>> On 19-02-15 14:03, Richard Biener wrote:
>>> On Thu, 19 Feb 2015, Tom de Vries wrote:
>>>
>>>> On 19-02-15 11:29, Tom de Vries wrote:
>>>>> Hi,
>>>>>
>>>>> I'm posting this patch series for stage1:
>>>>> - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
>>>>> - 0002-Add-gimple_find_sub_bbs.patch
>>>>> - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
>>>>> - 0004-Handle-internal_fn-in-operand_equal_p.patch
>>>>> - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>>>>>
>>>>> The patch series - based on Michael's initial patch - postpones expanding
>>>>> va_arg
>>>>> until pass_stdarg, which makes pass_stdarg more robust.
>>>>>
>>>>> Bootstrapped and reg-tested on x86_64 using all languages, with unix/ and
>>>>> unix/-m32 testing.
>>>>>
>>>>> I'll post the patches in reply to this email.
>>>>>
>>>>
>>>> This patch postpones expanding va_arg until pass_stdarg.
>>>>
>>>> We add a new internal function IFN_VA_ARG. During gimplification, we map
>>>> VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified in to a
>>>> gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into actual
>>>> code.
>>>>
>>>> There are a few implementation details worth mentioning:
>>>> - passing the type beyond gimplification is done by adding a NULL pointer-
>>>>    to-type to IFN_VA_ARG.
>>>> - there is special handling for IFN_VA_ARG that would be most suited to be
>>>>    placed in gimplify_va_arg_expr. However, that function lacks the scope for
>>>>    the special handling, so it's placed awkwardly in gimplify_modify_expr.
>>>> - there's special handling in case the va_arg type is variable-sized.
>>>>    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR IFN_VA_ARG for
>>>>    variable-sized types. However, this is gimplified into a gimple_call which
>>>>    does not have the possibility to wrap it's result in a WITH_SIZE_EXPR. So
>>>>    we're adding the size argument of the WITH_SIZE_EXPR as argument to
>>>>    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>>>>    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the subsequent
>>>>    gimplify_assign will generate a memcpy if necessary.
>>>> - when gimplifying the va_arg argument ap, it may not be addressable. So
>>>>    gimplification will generate a copy ap.1 = ap, and use &ap.1 as argument.
>>>>    This means that we have to copy back the ap.1 value to ap after IFN_VA_ARG.
>>>>    The copy is classified by the va_list_gpr/fpr_size optimization as an
>>>>    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15 update is
>>>>    because of that.
>>>>
>>>> OK for stage1?
>>>
>>> Looks mostly good, though it looks like with -O0 this doesn't delay
>>> lowering of va-arg and thus won't "fix" offloading.  Can you instead
>>> introduce a PROP_gimple_lva, provide it by the stdarg pass and add
>>> a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
>>> that runs of !PROP_gimple_lva (and also provides it), and require
>>> PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
>>> the complex stuff to get an idea what needs to be touched)
>>>
>>
>> Updated according to comments.
>>
>> Furthermore (having updated the patch series to recent trunk), I'm dropping the
>> ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates to this
>> patch.
>>
>> Retested as before.
>>
>> OK for stage1?
>>
>
> Ping.

Ping again.

Patch originally posted at: 
https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .

Thanks,
- Tom

>> Btw, I'm wondering if as run-time optimization we can tentatively set
>> PROP_gimple_lva at the start of the gimple pass, and unset it in
>> gimplify_va_arg_expr. That way we would avoid the loop in expand_ifn_va_arg_1
>> (over all bbs and gimples) in functions without va_arg.
>>
>
> Taken care of in follow-up patch 5b.
Richard Biener April 16, 2015, 8:55 a.m. UTC | #4
On Thu, 16 Apr 2015, Tom de Vries wrote:

> [stage1 ping^2]
> On 10-03-15 16:30, Tom de Vries wrote:
> > [stage1 ping]
> > On 22-02-15 14:13, Tom de Vries wrote:
> > > On 19-02-15 14:03, Richard Biener wrote:
> > > > On Thu, 19 Feb 2015, Tom de Vries wrote:
> > > > 
> > > > > On 19-02-15 11:29, Tom de Vries wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > I'm posting this patch series for stage1:
> > > > > > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> > > > > > - 0002-Add-gimple_find_sub_bbs.patch
> > > > > > - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> > > > > > - 0004-Handle-internal_fn-in-operand_equal_p.patch
> > > > > > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
> > > > > > 
> > > > > > The patch series - based on Michael's initial patch - postpones
> > > > > > expanding
> > > > > > va_arg
> > > > > > until pass_stdarg, which makes pass_stdarg more robust.
> > > > > > 
> > > > > > Bootstrapped and reg-tested on x86_64 using all languages, with
> > > > > > unix/ and
> > > > > > unix/-m32 testing.
> > > > > > 
> > > > > > I'll post the patches in reply to this email.
> > > > > > 
> > > > > 
> > > > > This patch postpones expanding va_arg until pass_stdarg.
> > > > > 
> > > > > We add a new internal function IFN_VA_ARG. During gimplification, we
> > > > > map
> > > > > VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified
> > > > > in to a
> > > > > gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into
> > > > > actual
> > > > > code.
> > > > > 
> > > > > There are a few implementation details worth mentioning:
> > > > > - passing the type beyond gimplification is done by adding a NULL
> > > > > pointer-
> > > > >    to-type to IFN_VA_ARG.
> > > > > - there is special handling for IFN_VA_ARG that would be most suited
> > > > > to be
> > > > >    placed in gimplify_va_arg_expr. However, that function lacks the
> > > > > scope for
> > > > >    the special handling, so it's placed awkwardly in
> > > > > gimplify_modify_expr.
> > > > > - there's special handling in case the va_arg type is variable-sized.
> > > > >    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
> > > > > IFN_VA_ARG for
> > > > >    variable-sized types. However, this is gimplified into a
> > > > > gimple_call which
> > > > >    does not have the possibility to wrap it's result in a
> > > > > WITH_SIZE_EXPR. So
> > > > >    we're adding the size argument of the WITH_SIZE_EXPR as argument to
> > > > >    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
> > > > >    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
> > > > > subsequent
> > > > >    gimplify_assign will generate a memcpy if necessary.
> > > > > - when gimplifying the va_arg argument ap, it may not be addressable.
> > > > > So
> > > > >    gimplification will generate a copy ap.1 = ap, and use &ap.1 as
> > > > > argument.
> > > > >    This means that we have to copy back the ap.1 value to ap after
> > > > > IFN_VA_ARG.
> > > > >    The copy is classified by the va_list_gpr/fpr_size optimization as
> > > > > an
> > > > >    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15
> > > > > update is
> > > > >    because of that.
> > > > > 
> > > > > OK for stage1?
> > > > 
> > > > Looks mostly good, though it looks like with -O0 this doesn't delay
> > > > lowering of va-arg and thus won't "fix" offloading.  Can you instead
> > > > introduce a PROP_gimple_lva, provide it by the stdarg pass and add
> > > > a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
> > > > that runs of !PROP_gimple_lva (and also provides it), and require
> > > > PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
> > > > the complex stuff to get an idea what needs to be touched)
> > > > 
> > > 
> > > Updated according to comments.
> > > 
> > > Furthermore (having updated the patch series to recent trunk), I'm
> > > dropping the
> > > ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates
> > > to this
> > > patch.
> > > 
> > > Retested as before.
> > > 
> > > OK for stage1?
> > > 
> > 
> > Ping.
> 
> Ping again.
> 
> Patch originally posted at:
> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .

Ok.

Thanks,
Richard.

> Thanks,
> - Tom
> 
> > > Btw, I'm wondering if as run-time optimization we can tentatively set
> > > PROP_gimple_lva at the start of the gimple pass, and unset it in
> > > gimplify_va_arg_expr. That way we would avoid the loop in
> > > expand_ifn_va_arg_1
> > > (over all bbs and gimples) in functions without va_arg.
> > > 
> > 
> > Taken care of in follow-up patch 5b.
Bin.Cheng April 21, 2015, 5:32 a.m. UTC | #5
On Thu, Apr 16, 2015 at 4:55 PM, Richard Biener <rguenther@suse.de> wrote:
> On Thu, 16 Apr 2015, Tom de Vries wrote:
>
>> [stage1 ping^2]
>> On 10-03-15 16:30, Tom de Vries wrote:
>> > [stage1 ping]
>> > On 22-02-15 14:13, Tom de Vries wrote:
>> > > On 19-02-15 14:03, Richard Biener wrote:
>> > > > On Thu, 19 Feb 2015, Tom de Vries wrote:
>> > > >
>> > > > > On 19-02-15 11:29, Tom de Vries wrote:
>> > > > > > Hi,
>> > > > > >
>> > > > > > I'm posting this patch series for stage1:
>> > > > > > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
>> > > > > > - 0002-Add-gimple_find_sub_bbs.patch
>> > > > > > - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
>> > > > > > - 0004-Handle-internal_fn-in-operand_equal_p.patch
>> > > > > > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>> > > > > >
>> > > > > > The patch series - based on Michael's initial patch - postpones
>> > > > > > expanding
>> > > > > > va_arg
>> > > > > > until pass_stdarg, which makes pass_stdarg more robust.
>> > > > > >
>> > > > > > Bootstrapped and reg-tested on x86_64 using all languages, with
>> > > > > > unix/ and
>> > > > > > unix/-m32 testing.
>> > > > > >
>> > > > > > I'll post the patches in reply to this email.
>> > > > > >
>> > > > >
>> > > > > This patch postpones expanding va_arg until pass_stdarg.
>> > > > >
>> > > > > We add a new internal function IFN_VA_ARG. During gimplification, we
>> > > > > map
>> > > > > VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified
>> > > > > in to a
>> > > > > gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into
>> > > > > actual
>> > > > > code.
>> > > > >
>> > > > > There are a few implementation details worth mentioning:
>> > > > > - passing the type beyond gimplification is done by adding a NULL
>> > > > > pointer-
>> > > > >    to-type to IFN_VA_ARG.
>> > > > > - there is special handling for IFN_VA_ARG that would be most suited
>> > > > > to be
>> > > > >    placed in gimplify_va_arg_expr. However, that function lacks the
>> > > > > scope for
>> > > > >    the special handling, so it's placed awkwardly in
>> > > > > gimplify_modify_expr.
>> > > > > - there's special handling in case the va_arg type is variable-sized.
>> > > > >    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
>> > > > > IFN_VA_ARG for
>> > > > >    variable-sized types. However, this is gimplified into a
>> > > > > gimple_call which
>> > > > >    does not have the possibility to wrap it's result in a
>> > > > > WITH_SIZE_EXPR. So
>> > > > >    we're adding the size argument of the WITH_SIZE_EXPR as argument to
>> > > > >    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>> > > > >    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
>> > > > > subsequent
>> > > > >    gimplify_assign will generate a memcpy if necessary.
>> > > > > - when gimplifying the va_arg argument ap, it may not be addressable.
>> > > > > So
>> > > > >    gimplification will generate a copy ap.1 = ap, and use &ap.1 as
>> > > > > argument.
>> > > > >    This means that we have to copy back the ap.1 value to ap after
>> > > > > IFN_VA_ARG.
>> > > > >    The copy is classified by the va_list_gpr/fpr_size optimization as
>> > > > > an
>> > > > >    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15
>> > > > > update is
>> > > > >    because of that.
>> > > > >
>> > > > > OK for stage1?
>> > > >
>> > > > Looks mostly good, though it looks like with -O0 this doesn't delay
>> > > > lowering of va-arg and thus won't "fix" offloading.  Can you instead
>> > > > introduce a PROP_gimple_lva, provide it by the stdarg pass and add
>> > > > a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
>> > > > that runs of !PROP_gimple_lva (and also provides it), and require
>> > > > PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
>> > > > the complex stuff to get an idea what needs to be touched)
>> > > >
>> > >
>> > > Updated according to comments.
>> > >
>> > > Furthermore (having updated the patch series to recent trunk), I'm
>> > > dropping the
>> > > ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates
>> > > to this
>> > > patch.
>> > >
>> > > Retested as before.
>> > >
>> > > OK for stage1?
>> > >
>> >
>> > Ping.
>>
>> Ping again.
>>
>> Patch originally posted at:
>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .
>
> Ok.
>
> Thanks,
> Richard.
>
>> Thanks,
>> - Tom
>>
>> > > Btw, I'm wondering if as run-time optimization we can tentatively set
>> > > PROP_gimple_lva at the start of the gimple pass, and unset it in
>> > > gimplify_va_arg_expr. That way we would avoid the loop in
>> > > expand_ifn_va_arg_1
>> > > (over all bbs and gimples) in functions without va_arg.
>> > >
>> >
>> > Taken care of in follow-up patch 5b.

Hi,
This patch causes stdarg test failure on aarch64 for both linux and
elf variants.  Seems wrong code is generated.

FAIL: gcc.c-torture/execute/stdarg-1.c   -O0  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O1  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O3 -fomit-frame-pointer
execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/stdarg-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O0  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O1  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O2  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O3 -fomit-frame-pointer
execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O3 -fomit-frame-pointer
-funroll-loops  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/va-arg-11.c   -Os  execution test

It may also causes ICE on arm-none-linux-gnueabihf.

compiler exited with status 1
output is:
/projects/.../src/gcc/gcc/testsuite/gcc.c-torture/execute/stdarg-2.c:
In function 'f3':
/projects/.../src/gcc/gcc/testsuite/gcc.c-torture/execute/stdarg-2.c:61:1:
error: incorrect sharing of tree nodes
aps[4]
# .MEM_5 = VDEF <.MEM_11>
aps[4] = aps[4];
/projects/.../src/gcc/gcc/testsuite/gcc.c-torture/execute/stdarg-2.c:61:1:
internal compiler error: verify_gimple failed
0xae4893 verify_gimple_in_cfg(function*, bool)
    /projects/.../src/gcc/gcc/tree-cfg.c:5136
0x9e3f2f execute_function_todo
    /projects/.../src/gcc/gcc/passes.c:1946
0x9e48d3 execute_todo
    /projects/.../src/gcc/gcc/passes.c:2003
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

FAIL: gcc.c-torture/execute/stdarg-2.c   -O0  (internal compiler error)
FAIL: gcc.c-torture/execute/stdarg-2.c   -O0  (test for excess errors)

Thanks,
bin
Alan Lawrence June 9, 2015, 10:57 a.m. UTC | #6
Hmmm. One side effect of this is that the line number information available in 
the target hook gimplify_va_arg_expr, is now just the name of the containing 
function, rather than the specific use of va_arg. Is there some way to get this 
more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1, the 
only caller of said hook)? I don't really want to have to add an extra parameter 
to the target hook...

--Alan

Richard Biener wrote:
> On Thu, 16 Apr 2015, Tom de Vries wrote:
> 
>> [stage1 ping^2]
>> On 10-03-15 16:30, Tom de Vries wrote:
>>> [stage1 ping]
>>> On 22-02-15 14:13, Tom de Vries wrote:
>>>> On 19-02-15 14:03, Richard Biener wrote:
>>>>> On Thu, 19 Feb 2015, Tom de Vries wrote:
>>>>>
>>>>>> On 19-02-15 11:29, Tom de Vries wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> I'm posting this patch series for stage1:
>>>>>>> - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
>>>>>>> - 0002-Add-gimple_find_sub_bbs.patch
>>>>>>> - 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
>>>>>>> - 0004-Handle-internal_fn-in-operand_equal_p.patch
>>>>>>> - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
>>>>>>>
>>>>>>> The patch series - based on Michael's initial patch - postpones
>>>>>>> expanding
>>>>>>> va_arg
>>>>>>> until pass_stdarg, which makes pass_stdarg more robust.
>>>>>>>
>>>>>>> Bootstrapped and reg-tested on x86_64 using all languages, with
>>>>>>> unix/ and
>>>>>>> unix/-m32 testing.
>>>>>>>
>>>>>>> I'll post the patches in reply to this email.
>>>>>>>
>>>>>> This patch postpones expanding va_arg until pass_stdarg.
>>>>>>
>>>>>> We add a new internal function IFN_VA_ARG. During gimplification, we
>>>>>> map
>>>>>> VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then gimplified
>>>>>> in to a
>>>>>> gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call into
>>>>>> actual
>>>>>> code.
>>>>>>
>>>>>> There are a few implementation details worth mentioning:
>>>>>> - passing the type beyond gimplification is done by adding a NULL
>>>>>> pointer-
>>>>>>    to-type to IFN_VA_ARG.
>>>>>> - there is special handling for IFN_VA_ARG that would be most suited
>>>>>> to be
>>>>>>    placed in gimplify_va_arg_expr. However, that function lacks the
>>>>>> scope for
>>>>>>    the special handling, so it's placed awkwardly in
>>>>>> gimplify_modify_expr.
>>>>>> - there's special handling in case the va_arg type is variable-sized.
>>>>>>    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
>>>>>> IFN_VA_ARG for
>>>>>>    variable-sized types. However, this is gimplified into a
>>>>>> gimple_call which
>>>>>>    does not have the possibility to wrap it's result in a
>>>>>> WITH_SIZE_EXPR. So
>>>>>>    we're adding the size argument of the WITH_SIZE_EXPR as argument to
>>>>>>    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of the
>>>>>>    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
>>>>>> subsequent
>>>>>>    gimplify_assign will generate a memcpy if necessary.
>>>>>> - when gimplifying the va_arg argument ap, it may not be addressable.
>>>>>> So
>>>>>>    gimplification will generate a copy ap.1 = ap, and use &ap.1 as
>>>>>> argument.
>>>>>>    This means that we have to copy back the ap.1 value to ap after
>>>>>> IFN_VA_ARG.
>>>>>>    The copy is classified by the va_list_gpr/fpr_size optimization as
>>>>>> an
>>>>>>    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c f15
>>>>>> update is
>>>>>>    because of that.
>>>>>>
>>>>>> OK for stage1?
>>>>> Looks mostly good, though it looks like with -O0 this doesn't delay
>>>>> lowering of va-arg and thus won't "fix" offloading.  Can you instead
>>>>> introduce a PROP_gimple_lva, provide it by the stdarg pass and add
>>>>> a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
>>>>> that runs of !PROP_gimple_lva (and also provides it), and require
>>>>> PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
>>>>> the complex stuff to get an idea what needs to be touched)
>>>>>
>>>> Updated according to comments.
>>>>
>>>> Furthermore (having updated the patch series to recent trunk), I'm
>>>> dropping the
>>>> ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there relates
>>>> to this
>>>> patch.
>>>>
>>>> Retested as before.
>>>>
>>>> OK for stage1?
>>>>
>>> Ping.
>> Ping again.
>>
>> Patch originally posted at:
>> https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .
> 
> Ok.
> 
> Thanks,
> Richard.
> 
>> Thanks,
>> - Tom
>>
>>>> Btw, I'm wondering if as run-time optimization we can tentatively set
>>>> PROP_gimple_lva at the start of the gimple pass, and unset it in
>>>> gimplify_va_arg_expr. That way we would avoid the loop in
>>>> expand_ifn_va_arg_1
>>>> (over all bbs and gimples) in functions without va_arg.
>>>>
>>> Taken care of in follow-up patch 5b.
> 
>
Richard Biener June 9, 2015, 11:03 a.m. UTC | #7
On Tue, 9 Jun 2015, Alan Lawrence wrote:

> Hmmm. One side effect of this is that the line number information available in
> the target hook gimplify_va_arg_expr, is now just the name of the containing
> function, rather than the specific use of va_arg. Is there some way to get
> this more precise location (e.g. gimple_location(stmt) in expand_ifn_va_arg_1,
> the only caller of said hook)? I don't really want to have to add an extra
> parameter to the target hook...

The x86 variant doesn't use any locations but if then the caller of
the target hook (expand_ifn_va_arg_1) should assign the IFNs location
to all statements expanded from it (it could set input_location to 
that during the target hook call...)

Richard.

> --Alan
> 
> Richard Biener wrote:
> > On Thu, 16 Apr 2015, Tom de Vries wrote:
> > 
> > > [stage1 ping^2]
> > > On 10-03-15 16:30, Tom de Vries wrote:
> > > > [stage1 ping]
> > > > On 22-02-15 14:13, Tom de Vries wrote:
> > > > > On 19-02-15 14:03, Richard Biener wrote:
> > > > > > On Thu, 19 Feb 2015, Tom de Vries wrote:
> > > > > > 
> > > > > > > On 19-02-15 11:29, Tom de Vries wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I'm posting this patch series for stage1:
> > > > > > > > - 0001-Disable-lang_hooks.gimplify_expr-in-free_lang_data.patch
> > > > > > > > - 0002-Add-gimple_find_sub_bbs.patch
> > > > > > > > -
> > > > > > > > 0003-Factor-optimize_va_list_gpr_fpr_size-out-of-pass_std.patch
> > > > > > > > - 0004-Handle-internal_fn-in-operand_equal_p.patch
> > > > > > > > - 0005-Postpone-expanding-va_arg-until-pass_stdarg.patch
> > > > > > > > 
> > > > > > > > The patch series - based on Michael's initial patch - postpones
> > > > > > > > expanding
> > > > > > > > va_arg
> > > > > > > > until pass_stdarg, which makes pass_stdarg more robust.
> > > > > > > > 
> > > > > > > > Bootstrapped and reg-tested on x86_64 using all languages, with
> > > > > > > > unix/ and
> > > > > > > > unix/-m32 testing.
> > > > > > > > 
> > > > > > > > I'll post the patches in reply to this email.
> > > > > > > > 
> > > > > > > This patch postpones expanding va_arg until pass_stdarg.
> > > > > > > 
> > > > > > > We add a new internal function IFN_VA_ARG. During gimplification,
> > > > > > > we
> > > > > > > map
> > > > > > > VA_ARG_EXPR onto a CALL_EXPR with IFN_VA_ARG, which is then
> > > > > > > gimplified
> > > > > > > in to a
> > > > > > > gimple_call. At pass_stdarg, we expand the IFN_VA_ARG gimple_call
> > > > > > > into
> > > > > > > actual
> > > > > > > code.
> > > > > > > 
> > > > > > > There are a few implementation details worth mentioning:
> > > > > > > - passing the type beyond gimplification is done by adding a NULL
> > > > > > > pointer-
> > > > > > >    to-type to IFN_VA_ARG.
> > > > > > > - there is special handling for IFN_VA_ARG that would be most
> > > > > > > suited
> > > > > > > to be
> > > > > > >    placed in gimplify_va_arg_expr. However, that function lacks
> > > > > > > the
> > > > > > > scope for
> > > > > > >    the special handling, so it's placed awkwardly in
> > > > > > > gimplify_modify_expr.
> > > > > > > - there's special handling in case the va_arg type is
> > > > > > > variable-sized.
> > > > > > >    gimplify_modify_expr adds a WITH_SIZE_EXPR to the CALL_EXPR
> > > > > > > IFN_VA_ARG for
> > > > > > >    variable-sized types. However, this is gimplified into a
> > > > > > > gimple_call which
> > > > > > >    does not have the possibility to wrap it's result in a
> > > > > > > WITH_SIZE_EXPR. So
> > > > > > >    we're adding the size argument of the WITH_SIZE_EXPR as
> > > > > > > argument to
> > > > > > >    IFN_VA_ARG, and at expansion in pass_stdarg, wrap the result of
> > > > > > > the
> > > > > > >    gimplification of IFN_VA_ARG in a WITH_SIZE_EXPR, such that the
> > > > > > > subsequent
> > > > > > >    gimplify_assign will generate a memcpy if necessary.
> > > > > > > - when gimplifying the va_arg argument ap, it may not be
> > > > > > > addressable.
> > > > > > > So
> > > > > > >    gimplification will generate a copy ap.1 = ap, and use &ap.1 as
> > > > > > > argument.
> > > > > > >    This means that we have to copy back the ap.1 value to ap after
> > > > > > > IFN_VA_ARG.
> > > > > > >    The copy is classified by the va_list_gpr/fpr_size optimization
> > > > > > > as
> > > > > > > an
> > > > > > >    escape,  so it inhibits optimization. The tree-ssa/stdarg-2.c
> > > > > > > f15
> > > > > > > update is
> > > > > > >    because of that.
> > > > > > > 
> > > > > > > OK for stage1?
> > > > > > Looks mostly good, though it looks like with -O0 this doesn't delay
> > > > > > lowering of va-arg and thus won't "fix" offloading.  Can you instead
> > > > > > introduce a PROP_gimple_lva, provide it by the stdarg pass and add
> > > > > > a pass_lower_vaarg somewhere where pass_lower_complex_O0 is run
> > > > > > that runs of !PROP_gimple_lva (and also provides it), and require
> > > > > > PROP_gimple_lva by pass_expand?  (just look for PROP_gimple_lcx for
> > > > > > the complex stuff to get an idea what needs to be touched)
> > > > > > 
> > > > > Updated according to comments.
> > > > > 
> > > > > Furthermore (having updated the patch series to recent trunk), I'm
> > > > > dropping the
> > > > > ACCEL_COMPILER bit in pass_stdarg::gate. AFAIU the comment there
> > > > > relates
> > > > > to this
> > > > > patch.
> > > > > 
> > > > > Retested as before.
> > > > > 
> > > > > OK for stage1?
> > > > > 
> > > > Ping.
> > > Ping again.
> > > 
> > > Patch originally posted at:
> > > https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01332.html .
> > 
> > Ok.
> > 
> > Thanks,
> > Richard.
> > 
> > > Thanks,
> > > - Tom
> > > 
> > > > > Btw, I'm wondering if as run-time optimization we can tentatively set
> > > > > PROP_gimple_lva at the start of the gimple pass, and unset it in
> > > > > gimplify_va_arg_expr. That way we would avoid the loop in
> > > > > expand_ifn_va_arg_1
> > > > > (over all bbs and gimples) in functions without va_arg.
> > > > > 
> > > > Taken care of in follow-up patch 5b.
> > 
> > 
> 
>
diff mbox

Patch

2015-02-17  Tom de Vries  <tom@codesourcery.com>
	    Michael Matz  <matz@suse.de>

	* gimple-iterator.c (update_modified_stmts): Remove static.
	* gimple-iterator.h (update_modified_stmts): Declare.
	* gimplify.c (gimplify_modify_expr): Handle IFN_VA_ARG.
	(gimplify_va_arg_internal): New function.
	(gimplify_va_arg_expr): Use IFN_VA_ARG.
	* gimplify.h (gimplify_va_arg_internal): Declare.
	* internal-fn.c (expand_VA_ARG): New unreachable function.
	* internal-fn.def (VA_ARG): New DEF_INTERNAL_FN.
	* tree-stdarg.c (gimple_call_ifn_va_arg_p, expand_ifn_va_arg_1)
	(expand_ifn_va_arg): New function.
	(pass_data_stdarg): Add PROP_gimple_lva to properties_provided field.
	(pass_stdarg::execute): Call expand_ifn_va_arg.
	(pass_data_lower_vaarg): New pass_data.
	(pass_lower_vaarg): New gimple_opt_pass.
	(pass_lower_vaarg::gate, pass_lower_vaarg::execute)
	(make_pass_lower_vaarg): New function.
	* cfgexpand.c (pass_data_expand): Add PROP_gimple_lva to
	properties_required field.
	* passes.def (all_passes): Add pass_lower_vaarg.
	* tree-pass.h (PROP_gimple_lva): Add define.
	(make_pass_lower_vaarg): Declare.

	* gcc.dg/tree-ssa/stdarg-2.c: Change f15 scan-tree-dump for target
	x86_64-*-*.
---
 gcc/cfgexpand.c                          |   3 +-
 gcc/gimple-iterator.c                    |   2 +-
 gcc/gimple-iterator.h                    |   1 +
 gcc/gimplify.c                           | 111 ++++++++++++++-----
 gcc/gimplify.h                           |   2 +
 gcc/internal-fn.c                        |   9 ++
 gcc/internal-fn.def                      |   1 +
 gcc/passes.def                           |   1 +
 gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c |   4 +-
 gcc/tree-pass.h                          |   2 +
 gcc/tree-stdarg.c                        | 184 ++++++++++++++++++++++++++++---
 11 files changed, 273 insertions(+), 47 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 7dfe1f6..af5a652 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -5860,7 +5860,8 @@  const pass_data pass_data_expand =
   TV_EXPAND, /* tv_id */
   ( PROP_ssa | PROP_gimple_leh | PROP_cfg
     | PROP_gimple_lcx
-    | PROP_gimple_lvec ), /* properties_required */
+    | PROP_gimple_lvec
+    | PROP_gimple_lva), /* properties_required */
   PROP_rtl, /* properties_provided */
   ( PROP_ssa | PROP_trees ), /* properties_destroyed */
   0, /* todo_flags_start */
diff --git a/gcc/gimple-iterator.c b/gcc/gimple-iterator.c
index a322390..df29123 100644
--- a/gcc/gimple-iterator.c
+++ b/gcc/gimple-iterator.c
@@ -72,7 +72,7 @@  update_modified_stmt (gimple stmt)
 
 /* Mark the statements in SEQ as modified, and update them.  */
 
-static void
+void
 update_modified_stmts (gimple_seq seq)
 {
   gimple_stmt_iterator gsi;
diff --git a/gcc/gimple-iterator.h b/gcc/gimple-iterator.h
index 6be88dd..ab5759e 100644
--- a/gcc/gimple-iterator.h
+++ b/gcc/gimple-iterator.h
@@ -90,6 +90,7 @@  extern basic_block gsi_insert_seq_on_edge_immediate (edge, gimple_seq);
 extern void gsi_commit_edge_inserts (void);
 extern void gsi_commit_one_edge_insert (edge, basic_block *);
 extern gphi_iterator gsi_start_phis (basic_block);
+extern void update_modified_stmts (gimple_seq);
 
 /* Return a new iterator pointing to GIMPLE_SEQ's first statement.  */
 
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 1353ada..8ac6a35 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4564,6 +4564,7 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   gimple assign;
   location_t loc = EXPR_LOCATION (*expr_p);
   gimple_stmt_iterator gsi;
+  tree ap = NULL_TREE, ap_copy = NULL_TREE;
 
   gcc_assert (TREE_CODE (*expr_p) == MODIFY_EXPR
 	      || TREE_CODE (*expr_p) == INIT_EXPR);
@@ -4640,6 +4641,27 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   if (ret == GS_ERROR)
     return ret;
 
+  /* In case of va_arg internal fn wrappped in a WITH_SIZE_EXPR, add the type
+     size as argument to the the call.  */
+  if (TREE_CODE (*from_p) == WITH_SIZE_EXPR)
+    {
+      tree call = TREE_OPERAND (*from_p, 0);
+      tree vlasize = TREE_OPERAND (*from_p, 1);
+
+      if (TREE_CODE (call) == CALL_EXPR
+	  && CALL_EXPR_IFN (call) == IFN_VA_ARG)
+	{
+	  tree type = TREE_TYPE (call);
+	  tree ap = CALL_EXPR_ARG (call, 0);
+	  tree tag = CALL_EXPR_ARG (call, 1);
+	  tree newcall = build_call_expr_internal_loc (EXPR_LOCATION (call),
+						       IFN_VA_ARG, type, 3, ap,
+						       tag, vlasize);
+	  tree *call_p = &(TREE_OPERAND (*from_p, 0));
+	  *call_p = newcall;
+	}
+    }
+
   /* Now see if the above changed *from_p to something we handle specially.  */
   ret = gimplify_modify_expr_rhs (expr_p, from_p, to_p, pre_p, post_p,
 				  want_value);
@@ -4703,12 +4725,16 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
 	  enum internal_fn ifn = CALL_EXPR_IFN (*from_p);
 	  auto_vec<tree> vargs (nargs);
 
+	  if (ifn == IFN_VA_ARG)
+	    ap = unshare_expr (CALL_EXPR_ARG (*from_p, 0));
 	  for (i = 0; i < nargs; i++)
 	    {
 	      gimplify_arg (&CALL_EXPR_ARG (*from_p, i), pre_p,
 			    EXPR_LOCATION (*from_p));
 	      vargs.quick_push (CALL_EXPR_ARG (*from_p, i));
 	    }
+	  if (ifn == IFN_VA_ARG)
+	    ap_copy = CALL_EXPR_ARG (*from_p, 0);
 	  call_stmt = gimple_build_call_internal_vec (ifn, vargs);
 	  gimple_set_location (call_stmt, EXPR_LOCATION (*expr_p));
 	}
@@ -4753,6 +4779,17 @@  gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p,
   gsi = gsi_last (*pre_p);
   maybe_fold_stmt (&gsi);
 
+  /* When gimplifying the &ap argument of va_arg, we might end up with
+       ap.1 = ap
+       va_arg (&ap.1, 0B)
+     We need to assign ap.1 back to ap, otherwise va_arg has no effect on
+     ap.  */
+  if (ap != NULL_TREE
+      && TREE_CODE (ap) == ADDR_EXPR
+      && TREE_CODE (ap_copy) == ADDR_EXPR
+      && TREE_OPERAND (ap, 0) != TREE_OPERAND (ap_copy, 0))
+    gimplify_assign (TREE_OPERAND (ap, 0), TREE_OPERAND (ap_copy, 0), pre_p);
+
   if (want_value)
     {
       *expr_p = TREE_THIS_VOLATILE (*to_p) ? *from_p : unshare_expr (*to_p);
@@ -9290,16 +9327,53 @@  dummy_object (tree type)
   return build2 (MEM_REF, type, t, t);
 }
 
+/* Call the target expander for evaluating a va_arg call of VALIST
+   and TYPE.  */
+
+tree
+gimplify_va_arg_internal (tree valist, tree type, location_t loc,
+			  gimple_seq *pre_p, gimple_seq *post_p)
+{
+  tree have_va_type = TREE_TYPE (valist);
+  tree cano_type = targetm.canonical_va_list_type (have_va_type);
+
+  if (cano_type != NULL_TREE)
+    have_va_type = cano_type;
+
+  /* Make it easier for the backends by protecting the valist argument
+     from multiple evaluations.  */
+  if (TREE_CODE (have_va_type) == ARRAY_TYPE)
+    {
+      /* For this case, the backends will be expecting a pointer to
+	 TREE_TYPE (abi), but it's possible we've
+	 actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
+	 So fix it.  */
+      if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
+	{
+	  tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
+	  valist = fold_convert_loc (loc, p1,
+				     build_fold_addr_expr_loc (loc, valist));
+	}
+
+      gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
+    }
+  else
+    gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
+
+  return targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
+}
+
 /* Gimplify __builtin_va_arg, aka VA_ARG_EXPR, which is not really a
    builtin function, but a very special sort of operator.  */
 
 enum gimplify_status
-gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
+gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p,
+		      gimple_seq *post_p ATTRIBUTE_UNUSED)
 {
   tree promoted_type, have_va_type;
   tree valist = TREE_OPERAND (*expr_p, 0);
   tree type = TREE_TYPE (*expr_p);
-  tree t;
+  tree t, tag, ap;
   location_t loc = EXPR_LOCATION (*expr_p);
 
   /* Verify that valist is of the proper type.  */
@@ -9351,36 +9425,13 @@  gimplify_va_arg_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p)
       *expr_p = dummy_object (type);
       return GS_ALL_DONE;
     }
-  else
-    {
-      /* Make it easier for the backends by protecting the valist argument
-	 from multiple evaluations.  */
-      if (TREE_CODE (have_va_type) == ARRAY_TYPE)
-	{
-	  /* For this case, the backends will be expecting a pointer to
-	     TREE_TYPE (abi), but it's possible we've
-	     actually been given an array (an actual TARGET_FN_ABI_VA_LIST).
-	     So fix it.  */
-	  if (TREE_CODE (TREE_TYPE (valist)) == ARRAY_TYPE)
-	    {
-	      tree p1 = build_pointer_type (TREE_TYPE (have_va_type));
-	      valist = fold_convert_loc (loc, p1,
-					 build_fold_addr_expr_loc (loc, valist));
-	    }
-
-	  gimplify_expr (&valist, pre_p, post_p, is_gimple_val, fb_rvalue);
-	}
-      else
-	gimplify_expr (&valist, pre_p, post_p, is_gimple_min_lval, fb_lvalue);
 
-      if (!targetm.gimplify_va_arg_expr)
-	/* FIXME: Once most targets are converted we should merely
-	   assert this is non-null.  */
-	return GS_ALL_DONE;
+  /* Transform a VA_ARG_EXPR into an VA_ARG internal function.  */
+  ap = build_fold_addr_expr_loc (loc, valist);
+  tag = build_int_cst (build_pointer_type (type), 0);
+  *expr_p = build_call_expr_internal_loc (loc, IFN_VA_ARG, type, 2, ap, tag);
 
-      *expr_p = targetm.gimplify_va_arg_expr (valist, type, pre_p, post_p);
-      return GS_OK;
-    }
+  return GS_OK;
 }
 
 /* Build a new GIMPLE_ASSIGN tuple and append it to the end of *SEQ_P.
diff --git a/gcc/gimplify.h b/gcc/gimplify.h
index 615925c..bad8e0f 100644
--- a/gcc/gimplify.h
+++ b/gcc/gimplify.h
@@ -82,6 +82,8 @@  extern void gimplify_function_tree (tree);
 extern enum gimplify_status gimplify_va_arg_expr (tree *, gimple_seq *,
 						  gimple_seq *);
 gimple gimplify_assign (tree, tree, gimple_seq *);
+extern tree gimplify_va_arg_internal (tree, tree, location_t, gimple_seq *,
+				      gimple_seq *);
 
 /* Return true if gimplify_one_sizepos doesn't need to gimplify
    expr (when in TYPE_SIZE{,_UNIT} and similar type/decl size/bitsize
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index e402825..0053ed9 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1972,6 +1972,15 @@  expand_BUILTIN_EXPECT (gcall *stmt)
     emit_move_insn (target, val);
 }
 
+/* IFN_VA_ARG is supposed to be expanded at pass_stdarg.  So this dummy function
+   should never be called.  */
+
+static void
+expand_VA_ARG (gcall *stmt ATTRIBUTE_UNUSED)
+{
+  gcc_unreachable ();
+}
+
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
 
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 032ce6c..f557c64 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -62,3 +62,4 @@  DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (VA_ARG, 0, NULL)
diff --git a/gcc/passes.def b/gcc/passes.def
index 2bc5dcd..b9d396f 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -342,6 +342,7 @@  along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_tm_edges);
   POP_INSERT_PASSES ()
   NEXT_PASS (pass_vtable_verify);
+  NEXT_PASS (pass_lower_vaarg);
   NEXT_PASS (pass_lower_vector);
   NEXT_PASS (pass_lower_complex_O0);
   NEXT_PASS (pass_asan_O0);
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
index fe39da3..5a74280 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/stdarg-2.c
@@ -288,9 +288,9 @@  f15 (int i, ...)
   f15_1 (ap);
   va_end (ap);
 }
-/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && { ! { ia32 || llp64 } } } } } } */
+/* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { { i?86-*-* } && { ! { ia32 || llp64 } } } } } } */
 /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save \[148\] GPR units and \[1-9\]\[0-9\]* FPR units" "stdarg" { target { powerpc*-*-linux* && { powerpc_fprs && ilp32 } } } } } */
-/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target alpha*-*-linux* } } } */
+/* { dg-final { scan-tree-dump "f15: va_list escapes 1, needs to save all GPR units and all FPR units" "stdarg" { target { { alpha*-*-linux* } || { { x86_64-*-* } && { ! { ia32 || llp64 } } } } } } } */
 /* { dg-final { scan-tree-dump "f15: va_list escapes 0, needs to save 1 GPR units and 2 FPR units" "stdarg" { target s390*-*-linux* } } } */
 /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target { { i?86-*-* x86_64-*-* } && ia32 } } } } */
 /* { dg-final { scan-tree-dump-not "f15: va_list escapes 0, needs to save 0 GPR units" "stdarg" { target ia64-*-* } } } */
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index b59ae7a..bcefb3e 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -221,6 +221,7 @@  protected:
 #define PROP_loops		(1 << 11)	/* preserve loop structures */
 #define PROP_gimple_lvec	(1 << 12)       /* lowered vector */
 #define PROP_gimple_eomp	(1 << 13)       /* no OpenMP directives */
+#define PROP_gimple_lva		(1 << 14)       /* No va_arg internal function.  */
 
 #define PROP_trees \
   (PROP_gimple_any | PROP_gimple_lcf | PROP_gimple_leh | PROP_gimple_lomp)
@@ -588,6 +589,7 @@  extern gimple_opt_pass *make_pass_early_inline (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_inline_parameters (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_update_address_taken (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_convert_switch (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_lower_vaarg (gcc::context *ctxt);
 
 /* Current optimization pass.  */
 extern opt_pass *current_pass;
diff --git a/gcc/tree-stdarg.c b/gcc/tree-stdarg.c
index 8d221a4..16a9e2c 100644
--- a/gcc/tree-stdarg.c
+++ b/gcc/tree-stdarg.c
@@ -52,11 +52,14 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-iterator.h"
 #include "gimple-walk.h"
 #include "gimple-ssa.h"
+#include "gimplify.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
 #include "stringpool.h"
 #include "tree-ssanames.h"
+#include "tree-into-ssa.h"
 #include "sbitmap.h"
+#include "tree-cfg.h"
 #include "tree-pass.h"
 #include "tree-stdarg.h"
 
@@ -1016,6 +1019,112 @@  finish:
     }
 }
 
+/* Return true if STMT is IFN_VA_ARG.  */
+
+static bool
+gimple_call_ifn_va_arg_p (gimple stmt)
+{
+  return (is_gimple_call (stmt)
+	  && gimple_call_internal_p (stmt)
+	  && gimple_call_internal_fn (stmt) == IFN_VA_ARG);
+}
+
+/* Expand IFN_VA_ARGs in FUN.  */
+
+static void
+expand_ifn_va_arg_1 (function *fun)
+{
+  bool modified = false;
+  basic_block bb;
+  gimple_stmt_iterator i;
+
+  FOR_EACH_BB_FN (bb, fun)
+    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+      {
+	gimple stmt = gsi_stmt (i);
+	tree ap, expr, lhs, type;
+	gimple_seq pre = NULL, post = NULL;
+
+	if (!gimple_call_ifn_va_arg_p (stmt))
+	  continue;
+
+	modified = true;
+
+	type = TREE_TYPE (TREE_TYPE (gimple_call_arg (stmt, 1)));
+	ap = gimple_call_arg (stmt, 0);
+	ap = build_fold_indirect_ref (ap);
+
+	push_gimplify_context (false);
+
+	expr = gimplify_va_arg_internal (ap, type, gimple_location (stmt),
+					 &pre, &post);
+
+	lhs = gimple_call_lhs (stmt);
+	if (lhs != NULL_TREE)
+	  {
+	    gcc_assert (useless_type_conversion_p (TREE_TYPE (lhs), type));
+
+	    if (gimple_call_num_args (stmt) == 3)
+	      {
+		/* We've transported the size of with WITH_SIZE_EXPR here as
+		   the 3rd argument of the internal fn call.  Now reinstate
+		   it.  */
+		tree size = gimple_call_arg (stmt, 2);
+		expr = build2 (WITH_SIZE_EXPR, TREE_TYPE (expr), expr, size);
+	      }
+
+	    /* We use gimplify_assign here, rather than gimple_build_assign,
+	       because gimple_assign knows how to deal with variable-sized
+	       types.  */
+	    gimplify_assign (lhs, expr, &pre);
+	  }
+
+	pop_gimplify_context (NULL);
+
+	gimple_seq_add_seq (&pre, post);
+	update_modified_stmts (pre);
+
+	/* Add the sequence after IFN_VA_ARG.  This splits the bb right
+	   after IFN_VA_ARG, and adds the sequence in one or more new bbs
+	   inbetween.  */
+	gimple_find_sub_bbs (pre, &i);
+
+	/* Remove the IFN_VA_ARG gimple_call.  It's the last stmt in the
+	   bb.  */
+	gsi_remove (&i, true);
+	gcc_assert (gsi_end_p (i));
+
+	/* We're walking here into the bbs which contain the expansion of
+	   IFN_VA_ARG, and will not contain another IFN_VA_ARG that needs
+	   expanding.  We could try to skip walking these bbs, perhaps by
+	   walking backwards over gimples and bbs.  */
+	break;
+      }
+
+  if (!modified)
+    return;
+
+  free_dominance_info (CDI_DOMINATORS);
+  update_ssa (TODO_update_ssa);
+}
+
+/* Expand IFN_VA_ARGs in FUN, if necessary.  */
+
+static void
+expand_ifn_va_arg (function *fun)
+{
+  if ((fun->curr_properties & PROP_gimple_lva) == 0)
+    expand_ifn_va_arg_1 (fun);
+
+#if ENABLE_CHECKING
+  basic_block bb;
+  gimple_stmt_iterator i;
+  FOR_EACH_BB_FN (bb, fun)
+    for (i = gsi_start_bb (bb); !gsi_end_p (i); gsi_next (&i))
+      gcc_assert (!gimple_call_ifn_va_arg_p (gsi_stmt (i)));
+#endif
+}
+
 namespace {
 
 const pass_data pass_data_stdarg =
@@ -1025,7 +1134,7 @@  const pass_data pass_data_stdarg =
   OPTGROUP_NONE, /* optinfo_flags */
   TV_NONE, /* tv_id */
   ( PROP_cfg | PROP_ssa ), /* properties_required */
-  0, /* properties_provided */
+  PROP_gimple_lva, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
   0, /* todo_flags_finish */
@@ -1039,18 +1148,13 @@  public:
   {}
 
   /* opt_pass methods: */
-  virtual bool gate (function *fun)
+  virtual bool gate (function *)
     {
-      return (flag_stdarg_opt
-#ifdef ACCEL_COMPILER
-	      /* Disable for GCC5 in the offloading compilers, as
-		 va_list and gpr/fpr counter fields are not merged.
-		 In GCC6 when stdarg is lowered late this shouldn't be
-		 an issue.  */
-	      && !in_lto_p
-#endif
-	      /* This optimization is only for stdarg functions.  */
-	      && fun->stdarg != 0);
+      /* Always run this pass, in order to expand va_arg internal_fns.  We
+	 also need to do that if fun->stdarg == 0, because a va_arg may also
+	 occur in a function without varargs, f.i. if when passing a va_list to
+	 another function.  */
+      return true;
     }
 
   virtual unsigned int execute (function *);
@@ -1060,7 +1164,14 @@  public:
 unsigned int
 pass_stdarg::execute (function *fun)
 {
-  optimize_va_list_gpr_fpr_size (fun);
+  /* TODO: Postpone expand_ifn_va_arg till after
+     optimize_va_list_gpr_fpr_size.  */
+  expand_ifn_va_arg (fun);
+
+  if (flag_stdarg_opt
+      /* This optimization is only for stdarg functions.  */
+      && fun->stdarg != 0)
+    optimize_va_list_gpr_fpr_size (fun);
 
   return 0;
 }
@@ -1072,3 +1183,50 @@  make_pass_stdarg (gcc::context *ctxt)
 {
   return new pass_stdarg (ctxt);
 }
+
+namespace {
+
+const pass_data pass_data_lower_vaarg =
+{
+  GIMPLE_PASS, /* type */
+  "lower_vaarg", /* name */
+  OPTGROUP_NONE, /* optinfo_flags */
+  TV_NONE, /* tv_id */
+  ( PROP_cfg | PROP_ssa ), /* properties_required */
+  PROP_gimple_lva, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  0, /* todo_flags_finish */
+};
+
+class pass_lower_vaarg : public gimple_opt_pass
+{
+public:
+  pass_lower_vaarg (gcc::context *ctxt)
+    : gimple_opt_pass (pass_data_lower_vaarg, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+    {
+      return (cfun->curr_properties & PROP_gimple_lva) == 0;
+    }
+
+  virtual unsigned int execute (function *);
+
+}; // class pass_lower_vaarg
+
+unsigned int
+pass_lower_vaarg::execute (function *fun)
+{
+  expand_ifn_va_arg (fun);
+  return 0;
+}
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_lower_vaarg (gcc::context *ctxt)
+{
+  return new pass_lower_vaarg (ctxt);
+}
-- 
1.9.1