diff mbox

[2/3] Make early return predictor more precise.

Message ID 62ae8f2a-0e53-5177-afa4-4314f118c3cc@suse.cz
State New
Headers show

Commit Message

Martin Liška June 20, 2017, 12:13 p.m. UTC
On 06/19/2017 01:11 PM, Jan Hubicka wrote:
>> Ok, you're right that we can preserve the predictor. However, let's consider following test-case:
>>
>> static
>> int baz(int a)
>> {
>>   if (a == 1)
>>     return 1;
>>
>>   return 0;
>> }
>>
>>   
>> static
>> int bar(int a)
>> {
>>   if (a == 1)
>>     return baz(a);
>>
>>   return 0;
>> }
>>   
>> static
>> int foo(int a)
>> {
>>   if (a == 1)
>>     return bar(a);
>>
>>   return 12;
>> }
>>
>> int main(int argc, char **argv)
>> {
>>   return foo(argc);
>> }
>>
>> There after einline we have:
>>
>> main (int argc, char * * argv)
>> {
>>   int D.1832;
>>   int _3;
>>   int _4;
>>
>>   <bb 2> [100.00%]:
>>   if (argc_2(D) == 1)
>>     goto <bb 3>; [37.13%]
>>   else
>>     goto <bb 4>; [62.87%]
>>
>>   <bb 3> [37.13%]:
>>   // predicted unlikely by early return (on trees) predictor.
>>   // predicted unlikely by early return (on trees) predictor.
>>   // predicted unlikely by early return (on trees) predictor.
>>
>>   <bb 4> [100.00%]:
>>   # _3 = PHI <12(2), 1(3)>
>>   _5 = _3;
>>   _4 = _5;
>>   return _4;
>>
>> }
>>
>> I'm thinking what's the best place to merge all the predictor
>> statements?
> 
> I wonder if we need to - predictors are relatively short lived.
> In fact they do not need to hit IPA passes but they do as at a time
> I was implementing them I was worrying about introducing yet another
> global IPA pass to remove them (we can't do during early inlining
> because we want to reuse them after inlining).

Ok, so I fixed that in the described way. There's one remaining fallout of:
gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c

Where a fnsplit is properly done, but then it's again inlined:

Considering split_me.part.0/5 with 23 size
 to be inlined into test/2 in unknown:0
 Estimated badness is -0.000001, frequency 0.33.
 Inlined split_me.part.0 into test which now has time 50.300000 and size 44, net change of +17.

Considering split_me.part.0/5 with 23 size
 to be inlined into test/2 in unknown:0
 Estimated badness is -0.000001, frequency 0.33.
 Inlined split_me.part.0 into test which now has time 70.760000 and size 61, net change of +17.

Considering split_me.part.0/5 with 23 size
 to be inlined into test/2 in unknown:0
 Estimated badness is -0.000001, frequency 0.33.
 Inlined split_me.part.0 into test which now has time 91.220000 and size 78, net change of +17.

Considering split_me.part.0/5 with 23 size
 to be inlined into test/2 in unknown:0
 Estimated badness is -0.000001, frequency 0.33.
 Inlined split_me.part.0 into test which now has time 111.680000 and size 95, net change of +17.
Unit growth for small function inlining: 61->129 (111%)

...

Any hint how to block the IPA inlining?

Sending new version of patch.
Martin

> 
> I would just move pass_strip_predict_hints pre-IPA and not worry about
> them chaining.
> 
> There is problem that after inlining the prediction may expand its scope
> and predict branch that it outside of the original function body,
> but I do not see very easy solution for that besides not re-doing
> prediction (we could also copy probabilities from the inlined function
> when they exists and honnor them in the outer function. I am not sure
> that is going to improve prediction quality though - extra context
> is probably useful)
> 
> Thanks,
> Honza
>>
>> Thanks,
>> Martin
>>
>>>
>>> Where did you found this case?
>>> Honza
>>>>  
>>>>        /* Create a new deep copy of the statement.  */
>>>>        copy = gimple_copy (stmt);
>>>> -- 
>>>> 2.13.0
>>>>

Comments

Jan Hubicka June 21, 2017, 8:26 a.m. UTC | #1
> 
> Ok, so I fixed that in the described way. There's one remaining fallout of:
> gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c
> 
> Where a fnsplit is properly done, but then it's again inlined:
> 
> Considering split_me.part.0/5 with 23 size
>  to be inlined into test/2 in unknown:0
>  Estimated badness is -0.000001, frequency 0.33.
>  Inlined split_me.part.0 into test which now has time 50.300000 and size 44, net change of +17.
> 
> Considering split_me.part.0/5 with 23 size
>  to be inlined into test/2 in unknown:0
>  Estimated badness is -0.000001, frequency 0.33.
>  Inlined split_me.part.0 into test which now has time 70.760000 and size 61, net change of +17.
> 
> Considering split_me.part.0/5 with 23 size
>  to be inlined into test/2 in unknown:0
>  Estimated badness is -0.000001, frequency 0.33.
>  Inlined split_me.part.0 into test which now has time 91.220000 and size 78, net change of +17.
> 
> Considering split_me.part.0/5 with 23 size
>  to be inlined into test/2 in unknown:0
>  Estimated badness is -0.000001, frequency 0.33.
>  Inlined split_me.part.0 into test which now has time 111.680000 and size 95, net change of +17.
> Unit growth for small function inlining: 61->129 (111%)
> 
> ...
> 
> Any hint how to block the IPA inlining?

I guess you only want to make cold part of split_me bigger or
just use --param to reduce growth for auto inlining.

How the patch reduces split_me_part considerably?
Honza
> 
> Sending new version of patch.
> Martin
> 
> > 
> > I would just move pass_strip_predict_hints pre-IPA and not worry about
> > them chaining.
> > 
> > There is problem that after inlining the prediction may expand its scope
> > and predict branch that it outside of the original function body,
> > but I do not see very easy solution for that besides not re-doing
> > prediction (we could also copy probabilities from the inlined function
> > when they exists and honnor them in the outer function. I am not sure
> > that is going to improve prediction quality though - extra context
> > is probably useful)
> > 
> > Thanks,
> > Honza
> >>
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> Where did you found this case?
> >>> Honza
> >>>>  
> >>>>        /* Create a new deep copy of the statement.  */
> >>>>        copy = gimple_copy (stmt);
> >>>> -- 
> >>>> 2.13.0
> >>>>
> 

> >From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Tue, 6 Jun 2017 10:55:18 +0200
> Subject: [PATCH 1/2] Make early return predictor more precise.
> 
> gcc/ChangeLog:
> 
> 2017-05-26  Martin Liska  <mliska@suse.cz>
> 
> 	PR tree-optimization/79489
> 	* gimplify.c (maybe_add_early_return_predict_stmt): New
> 	function.
> 	(gimplify_return_expr): Call the function.
> 	* predict.c (tree_estimate_probability_bb): Remove handling
> 	of early return.
> 	* predict.def: Update comment about early return predictor.
> 	* gimple-predict.h (is_gimple_predict): New function.
> 	* predict.def: Change default value of early return to 66.
> 	* tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT
> 	statements.
> 	* passes.def: Put pass_strip_predict_hints to the beginning of
> 	IPA passes.
> ---
>  gcc/gimple-low.c     |  2 ++
>  gcc/gimple-predict.h |  8 ++++++++
>  gcc/gimplify.c       | 16 ++++++++++++++++
>  gcc/passes.def       |  1 +
>  gcc/predict.c        | 41 -----------------------------------------
>  gcc/predict.def      | 15 +++------------
>  gcc/tree-tailcall.c  |  2 ++
>  7 files changed, 32 insertions(+), 53 deletions(-)
> 
> diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c
> index 619b9d7bfb1..4ea6c3532f3 100644
> --- a/gcc/gimple-low.c
> +++ b/gcc/gimple-low.c
> @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3.  If not see
>  #include "calls.h"
>  #include "gimple-iterator.h"
>  #include "gimple-low.h"
> +#include "predict.h"
> +#include "gimple-predict.h"
>  
>  /* The differences between High GIMPLE and Low GIMPLE are the
>     following:
> diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h
> index ba58e12e9e9..0e6c2e1ea01 100644
> --- a/gcc/gimple-predict.h
> +++ b/gcc/gimple-predict.h
> @@ -80,4 +80,12 @@ gimple_build_predict (enum br_predictor predictor, enum prediction outcome)
>    return p;
>  }
>  
> +/* Return true if GS is a GIMPLE_PREDICT statement.  */
> +
> +static inline bool
> +is_gimple_predict (const gimple *gs)
> +{
> +  return gimple_code (gs) == GIMPLE_PREDICT;
> +}
> +
>  #endif  /* GCC_GIMPLE_PREDICT_H */
> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
> index 9af95a28704..1c6e1591953 100644
> --- a/gcc/gimplify.c
> +++ b/gcc/gimplify.c
> @@ -1428,6 +1428,20 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
>    return GS_ALL_DONE;
>  }
>  
> +/* Maybe add early return predict statement to PRE_P sequence.  */
> +
> +static void
> +maybe_add_early_return_predict_stmt (gimple_seq *pre_p)
> +{
> +  /* If we are not in a conditional context, add PREDICT statement.  */
> +  if (gimple_conditional_context ())
> +    {
> +      gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN,
> +					      NOT_TAKEN);
> +      gimplify_seq_add_stmt (pre_p, predict);
> +    }
> +}
> +
>  /* Gimplify a RETURN_EXPR.  If the expression to be returned is not a
>     GIMPLE value, it is assigned to a new temporary and the statement is
>     re-written to return the temporary.
> @@ -1458,6 +1472,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>        || TREE_CODE (ret_expr) == RESULT_DECL
>        || ret_expr == error_mark_node)
>      {
> +      maybe_add_early_return_predict_stmt (pre_p);
>        greturn *ret = gimple_build_return (ret_expr);
>        gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>        gimplify_seq_add_stmt (pre_p, ret);
> @@ -1525,6 +1540,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>  
>    gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p);
>  
> +  maybe_add_early_return_predict_stmt (pre_p);
>    ret = gimple_build_return (result);
>    gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>    gimplify_seq_add_stmt (pre_p, ret);
> diff --git a/gcc/passes.def b/gcc/passes.def
> index c14f6b9f63c..316e19d12e3 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -107,6 +107,7 @@ along with GCC; see the file COPYING3.  If not see
>  	     early optimizations again.  It is thus good idea to do this
>  	      late.  */
>  	  NEXT_PASS (pass_split_functions);
> +	  NEXT_PASS (pass_strip_predict_hints);
>        POP_INSERT_PASSES ()
>        NEXT_PASS (pass_release_ssa_names);
>        NEXT_PASS (pass_rebuild_cgraph_edges);
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 60d1a094d10..790be9fbf16 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -2739,7 +2739,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>  {
>    edge e;
>    edge_iterator ei;
> -  gimple *last;
>  
>    FOR_EACH_EDGE (e, ei, bb->succs)
>      {
> @@ -2766,46 +2765,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>  	    }
>  	}
>  
> -      /* Predict early returns to be probable, as we've already taken
> -	 care for error returns and other cases are often used for
> -	 fast paths through function.
> -
> -	 Since we've already removed the return statements, we are
> -	 looking for CFG like:
> -
> -	 if (conditional)
> -	 {
> -	 ..
> -	 goto return_block
> -	 }
> -	 some other blocks
> -	 return_block:
> -	 return_stmt.  */
> -      if (e->dest != bb->next_bb
> -	  && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
> -	  && single_succ_p (e->dest)
> -	  && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
> -	  && (last = last_stmt (e->dest)) != NULL
> -	  && gimple_code (last) == GIMPLE_RETURN)
> -	{
> -	  edge e1;
> -	  edge_iterator ei1;
> -
> -	  if (single_succ_p (bb))
> -	    {
> -	      FOR_EACH_EDGE (e1, ei1, bb->preds)
> -		if (!predicted_by_p (e1->src, PRED_NULL_RETURN)
> -		    && !predicted_by_p (e1->src, PRED_CONST_RETURN)
> -		    && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN))
> -		  predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
> -	    }
> -	  else
> -	    if (!predicted_by_p (e->src, PRED_NULL_RETURN)
> -		&& !predicted_by_p (e->src, PRED_CONST_RETURN)
> -		&& !predicted_by_p (e->src, PRED_NEGATIVE_RETURN))
> -	      predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
> -	}
> -
>        /* Look for block we are guarding (ie we dominate it,
>  	 but it doesn't postdominate us).  */
>        if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb
> diff --git a/gcc/predict.def b/gcc/predict.def
> index fcda6c48f11..f7b2bf7738c 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -128,18 +128,9 @@ DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
>     indefinitely.  */
>  DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
>  
> -/* Branch causing function to terminate is probably not taken. 
> -   FIXME: early return currently predicts code:
> -   int foo (int a)
> -   {
> -      if (a)
> -	bar();
> -      else
> -	bar2();
> -   }
> -   even though there is no return statement involved.  We probably want to track
> -   this from FE or retire the predictor.  */
> -DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (54), 0)
> +/* Branch causing function to terminate is probably not taken.  */
> +DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
> +	       0)
>  
>  /* Branch containing goto is probably not taken.
>     FIXME: Currently not used.  */
> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
> index b7053387e91..6aa9a56462e 100644
> --- a/gcc/tree-tailcall.c
> +++ b/gcc/tree-tailcall.c
> @@ -421,6 +421,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>        if (gimple_code (stmt) == GIMPLE_LABEL
>  	  || gimple_code (stmt) == GIMPLE_RETURN
>  	  || gimple_code (stmt) == GIMPLE_NOP
> +	  || gimple_code (stmt) == GIMPLE_PREDICT
>  	  || gimple_clobber_p (stmt)
>  	  || is_gimple_debug (stmt))
>  	continue;
> @@ -555,6 +556,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>  
>        if (gimple_code (stmt) == GIMPLE_LABEL
>  	  || gimple_code (stmt) == GIMPLE_NOP
> +	  || gimple_code (stmt) == GIMPLE_PREDICT
>  	  || gimple_clobber_p (stmt)
>  	  || is_gimple_debug (stmt))
>  	continue;
> -- 
> 2.13.1
>
Martin Liška June 21, 2017, 12:50 p.m. UTC | #2
On 06/21/2017 10:26 AM, Jan Hubicka wrote:
>>
>> Ok, so I fixed that in the described way. There's one remaining fallout of:
>> gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c
>>
>> Where a fnsplit is properly done, but then it's again inlined:
>>
>> Considering split_me.part.0/5 with 23 size
>>  to be inlined into test/2 in unknown:0
>>  Estimated badness is -0.000001, frequency 0.33.
>>  Inlined split_me.part.0 into test which now has time 50.300000 and size 44, net change of +17.
>>
>> Considering split_me.part.0/5 with 23 size
>>  to be inlined into test/2 in unknown:0
>>  Estimated badness is -0.000001, frequency 0.33.
>>  Inlined split_me.part.0 into test which now has time 70.760000 and size 61, net change of +17.
>>
>> Considering split_me.part.0/5 with 23 size
>>  to be inlined into test/2 in unknown:0
>>  Estimated badness is -0.000001, frequency 0.33.
>>  Inlined split_me.part.0 into test which now has time 91.220000 and size 78, net change of +17.
>>
>> Considering split_me.part.0/5 with 23 size
>>  to be inlined into test/2 in unknown:0
>>  Estimated badness is -0.000001, frequency 0.33.
>>  Inlined split_me.part.0 into test which now has time 111.680000 and size 95, net change of +17.
>> Unit growth for small function inlining: 61->129 (111%)
>>
>> ...
>>
>> Any hint how to block the IPA inlining?
> 
> I guess you only want to make cold part of split_me bigger or
> just use --param to reduce growth for auto inlining.
> 
> How the patch reduces split_me_part considerably?
> Honza

Well, I probably overlooked test results, test works fine.

I'm going to install the patch.

Martin

>>
>> Sending new version of patch.
>> Martin
>>
>>>
>>> I would just move pass_strip_predict_hints pre-IPA and not worry about
>>> them chaining.
>>>
>>> There is problem that after inlining the prediction may expand its scope
>>> and predict branch that it outside of the original function body,
>>> but I do not see very easy solution for that besides not re-doing
>>> prediction (we could also copy probabilities from the inlined function
>>> when they exists and honnor them in the outer function. I am not sure
>>> that is going to improve prediction quality though - extra context
>>> is probably useful)
>>>
>>> Thanks,
>>> Honza
>>>>
>>>> Thanks,
>>>> Martin
>>>>
>>>>>
>>>>> Where did you found this case?
>>>>> Honza
>>>>>>  
>>>>>>        /* Create a new deep copy of the statement.  */
>>>>>>        copy = gimple_copy (stmt);
>>>>>> -- 
>>>>>> 2.13.0
>>>>>>
>>
> 
>> >From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Tue, 6 Jun 2017 10:55:18 +0200
>> Subject: [PATCH 1/2] Make early return predictor more precise.
>>
>> gcc/ChangeLog:
>>
>> 2017-05-26  Martin Liska  <mliska@suse.cz>
>>
>> 	PR tree-optimization/79489
>> 	* gimplify.c (maybe_add_early_return_predict_stmt): New
>> 	function.
>> 	(gimplify_return_expr): Call the function.
>> 	* predict.c (tree_estimate_probability_bb): Remove handling
>> 	of early return.
>> 	* predict.def: Update comment about early return predictor.
>> 	* gimple-predict.h (is_gimple_predict): New function.
>> 	* predict.def: Change default value of early return to 66.
>> 	* tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT
>> 	statements.
>> 	* passes.def: Put pass_strip_predict_hints to the beginning of
>> 	IPA passes.
>> ---
>>  gcc/gimple-low.c     |  2 ++
>>  gcc/gimple-predict.h |  8 ++++++++
>>  gcc/gimplify.c       | 16 ++++++++++++++++
>>  gcc/passes.def       |  1 +
>>  gcc/predict.c        | 41 -----------------------------------------
>>  gcc/predict.def      | 15 +++------------
>>  gcc/tree-tailcall.c  |  2 ++
>>  7 files changed, 32 insertions(+), 53 deletions(-)
>>
>> diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c
>> index 619b9d7bfb1..4ea6c3532f3 100644
>> --- a/gcc/gimple-low.c
>> +++ b/gcc/gimple-low.c
>> @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "calls.h"
>>  #include "gimple-iterator.h"
>>  #include "gimple-low.h"
>> +#include "predict.h"
>> +#include "gimple-predict.h"
>>  
>>  /* The differences between High GIMPLE and Low GIMPLE are the
>>     following:
>> diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h
>> index ba58e12e9e9..0e6c2e1ea01 100644
>> --- a/gcc/gimple-predict.h
>> +++ b/gcc/gimple-predict.h
>> @@ -80,4 +80,12 @@ gimple_build_predict (enum br_predictor predictor, enum prediction outcome)
>>    return p;
>>  }
>>  
>> +/* Return true if GS is a GIMPLE_PREDICT statement.  */
>> +
>> +static inline bool
>> +is_gimple_predict (const gimple *gs)
>> +{
>> +  return gimple_code (gs) == GIMPLE_PREDICT;
>> +}
>> +
>>  #endif  /* GCC_GIMPLE_PREDICT_H */
>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>> index 9af95a28704..1c6e1591953 100644
>> --- a/gcc/gimplify.c
>> +++ b/gcc/gimplify.c
>> @@ -1428,6 +1428,20 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
>>    return GS_ALL_DONE;
>>  }
>>  
>> +/* Maybe add early return predict statement to PRE_P sequence.  */
>> +
>> +static void
>> +maybe_add_early_return_predict_stmt (gimple_seq *pre_p)
>> +{
>> +  /* If we are not in a conditional context, add PREDICT statement.  */
>> +  if (gimple_conditional_context ())
>> +    {
>> +      gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN,
>> +					      NOT_TAKEN);
>> +      gimplify_seq_add_stmt (pre_p, predict);
>> +    }
>> +}
>> +
>>  /* Gimplify a RETURN_EXPR.  If the expression to be returned is not a
>>     GIMPLE value, it is assigned to a new temporary and the statement is
>>     re-written to return the temporary.
>> @@ -1458,6 +1472,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>>        || TREE_CODE (ret_expr) == RESULT_DECL
>>        || ret_expr == error_mark_node)
>>      {
>> +      maybe_add_early_return_predict_stmt (pre_p);
>>        greturn *ret = gimple_build_return (ret_expr);
>>        gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>>        gimplify_seq_add_stmt (pre_p, ret);
>> @@ -1525,6 +1540,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>>  
>>    gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p);
>>  
>> +  maybe_add_early_return_predict_stmt (pre_p);
>>    ret = gimple_build_return (result);
>>    gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>>    gimplify_seq_add_stmt (pre_p, ret);
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index c14f6b9f63c..316e19d12e3 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -107,6 +107,7 @@ along with GCC; see the file COPYING3.  If not see
>>  	     early optimizations again.  It is thus good idea to do this
>>  	      late.  */
>>  	  NEXT_PASS (pass_split_functions);
>> +	  NEXT_PASS (pass_strip_predict_hints);
>>        POP_INSERT_PASSES ()
>>        NEXT_PASS (pass_release_ssa_names);
>>        NEXT_PASS (pass_rebuild_cgraph_edges);
>> diff --git a/gcc/predict.c b/gcc/predict.c
>> index 60d1a094d10..790be9fbf16 100644
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -2739,7 +2739,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>  {
>>    edge e;
>>    edge_iterator ei;
>> -  gimple *last;
>>  
>>    FOR_EACH_EDGE (e, ei, bb->succs)
>>      {
>> @@ -2766,46 +2765,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>  	    }
>>  	}
>>  
>> -      /* Predict early returns to be probable, as we've already taken
>> -	 care for error returns and other cases are often used for
>> -	 fast paths through function.
>> -
>> -	 Since we've already removed the return statements, we are
>> -	 looking for CFG like:
>> -
>> -	 if (conditional)
>> -	 {
>> -	 ..
>> -	 goto return_block
>> -	 }
>> -	 some other blocks
>> -	 return_block:
>> -	 return_stmt.  */
>> -      if (e->dest != bb->next_bb
>> -	  && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
>> -	  && single_succ_p (e->dest)
>> -	  && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
>> -	  && (last = last_stmt (e->dest)) != NULL
>> -	  && gimple_code (last) == GIMPLE_RETURN)
>> -	{
>> -	  edge e1;
>> -	  edge_iterator ei1;
>> -
>> -	  if (single_succ_p (bb))
>> -	    {
>> -	      FOR_EACH_EDGE (e1, ei1, bb->preds)
>> -		if (!predicted_by_p (e1->src, PRED_NULL_RETURN)
>> -		    && !predicted_by_p (e1->src, PRED_CONST_RETURN)
>> -		    && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN))
>> -		  predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
>> -	    }
>> -	  else
>> -	    if (!predicted_by_p (e->src, PRED_NULL_RETURN)
>> -		&& !predicted_by_p (e->src, PRED_CONST_RETURN)
>> -		&& !predicted_by_p (e->src, PRED_NEGATIVE_RETURN))
>> -	      predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
>> -	}
>> -
>>        /* Look for block we are guarding (ie we dominate it,
>>  	 but it doesn't postdominate us).  */
>>        if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb
>> diff --git a/gcc/predict.def b/gcc/predict.def
>> index fcda6c48f11..f7b2bf7738c 100644
>> --- a/gcc/predict.def
>> +++ b/gcc/predict.def
>> @@ -128,18 +128,9 @@ DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
>>     indefinitely.  */
>>  DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
>>  
>> -/* Branch causing function to terminate is probably not taken. 
>> -   FIXME: early return currently predicts code:
>> -   int foo (int a)
>> -   {
>> -      if (a)
>> -	bar();
>> -      else
>> -	bar2();
>> -   }
>> -   even though there is no return statement involved.  We probably want to track
>> -   this from FE or retire the predictor.  */
>> -DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (54), 0)
>> +/* Branch causing function to terminate is probably not taken.  */
>> +DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>> +	       0)
>>  
>>  /* Branch containing goto is probably not taken.
>>     FIXME: Currently not used.  */
>> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
>> index b7053387e91..6aa9a56462e 100644
>> --- a/gcc/tree-tailcall.c
>> +++ b/gcc/tree-tailcall.c
>> @@ -421,6 +421,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>>        if (gimple_code (stmt) == GIMPLE_LABEL
>>  	  || gimple_code (stmt) == GIMPLE_RETURN
>>  	  || gimple_code (stmt) == GIMPLE_NOP
>> +	  || gimple_code (stmt) == GIMPLE_PREDICT
>>  	  || gimple_clobber_p (stmt)
>>  	  || is_gimple_debug (stmt))
>>  	continue;
>> @@ -555,6 +556,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>>  
>>        if (gimple_code (stmt) == GIMPLE_LABEL
>>  	  || gimple_code (stmt) == GIMPLE_NOP
>> +	  || gimple_code (stmt) == GIMPLE_PREDICT
>>  	  || gimple_clobber_p (stmt)
>>  	  || is_gimple_debug (stmt))
>>  	continue;
>> -- 
>> 2.13.1
>>
>
Christophe Lyon June 22, 2017, 2:14 p.m. UTC | #3
Hi,


On 21 June 2017 at 14:50, Martin Liška <mliska@suse.cz> wrote:
> On 06/21/2017 10:26 AM, Jan Hubicka wrote:
>>>
>>> Ok, so I fixed that in the described way. There's one remaining fallout of:
>>> gcc/testsuite/gcc.dg/tree-ssa/ipa-split-5.c
>>>
>>> Where a fnsplit is properly done, but then it's again inlined:
>>>
>>> Considering split_me.part.0/5 with 23 size
>>>  to be inlined into test/2 in unknown:0
>>>  Estimated badness is -0.000001, frequency 0.33.
>>>  Inlined split_me.part.0 into test which now has time 50.300000 and size 44, net change of +17.
>>>
>>> Considering split_me.part.0/5 with 23 size
>>>  to be inlined into test/2 in unknown:0
>>>  Estimated badness is -0.000001, frequency 0.33.
>>>  Inlined split_me.part.0 into test which now has time 70.760000 and size 61, net change of +17.
>>>
>>> Considering split_me.part.0/5 with 23 size
>>>  to be inlined into test/2 in unknown:0
>>>  Estimated badness is -0.000001, frequency 0.33.
>>>  Inlined split_me.part.0 into test which now has time 91.220000 and size 78, net change of +17.
>>>
>>> Considering split_me.part.0/5 with 23 size
>>>  to be inlined into test/2 in unknown:0
>>>  Estimated badness is -0.000001, frequency 0.33.
>>>  Inlined split_me.part.0 into test which now has time 111.680000 and size 95, net change of +17.
>>> Unit growth for small function inlining: 61->129 (111%)
>>>
>>> ...
>>>
>>> Any hint how to block the IPA inlining?
>>
>> I guess you only want to make cold part of split_me bigger or
>> just use --param to reduce growth for auto inlining.
>>
>> How the patch reduces split_me_part considerably?
>> Honza
>
> Well, I probably overlooked test results, test works fine.
>
> I'm going to install the patch.
>

Since this commit (r249450), I have noticed a regression:
FAIL:    gcc.dg/tree-ssa/ipa-split-5.c scan-tree-dump optimized "part"
on aarch64/arm

Christophe


> Martin
>
>>>
>>> Sending new version of patch.
>>> Martin
>>>
>>>>
>>>> I would just move pass_strip_predict_hints pre-IPA and not worry about
>>>> them chaining.
>>>>
>>>> There is problem that after inlining the prediction may expand its scope
>>>> and predict branch that it outside of the original function body,
>>>> but I do not see very easy solution for that besides not re-doing
>>>> prediction (we could also copy probabilities from the inlined function
>>>> when they exists and honnor them in the outer function. I am not sure
>>>> that is going to improve prediction quality though - extra context
>>>> is probably useful)
>>>>
>>>> Thanks,
>>>> Honza
>>>>>
>>>>> Thanks,
>>>>> Martin
>>>>>
>>>>>>
>>>>>> Where did you found this case?
>>>>>> Honza
>>>>>>>
>>>>>>>        /* Create a new deep copy of the statement.  */
>>>>>>>        copy = gimple_copy (stmt);
>>>>>>> --
>>>>>>> 2.13.0
>>>>>>>
>>>
>>
>>> >From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001
>>> From: marxin <mliska@suse.cz>
>>> Date: Tue, 6 Jun 2017 10:55:18 +0200
>>> Subject: [PATCH 1/2] Make early return predictor more precise.
>>>
>>> gcc/ChangeLog:
>>>
>>> 2017-05-26  Martin Liska  <mliska@suse.cz>
>>>
>>>      PR tree-optimization/79489
>>>      * gimplify.c (maybe_add_early_return_predict_stmt): New
>>>      function.
>>>      (gimplify_return_expr): Call the function.
>>>      * predict.c (tree_estimate_probability_bb): Remove handling
>>>      of early return.
>>>      * predict.def: Update comment about early return predictor.
>>>      * gimple-predict.h (is_gimple_predict): New function.
>>>      * predict.def: Change default value of early return to 66.
>>>      * tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT
>>>      statements.
>>>      * passes.def: Put pass_strip_predict_hints to the beginning of
>>>      IPA passes.
>>> ---
>>>  gcc/gimple-low.c     |  2 ++
>>>  gcc/gimple-predict.h |  8 ++++++++
>>>  gcc/gimplify.c       | 16 ++++++++++++++++
>>>  gcc/passes.def       |  1 +
>>>  gcc/predict.c        | 41 -----------------------------------------
>>>  gcc/predict.def      | 15 +++------------
>>>  gcc/tree-tailcall.c  |  2 ++
>>>  7 files changed, 32 insertions(+), 53 deletions(-)
>>>
>>> diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c
>>> index 619b9d7bfb1..4ea6c3532f3 100644
>>> --- a/gcc/gimple-low.c
>>> +++ b/gcc/gimple-low.c
>>> @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "calls.h"
>>>  #include "gimple-iterator.h"
>>>  #include "gimple-low.h"
>>> +#include "predict.h"
>>> +#include "gimple-predict.h"
>>>
>>>  /* The differences between High GIMPLE and Low GIMPLE are the
>>>     following:
>>> diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h
>>> index ba58e12e9e9..0e6c2e1ea01 100644
>>> --- a/gcc/gimple-predict.h
>>> +++ b/gcc/gimple-predict.h
>>> @@ -80,4 +80,12 @@ gimple_build_predict (enum br_predictor predictor, enum prediction outcome)
>>>    return p;
>>>  }
>>>
>>> +/* Return true if GS is a GIMPLE_PREDICT statement.  */
>>> +
>>> +static inline bool
>>> +is_gimple_predict (const gimple *gs)
>>> +{
>>> +  return gimple_code (gs) == GIMPLE_PREDICT;
>>> +}
>>> +
>>>  #endif  /* GCC_GIMPLE_PREDICT_H */
>>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>>> index 9af95a28704..1c6e1591953 100644
>>> --- a/gcc/gimplify.c
>>> +++ b/gcc/gimplify.c
>>> @@ -1428,6 +1428,20 @@ gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
>>>    return GS_ALL_DONE;
>>>  }
>>>
>>> +/* Maybe add early return predict statement to PRE_P sequence.  */
>>> +
>>> +static void
>>> +maybe_add_early_return_predict_stmt (gimple_seq *pre_p)
>>> +{
>>> +  /* If we are not in a conditional context, add PREDICT statement.  */
>>> +  if (gimple_conditional_context ())
>>> +    {
>>> +      gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN,
>>> +                                          NOT_TAKEN);
>>> +      gimplify_seq_add_stmt (pre_p, predict);
>>> +    }
>>> +}
>>> +
>>>  /* Gimplify a RETURN_EXPR.  If the expression to be returned is not a
>>>     GIMPLE value, it is assigned to a new temporary and the statement is
>>>     re-written to return the temporary.
>>> @@ -1458,6 +1472,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>>>        || TREE_CODE (ret_expr) == RESULT_DECL
>>>        || ret_expr == error_mark_node)
>>>      {
>>> +      maybe_add_early_return_predict_stmt (pre_p);
>>>        greturn *ret = gimple_build_return (ret_expr);
>>>        gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>>>        gimplify_seq_add_stmt (pre_p, ret);
>>> @@ -1525,6 +1540,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
>>>
>>>    gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p);
>>>
>>> +  maybe_add_early_return_predict_stmt (pre_p);
>>>    ret = gimple_build_return (result);
>>>    gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
>>>    gimplify_seq_add_stmt (pre_p, ret);
>>> diff --git a/gcc/passes.def b/gcc/passes.def
>>> index c14f6b9f63c..316e19d12e3 100644
>>> --- a/gcc/passes.def
>>> +++ b/gcc/passes.def
>>> @@ -107,6 +107,7 @@ along with GCC; see the file COPYING3.  If not see
>>>           early optimizations again.  It is thus good idea to do this
>>>            late.  */
>>>        NEXT_PASS (pass_split_functions);
>>> +      NEXT_PASS (pass_strip_predict_hints);
>>>        POP_INSERT_PASSES ()
>>>        NEXT_PASS (pass_release_ssa_names);
>>>        NEXT_PASS (pass_rebuild_cgraph_edges);
>>> diff --git a/gcc/predict.c b/gcc/predict.c
>>> index 60d1a094d10..790be9fbf16 100644
>>> --- a/gcc/predict.c
>>> +++ b/gcc/predict.c
>>> @@ -2739,7 +2739,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>>  {
>>>    edge e;
>>>    edge_iterator ei;
>>> -  gimple *last;
>>>
>>>    FOR_EACH_EDGE (e, ei, bb->succs)
>>>      {
>>> @@ -2766,46 +2765,6 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>>          }
>>>      }
>>>
>>> -      /* Predict early returns to be probable, as we've already taken
>>> -     care for error returns and other cases are often used for
>>> -     fast paths through function.
>>> -
>>> -     Since we've already removed the return statements, we are
>>> -     looking for CFG like:
>>> -
>>> -     if (conditional)
>>> -     {
>>> -     ..
>>> -     goto return_block
>>> -     }
>>> -     some other blocks
>>> -     return_block:
>>> -     return_stmt.  */
>>> -      if (e->dest != bb->next_bb
>>> -      && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
>>> -      && single_succ_p (e->dest)
>>> -      && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
>>> -      && (last = last_stmt (e->dest)) != NULL
>>> -      && gimple_code (last) == GIMPLE_RETURN)
>>> -    {
>>> -      edge e1;
>>> -      edge_iterator ei1;
>>> -
>>> -      if (single_succ_p (bb))
>>> -        {
>>> -          FOR_EACH_EDGE (e1, ei1, bb->preds)
>>> -            if (!predicted_by_p (e1->src, PRED_NULL_RETURN)
>>> -                && !predicted_by_p (e1->src, PRED_CONST_RETURN)
>>> -                && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN))
>>> -              predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
>>> -        }
>>> -      else
>>> -        if (!predicted_by_p (e->src, PRED_NULL_RETURN)
>>> -            && !predicted_by_p (e->src, PRED_CONST_RETURN)
>>> -            && !predicted_by_p (e->src, PRED_NEGATIVE_RETURN))
>>> -          predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
>>> -    }
>>> -
>>>        /* Look for block we are guarding (ie we dominate it,
>>>       but it doesn't postdominate us).  */
>>>        if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb
>>> diff --git a/gcc/predict.def b/gcc/predict.def
>>> index fcda6c48f11..f7b2bf7738c 100644
>>> --- a/gcc/predict.def
>>> +++ b/gcc/predict.def
>>> @@ -128,18 +128,9 @@ DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
>>>     indefinitely.  */
>>>  DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
>>>
>>> -/* Branch causing function to terminate is probably not taken.
>>> -   FIXME: early return currently predicts code:
>>> -   int foo (int a)
>>> -   {
>>> -      if (a)
>>> -    bar();
>>> -      else
>>> -    bar2();
>>> -   }
>>> -   even though there is no return statement involved.  We probably want to track
>>> -   this from FE or retire the predictor.  */
>>> -DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (54), 0)
>>> +/* Branch causing function to terminate is probably not taken.  */
>>> +DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>>> +           0)
>>>
>>>  /* Branch containing goto is probably not taken.
>>>     FIXME: Currently not used.  */
>>> diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
>>> index b7053387e91..6aa9a56462e 100644
>>> --- a/gcc/tree-tailcall.c
>>> +++ b/gcc/tree-tailcall.c
>>> @@ -421,6 +421,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>>>        if (gimple_code (stmt) == GIMPLE_LABEL
>>>        || gimple_code (stmt) == GIMPLE_RETURN
>>>        || gimple_code (stmt) == GIMPLE_NOP
>>> +      || gimple_code (stmt) == GIMPLE_PREDICT
>>>        || gimple_clobber_p (stmt)
>>>        || is_gimple_debug (stmt))
>>>      continue;
>>> @@ -555,6 +556,7 @@ find_tail_calls (basic_block bb, struct tailcall **ret)
>>>
>>>        if (gimple_code (stmt) == GIMPLE_LABEL
>>>        || gimple_code (stmt) == GIMPLE_NOP
>>> +      || gimple_code (stmt) == GIMPLE_PREDICT
>>>        || gimple_clobber_p (stmt)
>>>        || is_gimple_debug (stmt))
>>>      continue;
>>> --
>>> 2.13.1
>>>
>>
>
Martin Liška June 23, 2017, 7:03 a.m. UTC | #4
On 06/22/2017 04:14 PM, Christophe Lyon wrote:
> Since this commit (r249450), I have noticed a regression:
> FAIL:    gcc.dg/tree-ssa/ipa-split-5.c scan-tree-dump optimized "part"
> on aarch64/arm
> 
> Christophe

Hello.

I'm aware of the failure and I fixed that (hopefully) in r249503.
Can you please test that?

Thanks,
Martin
Christophe Lyon June 23, 2017, 7:40 a.m. UTC | #5
On 23 June 2017 at 09:03, Martin Liška <mliska@suse.cz> wrote:
> On 06/22/2017 04:14 PM, Christophe Lyon wrote:
>> Since this commit (r249450), I have noticed a regression:
>> FAIL:    gcc.dg/tree-ssa/ipa-split-5.c scan-tree-dump optimized "part"
>> on aarch64/arm
>>
>> Christophe
>
> Hello.
>
> I'm aware of the failure and I fixed that (hopefully) in r249503.
> Can you please test that?
>
Yes, I can confirm it's now OK. Thanks

> Thanks,
> Martin
diff mbox

Patch

From 84625a782add6ae2ed29630815b61b34a052770a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Tue, 6 Jun 2017 10:55:18 +0200
Subject: [PATCH 1/2] Make early return predictor more precise.

gcc/ChangeLog:

2017-05-26  Martin Liska  <mliska@suse.cz>

	PR tree-optimization/79489
	* gimplify.c (maybe_add_early_return_predict_stmt): New
	function.
	(gimplify_return_expr): Call the function.
	* predict.c (tree_estimate_probability_bb): Remove handling
	of early return.
	* predict.def: Update comment about early return predictor.
	* gimple-predict.h (is_gimple_predict): New function.
	* predict.def: Change default value of early return to 66.
	* tree-tailcall.c (find_tail_calls): Skip GIMPLE_PREDICT
	statements.
	* passes.def: Put pass_strip_predict_hints to the beginning of
	IPA passes.
---
 gcc/gimple-low.c     |  2 ++
 gcc/gimple-predict.h |  8 ++++++++
 gcc/gimplify.c       | 16 ++++++++++++++++
 gcc/passes.def       |  1 +
 gcc/predict.c        | 41 -----------------------------------------
 gcc/predict.def      | 15 +++------------
 gcc/tree-tailcall.c  |  2 ++
 7 files changed, 32 insertions(+), 53 deletions(-)

diff --git a/gcc/gimple-low.c b/gcc/gimple-low.c
index 619b9d7bfb1..4ea6c3532f3 100644
--- a/gcc/gimple-low.c
+++ b/gcc/gimple-low.c
@@ -30,6 +30,8 @@  along with GCC; see the file COPYING3.  If not see
 #include "calls.h"
 #include "gimple-iterator.h"
 #include "gimple-low.h"
+#include "predict.h"
+#include "gimple-predict.h"
 
 /* The differences between High GIMPLE and Low GIMPLE are the
    following:
diff --git a/gcc/gimple-predict.h b/gcc/gimple-predict.h
index ba58e12e9e9..0e6c2e1ea01 100644
--- a/gcc/gimple-predict.h
+++ b/gcc/gimple-predict.h
@@ -80,4 +80,12 @@  gimple_build_predict (enum br_predictor predictor, enum prediction outcome)
   return p;
 }
 
+/* Return true if GS is a GIMPLE_PREDICT statement.  */
+
+static inline bool
+is_gimple_predict (const gimple *gs)
+{
+  return gimple_code (gs) == GIMPLE_PREDICT;
+}
+
 #endif  /* GCC_GIMPLE_PREDICT_H */
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9af95a28704..1c6e1591953 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1428,6 +1428,20 @@  gimplify_bind_expr (tree *expr_p, gimple_seq *pre_p)
   return GS_ALL_DONE;
 }
 
+/* Maybe add early return predict statement to PRE_P sequence.  */
+
+static void
+maybe_add_early_return_predict_stmt (gimple_seq *pre_p)
+{
+  /* If we are not in a conditional context, add PREDICT statement.  */
+  if (gimple_conditional_context ())
+    {
+      gimple *predict = gimple_build_predict (PRED_TREE_EARLY_RETURN,
+					      NOT_TAKEN);
+      gimplify_seq_add_stmt (pre_p, predict);
+    }
+}
+
 /* Gimplify a RETURN_EXPR.  If the expression to be returned is not a
    GIMPLE value, it is assigned to a new temporary and the statement is
    re-written to return the temporary.
@@ -1458,6 +1472,7 @@  gimplify_return_expr (tree stmt, gimple_seq *pre_p)
       || TREE_CODE (ret_expr) == RESULT_DECL
       || ret_expr == error_mark_node)
     {
+      maybe_add_early_return_predict_stmt (pre_p);
       greturn *ret = gimple_build_return (ret_expr);
       gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
       gimplify_seq_add_stmt (pre_p, ret);
@@ -1525,6 +1540,7 @@  gimplify_return_expr (tree stmt, gimple_seq *pre_p)
 
   gimplify_and_add (TREE_OPERAND (stmt, 0), pre_p);
 
+  maybe_add_early_return_predict_stmt (pre_p);
   ret = gimple_build_return (result);
   gimple_set_no_warning (ret, TREE_NO_WARNING (stmt));
   gimplify_seq_add_stmt (pre_p, ret);
diff --git a/gcc/passes.def b/gcc/passes.def
index c14f6b9f63c..316e19d12e3 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -107,6 +107,7 @@  along with GCC; see the file COPYING3.  If not see
 	     early optimizations again.  It is thus good idea to do this
 	      late.  */
 	  NEXT_PASS (pass_split_functions);
+	  NEXT_PASS (pass_strip_predict_hints);
       POP_INSERT_PASSES ()
       NEXT_PASS (pass_release_ssa_names);
       NEXT_PASS (pass_rebuild_cgraph_edges);
diff --git a/gcc/predict.c b/gcc/predict.c
index 60d1a094d10..790be9fbf16 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -2739,7 +2739,6 @@  tree_estimate_probability_bb (basic_block bb, bool local_only)
 {
   edge e;
   edge_iterator ei;
-  gimple *last;
 
   FOR_EACH_EDGE (e, ei, bb->succs)
     {
@@ -2766,46 +2765,6 @@  tree_estimate_probability_bb (basic_block bb, bool local_only)
 	    }
 	}
 
-      /* Predict early returns to be probable, as we've already taken
-	 care for error returns and other cases are often used for
-	 fast paths through function.
-
-	 Since we've already removed the return statements, we are
-	 looking for CFG like:
-
-	 if (conditional)
-	 {
-	 ..
-	 goto return_block
-	 }
-	 some other blocks
-	 return_block:
-	 return_stmt.  */
-      if (e->dest != bb->next_bb
-	  && e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun)
-	  && single_succ_p (e->dest)
-	  && single_succ_edge (e->dest)->dest == EXIT_BLOCK_PTR_FOR_FN (cfun)
-	  && (last = last_stmt (e->dest)) != NULL
-	  && gimple_code (last) == GIMPLE_RETURN)
-	{
-	  edge e1;
-	  edge_iterator ei1;
-
-	  if (single_succ_p (bb))
-	    {
-	      FOR_EACH_EDGE (e1, ei1, bb->preds)
-		if (!predicted_by_p (e1->src, PRED_NULL_RETURN)
-		    && !predicted_by_p (e1->src, PRED_CONST_RETURN)
-		    && !predicted_by_p (e1->src, PRED_NEGATIVE_RETURN))
-		  predict_edge_def (e1, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
-	    }
-	  else
-	    if (!predicted_by_p (e->src, PRED_NULL_RETURN)
-		&& !predicted_by_p (e->src, PRED_CONST_RETURN)
-		&& !predicted_by_p (e->src, PRED_NEGATIVE_RETURN))
-	      predict_edge_def (e, PRED_TREE_EARLY_RETURN, NOT_TAKEN);
-	}
-
       /* Look for block we are guarding (ie we dominate it,
 	 but it doesn't postdominate us).  */
       if (e->dest != EXIT_BLOCK_PTR_FOR_FN (cfun) && e->dest != bb
diff --git a/gcc/predict.def b/gcc/predict.def
index fcda6c48f11..f7b2bf7738c 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -128,18 +128,9 @@  DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
    indefinitely.  */
 DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
 
-/* Branch causing function to terminate is probably not taken. 
-   FIXME: early return currently predicts code:
-   int foo (int a)
-   {
-      if (a)
-	bar();
-      else
-	bar2();
-   }
-   even though there is no return statement involved.  We probably want to track
-   this from FE or retire the predictor.  */
-DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (54), 0)
+/* Branch causing function to terminate is probably not taken.  */
+DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
+	       0)
 
 /* Branch containing goto is probably not taken.
    FIXME: Currently not used.  */
diff --git a/gcc/tree-tailcall.c b/gcc/tree-tailcall.c
index b7053387e91..6aa9a56462e 100644
--- a/gcc/tree-tailcall.c
+++ b/gcc/tree-tailcall.c
@@ -421,6 +421,7 @@  find_tail_calls (basic_block bb, struct tailcall **ret)
       if (gimple_code (stmt) == GIMPLE_LABEL
 	  || gimple_code (stmt) == GIMPLE_RETURN
 	  || gimple_code (stmt) == GIMPLE_NOP
+	  || gimple_code (stmt) == GIMPLE_PREDICT
 	  || gimple_clobber_p (stmt)
 	  || is_gimple_debug (stmt))
 	continue;
@@ -555,6 +556,7 @@  find_tail_calls (basic_block bb, struct tailcall **ret)
 
       if (gimple_code (stmt) == GIMPLE_LABEL
 	  || gimple_code (stmt) == GIMPLE_NOP
+	  || gimple_code (stmt) == GIMPLE_PREDICT
 	  || gimple_clobber_p (stmt)
 	  || is_gimple_debug (stmt))
 	continue;
-- 
2.13.1