diff mbox series

[4/5] Remove predictors that are unrealiable.

Message ID 8d9f8a5a-5180-1d05-2677-c959ce33e044@suse.cz
State New
Headers show
Series [1/5] Fix usage of analyze_brprob.py script. | expand

Commit Message

Martin Liška Jan. 9, 2018, 11:46 a.m. UTC
These predictors are in my opinion not reliable and thus I decided to remove them:

1) PRED_NEGATIVE_RETURN: probability is ~51%
2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
w/o these edges it goes down to 52%
3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
4) PRED_INDIR_CALL: likewise

Question here is whether we want to remove them, or to predict them with a 'ignored'
flag? Doing that, we can measure statistics of the predictor in the future?

Martin

Comments

Jan Hubicka Jan. 11, 2018, 10:39 a.m. UTC | #1
> These predictors are in my opinion not reliable and thus I decided to remove them:
> 
> 1) PRED_NEGATIVE_RETURN: probability is ~51%
> 2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
> w/o these edges it goes down to 52%
> 3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
> 4) PRED_INDIR_CALL: likewise
> 
> Question here is whether we want to remove them, or to predict them with a 'ignored'
> flag? Doing that, we can measure statistics of the predictor in the future?

I believe that recursive call was introudced to help exchange2 benchmark.  I think it does
make sense globally because function simply can not contain hot self recursive call and thus
I would not care about low benchmark coverage.

For polymorphic/indir call I think they are going to grow in importance in future
(especialy first one) and thus I would like to keep them tracked.  If you simply
set probablility to PROB_EVEN, they won't affect branch prediction outcome and
we still will get data on how common they are.

For PRED_NAGATIVE_RETURN, can you take a look why it changed from 98 to 51%?
The idea is that negative values are often used to report error codes and that seems
reasonable.  Perhaps it can be made more specific so it remains working ofr spec2k16?

Honza
> 
> Martin

> >From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
> From: marxin <mliska@suse.cz>
> Date: Thu, 28 Dec 2017 10:23:48 +0100
> Subject: [PATCH 4/5] Remove predictors that are unrealiable.
> 
> gcc/ChangeLog:
> 
> 2017-12-28  Martin Liska  <mliska@suse.cz>
> 
> 	* predict.c (return_prediction): Do not predict
> 	PRED_NEGATIVE_RETURN.
> 	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
> 	(tree_estimate_probability_bb): Do not predict
> 	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
> 	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
> 	(PRED_POLYMORPHIC_CALL): Likewise.
> 	(PRED_RECURSIVE_CALL): Likewise.
> 	(PRED_NEGATIVE_RETURN): Likewise.
> ---
>  gcc/predict.c   | 17 ++---------------
>  gcc/predict.def | 13 -------------
>  2 files changed, 2 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 51fd14205c2..f53724792e9 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction *prediction)
>      }
>    else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
>      {
> -      /* Negative return values are often used to indicate
> -         errors.  */
> -      if (TREE_CODE (val) == INTEGER_CST
> -	  && tree_int_cst_sgn (val) < 0)
> -	{
> -	  *prediction = NOT_TAKEN;
> -	  return PRED_NEGATIVE_RETURN;
> -	}
>        /* Constant return values seems to be commonly taken.
>           Zero/one often represent booleans so exclude them from the
>  	 heuristics.  */
> @@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
>  				       DECL_ATTRIBUTES (decl)))
>  		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
>  					  NOT_TAKEN);
> -	      if (decl && recursive_call_p (current_function_decl, decl))
> -		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
> -					  NOT_TAKEN);
>  	    }
>  	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
>  	    {
> @@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>  		     something exceptional.  */
>  		  && gimple_has_side_effects (stmt))
>  		{
> +		  /* Consider just normal function calls, skip indirect and
> +		  polymorphic calls as these tend to be unreliable.  */
>  		  if (gimple_call_fndecl (stmt))
>  		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
> -		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
> -		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
> -		  else
> -		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
>  		  break;
>  		}
>  	    }
> diff --git a/gcc/predict.def b/gcc/predict.def
> index fe72080d5bd..7291650d237 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -118,16 +118,6 @@ DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
>  /* Branch guarding call is probably taken.  */
>  DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
>  
> -/* PRED_CALL is not very reliable predictor and it turns out to be even
> -   less reliable for indirect calls and polymorphic calls.  For spec2k6
> -   the predictio nis slightly in the direction of taking the call.  */
> -DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
> -DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
> -
> -/* Recursive calls are usually not taken or the function will recurse
> -   indefinitely.  */
> -DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
> -
>  /* Branch causing function to terminate is probably not taken.  */
>  DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>  	       0)
> @@ -138,9 +128,6 @@ DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
>  /* Branch ending with return constant is probably not taken.  */
>  DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
>  
> -/* Branch ending with return negative constant is probably not taken.  */
> -DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
> -
>  /* Branch ending with return; is probably not taken */
>  DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
>  
> -- 
> 2.14.3
>
Martin Liška Jan. 19, 2018, 11:57 a.m. UTC | #2
On 01/11/2018 11:39 AM, Jan Hubicka wrote:
>> These predictors are in my opinion not reliable and thus I decided to remove them:
>>
>> 1) PRED_NEGATIVE_RETURN: probability is ~51%
>> 2) PRED_RECURSIVE_CALL: there are 2 dominant edges that influence value to 63%;
>> w/o these edges it goes down to 52%
>> 3) PRED_POLYMORPHIC_CALL: having very low coverage, probability is ~51%
>> 4) PRED_INDIR_CALL: likewise
>>
>> Question here is whether we want to remove them, or to predict them with a 'ignored'
>> flag? Doing that, we can measure statistics of the predictor in the future?
> 
> I believe that recursive call was introudced to help exchange2 benchmark.  I think it does
> make sense globally because function simply can not contain hot self recursive call and thus
> I would not care about low benchmark coverage.

Yes it probably was. However according to numbers I have it does not influence the jumpy
benchmarks ;)

> 
> For polymorphic/indir call I think they are going to grow in importance in future
> (especialy first one) and thus I would like to keep them tracked.  If you simply
> set probablility to PROB_EVEN, they won't affect branch prediction outcome and
> we still will get data on how common they are.

Sure, let's make then PROB_EVEN.

> 
> For PRED_NAGATIVE_RETURN, can you take a look why it changed from 98 to 51%?
> The idea is that negative values are often used to report error codes and that seems
> reasonable.  Perhaps it can be made more specific so it remains working ofr spec2k16?

Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
and later one only 11%. It's caused by these 2 benchmarks with a high coverage:

500.perlbench_r: regexec.c.065i.profile:
  negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)

and 523.xalancbmk_r:
build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)

Ideas what to do with the predictor for GCC 8 release?
Martin

> 
> Honza
>>
>> Martin
> 
>> >From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
>> From: marxin <mliska@suse.cz>
>> Date: Thu, 28 Dec 2017 10:23:48 +0100
>> Subject: [PATCH 4/5] Remove predictors that are unrealiable.
>>
>> gcc/ChangeLog:
>>
>> 2017-12-28  Martin Liska  <mliska@suse.cz>
>>
>> 	* predict.c (return_prediction): Do not predict
>> 	PRED_NEGATIVE_RETURN.
>> 	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
>> 	(tree_estimate_probability_bb): Do not predict
>> 	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
>> 	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
>> 	(PRED_POLYMORPHIC_CALL): Likewise.
>> 	(PRED_RECURSIVE_CALL): Likewise.
>> 	(PRED_NEGATIVE_RETURN): Likewise.
>> ---
>>  gcc/predict.c   | 17 ++---------------
>>  gcc/predict.def | 13 -------------
>>  2 files changed, 2 insertions(+), 28 deletions(-)
>>
>> diff --git a/gcc/predict.c b/gcc/predict.c
>> index 51fd14205c2..f53724792e9 100644
>> --- a/gcc/predict.c
>> +++ b/gcc/predict.c
>> @@ -2632,14 +2632,6 @@ return_prediction (tree val, enum prediction *prediction)
>>      }
>>    else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
>>      {
>> -      /* Negative return values are often used to indicate
>> -         errors.  */
>> -      if (TREE_CODE (val) == INTEGER_CST
>> -	  && tree_int_cst_sgn (val) < 0)
>> -	{
>> -	  *prediction = NOT_TAKEN;
>> -	  return PRED_NEGATIVE_RETURN;
>> -	}
>>        /* Constant return values seems to be commonly taken.
>>           Zero/one often represent booleans so exclude them from the
>>  	 heuristics.  */
>> @@ -2820,9 +2812,6 @@ tree_bb_level_predictions (void)
>>  				       DECL_ATTRIBUTES (decl)))
>>  		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
>>  					  NOT_TAKEN);
>> -	      if (decl && recursive_call_p (current_function_decl, decl))
>> -		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
>> -					  NOT_TAKEN);
>>  	    }
>>  	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
>>  	    {
>> @@ -2880,12 +2869,10 @@ tree_estimate_probability_bb (basic_block bb, bool local_only)
>>  		     something exceptional.  */
>>  		  && gimple_has_side_effects (stmt))
>>  		{
>> +		  /* Consider just normal function calls, skip indirect and
>> +		  polymorphic calls as these tend to be unreliable.  */
>>  		  if (gimple_call_fndecl (stmt))
>>  		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
>> -		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
>> -		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
>> -		  else
>> -		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
>>  		  break;
>>  		}
>>  	    }
>> diff --git a/gcc/predict.def b/gcc/predict.def
>> index fe72080d5bd..7291650d237 100644
>> --- a/gcc/predict.def
>> +++ b/gcc/predict.def
>> @@ -118,16 +118,6 @@ DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
>>  /* Branch guarding call is probably taken.  */
>>  DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
>>  
>> -/* PRED_CALL is not very reliable predictor and it turns out to be even
>> -   less reliable for indirect calls and polymorphic calls.  For spec2k6
>> -   the predictio nis slightly in the direction of taking the call.  */
>> -DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
>> -DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
>> -
>> -/* Recursive calls are usually not taken or the function will recurse
>> -   indefinitely.  */
>> -DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
>> -
>>  /* Branch causing function to terminate is probably not taken.  */
>>  DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
>>  	       0)
>> @@ -138,9 +128,6 @@ DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
>>  /* Branch ending with return constant is probably not taken.  */
>>  DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
>>  
>> -/* Branch ending with return negative constant is probably not taken.  */
>> -DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
>> -
>>  /* Branch ending with return; is probably not taken */
>>  DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
>>  
>> -- 
>> 2.14.3
>>
>
Martin Liška Jan. 22, 2018, 2:10 p.m. UTC | #3
On 01/19/2018 12:57 PM, Martin Liška wrote:
> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> 

Hi.

I'm sending details about the 2 edges that influence the statistics significantly:

> 500.perlbench_r: regexec.c.065i.profile:
>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)

2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
        -: 3485:{
2477714850: 3486:    CHECKPOINT lastcp;
2477714850: 3487:    REGEXP *const rx = reginfo->prog;
2477714850: 3488:    regexp *const prog = ReANY(rx);
2477714850: 3489:    SSize_t result;
[snip]
2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
        -:  8048:}

As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.

> 
> and 523.xalancbmk_r:
> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)

1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
        -:   75:                                   , const XMLCh* const rValue
        -:   76:                                   ,       MemoryManager*     const)
        -:   77:{
1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
        -:   79:}
        -:   80:

IP profile dump file:

xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
{
  bool _1;
  int iftmp.0_2;

  <bb 2> [count: 1221735072]:
  # DEBUG BEGIN_STMT
  _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
  if (_1 != 0)
    goto <bb 4>; [0.02%]
  else
    goto <bb 3>; [99.98%]

  <bb 3> [count: 1221522453]:

  <bb 4> [count: 1221735072]:
  # iftmp.0_2 = PHI <0(2), -1(3)>
  return iftmp.0_2;

}

Likewise, XML values are more commonly non-equal.
Ok, so may I mark also negative return with PROB_EVEN to track it?

Thanks,
Martin

> 
> Ideas what to do with the predictor for GCC 8 release?
> Martin
Jan Hubicka Jan. 23, 2018, 10:02 a.m. UTC | #4
> On 01/19/2018 12:57 PM, Martin Liška wrote:
> > Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> > and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> > 
> 
> Hi.
> 
> I'm sending details about the 2 edges that influence the statistics significantly:
> 
> > 500.perlbench_r: regexec.c.065i.profile:
> >   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
> 
> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
>         -: 3485:{
> 2477714850: 3486:    CHECKPOINT lastcp;
> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
> 2477714850: 3488:    regexp *const prog = ReANY(rx);
> 2477714850: 3489:    SSize_t result;
> [snip]
> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
>         -:  8048:}
> 
> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
> 
> > 
> > and 523.xalancbmk_r:
> > build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
> 
> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
>         -:   75:                                   , const XMLCh* const rValue
>         -:   76:                                   ,       MemoryManager*     const)
>         -:   77:{
> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
>         -:   79:}
>         -:   80:
> 
> IP profile dump file:
> 
> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
> {
>   bool _1;
>   int iftmp.0_2;
> 
>   <bb 2> [count: 1221735072]:
>   # DEBUG BEGIN_STMT
>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
>   if (_1 != 0)
>     goto <bb 4>; [0.02%]
>   else
>     goto <bb 3>; [99.98%]
> 
>   <bb 3> [count: 1221522453]:
> 
>   <bb 4> [count: 1221735072]:
>   # iftmp.0_2 = PHI <0(2), -1(3)>
>   return iftmp.0_2;
> 
> }
> 
> Likewise, XML values are more commonly non-equal.
> Ok, so may I mark also negative return with PROB_EVEN to track it?

Well, if we start disabling predictors just because we can find benchmark where they do not
perform as expected, we will need to disable them all. It is nature of heuristics to make
mistakes, just they should do something useful for common code.
It is not goal to make heuristics that makes no mistake.
Heuristics is useful either if it have high precision even if it cover few
branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
call or return based heuristics).

To make things bit more esoteric, in practice this is also bit weighted by fact
how much damage bad prediction does.  For example, call heuristics which predict
non-call path to be taken is likely going to do little damage. Even if call
path is common it is heavy and hard to optimize by presence of call and thus we
don't loose that much in common scenarios.

In the second case:
00073 // -----------------------------------------------------------------------
00074 // Compare methods
00075 // -----------------------------------------------------------------------
00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
00077                                    , const XMLCh* const rValue
00078                                    ,       MemoryManager*     const)
00079 {
00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
00081 }

I would say it is odd coding style.  I do not see why it is not declared
bool and not returning true/false. 

First case seems bit non-standard too as it looks like usual cache lookup just
the cache frequently fails.

I would be in favor of keeping the prediction with non-50% hitrate thus unless
we run into some performance problems.
If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
it seems fine to me.

There is issue with the way we collect our data.  Using arithmetic mean across
spec is optimizing for overall runtime of all spec benchmarks combined
together.  We more want to look for safer values that do well for average
benchmarks independently how many branches they do.  We could consider making
the statistics script also collect geometric averages across different
benchmarks. If we will see big difference between arithmetic mean and geomean
we will know there is small subset of benchmarks which dominates and we could
decide what to do with the probability.

Honza

> 
> Thanks,
> Martin
> 
> > 
> > Ideas what to do with the predictor for GCC 8 release?
> > Martin
Martin Liška Jan. 23, 2018, 12:10 p.m. UTC | #5
On 01/23/2018 11:02 AM, Jan Hubicka wrote:
>> On 01/19/2018 12:57 PM, Martin Liška wrote:
>>> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
>>> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
>>>
>>
>> Hi.
>>
>> I'm sending details about the 2 edges that influence the statistics significantly:
>>
>>> 500.perlbench_r: regexec.c.065i.profile:
>>>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
>>
>> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
>>         -: 3485:{
>> 2477714850: 3486:    CHECKPOINT lastcp;
>> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
>> 2477714850: 3488:    regexp *const prog = ReANY(rx);
>> 2477714850: 3489:    SSize_t result;
>> [snip]
>> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
>> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
>>         -:  8048:}
>>
>> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
>>
>>>
>>> and 523.xalancbmk_r:
>>> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
>>
>> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
>>         -:   75:                                   , const XMLCh* const rValue
>>         -:   76:                                   ,       MemoryManager*     const)
>>         -:   77:{
>> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
>>         -:   79:}
>>         -:   80:
>>
>> IP profile dump file:
>>
>> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
>> {
>>   bool _1;
>>   int iftmp.0_2;
>>
>>   <bb 2> [count: 1221735072]:
>>   # DEBUG BEGIN_STMT
>>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
>>   if (_1 != 0)
>>     goto <bb 4>; [0.02%]
>>   else
>>     goto <bb 3>; [99.98%]
>>
>>   <bb 3> [count: 1221522453]:
>>
>>   <bb 4> [count: 1221735072]:
>>   # iftmp.0_2 = PHI <0(2), -1(3)>
>>   return iftmp.0_2;
>>
>> }
>>
>> Likewise, XML values are more commonly non-equal.
>> Ok, so may I mark also negative return with PROB_EVEN to track it?
> 
> Well, if we start disabling predictors just because we can find benchmark where they do not
> perform as expected, we will need to disable them all. It is nature of heuristics to make
> mistakes, just they should do something useful for common code.
> It is not goal to make heuristics that makes no mistake.
> Heuristics is useful either if it have high precision even if it cover few
> branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
> call or return based heuristics).
> 
> To make things bit more esoteric, in practice this is also bit weighted by fact
> how much damage bad prediction does.  For example, call heuristics which predict
> non-call path to be taken is likely going to do little damage. Even if call
> path is common it is heavy and hard to optimize by presence of call and thus we
> don't loose that much in common scenarios.
> 
> In the second case:
> 00073 // -----------------------------------------------------------------------
> 00074 // Compare methods
> 00075 // -----------------------------------------------------------------------
> 00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
> 00077                                    , const XMLCh* const rValue
> 00078                                    ,       MemoryManager*     const)
> 00079 {
> 00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
> 00081 }
> 
> I would say it is odd coding style.  I do not see why it is not declared
> bool and not returning true/false. 
> 
> First case seems bit non-standard too as it looks like usual cache lookup just
> the cache frequently fails.
> 
> I would be in favor of keeping the prediction with non-50% hitrate thus unless
> we run into some performance problems.
> If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
> it seems fine to me.
> 
> There is issue with the way we collect our data.  Using arithmetic mean across
> spec is optimizing for overall runtime of all spec benchmarks combined
> together.  We more want to look for safer values that do well for average
> benchmarks independently how many branches they do.  We could consider making
> the statistics script also collect geometric averages across different
> benchmarks. If we will see big difference between arithmetic mean and geomean
> we will know there is small subset of benchmarks which dominates and we could
> decide what to do with the probability.

Hello.

I fully agree that proper way to do statistics is to do a weight average across
benchmarks how each predictor hits performance. Which is nice, but quite hard
to achieve. Let's talk personally how we can change it in next stage1.

For now, in case of the 'negative return' predictor, I tend to leave it as it is
and install the rest of the patch that makes PROB_EVEN probability.

Works for you Honza?
Martin

> 
> Honza
> 
>>
>> Thanks,
>> Martin
>>
>>>
>>> Ideas what to do with the predictor for GCC 8 release?
>>> Martin
Jan Hubicka Jan. 23, 2018, 1:49 p.m. UTC | #6
> On 01/23/2018 11:02 AM, Jan Hubicka wrote:
> >> On 01/19/2018 12:57 PM, Martin Liška wrote:
> >>> Yes, there's a huge difference in between CPU 2006 and 2017. Former has 63% w/ dominant edges,
> >>> and later one only 11%. It's caused by these 2 benchmarks with a high coverage:
> >>>
> >>
> >> Hi.
> >>
> >> I'm sending details about the 2 edges that influence the statistics significantly:
> >>
> >>> 500.perlbench_r: regexec.c.065i.profile:
> >>>   negative return heuristics of edge 1368->1370: 2.0%  exec 2477714850 hit 2429863555 (98.1%)
> >>
> >> 2477714850: 3484:S_regtry(pTHX_ regmatch_info *reginfo, char **startposp)
> >>         -: 3485:{
> >> 2477714850: 3486:    CHECKPOINT lastcp;
> >> 2477714850: 3487:    REGEXP *const rx = reginfo->prog;
> >> 2477714850: 3488:    regexp *const prog = ReANY(rx);
> >> 2477714850: 3489:    SSize_t result;
> >> [snip]
> >> 2477714850: 8046:    assert(!result ||  locinput - reginfo->strbeg >= 0);
> >> 2477714850: 8047:    return result ?  locinput - reginfo->strbeg : -1;
> >>         -:  8048:}
> >>
> >> As seen it return -1 if a regex is not found, which is in case of perlbench very likely branch.
> >>
> >>>
> >>> and 523.xalancbmk_r:
> >>> build/build_peak_gcc7-m64.0000/NameDatatypeValidator.cpp.065i.profile:  negative return heuristics of edge 3->4: 2.0%  exec 1221735072 hit 1221522453 (100.0%)
> >>
> >> 1221735072:   74:int NameDatatypeValidator::compare(const XMLCh* const lValue
> >>         -:   75:                                   , const XMLCh* const rValue
> >>         -:   76:                                   ,       MemoryManager*     const)
> >>         -:   77:{
> >> 1221735072:   78:    return ( XMLString::equals(lValue, rValue)? 0 : -1);
> >>         -:   79:}
> >>         -:   80:
> >>
> >> IP profile dump file:
> >>
> >> xercesc_2_7::NameDatatypeValidator::compare (struct NameDatatypeValidator * const this, const XMLCh * const lValue, const XMLCh * const rValue, struct MemoryManager * const D.17157)
> >> {
> >>   bool _1;
> >>   int iftmp.0_2;
> >>
> >>   <bb 2> [count: 1221735072]:
> >>   # DEBUG BEGIN_STMT
> >>   _1 = xercesc_2_7::XMLString::equals (lValue_4(D), rValue_5(D));
> >>   if (_1 != 0)
> >>     goto <bb 4>; [0.02%]
> >>   else
> >>     goto <bb 3>; [99.98%]
> >>
> >>   <bb 3> [count: 1221522453]:
> >>
> >>   <bb 4> [count: 1221735072]:
> >>   # iftmp.0_2 = PHI <0(2), -1(3)>
> >>   return iftmp.0_2;
> >>
> >> }
> >>
> >> Likewise, XML values are more commonly non-equal.
> >> Ok, so may I mark also negative return with PROB_EVEN to track it?
> > 
> > Well, if we start disabling predictors just because we can find benchmark where they do not
> > perform as expected, we will need to disable them all. It is nature of heuristics to make
> > mistakes, just they should do something useful for common code.
> > It is not goal to make heuristics that makes no mistake.
> > Heuristics is useful either if it have high precision even if it cover few
> > branches (say noreturn), or if it have large coverage and still some hitrate (say opcode,
> > call or return based heuristics).
> > 
> > To make things bit more esoteric, in practice this is also bit weighted by fact
> > how much damage bad prediction does.  For example, call heuristics which predict
> > non-call path to be taken is likely going to do little damage. Even if call
> > path is common it is heavy and hard to optimize by presence of call and thus we
> > don't loose that much in common scenarios.
> > 
> > In the second case:
> > 00073 // -----------------------------------------------------------------------
> > 00074 // Compare methods
> > 00075 // -----------------------------------------------------------------------
> > 00076 int NameDatatypeValidator::compare(const XMLCh* const lValue
> > 00077                                    , const XMLCh* const rValue
> > 00078                                    ,       MemoryManager*     const)
> > 00079 {
> > 00080     return ( XMLString::equals(lValue, rValue)? 0 : -1);
> > 00081 }
> > 
> > I would say it is odd coding style.  I do not see why it is not declared
> > bool and not returning true/false. 
> > 
> > First case seems bit non-standard too as it looks like usual cache lookup just
> > the cache frequently fails.
> > 
> > I would be in favor of keeping the prediction with non-50% hitrate thus unless
> > we run into some performance problems.
> > If there are only two benchmarks out of say 20 we tried (in spec2000 and 2k17)
> > it seems fine to me.
> > 
> > There is issue with the way we collect our data.  Using arithmetic mean across
> > spec is optimizing for overall runtime of all spec benchmarks combined
> > together.  We more want to look for safer values that do well for average
> > benchmarks independently how many branches they do.  We could consider making
> > the statistics script also collect geometric averages across different
> > benchmarks. If we will see big difference between arithmetic mean and geomean
> > we will know there is small subset of benchmarks which dominates and we could
> > decide what to do with the probability.
> 
> Hello.
> 
> I fully agree that proper way to do statistics is to do a weight average across
> benchmarks how each predictor hits performance. Which is nice, but quite hard
> to achieve. Let's talk personally how we can change it in next stage1.
> 
> For now, in case of the 'negative return' predictor, I tend to leave it as it is
> and install the rest of the patch that makes PROB_EVEN probability.

OK, makes sense to me.
Honza
> 
> Works for you Honza?
> Martin
> 
> > 
> > Honza
> > 
> >>
> >> Thanks,
> >> Martin
> >>
> >>>
> >>> Ideas what to do with the predictor for GCC 8 release?
> >>> Martin
diff mbox series

Patch

From afbc86cb72eab37bcf6325954d0bf306b301f76e Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Thu, 28 Dec 2017 10:23:48 +0100
Subject: [PATCH 4/5] Remove predictors that are unrealiable.

gcc/ChangeLog:

2017-12-28  Martin Liska  <mliska@suse.cz>

	* predict.c (return_prediction): Do not predict
	PRED_NEGATIVE_RETURN.
	(tree_bb_level_predictions): Do not predict PRED_RECURSIVE_CALL.
	(tree_estimate_probability_bb): Do not predict
	PRED_POLYMORPHIC_CALL and PRED_INDIR_CALL.
	* predict.def (PRED_INDIR_CALL): Remove unused predictors.
	(PRED_POLYMORPHIC_CALL): Likewise.
	(PRED_RECURSIVE_CALL): Likewise.
	(PRED_NEGATIVE_RETURN): Likewise.
---
 gcc/predict.c   | 17 ++---------------
 gcc/predict.def | 13 -------------
 2 files changed, 2 insertions(+), 28 deletions(-)

diff --git a/gcc/predict.c b/gcc/predict.c
index 51fd14205c2..f53724792e9 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -2632,14 +2632,6 @@  return_prediction (tree val, enum prediction *prediction)
     }
   else if (INTEGRAL_TYPE_P (TREE_TYPE (val)))
     {
-      /* Negative return values are often used to indicate
-         errors.  */
-      if (TREE_CODE (val) == INTEGER_CST
-	  && tree_int_cst_sgn (val) < 0)
-	{
-	  *prediction = NOT_TAKEN;
-	  return PRED_NEGATIVE_RETURN;
-	}
       /* Constant return values seems to be commonly taken.
          Zero/one often represent booleans so exclude them from the
 	 heuristics.  */
@@ -2820,9 +2812,6 @@  tree_bb_level_predictions (void)
 				       DECL_ATTRIBUTES (decl)))
 		predict_paths_leading_to (bb, PRED_COLD_FUNCTION,
 					  NOT_TAKEN);
-	      if (decl && recursive_call_p (current_function_decl, decl))
-		predict_paths_leading_to (bb, PRED_RECURSIVE_CALL,
-					  NOT_TAKEN);
 	    }
 	  else if (gimple_code (stmt) == GIMPLE_PREDICT)
 	    {
@@ -2880,12 +2869,10 @@  tree_estimate_probability_bb (basic_block bb, bool local_only)
 		     something exceptional.  */
 		  && gimple_has_side_effects (stmt))
 		{
+		  /* Consider just normal function calls, skip indirect and
+		  polymorphic calls as these tend to be unreliable.  */
 		  if (gimple_call_fndecl (stmt))
 		    predict_edge_def (e, PRED_CALL, NOT_TAKEN);
-		  else if (virtual_method_call_p (gimple_call_fn (stmt)))
-		    predict_edge_def (e, PRED_POLYMORPHIC_CALL, NOT_TAKEN);
-		  else
-		    predict_edge_def (e, PRED_INDIR_CALL, TAKEN);
 		  break;
 		}
 	    }
diff --git a/gcc/predict.def b/gcc/predict.def
index fe72080d5bd..7291650d237 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -118,16 +118,6 @@  DEF_PREDICTOR (PRED_TREE_FPOPCODE, "fp_opcode (on trees)", HITRATE (90), 0)
 /* Branch guarding call is probably taken.  */
 DEF_PREDICTOR (PRED_CALL, "call", HITRATE (67), 0)
 
-/* PRED_CALL is not very reliable predictor and it turns out to be even
-   less reliable for indirect calls and polymorphic calls.  For spec2k6
-   the predictio nis slightly in the direction of taking the call.  */
-DEF_PREDICTOR (PRED_INDIR_CALL, "indirect call", HITRATE (86), 0)
-DEF_PREDICTOR (PRED_POLYMORPHIC_CALL, "polymorphic call", HITRATE (59), 0)
-
-/* Recursive calls are usually not taken or the function will recurse
-   indefinitely.  */
-DEF_PREDICTOR (PRED_RECURSIVE_CALL, "recursive call", HITRATE (75), 0)
-
 /* Branch causing function to terminate is probably not taken.  */
 DEF_PREDICTOR (PRED_TREE_EARLY_RETURN, "early return (on trees)", HITRATE (66),
 	       0)
@@ -138,9 +128,6 @@  DEF_PREDICTOR (PRED_GOTO, "goto", HITRATE (66), 0)
 /* Branch ending with return constant is probably not taken.  */
 DEF_PREDICTOR (PRED_CONST_RETURN, "const return", HITRATE (65), 0)
 
-/* Branch ending with return negative constant is probably not taken.  */
-DEF_PREDICTOR (PRED_NEGATIVE_RETURN, "negative return", HITRATE (98), 0)
-
 /* Branch ending with return; is probably not taken */
 DEF_PREDICTOR (PRED_NULL_RETURN, "null return", HITRATE (71), 0)
 
-- 
2.14.3