Patchwork Use working set profile info to determine hotness (issue6852069)

login
register
mail settings
Submitter Teresa Johnson
Date Nov. 19, 2012, 10:43 p.m.
Message ID <20121119224351.C9D2161425@tjsboxrox.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/200209/
State New
Headers show

Comments

Teresa Johnson - Nov. 19, 2012, 10:43 p.m.
This patch uses the new working set information from the profile to select
the hot count threshold for an application instead of using a hard cutoff.
Currently the threshold is set by default to the minimum counter value
needed to reach 99.9% of the profiled execution time, but I have added
a parameter to control this.

I saw a couple improvements in SPEC2006 on a Westmere, such as xalancbmk by a few
percent.

Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?

Thanks,
Teresa


2012-11-19  Teresa Johnson  <tejohnson@google.com>

	* predict.c (maybe_hot_count_p): Use threshold from profiled working
	set instead of hard limit.
	(cgraph_maybe_hot_edge_p): Invoke maybe_hot_count_p() instead of
	directly checking limit.
	* params.def (HOT_BB_COUNT_FRACTION): Remove.
	(HOT_BB_COUNT_WS_PERCENT): New parameter.


--
This patch is available for review at http://codereview.appspot.com/6852069
Jan Hubicka - Nov. 21, 2012, 12:47 p.m.
> This patch uses the new working set information from the profile to select
> the hot count threshold for an application instead of using a hard cutoff.
> Currently the threshold is set by default to the minimum counter value
> needed to reach 99.9% of the profiled execution time, but I have added
> a parameter to control this.
> 
> I saw a couple improvements in SPEC2006 on a Westmere, such as xalancbmk by a few
> percent.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu. Ok for trunk?
> 
> Thanks,
> Teresa
> 
> 
> 2012-11-19  Teresa Johnson  <tejohnson@google.com>
> 
> 	* predict.c (maybe_hot_count_p): Use threshold from profiled working
> 	set instead of hard limit.
> 	(cgraph_maybe_hot_edge_p): Invoke maybe_hot_count_p() instead of
> 	directly checking limit.
> 	* params.def (HOT_BB_COUNT_FRACTION): Remove.
> 	(HOT_BB_COUNT_WS_PERCENT): New parameter.
> 
> Index: predict.c
> ===================================================================
> --- predict.c	(revision 193614)
> +++ predict.c	(working copy)
> @@ -134,13 +134,15 @@ maybe_hot_frequency_p (struct function *fun, int f
>  static inline bool
>  maybe_hot_count_p (struct function *fun, gcov_type count)
>  {
> +  gcov_working_set_t *ws;
>    if (profile_status_for_function (fun) != PROFILE_READ)
>      return true;
>    /* Code executed at most once is not hot.  */
>    if (profile_info->runs >= count)
>      return false;
> -  return (count
> -	  > profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION));
> +  ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERCENT));
> +  gcc_assert (ws);
> +  return (count >= ws->min_counter);

I think you want to store the minimal count into a global variable to avoid the
repeated working set lookup.
> Index: params.def
> ===================================================================
> --- params.def	(revision 193614)
> +++ params.def	(working copy)
> @@ -365,10 +365,11 @@ DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_THRESHOLD,
>  	 "A threshold on the average loop count considered by the swing modulo scheduler",
>  	 0, 0, 0)
>  
> -DEFPARAM(HOT_BB_COUNT_FRACTION,
> -	 "hot-bb-count-fraction",
> -	 "Select fraction of the maximal count of repetitions of basic block in program given basic block needs to have to be considered hot",
> -	 10000, 0, 0)
> +DEFPARAM(HOT_BB_COUNT_WS_PERCENT,
> +	 "hot-bb-count-ws-percent",
> +         "A basic block profile count is considered hot if it contributes to "
> +         "the given percentage (times ten) of the entire profiled execution",
> +	 999, 0, 1000)

And document the parameter.  

Honza
>  DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
>  	 "hot-bb-frequency-fraction",
>  	 "Select fraction of the maximal frequency of executions of basic block in function given basic block needs to have to be considered hot",
> @@ -392,7 +393,7 @@ DEFPARAM (PARAM_ALIGN_LOOP_ITERATIONS,
>     flatten the profile.
>  
>     We need to cut the maximal predicted iterations to large enough iterations
> -   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
> +   so the loop appears important, but safely within maximum hotness
>     range.  */
>  
>  DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
> 
> --
> This patch is available for review at http://codereview.appspot.com/6852069

Patch

Index: predict.c
===================================================================
--- predict.c	(revision 193614)
+++ predict.c	(working copy)
@@ -134,13 +134,15 @@  maybe_hot_frequency_p (struct function *fun, int f
 static inline bool
 maybe_hot_count_p (struct function *fun, gcov_type count)
 {
+  gcov_working_set_t *ws;
   if (profile_status_for_function (fun) != PROFILE_READ)
     return true;
   /* Code executed at most once is not hot.  */
   if (profile_info->runs >= count)
     return false;
-  return (count
-	  > profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION));
+  ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERCENT));
+  gcc_assert (ws);
+  return (count >= ws->min_counter);
 }
 
 /* Return true in case BB can be CPU intensive and should be optimized
@@ -161,8 +163,8 @@  bool
 cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
 {
   if (profile_info && flag_branch_probabilities
-      && (edge->count
-	  <= profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION)))
+      && !maybe_hot_count_p (DECL_STRUCT_FUNCTION (edge->caller->symbol.decl),
+                             edge->count))
     return false;
   if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
       || (edge->callee
Index: params.def
===================================================================
--- params.def	(revision 193614)
+++ params.def	(working copy)
@@ -365,10 +365,11 @@  DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_THRESHOLD,
 	 "A threshold on the average loop count considered by the swing modulo scheduler",
 	 0, 0, 0)
 
-DEFPARAM(HOT_BB_COUNT_FRACTION,
-	 "hot-bb-count-fraction",
-	 "Select fraction of the maximal count of repetitions of basic block in program given basic block needs to have to be considered hot",
-	 10000, 0, 0)
+DEFPARAM(HOT_BB_COUNT_WS_PERCENT,
+	 "hot-bb-count-ws-percent",
+         "A basic block profile count is considered hot if it contributes to "
+         "the given percentage (times ten) of the entire profiled execution",
+	 999, 0, 1000)
 DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
 	 "hot-bb-frequency-fraction",
 	 "Select fraction of the maximal frequency of executions of basic block in function given basic block needs to have to be considered hot",
@@ -392,7 +393,7 @@  DEFPARAM (PARAM_ALIGN_LOOP_ITERATIONS,
    flatten the profile.
 
    We need to cut the maximal predicted iterations to large enough iterations
-   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
+   so the loop appears important, but safely within maximum hotness
    range.  */
 
 DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,