Patchwork Use working set profile info to determine hotness (issue6852069)

login
register
mail settings
Submitter Jan Hubicka
Date Nov. 23, 2012, 2:13 p.m.
Message ID <20121123141300.GA14227@kam.mff.cuni.cz>
Download mbox | patch
Permalink /patch/201328/
State New
Headers show

Comments

Jan Hubicka - Nov. 23, 2012, 2:13 p.m.
> Sounds good. I am travelling the rest of the week so I'll get the revised
> patch ready by Mon. Thanks, Teresa
Hi,
I updated the patch, so we make progress on the heuristic retunning.  There was
a segfault during profiledbootstrap trying to fetch DECL_STRUCT_FUNCTION of
calle of indirect call and I renamed percents to permilles since they are in
0...1000 range.

Bootstrapped/regtested x86_64-linux, comitted.

Honza

2012-11-19  Teresa Johnson  <tejohnson@google.com>
	    Jan Hubicka

	* predict.c (maybe_hot_count_p): Use threshold from profiled working
	set instead of hard limit.
	(cgraph_maybe_hot_edge_p): Invoke maybe_hot_count_p() instead of
	directly checking limit.
	* params.def (HOT_BB_COUNT_FRACTION): Remove.
	(HOT_BB_COUNT_WS_PERMILLE): New parameter.
	* invoke.texi (hot-bb-count-fraction): Remove.
	(hot-bb-count-ws-permille): Document.
Teresa Johnson - Nov. 26, 2012, 2:42 p.m.
Thanks for the fixes and commit!
Teresa

On Fri, Nov 23, 2012 at 6:13 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Sounds good. I am travelling the rest of the week so I'll get the revised
>> patch ready by Mon. Thanks, Teresa
> Hi,
> I updated the patch, so we make progress on the heuristic retunning.  There was
> a segfault during profiledbootstrap trying to fetch DECL_STRUCT_FUNCTION of
> calle of indirect call and I renamed percents to permilles since they are in
> 0...1000 range.
>
> Bootstrapped/regtested x86_64-linux, comitted.
>
> Honza
>
> 2012-11-19  Teresa Johnson  <tejohnson@google.com>
>             Jan Hubicka
>
>         * predict.c (maybe_hot_count_p): Use threshold from profiled working
>         set instead of hard limit.
>         (cgraph_maybe_hot_edge_p): Invoke maybe_hot_count_p() instead of
>         directly checking limit.
>         * params.def (HOT_BB_COUNT_FRACTION): Remove.
>         (HOT_BB_COUNT_WS_PERMILLE): New parameter.
>         * invoke.texi (hot-bb-count-fraction): Remove.
>         (hot-bb-count-ws-permille): Document.
>
> Index: predict.c
> ===================================================================
> --- predict.c   (revision 193696)
> +++ predict.c   (working copy)
> @@ -134,13 +134,20 @@ maybe_hot_frequency_p (struct function *
>  static inline bool
>  maybe_hot_count_p (struct function *fun, gcov_type count)
>  {
> -  if (profile_status_for_function (fun) != PROFILE_READ)
> +  gcov_working_set_t *ws;
> +  static gcov_type min_count = -1;
> +  if (fun && profile_status_for_function (fun) != PROFILE_READ)
>      return true;
>    /* Code executed at most once is not hot.  */
>    if (profile_info->runs >= count)
>      return false;
> -  return (count
> -         > profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION));
> +  if (min_count == -1)
> +    {
> +      ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
> +      gcc_assert (ws);
> +      min_count = ws->min_counter;
> +    }
> +  return (count >= min_count);
>  }
>
>  /* Return true in case BB can be CPU intensive and should be optimized
> @@ -161,8 +168,8 @@ bool
>  cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>  {
>    if (profile_info && flag_branch_probabilities
> -      && (edge->count
> -         <= profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION)))
> +      && !maybe_hot_count_p (NULL,
> +                             edge->count))
>      return false;
>    if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>        || (edge->callee
> Index: params.def
> ===================================================================
> --- params.def  (revision 193696)
> +++ params.def  (working copy)
> @@ -365,10 +365,11 @@ DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_TH
>          "A threshold on the average loop count considered by the swing modulo scheduler",
>          0, 0, 0)
>
> -DEFPARAM(HOT_BB_COUNT_FRACTION,
> -        "hot-bb-count-fraction",
> -        "Select fraction of the maximal count of repetitions of basic block in program given basic block needs to have to be considered hot",
> -        10000, 0, 0)
> +DEFPARAM(HOT_BB_COUNT_WS_PERMILLE,
> +        "hot-bb-count-ws-permille",
> +         "A basic block profile count is considered hot if it contributes to "
> +         "the given permillage of the entire profiled execution",
> +        999, 0, 1000)
>  DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
>          "hot-bb-frequency-fraction",
>          "Select fraction of the maximal frequency of executions of basic block in function given basic block needs to have to be considered hot",
> @@ -392,7 +393,7 @@ DEFPARAM (PARAM_ALIGN_LOOP_ITERATIONS,
>     flatten the profile.
>
>     We need to cut the maximal predicted iterations to large enough iterations
> -   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
> +   so the loop appears important, but safely within maximum hotness
>     range.  */
>
>  DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
> Index: doc/invoke.texi
> ===================================================================
> --- doc/invoke.texi     (revision 193696)
> +++ doc/invoke.texi     (working copy)
> @@ -9216,9 +9216,9 @@ doing loop versioning for alias in the v
>  The maximum number of iterations of a loop the brute-force algorithm
>  for analysis of the number of iterations of the loop tries to evaluate.
>
> -@item hot-bb-count-fraction
> -Select fraction of the maximal count of repetitions of basic block in program
> -given basic block needs to have to be considered hot.
> +@item hot-bb-count-ws-permille
> +A basic block profile count is considered hot if it contributes to
> +the given permillage (i.e. 0...1000) of the entire profiled execution.
>
>  @item hot-bb-frequency-fraction
>  Select fraction of the entry block frequency of executions of basic block in

Patch

Index: predict.c
===================================================================
--- predict.c	(revision 193696)
+++ predict.c	(working copy)
@@ -134,13 +134,20 @@  maybe_hot_frequency_p (struct function *
 static inline bool
 maybe_hot_count_p (struct function *fun, gcov_type count)
 {
-  if (profile_status_for_function (fun) != PROFILE_READ)
+  gcov_working_set_t *ws;
+  static gcov_type min_count = -1;
+  if (fun && profile_status_for_function (fun) != PROFILE_READ)
     return true;
   /* Code executed at most once is not hot.  */
   if (profile_info->runs >= count)
     return false;
-  return (count
-	  > profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION));
+  if (min_count == -1)
+    {
+      ws = find_working_set (PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE));
+      gcc_assert (ws);
+      min_count = ws->min_counter;
+    }
+  return (count >= min_count);
 }
 
 /* Return true in case BB can be CPU intensive and should be optimized
@@ -161,8 +168,8 @@  bool
 cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
 {
   if (profile_info && flag_branch_probabilities
-      && (edge->count
-	  <= profile_info->sum_max / PARAM_VALUE (HOT_BB_COUNT_FRACTION)))
+      && !maybe_hot_count_p (NULL,
+                             edge->count))
     return false;
   if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
       || (edge->callee
Index: params.def
===================================================================
--- params.def	(revision 193696)
+++ params.def	(working copy)
@@ -365,10 +365,11 @@  DEFPARAM(PARAM_SMS_LOOP_AVERAGE_COUNT_TH
 	 "A threshold on the average loop count considered by the swing modulo scheduler",
 	 0, 0, 0)
 
-DEFPARAM(HOT_BB_COUNT_FRACTION,
-	 "hot-bb-count-fraction",
-	 "Select fraction of the maximal count of repetitions of basic block in program given basic block needs to have to be considered hot",
-	 10000, 0, 0)
+DEFPARAM(HOT_BB_COUNT_WS_PERMILLE,
+	 "hot-bb-count-ws-permille",
+         "A basic block profile count is considered hot if it contributes to "
+         "the given permillage of the entire profiled execution",
+	 999, 0, 1000)
 DEFPARAM(HOT_BB_FREQUENCY_FRACTION,
 	 "hot-bb-frequency-fraction",
 	 "Select fraction of the maximal frequency of executions of basic block in function given basic block needs to have to be considered hot",
@@ -392,7 +393,7 @@  DEFPARAM (PARAM_ALIGN_LOOP_ITERATIONS,
    flatten the profile.
 
    We need to cut the maximal predicted iterations to large enough iterations
-   so the loop appears important, but safely within HOT_BB_COUNT_FRACTION
+   so the loop appears important, but safely within maximum hotness
    range.  */
 
 DEFPARAM(PARAM_MAX_PREDICTED_ITERATIONS,
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 193696)
+++ doc/invoke.texi	(working copy)
@@ -9216,9 +9216,9 @@  doing loop versioning for alias in the v
 The maximum number of iterations of a loop the brute-force algorithm
 for analysis of the number of iterations of the loop tries to evaluate.
 
-@item hot-bb-count-fraction
-Select fraction of the maximal count of repetitions of basic block in program
-given basic block needs to have to be considered hot.
+@item hot-bb-count-ws-permille
+A basic block profile count is considered hot if it contributes to 
+the given permillage (i.e. 0...1000) of the entire profiled execution.
 
 @item hot-bb-frequency-fraction
 Select fraction of the entry block frequency of executions of basic block in