Message ID | CAAe5K+UOn4wxvwsGhy_EjiFFQMDB-emRbpNJkY5bBf6QgRJo4w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Fri, Nov 8, 2013 at 10:58 PM, Teresa Johnson wrote: > +static void > +drop_profile (struct cgraph_node *node, bool hot) > +{ > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + if (dump_file) > + fprintf (dump_file, > + "Dropping 0 profile for %s/%i. %s based on calls.\n", > + cgraph_node_name (node), node->order, > + hot ? "Function is hot" : "Function is normal"); > + /* We only expect to miss profiles for functions that are reached > + via non-zero call edges in cases where the function may have > + been linked from another module or library (COMDATs and extern > + templates). See the comments below for handle_missing_profiles. */ > + if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl)) > + warning (0, > + "Missing counts for called function %s/%i", > + cgraph_node_name (node), node->order); Maybe OPT_Wdisabled_optimization? > + > + profile_status_for_function (fn) > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > + node->frequency > + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; In GCC code style the = goes at the end of the line: profile_status_for_function (fn) (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); node->frequency = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > @@ -2774,6 +2875,9 @@ counts_to_freqs (void) > gcov_type count_max, true_count_max = 0; > basic_block bb; > > + if (!ENTRY_BLOCK_PTR->count) > + return 0; > + Deserves a one-line comment ;-) > +void > +freqs_to_counts (struct cgraph_node *node, gcov_type count) > +{ > + basic_block bb; > + edge_iterator ei; > + edge e; > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + FOR_ALL_BB_FN(bb, fn) > + { > + bb->count = apply_scale (count, > + GCOV_COMPUTE_SCALE (bb->frequency, BB_FREQ_MAX)); > + FOR_EACH_EDGE (e, ei, bb->succs) > + e->count = apply_probability (e->src->count, e->probability); > + } Indent +2: FOR_ALL_BB_FN (bb, fn) { ... } Ciao! Steven
> > + > > + profile_status_for_function (fn) > > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > > + node->frequency > > + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > > In GCC code style the = goes at the end of the line: > > profile_status_for_function (fn) > (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > node->frequency = > hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; Absolutely not, Teresa's version is the correct one, see reload.c for example.
On Sun, Nov 10, 2013 at 1:08 PM, Eric Botcazou wrote: >> > + >> > + profile_status_for_function (fn) >> > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); >> > + node->frequency >> > + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; >> >> In GCC code style the = goes at the end of the line: >> >> profile_status_for_function (fn) >> (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); >> node->frequency = >> hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > > Absolutely not, Teresa's version is the correct one, see reload.c for example. Hmm, "absolutely"? [stevenb@gcc1-power7 trunk]$ egrep -ch "^\s+= " gcc/*.[ch] | awk '{sum=sum+$1}END{print sum}' 1797 [stevenb@gcc1-power7 trunk]$ egrep -ch "\s=$" gcc/*.[ch] | awk '{sum=sum+$1}END{print sum}' 685 I can't find a rule/guide in the GNU or GCC coding style documents. Anyway, not important. The "=" starting a new line is the majority, so Theresa please forget about my comment :-) Ciao! Steven
> 2013-11-08 Teresa Johnson <tejohnson@google.com> > Jan Hubicka <jh@suse.cz> > > * predict.c (drop_profile): New function. > (handle_missing_profiles): Ditto. > (counts_to_freqs): Don't overwrite estimated frequencies > when function has no profile counts. > * predict.h (handle_missing_profiles): Declare. > * tree-inline.c (freqs_to_counts): New function. > (copy_cfg_body): Invoke freqs_to_counts as needed. > * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. > > Index: predict.c > =================================================================== > --- predict.c (revision 204516) > +++ predict.c (working copy) > @@ -2765,6 +2765,107 @@ estimate_loops (void) > BITMAP_FREE (tovisit); > } > > +/* Drop the profile for NODE to guessed, and update its frequency based on > + whether it is expected to be HOT. */ > + > +static void > +drop_profile (struct cgraph_node *node, bool hot) > +{ > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + if (dump_file) > + fprintf (dump_file, > + "Dropping 0 profile for %s/%i. %s based on calls.\n", > + cgraph_node_name (node), node->order, > + hot ? "Function is hot" : "Function is normal"); > + /* We only expect to miss profiles for functions that are reached > + via non-zero call edges in cases where the function may have > + been linked from another module or library (COMDATs and extern > + templates). See the comments below for handle_missing_profiles. */ > + if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl)) > + warning (0, > + "Missing counts for called function %s/%i", > + cgraph_node_name (node), node->order); > + > + profile_status_for_function (fn) > + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); > + node->frequency > + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; > +} > + > +/* In the case of COMDAT routines, multiple object files will contain the same > + function and the linker will select one for the binary. In that case > + all the other copies from the profile instrument binary will be missing > + profile counts. Look for cases where this happened, due to non-zero > + call counts going to 0-count functions, and drop the profile to guessed > + so that we can use the estimated probabilities and avoid optimizing only > + for size. > + > + The other case where the profile may be missing is when the routine > + is not going to be emitted to the object file, e.g. for "extern template" > + class methods. Those will be marked DECL_EXTERNAL. Emit a warning in > + all other cases of non-zero calls to 0-count functions. */ > + > +void > +handle_missing_profiles (void) > +{ > + struct cgraph_node *node; > + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); > + vec<struct cgraph_node *> worklist; > + worklist.create (64); > + > + /* See if 0 count function has non-0 count callers. In this case we > + lost some profile. Drop its function profile to PROFILE_GUESSED. */ > + FOR_EACH_DEFINED_FUNCTION (node) > + { > + struct cgraph_edge *e; > + gcov_type call_count = 0; > + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); > + > + if (node->count) > + continue; > + for (e = node->callers; e; e = e->next_caller) > + call_count += e->count; > + if (call_count > + && fn && fn->cfg > + && (call_count * unlikely_count_fraction >= profile_info->runs)) > + { > + bool maybe_hot = maybe_hot_count_p (NULL, call_count); > + > + drop_profile (node, maybe_hot); > + worklist.safe_push (node); > + } Perhaps we should add an note/error on mishandled profile when function is not COMDAT? That way we may notice further bugs in this area. > + } > + > + /* Propagate the profile dropping to other 0-count COMDATs that are > + potentially called by COMDATs we already dropped the profile on. */ > + while (worklist.length () > 0) > + { > + struct cgraph_edge *e; > + > + node = worklist.pop (); > + for (e = node->callees; e; e = e->next_caller) > + { > + struct cgraph_node *callee = e->callee; > + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); > + > + if (callee->count > 0) > + continue; > + if (DECL_COMDAT (callee->decl) && fn && fn->cfg > + && profile_status_for_function (fn) == PROFILE_READ) Perhaps we can check here maybe_hot_bb_p for the caller and rely on profile estimate to give us false only for really known to be unlikely paths? (i.e. EH handling, noreturns etc.) > + { > + /* Since there are no non-0 call counts to this function, > + we don't know for sure whether it is hot. Indicate to > + the drop_profile routine that function should be marked > + normal, rather than hot. */ > + drop_profile (node, false); > + worklist.safe_push (callee); > + } > + } > + } > + worklist.release (); I would add a pointer set so we avoid duplicates in worklist. This is potentially quadratic for large programs. OK, with these changes. Honza
On Mon, Nov 11, 2013 at 7:23 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> 2013-11-08 Teresa Johnson <tejohnson@google.com> >> Jan Hubicka <jh@suse.cz> >> >> * predict.c (drop_profile): New function. >> (handle_missing_profiles): Ditto. >> (counts_to_freqs): Don't overwrite estimated frequencies >> when function has no profile counts. >> * predict.h (handle_missing_profiles): Declare. >> * tree-inline.c (freqs_to_counts): New function. >> (copy_cfg_body): Invoke freqs_to_counts as needed. >> * tree-profile.c (tree_profiling): Invoke handle_missing_profiles. >> >> Index: predict.c >> =================================================================== >> --- predict.c (revision 204516) >> +++ predict.c (working copy) >> @@ -2765,6 +2765,107 @@ estimate_loops (void) >> BITMAP_FREE (tovisit); >> } >> >> +/* Drop the profile for NODE to guessed, and update its frequency based on >> + whether it is expected to be HOT. */ >> + >> +static void >> +drop_profile (struct cgraph_node *node, bool hot) >> +{ >> + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); >> + >> + if (dump_file) >> + fprintf (dump_file, >> + "Dropping 0 profile for %s/%i. %s based on calls.\n", >> + cgraph_node_name (node), node->order, >> + hot ? "Function is hot" : "Function is normal"); >> + /* We only expect to miss profiles for functions that are reached >> + via non-zero call edges in cases where the function may have >> + been linked from another module or library (COMDATs and extern >> + templates). See the comments below for handle_missing_profiles. */ >> + if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl)) >> + warning (0, >> + "Missing counts for called function %s/%i", >> + cgraph_node_name (node), node->order); >> + >> + profile_status_for_function (fn) >> + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); >> + node->frequency >> + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; >> +} >> + >> +/* In the case of COMDAT routines, multiple object files will contain the same >> + function and the linker will select one for the binary. In that case >> + all the other copies from the profile instrument binary will be missing >> + profile counts. Look for cases where this happened, due to non-zero >> + call counts going to 0-count functions, and drop the profile to guessed >> + so that we can use the estimated probabilities and avoid optimizing only >> + for size. >> + >> + The other case where the profile may be missing is when the routine >> + is not going to be emitted to the object file, e.g. for "extern template" >> + class methods. Those will be marked DECL_EXTERNAL. Emit a warning in >> + all other cases of non-zero calls to 0-count functions. */ >> + >> +void >> +handle_missing_profiles (void) >> +{ >> + struct cgraph_node *node; >> + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); >> + vec<struct cgraph_node *> worklist; >> + worklist.create (64); >> + >> + /* See if 0 count function has non-0 count callers. In this case we >> + lost some profile. Drop its function profile to PROFILE_GUESSED. */ >> + FOR_EACH_DEFINED_FUNCTION (node) >> + { >> + struct cgraph_edge *e; >> + gcov_type call_count = 0; >> + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); >> + >> + if (node->count) >> + continue; >> + for (e = node->callers; e; e = e->next_caller) >> + call_count += e->count; >> + if (call_count >> + && fn && fn->cfg >> + && (call_count * unlikely_count_fraction >= profile_info->runs)) >> + { >> + bool maybe_hot = maybe_hot_count_p (NULL, call_count); >> + >> + drop_profile (node, maybe_hot); >> + worklist.safe_push (node); >> + } > > Perhaps we should add an note/error on mishandled profile when function is not COMDAT? > That way we may notice further bugs in this area. I have a warning like that already in drop_profile(). Is that sufficient? Also, Steven Bosscher suggested putting that warning under OPT_Wdisabled_optimization instead of '0', what do you prefer for that? >> + } >> + >> + /* Propagate the profile dropping to other 0-count COMDATs that are >> + potentially called by COMDATs we already dropped the profile on. */ >> + while (worklist.length () > 0) >> + { >> + struct cgraph_edge *e; >> + >> + node = worklist.pop (); >> + for (e = node->callees; e; e = e->next_caller) >> + { >> + struct cgraph_node *callee = e->callee; >> + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); >> + >> + if (callee->count > 0) >> + continue; >> + if (DECL_COMDAT (callee->decl) && fn && fn->cfg >> + && profile_status_for_function (fn) == PROFILE_READ) > > Perhaps we can check here maybe_hot_bb_p for the caller and rely on profile estimate > to give us false only for really known to be unlikely paths? (i.e. EH handling, noreturns > etc.) Ok, let me try this. >> + { >> + /* Since there are no non-0 call counts to this function, >> + we don't know for sure whether it is hot. Indicate to >> + the drop_profile routine that function should be marked >> + normal, rather than hot. */ >> + drop_profile (node, false); >> + worklist.safe_push (callee); >> + } >> + } >> + } >> + worklist.release (); > > I would add a pointer set so we avoid duplicates in worklist. This is potentially quadratic > for large programs. I'll add a visited_nodes set to keep track of processed nodes so they don't get added to the worklist multiple times. Thanks, Teresa > > OK, with these changes. > > Honza
> I have a warning like that already in drop_profile(). Is that I think it should be warning (or silent) for COMDAT and error/note for other functions (depending on flag_profile_correction). I guess drop_profile is better place for it indeed. > sufficient? Also, Steven Bosscher suggested putting that warning under > OPT_Wdisabled_optimization instead of '0', what do you prefer for > that? It is a bit different case than other disabled optimizations we have (where the optimization does not happen because of --param tunable limits), but I am fine with both alternatives. The warning may end up quite noisy so having way to silence it definitely makes sense. > > >> + } > >> + > >> + /* Propagate the profile dropping to other 0-count COMDATs that are > >> + potentially called by COMDATs we already dropped the profile on. */ > >> + while (worklist.length () > 0) > >> + { > >> + struct cgraph_edge *e; > >> + > >> + node = worklist.pop (); > >> + for (e = node->callees; e; e = e->next_caller) > >> + { > >> + struct cgraph_node *callee = e->callee; > >> + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); > >> + > >> + if (callee->count > 0) > >> + continue; > >> + if (DECL_COMDAT (callee->decl) && fn && fn->cfg > >> + && profile_status_for_function (fn) == PROFILE_READ) > > > > Perhaps we can check here maybe_hot_bb_p for the caller and rely on profile estimate > > to give us false only for really known to be unlikely paths? (i.e. EH handling, noreturns > > etc.) > > Ok, let me try this. > > >> + { > >> + /* Since there are no non-0 call counts to this function, > >> + we don't know for sure whether it is hot. Indicate to > >> + the drop_profile routine that function should be marked > >> + normal, rather than hot. */ > >> + drop_profile (node, false); > >> + worklist.safe_push (callee); > >> + } > >> + } > >> + } > >> + worklist.release (); > > > > I would add a pointer set so we avoid duplicates in worklist. This is potentially quadratic > > for large programs. > > I'll add a visited_nodes set to keep track of processed nodes so they > don't get added to the worklist multiple times. Perhaps you can also track this by testing profile_status_for_function. Both solutions are fine for me. Honza > > Thanks, > Teresa > > > > > OK, with these changes. > > > > Honza > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Mon, Nov 11, 2013 at 8:22 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> I have a warning like that already in drop_profile(). Is that > > I think it should be warning (or silent) for COMDAT and error/note for > other functions (depending on flag_profile_correction). > I guess drop_profile is better place for it indeed. I don't want to warn for COMDAT since this is currently an expected issue in that case, so I'm afraid it will be too noisy (but there is a note emitted to the dump file to track how often this occurs). So currently the warning is only emitted in cases where we don't expect it to occur (e.g. non-comdat). > >> sufficient? Also, Steven Bosscher suggested putting that warning under >> OPT_Wdisabled_optimization instead of '0', what do you prefer for >> that? > > It is a bit different case than other disabled optimizations we have (where > the optimization does not happen because of --param tunable limits), but I > am fine with both alternatives. > The warning may end up quite noisy so having way to silence it definitely > makes sense. I didn't find any warnings being emitted when running this for spec or internal benchmarks, so how about instead of a warning, I handle it like you suggest above: depending on the setting of flag_profile_correction either error or emit a note to the dump file under MSG_MISSED_OPTIMIZATION? >> >> >> + } >> >> + >> >> + /* Propagate the profile dropping to other 0-count COMDATs that are >> >> + potentially called by COMDATs we already dropped the profile on. */ >> >> + while (worklist.length () > 0) >> >> + { >> >> + struct cgraph_edge *e; >> >> + >> >> + node = worklist.pop (); >> >> + for (e = node->callees; e; e = e->next_caller) >> >> + { >> >> + struct cgraph_node *callee = e->callee; >> >> + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); >> >> + >> >> + if (callee->count > 0) >> >> + continue; >> >> + if (DECL_COMDAT (callee->decl) && fn && fn->cfg >> >> + && profile_status_for_function (fn) == PROFILE_READ) >> > >> > Perhaps we can check here maybe_hot_bb_p for the caller and rely on profile estimate >> > to give us false only for really known to be unlikely paths? (i.e. EH handling, noreturns >> > etc.) >> >> Ok, let me try this. >> >> >> + { >> >> + /* Since there are no non-0 call counts to this function, >> >> + we don't know for sure whether it is hot. Indicate to >> >> + the drop_profile routine that function should be marked >> >> + normal, rather than hot. */ >> >> + drop_profile (node, false); >> >> + worklist.safe_push (callee); >> >> + } >> >> + } >> >> + } >> >> + worklist.release (); >> > >> > I would add a pointer set so we avoid duplicates in worklist. This is potentially quadratic >> > for large programs. >> >> I'll add a visited_nodes set to keep track of processed nodes so they >> don't get added to the worklist multiple times. > > Perhaps you can also track this by testing profile_status_for_function. Both solutions are fine for me. Ah - I just realized I am already checking profile_status_for_function above and adding to the worklist if it is still PROFILE_READ. Since I call drop_profile before adding to the worklist, we can't end up adding multiple times. So I don't think there is currently an issue with this. Thanks, Teresa > > Honza >> >> Thanks, >> Teresa >> >> > >> > OK, with these changes. >> > >> > Honza >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
> On Mon, Nov 11, 2013 at 8:22 AM, Jan Hubicka <hubicka@ucw.cz> wrote: > >> I have a warning like that already in drop_profile(). Is that > > > > I think it should be warning (or silent) for COMDAT and error/note for > > other functions (depending on flag_profile_correction). > > I guess drop_profile is better place for it indeed. > > I don't want to warn for COMDAT since this is currently an expected > issue in that case, so I'm afraid it will be too noisy (but there is a > note emitted to the dump file to track how often this occurs). So > currently the warning is only emitted in cases where we don't expect > it to occur (e.g. non-comdat). I see, I misread it. The non-comdat warning IMO ought go the same way as the other profile confussions. I guess something like if (!flag_profile_correction) error (...); like existing cases in profile.c > > I didn't find any warnings being emitted when running this for spec or > internal benchmarks, so how about instead of a warning, I handle it > like you suggest above: depending on the setting of > flag_profile_correction either error or emit a note to the dump file > under MSG_MISSED_OPTIMIZATION? Yep, lets just go with error with !flag_profile_correction and being silent otherwise. If we want to go with warnings, we need to change all the similar places in profile.c and elsewhere. > > Ah - I just realized I am already checking profile_status_for_function > above and adding to the worklist if it is still PROFILE_READ. Since I > call drop_profile before adding to the worklist, we can't end up > adding multiple times. So I don't think there is currently an issue > with this. Yep, indeed. The patch is OK with the error message as discussed above. Thanks. Honza > > Thanks, > Teresa > > > > > Honza > >> > >> Thanks, > >> Teresa > >> > >> > > >> > OK, with these changes. > >> > > >> > Honza > >> > >> > >> > >> -- > >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 > > > > -- > Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
On Mon, Nov 11, 2013 at 9:13 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> On Mon, Nov 11, 2013 at 8:22 AM, Jan Hubicka <hubicka@ucw.cz> wrote: >> >> I have a warning like that already in drop_profile(). Is that >> > >> > I think it should be warning (or silent) for COMDAT and error/note for >> > other functions (depending on flag_profile_correction). >> > I guess drop_profile is better place for it indeed. >> >> I don't want to warn for COMDAT since this is currently an expected >> issue in that case, so I'm afraid it will be too noisy (but there is a >> note emitted to the dump file to track how often this occurs). So >> currently the warning is only emitted in cases where we don't expect >> it to occur (e.g. non-comdat). > > I see, I misread it. > The non-comdat warning IMO ought go the same way as the other profile confussions. > I guess something like > if (!flag_profile_correction) > error (...); > like existing cases in profile.c Ok, will do (emitting a note to the dump file as in other cases in profile.c that do this same thing). >> >> I didn't find any warnings being emitted when running this for spec or >> internal benchmarks, so how about instead of a warning, I handle it >> like you suggest above: depending on the setting of >> flag_profile_correction either error or emit a note to the dump file >> under MSG_MISSED_OPTIMIZATION? > > Yep, lets just go with error with !flag_profile_correction and being silent > otherwise. > If we want to go with warnings, we need to change all the similar places > in profile.c and elsewhere. >> >> Ah - I just realized I am already checking profile_status_for_function >> above and adding to the worklist if it is still PROFILE_READ. Since I >> call drop_profile before adding to the worklist, we can't end up >> adding multiple times. So I don't think there is currently an issue >> with this. > > Yep, indeed. > > The patch is OK with the error message as discussed above. Ok, will do that and fix a couple of minor issues mentioned by Steven (missing comment and incorrect indentation in one spot). Will retest just to make sure I didn't miss any warnings which will now be errors. Thanks, Teresa > > Thanks. > Honza >> >> Thanks, >> Teresa >> >> > >> > Honza >> >> >> >> Thanks, >> >> Teresa >> >> >> >> > >> >> > OK, with these changes. >> >> > >> >> > Honza >> >> >> >> >> >> >> >> -- >> >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413 >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Index: predict.c =================================================================== --- predict.c (revision 204516) +++ predict.c (working copy) @@ -2765,6 +2765,107 @@ estimate_loops (void) BITMAP_FREE (tovisit); } +/* Drop the profile for NODE to guessed, and update its frequency based on + whether it is expected to be HOT. */ + +static void +drop_profile (struct cgraph_node *node, bool hot) +{ + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); + + if (dump_file) + fprintf (dump_file, + "Dropping 0 profile for %s/%i. %s based on calls.\n", + cgraph_node_name (node), node->order, + hot ? "Function is hot" : "Function is normal"); + /* We only expect to miss profiles for functions that are reached + via non-zero call edges in cases where the function may have + been linked from another module or library (COMDATs and extern + templates). See the comments below for handle_missing_profiles. */ + if (!DECL_COMDAT (node->decl) && !DECL_EXTERNAL (node->decl)) + warning (0, + "Missing counts for called function %s/%i", + cgraph_node_name (node), node->order); + + profile_status_for_function (fn) + = (flag_guess_branch_prob ? PROFILE_GUESSED : PROFILE_ABSENT); + node->frequency + = hot ? NODE_FREQUENCY_HOT : NODE_FREQUENCY_NORMAL; +} + +/* In the case of COMDAT routines, multiple object files will contain the same + function and the linker will select one for the binary. In that case + all the other copies from the profile instrument binary will be missing + profile counts. Look for cases where this happened, due to non-zero + call counts going to 0-count functions, and drop the profile to guessed + so that we can use the estimated probabilities and avoid optimizing only + for size. + + The other case where the profile may be missing is when the routine + is not going to be emitted to the object file, e.g. for "extern template" + class methods. Those will be marked DECL_EXTERNAL. Emit a warning in + all other cases of non-zero calls to 0-count functions. */ + +void +handle_missing_profiles (void) +{ + struct cgraph_node *node; + int unlikely_count_fraction = PARAM_VALUE (UNLIKELY_BB_COUNT_FRACTION); + vec<struct cgraph_node *> worklist; + worklist.create (64); + + /* See if 0 count function has non-0 count callers. In this case we + lost some profile. Drop its function profile to PROFILE_GUESSED. */ + FOR_EACH_DEFINED_FUNCTION (node) + { + struct cgraph_edge *e; + gcov_type call_count = 0; + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); + + if (node->count) + continue; + for (e = node->callers; e; e = e->next_caller) + call_count += e->count; + if (call_count + && fn && fn->cfg + && (call_count * unlikely_count_fraction >= profile_info->runs)) + { + bool maybe_hot = maybe_hot_count_p (NULL, call_count); + + drop_profile (node, maybe_hot); + worklist.safe_push (node); + } + } + + /* Propagate the profile dropping to other 0-count COMDATs that are + potentially called by COMDATs we already dropped the profile on. */ + while (worklist.length () > 0) + { + struct cgraph_edge *e; + + node = worklist.pop (); + for (e = node->callees; e; e = e->next_caller) + { + struct cgraph_node *callee = e->callee; + struct function *fn = DECL_STRUCT_FUNCTION (callee->decl); + + if (callee->count > 0) + continue; + if (DECL_COMDAT (callee->decl) && fn && fn->cfg + && profile_status_for_function (fn) == PROFILE_READ) + { + /* Since there are no non-0 call counts to this function, + we don't know for sure whether it is hot. Indicate to + the drop_profile routine that function should be marked + normal, rather than hot. */ + drop_profile (node, false); + worklist.safe_push (callee); + } + } + } + worklist.release (); +} + /* Convert counts measured by profile driven feedback to frequencies. Return nonzero iff there was any nonzero execution count. */ @@ -2774,6 +2875,9 @@ counts_to_freqs (void) gcov_type count_max, true_count_max = 0; basic_block bb; + if (!ENTRY_BLOCK_PTR->count) + return 0; + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR, NULL, next_bb) true_count_max = MAX (bb->count, true_count_max); Index: predict.h =================================================================== --- predict.h (revision 204516) +++ predict.h (working copy) @@ -37,6 +37,7 @@ enum prediction extern void predict_insn_def (rtx, enum br_predictor, enum prediction); extern int counts_to_freqs (void); +extern void handle_missing_profiles (void); extern void estimate_bb_frequencies (bool); extern const char *predictor_name (enum br_predictor); extern tree build_predict_expr (enum br_predictor, enum prediction); Index: tree-inline.c =================================================================== --- tree-inline.c (revision 204516) +++ tree-inline.c (working copy) @@ -2353,6 +2353,29 @@ redirect_all_calls (copy_body_data * id, basic_blo } } +/* Convert estimated frequencies into counts for NODE, scaling COUNT + with each bb's frequency. Used when NODE has a 0-weight entry + but we are about to inline it into a non-zero count call bb. + See the comments for handle_missing_profiles() in predict.c for + when this can happen for COMDATs. */ + +void +freqs_to_counts (struct cgraph_node *node, gcov_type count) +{ + basic_block bb; + edge_iterator ei; + edge e; + struct function *fn = DECL_STRUCT_FUNCTION (node->decl); + + FOR_ALL_BB_FN(bb, fn) + { + bb->count = apply_scale (count, + GCOV_COMPUTE_SCALE (bb->frequency, BB_FREQ_MAX)); + FOR_EACH_EDGE (e, ei, bb->succs) + e->count = apply_probability (e->src->count, e->probability); + } +} + /* Make a copy of the body of FN so that it can be inserted inline in another function. Walks FN via CFG, returns new fndecl. */ @@ -2373,6 +2396,24 @@ copy_cfg_body (copy_body_data * id, gcov_type coun int incoming_frequency = 0; gcov_type incoming_count = 0; + /* This can happen for COMDAT routines that end up with 0 counts + despite being called (see the comments for handle_missing_profiles() + in predict.c as to why). Apply counts to the blocks in the callee + before inlining, using the guessed edge frequencies, so that we don't + end up with a 0-count inline body which can confuse downstream + optimizations such as function splitting. */ + if (!ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count && count) + { + /* Apply the larger of the call bb count and the total incoming + call edge count to the callee. */ + gcov_type in_count = 0; + struct cgraph_edge *in_edge; + for (in_edge = id->src_node->callers; in_edge; + in_edge = in_edge->next_caller) + in_count += in_edge->count; + freqs_to_counts (id->src_node, count > in_count ? count : in_count); + } + if (ENTRY_BLOCK_PTR_FOR_FUNCTION (src_cfun)->count) count_scale = GCOV_COMPUTE_SCALE (count, Index: tree-profile.c =================================================================== --- tree-profile.c (revision 204516) +++ tree-profile.c (working copy) @@ -612,6 +612,8 @@ tree_profiling (void) pop_cfun (); } + handle_missing_profiles (); + del_node_map (); return 0; }