diff mbox series

[v3] Missed function specialization + partial devirtualization

Message ID 42e71441-3a7e-13aa-7d57-f9880591fc28@linux.ibm.com
State New
Headers show
Series [v3] Missed function specialization + partial devirtualization | expand

Commit Message

Xionghu Luo July 31, 2019, 5:29 a.m. UTC
This patch aims to fix PR69678 caused by PGO indirect call profiling
performance issues.
The bug that profiling data is never working was fixed by Martin's pull
back of topN patches, performance got GEOMEAN ~1% improvement.
Still, currently the default profile only generates SINGLE indirect target
that called more than 75%.  This patch leverages MULTIPLE indirect
targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
specialization, profiling, partial devirtualization, inlining and
cloning could be done successfully based on it.
Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
Details are:
  1.  PGO with topn is enbaled by default now, but only one indirect
  target edge will be generated in ipa-profile pass, so add variables to enable
  multiple speculative edges through passes, speculative_id will record the direct edge
  index bind to the indirect edge, num_of_ics records how many direct edges owned by
  the indirect edge, postpone gimple_ic to ipa-profile like default as inline
  pass will decide whether it is benefit to transform indirect call.
  2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
  profile full support in ipa passes and cgraph_edge functions.  speculative_id
  can be set by make_speculative id when multiple targets are binded to
  one indirect edge, and cloned if new edge is cloned.  speculative_id
  is streamed out and stream int by lto like lto_stmt_uid.
  3.  Add 1 in module testcase and 2 cross module testcases.
  4.  Bootstrap and regression test passed on Power8-LE.

v3 Changes:
 1. Rebase to trunk.
 2. Use speculative_id to track and search the reference node matched
 with the direct edge's callee for multiple targets.  This could
 eliminate the workaround strstr before.  Actually, it is the caller's
 response to handle the direct edges mapped to same indirect edge.
 speculative_call_info will still return one of the direct edge
 specified, this will leverage current IPA edge process framework mostly.

gcc/ChangeLog

	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>

	PR ipa/69678
	* cgraph.c (symbol_table::create_edge): Init speculative_id.
	(cgraph_edge::make_speculative): Add param for setting speculative_id.
	(cgraph_edge::speculative_call_info): Find reference by
	speculative_id for multiple indirect targets.
	(cgraph_edge::resolve_speculation): Decrease the speculations
	for indirect edge, drop it's speculative if not direct target
	left.
	(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
	(cgraph_node::verify_node): Don't report error if speculative
	edge not include statement.
	* cgraph.h (struct indirect_target_info): New struct.
	(indirect_call_targets): New vector variable.
	(num_of_ics): New variable.
	(make_speculative): Add param for setting speculative_id.
	(speculative_id): New variable.
	* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
	* ipa-inline.c (inline_small_functions): Add iterator update.
	* ipa-profile.c (ipa_profile_generate_summary): Add indirect
	multiple targets logic.
	(ipa_profile): Likewise.
	* ipa-ref.h (speculative_id): New variable.
	* ipa.c (process_references): Fix typo.
	* lto-cgraph.c (lto_output_edge): Add indirect multiple targets
	logic.  Stream out speculative_id.
	(input_edge): Likewise.
	* predict.c (dump_prediction): Revome edges count assert to be
	precise.
	* symtab.c (symtab_node::create_reference): Init speculative_id.
	(symtab_node::clone_references): Clone speculative_id.
	(symtab_node::clone_referring): Clone speculative_id.
	(symtab_node::clone_reference): Clone speculative_id.
	(symtab_node::clear_stmts_in_references): Clear speculative_id.
	* tree-inline.c (copy_bb): Duplicate all the speculative edges
	if indirect call contains multiple speculative targets.
	* tree-profile.c (gimple_gen_ic_profiler): Use the new variable
	__gcov_indirect_call.counters and __gcov_indirect_call.callee.
	(gimple_gen_ic_func_profiler): Likewise.
	(pass_ipa_tree_profile::gate): Fix comment typos.
	* value-prof.c  (gimple_ic_transform): Handle topn case.
	Fix comment typos.

gcc/testsuite/ChangeLog

	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>

	PR ipa/69678
	* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
	* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
---
 gcc/cgraph.c                                  |  70 +++++-
 gcc/cgraph.h                                  |  28 ++-
 gcc/cgraphclones.c                            |   1 +
 gcc/ipa-inline.c                              |  15 +-
 gcc/ipa-profile.c                             | 238 +++++++++++-------
 gcc/ipa-ref.h                                 |   1 +
 gcc/ipa.c                                     |   2 +-
 gcc/lto-cgraph.c                              |  54 +++-
 gcc/predict.c                                 |   1 -
 gcc/symtab.c                                  |   5 +
 .../tree-prof/crossmodule-indir-call-topn-1.c |  35 +++
 .../crossmodule-indir-call-topn-1a.c          |  22 ++
 .../tree-prof/crossmodule-indir-call-topn-2.c |  42 ++++
 .../gcc.dg/tree-prof/indir-call-prof-topn.c   |  38 +++
 gcc/tree-inline.c                             |  20 ++
 gcc/tree-profile.c                            |  12 +-
 gcc/value-prof.c                              |  80 +++---
 17 files changed, 500 insertions(+), 164 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c

Comments

Martin Liška Aug. 13, 2019, 12:58 p.m. UTC | #1
On 7/31/19 7:29 AM, luoxhu wrote:
> This patch aims to fix PR69678 caused by PGO indirect call profiling
> performance issues.
> The bug that profiling data is never working was fixed by Martin's pull
> back of topN patches, performance got GEOMEAN ~1% improvement.
> Still, currently the default profile only generates SINGLE indirect target
> that called more than 75%.  This patch leverages MULTIPLE indirect
> targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
> specialization, profiling, partial devirtualization, inlining and
> cloning could be done successfully based on it.
> Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
> Details are:
>   1.  PGO with topn is enbaled by default now, but only one indirect
>   target edge will be generated in ipa-profile pass, so add variables to enable
>   multiple speculative edges through passes, speculative_id will record the direct edge
>   index bind to the indirect edge, num_of_ics records how many direct edges owned by
>   the indirect edge, postpone gimple_ic to ipa-profile like default as inline
>   pass will decide whether it is benefit to transform indirect call.
>   2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
>   profile full support in ipa passes and cgraph_edge functions.  speculative_id
>   can be set by make_speculative id when multiple targets are binded to
>   one indirect edge, and cloned if new edge is cloned.  speculative_id
>   is streamed out and stream int by lto like lto_stmt_uid.
>   3.  Add 1 in module testcase and 2 cross module testcases.
>   4.  Bootstrap and regression test passed on Power8-LE.
> 
> v3 Changes:
>  1. Rebase to trunk.
>  2. Use speculative_id to track and search the reference node matched
>  with the direct edge's callee for multiple targets.  This could
>  eliminate the workaround strstr before.  Actually, it is the caller's
>  response to handle the direct edges mapped to same indirect edge.
>  speculative_call_info will still return one of the direct edge
>  specified, this will leverage current IPA edge process framework mostly.

Thank you for the next version of the patch. Inline comments follow:

> 
> gcc/ChangeLog
> 
> 	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR ipa/69678
> 	* cgraph.c (symbol_table::create_edge): Init speculative_id.
> 	(cgraph_edge::make_speculative): Add param for setting speculative_id.
> 	(cgraph_edge::speculative_call_info): Find reference by
> 	speculative_id for multiple indirect targets.
> 	(cgraph_edge::resolve_speculation): Decrease the speculations
> 	for indirect edge, drop it's speculative if not direct target
> 	left.
> 	(cgraph_edge::redirect_call_stmt_to_callee): Likewise.
> 	(cgraph_node::verify_node): Don't report error if speculative
> 	edge not include statement.
> 	* cgraph.h (struct indirect_target_info): New struct.
> 	(indirect_call_targets): New vector variable.
> 	(num_of_ics): New variable.
> 	(make_speculative): Add param for setting speculative_id.
> 	(speculative_id): New variable.
> 	* cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
> 	* ipa-inline.c (inline_small_functions): Add iterator update.
> 	* ipa-profile.c (ipa_profile_generate_summary): Add indirect
> 	multiple targets logic.
> 	(ipa_profile): Likewise.
> 	* ipa-ref.h (speculative_id): New variable.
> 	* ipa.c (process_references): Fix typo.
> 	* lto-cgraph.c (lto_output_edge): Add indirect multiple targets
> 	logic.  Stream out speculative_id.
> 	(input_edge): Likewise.
> 	* predict.c (dump_prediction): Revome edges count assert to be
> 	precise.
> 	* symtab.c (symtab_node::create_reference): Init speculative_id.
> 	(symtab_node::clone_references): Clone speculative_id.
> 	(symtab_node::clone_referring): Clone speculative_id.
> 	(symtab_node::clone_reference): Clone speculative_id.
> 	(symtab_node::clear_stmts_in_references): Clear speculative_id.
> 	* tree-inline.c (copy_bb): Duplicate all the speculative edges
> 	if indirect call contains multiple speculative targets.
> 	* tree-profile.c (gimple_gen_ic_profiler): Use the new variable
> 	__gcov_indirect_call.counters and __gcov_indirect_call.callee.
> 	(gimple_gen_ic_func_profiler): Likewise.
> 	(pass_ipa_tree_profile::gate): Fix comment typos.
> 	* value-prof.c  (gimple_ic_transform): Handle topn case.
> 	Fix comment typos.
> 
> gcc/testsuite/ChangeLog
> 
> 	2019-07-31  Xiong Hu Luo  <luoxhu@linux.ibm.com>
> 
> 	PR ipa/69678
> 	* gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
> 	* gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
> ---
>  gcc/cgraph.c                                  |  70 +++++-
>  gcc/cgraph.h                                  |  28 ++-
>  gcc/cgraphclones.c                            |   1 +
>  gcc/ipa-inline.c                              |  15 +-
>  gcc/ipa-profile.c                             | 238 +++++++++++-------
>  gcc/ipa-ref.h                                 |   1 +
>  gcc/ipa.c                                     |   2 +-
>  gcc/lto-cgraph.c                              |  54 +++-
>  gcc/predict.c                                 |   1 -
>  gcc/symtab.c                                  |   5 +
>  .../tree-prof/crossmodule-indir-call-topn-1.c |  35 +++
>  .../crossmodule-indir-call-topn-1a.c          |  22 ++
>  .../tree-prof/crossmodule-indir-call-topn-2.c |  42 ++++
>  .../gcc.dg/tree-prof/indir-call-prof-topn.c   |  38 +++
>  gcc/tree-inline.c                             |  20 ++
>  gcc/tree-profile.c                            |  12 +-
>  gcc/value-prof.c                              |  80 +++---
>  17 files changed, 500 insertions(+), 164 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> 
> diff --git a/gcc/cgraph.c b/gcc/cgraph.c
> index a7e3bcf2132..d7be9e9c545 100644
> --- a/gcc/cgraph.c
> +++ b/gcc/cgraph.c
> @@ -869,6 +869,7 @@ symbol_table::create_edge (cgraph_node *caller, cgraph_node *callee,
>    edge->prev_callee = NULL;
>    edge->next_callee = NULL;
>    edge->lto_stmt_uid = 0;
> +  edge->speculative_id = 0;
>  
>    edge->count = count;
>  
> @@ -1063,7 +1064,8 @@ cgraph_edge::remove (void)
>     Return direct edge created.  */
>  
>  cgraph_edge *
> -cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
> +cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count,
> +			       unsigned int speculative_id)
>  {
>    cgraph_node *n = caller;
>    ipa_ref *ref = NULL;
> @@ -1081,11 +1083,13 @@ cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
>    else
>      e2->can_throw_external = can_throw_external;
>    e2->lto_stmt_uid = lto_stmt_uid;
> +  e2->speculative_id = speculative_id;
>    e2->in_polymorphic_cdtor = in_polymorphic_cdtor;
>    count -= e2->count;
>    symtab->call_edge_duplication_hooks (this, e2);
>    ref = n->create_reference (n2, IPA_REF_ADDR, call_stmt);
>    ref->lto_stmt_uid = lto_stmt_uid;
> +  ref->speculative_id = speculative_id;
>    ref->speculative = speculative;
>    n2->mark_address_taken ();
>    return e2;
> @@ -1099,6 +1103,38 @@ cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
>     call) and if one of them exists, all of them must exist.
>  
>     Given speculative call edge, return all three components.
> +
> +   For some indirect edge, it may maps to multiple direct edges, i.e. 1:N.
> +   check the speculative_id to return all the three components for specified
> +   direct edge or indirect edge.
> +   If input is indirect, caller of this function will get the direct edge one by
> +   one, get_edge will just return one of the direct edge mapped to the indirect
> +   edge, the returned direct edge will be resolved or redirected by the caller,
> +   then num_of_ics (speculations) is deceased in each access.
> +   If input is direct, this function will get the indirect edge and reference
> +   with matched speculative_id, the returned edge will also be resolved or
> +   redirected, decrease the speculations accordingly.
> +   Speculations of indirect edge will be dropped only if all direct edges
> +   be handled.
> +
> +   e.g. for indirect edge E statement "call call_dest":
> +
> +   Redirect N3 after redirected N2:
> +
> +   if (call_dest == N2)
> +     n2 ();
> +   else if (call_dest == N3)
> +     n3 ();
> +   else
> +     call call_dest
> +
> +   Resolve N3 and only redirect N2:
> +
> +   if (call_dest == N2)
> +     n2 ();
> +   else
> +     call call_dest
> +
>   */
>  
>  void
> @@ -1138,7 +1174,7 @@ cgraph_edge::speculative_call_info (cgraph_edge *&direct,
>  
>    reference = NULL;
>    for (i = 0; e->caller->iterate_reference (i, ref); i++)
> -    if (ref->speculative
> +    if (ref->speculative && ref->speculative_id == e->speculative_id
>  	&& ((ref->stmt && ref->stmt == e->call_stmt)
>  	    || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
>        {
> @@ -1199,7 +1235,19 @@ cgraph_edge::resolve_speculation (tree callee_decl)
>           in the functions inlined through it.  */
>      }
>    edge->count += e2->count;
> -  edge->speculative = false;
> +  /* edge is indirect, e2 is direct.  If edge contains multiple speculations,
> +     remove one of speculations for this indirect edge, then if edge still
> +     contains direct target, keep the speculation, next direct target
> +     will continue use it.  Give up speculation completely if no direct
> +     target is left for this indirect edge.  */
> +  if (edge->indirect_info && edge->indirect_info->num_of_ics)
> +    {
> +      edge->indirect_info->num_of_ics--;
> +      if (edge->indirect_info->num_of_ics == 0)
> +	edge->speculative = false;
> +    }
> +  else
> +    edge->speculative = false;
>    e2->speculative = false;
>    ref->remove_reference ();
>    if (e2->indirect_unknown_callee || e2->inline_failed)
> @@ -1333,7 +1381,19 @@ cgraph_edge::redirect_call_stmt_to_callee (void)
>  	  e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
>  						     false);
>  	  e->count = gimple_bb (e->call_stmt)->count;
> -	  e2->speculative = false;
> +	  /* edge is direct, e2 is indirect here.  If e2 contains multiple
> +	     speculations, remove one of speculations for this indirect edge,
> +	     then if e2 still contains direct target, keep the speculation,
> +	     next direct target will continue use it.  Give up speculation
> +	     completely if no direct target is left for this indirect e2.  */
> +	  if (e2->indirect_info && e2->indirect_info->num_of_ics)
> +	    {
> +	      e2->indirect_info->num_of_ics--;
> +	      if (e2->indirect_info->num_of_ics == 0)
> +		e2->speculative = false;
> +	    }
> +	  else
> +	    e2->speculative = false;
>  	  e2->count = gimple_bb (e2->call_stmt)->count;
>  	  ref->speculative = false;
>  	  ref->stmt = NULL;
> @@ -3434,7 +3494,7 @@ cgraph_node::verify_node (void)
>  
>        for (e = callees; e; e = e->next_callee)
>  	{
> -	  if (!e->aux)
> +	  if (!e->aux && !e->speculative)
>  	    {
>  	      error ("edge %s->%s has no corresponding call_stmt",
>  		     identifier_to_locale (e->caller->name ()),
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index fa5224fb3a5..0b453c3534c 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -1630,6 +1630,16 @@ private:
>    void make_speculative (tree otr_type = NULL);
>  };
>  
> +/* Structure containing indirect target information from profile.  */
> +
> +struct GTY (()) indirect_target_info
> +{
> +  /* Profile_id of common target obtained from profile.  */
> +  unsigned int common_target_id;
> +  /* Probability that call will land in function with COMMON_TARGET_ID.  */
> +  int common_target_probability;
> +};
> +
>  /* Structure containing additional information about an indirect call.  */
>  
>  class GTY(()) cgraph_indirect_call_info
> @@ -1648,10 +1658,14 @@ public:
>    int param_index;
>    /* ECF flags determined from the caller.  */
>    int ecf_flags;
> -  /* Profile_id of common target obtrained from profile.  */
> -  int common_target_id;
> -  /* Probability that call will land in function with COMMON_TARGET_ID.  */
> -  int common_target_probability;
> +
> +  /* An indirect call may contain one or multiple call targets.  */
> +  vec<indirect_target_info, va_gc> *indirect_call_targets;
> +
> +  /* Number of indirect targets (speculations) for the indirect call.
> +     Increased by ipa-profile, decreased by resolve_speculation or redirect
> +     statement to callee.  */
> +  unsigned num_of_ics;

Would it be possible to use just the vector and replace all usages of num_of_ics
with indirect_call_targets->length () ? It would make the code easier.

>  
>    /* Set when the call is a virtual call with the parameter being the
>       associated object pointer rather than a simple direct call.  */
> @@ -1708,7 +1722,8 @@ public:
>    /* Turn edge into speculative call calling N2. Update
>       the profile so the direct call is taken COUNT times
>       with FREQUENCY.  */
> -  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count);
> +  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count,
> +				 unsigned int speculative_id = 0);
>  
>     /* Given speculative call edge, return all three components.  */
>    void speculative_call_info (cgraph_edge *&direct, cgraph_edge *&indirect,
> @@ -1786,6 +1801,9 @@ public:
>    /* The stmt_uid of call_stmt.  This is used by LTO to recover the call_stmt
>       when the function is serialized in.  */
>    unsigned int lto_stmt_uid;
> +  /* speculative id is used by multiple indirect targets when the function is
> +   speculated.  */
> +  unsigned int speculative_id;
>    /* Whether this edge was made direct by indirect inlining.  */
>    unsigned int indirect_inlining_edge : 1;
>    /* Whether this edge describes an indirect call with an undetermined
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index fd867ecac91..ab6b9dd86c8 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -128,6 +128,7 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, unsigned stmt_uid,
>    new_edge->inline_failed = inline_failed;
>    new_edge->indirect_inlining_edge = indirect_inlining_edge;
>    new_edge->lto_stmt_uid = stmt_uid;
> +  new_edge->speculative_id = speculative_id;
>    /* Clone flags that depend on call_stmt availability manually.  */
>    new_edge->can_throw_external = can_throw_external;
>    new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p;
> diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
> index 939d86ef94a..f5654198877 100644
> --- a/gcc/ipa-inline.c
> +++ b/gcc/ipa-inline.c
> @@ -1877,12 +1877,15 @@ inline_small_functions (void)
>  	}
>        if (has_speculative)
>  	for (edge = node->callees; edge; edge = next)
> -	  if (edge->speculative && !speculation_useful_p (edge,
> -							  edge->aux != NULL))
> -	    {
> -	      edge->resolve_speculation ();
> -	      update = true;
> -	    }
> +	  {
> +	    next = edge->next_callee;
> +	    if (edge->speculative
> +		&& !speculation_useful_p (edge, edge->aux != NULL))
> +	      {
> +		edge->resolve_speculation ();
> +		update = true;
> +	      }
> +	  }
>        if (update)
>  	{
>  	  struct cgraph_node *where = node->global.inlined_to
> diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
> index 970dba39c80..26214eba27e 100644
> --- a/gcc/ipa-profile.c
> +++ b/gcc/ipa-profile.c
> @@ -192,23 +192,39 @@ ipa_profile_generate_summary (void)
>  		  if (h)
>  		    {
>  		      gcov_type val, count, all;
> -		      if (get_nth_most_common_value (NULL, "indirect call", h,
> -						     &val, &count, &all))
> +		      struct cgraph_edge *e = node->get_edge (stmt);
> +		      if (e && !e->indirect_unknown_callee)
> +			continue;
> +
> +		      e->indirect_info->num_of_ics = 0;
> +		      for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
>  			{
> -			  struct cgraph_edge * e = node->get_edge (stmt);
> -			  if (e && !e->indirect_unknown_callee)
> +			  if (!get_nth_most_common_value (NULL, "indirect call",
> +							  h, &val, &count, &all,
> +							  j))
> +			    continue;
> +
> +			  if (val == 0)
>  			    continue;
>  
> -			  e->indirect_info->common_target_id = val;
> -			  e->indirect_info->common_target_probability
> +			  struct indirect_target_info item;
> +			  item.common_target_id = val;
> +			  item.common_target_probability
>  			    = GCOV_COMPUTE_SCALE (count, all);
> -			  if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE)
> +			  if (item.common_target_probability > REG_BR_PROB_BASE)
>  			    {
>  			      if (dump_file)
> -				fprintf (dump_file, "Probability capped to 1\n");
> -			      e->indirect_info->common_target_probability = REG_BR_PROB_BASE;
> +				fprintf (dump_file,
> +					 "Probability capped to 1\n");
> +			      item.common_target_probability = REG_BR_PROB_BASE;
>  			    }
> +			  vec_safe_push (
> +			    e->indirect_info->indirect_call_targets, item);
> +			  e->indirect_info->num_of_ics++;
>  			}
> +		      gcc_assert (e->indirect_info->num_of_ics
> +				  <= GCOV_TOPN_VALUES);
> +
>  		      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl),
>  						      stmt, h);
>  		    }
> @@ -492,6 +508,7 @@ ipa_profile (void)
>    int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted = 0;
>    int nmismatch = 0, nimpossible = 0;
>    bool node_map_initialized = false;
> +  gcov_type threshold;
>  
>    if (dump_file)
>      dump_histogram (dump_file, histogram);
> @@ -500,14 +517,12 @@ ipa_profile (void)
>        overall_time += histogram[i]->count * histogram[i]->time;
>        overall_size += histogram[i]->size;
>      }
> +  threshold = 0;
>    if (overall_time)
>      {
> -      gcov_type threshold;
> -
>        gcc_assert (overall_size);
>  
>        cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + 500) / 1000;
> -      threshold = 0;
>        for (i = 0; cumulated < cutoff; i++)
>  	{
>  	  cumulated += histogram[i]->count * histogram[i]->time;
> @@ -543,7 +558,7 @@ ipa_profile (void)
>    histogram.release ();
>    histogram_pool.release ();
>  
> -  /* Produce speculative calls: we saved common traget from porfiling into
> +  /* Produce speculative calls: we saved common target from profiling into
>       e->common_target_id.  Now, at link time, we can look up corresponding
>       function node and produce speculative call.  */
>  
> @@ -558,104 +573,133 @@ ipa_profile (void)
>  	{
>  	  if (n->count.initialized_p ())
>  	    nindirect++;
> -	  if (e->indirect_info->common_target_id)
> +	  if (e->indirect_info && e->indirect_info->num_of_ics)
>  	    {
> -	      if (!node_map_initialized)
> -	        init_node_map (false);
> -	      node_map_initialized = true;
> -	      ncommon++;
> -	      n2 = find_func_by_profile_id (e->indirect_info->common_target_id);
> -	      if (n2)
> +	      if (in_lto_p)
>  		{

Starting from here, it's very difficult to read the patch. Reason is that you
wrapped all this in a FOR_EACH_VEC_SAFE_ELT. Would it be please possible
to make a refactoring patch that will put code with all the if-else-else-else..
into a separate function? Then you'll be able to call the function in a small loop:

FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, i, item)
   new_function_call...

It will simplify the patch review a lot.


>  		  if (dump_file)
>  		    {
> -		      fprintf (dump_file, "Indirect call -> direct call from"
> -			       " other module %s => %s, prob %3.2f\n",
> -			       n->dump_name (),
> -			       n2->dump_name (),
> -			       e->indirect_info->common_target_probability
> -			       / (float)REG_BR_PROB_BASE);
> -		    }
> -		  if (e->indirect_info->common_target_probability
> -		      < REG_BR_PROB_BASE / 2)
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: probability is too low.\n");
> -		    }
> -		  else if (!e->maybe_hot_p ())
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: call is cold.\n");
> -		    }
> -		  else if (n2->get_availability () <= AVAIL_INTERPOSABLE
> -			   && n2->can_be_discarded_p ())
> -		    {
> -		      nuseless++;
> -		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: target is overwritable "
> -				 "and can be discarded.\n");
> +		      fprintf (dump_file,
> +			       "Updating hotness threshold in LTO mode.\n");
> +		      fprintf (dump_file, "Updated min count: %" PRId64 "\n",
> +			       (int64_t) threshold);
>  		    }
> -		  else if (ipa_node_params_sum && ipa_edge_args_sum
> -			   && (!vec_safe_is_empty
> -			       (IPA_NODE_REF (n2)->descriptors))
> -			   && ipa_get_param_count (IPA_NODE_REF (n2))
> -			      != ipa_get_cs_argument_count (IPA_EDGE_REF (e))
> -			    && (ipa_get_param_count (IPA_NODE_REF (n2))
> -				>= ipa_get_cs_argument_count (IPA_EDGE_REF (e))
> -				|| !stdarg_p (TREE_TYPE (n2->decl))))
> +		  set_hot_bb_threshold (threshold
> +					/ e->indirect_info->num_of_ics);
> +		}
> +	      if (!node_map_initialized)
> +		init_node_map (false);
> +	      node_map_initialized = true;
> +	      ncommon++;
> +	      unsigned speculative_id = 0;
> +	      struct indirect_target_info *item;
> +	      FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, i,
> +				     item)
> +		{
> +		  if ((unsigned) i > e->indirect_info->num_of_ics)
> +		    break;
> +		  n2 = find_func_by_profile_id (item->common_target_id);
> +		  if (n2)
>  		    {
> -		      nmismatch++;
>  		      if (dump_file)
> -			fprintf (dump_file,
> -				 "Not speculating: "
> -				 "parameter count mistmatch\n");
> +			{
> +			  fprintf (dump_file,
> +				   "Indirect call -> direct call from"
> +				   " other module %s => %s, prob %3.2f\n",
> +				   n->dump_name (), n2->dump_name (),
> +				   item->common_target_probability
> +				     / (float) REG_BR_PROB_BASE);
> +			}
> +		      if (item->common_target_probability
> +			  < REG_BR_PROB_BASE / 2)
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (
> +			      dump_file,
> +			      "Not speculating: probability is too low.\n");
> +			}
> +		      else if (!e->maybe_hot_p ())
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: call is cold.\n");
> +			}
> +		      else if (n2->get_availability () <= AVAIL_INTERPOSABLE
> +			       && n2->can_be_discarded_p ())
> +			{
> +			  nuseless++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: target is overwritable "
> +				     "and can be discarded.\n");
> +			}
> +		      else if (ipa_node_params_sum && ipa_edge_args_sum
> +			       && (!vec_safe_is_empty (
> +				 IPA_NODE_REF (n2)->descriptors))
> +			       && ipa_get_param_count (IPA_NODE_REF (n2))
> +				    != ipa_get_cs_argument_count (
> +				      IPA_EDGE_REF (e))
> +			       && (ipa_get_param_count (IPA_NODE_REF (n2))
> +				     >= ipa_get_cs_argument_count (
> +				       IPA_EDGE_REF (e))
> +				   || !stdarg_p (TREE_TYPE (n2->decl))))
> +			{
> +			  nmismatch++;
> +			  if (dump_file)
> +			    fprintf (dump_file, "Not speculating: "
> +						"parameter count mismatch\n");
> +			}
> +		      else if (e->indirect_info->polymorphic
> +			       && !opt_for_fn (n->decl, flag_devirtualize)
> +			       && !possible_polymorphic_call_target_p (e, n2))
> +			{
> +			  nimpossible++;
> +			  if (dump_file)
> +			    fprintf (dump_file,
> +				     "Not speculating: "
> +				     "function is not in the polymorphic "
> +				     "call target list\n");
> +			}
> +		      else
> +			{
> +			  /* Target may be overwritable, but profile says that
> +			     control flow goes to this particular implementation
> +			     of N2.  Speculate on the local alias to allow
> +			     inlining.
> +			     */
> +			  if (!n2->can_be_discarded_p ())
> +			    {
> +			      cgraph_node *alias;
> +			      alias = dyn_cast<cgraph_node *> (
> +				n2->noninterposable_alias ());
> +			      if (alias)
> +				n2 = alias;
> +			    }
> +			  nconverted++;
> +			  e->make_speculative (
> +			    n2,
> +			    e->count.apply_probability (
> +			      item->common_target_probability),
> +			    speculative_id);
> +			  update = true;
> +			  speculative_id++;
> +			}
>  		    }
> -		  else if (e->indirect_info->polymorphic
> -			   && !opt_for_fn (n->decl, flag_devirtualize)
> -			   && !possible_polymorphic_call_target_p (e, n2))
> +		  else
>  		    {
> -		      nimpossible++;
>  		      if (dump_file)
>  			fprintf (dump_file,
> -				 "Not speculating: "
> -				 "function is not in the polymorphic "
> -				 "call target list\n");
> +				 "Function with profile-id %i not found.\n",
> +				 item->common_target_id);
> +		      nunknown++;
>  		    }
> -		  else
> -		    {
> -		      /* Target may be overwritable, but profile says that
> -			 control flow goes to this particular implementation
> -			 of N2.  Speculate on the local alias to allow inlining.
> -		       */
> -		      if (!n2->can_be_discarded_p ())
> -			{
> -			  cgraph_node *alias;
> -			  alias = dyn_cast<cgraph_node *> (n2->noninterposable_alias ());
> -			  if (alias)
> -			    n2 = alias;
> -			}
> -		      nconverted++;
> -		      e->make_speculative
> -			(n2,
> -			 e->count.apply_probability
> -				     (e->indirect_info->common_target_probability));
> -		      update = true;
> -		    }
> -		}
> -	      else
> -		{
> -		  if (dump_file)
> -		    fprintf (dump_file, "Function with profile-id %i not found.\n",
> -			     e->indirect_info->common_target_id);
> -		  nunknown++;
>  		}
> +	      if (speculative_id < e->indirect_info->num_of_ics)
> +		e->indirect_info->num_of_ics = speculative_id;
>  	    }
> -	 }
> +	}
>         if (update)
>  	 ipa_update_overall_fn_summary (n);
>       }
> diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
> index 0d8e509c932..3e6562ec9d1 100644
> --- a/gcc/ipa-ref.h
> +++ b/gcc/ipa-ref.h
> @@ -59,6 +59,7 @@ public:
>    symtab_node *referred;
>    gimple *stmt;
>    unsigned int lto_stmt_uid;
> +  unsigned int speculative_id;
>    unsigned int referred_index;
>    ENUM_BITFIELD (ipa_ref_use) use:3;
>    unsigned int speculative:1;
> diff --git a/gcc/ipa.c b/gcc/ipa.c
> index 2496694124c..c1fe081a72d 100644
> --- a/gcc/ipa.c
> +++ b/gcc/ipa.c
> @@ -166,7 +166,7 @@ process_references (symtab_node *snode,
>     devirtualization happens.  After inlining still keep their declarations
>     around, so we can devirtualize to a direct call.
>  
> -   Also try to make trivial devirutalization when no or only one target is
> +   Also try to make trivial devirtualization when no or only one target is
>     possible.  */
>  
>  static void
> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
> index bc0f0107333..1dc856d3c64 100644
> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -238,6 +238,7 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    unsigned int uid;
>    intptr_t ref;
>    struct bitpack_d bp;
> +  unsigned len;
>  
>    if (edge->indirect_unknown_callee)
>      streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag,
> @@ -265,6 +266,7 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    bp_pack_enum (&bp, cgraph_inline_failed_t,
>  	        CIF_N_REASONS, edge->inline_failed);
>    bp_pack_var_len_unsigned (&bp, uid);
> +  bp_pack_var_len_unsigned (&bp, edge->speculative_id);
>    bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
>    bp_pack_value (&bp, edge->speculative, 1);
>    bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
> @@ -291,11 +293,28 @@ lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
>    streamer_write_bitpack (&bp);
>    if (edge->indirect_unknown_callee)
>      {
> +      struct indirect_target_info *item;
> +      unsigned int i;
> +      len = edge->indirect_info->num_of_ics;
> +      gcc_assert (len <= GCOV_TOPN_VALUES);
> +
>        streamer_write_hwi_stream (ob->main_stream,
> -			         edge->indirect_info->common_target_id);
> -      if (edge->indirect_info->common_target_id)
> -	streamer_write_hwi_stream
> -	   (ob->main_stream, edge->indirect_info->common_target_probability);
> +				 edge->indirect_info->num_of_ics);
> +
> +      if (len)
> +	{
> +	  FOR_EACH_VEC_SAFE_ELT (edge->indirect_info->indirect_call_targets, i,
> +				 item)
> +	    {
> +	      if (i == edge->indirect_info->num_of_ics)
> +		break;

This is one of the places which will simplify with the vec::length.

> +	      streamer_write_hwi_stream (ob->main_stream,
> +					 item->common_target_id);
> +	      if (item->common_target_id)
> +		streamer_write_hwi_stream (ob->main_stream,
> +					   item->common_target_probability);
> +	    }
> +	}
>      }
>  }
>  
> @@ -688,6 +707,7 @@ lto_output_ref (struct lto_simple_output_block *ob, struct ipa_ref *ref,
>        if (ref->stmt)
>  	uid = gimple_uid (ref->stmt) + 1;
>        streamer_write_hwi_stream (ob->main_stream, uid);
> +      streamer_write_hwi_stream (ob->main_stream, ref->speculative_id);
>      }
>  }
>  
> @@ -1420,6 +1440,8 @@ input_ref (class lto_input_block *ib,
>    ref->speculative = speculative;
>    if (is_a <cgraph_node *> (referring_node))
>      ref->lto_stmt_uid = streamer_read_hwi (ib);
> +  if (is_a <cgraph_node *> (referring_node))
> +    ref->speculative_id = streamer_read_hwi (ib);

Here you want to write the condition just once.

>  }
>  
>  /* Read an edge from IB.  NODES points to a vector of previously read nodes for
> @@ -1433,11 +1455,12 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>  {
>    struct cgraph_node *caller, *callee;
>    struct cgraph_edge *edge;
> -  unsigned int stmt_id;
> +  unsigned int stmt_id, speculative_id;
>    profile_count count;
>    cgraph_inline_failed_t inline_failed;
>    struct bitpack_d bp;
>    int ecf_flags = 0;
> +  unsigned i, len;
>  
>    caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]);
>    if (caller == NULL || caller->decl == NULL_TREE)
> @@ -1457,6 +1480,7 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>    bp = streamer_read_bitpack (ib);
>    inline_failed = bp_unpack_enum (&bp, cgraph_inline_failed_t, CIF_N_REASONS);
>    stmt_id = bp_unpack_var_len_unsigned (&bp);
> +  speculative_id = bp_unpack_var_len_unsigned (&bp);
>  
>    if (indirect)
>      edge = caller->create_indirect_edge (NULL, 0, count);
> @@ -1466,6 +1490,7 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>    edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
>    edge->speculative = bp_unpack_value (&bp, 1);
>    edge->lto_stmt_uid = stmt_id;
> +  edge->speculative_id = speculative_id;
>    edge->inline_failed = inline_failed;
>    edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
>    edge->can_throw_external = bp_unpack_value (&bp, 1);
> @@ -1485,9 +1510,22 @@ input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
>        if (bp_unpack_value (&bp, 1))
>  	ecf_flags |= ECF_RETURNS_TWICE;
>        edge->indirect_info->ecf_flags = ecf_flags;
> -      edge->indirect_info->common_target_id = streamer_read_hwi (ib);
> -      if (edge->indirect_info->common_target_id)
> -        edge->indirect_info->common_target_probability = streamer_read_hwi (ib);
> +
> +      len = streamer_read_hwi (ib);
> +      edge->indirect_info->num_of_ics = len;
> +
> +      gcc_assert (len <= GCOV_TOPN_VALUES);
> +
> +      if (len)
> +	{
> +	  for (i = 0; i < len; i++)
> +	    {
> +	      indirect_target_info item;

I would also suggest a constructor for indirect_target_info. You initialize it
multiple times.

> +	      item.common_target_id = streamer_read_hwi (ib);
> +	      item.common_target_probability = streamer_read_hwi (ib);
> +	      vec_safe_push (edge->indirect_info->indirect_call_targets, item);
> +	    }
> +	}
>      }
>  }
>  
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 07f66aab7a3..67a859ea436 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -763,7 +763,6 @@ dump_prediction (FILE *file, enum br_predictor predictor, int probability,
>        && bb->count.precise_p ()
>        && reason == REASON_NONE)
>      {
> -      gcc_assert (e->count ().precise_p ());
>        fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n",
>  	       predictor_info[predictor].name,
>  	       bb->count.to_gcov_type (), e->count ().to_gcov_type (),
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index b1589ea8a40..90a1f62d6a9 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -603,6 +603,7 @@ symtab_node::create_reference (symtab_node *referred_node,
>    ref->referred = referred_node;
>    ref->stmt = stmt;
>    ref->lto_stmt_uid = 0;
> +  ref->speculative_id = 0;
>    ref->use = use_type;
>    ref->speculative = 0;
>  
> @@ -660,6 +661,7 @@ symtab_node::clone_references (symtab_node *node)
>        ref2 = create_reference (ref->referred, ref->use, ref->stmt);
>        ref2->speculative = speculative;
>        ref2->lto_stmt_uid = stmt_uid;
> +      ref2->speculative_id = ref->speculative_id;
>      }
>  }
>  
> @@ -678,6 +680,7 @@ symtab_node::clone_referring (symtab_node *node)
>        ref2 = ref->referring->create_reference (this, ref->use, ref->stmt);
>        ref2->speculative = speculative;
>        ref2->lto_stmt_uid = stmt_uid;
> +      ref2->speculative_id = ref->speculative_id;
>      }
>  }
>  
> @@ -693,6 +696,7 @@ symtab_node::clone_reference (ipa_ref *ref, gimple *stmt)
>    ref2 = create_reference (ref->referred, ref->use, stmt);
>    ref2->speculative = speculative;
>    ref2->lto_stmt_uid = stmt_uid;
> +  ref2->speculative_id = ref->speculative_id;
>    return ref2;
>  }
>  
> @@ -747,6 +751,7 @@ symtab_node::clear_stmts_in_references (void)
>        {
>  	r->stmt = NULL;
>  	r->lto_stmt_uid = 0;
> +	r->speculative_id = 0;
>        }
>  }
>  
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
> new file mode 100644
> index 00000000000..e0a83c2e067
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
> @@ -0,0 +1,35 @@
> +/* { dg-require-effective-target lto } */
> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a);
> +
> +int
> +two (int a);
> +
> +fptr table[] = {&one, &two};
> +
> +int
> +main()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  x = one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
> new file mode 100644
> index 00000000000..a8c6e365fb9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
> @@ -0,0 +1,22 @@
> +/* It seems there is no way to avoid the other source of mulitple
> +   source testcase from being compiled independently.  Just avoid
> +   error.  */
> +#ifdef DOJOB
> +int
> +one (int a)
> +{
> +  return 1;
> +}
> +
> +int
> +two (int a)
> +{
> +  return 0;
> +}
> +#else
> +int
> +main()
> +{
> +  return 0;
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
> new file mode 100644
> index 00000000000..aa3887fde83
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
> @@ -0,0 +1,42 @@
> +/* { dg-require-effective-target lto } */
> +/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a);
> +
> +int
> +two (int a);
> +
> +fptr table[] = {&one, &two};
> +
> +int foo ()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  x = one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  return x;
> +}
> +
> +int
> +main()
> +{
> +  int x = foo ();
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
> +/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
> +
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> new file mode 100644
> index 00000000000..951bc7ddd19
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
> @@ -0,0 +1,38 @@
> +/* { dg-require-profiling "-fprofile-generate" } */
> +/* { dg-options "-O2 -fdump-ipa-profile --param indir-call-topn-profile=1" } */
> +
> +#include <stdio.h>
> +
> +typedef int (*fptr) (int);
> +int
> +one (int a)
> +{
> +  return 1;
> +}
> +
> +int
> +two (int a)
> +{
> +  return 0;
> +}
> +
> +fptr table[] = {&one, &two};
> +
> +int
> +main()
> +{
> +  int i, x;
> +  fptr p = &one;
> +
> +  one (3);
> +
> +  for (i = 0; i < 350000000; i++)
> +    {
> +      x = (*p) (3);
> +      p = table[x];
> +    }
> +  printf ("done:%d\n", x);
> +}
> +
> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile" } } */
> +/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile" } } */
> diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
> index 4311309acce..93d9269aa8f 100644
> --- a/gcc/tree-inline.c
> +++ b/gcc/tree-inline.c
> @@ -2167,6 +2167,26 @@ copy_bb (copy_body_data *id, basic_block bb,
>  
>  			  gcc_assert (!edge->indirect_unknown_callee);
>  			  old_edge->speculative_call_info (direct, indirect, ref);
> +			  while (old_edge->next_callee
> +				 && old_edge->next_callee->speculative
> +				 && indirect->indirect_info
> +				 && indirect->indirect_info->num_of_ics > 1)

Maybe indirect->indirect_info && indirect->indirect_info->num_of_ics can be
put into a function indirect->has_indirect_call_p () ?

> +			    {
> +			      /* Some speculative calls may contain more than
> +				 one direct target, loop iterate it to clone all
> +				 related direct edges before cloning the related
> +				 indirect edge.  */
> +			      id->dst_node->clone_reference (ref, stmt);
> +
> +			      edge = old_edge->next_callee;
> +			      edge = edge->clone (id->dst_node, call_stmt,
> +						  gimple_uid (stmt), num, den,
> +						  true);
> +			      old_edge = old_edge->next_callee;
> +			      gcc_assert (!edge->indirect_unknown_callee);
> +			      old_edge->speculative_call_info (direct, indirect,
> +							       ref);
> +			    }
>  
>  			  profile_count indir_cnt = indirect->count;
>  			  indirect = indirect->clone (id->dst_node, call_stmt,
> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> index 554a8c98419..f35bd3f9d7a 100644
> --- a/gcc/tree-profile.c
> +++ b/gcc/tree-profile.c
> @@ -74,8 +74,8 @@ static GTY(()) tree ic_tuple_callee_field;
>  /* Do initialization work for the edge profiler.  */
>  
>  /* Add code:
> -   __thread gcov*	__gcov_indirect_call_counters; // pointer to actual counter
> -   __thread void*	__gcov_indirect_call_callee; // actual callee address
> +   __thread gcov*	__gcov_indirect_call.counters; // pointer to actual counter
> +   __thread void*	__gcov_indirect_call.callee; // actual callee address
>     __thread int __gcov_function_counter; // time profiler function counter
>  */
>  static void
> @@ -384,7 +384,7 @@ gimple_gen_ic_profiler (histogram_value value, unsigned tag, unsigned base)
>        f_1 = foo;
>        __gcov_indirect_call.counters = &__gcov4.main[0];
>        PROF_9 = f_1;
> -      __gcov_indirect_call_callee = PROF_9;
> +      __gcov_indirect_call.callee = PROF_9;
>        _4 = f_1 ();
>     */
>  
> @@ -447,11 +447,11 @@ gimple_gen_ic_func_profiler (void)
>  
>    /* Insert code:
>  
> -     if (__gcov_indirect_call_callee != NULL)
> +     if (__gcov_indirect_call.callee != NULL)
>         __gcov_indirect_call_profiler_v3 (profile_id, &current_function_decl);
>  
>       The function __gcov_indirect_call_profiler_v3 is responsible for
> -     resetting __gcov_indirect_call_callee to NULL.  */
> +     resetting __gcov_indirect_call.callee to NULL.  */
>  
>    gimple_stmt_iterator gsi = gsi_start_bb (cond_bb);
>    void0 = build_int_cst (ptr_type_node, 0);
> @@ -893,7 +893,7 @@ pass_ipa_tree_profile::gate (function *)
>  {
>    /* When profile instrumentation, use or test coverage shall be performed.
>       But for AutoFDO, this there is no instrumentation, thus this pass is
> -     diabled.  */
> +     disabled.  */
>    return (!in_lto_p && !flag_auto_profile
>  	  && (flag_branch_probabilities || flag_test_coverage
>  	      || profile_arc_flag));
> diff --git a/gcc/value-prof.c b/gcc/value-prof.c
> index 759458868a8..cbcc104bd5c 100644
> --- a/gcc/value-prof.c
> +++ b/gcc/value-prof.c

Please rebase changes in this file. I installed recently changes that collide with your patch.

Thanks,
Martin

> @@ -1404,11 +1404,10 @@ gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
>    return dcall_stmt;
>  }
>  
> -/*
> -  For every checked indirect/virtual call determine if most common pid of
> -  function/class method has probability more than 50%. If yes modify code of
> -  this call to:
> - */
> +/* There maybe multiple indirect targets in histogram.  Check every
> +   indirect/virtual call if callee function exists, if not exist, leave it to
> +   LTO stage for later process.  Modify code of this indirect call to an if-else
> +   structure in ipa-profile finally.  */
>  
>  static bool
>  gimple_ic_transform (gimple_stmt_iterator *gsi)
> @@ -1432,54 +1431,65 @@ gimple_ic_transform (gimple_stmt_iterator *gsi)
>    if (!histogram)
>      return false;
>  
> -  if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
> -				  &count, &all))
> -    return false;
> +  count = 0;
> +  all = histogram->hvalue.counters[0];
>  
> -  if (4 * count <= 3 * all)
> -    return false;
> +  for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
> +    {
> +      if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
> +				      &count, &all, j))
> +	continue;
>  
> -  direct_call = find_func_by_profile_id ((int)val);
> +      if (4 * count <= all)
> +	continue;
>  
> -  if (direct_call == NULL)
> -    {
> -      if (val)
> +      direct_call = find_func_by_profile_id ((int) val);
> +
> +      if (direct_call == NULL)
> +	{
> +	  if (val)
> +	    {
> +	      if (dump_file)
> +		{
> +		  fprintf (dump_file,
> +			   "Indirect call -> direct call from other module");
> +		  print_generic_expr (dump_file, gimple_call_fn (stmt),
> +				      TDF_SLIM);
> +		  fprintf (dump_file, "=> %i (will resolve only with LTO)\n",
> +			   (int) val);
> +		}
> +	    }
> +	  continue;
> +	}
> +
> +      if (!check_ic_target (stmt, direct_call))
>  	{
>  	  if (dump_file)
>  	    {
> -	      fprintf (dump_file, "Indirect call -> direct call from other module");
> +	      fprintf (dump_file, "Indirect call -> direct call ");
>  	      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
> -	      fprintf (dump_file, "=> %i (will resolve only with LTO)\n", (int)val);
> +	      fprintf (dump_file, "=> ");
> +	      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> +	      fprintf (dump_file,
> +		       " transformation skipped because of type mismatch");
> +	      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
>  	    }
> +	  gimple_remove_histogram_value (cfun, stmt, histogram);
> +	  return false;
>  	}
> -      return false;
> -    }
>  
> -  if (!check_ic_target (stmt, direct_call))
> -    {
>        if (dump_file)
>  	{
>  	  fprintf (dump_file, "Indirect call -> direct call ");
>  	  print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
>  	  fprintf (dump_file, "=> ");
>  	  print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> -	  fprintf (dump_file, " transformation skipped because of type mismatch");
> +	  fprintf (dump_file,
> +		   " transformation on insn postponed to ipa-profile");
>  	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> +	  fprintf (dump_file, "hist->count %" PRId64 " hist->all %" PRId64 "\n",
> +		   count, all);
>  	}
> -      gimple_remove_histogram_value (cfun, stmt, histogram);
> -      return false;
> -    }
> -
> -  if (dump_file)
> -    {
> -      fprintf (dump_file, "Indirect call -> direct call ");
> -      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
> -      fprintf (dump_file, "=> ");
> -      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
> -      fprintf (dump_file, " transformation on insn postponned to ipa-profile");
> -      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
> -      fprintf (dump_file, "hist->count %" PRId64
> -	       " hist->all %" PRId64"\n", count, all);
>      }
>  
>    return true;
>
Martin Liška Aug. 13, 2019, 1:16 p.m. UTC | #2
And I would also suggest to come up with parameter that will
drive minimum probability, maximum number of promotions and
maybe minimal number of edge executions to consider a speculation.

Clang provides all these:
https://github.com/microsoft/llvm-1/blob/master/lib/Analysis/IndirectCallPromotionAnalysis.cpp#L37

Martin
diff mbox series

Patch

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index a7e3bcf2132..d7be9e9c545 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -869,6 +869,7 @@  symbol_table::create_edge (cgraph_node *caller, cgraph_node *callee,
   edge->prev_callee = NULL;
   edge->next_callee = NULL;
   edge->lto_stmt_uid = 0;
+  edge->speculative_id = 0;
 
   edge->count = count;
 
@@ -1063,7 +1064,8 @@  cgraph_edge::remove (void)
    Return direct edge created.  */
 
 cgraph_edge *
-cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
+cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count,
+			       unsigned int speculative_id)
 {
   cgraph_node *n = caller;
   ipa_ref *ref = NULL;
@@ -1081,11 +1083,13 @@  cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
   else
     e2->can_throw_external = can_throw_external;
   e2->lto_stmt_uid = lto_stmt_uid;
+  e2->speculative_id = speculative_id;
   e2->in_polymorphic_cdtor = in_polymorphic_cdtor;
   count -= e2->count;
   symtab->call_edge_duplication_hooks (this, e2);
   ref = n->create_reference (n2, IPA_REF_ADDR, call_stmt);
   ref->lto_stmt_uid = lto_stmt_uid;
+  ref->speculative_id = speculative_id;
   ref->speculative = speculative;
   n2->mark_address_taken ();
   return e2;
@@ -1099,6 +1103,38 @@  cgraph_edge::make_speculative (cgraph_node *n2, profile_count direct_count)
    call) and if one of them exists, all of them must exist.
 
    Given speculative call edge, return all three components.
+
+   For some indirect edge, it may maps to multiple direct edges, i.e. 1:N.
+   check the speculative_id to return all the three components for specified
+   direct edge or indirect edge.
+   If input is indirect, caller of this function will get the direct edge one by
+   one, get_edge will just return one of the direct edge mapped to the indirect
+   edge, the returned direct edge will be resolved or redirected by the caller,
+   then num_of_ics (speculations) is deceased in each access.
+   If input is direct, this function will get the indirect edge and reference
+   with matched speculative_id, the returned edge will also be resolved or
+   redirected, decrease the speculations accordingly.
+   Speculations of indirect edge will be dropped only if all direct edges
+   be handled.
+
+   e.g. for indirect edge E statement "call call_dest":
+
+   Redirect N3 after redirected N2:
+
+   if (call_dest == N2)
+     n2 ();
+   else if (call_dest == N3)
+     n3 ();
+   else
+     call call_dest
+
+   Resolve N3 and only redirect N2:
+
+   if (call_dest == N2)
+     n2 ();
+   else
+     call call_dest
+
  */
 
 void
@@ -1138,7 +1174,7 @@  cgraph_edge::speculative_call_info (cgraph_edge *&direct,
 
   reference = NULL;
   for (i = 0; e->caller->iterate_reference (i, ref); i++)
-    if (ref->speculative
+    if (ref->speculative && ref->speculative_id == e->speculative_id
 	&& ((ref->stmt && ref->stmt == e->call_stmt)
 	    || (!ref->stmt && ref->lto_stmt_uid == e->lto_stmt_uid)))
       {
@@ -1199,7 +1235,19 @@  cgraph_edge::resolve_speculation (tree callee_decl)
          in the functions inlined through it.  */
     }
   edge->count += e2->count;
-  edge->speculative = false;
+  /* edge is indirect, e2 is direct.  If edge contains multiple speculations,
+     remove one of speculations for this indirect edge, then if edge still
+     contains direct target, keep the speculation, next direct target
+     will continue use it.  Give up speculation completely if no direct
+     target is left for this indirect edge.  */
+  if (edge->indirect_info && edge->indirect_info->num_of_ics)
+    {
+      edge->indirect_info->num_of_ics--;
+      if (edge->indirect_info->num_of_ics == 0)
+	edge->speculative = false;
+    }
+  else
+    edge->speculative = false;
   e2->speculative = false;
   ref->remove_reference ();
   if (e2->indirect_unknown_callee || e2->inline_failed)
@@ -1333,7 +1381,19 @@  cgraph_edge::redirect_call_stmt_to_callee (void)
 	  e->caller->set_call_stmt_including_clones (e->call_stmt, new_stmt,
 						     false);
 	  e->count = gimple_bb (e->call_stmt)->count;
-	  e2->speculative = false;
+	  /* edge is direct, e2 is indirect here.  If e2 contains multiple
+	     speculations, remove one of speculations for this indirect edge,
+	     then if e2 still contains direct target, keep the speculation,
+	     next direct target will continue use it.  Give up speculation
+	     completely if no direct target is left for this indirect e2.  */
+	  if (e2->indirect_info && e2->indirect_info->num_of_ics)
+	    {
+	      e2->indirect_info->num_of_ics--;
+	      if (e2->indirect_info->num_of_ics == 0)
+		e2->speculative = false;
+	    }
+	  else
+	    e2->speculative = false;
 	  e2->count = gimple_bb (e2->call_stmt)->count;
 	  ref->speculative = false;
 	  ref->stmt = NULL;
@@ -3434,7 +3494,7 @@  cgraph_node::verify_node (void)
 
       for (e = callees; e; e = e->next_callee)
 	{
-	  if (!e->aux)
+	  if (!e->aux && !e->speculative)
 	    {
 	      error ("edge %s->%s has no corresponding call_stmt",
 		     identifier_to_locale (e->caller->name ()),
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index fa5224fb3a5..0b453c3534c 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1630,6 +1630,16 @@  private:
   void make_speculative (tree otr_type = NULL);
 };
 
+/* Structure containing indirect target information from profile.  */
+
+struct GTY (()) indirect_target_info
+{
+  /* Profile_id of common target obtained from profile.  */
+  unsigned int common_target_id;
+  /* Probability that call will land in function with COMMON_TARGET_ID.  */
+  int common_target_probability;
+};
+
 /* Structure containing additional information about an indirect call.  */
 
 class GTY(()) cgraph_indirect_call_info
@@ -1648,10 +1658,14 @@  public:
   int param_index;
   /* ECF flags determined from the caller.  */
   int ecf_flags;
-  /* Profile_id of common target obtrained from profile.  */
-  int common_target_id;
-  /* Probability that call will land in function with COMMON_TARGET_ID.  */
-  int common_target_probability;
+
+  /* An indirect call may contain one or multiple call targets.  */
+  vec<indirect_target_info, va_gc> *indirect_call_targets;
+
+  /* Number of indirect targets (speculations) for the indirect call.
+     Increased by ipa-profile, decreased by resolve_speculation or redirect
+     statement to callee.  */
+  unsigned num_of_ics;
 
   /* Set when the call is a virtual call with the parameter being the
      associated object pointer rather than a simple direct call.  */
@@ -1708,7 +1722,8 @@  public:
   /* Turn edge into speculative call calling N2. Update
      the profile so the direct call is taken COUNT times
      with FREQUENCY.  */
-  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count);
+  cgraph_edge *make_speculative (cgraph_node *n2, profile_count direct_count,
+				 unsigned int speculative_id = 0);
 
    /* Given speculative call edge, return all three components.  */
   void speculative_call_info (cgraph_edge *&direct, cgraph_edge *&indirect,
@@ -1786,6 +1801,9 @@  public:
   /* The stmt_uid of call_stmt.  This is used by LTO to recover the call_stmt
      when the function is serialized in.  */
   unsigned int lto_stmt_uid;
+  /* speculative id is used by multiple indirect targets when the function is
+   speculated.  */
+  unsigned int speculative_id;
   /* Whether this edge was made direct by indirect inlining.  */
   unsigned int indirect_inlining_edge : 1;
   /* Whether this edge describes an indirect call with an undetermined
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index fd867ecac91..ab6b9dd86c8 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -128,6 +128,7 @@  cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, unsigned stmt_uid,
   new_edge->inline_failed = inline_failed;
   new_edge->indirect_inlining_edge = indirect_inlining_edge;
   new_edge->lto_stmt_uid = stmt_uid;
+  new_edge->speculative_id = speculative_id;
   /* Clone flags that depend on call_stmt availability manually.  */
   new_edge->can_throw_external = can_throw_external;
   new_edge->call_stmt_cannot_inline_p = call_stmt_cannot_inline_p;
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 939d86ef94a..f5654198877 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -1877,12 +1877,15 @@  inline_small_functions (void)
 	}
       if (has_speculative)
 	for (edge = node->callees; edge; edge = next)
-	  if (edge->speculative && !speculation_useful_p (edge,
-							  edge->aux != NULL))
-	    {
-	      edge->resolve_speculation ();
-	      update = true;
-	    }
+	  {
+	    next = edge->next_callee;
+	    if (edge->speculative
+		&& !speculation_useful_p (edge, edge->aux != NULL))
+	      {
+		edge->resolve_speculation ();
+		update = true;
+	      }
+	  }
       if (update)
 	{
 	  struct cgraph_node *where = node->global.inlined_to
diff --git a/gcc/ipa-profile.c b/gcc/ipa-profile.c
index 970dba39c80..26214eba27e 100644
--- a/gcc/ipa-profile.c
+++ b/gcc/ipa-profile.c
@@ -192,23 +192,39 @@  ipa_profile_generate_summary (void)
 		  if (h)
 		    {
 		      gcov_type val, count, all;
-		      if (get_nth_most_common_value (NULL, "indirect call", h,
-						     &val, &count, &all))
+		      struct cgraph_edge *e = node->get_edge (stmt);
+		      if (e && !e->indirect_unknown_callee)
+			continue;
+
+		      e->indirect_info->num_of_ics = 0;
+		      for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
 			{
-			  struct cgraph_edge * e = node->get_edge (stmt);
-			  if (e && !e->indirect_unknown_callee)
+			  if (!get_nth_most_common_value (NULL, "indirect call",
+							  h, &val, &count, &all,
+							  j))
+			    continue;
+
+			  if (val == 0)
 			    continue;
 
-			  e->indirect_info->common_target_id = val;
-			  e->indirect_info->common_target_probability
+			  struct indirect_target_info item;
+			  item.common_target_id = val;
+			  item.common_target_probability
 			    = GCOV_COMPUTE_SCALE (count, all);
-			  if (e->indirect_info->common_target_probability > REG_BR_PROB_BASE)
+			  if (item.common_target_probability > REG_BR_PROB_BASE)
 			    {
 			      if (dump_file)
-				fprintf (dump_file, "Probability capped to 1\n");
-			      e->indirect_info->common_target_probability = REG_BR_PROB_BASE;
+				fprintf (dump_file,
+					 "Probability capped to 1\n");
+			      item.common_target_probability = REG_BR_PROB_BASE;
 			    }
+			  vec_safe_push (
+			    e->indirect_info->indirect_call_targets, item);
+			  e->indirect_info->num_of_ics++;
 			}
+		      gcc_assert (e->indirect_info->num_of_ics
+				  <= GCOV_TOPN_VALUES);
+
 		      gimple_remove_histogram_value (DECL_STRUCT_FUNCTION (node->decl),
 						      stmt, h);
 		    }
@@ -492,6 +508,7 @@  ipa_profile (void)
   int nindirect = 0, ncommon = 0, nunknown = 0, nuseless = 0, nconverted = 0;
   int nmismatch = 0, nimpossible = 0;
   bool node_map_initialized = false;
+  gcov_type threshold;
 
   if (dump_file)
     dump_histogram (dump_file, histogram);
@@ -500,14 +517,12 @@  ipa_profile (void)
       overall_time += histogram[i]->count * histogram[i]->time;
       overall_size += histogram[i]->size;
     }
+  threshold = 0;
   if (overall_time)
     {
-      gcov_type threshold;
-
       gcc_assert (overall_size);
 
       cutoff = (overall_time * PARAM_VALUE (HOT_BB_COUNT_WS_PERMILLE) + 500) / 1000;
-      threshold = 0;
       for (i = 0; cumulated < cutoff; i++)
 	{
 	  cumulated += histogram[i]->count * histogram[i]->time;
@@ -543,7 +558,7 @@  ipa_profile (void)
   histogram.release ();
   histogram_pool.release ();
 
-  /* Produce speculative calls: we saved common traget from porfiling into
+  /* Produce speculative calls: we saved common target from profiling into
      e->common_target_id.  Now, at link time, we can look up corresponding
      function node and produce speculative call.  */
 
@@ -558,104 +573,133 @@  ipa_profile (void)
 	{
 	  if (n->count.initialized_p ())
 	    nindirect++;
-	  if (e->indirect_info->common_target_id)
+	  if (e->indirect_info && e->indirect_info->num_of_ics)
 	    {
-	      if (!node_map_initialized)
-	        init_node_map (false);
-	      node_map_initialized = true;
-	      ncommon++;
-	      n2 = find_func_by_profile_id (e->indirect_info->common_target_id);
-	      if (n2)
+	      if (in_lto_p)
 		{
 		  if (dump_file)
 		    {
-		      fprintf (dump_file, "Indirect call -> direct call from"
-			       " other module %s => %s, prob %3.2f\n",
-			       n->dump_name (),
-			       n2->dump_name (),
-			       e->indirect_info->common_target_probability
-			       / (float)REG_BR_PROB_BASE);
-		    }
-		  if (e->indirect_info->common_target_probability
-		      < REG_BR_PROB_BASE / 2)
-		    {
-		      nuseless++;
-		      if (dump_file)
-			fprintf (dump_file,
-				 "Not speculating: probability is too low.\n");
-		    }
-		  else if (!e->maybe_hot_p ())
-		    {
-		      nuseless++;
-		      if (dump_file)
-			fprintf (dump_file,
-				 "Not speculating: call is cold.\n");
-		    }
-		  else if (n2->get_availability () <= AVAIL_INTERPOSABLE
-			   && n2->can_be_discarded_p ())
-		    {
-		      nuseless++;
-		      if (dump_file)
-			fprintf (dump_file,
-				 "Not speculating: target is overwritable "
-				 "and can be discarded.\n");
+		      fprintf (dump_file,
+			       "Updating hotness threshold in LTO mode.\n");
+		      fprintf (dump_file, "Updated min count: %" PRId64 "\n",
+			       (int64_t) threshold);
 		    }
-		  else if (ipa_node_params_sum && ipa_edge_args_sum
-			   && (!vec_safe_is_empty
-			       (IPA_NODE_REF (n2)->descriptors))
-			   && ipa_get_param_count (IPA_NODE_REF (n2))
-			      != ipa_get_cs_argument_count (IPA_EDGE_REF (e))
-			    && (ipa_get_param_count (IPA_NODE_REF (n2))
-				>= ipa_get_cs_argument_count (IPA_EDGE_REF (e))
-				|| !stdarg_p (TREE_TYPE (n2->decl))))
+		  set_hot_bb_threshold (threshold
+					/ e->indirect_info->num_of_ics);
+		}
+	      if (!node_map_initialized)
+		init_node_map (false);
+	      node_map_initialized = true;
+	      ncommon++;
+	      unsigned speculative_id = 0;
+	      struct indirect_target_info *item;
+	      FOR_EACH_VEC_SAFE_ELT (e->indirect_info->indirect_call_targets, i,
+				     item)
+		{
+		  if ((unsigned) i > e->indirect_info->num_of_ics)
+		    break;
+		  n2 = find_func_by_profile_id (item->common_target_id);
+		  if (n2)
 		    {
-		      nmismatch++;
 		      if (dump_file)
-			fprintf (dump_file,
-				 "Not speculating: "
-				 "parameter count mistmatch\n");
+			{
+			  fprintf (dump_file,
+				   "Indirect call -> direct call from"
+				   " other module %s => %s, prob %3.2f\n",
+				   n->dump_name (), n2->dump_name (),
+				   item->common_target_probability
+				     / (float) REG_BR_PROB_BASE);
+			}
+		      if (item->common_target_probability
+			  < REG_BR_PROB_BASE / 2)
+			{
+			  nuseless++;
+			  if (dump_file)
+			    fprintf (
+			      dump_file,
+			      "Not speculating: probability is too low.\n");
+			}
+		      else if (!e->maybe_hot_p ())
+			{
+			  nuseless++;
+			  if (dump_file)
+			    fprintf (dump_file,
+				     "Not speculating: call is cold.\n");
+			}
+		      else if (n2->get_availability () <= AVAIL_INTERPOSABLE
+			       && n2->can_be_discarded_p ())
+			{
+			  nuseless++;
+			  if (dump_file)
+			    fprintf (dump_file,
+				     "Not speculating: target is overwritable "
+				     "and can be discarded.\n");
+			}
+		      else if (ipa_node_params_sum && ipa_edge_args_sum
+			       && (!vec_safe_is_empty (
+				 IPA_NODE_REF (n2)->descriptors))
+			       && ipa_get_param_count (IPA_NODE_REF (n2))
+				    != ipa_get_cs_argument_count (
+				      IPA_EDGE_REF (e))
+			       && (ipa_get_param_count (IPA_NODE_REF (n2))
+				     >= ipa_get_cs_argument_count (
+				       IPA_EDGE_REF (e))
+				   || !stdarg_p (TREE_TYPE (n2->decl))))
+			{
+			  nmismatch++;
+			  if (dump_file)
+			    fprintf (dump_file, "Not speculating: "
+						"parameter count mismatch\n");
+			}
+		      else if (e->indirect_info->polymorphic
+			       && !opt_for_fn (n->decl, flag_devirtualize)
+			       && !possible_polymorphic_call_target_p (e, n2))
+			{
+			  nimpossible++;
+			  if (dump_file)
+			    fprintf (dump_file,
+				     "Not speculating: "
+				     "function is not in the polymorphic "
+				     "call target list\n");
+			}
+		      else
+			{
+			  /* Target may be overwritable, but profile says that
+			     control flow goes to this particular implementation
+			     of N2.  Speculate on the local alias to allow
+			     inlining.
+			     */
+			  if (!n2->can_be_discarded_p ())
+			    {
+			      cgraph_node *alias;
+			      alias = dyn_cast<cgraph_node *> (
+				n2->noninterposable_alias ());
+			      if (alias)
+				n2 = alias;
+			    }
+			  nconverted++;
+			  e->make_speculative (
+			    n2,
+			    e->count.apply_probability (
+			      item->common_target_probability),
+			    speculative_id);
+			  update = true;
+			  speculative_id++;
+			}
 		    }
-		  else if (e->indirect_info->polymorphic
-			   && !opt_for_fn (n->decl, flag_devirtualize)
-			   && !possible_polymorphic_call_target_p (e, n2))
+		  else
 		    {
-		      nimpossible++;
 		      if (dump_file)
 			fprintf (dump_file,
-				 "Not speculating: "
-				 "function is not in the polymorphic "
-				 "call target list\n");
+				 "Function with profile-id %i not found.\n",
+				 item->common_target_id);
+		      nunknown++;
 		    }
-		  else
-		    {
-		      /* Target may be overwritable, but profile says that
-			 control flow goes to this particular implementation
-			 of N2.  Speculate on the local alias to allow inlining.
-		       */
-		      if (!n2->can_be_discarded_p ())
-			{
-			  cgraph_node *alias;
-			  alias = dyn_cast<cgraph_node *> (n2->noninterposable_alias ());
-			  if (alias)
-			    n2 = alias;
-			}
-		      nconverted++;
-		      e->make_speculative
-			(n2,
-			 e->count.apply_probability
-				     (e->indirect_info->common_target_probability));
-		      update = true;
-		    }
-		}
-	      else
-		{
-		  if (dump_file)
-		    fprintf (dump_file, "Function with profile-id %i not found.\n",
-			     e->indirect_info->common_target_id);
-		  nunknown++;
 		}
+	      if (speculative_id < e->indirect_info->num_of_ics)
+		e->indirect_info->num_of_ics = speculative_id;
 	    }
-	 }
+	}
        if (update)
 	 ipa_update_overall_fn_summary (n);
      }
diff --git a/gcc/ipa-ref.h b/gcc/ipa-ref.h
index 0d8e509c932..3e6562ec9d1 100644
--- a/gcc/ipa-ref.h
+++ b/gcc/ipa-ref.h
@@ -59,6 +59,7 @@  public:
   symtab_node *referred;
   gimple *stmt;
   unsigned int lto_stmt_uid;
+  unsigned int speculative_id;
   unsigned int referred_index;
   ENUM_BITFIELD (ipa_ref_use) use:3;
   unsigned int speculative:1;
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 2496694124c..c1fe081a72d 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -166,7 +166,7 @@  process_references (symtab_node *snode,
    devirtualization happens.  After inlining still keep their declarations
    around, so we can devirtualize to a direct call.
 
-   Also try to make trivial devirutalization when no or only one target is
+   Also try to make trivial devirtualization when no or only one target is
    possible.  */
 
 static void
diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index bc0f0107333..1dc856d3c64 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -238,6 +238,7 @@  lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
   unsigned int uid;
   intptr_t ref;
   struct bitpack_d bp;
+  unsigned len;
 
   if (edge->indirect_unknown_callee)
     streamer_write_enum (ob->main_stream, LTO_symtab_tags, LTO_symtab_last_tag,
@@ -265,6 +266,7 @@  lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
   bp_pack_enum (&bp, cgraph_inline_failed_t,
 	        CIF_N_REASONS, edge->inline_failed);
   bp_pack_var_len_unsigned (&bp, uid);
+  bp_pack_var_len_unsigned (&bp, edge->speculative_id);
   bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
   bp_pack_value (&bp, edge->speculative, 1);
   bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
@@ -291,11 +293,28 @@  lto_output_edge (struct lto_simple_output_block *ob, struct cgraph_edge *edge,
   streamer_write_bitpack (&bp);
   if (edge->indirect_unknown_callee)
     {
+      struct indirect_target_info *item;
+      unsigned int i;
+      len = edge->indirect_info->num_of_ics;
+      gcc_assert (len <= GCOV_TOPN_VALUES);
+
       streamer_write_hwi_stream (ob->main_stream,
-			         edge->indirect_info->common_target_id);
-      if (edge->indirect_info->common_target_id)
-	streamer_write_hwi_stream
-	   (ob->main_stream, edge->indirect_info->common_target_probability);
+				 edge->indirect_info->num_of_ics);
+
+      if (len)
+	{
+	  FOR_EACH_VEC_SAFE_ELT (edge->indirect_info->indirect_call_targets, i,
+				 item)
+	    {
+	      if (i == edge->indirect_info->num_of_ics)
+		break;
+	      streamer_write_hwi_stream (ob->main_stream,
+					 item->common_target_id);
+	      if (item->common_target_id)
+		streamer_write_hwi_stream (ob->main_stream,
+					   item->common_target_probability);
+	    }
+	}
     }
 }
 
@@ -688,6 +707,7 @@  lto_output_ref (struct lto_simple_output_block *ob, struct ipa_ref *ref,
       if (ref->stmt)
 	uid = gimple_uid (ref->stmt) + 1;
       streamer_write_hwi_stream (ob->main_stream, uid);
+      streamer_write_hwi_stream (ob->main_stream, ref->speculative_id);
     }
 }
 
@@ -1420,6 +1440,8 @@  input_ref (class lto_input_block *ib,
   ref->speculative = speculative;
   if (is_a <cgraph_node *> (referring_node))
     ref->lto_stmt_uid = streamer_read_hwi (ib);
+  if (is_a <cgraph_node *> (referring_node))
+    ref->speculative_id = streamer_read_hwi (ib);
 }
 
 /* Read an edge from IB.  NODES points to a vector of previously read nodes for
@@ -1433,11 +1455,12 @@  input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
 {
   struct cgraph_node *caller, *callee;
   struct cgraph_edge *edge;
-  unsigned int stmt_id;
+  unsigned int stmt_id, speculative_id;
   profile_count count;
   cgraph_inline_failed_t inline_failed;
   struct bitpack_d bp;
   int ecf_flags = 0;
+  unsigned i, len;
 
   caller = dyn_cast<cgraph_node *> (nodes[streamer_read_hwi (ib)]);
   if (caller == NULL || caller->decl == NULL_TREE)
@@ -1457,6 +1480,7 @@  input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
   bp = streamer_read_bitpack (ib);
   inline_failed = bp_unpack_enum (&bp, cgraph_inline_failed_t, CIF_N_REASONS);
   stmt_id = bp_unpack_var_len_unsigned (&bp);
+  speculative_id = bp_unpack_var_len_unsigned (&bp);
 
   if (indirect)
     edge = caller->create_indirect_edge (NULL, 0, count);
@@ -1466,6 +1490,7 @@  input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
   edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
   edge->speculative = bp_unpack_value (&bp, 1);
   edge->lto_stmt_uid = stmt_id;
+  edge->speculative_id = speculative_id;
   edge->inline_failed = inline_failed;
   edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
   edge->can_throw_external = bp_unpack_value (&bp, 1);
@@ -1485,9 +1510,22 @@  input_edge (class lto_input_block *ib, vec<symtab_node *> nodes,
       if (bp_unpack_value (&bp, 1))
 	ecf_flags |= ECF_RETURNS_TWICE;
       edge->indirect_info->ecf_flags = ecf_flags;
-      edge->indirect_info->common_target_id = streamer_read_hwi (ib);
-      if (edge->indirect_info->common_target_id)
-        edge->indirect_info->common_target_probability = streamer_read_hwi (ib);
+
+      len = streamer_read_hwi (ib);
+      edge->indirect_info->num_of_ics = len;
+
+      gcc_assert (len <= GCOV_TOPN_VALUES);
+
+      if (len)
+	{
+	  for (i = 0; i < len; i++)
+	    {
+	      indirect_target_info item;
+	      item.common_target_id = streamer_read_hwi (ib);
+	      item.common_target_probability = streamer_read_hwi (ib);
+	      vec_safe_push (edge->indirect_info->indirect_call_targets, item);
+	    }
+	}
     }
 }
 
diff --git a/gcc/predict.c b/gcc/predict.c
index 07f66aab7a3..67a859ea436 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -763,7 +763,6 @@  dump_prediction (FILE *file, enum br_predictor predictor, int probability,
       && bb->count.precise_p ()
       && reason == REASON_NONE)
     {
-      gcc_assert (e->count ().precise_p ());
       fprintf (file, ";;heuristics;%s;%" PRId64 ";%" PRId64 ";%.1f;\n",
 	       predictor_info[predictor].name,
 	       bb->count.to_gcov_type (), e->count ().to_gcov_type (),
diff --git a/gcc/symtab.c b/gcc/symtab.c
index b1589ea8a40..90a1f62d6a9 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -603,6 +603,7 @@  symtab_node::create_reference (symtab_node *referred_node,
   ref->referred = referred_node;
   ref->stmt = stmt;
   ref->lto_stmt_uid = 0;
+  ref->speculative_id = 0;
   ref->use = use_type;
   ref->speculative = 0;
 
@@ -660,6 +661,7 @@  symtab_node::clone_references (symtab_node *node)
       ref2 = create_reference (ref->referred, ref->use, ref->stmt);
       ref2->speculative = speculative;
       ref2->lto_stmt_uid = stmt_uid;
+      ref2->speculative_id = ref->speculative_id;
     }
 }
 
@@ -678,6 +680,7 @@  symtab_node::clone_referring (symtab_node *node)
       ref2 = ref->referring->create_reference (this, ref->use, ref->stmt);
       ref2->speculative = speculative;
       ref2->lto_stmt_uid = stmt_uid;
+      ref2->speculative_id = ref->speculative_id;
     }
 }
 
@@ -693,6 +696,7 @@  symtab_node::clone_reference (ipa_ref *ref, gimple *stmt)
   ref2 = create_reference (ref->referred, ref->use, stmt);
   ref2->speculative = speculative;
   ref2->lto_stmt_uid = stmt_uid;
+  ref2->speculative_id = ref->speculative_id;
   return ref2;
 }
 
@@ -747,6 +751,7 @@  symtab_node::clear_stmts_in_references (void)
       {
 	r->stmt = NULL;
 	r->lto_stmt_uid = 0;
+	r->speculative_id = 0;
       }
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
new file mode 100644
index 00000000000..e0a83c2e067
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c
@@ -0,0 +1,35 @@ 
+/* { dg-require-effective-target lto } */
+/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a);
+
+int
+two (int a);
+
+fptr table[] = {&one, &two};
+
+int
+main()
+{
+  int i, x;
+  fptr p = &one;
+
+  x = one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
+/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
new file mode 100644
index 00000000000..a8c6e365fb9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c
@@ -0,0 +1,22 @@ 
+/* It seems there is no way to avoid the other source of mulitple
+   source testcase from being compiled independently.  Just avoid
+   error.  */
+#ifdef DOJOB
+int
+one (int a)
+{
+  return 1;
+}
+
+int
+two (int a)
+{
+  return 0;
+}
+#else
+int
+main()
+{
+  return 0;
+}
+#endif
diff --git a/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
new file mode 100644
index 00000000000..aa3887fde83
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c
@@ -0,0 +1,42 @@ 
+/* { dg-require-effective-target lto } */
+/* { dg-additional-sources "crossmodule-indir-call-topn-1a.c" } */
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -flto -DDOJOB=1 -fdump-ipa-profile_estimate --param indir-call-topn-profile=1" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a);
+
+int
+two (int a);
+
+fptr table[] = {&one, &two};
+
+int foo ()
+{
+  int i, x;
+  fptr p = &one;
+
+  x = one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  return x;
+}
+
+int
+main()
+{
+  int x = foo ();
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile_estimate" } } */
+/* { dg-final-use-not-autofdo { scan-wpa-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile_estimate" } } */
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
new file mode 100644
index 00000000000..951bc7ddd19
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-prof/indir-call-prof-topn.c
@@ -0,0 +1,38 @@ 
+/* { dg-require-profiling "-fprofile-generate" } */
+/* { dg-options "-O2 -fdump-ipa-profile --param indir-call-topn-profile=1" } */
+
+#include <stdio.h>
+
+typedef int (*fptr) (int);
+int
+one (int a)
+{
+  return 1;
+}
+
+int
+two (int a)
+{
+  return 0;
+}
+
+fptr table[] = {&one, &two};
+
+int
+main()
+{
+  int i, x;
+  fptr p = &one;
+
+  one (3);
+
+  for (i = 0; i < 350000000; i++)
+    {
+      x = (*p) (3);
+      p = table[x];
+    }
+  printf ("done:%d\n", x);
+}
+
+/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* one transformation on insn" "profile" } } */
+/* { dg-final-use-not-autofdo { scan-ipa-dump "Indirect call -> direct call.* two transformation on insn" "profile" } } */
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index 4311309acce..93d9269aa8f 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -2167,6 +2167,26 @@  copy_bb (copy_body_data *id, basic_block bb,
 
 			  gcc_assert (!edge->indirect_unknown_callee);
 			  old_edge->speculative_call_info (direct, indirect, ref);
+			  while (old_edge->next_callee
+				 && old_edge->next_callee->speculative
+				 && indirect->indirect_info
+				 && indirect->indirect_info->num_of_ics > 1)
+			    {
+			      /* Some speculative calls may contain more than
+				 one direct target, loop iterate it to clone all
+				 related direct edges before cloning the related
+				 indirect edge.  */
+			      id->dst_node->clone_reference (ref, stmt);
+
+			      edge = old_edge->next_callee;
+			      edge = edge->clone (id->dst_node, call_stmt,
+						  gimple_uid (stmt), num, den,
+						  true);
+			      old_edge = old_edge->next_callee;
+			      gcc_assert (!edge->indirect_unknown_callee);
+			      old_edge->speculative_call_info (direct, indirect,
+							       ref);
+			    }
 
 			  profile_count indir_cnt = indirect->count;
 			  indirect = indirect->clone (id->dst_node, call_stmt,
diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index 554a8c98419..f35bd3f9d7a 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -74,8 +74,8 @@  static GTY(()) tree ic_tuple_callee_field;
 /* Do initialization work for the edge profiler.  */
 
 /* Add code:
-   __thread gcov*	__gcov_indirect_call_counters; // pointer to actual counter
-   __thread void*	__gcov_indirect_call_callee; // actual callee address
+   __thread gcov*	__gcov_indirect_call.counters; // pointer to actual counter
+   __thread void*	__gcov_indirect_call.callee; // actual callee address
    __thread int __gcov_function_counter; // time profiler function counter
 */
 static void
@@ -384,7 +384,7 @@  gimple_gen_ic_profiler (histogram_value value, unsigned tag, unsigned base)
       f_1 = foo;
       __gcov_indirect_call.counters = &__gcov4.main[0];
       PROF_9 = f_1;
-      __gcov_indirect_call_callee = PROF_9;
+      __gcov_indirect_call.callee = PROF_9;
       _4 = f_1 ();
    */
 
@@ -447,11 +447,11 @@  gimple_gen_ic_func_profiler (void)
 
   /* Insert code:
 
-     if (__gcov_indirect_call_callee != NULL)
+     if (__gcov_indirect_call.callee != NULL)
        __gcov_indirect_call_profiler_v3 (profile_id, &current_function_decl);
 
      The function __gcov_indirect_call_profiler_v3 is responsible for
-     resetting __gcov_indirect_call_callee to NULL.  */
+     resetting __gcov_indirect_call.callee to NULL.  */
 
   gimple_stmt_iterator gsi = gsi_start_bb (cond_bb);
   void0 = build_int_cst (ptr_type_node, 0);
@@ -893,7 +893,7 @@  pass_ipa_tree_profile::gate (function *)
 {
   /* When profile instrumentation, use or test coverage shall be performed.
      But for AutoFDO, this there is no instrumentation, thus this pass is
-     diabled.  */
+     disabled.  */
   return (!in_lto_p && !flag_auto_profile
 	  && (flag_branch_probabilities || flag_test_coverage
 	      || profile_arc_flag));
diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 759458868a8..cbcc104bd5c 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -1404,11 +1404,10 @@  gimple_ic (gcall *icall_stmt, struct cgraph_node *direct_call,
   return dcall_stmt;
 }
 
-/*
-  For every checked indirect/virtual call determine if most common pid of
-  function/class method has probability more than 50%. If yes modify code of
-  this call to:
- */
+/* There maybe multiple indirect targets in histogram.  Check every
+   indirect/virtual call if callee function exists, if not exist, leave it to
+   LTO stage for later process.  Modify code of this indirect call to an if-else
+   structure in ipa-profile finally.  */
 
 static bool
 gimple_ic_transform (gimple_stmt_iterator *gsi)
@@ -1432,54 +1431,65 @@  gimple_ic_transform (gimple_stmt_iterator *gsi)
   if (!histogram)
     return false;
 
-  if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
-				  &count, &all))
-    return false;
+  count = 0;
+  all = histogram->hvalue.counters[0];
 
-  if (4 * count <= 3 * all)
-    return false;
+  for (unsigned j = 0; j < GCOV_TOPN_VALUES; j++)
+    {
+      if (!get_nth_most_common_value (NULL, "indirect call", histogram, &val,
+				      &count, &all, j))
+	continue;
 
-  direct_call = find_func_by_profile_id ((int)val);
+      if (4 * count <= all)
+	continue;
 
-  if (direct_call == NULL)
-    {
-      if (val)
+      direct_call = find_func_by_profile_id ((int) val);
+
+      if (direct_call == NULL)
+	{
+	  if (val)
+	    {
+	      if (dump_file)
+		{
+		  fprintf (dump_file,
+			   "Indirect call -> direct call from other module");
+		  print_generic_expr (dump_file, gimple_call_fn (stmt),
+				      TDF_SLIM);
+		  fprintf (dump_file, "=> %i (will resolve only with LTO)\n",
+			   (int) val);
+		}
+	    }
+	  continue;
+	}
+
+      if (!check_ic_target (stmt, direct_call))
 	{
 	  if (dump_file)
 	    {
-	      fprintf (dump_file, "Indirect call -> direct call from other module");
+	      fprintf (dump_file, "Indirect call -> direct call ");
 	      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
-	      fprintf (dump_file, "=> %i (will resolve only with LTO)\n", (int)val);
+	      fprintf (dump_file, "=> ");
+	      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
+	      fprintf (dump_file,
+		       " transformation skipped because of type mismatch");
+	      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 	    }
+	  gimple_remove_histogram_value (cfun, stmt, histogram);
+	  return false;
 	}
-      return false;
-    }
 
-  if (!check_ic_target (stmt, direct_call))
-    {
       if (dump_file)
 	{
 	  fprintf (dump_file, "Indirect call -> direct call ");
 	  print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
 	  fprintf (dump_file, "=> ");
 	  print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
-	  fprintf (dump_file, " transformation skipped because of type mismatch");
+	  fprintf (dump_file,
+		   " transformation on insn postponed to ipa-profile");
 	  print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
+	  fprintf (dump_file, "hist->count %" PRId64 " hist->all %" PRId64 "\n",
+		   count, all);
 	}
-      gimple_remove_histogram_value (cfun, stmt, histogram);
-      return false;
-    }
-
-  if (dump_file)
-    {
-      fprintf (dump_file, "Indirect call -> direct call ");
-      print_generic_expr (dump_file, gimple_call_fn (stmt), TDF_SLIM);
-      fprintf (dump_file, "=> ");
-      print_generic_expr (dump_file, direct_call->decl, TDF_SLIM);
-      fprintf (dump_file, " transformation on insn postponned to ipa-profile");
-      print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
-      fprintf (dump_file, "hist->count %" PRId64
-	       " hist->all %" PRId64"\n", count, all);
     }
 
   return true;