diff mbox

Speculative call support in the callgraph

Message ID 20130809121840.GA28721@kam.mff.cuni.cz
State New
Headers show

Commit Message

Jan Hubicka Aug. 9, 2013, 12:18 p.m. UTC
Hi,
this patch adds support for speculative calls into callgraph.  The idea is that
any IPA optimization that believes it knows likely target of an indirect call
(currently I use it for cross-module indirect call profiling, but I expect
Martin J. can easily add support for ipa-cp and I hope to add speculative
devirtualization in foreseeable future since it should make difference for
Firefox). Speculative call replaces indirect call

    call_target ()

by:

    if (call_target == expected_fn)
      expected_fn ();
    else
      call_target ();

We already know how to produce these by gimple_ic, however the idea is to not
modify the body immediately (IPA passes can not do that), but do that only when
we are applying ipa transformations.

Every speculative call is represented by three components attached
to the same call statement (the original indirect call):
 1) a direct call (to expected_fn)
 2) an indirect call (to call_target)
 3) a IPA_REF_ADDR refrence to expected_fn.

Each of those can be found by having "speculative" flag set.

This representation allows IPA passes to be smart about speculative calls.
1) the direct calls can be inlined or redirected to cprop clones just as any
   ordinary calls.  The reference still links to the original call target making
   it possible to later expand the test.
2) When indirect call target is found by ipa propagation, the call can be turned back
   to a direct call. Either by using the exisitng speculative direct call (if
   speculation was right) and keeping optimizations already attached to it
   or by turning the indirect call into new target if speculation was wrong.
   (this is done by resolve_speuclative_call)
3) at the end of IPA optimization we may get rid of speuclative calls that
   don't seem to buy anything.  I think that on most modern cores the
   speculative sequence is actually worse than the indirect call when
   the expected_fn call stays uninlined or unpropagated.
   (not implemented yet)
4) inliner can adjust its metric knowing that inlining speuclative call
   is costier (disabling 3) (also not implemented yet).

This patch implements the infrastructure only. I will send the corss-module
indirect call profiling code in followup patch.

I hoped this all to go quite smoothly into exisitng WHOPR infrastructure that
juding from the size of patch is not true.  I had to deal with some issues:

1) I am not linking the three components of calls explicitely together. Instead
   I rely on statement pointers in both edges&refs to allow later lookup of
   the components (by cgraph_speculative_call_info).

   ipa-reference call statements are however not streamed by LTO, so I had to
   implement the streaming. (hitting some surprises handled by preparation
   patches)
2) IPA-refs are not really maintained through versioning and inlining that is
   needed now for speculative expansion to work.  Currently we update only 
   callgraph edges and we rebuild ipa-refs (and wrongly so not updating
   clones that is harmless since they are used only for unreachable function
   removal).

   I solved this partly by updating speculative calls only (since I can use
   the existing callgraph edge path).
   tree-inline got fat&ugly, I need to push higher the plans to clean it up.
3) cgraph_edge statement hash expect only one call per statement.  I fixed
   this by always recording the direct speculative call only.
4) ipa-prop and ipa-cp was not expecting the function to turn indirect call
   to direct call to return different edge that is passed to it (that happens
   when speculation was right).  It also accesses indirect_call_info in
   edges that are already direct.
   I fixed these and added code to explicitely free the indirect call info
   when they are turned to direct edges. Until now it stayed dormant
   in memory.
5) Indirect call expansion breaks basic blocks. Currently it happens during 
   call redirection performed by inliner.  It needs to be moved later in
   tree-inline after the body is fully duplicated or we ICE on trying to fixup
   not yet finished loop structure in split_bb.
6) ipa-inline was not ready for callgraph edges to be removed as result
   of inlining.  This was longer time on my TODO, so I fixed it by adding
   the missing edge removal hook.

Bootstrapped/regtested x86_64-linux, will commit it later today. Comments welcome.

Honza
	* cgraphbuild.c (cgraph_rebuild_references): Rebuild only non-speculative
	refs.
	* cgraph.c (cgraph_update_edge_in_call_site_hash): New function.
	(cgraph_add_edge_to_call_site_hash): Deal with speculative calls.
	(cgraph_set_call_stmt): Likewise.
	(cgraph_create_edge_1): Fix release checking compilatoin;
	clear lto_stmt_uid.
	(cgraph_free_edge): Free indirect info.
	(cgraph_turn_edge_to_speculative): New function.
	(cgraph_speculative_call_info): New function.
	(cgraph_make_edge_direct): Return direct edge; handle speculation.
	(cgraph_redirect_edge_call_stmt_to_callee): Expand speculative
	edges.
	(dump_cgraph_node): Dump speculation.
	(verify_edge_count_and_frequency): Accept speculative edges.
	(verify_edge_corresponds_to_fndecl): Handle partitioned cgraph.
	(verify_cgraph_node): Handle speculation.
	* cgraph.h (cgraph_edge): Add SPECULATIVE flag.
	(cgraph_set_call_stmt): Update prototype.
	(cgraph_make_edge_direct): Update prototype.
	(cgraph_speculative_call_info): Declare.
	* ipa-cp.c (ipcp_discover_new_direct_edges): Be ready for edge
	to change; update call of ipa_find_references.
	* ipa-ref.c (ipa_record_reference): Fix return value; clear
	lto_stmt_uid and speculative flags.
	(ipa_dump_references): Dump speculation.
	(ipa_clone_references): Clone speculative flag.
	(ipa_clone_referring): Likewise.
	(ipa_clone_ref): New function.
	(ipa_find_reference): Look into lto_stmt_uids
	(ipa_clear_stmts_in_references): Do not clear speculative calls.
	* ipa-ref.h (ipa_ref): Add lto_stmt_uid and speculative flags.
	(ipa_find_reference): Update declaration.
	(ipa_clone_ref): Declare.
	* lto-cgraph.c (lto_output_edge): Make lto_stmt_uids start from 0;
	stream speculative flag.
	(lto_output_ref): Stream statements uids and speculation.
	(input_ref): Likewise.
	(input_edge): Stream speuclation.
	* cgraphclones.c (cgraph_clone_edge): Clone speculation.
	(cgraph_set_call_stmt_including_clones): Handle speculation.
	* ipa-inline.c (heap_edge_removal_hook): New function.
	(inline_small_functions): Register it.
	* lto-streamer-in.c (fixup_call_stmt_edges_1): Bounds checking;
	also initialize refs.
	* ipa-prop.c (ipa_make_edge_direct_to_target): Be ready for
	edge to change.
	(try_make_edge_direct_simple_call): Likewise.
	(try_make_edge_direct_simple_call): Likewise.
	(update_indirect_edges_after_inlining): Likewise.
	(remove_described_reference): Look proper lto_stmt_uid.
	(propagate_controlled_uses): Likewise.
	(propagate_controlled_uses): Liekwise.
	* tree-inline.c (copy_bb): Copy speculative edges.
	(redirect_all_calls): New function.
	(copy_cfg_body): Do redirection after loop info
	is updated.
	(delete_unreachable_blocks_update_callgraph): Updadte
	speculation.
	* value-prof.c (gimple_ic): Export.
	* value-prof.h (gimple_ic): Declare.

Comments

Xinliang David Li Aug. 9, 2013, 4:44 p.m. UTC | #1
I have not looked at the details. One high level question: this form
seems to only support one indirect target case. LIPO uses TOPN
indirect target profiling (tracking multiple targets), which can be
used by LTO as well (when the topn profiling gets into trunk).

thanks,

David

On Fri, Aug 9, 2013 at 5:18 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> this patch adds support for speculative calls into callgraph.  The idea is that
> any IPA optimization that believes it knows likely target of an indirect call
> (currently I use it for cross-module indirect call profiling, but I expect
> Martin J. can easily add support for ipa-cp and I hope to add speculative
> devirtualization in foreseeable future since it should make difference for
> Firefox). Speculative call replaces indirect call
>
>     call_target ()
>
> by:
>
>     if (call_target == expected_fn)
>       expected_fn ();
>     else
>       call_target ();
>
> We already know how to produce these by gimple_ic, however the idea is to not
> modify the body immediately (IPA passes can not do that), but do that only when
> we are applying ipa transformations.
>
> Every speculative call is represented by three components attached
> to the same call statement (the original indirect call):
>  1) a direct call (to expected_fn)
>  2) an indirect call (to call_target)
>  3) a IPA_REF_ADDR refrence to expected_fn.
>
> Each of those can be found by having "speculative" flag set.
>
> This representation allows IPA passes to be smart about speculative calls.
> 1) the direct calls can be inlined or redirected to cprop clones just as any
>    ordinary calls.  The reference still links to the original call target making
>    it possible to later expand the test.
> 2) When indirect call target is found by ipa propagation, the call can be turned back
>    to a direct call. Either by using the exisitng speculative direct call (if
>    speculation was right) and keeping optimizations already attached to it
>    or by turning the indirect call into new target if speculation was wrong.
>    (this is done by resolve_speuclative_call)
> 3) at the end of IPA optimization we may get rid of speuclative calls that
>    don't seem to buy anything.  I think that on most modern cores the
>    speculative sequence is actually worse than the indirect call when
>    the expected_fn call stays uninlined or unpropagated.
>    (not implemented yet)
> 4) inliner can adjust its metric knowing that inlining speuclative call
>    is costier (disabling 3) (also not implemented yet).
>
> This patch implements the infrastructure only. I will send the corss-module
> indirect call profiling code in followup patch.
>
> I hoped this all to go quite smoothly into exisitng WHOPR infrastructure that
> juding from the size of patch is not true.  I had to deal with some issues:
>
> 1) I am not linking the three components of calls explicitely together. Instead
>    I rely on statement pointers in both edges&refs to allow later lookup of
>    the components (by cgraph_speculative_call_info).
>
>    ipa-reference call statements are however not streamed by LTO, so I had to
>    implement the streaming. (hitting some surprises handled by preparation
>    patches)
> 2) IPA-refs are not really maintained through versioning and inlining that is
>    needed now for speculative expansion to work.  Currently we update only
>    callgraph edges and we rebuild ipa-refs (and wrongly so not updating
>    clones that is harmless since they are used only for unreachable function
>    removal).
>
>    I solved this partly by updating speculative calls only (since I can use
>    the existing callgraph edge path).
>    tree-inline got fat&ugly, I need to push higher the plans to clean it up.
> 3) cgraph_edge statement hash expect only one call per statement.  I fixed
>    this by always recording the direct speculative call only.
> 4) ipa-prop and ipa-cp was not expecting the function to turn indirect call
>    to direct call to return different edge that is passed to it (that happens
>    when speculation was right).  It also accesses indirect_call_info in
>    edges that are already direct.
>    I fixed these and added code to explicitely free the indirect call info
>    when they are turned to direct edges. Until now it stayed dormant
>    in memory.
> 5) Indirect call expansion breaks basic blocks. Currently it happens during
>    call redirection performed by inliner.  It needs to be moved later in
>    tree-inline after the body is fully duplicated or we ICE on trying to fixup
>    not yet finished loop structure in split_bb.
> 6) ipa-inline was not ready for callgraph edges to be removed as result
>    of inlining.  This was longer time on my TODO, so I fixed it by adding
>    the missing edge removal hook.
>
> Bootstrapped/regtested x86_64-linux, will commit it later today. Comments welcome.
>
> Honza
>         * cgraphbuild.c (cgraph_rebuild_references): Rebuild only non-speculative
>         refs.
>         * cgraph.c (cgraph_update_edge_in_call_site_hash): New function.
>         (cgraph_add_edge_to_call_site_hash): Deal with speculative calls.
>         (cgraph_set_call_stmt): Likewise.
>         (cgraph_create_edge_1): Fix release checking compilatoin;
>         clear lto_stmt_uid.
>         (cgraph_free_edge): Free indirect info.
>         (cgraph_turn_edge_to_speculative): New function.
>         (cgraph_speculative_call_info): New function.
>         (cgraph_make_edge_direct): Return direct edge; handle speculation.
>         (cgraph_redirect_edge_call_stmt_to_callee): Expand speculative
>         edges.
>         (dump_cgraph_node): Dump speculation.
>         (verify_edge_count_and_frequency): Accept speculative edges.
>         (verify_edge_corresponds_to_fndecl): Handle partitioned cgraph.
>         (verify_cgraph_node): Handle speculation.
>         * cgraph.h (cgraph_edge): Add SPECULATIVE flag.
>         (cgraph_set_call_stmt): Update prototype.
>         (cgraph_make_edge_direct): Update prototype.
>         (cgraph_speculative_call_info): Declare.
>         * ipa-cp.c (ipcp_discover_new_direct_edges): Be ready for edge
>         to change; update call of ipa_find_references.
>         * ipa-ref.c (ipa_record_reference): Fix return value; clear
>         lto_stmt_uid and speculative flags.
>         (ipa_dump_references): Dump speculation.
>         (ipa_clone_references): Clone speculative flag.
>         (ipa_clone_referring): Likewise.
>         (ipa_clone_ref): New function.
>         (ipa_find_reference): Look into lto_stmt_uids
>         (ipa_clear_stmts_in_references): Do not clear speculative calls.
>         * ipa-ref.h (ipa_ref): Add lto_stmt_uid and speculative flags.
>         (ipa_find_reference): Update declaration.
>         (ipa_clone_ref): Declare.
>         * lto-cgraph.c (lto_output_edge): Make lto_stmt_uids start from 0;
>         stream speculative flag.
>         (lto_output_ref): Stream statements uids and speculation.
>         (input_ref): Likewise.
>         (input_edge): Stream speuclation.
>         * cgraphclones.c (cgraph_clone_edge): Clone speculation.
>         (cgraph_set_call_stmt_including_clones): Handle speculation.
>         * ipa-inline.c (heap_edge_removal_hook): New function.
>         (inline_small_functions): Register it.
>         * lto-streamer-in.c (fixup_call_stmt_edges_1): Bounds checking;
>         also initialize refs.
>         * ipa-prop.c (ipa_make_edge_direct_to_target): Be ready for
>         edge to change.
>         (try_make_edge_direct_simple_call): Likewise.
>         (try_make_edge_direct_simple_call): Likewise.
>         (update_indirect_edges_after_inlining): Likewise.
>         (remove_described_reference): Look proper lto_stmt_uid.
>         (propagate_controlled_uses): Likewise.
>         (propagate_controlled_uses): Liekwise.
>         * tree-inline.c (copy_bb): Copy speculative edges.
>         (redirect_all_calls): New function.
>         (copy_cfg_body): Do redirection after loop info
>         is updated.
>         (delete_unreachable_blocks_update_callgraph): Updadte
>         speculation.
>         * value-prof.c (gimple_ic): Export.
>         * value-prof.h (gimple_ic): Declare.
>
> Index: cgraphbuild.c
> ===================================================================
> --- cgraphbuild.c       (revision 201601)
> +++ cgraphbuild.c       (working copy)
> @@ -483,8 +483,15 @@ cgraph_rebuild_references (void)
>    basic_block bb;
>    struct cgraph_node *node = cgraph_get_node (current_function_decl);
>    gimple_stmt_iterator gsi;
> +  struct ipa_ref *ref;
> +  int i;
>
> -  ipa_remove_all_references (&node->symbol.ref_list);
> +  /* Keep speculative references for further cgraph edge expansion.  */
> +  for (i = 0; ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);)
> +    if (!ref->speculative)
> +      ipa_remove_reference (ref);
> +    else
> +      i++;
>
>    node->count = ENTRY_BLOCK_PTR->count;
>
> Index: cgraph.c
> ===================================================================
> --- cgraph.c    (revision 201601)
> +++ cgraph.c    (working copy)
> @@ -674,14 +674,36 @@ edge_eq (const void *x, const void *y)
>  /* Add call graph edge E to call site hash of its caller.  */
>
>  static inline void
> +cgraph_update_edge_in_call_site_hash (struct cgraph_edge *e)
> +{
> +  void **slot;
> +  slot = htab_find_slot_with_hash (e->caller->call_site_hash,
> +                                  e->call_stmt,
> +                                  htab_hash_pointer (e->call_stmt),
> +                                  INSERT);
> +  *slot = e;
> +}
> +
> +/* Add call graph edge E to call site hash of its caller.  */
> +
> +static inline void
>  cgraph_add_edge_to_call_site_hash (struct cgraph_edge *e)
>  {
>    void **slot;
> +  /* There are two speculative edges for every statement (one direct,
> +     one indirect); always hash the direct one.  */
> +  if (e->speculative && e->indirect_unknown_callee)
> +    return;
>    slot = htab_find_slot_with_hash (e->caller->call_site_hash,
>                                    e->call_stmt,
>                                    htab_hash_pointer (e->call_stmt),
>                                    INSERT);
> -  gcc_assert (!*slot);
> +  if (*slot)
> +    {
> +      gcc_assert (((struct cgraph_edge *)*slot)->speculative);
> +      return;
> +    }
> +  gcc_assert (!*slot || e->speculative);
>    *slot = e;
>  }
>
> @@ -732,14 +754,33 @@ cgraph_edge (struct cgraph_node *node, g
>  }
>
>
> -/* Change field call_stmt of edge E to NEW_STMT.  */
> +/* Change field call_stmt of edge E to NEW_STMT.
> +   If UPDATE_SPECULATIVE and E is any component of speculative
> +   edge, then update all components.  */
>
>  void
> -cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt)
> +cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt,
> +                     bool update_speculative)
>  {
>    tree decl;
>
> -  if (e->caller->call_site_hash)
> +  /* Speculative edges has three component, update all of them
> +     when asked to.  */
> +  if (update_speculative && e->speculative)
> +    {
> +      struct cgraph_edge *direct, *indirect;
> +      struct ipa_ref *ref;
> +
> +      cgraph_speculative_call_info (e, direct, indirect, ref);
> +      cgraph_set_call_stmt (direct, new_stmt, false);
> +      cgraph_set_call_stmt (indirect, new_stmt, false);
> +      ref->stmt = new_stmt;
> +      return;
> +    }
> +
> +  /* Only direct speculative edges go to call_site_hash.  */
> +  if (e->caller->call_site_hash
> +      && (!e->speculative || !e->indirect_unknown_callee))
>      {
>        htab_remove_elt_with_hash (e->caller->call_site_hash,
>                                  e->call_stmt,
> @@ -755,7 +796,7 @@ cgraph_set_call_stmt (struct cgraph_edge
>        struct cgraph_node *new_callee = cgraph_get_node (decl);
>
>        gcc_checking_assert (new_callee);
> -      cgraph_make_edge_direct (e, new_callee);
> +      e = cgraph_make_edge_direct (e, new_callee);
>      }
>
>    push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
> @@ -781,7 +822,10 @@ cgraph_create_edge_1 (struct cgraph_node
>      {
>        /* This is a rather expensive check possibly triggering
>          construction of call stmt hashtable.  */
> -      gcc_checking_assert (!cgraph_edge (caller, call_stmt));
> +#ifdef ENABLE_CHECKING
> +      struct cgraph_edge *e;
> +      gcc_checking_assert (!(e=cgraph_edge (caller, call_stmt)) || e->speculative);
> +#endif
>
>        gcc_assert (is_gimple_call (call_stmt));
>      }
> @@ -804,6 +848,7 @@ cgraph_create_edge_1 (struct cgraph_node
>    edge->next_caller = NULL;
>    edge->prev_callee = NULL;
>    edge->next_callee = NULL;
> +  edge->lto_stmt_uid = 0;
>
>    edge->count = count;
>    gcc_assert (count >= 0);
> @@ -937,6 +982,9 @@ cgraph_free_edge (struct cgraph_edge *e)
>  {
>    int uid = e->uid;
>
> +  if (e->indirect_info)
> +    ggc_free (e->indirect_info);
> +
>    /* Clear out the edge so we do not dangle pointers.  */
>    memset (e, 0, sizeof (*e));
>    e->uid = uid;
> @@ -977,6 +1025,110 @@ cgraph_set_edge_callee (struct cgraph_ed
>    e->callee = n;
>  }
>
> +/* Turn edge E into speculative call calling N2. Update
> +   the profile so the direct call is taken COUNT times
> +   with FREQUENCY.
> +
> +   At clone materialization time, the indirect call E will
> +   be expanded as:
> +
> +   if (call_dest == N2)
> +     n2 ();
> +   else
> +     call call_dest
> +
> +   At this time the function just creates the direct call,
> +   the referencd representing the if conditional and attaches
> +   them all to the orginal indirect call statement.  */
> +
> +void
> +cgraph_turn_edge_to_speculative (struct cgraph_edge *e,
> +                                struct cgraph_node *n2,
> +                                gcov_type direct_count,
> +                                int direct_frequency)
> +{
> +  struct cgraph_node *n = e->caller;
> +  struct ipa_ref *ref;
> +  struct cgraph_edge *e2;
> +
> +  if (dump_file)
> +    {
> +      fprintf (dump_file, "Indirect call -> direct call from"
> +              " other module %s/%i => %s/%i\n",
> +              xstrdup (cgraph_node_name (n)), n->symbol.order,
> +              xstrdup (cgraph_node_name (n2)), n2->symbol.order);
> +    }
> +  e->speculative = true;
> +  e2 = cgraph_create_edge (n, n2, e->call_stmt, direct_count, direct_frequency);
> +  initialize_inline_failed (e2);
> +  e2->speculative = true;
> +  e2->call_stmt = e->call_stmt;
> +  e2->can_throw_external = e->can_throw_external;
> +  e2->lto_stmt_uid = e->lto_stmt_uid;
> +  e->count -= e2->count;
> +  e->frequency -= e2->frequency;
> +  cgraph_call_edge_duplication_hooks (e, e2);
> +  ref = ipa_record_reference ((symtab_node)n, (symtab_node)n2,
> +                             IPA_REF_ADDR, e->call_stmt);
> +  ref->lto_stmt_uid = e->lto_stmt_uid;
> +  ref->speculative = e->speculative;
> +}
> +
> +/* Speculative call consist of three components:
> +   1) an indirect edge representing the original call
> +   2) an direct edge representing the new call
> +   3) ADDR_EXPR reference representing the speculative check.
> +   All three components are attached to single statement (the indirect
> +   call) and if one of them exists, all of them must exist.
> +
> +   Given speculative call edge E, return all three components.
> + */
> +
> +void
> +cgraph_speculative_call_info (struct cgraph_edge *e,
> +                             struct cgraph_edge *&indirect,
> +                             struct cgraph_edge *&direct,
> +                             struct ipa_ref *&reference)
> +{
> +  struct ipa_ref *ref;
> +  int i;
> +  struct cgraph_edge *e2;
> +
> +  if (!e->indirect_unknown_callee)
> +    for (e2 = e->caller->indirect_calls;
> +        e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
> +        e2 = e2->next_callee)
> +      ;
> +  else
> +    {
> +      e2 = e;
> +      /* We can take advantage of the call stmt hash.  */
> +      if (e2->call_stmt)
> +       {
> +         e = cgraph_edge (e->caller, e2->call_stmt);
> +         gcc_assert (!e->speculative && !e->indirect_unknown_callee);
> +       }
> +      else
> +       for (e = e->caller->callees;
> +            e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
> +            e = e->next_callee)
> +         ;
> +    }
> +  gcc_assert (e->speculative && e2->speculative);
> +  indirect = e;
> +  direct = e2;
> +
> +  reference = NULL;
> +  for (i = 0; ipa_ref_list_reference_iterate (&e->caller->symbol.ref_list, i, ref); i++)
> +    if (ref->speculative
> +       && ((ref->stmt && ref->stmt == e->call_stmt)
> +           || (ref->lto_stmt_uid == e->lto_stmt_uid)))
> +      {
> +       reference = ref;
> +       break;
> +      }
> +}
> +
>  /* Redirect callee of E to N.  The function does not update underlying
>     call expression.  */
>
> @@ -990,14 +1142,74 @@ cgraph_redirect_edge_callee (struct cgra
>    cgraph_set_edge_callee (e, n);
>  }
>
> +/* Speculative call EDGE turned out to be direct call to CALLE_DECL.
> +   Remove the speculative call sequence and return edge representing the call.
> +   It is up to caller to redirect the call as appropriate. */
> +
> +static struct cgraph_edge *
> +cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl)
> +{
> +  struct cgraph_edge *e2;
> +  struct ipa_ref *ref;
> +
> +  gcc_assert (edge->speculative);
> +  cgraph_speculative_call_info (edge, e2, edge, ref);
> +  if (ref->referred->symbol.decl != callee_decl)
> +    {
> +      if (dump_file)
> +       {
> +         fprintf (dump_file, "Speculative indirect call %s/%i => %s/%i has "
> +                  "turned out to have contradicitng known target ",
> +                  xstrdup (cgraph_node_name (edge->caller)), edge->caller->symbol.order,
> +                  xstrdup (cgraph_node_name (e2->callee)), e2->callee->symbol.order);
> +         print_generic_expr (dump_file, callee_decl, 0);
> +          fprintf (dump_file, "\n");
> +       }
> +    }
> +  else
> +    {
> +      struct cgraph_edge *tmp = edge;
> +      if (dump_file)
> +        fprintf (dump_file, "Speculative call turned into direct call.\n");
> +      edge = e2;
> +      e2 = tmp;
> +    }
> +  edge->count += e2->count;
> +  edge->frequency += e2->frequency;
> +  edge->speculative = false;
> +  e2->speculative = false;
> +  if (e2->indirect_unknown_callee || e2->inline_failed)
> +    cgraph_remove_edge (e2);
> +  else
> +    cgraph_remove_node_and_inline_clones (e2->callee, NULL);
> +  if (edge->caller->call_site_hash)
> +    cgraph_update_edge_in_call_site_hash (edge);
> +  ipa_remove_reference (ref);
> +  return edge;
> +}
> +
>  /* Make an indirect EDGE with an unknown callee an ordinary edge leading to
>     CALLEE.  DELTA is an integer constant that is to be added to the this
>     pointer (first parameter) to compensate for skipping a thunk adjustment.  */
>
> -void
> +struct cgraph_edge *
>  cgraph_make_edge_direct (struct cgraph_edge *edge, struct cgraph_node *callee)
>  {
> +  gcc_assert (edge->indirect_unknown_callee);
> +
> +  /* If we are redirecting speculative call, make it non-speculative.  */
> +  if (edge->indirect_unknown_callee && edge->speculative)
> +    {
> +      edge = cgraph_resolve_speculation (edge, callee->symbol.decl);
> +
> +      /* On successful speculation just return the pre existing direct edge.  */
> +      if (!edge->indirect_unknown_callee)
> +        return edge;
> +    }
> +
>    edge->indirect_unknown_callee = 0;
> +  ggc_free (edge->indirect_info);
> +  edge->indirect_info = NULL;
>
>    /* Get the edge out of the indirect edge list. */
>    if (edge->prev_callee)
> @@ -1024,6 +1236,7 @@ cgraph_make_edge_direct (struct cgraph_e
>
>    /* We need to re-determine the inlining status of the edge.  */
>    initialize_inline_failed (edge);
> +  return edge;
>  }
>
>  /* If necessary, change the function declaration in the call statement
> @@ -1039,6 +1252,50 @@ cgraph_redirect_edge_call_stmt_to_callee
>    struct cgraph_node *node;
>  #endif
>
> +  if (e->speculative)
> +    {
> +      struct cgraph_edge *e2;
> +      gimple new_stmt;
> +      struct ipa_ref *ref;
> +
> +      cgraph_speculative_call_info (e, e, e2, ref);
> +      if (gimple_call_fndecl (e->call_stmt))
> +       e = cgraph_resolve_speculation (e, gimple_call_fndecl (e->call_stmt));
> +      else
> +       {
> +         if (dump_file)
> +           fprintf (dump_file, "Expanding speculative call of %s/%i -> %s/%i\n",
> +                    xstrdup (cgraph_node_name (e->caller)), e->caller->symbol.order,
> +                    xstrdup (cgraph_node_name (e->callee)), e->callee->symbol.order);
> +         gcc_assert (e2->speculative);
> +         push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
> +         new_stmt = gimple_ic (e->call_stmt, cgraph (ref->referred),
> +                               e->count || e2->count
> +                               ?  RDIV (e->count * REG_BR_PROB_BASE,
> +                                        e->count + e2->count)
> +                               : e->frequency || e2->frequency
> +                               ? RDIV (e->frequency * REG_BR_PROB_BASE,
> +                                       e->frequency + e2->frequency)
> +                               : REG_BR_PROB_BASE / 2,
> +                               e->count, e->count + e2->count);
> +         e->speculative = false;
> +         cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
> +         e->frequency = compute_call_stmt_bb_frequency (e->caller->symbol.decl,
> +                                                        gimple_bb (e->call_stmt));
> +         e2->frequency = compute_call_stmt_bb_frequency (e2->caller->symbol.decl,
> +                                                         gimple_bb (e2->call_stmt));
> +         e2->speculative = false;
> +         ref->speculative = false;
> +         ref->stmt = NULL;
> +         /* Indirect edges are not both in the call site hash.
> +            get it updated.  */
> +         if (e->caller->call_site_hash)
> +           cgraph_update_edge_in_call_site_hash (e2);
> +         pop_cfun ();
> +         /* Continue redirecting E to proper target.  */
> +       }
> +    }
> +
>    if (e->indirect_unknown_callee
>        || decl == e->callee->symbol.decl)
>      return e->call_stmt;
> @@ -1099,7 +1356,7 @@ cgraph_redirect_edge_call_stmt_to_callee
>        update_stmt (new_stmt);
>      }
>
> -  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt);
> +  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
>
>    if (cgraph_dump_file)
>      {
> @@ -1603,6 +1860,8 @@ dump_cgraph_node (FILE *f, struct cgraph
>        if (edge->frequency)
>         fprintf (f, "(%.2f per call) ",
>                  edge->frequency / (double)CGRAPH_FREQ_BASE);
> +      if (edge->speculative)
> +       fprintf(f, "(speculative) ");
>        if (!edge->inline_failed)
>         fprintf(f, "(inlined) ");
>        if (edge->indirect_inlining_edge)
> @@ -1616,6 +1875,8 @@ dump_cgraph_node (FILE *f, struct cgraph
>      {
>        fprintf (f, "%s/%i ", cgraph_node_asm_name (edge->callee),
>                edge->callee->symbol.order);
> +      if (edge->speculative)
> +       fprintf(f, "(speculative) ");
>        if (!edge->inline_failed)
>         fprintf(f, "(inlined) ");
>        if (edge->indirect_inlining_edge)
> @@ -2262,6 +2523,7 @@ verify_edge_count_and_frequency (struct
>      }
>    if (gimple_has_body_p (e->caller->symbol.decl)
>        && !e->caller->global.inlined_to
> +      && !e->speculative
>        /* FIXME: Inline-analysis sets frequency to 0 when edge is optimized out.
>          Remove this once edges are actually removed from the function at that time.  */
>        && (e->frequency
> @@ -2316,7 +2578,7 @@ verify_edge_corresponds_to_fndecl (struc
>
>    /* We do not know if a node from a different partition is an alias or what it
>       aliases and therefore cannot do the former_clone_of check reliably.  */
> -  if (!node || node->symbol.in_other_partition)
> +  if (!node || node->symbol.in_other_partition || e->callee->symbol.in_other_partition)
>      return false;
>    node = cgraph_function_or_thunk_node (node, NULL);
>
> @@ -2625,7 +2887,7 @@ verify_cgraph_node (struct cgraph_node *
>         }
>        for (e = node->indirect_calls; e; e = e->next_callee)
>         {
> -         if (!e->aux)
> +         if (!e->aux && !e->speculative)
>             {
>               error ("an indirect edge from %s has no corresponding call_stmt",
>                      identifier_to_locale (cgraph_node_name (e->caller)));
> Index: cgraph.h
> ===================================================================
> --- cgraph.h    (revision 201568)
> +++ cgraph.h    (working copy)
> @@ -483,6 +483,24 @@ struct GTY((chain_next ("%h.next_caller"
>    unsigned int call_stmt_cannot_inline_p : 1;
>    /* Can this call throw externally?  */
>    unsigned int can_throw_external : 1;
> +  /* Edges with SPECULATIVE flag represents indirect calls that was
> +     speculatively turned into direct (i.e. by profile feedback).
> +     The final code sequence will have form:
> +
> +     if (call_target == expected_fn)
> +       expected_fn ();
> +     else
> +       call_target ();
> +
> +     Every speculative call is represented by three components attached
> +     to a same call statement:
> +     1) a direct call (to expected_fn)
> +     2) an indirect call (to call_target)
> +     3) a IPA_REF_ADDR refrence to expected_fn.
> +
> +     Optimizers may later redirect direct call to clone, so 1) and 3)
> +     do not need to necesarily agree with destination.  */
> +  unsigned int speculative : 1;
>  };
>
>  #define CGRAPH_FREQ_BASE 1000
> @@ -629,7 +647,7 @@ struct cgraph_node * cgraph_add_thunk (s
>                                        HOST_WIDE_INT, tree, tree);
>  struct cgraph_node *cgraph_node_for_asm (tree);
>  struct cgraph_edge *cgraph_edge (struct cgraph_node *, gimple);
> -void cgraph_set_call_stmt (struct cgraph_edge *, gimple);
> +void cgraph_set_call_stmt (struct cgraph_edge *, gimple, bool update_speculative = true);
>  void cgraph_update_edges_for_call_stmt (gimple, tree, gimple);
>  struct cgraph_local_info *cgraph_local_info (tree);
>  struct cgraph_global_info *cgraph_global_info (tree);
> @@ -641,7 +659,7 @@ void cgraph_call_edge_duplication_hooks
>                                          struct cgraph_edge *);
>
>  void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
> -void cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
> +struct cgraph_edge *cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
>  bool cgraph_only_called_directly_p (struct cgraph_node *);
>
>  bool cgraph_function_possibly_inlined_p (tree);
> @@ -702,6 +720,14 @@ bool cgraph_propagate_frequency (struct
>  struct cgraph_node * cgraph_function_node (struct cgraph_node *,
>                                            enum availability *avail = NULL);
>  bool cgraph_get_body (struct cgraph_node *node);
> +void
> +cgraph_turn_edge_to_speculative (struct cgraph_edge *,
> +                                struct cgraph_node *,
> +                                gcov_type, int);
> +void cgraph_speculative_call_info (struct cgraph_edge *,
> +                                  struct cgraph_edge *&,
> +                                  struct cgraph_edge *&,
> +                                  struct ipa_ref *&);
>
>  /* In cgraphunit.c  */
>  struct asm_node *add_asm_node (tree);
> @@ -735,7 +761,8 @@ struct cgraph_node * cgraph_create_virtu
>                                                   const char *clone_name);
>  struct cgraph_node *cgraph_find_replacement_node (struct cgraph_node *);
>  bool cgraph_remove_node_and_inline_clones (struct cgraph_node *, struct cgraph_node *);
> -void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple);
> +void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple,
> +                                           bool update_speculative = true);
>  void cgraph_create_edge_including_clones (struct cgraph_node *,
>                                           struct cgraph_node *,
>                                           gimple, gimple, gcov_type, int,
> Index: value-prof.c
> ===================================================================
> --- value-prof.c        (revision 201568)
> +++ value-prof.c        (working copy)
> @@ -1248,7 +1248,7 @@ check_ic_target (gimple call_stmt, struc
>      old call
>   */
>
> -static gimple
> +gimple
>  gimple_ic (gimple icall_stmt, struct cgraph_node *direct_call,
>            int prob, gcov_type count, gcov_type all)
>  {
> Index: value-prof.h
> ===================================================================
> --- value-prof.h        (revision 201568)
> +++ value-prof.h        (working copy)
> @@ -86,6 +86,8 @@ void gimple_move_stmt_histograms (struct
>  void verify_histograms (void);
>  void free_histograms (void);
>  void stringop_block_profile (gimple, unsigned int *, HOST_WIDE_INT *);
> +gimple gimple_ic (gimple, struct cgraph_node *, int, gcov_type, gcov_type);
> +
>
>  /* In tree-profile.c.  */
>  extern void gimple_init_edge_profiler (void);
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c    (revision 201568)
> +++ ipa-cp.c    (working copy)
> @@ -2278,14 +2278,15 @@ ipcp_discover_new_direct_edges (struct c
>                                                aggvals);
>        if (target)
>         {
> +         bool agg_contents = ie->indirect_info->agg_contents;
> +         bool polymorphic = ie->indirect_info->polymorphic;
> +         bool param_index = ie->indirect_info->param_index;
>           struct cgraph_edge *cs = ipa_make_edge_direct_to_target (ie, target);
>           found = true;
>
> -         if (cs && !ie->indirect_info->agg_contents
> -             && !ie->indirect_info->polymorphic)
> +         if (cs && !agg_contents && !polymorphic)
>             {
>               struct ipa_node_params *info = IPA_NODE_REF (node);
> -             int param_index = ie->indirect_info->param_index;
>               int c = ipa_get_controlled_uses (info, param_index);
>               if (c != IPA_UNDESCRIBED_USE)
>                 {
> @@ -2299,7 +2300,7 @@ ipcp_discover_new_direct_edges (struct c
>                   if (c == 0
>                       && (to_del = ipa_find_reference ((symtab_node) node,
>                                                        (symtab_node) cs->callee,
> -                                                      NULL)))
> +                                                      NULL, 0)))
>                     {
>                       if (dump_file && (dump_flags & TDF_DETAILS))
>                         fprintf (dump_file, "       and even removing its "
> Index: ipa-ref.c
> ===================================================================
> --- ipa-ref.c   (revision 201601)
> +++ ipa-ref.c   (working copy)
> @@ -38,7 +38,7 @@ ipa_record_reference (symtab_node referr
>                       symtab_node referred_node,
>                       enum ipa_ref_use use_type, gimple stmt)
>  {
> -  struct ipa_ref *ref;
> +  struct ipa_ref *ref, *ref2;
>    struct ipa_ref_list *list, *list2;
>    ipa_ref_t *old_references;
>
> @@ -56,14 +56,16 @@ ipa_record_reference (symtab_node referr
>    ref->referring = referring_node;
>    ref->referred = referred_node;
>    ref->stmt = stmt;
> +  ref->lto_stmt_uid = 0;
>    ref->use = use_type;
> +  ref->speculative = 0;
>
>    /* If vector was moved in memory, update pointers.  */
>    if (old_references != list->references->address ())
>      {
>        int i;
> -      for (i = 0; ipa_ref_list_reference_iterate (list, i, ref); i++)
> -       ipa_ref_referred_ref_list (ref)->referring[ref->referred_index] = ref;
> +      for (i = 0; ipa_ref_list_reference_iterate (list, i, ref2); i++)
> +       ipa_ref_referred_ref_list (ref2)->referring[ref2->referred_index] = ref2;
>      }
>    return ref;
>  }
> @@ -155,6 +157,8 @@ ipa_dump_references (FILE * file, struct
>                 symtab_node_asm_name (ref->referred),
>                 ref->referred->symbol.order,
>                ipa_ref_use_name [ref->use]);
> +      if (ref->speculative)
> +       fprintf (file, " (speculative)");
>      }
>    fprintf (file, "\n");
>  }
> @@ -172,22 +176,50 @@ ipa_dump_referring (FILE * file, struct
>                 symtab_node_asm_name (ref->referring),
>                 ref->referring->symbol.order,
>                ipa_ref_use_name [ref->use]);
> +      if (ref->speculative)
> +       fprintf (file, " (speculative)");
>      }
>    fprintf (file, "\n");
>  }
>
> +/* Clone reference REF to DEST_NODE and set its stmt to STMT.  */
> +
> +struct ipa_ref *
> +ipa_clone_ref (struct ipa_ref *ref,
> +              symtab_node dest_node,
> +              gimple stmt)
> +{
> +  bool speculative = ref->speculative;
> +  unsigned int stmt_uid = ref->lto_stmt_uid;
> +  struct ipa_ref *ref2;
> +
> +  ref2 = ipa_record_reference (dest_node,
> +                              ref->referred,
> +                              ref->use, stmt);
> +  ref2->speculative = speculative;
> +  ref2->lto_stmt_uid = stmt_uid;
> +  return ref2;
> +}
> +
>  /* Clone all references from SRC to DEST_NODE or DEST_VARPOOL_NODE.  */
>
>  void
>  ipa_clone_references (symtab_node dest_node,
>                       struct ipa_ref_list *src)
>  {
> -  struct ipa_ref *ref;
> +  struct ipa_ref *ref, *ref2;
>    int i;
>    for (i = 0; ipa_ref_list_reference_iterate (src, i, ref); i++)
> -    ipa_record_reference (dest_node,
> -                         ref->referred,
> -                         ref->use, ref->stmt);
> +    {
> +      bool speculative = ref->speculative;
> +      unsigned int stmt_uid = ref->lto_stmt_uid;
> +
> +      ref2 = ipa_record_reference (dest_node,
> +                                  ref->referred,
> +                                  ref->use, ref->stmt);
> +      ref2->speculative = speculative;
> +      ref2->lto_stmt_uid = stmt_uid;
> +    }
>  }
>
>  /* Clone all referring from SRC to DEST_NODE or DEST_VARPOOL_NODE.  */
> @@ -196,12 +228,19 @@ void
>  ipa_clone_referring (symtab_node dest_node,
>                     struct ipa_ref_list *src)
>  {
> -  struct ipa_ref *ref;
> +  struct ipa_ref *ref, *ref2;
>    int i;
>    for (i = 0; ipa_ref_list_referring_iterate (src, i, ref); i++)
> -    ipa_record_reference (ref->referring,
> -                         dest_node,
> -                         ref->use, ref->stmt);
> +    {
> +      bool speculative = ref->speculative;
> +      unsigned int stmt_uid = ref->lto_stmt_uid;
> +
> +      ref2 = ipa_record_reference (ref->referring,
> +                                  dest_node,
> +                                  ref->use, ref->stmt);
> +      ref2->speculative = speculative;
> +      ref2->lto_stmt_uid = stmt_uid;
> +    }
>  }
>
>  /* Return true when execution of REF can lead to return from
> @@ -230,14 +269,17 @@ ipa_ref_has_aliases_p (struct ipa_ref_li
>
>  struct ipa_ref *
>  ipa_find_reference (symtab_node referring_node, symtab_node referred_node,
> -                   gimple stmt)
> +                   gimple stmt, unsigned int lto_stmt_uid)
>  {
>    struct ipa_ref *r = NULL;
>    int i;
>
>    for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
>      if (r->referred == referred_node
> -       && (in_lto_p || r->stmt == stmt))
> +       && !r->speculative
> +       && ((stmt && r->stmt == stmt)
> +           || (lto_stmt_uid && r->lto_stmt_uid == lto_stmt_uid)
> +           || (!stmt && !lto_stmt_uid && !r->stmt && !r->lto_stmt_uid)))
>        return r;
>    return NULL;
>  }
> @@ -257,7 +299,9 @@ ipa_remove_stmt_references (symtab_node
>  }
>
>  /* Remove all stmt references in non-speculative references.
> -   Those are not maintained during inlining & clonning. */
> +   Those are not maintained during inlining & clonning.
> +   The exception are speculative references that are updated along
> +   with callgraph edges associated with them.  */
>
>  void
>  ipa_clear_stmts_in_references (symtab_node referring_node)
> @@ -266,5 +310,6 @@ ipa_clear_stmts_in_references (symtab_no
>    int i;
>
>    for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
> -    r->stmt = NULL;
> +    if (!r->speculative)
> +      r->stmt = NULL;
>  }
> Index: ipa-ref.h
> ===================================================================
> --- ipa-ref.h   (revision 201601)
> +++ ipa-ref.h   (working copy)
> @@ -40,8 +40,10 @@ struct GTY(()) ipa_ref
>    symtab_node referring;
>    symtab_node referred;
>    gimple stmt;
> +  unsigned int lto_stmt_uid;
>    unsigned int referred_index;
>    ENUM_BITFIELD (ipa_ref_use) use:2;
> +  unsigned int speculative:1;
>  };
>
>  typedef struct ipa_ref ipa_ref_t;
> @@ -71,8 +73,9 @@ void ipa_dump_references (FILE *, struct
>  void ipa_dump_referring (FILE *, struct ipa_ref_list *);
>  void ipa_clone_references (symtab_node, struct ipa_ref_list *);
>  void ipa_clone_referring (symtab_node, struct ipa_ref_list *);
> +struct ipa_ref * ipa_clone_ref (struct ipa_ref *, symtab_node, gimple);
>  bool ipa_ref_cannot_lead_to_return (struct ipa_ref *);
>  bool ipa_ref_has_aliases_p (struct ipa_ref_list *);
> -struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple);
> +struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple, unsigned int);
>  void ipa_remove_stmt_references (symtab_node, gimple);
>  void ipa_clear_stmts_in_references (symtab_node);
> Index: lto-cgraph.c
> ===================================================================
> --- lto-cgraph.c        (revision 201568)
> +++ lto-cgraph.c        (working copy)
> @@ -273,12 +273,13 @@ lto_output_edge (struct lto_simple_outpu
>
>    bp = bitpack_create (ob->main_stream);
>    uid = (!gimple_has_body_p (edge->caller->symbol.decl)
> -        ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt));
> +        ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt) + 1);
>    bp_pack_enum (&bp, cgraph_inline_failed_enum,
>                 CIF_N_REASONS, edge->inline_failed);
>    bp_pack_var_len_unsigned (&bp, uid);
>    bp_pack_var_len_unsigned (&bp, edge->frequency);
>    bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
> +  bp_pack_value (&bp, edge->speculative, 1);
>    bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
>    bp_pack_value (&bp, edge->can_throw_external, 1);
>    if (edge->indirect_unknown_callee)
> @@ -589,13 +590,24 @@ lto_output_ref (struct lto_simple_output
>  {
>    struct bitpack_d bp;
>    int nref;
> +  int uid = ref->lto_stmt_uid;
> +  struct cgraph_node *node;
>
>    bp = bitpack_create (ob->main_stream);
>    bp_pack_value (&bp, ref->use, 2);
> +  bp_pack_value (&bp, ref->speculative, 1);
>    streamer_write_bitpack (&bp);
>    nref = lto_symtab_encoder_lookup (encoder, ref->referred);
>    gcc_assert (nref != LCC_NOT_FOUND);
>    streamer_write_hwi_stream (ob->main_stream, nref);
> +
> +  node = dyn_cast <cgraph_node> (ref->referring);
> +  if (node)
> +    {
> +      if (ref->stmt)
> +       uid = gimple_uid (ref->stmt) + 1;
> +      streamer_write_hwi_stream (ob->main_stream, uid);
> +    }
>  }
>
>  /* Stream out profile_summary to OB.  */
> @@ -1116,11 +1128,17 @@ input_ref (struct lto_input_block *ib,
>    symtab_node node = NULL;
>    struct bitpack_d bp;
>    enum ipa_ref_use use;
> +  bool speculative;
> +  struct ipa_ref *ref;
>
>    bp = streamer_read_bitpack (ib);
>    use = (enum ipa_ref_use) bp_unpack_value (&bp, 2);
> +  speculative = (enum ipa_ref_use) bp_unpack_value (&bp, 1);
>    node = nodes[streamer_read_hwi (ib)];
> -  ipa_record_reference (referring_node, node, use, NULL);
> +  ref = ipa_record_reference (referring_node, node, use, NULL);
> +  ref->speculative = speculative;
> +  if (is_a <cgraph_node> (referring_node))
> +    ref->lto_stmt_uid = streamer_read_hwi (ib);
>  }
>
>  /* Read an edge from IB.  NODES points to a vector of previously read nodes for
> @@ -1167,6 +1185,7 @@ input_edge (struct lto_input_block *ib,
>      edge = cgraph_create_edge (caller, callee, NULL, count, freq);
>
>    edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
> +  edge->speculative = bp_unpack_value (&bp, 1);
>    edge->lto_stmt_uid = stmt_id;
>    edge->inline_failed = inline_failed;
>    edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
> Index: cgraphclones.c
> ===================================================================
> --- cgraphclones.c      (revision 201601)
> +++ cgraphclones.c      (working copy)
> @@ -147,6 +147,7 @@ cgraph_clone_edge (struct cgraph_edge *e
>    /* Clone flags that depend on call_stmt availability manually.  */
>    new_edge->can_throw_external = e->can_throw_external;
>    new_edge->call_stmt_cannot_inline_p = e->call_stmt_cannot_inline_p;
> +  new_edge->speculative = e->speculative;
>    if (update_original)
>      {
>        e->count -= new_edge->count;
> @@ -474,17 +475,21 @@ cgraph_find_replacement_node (struct cgr
>  }
>
>  /* Like cgraph_set_call_stmt but walk the clone tree and update all
> -   clones sharing the same function body.  */
> +   clones sharing the same function body.
> +   When WHOLE_SPECULATIVE_EDGES is true, all three components of
> +   speculative edge gets updated.  Otherwise we update only direct
> +   call.  */
>
>  void
>  cgraph_set_call_stmt_including_clones (struct cgraph_node *orig,
> -                                      gimple old_stmt, gimple new_stmt)
> +                                      gimple old_stmt, gimple new_stmt,
> +                                      bool update_speculative)
>  {
>    struct cgraph_node *node;
>    struct cgraph_edge *edge = cgraph_edge (orig, old_stmt);
>
>    if (edge)
> -    cgraph_set_call_stmt (edge, new_stmt);
> +    cgraph_set_call_stmt (edge, new_stmt, update_speculative);
>
>    node = orig->clones;
>    if (node)
> @@ -492,7 +497,23 @@ cgraph_set_call_stmt_including_clones (s
>        {
>         struct cgraph_edge *edge = cgraph_edge (node, old_stmt);
>         if (edge)
> -         cgraph_set_call_stmt (edge, new_stmt);
> +         {
> +           cgraph_set_call_stmt (edge, new_stmt, update_speculative);
> +           /* If UPDATE_SPECULATIVE is false, it means that we are turning
> +              speculative call into a real code sequence.  Update the
> +              callgraph edges.  */
> +           if (edge->speculative && !update_speculative)
> +             {
> +               struct cgraph_edge *direct, *indirect;
> +               struct ipa_ref *ref;
> +
> +               gcc_assert (!edge->indirect_unknown_callee);
> +               cgraph_speculative_call_info (edge, direct, indirect, ref);
> +               direct->speculative = false;
> +               indirect->speculative = false;
> +               ref->speculative = false;
> +             }
> +         }
>         if (node->clones)
>           node = node->clones;
>         else if (node->next_sibling_clone)
> @@ -811,6 +832,7 @@ cgraph_materialize_all_clones (void)
>  {
>    struct cgraph_node *node;
>    bool stabilized = false;
> +
>
>    if (cgraph_dump_file)
>      fprintf (cgraph_dump_file, "Materializing clones\n");
> Index: ipa-inline.c
> ===================================================================
> --- ipa-inline.c        (revision 201601)
> +++ ipa-inline.c        (working copy)
> @@ -1388,6 +1388,17 @@ add_new_edges_to_heap (fibheap_t heap, v
>      }
>  }
>
> +/* Remove EDGE from the fibheap.  */
> +
> +static void
> +heap_edge_removal_hook (struct cgraph_edge *e, void *data)
> +{
> +  if (e->aux)
> +    {
> +      fibheap_delete_node ((fibheap_t)data, (fibnode_t)e->aux);
> +      e->aux = NULL;
> +    }
> +}
>
>  /* We use greedy algorithm for inlining of small functions:
>     All inline candidates are put into prioritized heap ordered in
> @@ -1406,10 +1417,14 @@ inline_small_functions (void)
>    vec<cgraph_edge_p> new_indirect_edges = vNULL;
>    int initial_size = 0;
>    struct cgraph_node **order = XCNEWVEC (struct cgraph_node *, cgraph_n_nodes);
> +  struct cgraph_edge_hook_list *edge_removal_hook_holder;
>
>    if (flag_indirect_inlining)
>      new_indirect_edges.create (8);
>
> +  edge_removal_hook_holder
> +    = cgraph_add_edge_removal_hook (&heap_edge_removal_hook, edge_heap);
> +
>    /* Compute overall unit size and other global parameters used by badness
>       metrics.  */
>
> @@ -1663,6 +1678,7 @@ inline_small_functions (void)
>              initial_size, overall_size,
>              initial_size ? overall_size * 100 / (initial_size) - 100: 0);
>    BITMAP_FREE (updated_nodes);
> +  cgraph_remove_edge_removal_hook (edge_removal_hook_holder);
>  }
>
>  /* Flatten NODE.  Performed both during early inlining and
> Index: lto-streamer-in.c
> ===================================================================
> --- lto-streamer-in.c   (revision 201627)
> +++ lto-streamer-in.c   (working copy)
> @@ -755,30 +755,60 @@ input_ssa_names (struct lto_input_block
>     so they point to STMTS.  */
>
>  static void
> -fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts)
> +fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts,
> +                        struct function *fn)
>  {
>    struct cgraph_edge *cedge;
> +  struct ipa_ref *ref;
> +  unsigned int i;
> +
>    for (cedge = node->callees; cedge; cedge = cedge->next_callee)
> -    cedge->call_stmt = stmts[cedge->lto_stmt_uid];
> +    {
> +      if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
> +      fatal_error ("Cgraph edge statement index out of range");
> +      cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
> +      if (!cedge->call_stmt)
> +      fatal_error ("Cgraph edge statement index not found");
> +    }
>    for (cedge = node->indirect_calls; cedge; cedge = cedge->next_callee)
> -    cedge->call_stmt = stmts[cedge->lto_stmt_uid];
> +    {
> +      if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
> +      fatal_error ("Cgraph edge statement index out of range");
> +      cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
> +      if (!cedge->call_stmt)
> +      fatal_error ("Cgraph edge statement index not found");
> +    }
> +  for (i = 0;
> +       ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);
> +       i++)
> +    if (ref->lto_stmt_uid)
> +      {
> +       if (gimple_stmt_max_uid (fn) < ref->lto_stmt_uid)
> +         fatal_error ("Reference statement index out of range");
> +       ref->stmt = stmts[ref->lto_stmt_uid - 1];
> +       if (!ref->stmt)
> +         fatal_error ("Reference statement index not found");
> +      }
>  }
>
> +
>  /* Fixup call_stmt pointers in NODE and all clones.  */
>
>  static void
>  fixup_call_stmt_edges (struct cgraph_node *orig, gimple *stmts)
>  {
>    struct cgraph_node *node;
> +  struct function *fn;
>
>    while (orig->clone_of)
>      orig = orig->clone_of;
> +  fn = DECL_STRUCT_FUNCTION (orig->symbol.decl);
>
> -  fixup_call_stmt_edges_1 (orig, stmts);
> +  fixup_call_stmt_edges_1 (orig, stmts, fn);
>    if (orig->clones)
>      for (node = orig->clones; node != orig;)
>        {
> -       fixup_call_stmt_edges_1 (node, stmts);
> +       fixup_call_stmt_edges_1 (node, stmts, fn);
>         if (node->clones)
>           node = node->clones;
>         else if (node->next_sibling_clone)
> Index: ipa-prop.c
> ===================================================================
> --- ipa-prop.c  (revision 201568)
> +++ ipa-prop.c  (working copy)
> @@ -2313,12 +2313,6 @@ ipa_make_edge_direct_to_target (struct c
>       the cgraph node too early.  */
>    gcc_assert (!callee->global.inlined_to);
>
> -  cgraph_make_edge_direct (ie, callee);
> -  es = inline_edge_summary (ie);
> -  es->call_stmt_size -= (eni_size_weights.indirect_call_cost
> -                        - eni_size_weights.call_cost);
> -  es->call_stmt_time -= (eni_time_weights.indirect_call_cost
> -                        - eni_time_weights.call_cost);
>    if (dump_file && !unreachable)
>      {
>        fprintf (dump_file, "ipa-prop: Discovered %s call to a known target "
> @@ -2326,14 +2320,19 @@ ipa_make_edge_direct_to_target (struct c
>                ie->indirect_info->polymorphic ? "a virtual" : "an indirect",
>                xstrdup (cgraph_node_name (ie->caller)),
>                ie->caller->symbol.order,
> -              xstrdup (cgraph_node_name (ie->callee)),
> -              ie->callee->symbol.order);
> +              xstrdup (cgraph_node_name (callee)),
> +              callee->symbol.order);
>        if (ie->call_stmt)
>         print_gimple_stmt (dump_file, ie->call_stmt, 2, TDF_SLIM);
>        else
>         fprintf (dump_file, "with uid %i\n", ie->lto_stmt_uid);
> -    }
> -  callee = cgraph_function_or_thunk_node (callee, NULL);
> +     }
> +  ie = cgraph_make_edge_direct (ie, callee);
> +  es = inline_edge_summary (ie);
> +  es->call_stmt_size -= (eni_size_weights.indirect_call_cost
> +                        - eni_size_weights.call_cost);
> +  es->call_stmt_time -= (eni_time_weights.indirect_call_cost
> +                        - eni_time_weights.call_cost);
>
>    return ie;
>  }
> @@ -2374,7 +2373,7 @@ remove_described_reference (symtab_node
>
>    origin = rdesc->cs;
>    to_del = ipa_find_reference ((symtab_node) origin->caller, symbol,
> -                              origin->call_stmt);
> +                              origin->call_stmt, origin->lto_stmt_uid);
>    gcc_assert (to_del);
>    ipa_remove_reference (to_del);
>    if (dump_file)
> @@ -2408,9 +2407,11 @@ try_make_edge_direct_simple_call (struct
>                                   struct ipa_jump_func *jfunc,
>                                   struct ipa_node_params *new_root_info)
>  {
> -  struct ipa_cst_ref_desc *rdesc;
>    struct cgraph_edge *cs;
>    tree target;
> +  bool agg_contents = ie->indirect_info->agg_contents;
> +  bool speculative = ie->speculative;
> +  struct ipa_cst_ref_desc *rdesc;
>
>    if (ie->indirect_info->agg_contents)
>      target = ipa_find_agg_cst_for_param (&jfunc->agg,
> @@ -2422,7 +2423,8 @@ try_make_edge_direct_simple_call (struct
>      return NULL;
>    cs = ipa_make_edge_direct_to_target (ie, target);
>
> -  if (cs && !ie->indirect_info->agg_contents
> +  /* FIXME: speculative edges can be handled.  */
> +  if (cs && !agg_contents && !speculative
>        && jfunc->type == IPA_JF_CONST
>        && (rdesc = jfunc_rdesc_usable (jfunc))
>        && --rdesc->refcount == 0)
> @@ -2521,7 +2523,14 @@ update_indirect_edges_after_inlining (st
>        else
>         new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc,
>                                                             new_root_info);
> -      if (new_direct_edge)
> +      /* If speculation was removed, then we need to do nothing.  */
> +      if (new_direct_edge && new_direct_edge != ie)
> +       {
> +         new_direct_edge->indirect_inlining_edge = 1;
> +         top = IPA_EDGE_REF (cs);
> +         res = true;
> +       }
> +      else if (new_direct_edge)
>         {
>           new_direct_edge->indirect_inlining_edge = 1;
>           if (new_direct_edge->call_stmt)
> @@ -2532,9 +2541,9 @@ update_indirect_edges_after_inlining (st
>           if (new_edges)
>             {
>               new_edges->safe_push (new_direct_edge);
> -             top = IPA_EDGE_REF (cs);
>               res = true;
>             }
> +         top = IPA_EDGE_REF (cs);
>         }
>        else if (jfunc->type == IPA_JF_PASS_THROUGH
>                && ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
> @@ -2645,7 +2654,7 @@ propagate_controlled_uses (struct cgraph
>                   && TREE_CODE (TREE_OPERAND (t, 0)) == FUNCTION_DECL
>                   && (n = cgraph_get_node (TREE_OPERAND (t, 0)))
>                   && (ref = ipa_find_reference ((symtab_node) new_root,
> -                                               (symtab_node) n, NULL)))
> +                                               (symtab_node) n, NULL, 0)))
>                 {
>                   if (dump_file)
>                     fprintf (dump_file, "ipa-prop: Removing cloning-created "
> @@ -2683,7 +2692,7 @@ propagate_controlled_uses (struct cgraph
>                     {
>                       struct ipa_ref *ref;
>                       ref = ipa_find_reference ((symtab_node) clone,
> -                                               (symtab_node) n, NULL);
> +                                               (symtab_node) n, NULL, 0);
>                       if (ref)
>                         {
>                           if (dump_file)
> Index: tree-inline.c
> ===================================================================
> --- tree-inline.c       (revision 201568)
> +++ tree-inline.c       (working copy)
> @@ -1676,6 +1676,8 @@ copy_bb (copy_body_data *id, basic_block
>                   if (edge)
>                     {
>                       int edge_freq = edge->frequency;
> +                     int new_freq;
> +                     struct cgraph_edge *old_edge = edge;
>                       edge = cgraph_clone_edge (edge, id->dst_node, stmt,
>                                                 gimple_uid (stmt),
>                                                 REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
> @@ -1683,25 +1685,52 @@ copy_bb (copy_body_data *id, basic_block
>                       /* We could also just rescale the frequency, but
>                          doing so would introduce roundoff errors and make
>                          verifier unhappy.  */
> -                     edge->frequency
> -                       = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
> -                                                         copy_basic_block);
> -                     if (dump_file
> -                         && profile_status_for_function (cfun) != PROFILE_ABSENT
> -                         && (edge_freq > edge->frequency + 10
> -                             || edge_freq < edge->frequency - 10))
> +                     new_freq  = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
> +                                                                 copy_basic_block);
> +
> +                     /* Speculative calls consist of two edges - direct and indirect.
> +                        Duplicate the whole thing and distribute frequencies accordingly.  */
> +                     if (edge->speculative)
>                         {
> -                         fprintf (dump_file, "Edge frequency estimated by "
> -                                  "cgraph %i diverge from inliner's estimate %i\n",
> -                                  edge_freq,
> -                                  edge->frequency);
> -                         fprintf (dump_file,
> -                                  "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
> -                                  bb->index,
> -                                  bb->frequency,
> -                                  copy_basic_block->frequency);
> +                         struct cgraph_edge *direct, *indirect;
> +                         struct ipa_ref *ref;
> +
> +                         gcc_assert (!edge->indirect_unknown_callee);
> +                         cgraph_speculative_call_info (old_edge, direct, indirect, ref);
> +                         indirect = cgraph_clone_edge (indirect, id->dst_node, stmt,
> +                                                       gimple_uid (stmt),
> +                                                       REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
> +                                                       true);
> +                         if (old_edge->frequency + indirect->frequency)
> +                           {
> +                             edge->frequency = MIN (RDIV ((gcov_type)new_freq * old_edge->frequency,
> +                                                          (old_edge->frequency + indirect->frequency)),
> +                                                    CGRAPH_FREQ_MAX);
> +                             indirect->frequency = MIN (RDIV ((gcov_type)new_freq * indirect->frequency,
> +                                                              (old_edge->frequency + indirect->frequency)),
> +                                                        CGRAPH_FREQ_MAX);
> +                           }
> +                         ipa_clone_ref (ref, (symtab_node)id->dst_node, stmt);
> +                       }
> +                     else
> +                       {
> +                         edge->frequency = new_freq;
> +                         if (dump_file
> +                             && profile_status_for_function (cfun) != PROFILE_ABSENT
> +                             && (edge_freq > edge->frequency + 10
> +                                 || edge_freq < edge->frequency - 10))
> +                           {
> +                             fprintf (dump_file, "Edge frequency estimated by "
> +                                      "cgraph %i diverge from inliner's estimate %i\n",
> +                                      edge_freq,
> +                                      edge->frequency);
> +                             fprintf (dump_file,
> +                                      "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
> +                                      bb->index,
> +                                      bb->frequency,
> +                                      copy_basic_block->frequency);
> +                           }
>                         }
> -                     stmt = cgraph_redirect_edge_call_stmt_to_callee (edge);
>                     }
>                   break;
>
> @@ -2232,6 +2261,23 @@ copy_loops (bitmap blocks_to_copy,
>      }
>  }
>
> +/* Call cgraph_redirect_edge_call_stmt_to_callee on all calls in BB */
> +
> +void
> +redirect_all_calls (copy_body_data * id, basic_block bb)
> +{
> +  gimple_stmt_iterator si;
> +  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
> +    {
> +      if (is_gimple_call (gsi_stmt (si)))
> +       {
> +         struct cgraph_edge *edge = cgraph_edge (id->dst_node, gsi_stmt (si));
> +         if (edge)
> +           cgraph_redirect_edge_call_stmt_to_callee (edge);
> +       }
> +    }
> +}
> +
>  /* Make a copy of the body of FN so that it can be inserted inline in
>     another function.  Walks FN via CFG, returns new fndecl.  */
>
> @@ -2356,6 +2402,10 @@ copy_cfg_body (copy_body_data * id, gcov
>             && bb->index != ENTRY_BLOCK
>             && bb->index != EXIT_BLOCK)
>           maybe_move_debug_stmts_to_successors (id, (basic_block) bb->aux);
> +       /* Update call edge destinations.  This can not be done before loop
> +          info is updated, because we may split basic blocks.  */
> +       if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
> +         redirect_all_calls (id, (basic_block)bb->aux);
>         ((basic_block)bb->aux)->aux = NULL;
>         bb->aux = NULL;
>        }
> @@ -2367,6 +2417,10 @@ copy_cfg_body (copy_body_data * id, gcov
>        if (need_debug_cleanup)
>         maybe_move_debug_stmts_to_successors (id, BASIC_BLOCK (last));
>        BASIC_BLOCK (last)->aux = NULL;
> +      /* Update call edge destinations.  This can not be done before loop
> +        info is updated, because we may split basic blocks.  */
> +      if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
> +       redirect_all_calls (id, BASIC_BLOCK (last));
>      }
>    entry_block_map->aux = NULL;
>    exit_block_map->aux = NULL;
> @@ -4941,43 +4995,47 @@ delete_unreachable_blocks_update_callgra
>            gimple_stmt_iterator bsi;
>
>            for (bsi = gsi_start_bb (b); !gsi_end_p (bsi); gsi_next (&bsi))
> -           if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL)
> -             {
> -               struct cgraph_edge *e;
> -               struct cgraph_node *node;
> +           {
> +             struct cgraph_edge *e;
> +             struct cgraph_node *node;
>
> -               if ((e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
> +             ipa_remove_stmt_references ((symtab_node)id->dst_node, gsi_stmt (bsi));
> +
> +             if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
> +                 &&(e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
> +               {
> +                 if (!e->inline_failed)
> +                   cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> +                 else
> +                   cgraph_remove_edge (e);
> +               }
> +             if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
> +                 && id->dst_node->clones)
> +               for (node = id->dst_node->clones; node != id->dst_node;)
>                   {
> -                   if (!e->inline_failed)
> -                     cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> +                   ipa_remove_stmt_references ((symtab_node)node, gsi_stmt (bsi));
> +                   if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
> +                       && (e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
> +                     {
> +                       if (!e->inline_failed)
> +                         cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> +                       else
> +                         cgraph_remove_edge (e);
> +                     }
> +
> +                   if (node->clones)
> +                     node = node->clones;
> +                   else if (node->next_sibling_clone)
> +                     node = node->next_sibling_clone;
>                     else
> -                     cgraph_remove_edge (e);
> +                     {
> +                       while (node != id->dst_node && !node->next_sibling_clone)
> +                         node = node->clone_of;
> +                       if (node != id->dst_node)
> +                         node = node->next_sibling_clone;
> +                     }
>                   }
> -               if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
> -                   && id->dst_node->clones)
> -                 for (node = id->dst_node->clones; node != id->dst_node;)
> -                   {
> -                     if ((e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
> -                       {
> -                         if (!e->inline_failed)
> -                           cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
> -                         else
> -                           cgraph_remove_edge (e);
> -                       }
> -
> -                     if (node->clones)
> -                       node = node->clones;
> -                     else if (node->next_sibling_clone)
> -                       node = node->next_sibling_clone;
> -                     else
> -                       {
> -                         while (node != id->dst_node && !node->next_sibling_clone)
> -                           node = node->clone_of;
> -                         if (node != id->dst_node)
> -                           node = node->next_sibling_clone;
> -                       }
> -                   }
> -             }
> +           }
>           delete_basic_block (b);
>           changed = true;
>         }
Jan Hubicka Aug. 9, 2013, 8:24 p.m. UTC | #2
> I have not looked at the details. One high level question: this form
> seems to only support one indirect target case. LIPO uses TOPN
> indirect target profiling (tracking multiple targets), which can be
> used by LTO as well (when the topn profiling gets into trunk).

Well, adding multiple direct edges for given call will need extension into
cgraph_turn_edge_to_speculative and cgraph_speculative_call_info APIs (to allow
multple direct edges) to indirect_info common_target datastructure, to
profiling histograms and to the gimple_ic code.  Otherwise there is nothing
really hard coded about single direct target.

How much benefits do you see from having multiple direct targets? I would
expect them to be quite quickly disappearing as N increases...
How the TOPN profiling counter is implemented?

Honza
Andi Kleen Aug. 9, 2013, 10:12 p.m. UTC | #3
Jan Hubicka <hubicka@ucw.cz> writes:

> Hi,
> this patch adds support for speculative calls into callgraph.  The idea is that
> any IPA optimization that believes it knows likely target of an indirect call
> (currently I use it for cross-module indirect call profiling, but I expect
> Martin J. can easily add support for ipa-cp and I hope to add speculative
> devirtualization in foreseeable future since it should make difference for
> Firefox). Speculative call replaces indirect call

Patch appears to break boot strap on x86_64-linux 
(or maybe your other one)

0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x827ab7 crash_signal
        ../../gcc/gcc/toplev.c:335
0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
        cgraph_edge*&, ipa_ref*&)
        ../../gcc/gcc/cgraph.c:1098
0x5d1270 cgraph_set_call_stmt(cgraph_edge*, gimple_statement_d*, bool)
        ../../gcc/gcc/cgraph.c:774
0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
        cgraph_edge*&, ipa_ref*&)
        ../../gcc/gcc/cgraph.c:1098
0x5d1270 cgraph_set_call_stmt(cgraph_edge*, gimple_statement_d*, bool)
        ../../gcc/gcc/cgraph.c:774
0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
        cgraph_edge*&, ipa_ref*&)
        ../../gcc/gcc/cgraph.c:1098
...


-Andi
Xinliang David Li Aug. 9, 2013, 10:38 p.m. UTC | #4
On Fri, Aug 9, 2013 at 1:24 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> I have not looked at the details. One high level question: this form
>> seems to only support one indirect target case. LIPO uses TOPN
>> indirect target profiling (tracking multiple targets), which can be
>> used by LTO as well (when the topn profiling gets into trunk).
>
> Well, adding multiple direct edges for given call will need extension into
> cgraph_turn_edge_to_speculative and cgraph_speculative_call_info APIs (to allow
> multple direct edges) to indirect_info common_target datastructure, to
> profiling histograms and to the gimple_ic code.  Otherwise there is nothing
> really hard coded about single direct target.
>
> How much benefits do you see from having multiple direct targets?

It can be large depending on applications. For some apps, there are
very callsites with 2 or 3 very hot targets.

> I would
> expect them to be quite quickly disappearing as N increases...

For the large apps I see, N rarely exceeds 4. For most of the cases,
the hot targets are 2 or 3 with 2 being very common.  In LIPO, we
track 4 targets, but only use top 2.

> How the TOPN profiling counter is implemented?

Currently FDO's value profiler is one value based -- it is quite
faulty because it can not handle cases where hot targets alternates.
For instance for sequence 1, 2, 1, 2, 1, 2, 3,3, 1 -- the winner will
be 3 instead of 1.

TOPN algorithm is a LFU based --- when match is not found, the least
frequent used entry will be evicted. To avoid ping-pong effect, total
eviction count is also tracked -- when it exceeds a threshold, the
bottom counter will be clear to make room for hot entries (instead of
letting them killing each other).

thanks,

David


>
> Honza
Jan Hubicka Aug. 9, 2013, 11:15 p.m. UTC | #5
> Jan Hubicka <hubicka@ucw.cz> writes:
> 
> > Hi,
> > this patch adds support for speculative calls into callgraph.  The idea is that
> > any IPA optimization that believes it knows likely target of an indirect call
> > (currently I use it for cross-module indirect call profiling, but I expect
> > Martin J. can easily add support for ipa-cp and I hope to add speculative
> > devirtualization in foreseeable future since it should make difference for
> > Firefox). Speculative call replaces indirect call
> 
> Patch appears to break boot strap on x86_64-linux 
> (or maybe your other one)

How do you bootstrap? This should not be used w/o lto+profiledbootstrap and
that seems to work for me...

I will re-try with clean tree now.

Honza
> 
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x827ab7 crash_signal
>         ../../gcc/gcc/toplev.c:335
> 0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
>         cgraph_edge*&, ipa_ref*&)
>         ../../gcc/gcc/cgraph.c:1098
> 0x5d1270 cgraph_set_call_stmt(cgraph_edge*, gimple_statement_d*, bool)
>         ../../gcc/gcc/cgraph.c:774
> 0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
>         cgraph_edge*&, ipa_ref*&)
>         ../../gcc/gcc/cgraph.c:1098
> 0x5d1270 cgraph_set_call_stmt(cgraph_edge*, gimple_statement_d*, bool)
>         ../../gcc/gcc/cgraph.c:774
> 0x5d0c6c cgraph_speculative_call_info(cgraph_edge*, cgraph_edge*&,
>         cgraph_edge*&, ipa_ref*&)
>         ../../gcc/gcc/cgraph.c:1098
> ...
> 
> 
> -Andi
> 
> -- 
> ak@linux.intel.com -- Speaking for myself only
Jan Hubicka Aug. 9, 2013, 11:21 p.m. UTC | #6
> On Fri, Aug 9, 2013 at 1:24 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> I have not looked at the details. One high level question: this form
> >> seems to only support one indirect target case. LIPO uses TOPN
> >> indirect target profiling (tracking multiple targets), which can be
> >> used by LTO as well (when the topn profiling gets into trunk).
> >
> > Well, adding multiple direct edges for given call will need extension into
> > cgraph_turn_edge_to_speculative and cgraph_speculative_call_info APIs (to allow
> > multple direct edges) to indirect_info common_target datastructure, to
> > profiling histograms and to the gimple_ic code.  Otherwise there is nothing
> > really hard coded about single direct target.
> >
> > How much benefits do you see from having multiple direct targets?
> 
> It can be large depending on applications. For some apps, there are
> very callsites with 2 or 3 very hot targets.
> 
> > I would
> > expect them to be quite quickly disappearing as N increases...
> 
> For the large apps I see, N rarely exceeds 4. For most of the cases,
> the hot targets are 2 or 3 with 2 being very common.  In LIPO, we
> track 4 targets, but only use top 2.
> 
> > How the TOPN profiling counter is implemented?
> 
> Currently FDO's value profiler is one value based -- it is quite
> faulty because it can not handle cases where hot targets alternates.
> For instance for sequence 1, 2, 1, 2, 1, 2, 3,3, 1 -- the winner will
> be 3 instead of 1.

Or rather no-one, since 3 won't have high enough frequency.  The
counter was really intended to look for most common destination by
a huge margin.
(like for GCC, most devirtualizations actually have probability of 1,
so these are more missed cases of static analysis)
> 
> TOPN algorithm is a LFU based --- when match is not found, the least
> frequent used entry will be evicted. To avoid ping-pong effect, total
> eviction count is also tracked -- when it exceeds a threshold, the
> bottom counter will be clear to make room for hot entries (instead of
> letting them killing each other).

Ok, this seems to make sense.  Lets merge the profiling/transformation
infrastructure from LIPO and I will add support for multiple speculation
into callgraph.  I was trying to keep in mind that devirtualization may
also want to eventually do more than one target, but I tought it is
better to keep things easy at beggining.

Thanks,
Honza
Xinliang David Li Aug. 9, 2013, 11:27 p.m. UTC | #7
On Fri, Aug 9, 2013 at 4:21 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Fri, Aug 9, 2013 at 1:24 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >> I have not looked at the details. One high level question: this form
>> >> seems to only support one indirect target case. LIPO uses TOPN
>> >> indirect target profiling (tracking multiple targets), which can be
>> >> used by LTO as well (when the topn profiling gets into trunk).
>> >
>> > Well, adding multiple direct edges for given call will need extension into
>> > cgraph_turn_edge_to_speculative and cgraph_speculative_call_info APIs (to allow
>> > multple direct edges) to indirect_info common_target datastructure, to
>> > profiling histograms and to the gimple_ic code.  Otherwise there is nothing
>> > really hard coded about single direct target.
>> >
>> > How much benefits do you see from having multiple direct targets?
>>
>> It can be large depending on applications. For some apps, there are
>> very callsites with 2 or 3 very hot targets.
>>
>> > I would
>> > expect them to be quite quickly disappearing as N increases...
>>
>> For the large apps I see, N rarely exceeds 4. For most of the cases,
>> the hot targets are 2 or 3 with 2 being very common.  In LIPO, we
>> track 4 targets, but only use top 2.
>>
>> > How the TOPN profiling counter is implemented?
>>
>> Currently FDO's value profiler is one value based -- it is quite
>> faulty because it can not handle cases where hot targets alternates.
>> For instance for sequence 1, 2, 1, 2, 1, 2, 3,3, 1 -- the winner will
>> be 3 instead of 1.
>
> Or rather no-one, since 3 won't have high enough frequency.  The
> counter was really intended to look for most common destination by
> a huge margin.
> (like for GCC, most devirtualizations actually have probability of 1,
> so these are more missed cases of static analysis)
>>
>> TOPN algorithm is a LFU based --- when match is not found, the least
>> frequent used entry will be evicted. To avoid ping-pong effect, total
>> eviction count is also tracked -- when it exceeds a threshold, the
>> bottom counter will be clear to make room for hot entries (instead of
>> letting them killing each other).
>
> Ok, this seems to make sense.  Lets merge the profiling/transformation
> infrastructure from LIPO and I will add support for multiple speculation
> into callgraph.  I was trying to keep in mind that devirtualization may
> also want to eventually do more than one target, but I tought it is
> better to keep things easy at beggining.

Ok. Rong, can you help rip the TOPN profiler part from google branch
and submit to trunk? TopN profiler can be used in other scenarios
other than indirect call profiling.

thanks,

David


>
> Thanks,
> Honza
Andi Kleen Aug. 10, 2013, 12:25 a.m. UTC | #8
On Sat, Aug 10, 2013 at 01:15:11AM +0200, Jan Hubicka wrote:
> > Jan Hubicka <hubicka@ucw.cz> writes:
> > 
> > > Hi,
> > > this patch adds support for speculative calls into callgraph.  The idea is that
> > > any IPA optimization that believes it knows likely target of an indirect call
> > > (currently I use it for cross-module indirect call profiling, but I expect
> > > Martin J. can easily add support for ipa-cp and I hope to add speculative
> > > devirtualization in foreseeable future since it should make difference for
> > > Firefox). Speculative call replaces indirect call
> > 
> > Patch appears to break boot strap on x86_64-linux 
> > (or maybe your other one)
> 
> How do you bootstrap? This should not be used w/o lto+profiledbootstrap and
> that seems to work for me...

Neither LTO nor profiled

../gcc/configure --prefix=/pkg/gcc-$V-$D --enable-lto \
--with-plugin-ld=/usr/local/bin/ld-plugin  \
--disable-nls --enable-languages=$FRONTEND > /dev/null \
--enable-checking=release --disable-libstdcxx-pch  \
--disable-fixincl 

-Andi
Andi Kleen Aug. 10, 2013, 4:19 a.m. UTC | #9
On Sat, Aug 10, 2013 at 02:25:21AM +0200, Andi Kleen wrote:
> On Sat, Aug 10, 2013 at 01:15:11AM +0200, Jan Hubicka wrote:
> > > Jan Hubicka <hubicka@ucw.cz> writes:
> > > 
> > > > Hi,
> > > > this patch adds support for speculative calls into callgraph.  The idea is that
> > > > any IPA optimization that believes it knows likely target of an indirect call
> > > > (currently I use it for cross-module indirect call profiling, but I expect
> > > > Martin J. can easily add support for ipa-cp and I hope to add speculative
> > > > devirtualization in foreseeable future since it should make difference for
> > > > Firefox). Speculative call replaces indirect call
> > > 
> > > Patch appears to break boot strap on x86_64-linux 
> > > (or maybe your other one)
> > 
> > How do you bootstrap? This should not be used w/o lto+profiledbootstrap and
> > that seems to work for me...
> 
> Neither LTO nor profiled

My tree bootstraps again when I revert

 * cgraphbuild.c (cgraph_rebuild_references): Rebuild only
 * non-speculative
   refs.
 * cgraph.c (cgraph_update_edge_in_call_site_hash): New
		 * function.

svn+ssh://gcc.gnu.org/svn/gcc/trunk@201632

(and also the later patch)

-Andi
Jan Hubicka Aug. 10, 2013, 9:37 a.m. UTC | #10
> On Sat, Aug 10, 2013 at 02:25:21AM +0200, Andi Kleen wrote:
> > On Sat, Aug 10, 2013 at 01:15:11AM +0200, Jan Hubicka wrote:
> > > > Jan Hubicka <hubicka@ucw.cz> writes:
> > > > 
> > > > > Hi,
> > > > > this patch adds support for speculative calls into callgraph.  The idea is that
> > > > > any IPA optimization that believes it knows likely target of an indirect call
> > > > > (currently I use it for cross-module indirect call profiling, but I expect
> > > > > Martin J. can easily add support for ipa-cp and I hope to add speculative
> > > > > devirtualization in foreseeable future since it should make difference for
> > > > > Firefox). Speculative call replaces indirect call
> > > > 
> > > > Patch appears to break boot strap on x86_64-linux 
> > > > (or maybe your other one)
> > > 
> > > How do you bootstrap? This should not be used w/o lto+profiledbootstrap and
> > > that seems to work for me...
> > 
> > Neither LTO nor profiled
> 
> My tree bootstraps again when I revert
> 
>  * cgraphbuild.c (cgraph_rebuild_references): Rebuild only
>  * non-speculative
>    refs.
>  * cgraph.c (cgraph_update_edge_in_call_site_hash): New
> 		 * function.
> 
> svn+ssh://gcc.gnu.org/svn/gcc/trunk@201632
> 
> (and also the later patch)

This is really strange. The speculative edges should not be created here at all.
So perhaps some uninitialized memory access crept in :(
It would help if you try to track how the ->speculative bit appears for you
in the loop in cgraph_rebuild_references.  I will try to reproduce your setup now.

Honza
> 
> -Andi
Jan Hubicka Aug. 10, 2013, 9:39 a.m. UTC | #11
Hi,
also I just became aware that this patch hits the following bug of gold
http://sourceware.org/bugzilla/show_bug.cgi?id=14342
It makes gcc.dg/tree-prof/crossmodule-indircall-1.c fail with the bogus
diagnostic on TLS and non-TLS use of the same symbol.  I attached the
details into the PR.  GNU-LD works.

I am not sure if I can workaround this problem except for convincing users to
always use -fno-lto with -fprofile-generate.  Shall we fix it in mainline
binutils + document that earlier binutils are broken with -fporfile-generate
-flto?

Honza
diff mbox

Patch

Index: cgraphbuild.c
===================================================================
--- cgraphbuild.c	(revision 201601)
+++ cgraphbuild.c	(working copy)
@@ -483,8 +483,15 @@  cgraph_rebuild_references (void)
   basic_block bb;
   struct cgraph_node *node = cgraph_get_node (current_function_decl);
   gimple_stmt_iterator gsi;
+  struct ipa_ref *ref;
+  int i;
 
-  ipa_remove_all_references (&node->symbol.ref_list);
+  /* Keep speculative references for further cgraph edge expansion.  */
+  for (i = 0; ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);)
+    if (!ref->speculative)
+      ipa_remove_reference (ref);
+    else
+      i++;
 
   node->count = ENTRY_BLOCK_PTR->count;
 
Index: cgraph.c
===================================================================
--- cgraph.c	(revision 201601)
+++ cgraph.c	(working copy)
@@ -674,14 +674,36 @@  edge_eq (const void *x, const void *y)
 /* Add call graph edge E to call site hash of its caller.  */
 
 static inline void
+cgraph_update_edge_in_call_site_hash (struct cgraph_edge *e)
+{
+  void **slot;
+  slot = htab_find_slot_with_hash (e->caller->call_site_hash,
+				   e->call_stmt,
+				   htab_hash_pointer (e->call_stmt),
+				   INSERT);
+  *slot = e;
+}
+
+/* Add call graph edge E to call site hash of its caller.  */
+
+static inline void
 cgraph_add_edge_to_call_site_hash (struct cgraph_edge *e)
 {
   void **slot;
+  /* There are two speculative edges for every statement (one direct,
+     one indirect); always hash the direct one.  */
+  if (e->speculative && e->indirect_unknown_callee)
+    return;
   slot = htab_find_slot_with_hash (e->caller->call_site_hash,
 				   e->call_stmt,
 				   htab_hash_pointer (e->call_stmt),
 				   INSERT);
-  gcc_assert (!*slot);
+  if (*slot)
+    {
+      gcc_assert (((struct cgraph_edge *)*slot)->speculative);
+      return;
+    }
+  gcc_assert (!*slot || e->speculative);
   *slot = e;
 }
 
@@ -732,14 +754,33 @@  cgraph_edge (struct cgraph_node *node, g
 }
 
 
-/* Change field call_stmt of edge E to NEW_STMT.  */
+/* Change field call_stmt of edge E to NEW_STMT.
+   If UPDATE_SPECULATIVE and E is any component of speculative
+   edge, then update all components.  */
 
 void
-cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt)
+cgraph_set_call_stmt (struct cgraph_edge *e, gimple new_stmt,
+		      bool update_speculative)
 {
   tree decl;
 
-  if (e->caller->call_site_hash)
+  /* Speculative edges has three component, update all of them
+     when asked to.  */
+  if (update_speculative && e->speculative)
+    {
+      struct cgraph_edge *direct, *indirect;
+      struct ipa_ref *ref;
+
+      cgraph_speculative_call_info (e, direct, indirect, ref);
+      cgraph_set_call_stmt (direct, new_stmt, false);
+      cgraph_set_call_stmt (indirect, new_stmt, false);
+      ref->stmt = new_stmt;
+      return;
+    }
+
+  /* Only direct speculative edges go to call_site_hash.  */
+  if (e->caller->call_site_hash
+      && (!e->speculative || !e->indirect_unknown_callee))
     {
       htab_remove_elt_with_hash (e->caller->call_site_hash,
 				 e->call_stmt,
@@ -755,7 +796,7 @@  cgraph_set_call_stmt (struct cgraph_edge
       struct cgraph_node *new_callee = cgraph_get_node (decl);
 
       gcc_checking_assert (new_callee);
-      cgraph_make_edge_direct (e, new_callee);
+      e = cgraph_make_edge_direct (e, new_callee);
     }
 
   push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
@@ -781,7 +822,10 @@  cgraph_create_edge_1 (struct cgraph_node
     {
       /* This is a rather expensive check possibly triggering
 	 construction of call stmt hashtable.  */
-      gcc_checking_assert (!cgraph_edge (caller, call_stmt));
+#ifdef ENABLE_CHECKING
+      struct cgraph_edge *e;
+      gcc_checking_assert (!(e=cgraph_edge (caller, call_stmt)) || e->speculative);
+#endif
 
       gcc_assert (is_gimple_call (call_stmt));
     }
@@ -804,6 +848,7 @@  cgraph_create_edge_1 (struct cgraph_node
   edge->next_caller = NULL;
   edge->prev_callee = NULL;
   edge->next_callee = NULL;
+  edge->lto_stmt_uid = 0;
 
   edge->count = count;
   gcc_assert (count >= 0);
@@ -937,6 +982,9 @@  cgraph_free_edge (struct cgraph_edge *e)
 {
   int uid = e->uid;
 
+  if (e->indirect_info)
+    ggc_free (e->indirect_info);
+
   /* Clear out the edge so we do not dangle pointers.  */
   memset (e, 0, sizeof (*e));
   e->uid = uid;
@@ -977,6 +1025,110 @@  cgraph_set_edge_callee (struct cgraph_ed
   e->callee = n;
 }
 
+/* Turn edge E into speculative call calling N2. Update
+   the profile so the direct call is taken COUNT times
+   with FREQUENCY.  
+
+   At clone materialization time, the indirect call E will
+   be expanded as:
+
+   if (call_dest == N2)
+     n2 ();
+   else
+     call call_dest
+
+   At this time the function just creates the direct call,
+   the referencd representing the if conditional and attaches
+   them all to the orginal indirect call statement.  */
+
+void
+cgraph_turn_edge_to_speculative (struct cgraph_edge *e,
+				 struct cgraph_node *n2,
+				 gcov_type direct_count,
+				 int direct_frequency)
+{
+  struct cgraph_node *n = e->caller;
+  struct ipa_ref *ref;
+  struct cgraph_edge *e2;
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "Indirect call -> direct call from"
+	       " other module %s/%i => %s/%i\n",
+	       xstrdup (cgraph_node_name (n)), n->symbol.order,
+	       xstrdup (cgraph_node_name (n2)), n2->symbol.order);
+    }
+  e->speculative = true;
+  e2 = cgraph_create_edge (n, n2, e->call_stmt, direct_count, direct_frequency);
+  initialize_inline_failed (e2);
+  e2->speculative = true;
+  e2->call_stmt = e->call_stmt;
+  e2->can_throw_external = e->can_throw_external;
+  e2->lto_stmt_uid = e->lto_stmt_uid;
+  e->count -= e2->count;
+  e->frequency -= e2->frequency;
+  cgraph_call_edge_duplication_hooks (e, e2);
+  ref = ipa_record_reference ((symtab_node)n, (symtab_node)n2,
+			      IPA_REF_ADDR, e->call_stmt);
+  ref->lto_stmt_uid = e->lto_stmt_uid;
+  ref->speculative = e->speculative;
+}
+
+/* Speculative call consist of three components:
+   1) an indirect edge representing the original call
+   2) an direct edge representing the new call
+   3) ADDR_EXPR reference representing the speculative check.
+   All three components are attached to single statement (the indirect
+   call) and if one of them exists, all of them must exist.
+
+   Given speculative call edge E, return all three components. 
+ */
+
+void
+cgraph_speculative_call_info (struct cgraph_edge *e,
+			      struct cgraph_edge *&indirect,
+			      struct cgraph_edge *&direct,
+			      struct ipa_ref *&reference)
+{
+  struct ipa_ref *ref;
+  int i;
+  struct cgraph_edge *e2;
+
+  if (!e->indirect_unknown_callee)
+    for (e2 = e->caller->indirect_calls;
+	 e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
+	 e2 = e2->next_callee)
+      ;
+  else
+    {
+      e2 = e;
+      /* We can take advantage of the call stmt hash.  */
+      if (e2->call_stmt)
+	{
+	  e = cgraph_edge (e->caller, e2->call_stmt);
+	  gcc_assert (!e->speculative && !e->indirect_unknown_callee);
+	}
+      else
+	for (e = e->caller->callees; 
+	     e2->call_stmt != e->call_stmt || e2->lto_stmt_uid != e->lto_stmt_uid;
+	     e = e->next_callee)
+	  ;
+    }
+  gcc_assert (e->speculative && e2->speculative);
+  indirect = e;
+  direct = e2;
+
+  reference = NULL;
+  for (i = 0; ipa_ref_list_reference_iterate (&e->caller->symbol.ref_list, i, ref); i++)
+    if (ref->speculative
+	&& ((ref->stmt && ref->stmt == e->call_stmt)
+	    || (ref->lto_stmt_uid == e->lto_stmt_uid)))
+      {
+	reference = ref;
+	break;
+      }
+}
+
 /* Redirect callee of E to N.  The function does not update underlying
    call expression.  */
 
@@ -990,14 +1142,74 @@  cgraph_redirect_edge_callee (struct cgra
   cgraph_set_edge_callee (e, n);
 }
 
+/* Speculative call EDGE turned out to be direct call to CALLE_DECL.
+   Remove the speculative call sequence and return edge representing the call.
+   It is up to caller to redirect the call as appropriate. */
+
+static struct cgraph_edge *
+cgraph_resolve_speculation (struct cgraph_edge *edge, tree callee_decl)
+{
+  struct cgraph_edge *e2;
+  struct ipa_ref *ref;
+
+  gcc_assert (edge->speculative);
+  cgraph_speculative_call_info (edge, e2, edge, ref);
+  if (ref->referred->symbol.decl != callee_decl)
+    {
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Speculative indirect call %s/%i => %s/%i has "
+		   "turned out to have contradicitng known target ",
+		   xstrdup (cgraph_node_name (edge->caller)), edge->caller->symbol.order,
+		   xstrdup (cgraph_node_name (e2->callee)), e2->callee->symbol.order);
+	  print_generic_expr (dump_file, callee_decl, 0);
+          fprintf (dump_file, "\n");
+	}
+    }
+  else
+    {
+      struct cgraph_edge *tmp = edge;
+      if (dump_file)
+        fprintf (dump_file, "Speculative call turned into direct call.\n");
+      edge = e2;
+      e2 = tmp;
+    }
+  edge->count += e2->count;
+  edge->frequency += e2->frequency;
+  edge->speculative = false;
+  e2->speculative = false;
+  if (e2->indirect_unknown_callee || e2->inline_failed)
+    cgraph_remove_edge (e2);
+  else
+    cgraph_remove_node_and_inline_clones (e2->callee, NULL);
+  if (edge->caller->call_site_hash)
+    cgraph_update_edge_in_call_site_hash (edge);
+  ipa_remove_reference (ref);
+  return edge;
+}
+
 /* Make an indirect EDGE with an unknown callee an ordinary edge leading to
    CALLEE.  DELTA is an integer constant that is to be added to the this
    pointer (first parameter) to compensate for skipping a thunk adjustment.  */
 
-void
+struct cgraph_edge *
 cgraph_make_edge_direct (struct cgraph_edge *edge, struct cgraph_node *callee)
 {
+  gcc_assert (edge->indirect_unknown_callee);
+
+  /* If we are redirecting speculative call, make it non-speculative.  */
+  if (edge->indirect_unknown_callee && edge->speculative)
+    {
+      edge = cgraph_resolve_speculation (edge, callee->symbol.decl);
+
+      /* On successful speculation just return the pre existing direct edge.  */
+      if (!edge->indirect_unknown_callee)
+        return edge;
+    }
+
   edge->indirect_unknown_callee = 0;
+  ggc_free (edge->indirect_info);
+  edge->indirect_info = NULL;
 
   /* Get the edge out of the indirect edge list. */
   if (edge->prev_callee)
@@ -1024,6 +1236,7 @@  cgraph_make_edge_direct (struct cgraph_e
 
   /* We need to re-determine the inlining status of the edge.  */
   initialize_inline_failed (edge);
+  return edge;
 }
 
 /* If necessary, change the function declaration in the call statement
@@ -1039,6 +1252,50 @@  cgraph_redirect_edge_call_stmt_to_callee
   struct cgraph_node *node;
 #endif
 
+  if (e->speculative)
+    {
+      struct cgraph_edge *e2;
+      gimple new_stmt;
+      struct ipa_ref *ref;
+
+      cgraph_speculative_call_info (e, e, e2, ref);
+      if (gimple_call_fndecl (e->call_stmt))
+	e = cgraph_resolve_speculation (e, gimple_call_fndecl (e->call_stmt));
+      else
+	{
+	  if (dump_file)
+	    fprintf (dump_file, "Expanding speculative call of %s/%i -> %s/%i\n",
+		     xstrdup (cgraph_node_name (e->caller)), e->caller->symbol.order,
+		     xstrdup (cgraph_node_name (e->callee)), e->callee->symbol.order);
+	  gcc_assert (e2->speculative);
+	  push_cfun (DECL_STRUCT_FUNCTION (e->caller->symbol.decl));
+	  new_stmt = gimple_ic (e->call_stmt, cgraph (ref->referred),
+				e->count || e2->count
+				?  RDIV (e->count * REG_BR_PROB_BASE,
+					 e->count + e2->count)
+				: e->frequency || e2->frequency
+				? RDIV (e->frequency * REG_BR_PROB_BASE,
+					e->frequency + e2->frequency)
+				: REG_BR_PROB_BASE / 2,
+				e->count, e->count + e2->count);
+	  e->speculative = false;
+	  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
+	  e->frequency = compute_call_stmt_bb_frequency (e->caller->symbol.decl,
+							 gimple_bb (e->call_stmt));
+	  e2->frequency = compute_call_stmt_bb_frequency (e2->caller->symbol.decl,
+							  gimple_bb (e2->call_stmt));
+	  e2->speculative = false;
+	  ref->speculative = false;
+	  ref->stmt = NULL;
+	  /* Indirect edges are not both in the call site hash.
+	     get it updated.  */
+	  if (e->caller->call_site_hash)
+	    cgraph_update_edge_in_call_site_hash (e2);
+	  pop_cfun ();
+	  /* Continue redirecting E to proper target.  */
+	}
+    }
+
   if (e->indirect_unknown_callee
       || decl == e->callee->symbol.decl)
     return e->call_stmt;
@@ -1099,7 +1356,7 @@  cgraph_redirect_edge_call_stmt_to_callee
       update_stmt (new_stmt);
     }
 
-  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt);
+  cgraph_set_call_stmt_including_clones (e->caller, e->call_stmt, new_stmt, false);
 
   if (cgraph_dump_file)
     {
@@ -1603,6 +1860,8 @@  dump_cgraph_node (FILE *f, struct cgraph
       if (edge->frequency)
 	fprintf (f, "(%.2f per call) ",
 		 edge->frequency / (double)CGRAPH_FREQ_BASE);
+      if (edge->speculative)
+	fprintf(f, "(speculative) ");
       if (!edge->inline_failed)
 	fprintf(f, "(inlined) ");
       if (edge->indirect_inlining_edge)
@@ -1616,6 +1875,8 @@  dump_cgraph_node (FILE *f, struct cgraph
     {
       fprintf (f, "%s/%i ", cgraph_node_asm_name (edge->callee),
 	       edge->callee->symbol.order);
+      if (edge->speculative)
+	fprintf(f, "(speculative) ");
       if (!edge->inline_failed)
 	fprintf(f, "(inlined) ");
       if (edge->indirect_inlining_edge)
@@ -2262,6 +2523,7 @@  verify_edge_count_and_frequency (struct
     }
   if (gimple_has_body_p (e->caller->symbol.decl)
       && !e->caller->global.inlined_to
+      && !e->speculative
       /* FIXME: Inline-analysis sets frequency to 0 when edge is optimized out.
 	 Remove this once edges are actually removed from the function at that time.  */
       && (e->frequency
@@ -2316,7 +2578,7 @@  verify_edge_corresponds_to_fndecl (struc
 
   /* We do not know if a node from a different partition is an alias or what it
      aliases and therefore cannot do the former_clone_of check reliably.  */
-  if (!node || node->symbol.in_other_partition)
+  if (!node || node->symbol.in_other_partition || e->callee->symbol.in_other_partition)
     return false;
   node = cgraph_function_or_thunk_node (node, NULL);
 
@@ -2625,7 +2887,7 @@  verify_cgraph_node (struct cgraph_node *
 	}
       for (e = node->indirect_calls; e; e = e->next_callee)
 	{
-	  if (!e->aux)
+	  if (!e->aux && !e->speculative)
 	    {
 	      error ("an indirect edge from %s has no corresponding call_stmt",
 		     identifier_to_locale (cgraph_node_name (e->caller)));
Index: cgraph.h
===================================================================
--- cgraph.h	(revision 201568)
+++ cgraph.h	(working copy)
@@ -483,6 +483,24 @@  struct GTY((chain_next ("%h.next_caller"
   unsigned int call_stmt_cannot_inline_p : 1;
   /* Can this call throw externally?  */
   unsigned int can_throw_external : 1;
+  /* Edges with SPECULATIVE flag represents indirect calls that was
+     speculatively turned into direct (i.e. by profile feedback).
+     The final code sequence will have form:
+
+     if (call_target == expected_fn)
+       expected_fn ();
+     else
+       call_target ();
+
+     Every speculative call is represented by three components attached
+     to a same call statement:
+     1) a direct call (to expected_fn)
+     2) an indirect call (to call_target)
+     3) a IPA_REF_ADDR refrence to expected_fn.
+
+     Optimizers may later redirect direct call to clone, so 1) and 3)
+     do not need to necesarily agree with destination.  */
+  unsigned int speculative : 1;
 };
 
 #define CGRAPH_FREQ_BASE 1000
@@ -629,7 +647,7 @@  struct cgraph_node * cgraph_add_thunk (s
 				       HOST_WIDE_INT, tree, tree);
 struct cgraph_node *cgraph_node_for_asm (tree);
 struct cgraph_edge *cgraph_edge (struct cgraph_node *, gimple);
-void cgraph_set_call_stmt (struct cgraph_edge *, gimple);
+void cgraph_set_call_stmt (struct cgraph_edge *, gimple, bool update_speculative = true);
 void cgraph_update_edges_for_call_stmt (gimple, tree, gimple);
 struct cgraph_local_info *cgraph_local_info (tree);
 struct cgraph_global_info *cgraph_global_info (tree);
@@ -641,7 +659,7 @@  void cgraph_call_edge_duplication_hooks
 				         struct cgraph_edge *);
 
 void cgraph_redirect_edge_callee (struct cgraph_edge *, struct cgraph_node *);
-void cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
+struct cgraph_edge *cgraph_make_edge_direct (struct cgraph_edge *, struct cgraph_node *);
 bool cgraph_only_called_directly_p (struct cgraph_node *);
 
 bool cgraph_function_possibly_inlined_p (tree);
@@ -702,6 +720,14 @@  bool cgraph_propagate_frequency (struct
 struct cgraph_node * cgraph_function_node (struct cgraph_node *,
 					   enum availability *avail = NULL);
 bool cgraph_get_body (struct cgraph_node *node);
+void
+cgraph_turn_edge_to_speculative (struct cgraph_edge *,
+				 struct cgraph_node *,
+				 gcov_type, int);
+void cgraph_speculative_call_info (struct cgraph_edge *,
+				   struct cgraph_edge *&,
+				   struct cgraph_edge *&,
+				   struct ipa_ref *&);
 
 /* In cgraphunit.c  */
 struct asm_node *add_asm_node (tree);
@@ -735,7 +761,8 @@  struct cgraph_node * cgraph_create_virtu
 						  const char *clone_name);
 struct cgraph_node *cgraph_find_replacement_node (struct cgraph_node *);
 bool cgraph_remove_node_and_inline_clones (struct cgraph_node *, struct cgraph_node *);
-void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple);
+void cgraph_set_call_stmt_including_clones (struct cgraph_node *, gimple, gimple,
+					    bool update_speculative = true);
 void cgraph_create_edge_including_clones (struct cgraph_node *,
 					  struct cgraph_node *,
 					  gimple, gimple, gcov_type, int,
Index: value-prof.c
===================================================================
--- value-prof.c	(revision 201568)
+++ value-prof.c	(working copy)
@@ -1248,7 +1248,7 @@  check_ic_target (gimple call_stmt, struc
     old call
  */
 
-static gimple
+gimple
 gimple_ic (gimple icall_stmt, struct cgraph_node *direct_call,
 	   int prob, gcov_type count, gcov_type all)
 {
Index: value-prof.h
===================================================================
--- value-prof.h	(revision 201568)
+++ value-prof.h	(working copy)
@@ -86,6 +86,8 @@  void gimple_move_stmt_histograms (struct
 void verify_histograms (void);
 void free_histograms (void);
 void stringop_block_profile (gimple, unsigned int *, HOST_WIDE_INT *);
+gimple gimple_ic (gimple, struct cgraph_node *, int, gcov_type, gcov_type);
+
 
 /* In tree-profile.c.  */
 extern void gimple_init_edge_profiler (void);
Index: ipa-cp.c
===================================================================
--- ipa-cp.c	(revision 201568)
+++ ipa-cp.c	(working copy)
@@ -2278,14 +2278,15 @@  ipcp_discover_new_direct_edges (struct c
 					       aggvals);
       if (target)
 	{
+	  bool agg_contents = ie->indirect_info->agg_contents;
+	  bool polymorphic = ie->indirect_info->polymorphic;
+	  bool param_index = ie->indirect_info->param_index;
 	  struct cgraph_edge *cs = ipa_make_edge_direct_to_target (ie, target);
 	  found = true;
 
-	  if (cs && !ie->indirect_info->agg_contents
-	      && !ie->indirect_info->polymorphic)
+	  if (cs && !agg_contents && !polymorphic)
 	    {
 	      struct ipa_node_params *info = IPA_NODE_REF (node);
-	      int param_index = ie->indirect_info->param_index;
 	      int c = ipa_get_controlled_uses (info, param_index);
 	      if (c != IPA_UNDESCRIBED_USE)
 		{
@@ -2299,7 +2300,7 @@  ipcp_discover_new_direct_edges (struct c
 		  if (c == 0
 		      && (to_del = ipa_find_reference ((symtab_node) node,
 						       (symtab_node) cs->callee,
-						       NULL)))
+						       NULL, 0)))
 		    {
 		      if (dump_file && (dump_flags & TDF_DETAILS))
 			fprintf (dump_file, "       and even removing its "
Index: ipa-ref.c
===================================================================
--- ipa-ref.c	(revision 201601)
+++ ipa-ref.c	(working copy)
@@ -38,7 +38,7 @@  ipa_record_reference (symtab_node referr
 		      symtab_node referred_node,
 		      enum ipa_ref_use use_type, gimple stmt)
 {
-  struct ipa_ref *ref;
+  struct ipa_ref *ref, *ref2;
   struct ipa_ref_list *list, *list2;
   ipa_ref_t *old_references;
 
@@ -56,14 +56,16 @@  ipa_record_reference (symtab_node referr
   ref->referring = referring_node;
   ref->referred = referred_node;
   ref->stmt = stmt;
+  ref->lto_stmt_uid = 0;
   ref->use = use_type;
+  ref->speculative = 0;
 
   /* If vector was moved in memory, update pointers.  */
   if (old_references != list->references->address ())
     {
       int i;
-      for (i = 0; ipa_ref_list_reference_iterate (list, i, ref); i++)
-	ipa_ref_referred_ref_list (ref)->referring[ref->referred_index] = ref;
+      for (i = 0; ipa_ref_list_reference_iterate (list, i, ref2); i++)
+	ipa_ref_referred_ref_list (ref2)->referring[ref2->referred_index] = ref2;
     }
   return ref;
 }
@@ -155,6 +157,8 @@  ipa_dump_references (FILE * file, struct
                symtab_node_asm_name (ref->referred),
                ref->referred->symbol.order,
 	       ipa_ref_use_name [ref->use]);
+      if (ref->speculative)
+	fprintf (file, " (speculative)");
     }
   fprintf (file, "\n");
 }
@@ -172,22 +176,50 @@  ipa_dump_referring (FILE * file, struct
                symtab_node_asm_name (ref->referring),
                ref->referring->symbol.order,
 	       ipa_ref_use_name [ref->use]);
+      if (ref->speculative)
+	fprintf (file, " (speculative)");
     }
   fprintf (file, "\n");
 }
 
+/* Clone reference REF to DEST_NODE and set its stmt to STMT.  */
+
+struct ipa_ref *
+ipa_clone_ref (struct ipa_ref *ref,
+	       symtab_node dest_node,
+	       gimple stmt)
+{
+  bool speculative = ref->speculative;
+  unsigned int stmt_uid = ref->lto_stmt_uid;
+  struct ipa_ref *ref2;
+
+  ref2 = ipa_record_reference (dest_node,
+			       ref->referred,
+			       ref->use, stmt);
+  ref2->speculative = speculative;
+  ref2->lto_stmt_uid = stmt_uid;
+  return ref2;
+}
+
 /* Clone all references from SRC to DEST_NODE or DEST_VARPOOL_NODE.  */
 
 void
 ipa_clone_references (symtab_node dest_node,
 		      struct ipa_ref_list *src)
 {
-  struct ipa_ref *ref;
+  struct ipa_ref *ref, *ref2;
   int i;
   for (i = 0; ipa_ref_list_reference_iterate (src, i, ref); i++)
-    ipa_record_reference (dest_node,
-			  ref->referred,
-			  ref->use, ref->stmt);
+    {
+      bool speculative = ref->speculative;
+      unsigned int stmt_uid = ref->lto_stmt_uid;
+
+      ref2 = ipa_record_reference (dest_node,
+				   ref->referred,
+				   ref->use, ref->stmt);
+      ref2->speculative = speculative;
+      ref2->lto_stmt_uid = stmt_uid;
+    }
 }
 
 /* Clone all referring from SRC to DEST_NODE or DEST_VARPOOL_NODE.  */
@@ -196,12 +228,19 @@  void
 ipa_clone_referring (symtab_node dest_node,
 		    struct ipa_ref_list *src)
 {
-  struct ipa_ref *ref;
+  struct ipa_ref *ref, *ref2;
   int i;
   for (i = 0; ipa_ref_list_referring_iterate (src, i, ref); i++)
-    ipa_record_reference (ref->referring,
-			  dest_node,
-			  ref->use, ref->stmt);
+    {
+      bool speculative = ref->speculative;
+      unsigned int stmt_uid = ref->lto_stmt_uid;
+
+      ref2 = ipa_record_reference (ref->referring,
+				   dest_node,
+				   ref->use, ref->stmt);
+      ref2->speculative = speculative;
+      ref2->lto_stmt_uid = stmt_uid;
+    }
 }
 
 /* Return true when execution of REF can lead to return from
@@ -230,14 +269,17 @@  ipa_ref_has_aliases_p (struct ipa_ref_li
 
 struct ipa_ref *
 ipa_find_reference (symtab_node referring_node, symtab_node referred_node,
-		    gimple stmt)
+		    gimple stmt, unsigned int lto_stmt_uid)
 {
   struct ipa_ref *r = NULL;
   int i;
 
   for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
     if (r->referred == referred_node
-	&& (in_lto_p || r->stmt == stmt))
+	&& !r->speculative
+	&& ((stmt && r->stmt == stmt)
+	    || (lto_stmt_uid && r->lto_stmt_uid == lto_stmt_uid)
+	    || (!stmt && !lto_stmt_uid && !r->stmt && !r->lto_stmt_uid)))
       return r;
   return NULL;
 }
@@ -257,7 +299,9 @@  ipa_remove_stmt_references (symtab_node
 }
 
 /* Remove all stmt references in non-speculative references.
-   Those are not maintained during inlining & clonning. */
+   Those are not maintained during inlining & clonning. 
+   The exception are speculative references that are updated along
+   with callgraph edges associated with them.  */
 
 void
 ipa_clear_stmts_in_references (symtab_node referring_node)
@@ -266,5 +310,6 @@  ipa_clear_stmts_in_references (symtab_no
   int i;
 
   for (i = 0; ipa_ref_list_reference_iterate (&referring_node->symbol.ref_list, i, r); i++)
-    r->stmt = NULL;
+    if (!r->speculative)
+      r->stmt = NULL;
 }
Index: ipa-ref.h
===================================================================
--- ipa-ref.h	(revision 201601)
+++ ipa-ref.h	(working copy)
@@ -40,8 +40,10 @@  struct GTY(()) ipa_ref
   symtab_node referring;
   symtab_node referred;
   gimple stmt;
+  unsigned int lto_stmt_uid;
   unsigned int referred_index;
   ENUM_BITFIELD (ipa_ref_use) use:2;
+  unsigned int speculative:1;
 };
 
 typedef struct ipa_ref ipa_ref_t;
@@ -71,8 +73,9 @@  void ipa_dump_references (FILE *, struct
 void ipa_dump_referring (FILE *, struct ipa_ref_list *);
 void ipa_clone_references (symtab_node, struct ipa_ref_list *);
 void ipa_clone_referring (symtab_node, struct ipa_ref_list *);
+struct ipa_ref * ipa_clone_ref (struct ipa_ref *, symtab_node, gimple);
 bool ipa_ref_cannot_lead_to_return (struct ipa_ref *);
 bool ipa_ref_has_aliases_p (struct ipa_ref_list *);
-struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple);
+struct ipa_ref * ipa_find_reference (symtab_node, symtab_node, gimple, unsigned int);
 void ipa_remove_stmt_references (symtab_node, gimple);
 void ipa_clear_stmts_in_references (symtab_node);
Index: lto-cgraph.c
===================================================================
--- lto-cgraph.c	(revision 201568)
+++ lto-cgraph.c	(working copy)
@@ -273,12 +273,13 @@  lto_output_edge (struct lto_simple_outpu
 
   bp = bitpack_create (ob->main_stream);
   uid = (!gimple_has_body_p (edge->caller->symbol.decl)
-	 ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt));
+	 ? edge->lto_stmt_uid : gimple_uid (edge->call_stmt) + 1);
   bp_pack_enum (&bp, cgraph_inline_failed_enum,
 	        CIF_N_REASONS, edge->inline_failed);
   bp_pack_var_len_unsigned (&bp, uid);
   bp_pack_var_len_unsigned (&bp, edge->frequency);
   bp_pack_value (&bp, edge->indirect_inlining_edge, 1);
+  bp_pack_value (&bp, edge->speculative, 1);
   bp_pack_value (&bp, edge->call_stmt_cannot_inline_p, 1);
   bp_pack_value (&bp, edge->can_throw_external, 1);
   if (edge->indirect_unknown_callee)
@@ -589,13 +590,24 @@  lto_output_ref (struct lto_simple_output
 {
   struct bitpack_d bp;
   int nref;
+  int uid = ref->lto_stmt_uid;
+  struct cgraph_node *node;
 
   bp = bitpack_create (ob->main_stream);
   bp_pack_value (&bp, ref->use, 2);
+  bp_pack_value (&bp, ref->speculative, 1);
   streamer_write_bitpack (&bp);
   nref = lto_symtab_encoder_lookup (encoder, ref->referred);
   gcc_assert (nref != LCC_NOT_FOUND);
   streamer_write_hwi_stream (ob->main_stream, nref);
+  
+  node = dyn_cast <cgraph_node> (ref->referring);
+  if (node)
+    {
+      if (ref->stmt)
+	uid = gimple_uid (ref->stmt) + 1;
+      streamer_write_hwi_stream (ob->main_stream, uid);
+    }
 }
 
 /* Stream out profile_summary to OB.  */
@@ -1116,11 +1128,17 @@  input_ref (struct lto_input_block *ib,
   symtab_node node = NULL;
   struct bitpack_d bp;
   enum ipa_ref_use use;
+  bool speculative;
+  struct ipa_ref *ref;
 
   bp = streamer_read_bitpack (ib);
   use = (enum ipa_ref_use) bp_unpack_value (&bp, 2);
+  speculative = (enum ipa_ref_use) bp_unpack_value (&bp, 1);
   node = nodes[streamer_read_hwi (ib)];
-  ipa_record_reference (referring_node, node, use, NULL);
+  ref = ipa_record_reference (referring_node, node, use, NULL);
+  ref->speculative = speculative;
+  if (is_a <cgraph_node> (referring_node))
+    ref->lto_stmt_uid = streamer_read_hwi (ib);
 }
 
 /* Read an edge from IB.  NODES points to a vector of previously read nodes for
@@ -1167,6 +1185,7 @@  input_edge (struct lto_input_block *ib,
     edge = cgraph_create_edge (caller, callee, NULL, count, freq);
 
   edge->indirect_inlining_edge = bp_unpack_value (&bp, 1);
+  edge->speculative = bp_unpack_value (&bp, 1);
   edge->lto_stmt_uid = stmt_id;
   edge->inline_failed = inline_failed;
   edge->call_stmt_cannot_inline_p = bp_unpack_value (&bp, 1);
Index: cgraphclones.c
===================================================================
--- cgraphclones.c	(revision 201601)
+++ cgraphclones.c	(working copy)
@@ -147,6 +147,7 @@  cgraph_clone_edge (struct cgraph_edge *e
   /* Clone flags that depend on call_stmt availability manually.  */
   new_edge->can_throw_external = e->can_throw_external;
   new_edge->call_stmt_cannot_inline_p = e->call_stmt_cannot_inline_p;
+  new_edge->speculative = e->speculative;
   if (update_original)
     {
       e->count -= new_edge->count;
@@ -474,17 +475,21 @@  cgraph_find_replacement_node (struct cgr
 }
 
 /* Like cgraph_set_call_stmt but walk the clone tree and update all
-   clones sharing the same function body.  */
+   clones sharing the same function body.  
+   When WHOLE_SPECULATIVE_EDGES is true, all three components of
+   speculative edge gets updated.  Otherwise we update only direct
+   call.  */
 
 void
 cgraph_set_call_stmt_including_clones (struct cgraph_node *orig,
-				       gimple old_stmt, gimple new_stmt)
+				       gimple old_stmt, gimple new_stmt,
+				       bool update_speculative)
 {
   struct cgraph_node *node;
   struct cgraph_edge *edge = cgraph_edge (orig, old_stmt);
 
   if (edge)
-    cgraph_set_call_stmt (edge, new_stmt);
+    cgraph_set_call_stmt (edge, new_stmt, update_speculative);
 
   node = orig->clones;
   if (node)
@@ -492,7 +497,23 @@  cgraph_set_call_stmt_including_clones (s
       {
 	struct cgraph_edge *edge = cgraph_edge (node, old_stmt);
 	if (edge)
-	  cgraph_set_call_stmt (edge, new_stmt);
+	  {
+	    cgraph_set_call_stmt (edge, new_stmt, update_speculative);
+	    /* If UPDATE_SPECULATIVE is false, it means that we are turning
+	       speculative call into a real code sequence.  Update the
+	       callgraph edges.  */
+	    if (edge->speculative && !update_speculative)
+	      {
+		struct cgraph_edge *direct, *indirect;
+		struct ipa_ref *ref;
+
+		gcc_assert (!edge->indirect_unknown_callee);
+		cgraph_speculative_call_info (edge, direct, indirect, ref);
+		direct->speculative = false;
+		indirect->speculative = false;
+		ref->speculative = false;
+	      }
+	  }
 	if (node->clones)
 	  node = node->clones;
 	else if (node->next_sibling_clone)
@@ -811,6 +832,7 @@  cgraph_materialize_all_clones (void)
 {
   struct cgraph_node *node;
   bool stabilized = false;
+  
 
   if (cgraph_dump_file)
     fprintf (cgraph_dump_file, "Materializing clones\n");
Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 201601)
+++ ipa-inline.c	(working copy)
@@ -1388,6 +1388,17 @@  add_new_edges_to_heap (fibheap_t heap, v
     }
 }
 
+/* Remove EDGE from the fibheap.  */
+
+static void
+heap_edge_removal_hook (struct cgraph_edge *e, void *data)
+{
+  if (e->aux)
+    {
+      fibheap_delete_node ((fibheap_t)data, (fibnode_t)e->aux);
+      e->aux = NULL;
+    }
+}
 
 /* We use greedy algorithm for inlining of small functions:
    All inline candidates are put into prioritized heap ordered in
@@ -1406,10 +1417,14 @@  inline_small_functions (void)
   vec<cgraph_edge_p> new_indirect_edges = vNULL;
   int initial_size = 0;
   struct cgraph_node **order = XCNEWVEC (struct cgraph_node *, cgraph_n_nodes);
+  struct cgraph_edge_hook_list *edge_removal_hook_holder;
 
   if (flag_indirect_inlining)
     new_indirect_edges.create (8);
 
+  edge_removal_hook_holder
+    = cgraph_add_edge_removal_hook (&heap_edge_removal_hook, edge_heap);
+
   /* Compute overall unit size and other global parameters used by badness
      metrics.  */
 
@@ -1663,6 +1678,7 @@  inline_small_functions (void)
 	     initial_size, overall_size,
 	     initial_size ? overall_size * 100 / (initial_size) - 100: 0);
   BITMAP_FREE (updated_nodes);
+  cgraph_remove_edge_removal_hook (edge_removal_hook_holder);
 }
 
 /* Flatten NODE.  Performed both during early inlining and
Index: lto-streamer-in.c
===================================================================
--- lto-streamer-in.c	(revision 201627)
+++ lto-streamer-in.c	(working copy)
@@ -755,30 +755,60 @@  input_ssa_names (struct lto_input_block
    so they point to STMTS.  */
 
 static void
-fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts)
+fixup_call_stmt_edges_1 (struct cgraph_node *node, gimple *stmts,
+			 struct function *fn)
 {
   struct cgraph_edge *cedge;
+  struct ipa_ref *ref;
+  unsigned int i;
+
   for (cedge = node->callees; cedge; cedge = cedge->next_callee)
-    cedge->call_stmt = stmts[cedge->lto_stmt_uid];
+    {
+      if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
+      fatal_error ("Cgraph edge statement index out of range");
+      cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
+      if (!cedge->call_stmt)
+      fatal_error ("Cgraph edge statement index not found");
+    }
   for (cedge = node->indirect_calls; cedge; cedge = cedge->next_callee)
-    cedge->call_stmt = stmts[cedge->lto_stmt_uid];
+    {
+      if (gimple_stmt_max_uid (fn) < cedge->lto_stmt_uid)
+      fatal_error ("Cgraph edge statement index out of range");
+      cedge->call_stmt = stmts[cedge->lto_stmt_uid - 1];
+      if (!cedge->call_stmt)
+      fatal_error ("Cgraph edge statement index not found");
+    }
+  for (i = 0;
+       ipa_ref_list_reference_iterate (&node->symbol.ref_list, i, ref);
+       i++)
+    if (ref->lto_stmt_uid)
+      {
+	if (gimple_stmt_max_uid (fn) < ref->lto_stmt_uid)
+	  fatal_error ("Reference statement index out of range");
+	ref->stmt = stmts[ref->lto_stmt_uid - 1];
+	if (!ref->stmt)
+	  fatal_error ("Reference statement index not found");
+      }
 }
 
+
 /* Fixup call_stmt pointers in NODE and all clones.  */
 
 static void
 fixup_call_stmt_edges (struct cgraph_node *orig, gimple *stmts)
 {
   struct cgraph_node *node;
+  struct function *fn;
 
   while (orig->clone_of)
     orig = orig->clone_of;
+  fn = DECL_STRUCT_FUNCTION (orig->symbol.decl);
 
-  fixup_call_stmt_edges_1 (orig, stmts);
+  fixup_call_stmt_edges_1 (orig, stmts, fn);
   if (orig->clones)
     for (node = orig->clones; node != orig;)
       {
-	fixup_call_stmt_edges_1 (node, stmts);
+	fixup_call_stmt_edges_1 (node, stmts, fn);
 	if (node->clones)
 	  node = node->clones;
 	else if (node->next_sibling_clone)
Index: ipa-prop.c
===================================================================
--- ipa-prop.c	(revision 201568)
+++ ipa-prop.c	(working copy)
@@ -2313,12 +2313,6 @@  ipa_make_edge_direct_to_target (struct c
      the cgraph node too early.  */
   gcc_assert (!callee->global.inlined_to);
 
-  cgraph_make_edge_direct (ie, callee);
-  es = inline_edge_summary (ie);
-  es->call_stmt_size -= (eni_size_weights.indirect_call_cost
-			 - eni_size_weights.call_cost);
-  es->call_stmt_time -= (eni_time_weights.indirect_call_cost
-			 - eni_time_weights.call_cost);
   if (dump_file && !unreachable)
     {
       fprintf (dump_file, "ipa-prop: Discovered %s call to a known target "
@@ -2326,14 +2320,19 @@  ipa_make_edge_direct_to_target (struct c
 	       ie->indirect_info->polymorphic ? "a virtual" : "an indirect",
 	       xstrdup (cgraph_node_name (ie->caller)),
 	       ie->caller->symbol.order,
-	       xstrdup (cgraph_node_name (ie->callee)),
-	       ie->callee->symbol.order);
+	       xstrdup (cgraph_node_name (callee)),
+	       callee->symbol.order);
       if (ie->call_stmt)
 	print_gimple_stmt (dump_file, ie->call_stmt, 2, TDF_SLIM);
       else
 	fprintf (dump_file, "with uid %i\n", ie->lto_stmt_uid);
-    }
-  callee = cgraph_function_or_thunk_node (callee, NULL);
+     }
+  ie = cgraph_make_edge_direct (ie, callee);
+  es = inline_edge_summary (ie);
+  es->call_stmt_size -= (eni_size_weights.indirect_call_cost
+			 - eni_size_weights.call_cost);
+  es->call_stmt_time -= (eni_time_weights.indirect_call_cost
+			 - eni_time_weights.call_cost);
 
   return ie;
 }
@@ -2374,7 +2373,7 @@  remove_described_reference (symtab_node
 
   origin = rdesc->cs;
   to_del = ipa_find_reference ((symtab_node) origin->caller, symbol,
-			       origin->call_stmt);
+			       origin->call_stmt, origin->lto_stmt_uid);
   gcc_assert (to_del);
   ipa_remove_reference (to_del);
   if (dump_file)
@@ -2408,9 +2407,11 @@  try_make_edge_direct_simple_call (struct
 				  struct ipa_jump_func *jfunc,
 				  struct ipa_node_params *new_root_info)
 {
-  struct ipa_cst_ref_desc *rdesc;
   struct cgraph_edge *cs;
   tree target;
+  bool agg_contents = ie->indirect_info->agg_contents;
+  bool speculative = ie->speculative;
+  struct ipa_cst_ref_desc *rdesc;
 
   if (ie->indirect_info->agg_contents)
     target = ipa_find_agg_cst_for_param (&jfunc->agg,
@@ -2422,7 +2423,8 @@  try_make_edge_direct_simple_call (struct
     return NULL;
   cs = ipa_make_edge_direct_to_target (ie, target);
 
-  if (cs && !ie->indirect_info->agg_contents
+  /* FIXME: speculative edges can be handled.  */
+  if (cs && !agg_contents && !speculative
       && jfunc->type == IPA_JF_CONST
       && (rdesc = jfunc_rdesc_usable (jfunc))
       && --rdesc->refcount == 0)
@@ -2521,7 +2523,14 @@  update_indirect_edges_after_inlining (st
       else
 	new_direct_edge = try_make_edge_direct_simple_call (ie, jfunc,
 							    new_root_info);
-      if (new_direct_edge)
+      /* If speculation was removed, then we need to do nothing.  */
+      if (new_direct_edge && new_direct_edge != ie)
+	{
+	  new_direct_edge->indirect_inlining_edge = 1;
+	  top = IPA_EDGE_REF (cs);
+	  res = true;
+	}
+      else if (new_direct_edge)
 	{
 	  new_direct_edge->indirect_inlining_edge = 1;
 	  if (new_direct_edge->call_stmt)
@@ -2532,9 +2541,9 @@  update_indirect_edges_after_inlining (st
 	  if (new_edges)
 	    {
 	      new_edges->safe_push (new_direct_edge);
-	      top = IPA_EDGE_REF (cs);
 	      res = true;
 	    }
+	  top = IPA_EDGE_REF (cs);
 	}
       else if (jfunc->type == IPA_JF_PASS_THROUGH
 	       && ipa_get_jf_pass_through_operation (jfunc) == NOP_EXPR)
@@ -2645,7 +2654,7 @@  propagate_controlled_uses (struct cgraph
 		  && TREE_CODE (TREE_OPERAND (t, 0)) == FUNCTION_DECL
 		  && (n = cgraph_get_node (TREE_OPERAND (t, 0)))
 		  && (ref = ipa_find_reference ((symtab_node) new_root,
-						(symtab_node) n, NULL)))
+						(symtab_node) n, NULL, 0)))
 		{
 		  if (dump_file)
 		    fprintf (dump_file, "ipa-prop: Removing cloning-created "
@@ -2683,7 +2692,7 @@  propagate_controlled_uses (struct cgraph
 		    {
 		      struct ipa_ref *ref;
 		      ref = ipa_find_reference ((symtab_node) clone,
-						(symtab_node) n, NULL);
+						(symtab_node) n, NULL, 0);
 		      if (ref)
 			{
 			  if (dump_file)
Index: tree-inline.c
===================================================================
--- tree-inline.c	(revision 201568)
+++ tree-inline.c	(working copy)
@@ -1676,6 +1676,8 @@  copy_bb (copy_body_data *id, basic_block
 		  if (edge)
 		    {
 		      int edge_freq = edge->frequency;
+		      int new_freq;
+		      struct cgraph_edge *old_edge = edge;
 		      edge = cgraph_clone_edge (edge, id->dst_node, stmt,
 					        gimple_uid (stmt),
 					        REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
@@ -1683,25 +1685,52 @@  copy_bb (copy_body_data *id, basic_block
 		      /* We could also just rescale the frequency, but
 		         doing so would introduce roundoff errors and make
 			 verifier unhappy.  */
-		      edge->frequency
-		        = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
-							  copy_basic_block);
-		      if (dump_file
-		      	  && profile_status_for_function (cfun) != PROFILE_ABSENT
-			  && (edge_freq > edge->frequency + 10
-			      || edge_freq < edge->frequency - 10))
+		      new_freq  = compute_call_stmt_bb_frequency (id->dst_node->symbol.decl,
+								  copy_basic_block);
+
+		      /* Speculative calls consist of two edges - direct and indirect.
+			 Duplicate the whole thing and distribute frequencies accordingly.  */
+		      if (edge->speculative)
 			{
-			  fprintf (dump_file, "Edge frequency estimated by "
-			           "cgraph %i diverge from inliner's estimate %i\n",
-			  	   edge_freq,
-				   edge->frequency);
-			  fprintf (dump_file,
-			  	   "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
-				   bb->index,
-				   bb->frequency,
-				   copy_basic_block->frequency);
+			  struct cgraph_edge *direct, *indirect;
+			  struct ipa_ref *ref;
+
+			  gcc_assert (!edge->indirect_unknown_callee);
+			  cgraph_speculative_call_info (old_edge, direct, indirect, ref);
+			  indirect = cgraph_clone_edge (indirect, id->dst_node, stmt,
+							gimple_uid (stmt),
+							REG_BR_PROB_BASE, CGRAPH_FREQ_BASE,
+							true);
+			  if (old_edge->frequency + indirect->frequency)
+			    {
+			      edge->frequency = MIN (RDIV ((gcov_type)new_freq * old_edge->frequency,
+						           (old_edge->frequency + indirect->frequency)),
+						     CGRAPH_FREQ_MAX);
+			      indirect->frequency = MIN (RDIV ((gcov_type)new_freq * indirect->frequency,
+							       (old_edge->frequency + indirect->frequency)),
+							 CGRAPH_FREQ_MAX);
+			    }
+			  ipa_clone_ref (ref, (symtab_node)id->dst_node, stmt);
+			}
+		      else
+			{
+			  edge->frequency = new_freq;
+			  if (dump_file
+			      && profile_status_for_function (cfun) != PROFILE_ABSENT
+			      && (edge_freq > edge->frequency + 10
+				  || edge_freq < edge->frequency - 10))
+			    {
+			      fprintf (dump_file, "Edge frequency estimated by "
+				       "cgraph %i diverge from inliner's estimate %i\n",
+				       edge_freq,
+				       edge->frequency);
+			      fprintf (dump_file,
+				       "Orig bb: %i, orig bb freq %i, new bb freq %i\n",
+				       bb->index,
+				       bb->frequency,
+				       copy_basic_block->frequency);
+			    }
 			}
-		      stmt = cgraph_redirect_edge_call_stmt_to_callee (edge);
 		    }
 		  break;
 
@@ -2232,6 +2261,23 @@  copy_loops (bitmap blocks_to_copy,
     }
 }
 
+/* Call cgraph_redirect_edge_call_stmt_to_callee on all calls in BB */
+
+void
+redirect_all_calls (copy_body_data * id, basic_block bb)
+{
+  gimple_stmt_iterator si;
+  for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
+    {
+      if (is_gimple_call (gsi_stmt (si)))
+	{
+	  struct cgraph_edge *edge = cgraph_edge (id->dst_node, gsi_stmt (si));
+	  if (edge)
+	    cgraph_redirect_edge_call_stmt_to_callee (edge);
+	}
+    }
+}
+
 /* Make a copy of the body of FN so that it can be inserted inline in
    another function.  Walks FN via CFG, returns new fndecl.  */
 
@@ -2356,6 +2402,10 @@  copy_cfg_body (copy_body_data * id, gcov
 	    && bb->index != ENTRY_BLOCK
 	    && bb->index != EXIT_BLOCK)
 	  maybe_move_debug_stmts_to_successors (id, (basic_block) bb->aux);
+	/* Update call edge destinations.  This can not be done before loop
+	   info is updated, because we may split basic blocks.  */
+	if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
+	  redirect_all_calls (id, (basic_block)bb->aux);
 	((basic_block)bb->aux)->aux = NULL;
 	bb->aux = NULL;
       }
@@ -2367,6 +2417,10 @@  copy_cfg_body (copy_body_data * id, gcov
       if (need_debug_cleanup)
 	maybe_move_debug_stmts_to_successors (id, BASIC_BLOCK (last));
       BASIC_BLOCK (last)->aux = NULL;
+      /* Update call edge destinations.  This can not be done before loop
+	 info is updated, because we may split basic blocks.  */
+      if (id->transform_call_graph_edges == CB_CGE_DUPLICATE)
+	redirect_all_calls (id, BASIC_BLOCK (last));
     }
   entry_block_map->aux = NULL;
   exit_block_map->aux = NULL;
@@ -4941,43 +4995,47 @@  delete_unreachable_blocks_update_callgra
           gimple_stmt_iterator bsi;
 
           for (bsi = gsi_start_bb (b); !gsi_end_p (bsi); gsi_next (&bsi))
-	    if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL)
-	      {
-	        struct cgraph_edge *e;
-		struct cgraph_node *node;
+	    {
+	      struct cgraph_edge *e;
+	      struct cgraph_node *node;
 
-	        if ((e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
+	      ipa_remove_stmt_references ((symtab_node)id->dst_node, gsi_stmt (bsi));
+
+	      if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
+		  &&(e = cgraph_edge (id->dst_node, gsi_stmt (bsi))) != NULL)
+		{
+		  if (!e->inline_failed)
+		    cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
+		  else
+		    cgraph_remove_edge (e);
+		}
+	      if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
+		  && id->dst_node->clones)
+		for (node = id->dst_node->clones; node != id->dst_node;)
 		  {
-		    if (!e->inline_failed)
-		      cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
+		    ipa_remove_stmt_references ((symtab_node)node, gsi_stmt (bsi));
+		    if (gimple_code (gsi_stmt (bsi)) == GIMPLE_CALL
+			&& (e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
+		      {
+			if (!e->inline_failed)
+			  cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
+			else
+			  cgraph_remove_edge (e);
+		      }
+
+		    if (node->clones)
+		      node = node->clones;
+		    else if (node->next_sibling_clone)
+		      node = node->next_sibling_clone;
 		    else
-	              cgraph_remove_edge (e);
+		      {
+			while (node != id->dst_node && !node->next_sibling_clone)
+			  node = node->clone_of;
+			if (node != id->dst_node)
+			  node = node->next_sibling_clone;
+		      }
 		  }
-		if (id->transform_call_graph_edges == CB_CGE_MOVE_CLONES
-		    && id->dst_node->clones)
-     		  for (node = id->dst_node->clones; node != id->dst_node;)
-		    {
-	              if ((e = cgraph_edge (node, gsi_stmt (bsi))) != NULL)
-			{
-		          if (!e->inline_failed)
-		            cgraph_remove_node_and_inline_clones (e->callee, id->dst_node);
-			  else
-	                    cgraph_remove_edge (e);
-			}
-
-		      if (node->clones)
-			node = node->clones;
-		      else if (node->next_sibling_clone)
-			node = node->next_sibling_clone;
-		      else
-			{
-			  while (node != id->dst_node && !node->next_sibling_clone)
-			    node = node->clone_of;
-			  if (node != id->dst_node)
-			    node = node->next_sibling_clone;
-			}
-		    }
-	      }
+	    }
 	  delete_basic_block (b);
 	  changed = true;
 	}