[RFC] ipa-cp: Fix PGO regression caused by r278808
diff mbox series

Message ID 20191210033822.124374-1-luoxhu@linux.ibm.com
State New
Headers show
Series
  • [RFC] ipa-cp: Fix PGO regression caused by r278808
Related show

Commit Message

luoxhu Dec. 10, 2019, 3:38 a.m. UTC
The performance of exchange2 built with PGO will decrease ~28% by r278808
due to profile count set incorrectly.  The cloned nodes are updated to a
very small count caused later pass cunroll fail to unroll the recursive
function in exchange2, This patch enables adding orig_sum to the new nodes
for self recursive node.

digits_2 ->
digits_2.constprop.0, digits_2.constprop.1, etc.

gcc/ChangeLog:

	2019-12-10  Luo Xiong Hu  <luoxhu@linux.ibm.com>

	* ipa-pure-const.c (self_recursive_p): Move it from ...
	(propagate_pure_const): Use cgraph_node::self_recursive_p.
	(pass_nothrow::execute): Likewise.
	* cgraph.h (cgraph_node::self_recursive_p): to ... this.
	* ipa-cp.c (update_profiling_info): Check self_recursive_p node.
---
 gcc/cgraph.h         | 18 ++++++++++++++++++
 gcc/ipa-cp.c         |  5 ++++-
 gcc/ipa-pure-const.c | 19 ++-----------------
 3 files changed, 24 insertions(+), 18 deletions(-)

Comments

Martin Liška Dec. 10, 2019, 8:24 a.m. UTC | #1
CC'ing Martin and Honza.

Martin
Jan Hubicka Dec. 10, 2019, 8:55 a.m. UTC | #2
Hi,
I think the updating should treat self recursive edges as loops: that is
calculate SUM of counts incomming edges which are not self recursive,
calculate probability of self recursing and then set count as
SUM * (1/(1-probability_of_recursion))
we do same thing when computing counts withing loops.

Martin, I woner how updating works for indirectly self recursive edges?

Honza
Martin Jambor Dec. 10, 2019, 9:14 a.m. UTC | #3
Hi,

On Tue, Dec 10 2019, Jan Hubicka wrote:
> Hi,
> I think the updating should treat self recursive edges as loops: that is
> calculate SUM of counts incomming edges which are not self recursive,
> calculate probability of self recursing and then set count as
> SUM * (1/(1-probability_of_recursion))
> we do same thing when computing counts withing loops.
>
> Martin, I woner how updating works for indirectly self recursive edges?

It does not special-case them in any way, if that was the question.

Martin
Jan Hubicka Dec. 10, 2019, 11:06 a.m. UTC | #4
> Hi,
> 
> On Tue, Dec 10 2019, Jan Hubicka wrote:
> > Hi,
> > I think the updating should treat self recursive edges as loops: that is
> > calculate SUM of counts incomming edges which are not self recursive,
> > calculate probability of self recursing and then set count as
> > SUM * (1/(1-probability_of_recursion))
> > we do same thing when computing counts withing loops.
> >
> > Martin, I woner how updating works for indirectly self recursive edges?
> 
> It does not special-case them in any way, if that was the question.

Well, it should, summing only frequencies of edges entering the SCC is
not going to give a good answer.
If SCC is having single entry and forms a loop the generalization of
algorithm above will work, but for general graphs this is hard problem.

I suppose we could special case self recurisve nodes (since they are
most common) and apply some fixed scale for other SCCs?

Also i wonder if the updating is done in RPO so we do not account edges
not updated yet?

Honza

> 
> Martin
luoxhu Dec. 13, 2019, 8:16 a.m. UTC | #5
Thanks Honza,

On 2019/12/10 19:06, Jan Hubicka wrote:
>> Hi,
>>
>> On Tue, Dec 10 2019, Jan Hubicka wrote:
>>> Hi,
>>> I think the updating should treat self recursive edges as loops: that is
>>> calculate SUM of counts incomming edges which are not self recursive,
>>> calculate probability of self recursing and then set count as
>>> SUM * (1/(1-probability_of_recursion))
>>> we do same thing when computing counts withing loops.
>>>
>>> Martin, I woner how updating works for indirectly self recursive edges?
>>
>> It does not special-case them in any way, if that was the question.
> 
> Well, it should, summing only frequencies of edges entering the SCC is
> not going to give a good answer.
> If SCC is having single entry and forms a loop the generalization of
> algorithm above will work, but for general graphs this is hard problem.
> 
> I suppose we could special case self recurisve nodes (since they are
> most common) and apply some fixed scale for other SCCs?
> 
> Also i wonder if the updating is done in RPO so we do not account edges
> not updated yet?

I have several questions with you comments.
1.  Where is the loops counts prob calculation code please?  And I am not
quite sure about the formula SUM * (1/(1-probability_of_recursion)).
At this moment, the self recursive node graph will be transformed as below:

            orig_node<-----
               |    |------| 
               |  
  caller ---> new_node

Assuming:		
e1: caller    ->  new_node:  count1
e2: orig_node -> orig_node:  count2
e3: new_node  -> orig_node:  count2 (same as back edge when cloned)

e3 is not considered to be a self recursive edge now?
So:
SUM = count2
probability_of_recursion should be count2/(count2 + count2)
=>
orig_node.count = SUM * (1/(1-probability_of_recursion)) = count2 * 2

But the count2 of e3 is not correct as it was copied when cloned which makes
the calculation inaccurate.  And the important thing is new_node.count is
still UNCERTAIN.

2.  Also, the orig_node_count is already inaccurate due to
ipa-cp.c:4289   orig_node_count = (orig_sum + new_sum).apply_scale (12, 10);
Since the parameter param_ipa_cp_max_recursive_depth defines how many nodes
are cloned, is it reasonable to update self_scc node as below to simplify the
calculation(replace self_recursive_p with info->node_is_self_scc)?

+  class ipa_node_params *info = IPA_NODE_REF (orig_node);
+  if (info && info->node_is_self_scc)
+    new_node->count
+      = (orig_sum + new_sum).apply_scale (1, param_ipa_cp_max_recursive_depth);
+  else
+    new_node->count = new_sum;

3.  The recursive ipa-cp will create multiple SCC nodes based on orig_node,
next new_node2 no need update previous new_node1, as each new_node takes
1/param_ipa_cp_max_recursive_depth execution count from orig_node.

BR
Xiong Hu

> 
> Honza
> 
>>
>> Martin

Patch
diff mbox series

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index cdeea4d9953..1aca7d114e9 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -1329,6 +1329,9 @@  struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node
   /* Return true if function should be optimized for size.  */
   bool optimize_for_size_p (void);
 
+  /* Return true if NODE is self recursive function.  */
+  inline bool self_recursive_p (void);
+
   /* Dump the callgraph to file F.  */
   static void dump_cgraph (FILE *f);
 
@@ -3285,6 +3288,21 @@  cgraph_node::optimize_for_size_p (void)
     return false;
 }
 
+/* Return true if NODE is self recursive function.
+   Indirectly recursive functions appears as non-trivial strongly
+   connected components, so we need to care about self recursion
+   only.  */
+
+inline bool
+cgraph_node::self_recursive_p (void)
+{
+  struct cgraph_edge *e;
+  for (e = this->callees; e; e = e->next_callee)
+    if (e->callee->function_symbol () == this)
+      return true;
+  return false;
+}
+
 /* Return symtab_node for NODE or create one if it is not present
    in symtab.  */
 
diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 14064ae0034..76c1b309d04 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -4305,7 +4305,10 @@  update_profiling_info (struct cgraph_node *orig_node,
     remainder = remainder.guessed_local ();
 
   new_sum = orig_node_count.combine_with_ipa_count (new_sum);
-  new_node->count = new_sum;
+  if (orig_node->self_recursive_p ())
+    new_node->count = (orig_sum + new_sum).apply_scale (5, 10);
+  else
+    new_node->count = new_sum;
   orig_node->count = remainder;
 
   profile_count::adjust_for_ipa_scaling (&new_sum, &orig_new_node_count);
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index a142e0cc8f6..520ed39b476 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -1371,21 +1371,6 @@  ignore_edge_for_nothrow (struct cgraph_edge *e)
 	  || !opt_for_fn (ultimate_target->decl, flag_ipa_pure_const));
 }
 
-/* Return true if NODE is self recursive function.
-   Indirectly recursive functions appears as non-trivial strongly
-   connected components, so we need to care about self recursion
-   only.  */
-
-static bool
-self_recursive_p (struct cgraph_node *node)
-{
-  struct cgraph_edge *e;
-  for (e = node->callees; e; e = e->next_callee)
-    if (e->callee->function_symbol () == node)
-      return true;
-  return false;
-}
-
 /* Return true if N is cdtor that is not const or pure.  In this case we may
    need to remove unreachable function if it is marked const/pure.  */
 
@@ -1666,7 +1651,7 @@  propagate_pure_const (void)
 	      if (this_state == IPA_NEITHER)
 	        this_looping = w_l->looping_previously_known;
 	    }
-	  if (!this_looping && self_recursive_p (w))
+	  if (!this_looping && w->self_recursive_p ())
 	    this_looping = true;
 	  if (!w_l->looping_previously_known)
 	    this_looping = false;
@@ -2342,7 +2327,7 @@  pass_nothrow::execute (function *)
   node->set_nothrow_flag (true);
 
   bool cfg_changed = false;
-  if (self_recursive_p (node))
+  if (node->self_recursive_p ())
     FOR_EACH_BB_FN (this_block, cfun)
       if (gimple *g = last_stmt (this_block))
 	if (is_gimple_call (g))