diff mbox series

[2/2] ipa-cp: Improve updating behavior when profile counts have gone bad

Message ID ri6mt57cc64.fsf@suse.cz
State New
Headers show
Series [1/2] ipa-cp: Fix various issues in update_specialized_profile (PR 107925) | expand

Commit Message

Martin Jambor Feb. 21, 2023, 2:42 p.m. UTC
Hi,

Looking into the behavior of profile count updating in PR 107925, I
noticed that an option not considered possible was actually happening,
and - with the guesswork in place to distribute unexplained counts -
it simply can happen.  Currently it is handled by dropping the counts
to local estimated zero, whereas it is probably better to leave the
count as they are but drop the category to GUESSED_GLOBAL0 - which is
what profile_count::combine_with_ipa_count in a similar case (or so I
hope :-)

Profiled-LTO-bootstrapped and normally bootstrapped and tested on an
x86_64-linux.  OK for master once stage1 opens up?  Or perhaps even now?

Thanks,

Martin


gcc/ChangeLog:

2023-02-20  Martin Jambor  <mjambor@suse.cz>

	PR ipa/107925
	* ipa-cp.cc (update_profiling_info): Drop counts of orig_node to
	global0 instead of zeroing when it does not have as many counts as
	it should.
---
 gcc/ipa-cp.cc | 29 ++++++++++++++++++++++-------
 1 file changed, 22 insertions(+), 7 deletions(-)

Comments

Martin Jambor March 8, 2023, 10:33 a.m. UTC | #1
Hello,

I'd like to ping the patch below.

Martin


On Tue, Feb 21 2023, Martin Jambor wrote:
> Hi,
>
> Looking into the behavior of profile count updating in PR 107925, I
> noticed that an option not considered possible was actually happening,
> and - with the guesswork in place to distribute unexplained counts -
> it simply can happen.  Currently it is handled by dropping the counts
> to local estimated zero, whereas it is probably better to leave the
> count as they are but drop the category to GUESSED_GLOBAL0 - which is
> what profile_count::combine_with_ipa_count in a similar case (or so I
> hope :-)
>
> Profiled-LTO-bootstrapped and normally bootstrapped and tested on an
> x86_64-linux.  OK for master once stage1 opens up?  Or perhaps even now?
>
> Thanks,
>
> Martin
>
>
> gcc/ChangeLog:
>
> 2023-02-20  Martin Jambor  <mjambor@suse.cz>
>
> 	PR ipa/107925
> 	* ipa-cp.cc (update_profiling_info): Drop counts of orig_node to
> 	global0 instead of zeroing when it does not have as many counts as
> 	it should.
> ---
>  gcc/ipa-cp.cc | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> index 5a6b41cf2d6..6477bb840e5 100644
> --- a/gcc/ipa-cp.cc
> +++ b/gcc/ipa-cp.cc
> @@ -4969,10 +4969,20 @@ update_profiling_info (struct cgraph_node *orig_node,
>  					      false);
>    new_sum = stats.count_sum;
>  
> +  bool orig_edges_processed = false;
>    if (new_sum > orig_node_count)
>      {
> -      /* TODO: Perhaps this should be gcc_unreachable ()?  */
> -      remainder = profile_count::zero ().guessed_local ();
> +      /* TODO: Profile has alreay gone astray, keep what we have but lower it
> +	 to global0 category.  */
> +      remainder = orig_node->count.global0 ();
> +
> +      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> +	cs->count = cs->count.global0 ();
> +      for (cgraph_edge *cs = orig_node->indirect_calls;
> +	   cs;
> +	   cs = cs->next_callee)
> +	cs->count = cs->count.global0 ();
> +      orig_edges_processed = true;
>      }
>    else if (stats.rec_count_sum.nonzero_p ())
>      {
> @@ -5070,11 +5080,16 @@ update_profiling_info (struct cgraph_node *orig_node,
>    for (cgraph_edge *cs = new_node->indirect_calls; cs; cs = cs->next_callee)
>      cs->count = cs->count.apply_scale (new_sum, orig_new_node_count);
>  
> -  profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
> -  for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> -    cs->count = cs->count.apply_scale (remainder, orig_node_count);
> -  for (cgraph_edge *cs = orig_node->indirect_calls; cs; cs = cs->next_callee)
> -    cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +  if (!orig_edges_processed)
> +    {
> +      profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
> +      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> +	cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +      for (cgraph_edge *cs = orig_node->indirect_calls;
> +	   cs;
> +	   cs = cs->next_callee)
> +	cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +    }
>  
>    if (dump_file)
>      {
> -- 
> 2.39.1
Jan Hubicka March 10, 2023, 5:24 p.m. UTC | #2
> Hi,
> 
> Looking into the behavior of profile count updating in PR 107925, I
> noticed that an option not considered possible was actually happening,
> and - with the guesswork in place to distribute unexplained counts -
> it simply can happen.  Currently it is handled by dropping the counts
> to local estimated zero, whereas it is probably better to leave the
> count as they are but drop the category to GUESSED_GLOBAL0 - which is
> what profile_count::combine_with_ipa_count in a similar case (or so I
> hope :-)
> 
> Profiled-LTO-bootstrapped and normally bootstrapped and tested on an
> x86_64-linux.  OK for master once stage1 opens up?  Or perhaps even now?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2023-02-20  Martin Jambor  <mjambor@suse.cz>
> 
> 	PR ipa/107925
> 	* ipa-cp.cc (update_profiling_info): Drop counts of orig_node to
> 	global0 instead of zeroing when it does not have as many counts as
> 	it should.

OK,
thanks!
Honza
> ---
>  gcc/ipa-cp.cc | 29 ++++++++++++++++++++++-------
>  1 file changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
> index 5a6b41cf2d6..6477bb840e5 100644
> --- a/gcc/ipa-cp.cc
> +++ b/gcc/ipa-cp.cc
> @@ -4969,10 +4969,20 @@ update_profiling_info (struct cgraph_node *orig_node,
>  					      false);
>    new_sum = stats.count_sum;
>  
> +  bool orig_edges_processed = false;
>    if (new_sum > orig_node_count)
>      {
> -      /* TODO: Perhaps this should be gcc_unreachable ()?  */
> -      remainder = profile_count::zero ().guessed_local ();
> +      /* TODO: Profile has alreay gone astray, keep what we have but lower it
> +	 to global0 category.  */
> +      remainder = orig_node->count.global0 ();
> +
> +      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> +	cs->count = cs->count.global0 ();
> +      for (cgraph_edge *cs = orig_node->indirect_calls;
> +	   cs;
> +	   cs = cs->next_callee)
> +	cs->count = cs->count.global0 ();
> +      orig_edges_processed = true;
>      }
>    else if (stats.rec_count_sum.nonzero_p ())
>      {
> @@ -5070,11 +5080,16 @@ update_profiling_info (struct cgraph_node *orig_node,
>    for (cgraph_edge *cs = new_node->indirect_calls; cs; cs = cs->next_callee)
>      cs->count = cs->count.apply_scale (new_sum, orig_new_node_count);
>  
> -  profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
> -  for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> -    cs->count = cs->count.apply_scale (remainder, orig_node_count);
> -  for (cgraph_edge *cs = orig_node->indirect_calls; cs; cs = cs->next_callee)
> -    cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +  if (!orig_edges_processed)
> +    {
> +      profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
> +      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
> +	cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +      for (cgraph_edge *cs = orig_node->indirect_calls;
> +	   cs;
> +	   cs = cs->next_callee)
> +	cs->count = cs->count.apply_scale (remainder, orig_node_count);
> +    }
>  
>    if (dump_file)
>      {
> -- 
> 2.39.1
>
diff mbox series

Patch

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index 5a6b41cf2d6..6477bb840e5 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -4969,10 +4969,20 @@  update_profiling_info (struct cgraph_node *orig_node,
 					      false);
   new_sum = stats.count_sum;
 
+  bool orig_edges_processed = false;
   if (new_sum > orig_node_count)
     {
-      /* TODO: Perhaps this should be gcc_unreachable ()?  */
-      remainder = profile_count::zero ().guessed_local ();
+      /* TODO: Profile has alreay gone astray, keep what we have but lower it
+	 to global0 category.  */
+      remainder = orig_node->count.global0 ();
+
+      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
+	cs->count = cs->count.global0 ();
+      for (cgraph_edge *cs = orig_node->indirect_calls;
+	   cs;
+	   cs = cs->next_callee)
+	cs->count = cs->count.global0 ();
+      orig_edges_processed = true;
     }
   else if (stats.rec_count_sum.nonzero_p ())
     {
@@ -5070,11 +5080,16 @@  update_profiling_info (struct cgraph_node *orig_node,
   for (cgraph_edge *cs = new_node->indirect_calls; cs; cs = cs->next_callee)
     cs->count = cs->count.apply_scale (new_sum, orig_new_node_count);
 
-  profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
-  for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
-    cs->count = cs->count.apply_scale (remainder, orig_node_count);
-  for (cgraph_edge *cs = orig_node->indirect_calls; cs; cs = cs->next_callee)
-    cs->count = cs->count.apply_scale (remainder, orig_node_count);
+  if (!orig_edges_processed)
+    {
+      profile_count::adjust_for_ipa_scaling (&remainder, &orig_node_count);
+      for (cgraph_edge *cs = orig_node->callees; cs; cs = cs->next_callee)
+	cs->count = cs->count.apply_scale (remainder, orig_node_count);
+      for (cgraph_edge *cs = orig_node->indirect_calls;
+	   cs;
+	   cs = cs->next_callee)
+	cs->count = cs->count.apply_scale (remainder, orig_node_count);
+    }
 
   if (dump_file)
     {