diff mbox

Ensure count_scale is no larger than REG_BR_PROB_BASE

Message ID CAO2gOZVj7vwDDtn1=XjE4PsiB-1B-a-9Ku2gFHViOVOLZUYMrg@mail.gmail.com
State New
Headers show

Commit Message

Dehao Chen May 16, 2014, 11:21 p.m. UTC
Is this patch ok for trunk? Bootstrapped and regression test on-going.

Thanks,
Dehao

2014-05-16  Dehao Chen  <dehao@google.com>

        * tree-inline.c (initialize_cfun): Ensure count_scale is no larger
        than REG_BR_PROB_BASE.
        (copy_cfg_body): Likewise.

Comments

Jan Hubicka May 16, 2014, 11:41 p.m. UTC | #1
> Is this patch ok for trunk? Bootstrapped and regression test on-going.
> 
> Thanks,
> Dehao
> 
> 2014-05-16  Dehao Chen  <dehao@google.com>
> 
>         * tree-inline.c (initialize_cfun): Ensure count_scale is no larger
>         than REG_BR_PROB_BASE.
>         (copy_cfg_body): Likewise.

This seems like wrong place to paper around the problem - symmetric count
scaling is done during production of the inline clone.  I think if we want to
be smart about broken profiles, we should do it at that place instead here
at inliner...

What kind of problem does this patch solve?

Honza
Dehao Chen May 17, 2014, 12:03 a.m. UTC | #2
On Fri, May 16, 2014 at 4:41 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>
> > Is this patch ok for trunk? Bootstrapped and regression test on-going.
> >
> > Thanks,
> > Dehao
> >
> > 2014-05-16  Dehao Chen  <dehao@google.com>
> >
> >         * tree-inline.c (initialize_cfun): Ensure count_scale is no larger
> >         than REG_BR_PROB_BASE.
> >         (copy_cfg_body): Likewise.
>
> This seems like wrong place to paper around the problem - symmetric count
> scaling is done during production of the inline clone.  I think if we want to
> be smart about broken profiles, we should do it at that place instead here
> at inliner...
>
> What kind of problem does this patch solve?


In AutoFDO, a basic block's count can be much larger than it's actual
count because debug info might be incorrect. In this case, a call edge
count (calculated from BB count) could be much larger than callee's
header count, making the count_scale incorrectly large.

Dehao
>
>
> Honza
Jan Hubicka May 17, 2014, 12:22 a.m. UTC | #3
> In AutoFDO, a basic block's count can be much larger than it's actual
> count because debug info might be incorrect. In this case, a call edge
> count (calculated from BB count) could be much larger than callee's
> header count, making the count_scale incorrectly large.

In this case I still think we should handle this when producing the clone:
we do not want to have clone's count much larger as well, so i think inliner
and ipa-cp needs to deal with capping here instead....

Honza
> 
> Dehao
> >
> >
> > Honza
Dehao Chen May 17, 2014, 12:42 a.m. UTC | #4
Do you mean adjusting bb->count? Because in
expand_call_inline(tree-inline.c), it will use bb->count to pass into
copy_body to calculate count_scale.

Thanks,
Dehao

On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> In AutoFDO, a basic block's count can be much larger than it's actual
>> count because debug info might be incorrect. In this case, a call edge
>> count (calculated from BB count) could be much larger than callee's
>> header count, making the count_scale incorrectly large.
>
> In this case I still think we should handle this when producing the clone:
> we do not want to have clone's count much larger as well, so i think inliner
> and ipa-cp needs to deal with capping here instead....
>
> Honza
>>
>> Dehao
>> >
>> >
>> > Honza
Jan Hubicka May 17, 2014, 1:41 a.m. UTC | #5
> Do you mean adjusting bb->count? Because in
> expand_call_inline(tree-inline.c), it will use bb->count to pass into
> copy_body to calculate count_scale.

What about taking here callee->count instead? For inline nodes without
any capping hack, bb->count == edge->count = callee->count.

When profile ends up being obviously inconsistent, I would say that
inliner can cap callee->count during producing the clone...

Honza
> 
> Thanks,
> Dehao
> 
> On Fri, May 16, 2014 at 5:22 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> In AutoFDO, a basic block's count can be much larger than it's actual
> >> count because debug info might be incorrect. In this case, a call edge
> >> count (calculated from BB count) could be much larger than callee's
> >> header count, making the count_scale incorrectly large.
> >
> > In this case I still think we should handle this when producing the clone:
> > we do not want to have clone's count much larger as well, so i think inliner
> > and ipa-cp needs to deal with capping here instead....
> >
> > Honza
> >>
> >> Dehao
> >> >
> >> >
> >> > Honza
diff mbox

Patch

Index: gcc/tree-inline.c
===================================================================
--- gcc/tree-inline.c (revision 210535)
+++ gcc/tree-inline.c (working copy)
@@ -2190,7 +2190,8 @@  initialize_cfun (tree new_fndecl, tree callee_fnde
   if (!DECL_RESULT (new_fndecl))
     DECL_RESULT (new_fndecl) = DECL_RESULT (callee_fndecl);

-  if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count)
+  if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count
+      && count < ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count)
     count_scale
         = GCOV_COMPUTE_SCALE (count,
                               ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count);
@@ -2451,7 +2452,8 @@  copy_cfg_body (copy_body_data * id, gcov_type coun
       freqs_to_counts (id->src_node, count > in_count ? count : in_count);
     }

-  if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count)
+  if (ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count
+      && count < ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count)
     count_scale
         = GCOV_COMPUTE_SCALE (count,
                               ENTRY_BLOCK_PTR_FOR_FN (src_cfun)->count);