diff mbox

[GOOGLE] Don't update the callee count if caller is not resolved node

Message ID CAO2gOZWLrtyNNZyz0q3aqF+bxay30+Y4hhCsbYzTTQcooTKbnw@mail.gmail.com
State New
Headers show

Commit Message

Dehao Chen Oct. 28, 2013, 10:51 p.m. UTC
This patch changes to no update callee count if caller count is not a
resolved node (in LIPO mode) during AutoFDO compilation. This is
because AutoFDO will have the same edge counts for all unresolved
nodes. Original update method will lead to multi-update of the callee.

Bootstrapped and testing on going.

OK for google-4_8 if test is OK?

Thanks,
Dehao

Comments

Xinliang David Li Oct. 29, 2013, 4:49 a.m. UTC | #1
Is it sufficient to check if the final caller is defined in primary module?

Note that in some cases, doing profile update 'speculatively' (without
your fix) can be more precise (assuming the inlining gets materialized
in a different compilation), but in general it might be better to be
conservative in profile update (as in your patch) -- the downside is
you may end up with less aggressive size optimization.

David

On Mon, Oct 28, 2013 at 3:51 PM, Dehao Chen <dehao@google.com> wrote:
> This patch changes to no update callee count if caller count is not a
> resolved node (in LIPO mode) during AutoFDO compilation. This is
> because AutoFDO will have the same edge counts for all unresolved
> nodes. Original update method will lead to multi-update of the callee.
>
> Bootstrapped and testing on going.
>
> OK for google-4_8 if test is OK?
>
> Thanks,
> Dehao
>
> Index: gcc/ipa-inline-transform.c
> ===================================================================
> --- gcc/ipa-inline-transform.c (revision 204131)
> +++ gcc/ipa-inline-transform.c (working copy)
> @@ -169,6 +169,15 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>        else
>   {
>    struct cgraph_node *n;
> +  if (flag_auto_profile && L_IPO_COMP_MODE
> +      && cgraph_pre_profiling_inlining_done)
> +    {
> +      struct cgraph_node *caller = e->caller;
> +      if (caller->global.inlined_to)
> + caller = caller->global.inlined_to;
> +      if (cgraph_lipo_get_resolved_node (caller->symbol.decl) != caller)
> + update_original = false;
> +    }
>    n = cgraph_clone_node (e->callee, e->callee->symbol.decl,
>   e->count, e->frequency,
>   update_original, vNULL, true);
Dehao Chen Oct. 29, 2013, 5:08 p.m. UTC | #2
On Mon, Oct 28, 2013 at 9:49 PM, Xinliang David Li <davidxl@google.com> wrote:
> Is it sufficient to check if the final caller is defined in primary module?

This might not be sufficient because the final caller may come from
comdat of aux-modules (not defined in the primary module).

>
> Note that in some cases, doing profile update 'speculatively' (without
> your fix) can be more precise (assuming the inlining gets materialized
> in a different compilation), but in general it might be better to be
> conservative in profile update (as in your patch) -- the downside is
> you may end up with less aggressive size optimization.

This is actually no speculatively update profile, but double update.

E.g. comdat_foo()-->bar() with count 100

and comdat_foo() has been defined in 2 aux-modules. Then there will be
two edges from comdat_foo()-->bar(), each of which has the count 100.
But bar()'s entry count is only 100 (assume comdat_foo is the only
caller). Then if we update bar() twice when inline these two edges,
the second update will be wrong.

Dehao

>
> David
>
> On Mon, Oct 28, 2013 at 3:51 PM, Dehao Chen <dehao@google.com> wrote:
>> This patch changes to no update callee count if caller count is not a
>> resolved node (in LIPO mode) during AutoFDO compilation. This is
>> because AutoFDO will have the same edge counts for all unresolved
>> nodes. Original update method will lead to multi-update of the callee.
>>
>> Bootstrapped and testing on going.
>>
>> OK for google-4_8 if test is OK?
>>
>> Thanks,
>> Dehao
>>
>> Index: gcc/ipa-inline-transform.c
>> ===================================================================
>> --- gcc/ipa-inline-transform.c (revision 204131)
>> +++ gcc/ipa-inline-transform.c (working copy)
>> @@ -169,6 +169,15 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>        else
>>   {
>>    struct cgraph_node *n;
>> +  if (flag_auto_profile && L_IPO_COMP_MODE
>> +      && cgraph_pre_profiling_inlining_done)
>> +    {
>> +      struct cgraph_node *caller = e->caller;
>> +      if (caller->global.inlined_to)
>> + caller = caller->global.inlined_to;
>> +      if (cgraph_lipo_get_resolved_node (caller->symbol.decl) != caller)
>> + update_original = false;
>> +    }
>>    n = cgraph_clone_node (e->callee, e->callee->symbol.decl,
>>   e->count, e->frequency,
>>   update_original, vNULL, true);
Xinliang David Li Oct. 29, 2013, 5:42 p.m. UTC | #3
The situation you described is worse -- hopefully it will be addressed
in the next version of lipo.

The change is ok.

David

On Tue, Oct 29, 2013 at 10:08 AM, Dehao Chen <dehao@google.com> wrote:
> On Mon, Oct 28, 2013 at 9:49 PM, Xinliang David Li <davidxl@google.com> wrote:
>> Is it sufficient to check if the final caller is defined in primary module?
>
> This might not be sufficient because the final caller may come from
> comdat of aux-modules (not defined in the primary module).
>
>>
>> Note that in some cases, doing profile update 'speculatively' (without
>> your fix) can be more precise (assuming the inlining gets materialized
>> in a different compilation), but in general it might be better to be
>> conservative in profile update (as in your patch) -- the downside is
>> you may end up with less aggressive size optimization.
>
> This is actually no speculatively update profile, but double update.
>
> E.g. comdat_foo()-->bar() with count 100
>
> and comdat_foo() has been defined in 2 aux-modules. Then there will be
> two edges from comdat_foo()-->bar(), each of which has the count 100.
> But bar()'s entry count is only 100 (assume comdat_foo is the only
> caller). Then if we update bar() twice when inline these two edges,
> the second update will be wrong.
>
> Dehao
>
>>
>> David
>>
>> On Mon, Oct 28, 2013 at 3:51 PM, Dehao Chen <dehao@google.com> wrote:
>>> This patch changes to no update callee count if caller count is not a
>>> resolved node (in LIPO mode) during AutoFDO compilation. This is
>>> because AutoFDO will have the same edge counts for all unresolved
>>> nodes. Original update method will lead to multi-update of the callee.
>>>
>>> Bootstrapped and testing on going.
>>>
>>> OK for google-4_8 if test is OK?
>>>
>>> Thanks,
>>> Dehao
>>>
>>> Index: gcc/ipa-inline-transform.c
>>> ===================================================================
>>> --- gcc/ipa-inline-transform.c (revision 204131)
>>> +++ gcc/ipa-inline-transform.c (working copy)
>>> @@ -169,6 +169,15 @@ clone_inlined_nodes (struct cgraph_edge *e, bool d
>>>        else
>>>   {
>>>    struct cgraph_node *n;
>>> +  if (flag_auto_profile && L_IPO_COMP_MODE
>>> +      && cgraph_pre_profiling_inlining_done)
>>> +    {
>>> +      struct cgraph_node *caller = e->caller;
>>> +      if (caller->global.inlined_to)
>>> + caller = caller->global.inlined_to;
>>> +      if (cgraph_lipo_get_resolved_node (caller->symbol.decl) != caller)
>>> + update_original = false;
>>> +    }
>>>    n = cgraph_clone_node (e->callee, e->callee->symbol.decl,
>>>   e->count, e->frequency,
>>>   update_original, vNULL, true);
diff mbox

Patch

Index: gcc/ipa-inline-transform.c
===================================================================
--- gcc/ipa-inline-transform.c (revision 204131)
+++ gcc/ipa-inline-transform.c (working copy)
@@ -169,6 +169,15 @@  clone_inlined_nodes (struct cgraph_edge *e, bool d
       else
  {
   struct cgraph_node *n;
+  if (flag_auto_profile && L_IPO_COMP_MODE
+      && cgraph_pre_profiling_inlining_done)
+    {
+      struct cgraph_node *caller = e->caller;
+      if (caller->global.inlined_to)
+ caller = caller->global.inlined_to;
+      if (cgraph_lipo_get_resolved_node (caller->symbol.decl) != caller)
+ update_original = false;
+    }
   n = cgraph_clone_node (e->callee, e->callee->symbol.decl,
  e->count, e->frequency,
  update_original, vNULL, true);