decide edge's hotness when there is profile info

Message ID	CAO2gOZVX2QuoUGRy6uRiT4A+=xdeCXmq04tF0Qr1kqezP-y5-A@mail.gmail.com
State	New
Headers	show Return-Path: <gcc-patches-return-351140-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=H+jEJZOEXpeauu4uBSGy6SxbXHGACN2tW0enU1oIMlQ 2K1JsAs9VHpg8WpBG216JvilYq9y1JFGoiF2YMLXUCG9QpH0DsE3xBwo8TTp64JW AwedpsW/K2JsBKWIqpmeNlGkPsdkOop7pRWaG7mM4WnA9YAJm5Y/bqKrxT/w+D6k = Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org MIME-Version: 1.0 Date: Mon, 14 Oct 2013 09:08:53 -0700 Message-ID: <CAO2gOZVX2QuoUGRy6uRiT4A+=xdeCXmq04tF0Qr1kqezP-y5-A@mail.gmail.com> Subject: [PATCH] decide edge's hotness when there is profile info From: Dehao Chen <dehao@google.com> To: GCC Patches <gcc-patches@gcc.gnu.org> Cc: Jan Hubicka <hubicka@ucw.cz>, David Li <davidxl@google.com> Content-Type: text/plain; charset=ISO-8859-1

Dehao Chen Oct. 14, 2013, 4:08 p.m. UTC

This patch forces to use profile info to check if an edge is hot when
profile is available.

Bootstrapped and passed regression tests.

OK for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2013-10-14  Dehao Chen  <dehao@google.com>
* predict.c(cgraph_maybe_hot_edge_p): Decide edge's hotness from profile.

Xinliang David Li Oct. 14, 2013, 4:13 p.m. UTC | #1

Looks like there is some inconsistency between edge hotness and callee
frequency?

David

On Mon, Oct 14, 2013 at 9:08 AM, Dehao Chen <dehao@google.com> wrote:
> This patch forces to use profile info to check if an edge is hot when
> profile is available.
>
> Bootstrapped and passed regression tests.
>
> OK for trunk?
>
> Thanks,
> Dehao
>
> gcc/ChangeLog:
> 2013-10-14  Dehao Chen  <dehao@google.com>
> * predict.c(cgraph_maybe_hot_edge_p): Decide edge's hotness from profile.
>
> Index: gcc/predict.c
> ===================================================================
> --- gcc/predict.c (revision 203568)
> +++ gcc/predict.c (working copy)
> @@ -185,10 +185,8 @@ maybe_hot_bb_p (struct function *fun, const_basic_
>  bool
>  cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>  {
> -  if (profile_info && flag_branch_probabilities
> -      && !maybe_hot_count_p (NULL,
> -                             edge->count))
> -    return false;
> +  if (profile_info && flag_branch_probabilities)
> +    return maybe_hot_count_p (NULL, edge->count);
>    if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>        || (edge->callee
>    && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))

Dehao Chen Oct. 14, 2013, 4:19 p.m. UTC | #2

Not for instrumented FDO (not as I know of). But for AutoFDO, this
could be a potential risk because some callee is marked unlikely
executed simply because they are inlined and eliminated in the O2
binary. But in ipa-inline it will not get inlined because the edge is
not hot from cgraph_maybe_hot_edge_p (because callee is
UNLIKELY_EXECUTED), while the edge->count is actually hot.

Dehao

On Mon, Oct 14, 2013 at 9:13 AM, Xinliang David Li <davidxl@google.com> wrote:
> Looks like there is some inconsistency between edge hotness and callee
> frequency?
>
> David
>
> On Mon, Oct 14, 2013 at 9:08 AM, Dehao Chen <dehao@google.com> wrote:
>> This patch forces to use profile info to check if an edge is hot when
>> profile is available.
>>
>> Bootstrapped and passed regression tests.
>>
>> OK for trunk?
>>
>> Thanks,
>> Dehao
>>
>> gcc/ChangeLog:
>> 2013-10-14  Dehao Chen  <dehao@google.com>
>> * predict.c(cgraph_maybe_hot_edge_p): Decide edge's hotness from profile.
>>
>> Index: gcc/predict.c
>> ===================================================================
>> --- gcc/predict.c (revision 203568)
>> +++ gcc/predict.c (working copy)
>> @@ -185,10 +185,8 @@ maybe_hot_bb_p (struct function *fun, const_basic_
>>  bool
>>  cgraph_maybe_hot_edge_p (struct cgraph_edge *edge)
>>  {
>> -  if (profile_info && flag_branch_probabilities
>> -      && !maybe_hot_count_p (NULL,
>> -                             edge->count))
>> -    return false;
>> +  if (profile_info && flag_branch_probabilities)
>> +    return maybe_hot_count_p (NULL, edge->count);
>>    if (edge->caller->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED
>>        || (edge->callee
>>    && edge->callee->frequency == NODE_FREQUENCY_UNLIKELY_EXECUTED))

Jan Hubicka Oct. 14, 2013, 7:49 p.m. UTC | #3

> Not for instrumented FDO (not as I know of). But for AutoFDO, this
> could be a potential risk because some callee is marked unlikely
> executed simply because they are inlined and eliminated in the O2
> binary. But in ipa-inline it will not get inlined because the edge is
> not hot from cgraph_maybe_hot_edge_p (because callee is
> UNLIKELY_EXECUTED), while the edge->count is actually hot.

Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
It seems that having profile set incorrectly will lead to other problems later, too.
We discussed similar problem with Teresa about the missing profiles for comdat,
basically one should detect these cases as profile being lost and go with guessed
profile.  (I believe patch for that was posted, too, and so far it seems best approach
to this issue)

Honza

Dehao Chen Oct. 14, 2013, 9:18 p.m. UTC | #4

On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>> could be a potential risk because some callee is marked unlikely
>> executed simply because they are inlined and eliminated in the O2
>> binary. But in ipa-inline it will not get inlined because the edge is
>> not hot from cgraph_maybe_hot_edge_p (because callee is
>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>
> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
> It seems that having profile set incorrectly will lead to other problems later, too.
> We discussed similar problem with Teresa about the missing profiles for comdat,
> basically one should detect these cases as profile being lost and go with guessed
> profile.  (I believe patch for that was posted, too, and so far it seems best approach
> to this issue)

The current AutoFDO implementation will take all functions that do not
have have profile as normally executed, thus use guessed profile for
it. This is like using profile for truly hot functions, and using O2
for other functions. This works fine. However, it leads to larger code
size (approximately 10%~20% larger than FDO).

I'd like to introduce another mode for users who care about both
performance and code size, and can be sure that profile is
representative. In this mode, we will mark all functions without
sample as "unlikely executed". However, because AutoFDO use debug info
(of optimized code) to represent profile, it's possible that some hot
functions (say foo) are inlined and fully eliminated into another hot
function (say bar). So in the profile, bar is cold, and because the
profile for foo::bar is eliminated, bar will not be inlined into foo
before the profile annotation. However, after profile annotate, we can
infer from the bb count that foo->bar is hot, thus it should be
inlined in ipa-inline phase. However, because bar itself is marked
UNLIKELY_EXECUTED, it will not be inlined.

One possible workaround would be that during rebuild_cgraph_edges, if
we find an edge's callee is unlikely executed, add the edge count to
the callee's count and recalculate callee's frequency.

Dehao

>
> Honza

Xinliang David Li Oct. 14, 2013, 9:27 p.m. UTC | #5

Is it possible to update the callee node summary after profile
annotate (using information from inline instances which are not
inlined in early inline)?

David

On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>> could be a potential risk because some callee is marked unlikely
>>> executed simply because they are inlined and eliminated in the O2
>>> binary. But in ipa-inline it will not get inlined because the edge is
>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>
>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>> It seems that having profile set incorrectly will lead to other problems later, too.
>> We discussed similar problem with Teresa about the missing profiles for comdat,
>> basically one should detect these cases as profile being lost and go with guessed
>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
>> to this issue)
>
> The current AutoFDO implementation will take all functions that do not
> have have profile as normally executed, thus use guessed profile for
> it. This is like using profile for truly hot functions, and using O2
> for other functions. This works fine. However, it leads to larger code
> size (approximately 10%~20% larger than FDO).
>
> I'd like to introduce another mode for users who care about both
> performance and code size, and can be sure that profile is
> representative. In this mode, we will mark all functions without
> sample as "unlikely executed". However, because AutoFDO use debug info
> (of optimized code) to represent profile, it's possible that some hot
> functions (say foo) are inlined and fully eliminated into another hot
> function (say bar). So in the profile, bar is cold, and because the
> profile for foo::bar is eliminated, bar will not be inlined into foo
> before the profile annotation. However, after profile annotate, we can
> infer from the bb count that foo->bar is hot, thus it should be
> inlined in ipa-inline phase. However, because bar itself is marked
> UNLIKELY_EXECUTED, it will not be inlined.
>
> One possible workaround would be that during rebuild_cgraph_edges, if
> we find an edge's callee is unlikely executed, add the edge count to
> the callee's count and recalculate callee's frequency.
>
> Dehao
>
>>
>> Honza

Dehao Chen Oct. 14, 2013, 9:34 p.m. UTC | #6

For my test case, the entire inline instance is optimized away, so
there is no info about it in the profile. I can do some fixup in the
rebuild_cgraph_edge though.

Dehao

On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
> Is it possible to update the callee node summary after profile
> annotate (using information from inline instances which are not
> inlined in early inline)?
>
> David
>
> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>>> could be a potential risk because some callee is marked unlikely
>>>> executed simply because they are inlined and eliminated in the O2
>>>> binary. But in ipa-inline it will not get inlined because the edge is
>>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>>
>>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>>> It seems that having profile set incorrectly will lead to other problems later, too.
>>> We discussed similar problem with Teresa about the missing profiles for comdat,
>>> basically one should detect these cases as profile being lost and go with guessed
>>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
>>> to this issue)
>>
>> The current AutoFDO implementation will take all functions that do not
>> have have profile as normally executed, thus use guessed profile for
>> it. This is like using profile for truly hot functions, and using O2
>> for other functions. This works fine. However, it leads to larger code
>> size (approximately 10%~20% larger than FDO).
>>
>> I'd like to introduce another mode for users who care about both
>> performance and code size, and can be sure that profile is
>> representative. In this mode, we will mark all functions without
>> sample as "unlikely executed". However, because AutoFDO use debug info
>> (of optimized code) to represent profile, it's possible that some hot
>> functions (say foo) are inlined and fully eliminated into another hot
>> function (say bar). So in the profile, bar is cold, and because the
>> profile for foo::bar is eliminated, bar will not be inlined into foo
>> before the profile annotation. However, after profile annotate, we can
>> infer from the bb count that foo->bar is hot, thus it should be
>> inlined in ipa-inline phase. However, because bar itself is marked
>> UNLIKELY_EXECUTED, it will not be inlined.
>>
>> One possible workaround would be that during rebuild_cgraph_edges, if
>> we find an edge's callee is unlikely executed, add the edge count to
>> the callee's count and recalculate callee's frequency.
>>
>> Dehao
>>
>>>
>>> Honza

Jan Hubicka Oct. 14, 2013, 9:50 p.m. UTC | #7

> For my test case, the entire inline instance is optimized away, so
> there is no info about it in the profile. I can do some fixup in the
> rebuild_cgraph_edge though.

Yep, I understand that.  In this case we should turn PROFILE_READ to PROFILE_GUESSED
and guess the profile when we detect this (i.e. we have edges with non-0 counts into
functions with 0 profile).  That should prvent these from getting UNLIKELY_EXECUTED
and they will be inlined normal way.

Honza
> 
> Dehao
> 
> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
> > Is it possible to update the callee node summary after profile
> > annotate (using information from inline instances which are not
> > inlined in early inline)?
> >
> > David
> >
> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
> >> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> >>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
> >>>> could be a potential risk because some callee is marked unlikely
> >>>> executed simply because they are inlined and eliminated in the O2
> >>>> binary. But in ipa-inline it will not get inlined because the edge is
> >>>> not hot from cgraph_maybe_hot_edge_p (because callee is
> >>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
> >>>
> >>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
> >>> It seems that having profile set incorrectly will lead to other problems later, too.
> >>> We discussed similar problem with Teresa about the missing profiles for comdat,
> >>> basically one should detect these cases as profile being lost and go with guessed
> >>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
> >>> to this issue)
> >>
> >> The current AutoFDO implementation will take all functions that do not
> >> have have profile as normally executed, thus use guessed profile for
> >> it. This is like using profile for truly hot functions, and using O2
> >> for other functions. This works fine. However, it leads to larger code
> >> size (approximately 10%~20% larger than FDO).
> >>
> >> I'd like to introduce another mode for users who care about both
> >> performance and code size, and can be sure that profile is
> >> representative. In this mode, we will mark all functions without
> >> sample as "unlikely executed". However, because AutoFDO use debug info
> >> (of optimized code) to represent profile, it's possible that some hot
> >> functions (say foo) are inlined and fully eliminated into another hot
> >> function (say bar). So in the profile, bar is cold, and because the
> >> profile for foo::bar is eliminated, bar will not be inlined into foo
> >> before the profile annotation. However, after profile annotate, we can
> >> infer from the bb count that foo->bar is hot, thus it should be
> >> inlined in ipa-inline phase. However, because bar itself is marked
> >> UNLIKELY_EXECUTED, it will not be inlined.
> >>
> >> One possible workaround would be that during rebuild_cgraph_edges, if
> >> we find an edge's callee is unlikely executed, add the edge count to
> >> the callee's count and recalculate callee's frequency.
> >>
> >> Dehao
> >>
> >>>
> >>> Honza

Xinliang David Li Oct. 14, 2013, 10:04 p.m. UTC | #8

On Mon, Oct 14, 2013 at 2:34 PM, Dehao Chen <dehao@google.com> wrote:
> For my test case, the entire inline instance is optimized away,

do you mean there is no out of line instance for the target function
in the profile binary?

David

> so
> there is no info about it in the profile. I can do some fixup in the
> rebuild_cgraph_edge though.
>
> Dehao
>
> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>> Is it possible to update the callee node summary after profile
>> annotate (using information from inline instances which are not
>> inlined in early inline)?
>>
>> David
>>
>> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>>>> could be a potential risk because some callee is marked unlikely
>>>>> executed simply because they are inlined and eliminated in the O2
>>>>> binary. But in ipa-inline it will not get inlined because the edge is
>>>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>>>
>>>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>>>> It seems that having profile set incorrectly will lead to other problems later, too.
>>>> We discussed similar problem with Teresa about the missing profiles for comdat,
>>>> basically one should detect these cases as profile being lost and go with guessed
>>>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
>>>> to this issue)
>>>
>>> The current AutoFDO implementation will take all functions that do not
>>> have have profile as normally executed, thus use guessed profile for
>>> it. This is like using profile for truly hot functions, and using O2
>>> for other functions. This works fine. However, it leads to larger code
>>> size (approximately 10%~20% larger than FDO).
>>>
>>> I'd like to introduce another mode for users who care about both
>>> performance and code size, and can be sure that profile is
>>> representative. In this mode, we will mark all functions without
>>> sample as "unlikely executed". However, because AutoFDO use debug info
>>> (of optimized code) to represent profile, it's possible that some hot
>>> functions (say foo) are inlined and fully eliminated into another hot
>>> function (say bar). So in the profile, bar is cold, and because the
>>> profile for foo::bar is eliminated, bar will not be inlined into foo
>>> before the profile annotation. However, after profile annotate, we can
>>> infer from the bb count that foo->bar is hot, thus it should be
>>> inlined in ipa-inline phase. However, because bar itself is marked
>>> UNLIKELY_EXECUTED, it will not be inlined.
>>>
>>> One possible workaround would be that during rebuild_cgraph_edges, if
>>> we find an edge's callee is unlikely executed, add the edge count to
>>> the callee's count and recalculate callee's frequency.
>>>
>>> Dehao
>>>
>>>>
>>>> Honza

Dehao Chen Oct. 14, 2013, 10:23 p.m. UTC | #9

On Mon, Oct 14, 2013 at 3:04 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Mon, Oct 14, 2013 at 2:34 PM, Dehao Chen <dehao@google.com> wrote:
>> For my test case, the entire inline instance is optimized away,
>
> do you mean there is no out of line instance for the target function
> in the profile binary?

Yes, and there is no inline instance either.

Dehao

>
> David
>
>> so
>> there is no info about it in the profile. I can do some fixup in the
>> rebuild_cgraph_edge though.
>>
>> Dehao
>>
>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> Is it possible to update the callee node summary after profile
>>> annotate (using information from inline instances which are not
>>> inlined in early inline)?
>>>
>>> David
>>>
>>> On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>>>> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>>>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>>>>>> could be a potential risk because some callee is marked unlikely
>>>>>> executed simply because they are inlined and eliminated in the O2
>>>>>> binary. But in ipa-inline it will not get inlined because the edge is
>>>>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>>>>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>>>>>
>>>>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>>>>> It seems that having profile set incorrectly will lead to other problems later, too.
>>>>> We discussed similar problem with Teresa about the missing profiles for comdat,
>>>>> basically one should detect these cases as profile being lost and go with guessed
>>>>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
>>>>> to this issue)
>>>>
>>>> The current AutoFDO implementation will take all functions that do not
>>>> have have profile as normally executed, thus use guessed profile for
>>>> it. This is like using profile for truly hot functions, and using O2
>>>> for other functions. This works fine. However, it leads to larger code
>>>> size (approximately 10%~20% larger than FDO).
>>>>
>>>> I'd like to introduce another mode for users who care about both
>>>> performance and code size, and can be sure that profile is
>>>> representative. In this mode, we will mark all functions without
>>>> sample as "unlikely executed". However, because AutoFDO use debug info
>>>> (of optimized code) to represent profile, it's possible that some hot
>>>> functions (say foo) are inlined and fully eliminated into another hot
>>>> function (say bar). So in the profile, bar is cold, and because the
>>>> profile for foo::bar is eliminated, bar will not be inlined into foo
>>>> before the profile annotation. However, after profile annotate, we can
>>>> infer from the bb count that foo->bar is hot, thus it should be
>>>> inlined in ipa-inline phase. However, because bar itself is marked
>>>> UNLIKELY_EXECUTED, it will not be inlined.
>>>>
>>>> One possible workaround would be that during rebuild_cgraph_edges, if
>>>> we find an edge's callee is unlikely executed, add the edge count to
>>>> the callee's count and recalculate callee's frequency.
>>>>
>>>> Dehao
>>>>
>>>>>
>>>>> Honza

Dehao Chen Oct. 14, 2013, 10:26 p.m. UTC | #10

On Mon, Oct 14, 2013 at 2:50 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> For my test case, the entire inline instance is optimized away, so
>> there is no info about it in the profile. I can do some fixup in the
>> rebuild_cgraph_edge though.
>
> Yep, I understand that.  In this case we should turn PROFILE_READ to PROFILE_GUESSED
> and guess the profile when we detect this (i.e. we have edges with non-0 counts into
> functions with 0 profile).  That should prvent these from getting UNLIKELY_EXECUTED
> and they will be inlined normal way.

Oh, actually in AutoFDO, only functions with samples will be marked as
PROFILE_READ. Others will all be marked as PROFILE_GUESSED.

Dehao

>
> Honza
>>
>> Dehao
>>
>> On Mon, Oct 14, 2013 at 2:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>> > Is it possible to update the callee node summary after profile
>> > annotate (using information from inline instances which are not
>> > inlined in early inline)?
>> >
>> > David
>> >
>> > On Mon, Oct 14, 2013 at 2:18 PM, Dehao Chen <dehao@google.com> wrote:
>> >> On Mon, Oct 14, 2013 at 12:49 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> >>>> Not for instrumented FDO (not as I know of). But for AutoFDO, this
>> >>>> could be a potential risk because some callee is marked unlikely
>> >>>> executed simply because they are inlined and eliminated in the O2
>> >>>> binary. But in ipa-inline it will not get inlined because the edge is
>> >>>> not hot from cgraph_maybe_hot_edge_p (because callee is
>> >>>> UNLIKELY_EXECUTED), while the edge->count is actually hot.
>> >>>
>> >>> Can't you prevent setting calle to UNLIKELY_EXECUTED in these cases instead?
>> >>> It seems that having profile set incorrectly will lead to other problems later, too.
>> >>> We discussed similar problem with Teresa about the missing profiles for comdat,
>> >>> basically one should detect these cases as profile being lost and go with guessed
>> >>> profile.  (I believe patch for that was posted, too, and so far it seems best approach
>> >>> to this issue)
>> >>
>> >> The current AutoFDO implementation will take all functions that do not
>> >> have have profile as normally executed, thus use guessed profile for
>> >> it. This is like using profile for truly hot functions, and using O2
>> >> for other functions. This works fine. However, it leads to larger code
>> >> size (approximately 10%~20% larger than FDO).
>> >>
>> >> I'd like to introduce another mode for users who care about both
>> >> performance and code size, and can be sure that profile is
>> >> representative. In this mode, we will mark all functions without
>> >> sample as "unlikely executed". However, because AutoFDO use debug info
>> >> (of optimized code) to represent profile, it's possible that some hot
>> >> functions (say foo) are inlined and fully eliminated into another hot
>> >> function (say bar). So in the profile, bar is cold, and because the
>> >> profile for foo::bar is eliminated, bar will not be inlined into foo
>> >> before the profile annotation. However, after profile annotate, we can
>> >> infer from the bb count that foo->bar is hot, thus it should be
>> >> inlined in ipa-inline phase. However, because bar itself is marked
>> >> UNLIKELY_EXECUTED, it will not be inlined.
>> >>
>> >> One possible workaround would be that during rebuild_cgraph_edges, if
>> >> we find an edge's callee is unlikely executed, add the edge count to
>> >> the callee's count and recalculate callee's frequency.
>> >>
>> >> Dehao
>> >>
>> >>>
>> >>> Honza

decide edge's hotness when there is profile info

Commit Message

Comments

Patch