diff mbox

Fix node weight updates during ipa-cp (issue7812053)

Message ID 20130327172238.545EF80580@tjsboxrox.mtv.corp.google.com
State New
Headers show

Commit Message

Teresa Johnson March 27, 2013, 5:22 p.m. UTC
I found that the node weight updates on cloned nodes during ipa-cp were
leading to incorrect/insane weights. Both the original and new node weight
computations used truncating divides, leading to a loss of total node weight.
I have fixed this by making both rounding integer divides.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

2013-03-27  Teresa Johnson  <tejohnson@google.com>

	* ipa-cp.c (update_profiling_info): Perform rounding integer
        division when updating weights instead of truncating.
	(update_specialized_profile): Ditto.


--
This patch is available for review at http://codereview.appspot.com/7812053

Comments

Richard Biener March 28, 2013, 9:27 a.m. UTC | #1
On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
> I found that the node weight updates on cloned nodes during ipa-cp were
> leading to incorrect/insane weights. Both the original and new node weight
> computations used truncating divides, leading to a loss of total node weight.
> I have fixed this by making both rounding integer divides.
>
> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

I'm sure we can outline a rounding integer divide inline function on
gcov_type.  To gcov-io.h, I suppose.

Otherwise this looks ok to me.

Thanks,
Richard.

> 2013-03-27  Teresa Johnson  <tejohnson@google.com>
>
>         * ipa-cp.c (update_profiling_info): Perform rounding integer
>         division when updating weights instead of truncating.
>         (update_specialized_profile): Ditto.
>
> Index: ipa-cp.c
> ===================================================================
> --- ipa-cp.c    (revision 197118)
> +++ ipa-cp.c    (working copy)
> @@ -2588,14 +2588,18 @@ update_profiling_info (struct cgraph_node *orig_no
>
>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>      if (cs->frequency)
> -      cs->count = cs->count * (new_sum * REG_BR_PROB_BASE
> -                              / orig_node_count) / REG_BR_PROB_BASE;
> +      cs->count = (cs->count
> +                   * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2)
> +                      / orig_node_count)
> +                   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>      else
>        cs->count = 0;
>
>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
> -    cs->count = cs->count * (remainder * REG_BR_PROB_BASE
> -                            / orig_node_count) / REG_BR_PROB_BASE;
> +    cs->count = (cs->count
> +                 * ((remainder * REG_BR_PROB_BASE + orig_node_count/2)
> +                    / orig_node_count)
> +                 + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>
>    if (dump_file)
>      dump_profile_updates (orig_node, new_node);
> @@ -2627,14 +2631,19 @@ update_specialized_profile (struct cgraph_node *ne
>
>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>      if (cs->frequency)
> -      cs->count += cs->count * redirected_sum / new_node_count;
> +      cs->count += (cs->count
> +                    * ((redirected_sum * REG_BR_PROB_BASE
> +                        + new_node_count/2) / new_node_count)
> +                    + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>      else
>        cs->count = 0;
>
>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
>      {
> -      gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE
> -                                  / orig_node_count) / REG_BR_PROB_BASE;
> +      gcov_type dec = (cs->count
> +                       * ((redirected_sum * REG_BR_PROB_BASE
> +                           + orig_node_count/2) / orig_node_count)
> +                       + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>        if (dec < cs->count)
>         cs->count -= dec;
>        else
>
> --
> This patch is available for review at http://codereview.appspot.com/7812053
Teresa Johnson April 5, 2013, 2:18 p.m. UTC | #2
On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> I found that the node weight updates on cloned nodes during ipa-cp were
>> leading to incorrect/insane weights. Both the original and new node weight
>> computations used truncating divides, leading to a loss of total node weight.
>> I have fixed this by making both rounding integer divides.
>>
>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>
> I'm sure we can outline a rounding integer divide inline function on
> gcov_type.  To gcov-io.h, I suppose.
>
> Otherwise this looks ok to me.

Thanks. I went ahead and worked on outlining this functionality. In
the process of doing so, I discovered that there was already a method
in basic-block.h to do part of this: apply_probability(), which does
the rounding divide by REG_BR_PROB_BASE. There is a related function
combine_probabilities() that takes 2 int probabilities instead of a
gcov_type and an int probability. I decided to use apply_probability()
in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
compute the scale factor/probability via a rounding divide. So the
ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
apply_probability.

I then went through all the code to look for instances where we were
computing scale factors/probabilities and performing scaling. I found
a mix of existing uses of apply/combine_probabilities, uses of RDIV,
inlined rounding divides, and truncating divides. I think it would be
good to unify all of this. As a first step, I replaced all inline code
sequences that were already doing rounding divides to compute scale
factors/probabilities or do the scaling, to instead use the
appropriate helper function/macro described above. For these
locations, there should be no change to behavior.

There are a number of places where there are truncating divides right
now. Since changing those may impact the resulting behavior, for this
patch I simply added a comment as to which helper they should use. As
soon as this patch goes in I am planning to change those to use the
appropriate helper and test performance, and then will send that patch
for review. So for this patch, the only place where behavior is
changed is in ipa-cp which was my original change.

New patch is attached. Bootstrapped (both bootstrap and
profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
trunk?

Thanks,
Teresa

>
> Thanks,
> Richard.
>
>> 2013-03-27  Teresa Johnson  <tejohnson@google.com>
>>
>>         * ipa-cp.c (update_profiling_info): Perform rounding integer
>>         division when updating weights instead of truncating.
>>         (update_specialized_profile): Ditto.
>>
>> Index: ipa-cp.c
>> ===================================================================
>> --- ipa-cp.c    (revision 197118)
>> +++ ipa-cp.c    (working copy)
>> @@ -2588,14 +2588,18 @@ update_profiling_info (struct cgraph_node *orig_no
>>
>>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>>      if (cs->frequency)
>> -      cs->count = cs->count * (new_sum * REG_BR_PROB_BASE
>> -                              / orig_node_count) / REG_BR_PROB_BASE;
>> +      cs->count = (cs->count
>> +                   * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2)
>> +                      / orig_node_count)
>> +                   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>      else
>>        cs->count = 0;
>>
>>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
>> -    cs->count = cs->count * (remainder * REG_BR_PROB_BASE
>> -                            / orig_node_count) / REG_BR_PROB_BASE;
>> +    cs->count = (cs->count
>> +                 * ((remainder * REG_BR_PROB_BASE + orig_node_count/2)
>> +                    / orig_node_count)
>> +                 + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>
>>    if (dump_file)
>>      dump_profile_updates (orig_node, new_node);
>> @@ -2627,14 +2631,19 @@ update_specialized_profile (struct cgraph_node *ne
>>
>>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>>      if (cs->frequency)
>> -      cs->count += cs->count * redirected_sum / new_node_count;
>> +      cs->count += (cs->count
>> +                    * ((redirected_sum * REG_BR_PROB_BASE
>> +                        + new_node_count/2) / new_node_count)
>> +                    + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>      else
>>        cs->count = 0;
>>
>>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
>>      {
>> -      gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE
>> -                                  / orig_node_count) / REG_BR_PROB_BASE;
>> +      gcov_type dec = (cs->count
>> +                       * ((redirected_sum * REG_BR_PROB_BASE
>> +                           + orig_node_count/2) / orig_node_count)
>> +                       + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>        if (dec < cs->count)
>>         cs->count -= dec;
>>        else
>>
>> --
>> This patch is available for review at http://codereview.appspot.com/7812053



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Richard Biener April 8, 2013, 8:52 a.m. UTC | #3
On Fri, Apr 5, 2013 at 4:18 PM, Teresa Johnson <tejohnson@google.com> wrote:
> On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>> I found that the node weight updates on cloned nodes during ipa-cp were
>>> leading to incorrect/insane weights. Both the original and new node weight
>>> computations used truncating divides, leading to a loss of total node weight.
>>> I have fixed this by making both rounding integer divides.
>>>
>>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>
>> I'm sure we can outline a rounding integer divide inline function on
>> gcov_type.  To gcov-io.h, I suppose.
>>
>> Otherwise this looks ok to me.
>
> Thanks. I went ahead and worked on outlining this functionality. In
> the process of doing so, I discovered that there was already a method
> in basic-block.h to do part of this: apply_probability(), which does
> the rounding divide by REG_BR_PROB_BASE. There is a related function
> combine_probabilities() that takes 2 int probabilities instead of a
> gcov_type and an int probability. I decided to use apply_probability()
> in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
> compute the scale factor/probability via a rounding divide. So the
> ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
> apply_probability.
>
> I then went through all the code to look for instances where we were
> computing scale factors/probabilities and performing scaling. I found
> a mix of existing uses of apply/combine_probabilities, uses of RDIV,
> inlined rounding divides, and truncating divides. I think it would be
> good to unify all of this. As a first step, I replaced all inline code
> sequences that were already doing rounding divides to compute scale
> factors/probabilities or do the scaling, to instead use the
> appropriate helper function/macro described above. For these
> locations, there should be no change to behavior.
>
> There are a number of places where there are truncating divides right
> now. Since changing those may impact the resulting behavior, for this
> patch I simply added a comment as to which helper they should use. As
> soon as this patch goes in I am planning to change those to use the
> appropriate helper and test performance, and then will send that patch
> for review. So for this patch, the only place where behavior is
> changed is in ipa-cp which was my original change.
>
> New patch is attached. Bootstrapped (both bootstrap and
> profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
> trunk?

Ok.

Thanks,
Richard.

> Thanks,
> Teresa
>
>>
>> Thanks,
>> Richard.
>>
>>> 2013-03-27  Teresa Johnson  <tejohnson@google.com>
>>>
>>>         * ipa-cp.c (update_profiling_info): Perform rounding integer
>>>         division when updating weights instead of truncating.
>>>         (update_specialized_profile): Ditto.
>>>
>>> Index: ipa-cp.c
>>> ===================================================================
>>> --- ipa-cp.c    (revision 197118)
>>> +++ ipa-cp.c    (working copy)
>>> @@ -2588,14 +2588,18 @@ update_profiling_info (struct cgraph_node *orig_no
>>>
>>>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>>>      if (cs->frequency)
>>> -      cs->count = cs->count * (new_sum * REG_BR_PROB_BASE
>>> -                              / orig_node_count) / REG_BR_PROB_BASE;
>>> +      cs->count = (cs->count
>>> +                   * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2)
>>> +                      / orig_node_count)
>>> +                   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>>      else
>>>        cs->count = 0;
>>>
>>>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
>>> -    cs->count = cs->count * (remainder * REG_BR_PROB_BASE
>>> -                            / orig_node_count) / REG_BR_PROB_BASE;
>>> +    cs->count = (cs->count
>>> +                 * ((remainder * REG_BR_PROB_BASE + orig_node_count/2)
>>> +                    / orig_node_count)
>>> +                 + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>>
>>>    if (dump_file)
>>>      dump_profile_updates (orig_node, new_node);
>>> @@ -2627,14 +2631,19 @@ update_specialized_profile (struct cgraph_node *ne
>>>
>>>    for (cs = new_node->callees; cs ; cs = cs->next_callee)
>>>      if (cs->frequency)
>>> -      cs->count += cs->count * redirected_sum / new_node_count;
>>> +      cs->count += (cs->count
>>> +                    * ((redirected_sum * REG_BR_PROB_BASE
>>> +                        + new_node_count/2) / new_node_count)
>>> +                    + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>>      else
>>>        cs->count = 0;
>>>
>>>    for (cs = orig_node->callees; cs ; cs = cs->next_callee)
>>>      {
>>> -      gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE
>>> -                                  / orig_node_count) / REG_BR_PROB_BASE;
>>> +      gcov_type dec = (cs->count
>>> +                       * ((redirected_sum * REG_BR_PROB_BASE
>>> +                           + orig_node_count/2) / orig_node_count)
>>> +                       + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
>>>        if (dec < cs->count)
>>>         cs->count -= dec;
>>>        else
>>>
>>> --
>>> This patch is available for review at http://codereview.appspot.com/7812053
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Jan Hubicka April 22, 2013, 5:27 p.m. UTC | #4
> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
> > I found that the node weight updates on cloned nodes during ipa-cp were
> > leading to incorrect/insane weights. Both the original and new node weight
> > computations used truncating divides, leading to a loss of total node weight.
> > I have fixed this by making both rounding integer divides.
> >
> > Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
> 
> I'm sure we can outline a rounding integer divide inline function on
> gcov_type.  To gcov-io.h, I suppose.

Most of code currently use RDIV for rounding divides (at lest I am slowly trying
to migrate to that).  

Honza
Teresa Johnson April 22, 2013, 5:38 p.m. UTC | #5
Hi Honza,

I converted all other weight update locations to use the helper
functions in basic-block.h instead of truncation (the patch I checked
in a couple weeks ago covered the cases that already used RDIV - see
the follow-on messages in this thread). I am almost done testing with
SPEC cpu2006. So far I haven't seen any performance effects, so I am
hoping to send this for review today or tomorrow.

Thanks,
Teresa

On Mon, Apr 22, 2013 at 10:27 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>> > I found that the node weight updates on cloned nodes during ipa-cp were
>> > leading to incorrect/insane weights. Both the original and new node weight
>> > computations used truncating divides, leading to a loss of total node weight.
>> > I have fixed this by making both rounding integer divides.
>> >
>> > Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>
>> I'm sure we can outline a rounding integer divide inline function on
>> gcov_type.  To gcov-io.h, I suppose.
>
> Most of code currently use RDIV for rounding divides (at lest I am slowly trying
> to migrate to that).
>
> Honza
H.J. Lu April 25, 2013, 10:29 p.m. UTC | #6
On Fri, Apr 5, 2013 at 7:18 AM, Teresa Johnson <tejohnson@google.com> wrote:
> On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>> I found that the node weight updates on cloned nodes during ipa-cp were
>>> leading to incorrect/insane weights. Both the original and new node weight
>>> computations used truncating divides, leading to a loss of total node weight.
>>> I have fixed this by making both rounding integer divides.
>>>
>>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>
>> I'm sure we can outline a rounding integer divide inline function on
>> gcov_type.  To gcov-io.h, I suppose.
>>
>> Otherwise this looks ok to me.
>
> Thanks. I went ahead and worked on outlining this functionality. In
> the process of doing so, I discovered that there was already a method
> in basic-block.h to do part of this: apply_probability(), which does
> the rounding divide by REG_BR_PROB_BASE. There is a related function
> combine_probabilities() that takes 2 int probabilities instead of a
> gcov_type and an int probability. I decided to use apply_probability()
> in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
> compute the scale factor/probability via a rounding divide. So the
> ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
> apply_probability.
>
> I then went through all the code to look for instances where we were
> computing scale factors/probabilities and performing scaling. I found
> a mix of existing uses of apply/combine_probabilities, uses of RDIV,
> inlined rounding divides, and truncating divides. I think it would be
> good to unify all of this. As a first step, I replaced all inline code
> sequences that were already doing rounding divides to compute scale
> factors/probabilities or do the scaling, to instead use the
> appropriate helper function/macro described above. For these
> locations, there should be no change to behavior.
>
> There are a number of places where there are truncating divides right
> now. Since changing those may impact the resulting behavior, for this
> patch I simply added a comment as to which helper they should use. As
> soon as this patch goes in I am planning to change those to use the
> appropriate helper and test performance, and then will send that patch
> for review. So for this patch, the only place where behavior is
> changed is in ipa-cp which was my original change.
>
> New patch is attached. Bootstrapped (both bootstrap and
> profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
> trunk?
>

This caused:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57077


H.J.
Teresa Johnson April 25, 2013, 10:34 p.m. UTC | #7
I'll take a look. Teresa

On Thu, Apr 25, 2013 at 3:29 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Apr 5, 2013 at 7:18 AM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>>> I found that the node weight updates on cloned nodes during ipa-cp were
>>>> leading to incorrect/insane weights. Both the original and new node weight
>>>> computations used truncating divides, leading to a loss of total node weight.
>>>> I have fixed this by making both rounding integer divides.
>>>>
>>>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>>
>>> I'm sure we can outline a rounding integer divide inline function on
>>> gcov_type.  To gcov-io.h, I suppose.
>>>
>>> Otherwise this looks ok to me.
>>
>> Thanks. I went ahead and worked on outlining this functionality. In
>> the process of doing so, I discovered that there was already a method
>> in basic-block.h to do part of this: apply_probability(), which does
>> the rounding divide by REG_BR_PROB_BASE. There is a related function
>> combine_probabilities() that takes 2 int probabilities instead of a
>> gcov_type and an int probability. I decided to use apply_probability()
>> in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
>> compute the scale factor/probability via a rounding divide. So the
>> ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
>> apply_probability.
>>
>> I then went through all the code to look for instances where we were
>> computing scale factors/probabilities and performing scaling. I found
>> a mix of existing uses of apply/combine_probabilities, uses of RDIV,
>> inlined rounding divides, and truncating divides. I think it would be
>> good to unify all of this. As a first step, I replaced all inline code
>> sequences that were already doing rounding divides to compute scale
>> factors/probabilities or do the scaling, to instead use the
>> appropriate helper function/macro described above. For these
>> locations, there should be no change to behavior.
>>
>> There are a number of places where there are truncating divides right
>> now. Since changing those may impact the resulting behavior, for this
>> patch I simply added a comment as to which helper they should use. As
>> soon as this patch goes in I am planning to change those to use the
>> appropriate helper and test performance, and then will send that patch
>> for review. So for this patch, the only place where behavior is
>> changed is in ipa-cp which was my original change.
>>
>> New patch is attached. Bootstrapped (both bootstrap and
>> profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
>> trunk?
>>
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57077
>
>
> H.J.
Teresa Johnson April 26, 2013, 5:19 a.m. UTC | #8
Reproduced. This looks like another instance of a case I found testing
my follow-on patch: the helper routines have some assertion checking
that is too strict for the broader usage where we may be scaling
counts up and not just down. I am verifying and will send a patch in
the morning that suppresses this assert, which is the approach I am
taking in the follow-on patch also coming tomorrow.

Teresa

On Thu, Apr 25, 2013 at 3:29 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Apr 5, 2013 at 7:18 AM, Teresa Johnson <tejohnson@google.com> wrote:
>> On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>>> I found that the node weight updates on cloned nodes during ipa-cp were
>>>> leading to incorrect/insane weights. Both the original and new node weight
>>>> computations used truncating divides, leading to a loss of total node weight.
>>>> I have fixed this by making both rounding integer divides.
>>>>
>>>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>>
>>> I'm sure we can outline a rounding integer divide inline function on
>>> gcov_type.  To gcov-io.h, I suppose.
>>>
>>> Otherwise this looks ok to me.
>>
>> Thanks. I went ahead and worked on outlining this functionality. In
>> the process of doing so, I discovered that there was already a method
>> in basic-block.h to do part of this: apply_probability(), which does
>> the rounding divide by REG_BR_PROB_BASE. There is a related function
>> combine_probabilities() that takes 2 int probabilities instead of a
>> gcov_type and an int probability. I decided to use apply_probability()
>> in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
>> compute the scale factor/probability via a rounding divide. So the
>> ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
>> apply_probability.
>>
>> I then went through all the code to look for instances where we were
>> computing scale factors/probabilities and performing scaling. I found
>> a mix of existing uses of apply/combine_probabilities, uses of RDIV,
>> inlined rounding divides, and truncating divides. I think it would be
>> good to unify all of this. As a first step, I replaced all inline code
>> sequences that were already doing rounding divides to compute scale
>> factors/probabilities or do the scaling, to instead use the
>> appropriate helper function/macro described above. For these
>> locations, there should be no change to behavior.
>>
>> There are a number of places where there are truncating divides right
>> now. Since changing those may impact the resulting behavior, for this
>> patch I simply added a comment as to which helper they should use. As
>> soon as this patch goes in I am planning to change those to use the
>> appropriate helper and test performance, and then will send that patch
>> for review. So for this patch, the only place where behavior is
>> changed is in ipa-cp which was my original change.
>>
>> New patch is attached. Bootstrapped (both bootstrap and
>> profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
>> trunk?
>>
>
> This caused:
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57077
>
>
> H.J.



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
Teresa Johnson April 29, 2013, 5:31 p.m. UTC | #9
FYI, Fixed in r198416.

Thanks,
Teresa

On Thu, Apr 25, 2013 at 10:19 PM, Teresa Johnson <tejohnson@google.com> wrote:
> Reproduced. This looks like another instance of a case I found testing
> my follow-on patch: the helper routines have some assertion checking
> that is too strict for the broader usage where we may be scaling
> counts up and not just down. I am verifying and will send a patch in
> the morning that suppresses this assert, which is the approach I am
> taking in the follow-on patch also coming tomorrow.
>
> Teresa
>
> On Thu, Apr 25, 2013 at 3:29 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Fri, Apr 5, 2013 at 7:18 AM, Teresa Johnson <tejohnson@google.com> wrote:
>>> On Thu, Mar 28, 2013 at 2:27 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Wed, Mar 27, 2013 at 6:22 PM, Teresa Johnson <tejohnson@google.com> wrote:
>>>>> I found that the node weight updates on cloned nodes during ipa-cp were
>>>>> leading to incorrect/insane weights. Both the original and new node weight
>>>>> computations used truncating divides, leading to a loss of total node weight.
>>>>> I have fixed this by making both rounding integer divides.
>>>>>
>>>>> Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?
>>>>
>>>> I'm sure we can outline a rounding integer divide inline function on
>>>> gcov_type.  To gcov-io.h, I suppose.
>>>>
>>>> Otherwise this looks ok to me.
>>>
>>> Thanks. I went ahead and worked on outlining this functionality. In
>>> the process of doing so, I discovered that there was already a method
>>> in basic-block.h to do part of this: apply_probability(), which does
>>> the rounding divide by REG_BR_PROB_BASE. There is a related function
>>> combine_probabilities() that takes 2 int probabilities instead of a
>>> gcov_type and an int probability. I decided to use apply_probability()
>>> in ipa-cp, and add a new macro GCOV_COMPUTE_SCALE to basic-block.h to
>>> compute the scale factor/probability via a rounding divide. So the
>>> ipa-cp changes I made use both GCOV_COMPUTE_SCALE and
>>> apply_probability.
>>>
>>> I then went through all the code to look for instances where we were
>>> computing scale factors/probabilities and performing scaling. I found
>>> a mix of existing uses of apply/combine_probabilities, uses of RDIV,
>>> inlined rounding divides, and truncating divides. I think it would be
>>> good to unify all of this. As a first step, I replaced all inline code
>>> sequences that were already doing rounding divides to compute scale
>>> factors/probabilities or do the scaling, to instead use the
>>> appropriate helper function/macro described above. For these
>>> locations, there should be no change to behavior.
>>>
>>> There are a number of places where there are truncating divides right
>>> now. Since changing those may impact the resulting behavior, for this
>>> patch I simply added a comment as to which helper they should use. As
>>> soon as this patch goes in I am planning to change those to use the
>>> appropriate helper and test performance, and then will send that patch
>>> for review. So for this patch, the only place where behavior is
>>> changed is in ipa-cp which was my original change.
>>>
>>> New patch is attached. Bootstrapped (both bootstrap and
>>> profiledbootstrap) and tested on x86-64-unknown-linux-gnu. Ok for
>>> trunk?
>>>
>>
>> This caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57077
>>
>>
>> H.J.
>
>
>
> --
> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
H.J. Lu April 29, 2013, 7:35 p.m. UTC | #10
On Mon, Apr 29, 2013 at 10:31 AM, Teresa Johnson <tejohnson@google.com> wrote:
> FYI, Fixed in r198416.
>
> Thanks,
> Teresa
>

I noticed that sometimes GCC generates:

_8 = memcpy (ret_6, s_2(D), len_4);
_8 = memcpy (ret_6, s_2(D), len_4);
memcpy (_17, buffer_12(D), add_16);
memcpy (_17, buffer_12(D), add_16);
memcpy (_25, _28, _27);
memcpy (_25, _28, _27);
memcpy (_39, buffer_2, len_4);
memcpy (_39, buffer_2, len_4);
memcpy (_16, &fillbuf, pad_1);
memcpy (_16, &fillbuf, pad_1);
...



--
H.J.
Teresa Johnson April 30, 2013, 2:02 p.m. UTC | #11
On Mon, Apr 29, 2013 at 12:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Mon, Apr 29, 2013 at 10:31 AM, Teresa Johnson <tejohnson@google.com> wrote:
>> FYI, Fixed in r198416.
>>
>> Thanks,
>> Teresa
>>
>
> I noticed that sometimes GCC generates:
>
> _8 = memcpy (ret_6, s_2(D), len_4);
> _8 = memcpy (ret_6, s_2(D), len_4);
> memcpy (_17, buffer_12(D), add_16);
> memcpy (_17, buffer_12(D), add_16);
> memcpy (_25, _28, _27);
> memcpy (_25, _28, _27);
> memcpy (_39, buffer_2, len_4);
> memcpy (_39, buffer_2, len_4);
> memcpy (_16, &fillbuf, pad_1);
> memcpy (_16, &fillbuf, pad_1);

I am getting this too with a profiledbootstrap with LTO. However, this
isn't due to my changes. I had confirmed that after reverting my
changes (r197595 and now the follow-on fix r198416) this problem still
occurs.

Teresa

> ...
>
>
>
> --
> H.J.



--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
diff mbox

Patch

Index: ipa-cp.c
===================================================================
--- ipa-cp.c	(revision 197118)
+++ ipa-cp.c	(working copy)
@@ -2588,14 +2588,18 @@  update_profiling_info (struct cgraph_node *orig_no
 
   for (cs = new_node->callees; cs ; cs = cs->next_callee)
     if (cs->frequency)
-      cs->count = cs->count * (new_sum * REG_BR_PROB_BASE
-			       / orig_node_count) / REG_BR_PROB_BASE;
+      cs->count = (cs->count
+                   * ((new_sum * REG_BR_PROB_BASE + orig_node_count/2)
+                      / orig_node_count)
+                   + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
     else
       cs->count = 0;
 
   for (cs = orig_node->callees; cs ; cs = cs->next_callee)
-    cs->count = cs->count * (remainder * REG_BR_PROB_BASE
-			     / orig_node_count) / REG_BR_PROB_BASE;
+    cs->count = (cs->count
+                 * ((remainder * REG_BR_PROB_BASE + orig_node_count/2)
+                    / orig_node_count)
+                 + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
 
   if (dump_file)
     dump_profile_updates (orig_node, new_node);
@@ -2627,14 +2631,19 @@  update_specialized_profile (struct cgraph_node *ne
 
   for (cs = new_node->callees; cs ; cs = cs->next_callee)
     if (cs->frequency)
-      cs->count += cs->count * redirected_sum / new_node_count;
+      cs->count += (cs->count
+                    * ((redirected_sum * REG_BR_PROB_BASE
+                        + new_node_count/2) / new_node_count)
+                    + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
     else
       cs->count = 0;
 
   for (cs = orig_node->callees; cs ; cs = cs->next_callee)
     {
-      gcov_type dec = cs->count * (redirected_sum * REG_BR_PROB_BASE
-				   / orig_node_count) / REG_BR_PROB_BASE;
+      gcov_type dec = (cs->count
+                       * ((redirected_sum * REG_BR_PROB_BASE
+                           + orig_node_count/2) / orig_node_count)
+                       + REG_BR_PROB_BASE/2) / REG_BR_PROB_BASE;
       if (dec < cs->count)
 	cs->count -= dec;
       else