diff mbox

[5.x/6.x/7.x] Be more conservative in early inliner if FDO is enabled

Message ID 355e6f53.95d3.1573230b35b.Coremail.ypf@pku.edu.cn
State New
Headers show

Commit Message

Yuan, Pengfei Sept. 16, 2016, 8:50 a.m. UTC
> > Here are the results:
> > 
> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
> > 0      44686265 (-8.26%)  58.772s      66.332s
> > 1      45692793 (-6.19%)  40.684s      39.220s
> > 2      45556185 (-6.47%)  35.292s      34.328s
> > 3      46251049 (-5.05%)  28.820s      27.136s
> > 4      47028873 (-3.45%)  24.616s      22.200s
> > 5      47495641 (-2.49%)  20.160s      17.800s
> > 6      47520153 (-2.44%)  16.444s      15.656s
> > 14     48708873           5.620s       5.556s
> 
> Thanks for data! I meant to run the benchmark myself, but had little time to do
> it over past week becuase of traveling and was also wondering what to do given
> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
> binary now, so we can re-run talos.

It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.

> > Param: value of PARAM_EARLY_INLINING_INSNS
> > Size:  code size (.text) of optimized libxul.so
> > Time:  execution time of instrumented tramp3d (-n 25)
> > 
> > To balance between size reduction of optimized binary and speed penalty
> > of instrumented binary, I set param=6 as baseline and compare:
> > 
> > Param  Size score  Time score  Total
> > 0      3.39        -3.57       -0.18
> > 1      2.54        -2.47        0.07
> > 2      2.65        -2.15        0.50
> > 3      2.07        -1.75        0.32
> > 4      1.41        -1.50       -0.09
> > 5      1.02        -1.23       -0.21
> > 6      1.00        -1.00        0.00
> > 14     0.00        -0.34       -0.34
> > 
> > Therefore, I think param=2 is the best choice.
> > 
> > Is the attached patch OK?
> 
> Setting param to 2 looks fine
> 
> > gcc/ChangeLog
> > 	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
> > 	when FDO is enabled.
> > 
> > 
> > diff --git a/gcc/opts.c b/gcc/opts.c
> > index 39c190d..b59c700 100644
> > --- a/gcc/opts.c
> > +++ b/gcc/opts.c
> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
> >                              opts->x_param_values, opts_set->x_param_values);
> >      }
> > 
> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> > +                          opts->x_param_values, opts_set->x_param_values);
> > +
> 
> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
> that profile is not a global property of program. It may or may not be
> available for given function, while params are global.

Whether profile is available for specific functions is not important here because
profile is loaded after pass_early_inline. Therefore I think setting the param
globally is fine.

> Even at compile time profile may be selectively missing for example for
> COMDATs that did not win in the linking process.
> 
> There is also need to update the documentation.
> Thanks for the work!
> Honza

Here is the new patch.

Regards,
Yuan, Pengfei


gcc/ChangeLog
	* doc/invoke.texi (early-inlining-insns): Value is adjusted
	when FDO is enabled.
	* opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
	when FDO is enabled.

Comments

Richard Biener Sept. 16, 2016, 9:05 a.m. UTC | #1
On Fri, Sep 16, 2016 at 10:50 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
>> > Here are the results:
>> >
>> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
>> > 0      44686265 (-8.26%)  58.772s      66.332s
>> > 1      45692793 (-6.19%)  40.684s      39.220s
>> > 2      45556185 (-6.47%)  35.292s      34.328s
>> > 3      46251049 (-5.05%)  28.820s      27.136s
>> > 4      47028873 (-3.45%)  24.616s      22.200s
>> > 5      47495641 (-2.49%)  20.160s      17.800s
>> > 6      47520153 (-2.44%)  16.444s      15.656s
>> > 14     48708873           5.620s       5.556s
>>
>> Thanks for data! I meant to run the benchmark myself, but had little time to do
>> it over past week becuase of traveling and was also wondering what to do given
>> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
>> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
>> binary now, so we can re-run talos.
>
> It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
> I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.
>
>> > Param: value of PARAM_EARLY_INLINING_INSNS
>> > Size:  code size (.text) of optimized libxul.so
>> > Time:  execution time of instrumented tramp3d (-n 25)
>> >
>> > To balance between size reduction of optimized binary and speed penalty
>> > of instrumented binary, I set param=6 as baseline and compare:
>> >
>> > Param  Size score  Time score  Total
>> > 0      3.39        -3.57       -0.18
>> > 1      2.54        -2.47        0.07
>> > 2      2.65        -2.15        0.50
>> > 3      2.07        -1.75        0.32
>> > 4      1.41        -1.50       -0.09
>> > 5      1.02        -1.23       -0.21
>> > 6      1.00        -1.00        0.00
>> > 14     0.00        -0.34       -0.34
>> >
>> > Therefore, I think param=2 is the best choice.
>> >
>> > Is the attached patch OK?
>>
>> Setting param to 2 looks fine
>>
>> > gcc/ChangeLog
>> >     * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
>> >     when FDO is enabled.
>> >
>> >
>> > diff --git a/gcc/opts.c b/gcc/opts.c
>> > index 39c190d..b59c700 100644
>> > --- a/gcc/opts.c
>> > +++ b/gcc/opts.c
>> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
>> >                              opts->x_param_values, opts_set->x_param_values);
>> >      }
>> >
>> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
>> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
>> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
>> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
>> > +                          opts->x_param_values, opts_set->x_param_values);
>> > +
>>
>> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
>> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
>> that profile is not a global property of program. It may or may not be
>> available for given function, while params are global.
>
> Whether profile is available for specific functions is not important here because
> profile is loaded after pass_early_inline. Therefore I think setting the param
> globally is fine.

I also like a new param better as it avoids a new magic constant and
makes playing with
it easier (your patch removes the ability to do statistics like you did via the
early-inlining-insns parameter by forcing it to two).

Thanks,
Richard.

>> Even at compile time profile may be selectively missing for example for
>> COMDATs that did not win in the linking process.
>>
>> There is also need to update the documentation.
>> Thanks for the work!
>> Honza
>
> Here is the new patch.
>
> Regards,
> Yuan, Pengfei
>
>
> gcc/ChangeLog
>         * doc/invoke.texi (early-inlining-insns): Value is adjusted
>         when FDO is enabled.
>         * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
>         when FDO is enabled.
>
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 1ca4dcc..880a28c 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -10272,9 +10272,11 @@ The default value is 10.
>
>  @item early-inlining-insns
>  Specify growth that the early inliner can make.  In effect it increases
>  the amount of inlining for code having a large abstraction penalty.
> -The default value is 14.
> +The default value is 14.  When feedback-directed optimizations is enabled (by
> +@option{-fprofile-generate} or @option{-fprofile-use}), the value is adjusted
> +to 2.
>
>  @item max-early-inliner-iterations
>  Limit of iterations of the early inliner.  This basically bounds
>  the number of nested indirect calls the early inliner can resolve.
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 39c190d..b59c700 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
>        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
>                              opts->x_param_values, opts_set->x_param_values);
>      }
>
> +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> +                          opts->x_param_values, opts_set->x_param_values);
> +
>    if (opts->x_flag_lto)
>      {
>  #ifdef ENABLE_LTO
>        opts->x_flag_generate_lto = 1;
>
Jan Hubicka Sept. 16, 2016, 9:14 a.m. UTC | #2
> On Fri, Sep 16, 2016 at 10:50 AM, Yuan, Pengfei <ypf@pku.edu.cn> wrote:
> >> > Here are the results:
> >> >
> >> > Param  Size (GCC5)        Time (GCC5)  Time (GCC7)
> >> > 0      44686265 (-8.26%)  58.772s      66.332s
> >> > 1      45692793 (-6.19%)  40.684s      39.220s
> >> > 2      45556185 (-6.47%)  35.292s      34.328s
> >> > 3      46251049 (-5.05%)  28.820s      27.136s
> >> > 4      47028873 (-3.45%)  24.616s      22.200s
> >> > 5      47495641 (-2.49%)  20.160s      17.800s
> >> > 6      47520153 (-2.44%)  16.444s      15.656s
> >> > 14     48708873           5.620s       5.556s
> >>
> >> Thanks for data! I meant to run the benchmark myself, but had little time to do
> >> it over past week becuase of traveling and was also wondering what to do given
> >> that spec is rather poor benchmark in this area. Tramp3d is biassed but we are
> >> in stage1 and can fine tune latter. I am debugging the libxul crashes in FDO
> >> binary now, so we can re-run talos.
> >
> > It seems to be memory corruption. The content of a pointer becomes 0xe5e5...e5.
> > I have tried Firefox 48.0.2, 49b10 and mozilla-central with GCC 6. Same error.
> >
> >> > Param: value of PARAM_EARLY_INLINING_INSNS
> >> > Size:  code size (.text) of optimized libxul.so
> >> > Time:  execution time of instrumented tramp3d (-n 25)
> >> >
> >> > To balance between size reduction of optimized binary and speed penalty
> >> > of instrumented binary, I set param=6 as baseline and compare:
> >> >
> >> > Param  Size score  Time score  Total
> >> > 0      3.39        -3.57       -0.18
> >> > 1      2.54        -2.47        0.07
> >> > 2      2.65        -2.15        0.50
> >> > 3      2.07        -1.75        0.32
> >> > 4      1.41        -1.50       -0.09
> >> > 5      1.02        -1.23       -0.21
> >> > 6      1.00        -1.00        0.00
> >> > 14     0.00        -0.34       -0.34
> >> >
> >> > Therefore, I think param=2 is the best choice.
> >> >
> >> > Is the attached patch OK?
> >>
> >> Setting param to 2 looks fine
> >>
> >> > gcc/ChangeLog
> >> >     * opts.c (finish_options): Adjust PARAM_EARLY_INLINING_INSNS
> >> >     when FDO is enabled.
> >> >
> >> >
> >> > diff --git a/gcc/opts.c b/gcc/opts.c
> >> > index 39c190d..b59c700 100644
> >> > --- a/gcc/opts.c
> >> > +++ b/gcc/opts.c
> >> > @@ -826,8 +826,14 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
> >> >        maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
> >> >                              opts->x_param_values, opts_set->x_param_values);
> >> >      }
> >> >
> >> > +  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
> >> > +  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
> >> > +      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
> >> > +    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
> >> > +                          opts->x_param_values, opts_set->x_param_values);
> >> > +
> >>
> >> I would actually preffer to have PARAM_EARLY_ININING_INSNS_FEEDBACK.  We
> >> already have TRACER_DYNAMIC_COVERAGE_FEEDBACK and other params.  The reason is
> >> that profile is not a global property of program. It may or may not be
> >> available for given function, while params are global.
> >
> > Whether profile is available for specific functions is not important here because
> > profile is loaded after pass_early_inline. Therefore I think setting the param
> > globally is fine.
> 
> I also like a new param better as it avoids a new magic constant and
> makes playing with
> it easier (your patch removes the ability to do statistics like you did via the
> early-inlining-insns parameter by forcing it to two).

Hmm, you are right that you do not know if this particular function will get
profile (forgot about that).  Still, please use two params - it is more
consistent with what we have now and we may make it profile specific in
future..

Honza
> 
> Thanks,
> Richard.
diff mbox

Patch

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1ca4dcc..880a28c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -10272,9 +10272,11 @@  The default value is 10.
 
 @item early-inlining-insns
 Specify growth that the early inliner can make.  In effect it increases
 the amount of inlining for code having a large abstraction penalty.
-The default value is 14.
+The default value is 14.  When feedback-directed optimizations is enabled (by
+@option{-fprofile-generate} or @option{-fprofile-use}), the value is adjusted
+to 2.
 
 @item max-early-inliner-iterations
 Limit of iterations of the early inliner.  This basically bounds
 the number of nested indirect calls the early inliner can resolve.
diff --git a/gcc/opts.c b/gcc/opts.c
index 39c190d..b59c700 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -826,8 +826,14 @@  finish_options (struct gcc_options *opts, struct gcc_options *opts_set,
       maybe_set_param_value (PARAM_STACK_FRAME_GROWTH, 40,
 			     opts->x_param_values, opts_set->x_param_values);
     }
 
+  /* Adjust PARAM_EARLY_INLINING_INSNS when FDO is enabled.  */
+  if ((opts->x_profile_arc_flag && !opts->x_flag_test_coverage)
+      || (opts->x_flag_branch_probabilities && !opts->x_flag_auto_profile))
+    maybe_set_param_value (PARAM_EARLY_INLINING_INSNS, 2,
+			   opts->x_param_values, opts_set->x_param_values);
+
   if (opts->x_flag_lto)
     {
 #ifdef ENABLE_LTO
       opts->x_flag_generate_lto = 1;