Patchwork new patches using -fopt-info (issue5294043)

login
register
mail settings
Submitter Rong Xu
Date Oct. 19, 2011, 11:28 p.m.
Message ID <20111019232801.3532EC1F62@rong.mtv.corp.google.com>
Download mbox | patch
Permalink /patch/120707/
State New
Headers show

Comments

Rong Xu - Oct. 19, 2011, 11:28 p.m.
After some off-line discussion, we decided to use a more general approach
to control the printing of optimization messages/warnings. We will
introduce a new option -fopt-info:
 * fopt-info=0 or fno-opt-info: no message will be emitted.
 * fopt-info or fopt-info=1: emit important warnings and optimization
   messages with large performance impact.
 * fopt-info=2: warnings and optimization messages targeting power users.
 * fopt-info=3: informational messages for compiler developers.

2011-10-19   Rong Xu  <xur@google.com>

        * gcc/common.opt (fopt-info): New flag. (fopt-info=) Ditto.
	* gcc/opts.c (common_handle_option): Handle OPT_fopt_info_.
	* gcc/flag-types.h (opt_info_verbosity_levels): New enum.
        * gcc/value-prof.c (check_ic_counter): guard warnings/notes by
          flag_opt_info.
          (find_func_by_funcdef_no): Ditto. 
          (check_ic_target): Ditto.
          (check_counter): Ditto.
          (check_ic_counter): Ditto.
        * gcc/mcf.c (find_minimum_cost_flow): Ditto.
        * gcc/profile.c (read_profile_edge_counts): Ditto.
          (compute_branch_probabilities): Ditto.
        * gcc/coverage.c (read_counts_file): Ditto.
          (get_coverage_counts): Ditto.
        * gcc/tree-profile.c: (gimple_gen_reusedist): Ditto.
          (maybe_issue_profile_use_note): Ditto.
          (optimize_reusedist): Ditto.
        * gcc/testsuite/gcc.dg/pr32773.c: add -fopt-info.
        * gcc/testsuite/gcc.dg/pr40209.c: Ditto.
        * gcc/testsuite/gcc.dg/pr26570.c: Ditto.
        * gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto.



--
This patch is available for review at http://codereview.appspot.com/5294043
Andi Kleen - Oct. 19, 2011, 11:33 p.m.
xur@google.com (Rong Xu) writes:

> After some off-line discussion, we decided to use a more general approach
> to control the printing of optimization messages/warnings. We will
> introduce a new option -fopt-info:
>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>  * fopt-info or fopt-info=1: emit important warnings and optimization
>    messages with large performance impact.
>  * fopt-info=2: warnings and optimization messages targeting power users.
>  * fopt-info=3: informational messages for compiler developers.

It would be interested to have some warnings about missing SRA
opportunities in =1 or =2. I found that sometimes fixing those can give a
large speedup.

Right now a common case that prevents SRA on structure field 
is simply a memset or memcpy.

-Andi
Rong Xu - Oct. 19, 2011, 11:43 p.m.
OK. I'll keep it in mind. We probably won't include all the messages
in one shot. But we do plan to add all of them finally because we
believe they are helpful for tracking the performance.

-Rong

On Wed, Oct 19, 2011 at 4:33 PM, Andi Kleen <andi@firstfloor.org> wrote:
> xur@google.com (Rong Xu) writes:
>
>> After some off-line discussion, we decided to use a more general approach
>> to control the printing of optimization messages/warnings. We will
>> introduce a new option -fopt-info:
>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>    messages with large performance impact.
>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>  * fopt-info=3: informational messages for compiler developers.
>
> It would be interested to have some warnings about missing SRA
> opportunities in =1 or =2. I found that sometimes fixing those can give a
> large speedup.
>
> Right now a common case that prevents SRA on structure field
> is simply a memset or memcpy.
>
> -Andi
>
>
> --
> ak@linux.intel.com -- Speaking for myself only
>
Richard Guenther - Oct. 20, 2011, 8:21 a.m.
On Thu, Oct 20, 2011 at 1:33 AM, Andi Kleen <andi@firstfloor.org> wrote:
> xur@google.com (Rong Xu) writes:
>
>> After some off-line discussion, we decided to use a more general approach
>> to control the printing of optimization messages/warnings. We will
>> introduce a new option -fopt-info:
>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>    messages with large performance impact.
>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>  * fopt-info=3: informational messages for compiler developers.

This doesn't look scalable if you consider that each pass would print
as much of a mess like -fvectorizer-verbose=5.

I think =2 and =3 should be omitted - we do have dump-files for a reason.

Also the coverage/profile cases you changed do not at all match
"... with large performance impact".  In fact the impact is completely
unknown (as it would be the case usually).

I'd rather have a way to make dump-files more structured (so, following
some standard reporting scheme) than introducing yet another way
of output.  [after making dump-files more consistent it will be easy
to revisit patches like this, there would be a natural general central
way to implement it]

So, please fix dump-files instead.  And for coverage/profiling, fill
in stuff in a dump-file!

Richard.

> It would be interested to have some warnings about missing SRA
> opportunities in =1 or =2. I found that sometimes fixing those can give a
> large speedup.
>
> Right now a common case that prevents SRA on structure field
> is simply a memset or memcpy.
>
> -Andi
>
>
> --
> ak@linux.intel.com -- Speaking for myself only
>
Xinliang David Li - Oct. 20, 2011, 5:11 p.m.
On Thu, Oct 20, 2011 at 1:21 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Oct 20, 2011 at 1:33 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> xur@google.com (Rong Xu) writes:
>>
>>> After some off-line discussion, we decided to use a more general approach
>>> to control the printing of optimization messages/warnings. We will
>>> introduce a new option -fopt-info:
>>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>>    messages with large performance impact.
>>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>>  * fopt-info=3: informational messages for compiler developers.
>
> This doesn't look scalable if you consider that each pass would print
> as much of a mess like -fvectorizer-verbose=5.

What is not scalable? For level 1 dump, only the summary of
vectorization will be printed just like other loop transformations.

>
> I think =2 and =3 should be omitted - we do have dump-files for a reason.

Dump files are not easy to use -- it is big, and slow especially for
people with large distributed build systems.  Having both level 2 and
3 is debatable, but it will be useful to have a least one level above
level 1. Dump files are mainly for compiler developers, while
-fopt-info are for compiler developers *and* power users who know
performance tuning.
>
> Also the coverage/profile cases you changed do not at all match
> "... with large performance impact".  In fact the impact is completely
> unknown (as it would be the case usually).

Impact of any transformations is just 'potential', coverage problems
are no different from that.

>
> I'd rather have a way to make dump-files more structured (so, following
> some standard reporting scheme) than introducing yet another way
> of output.  [after making dump-files more consistent it will be easy
> to revisit patches like this, there would be a natural general central
> way to implement it]

Yes, I remember we have discussed about this before -- currently dump
files are a big mess -- debug tracing, IR are all mixed up, but as I
said above, this is a different matter -- it is for compiler
developers.

For more structured optimization report, we should use option
-fopt-report which dump optimization information based on category --
the info data base can also be shared across modules:

Example:

[Loop Interchange]
File a, line x,   yyyyyyy
File b, line xx, yyyyyyy
....
File c, line z,   It is beneficial to interchange the loop, but not
done because of possible carried dependency (caused by false aliasing
...)

[Loop Vectorization]
....

[Loop Unroll]
...

[SRA]

[Alias summary]
  [Global Vars]
   a: addr exposed
   b: add not exposed
   ..
  [Global Pointers]
    ..
  ...


Thanks,

David

>
> So, please fix dump-files instead.  And for coverage/profiling, fill
> in stuff in a dump-file!
>
> Richard.
>
>> It would be interested to have some warnings about missing SRA
>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>> large speedup.
>>
>> Right now a common case that prevents SRA on structure field
>> is simply a memset or memcpy.
>>
>> -Andi
>>
>>
>> --
>> ak@linux.intel.com -- Speaking for myself only
>>
>
Rong Xu - Oct. 20, 2011, 5:28 p.m.
Richard, Thanks for the comments.

Let me give some background of the patch: The initial intention of the
patch is to suppress
the verbose warnings and notes emitted in profile-use compilation.
This warnings/notes are caused
by inconsistent profile due to data race (which is currently common in
multi-thread programs),
and some stale profiles (after adding a new functions). While valid,
they can easily pollute the build log.
In addition, some of these warnings cannot be disabled by any options
except -Wno-error. We have
many FDO users and these verbose messages one of the most complained issues.

The first patch was adding another option to control the fdo related
messages. Later we thought it's
better to do this in more general way. And here comes this patch.

I believe fopt-info is very useful for tracking regressions for
certain important optimization (inline, loop opt etc).
IR dump is just too big and add too much overhead to compilation (look
how many files it creates).

-Rong



On Thu, Oct 20, 2011 at 1:21 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Oct 20, 2011 at 1:33 AM, Andi Kleen <andi@firstfloor.org> wrote:
>> xur@google.com (Rong Xu) writes:
>>
>>> After some off-line discussion, we decided to use a more general approach
>>> to control the printing of optimization messages/warnings. We will
>>> introduce a new option -fopt-info:
>>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>>    messages with large performance impact.
>>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>>  * fopt-info=3: informational messages for compiler developers.
>
> This doesn't look scalable if you consider that each pass would print
> as much of a mess like -fvectorizer-verbose=5.
>
> I think =2 and =3 should be omitted - we do have dump-files for a reason.
>
> Also the coverage/profile cases you changed do not at all match
> "... with large performance impact".  In fact the impact is completely
> unknown (as it would be the case usually).
>
> I'd rather have a way to make dump-files more structured (so, following
> some standard reporting scheme) than introducing yet another way
> of output.  [after making dump-files more consistent it will be easy
> to revisit patches like this, there would be a natural general central
> way to implement it]
>
> So, please fix dump-files instead.  And for coverage/profiling, fill
> in stuff in a dump-file!
>
> Richard.
>
>> It would be interested to have some warnings about missing SRA
>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>> large speedup.
>>
>> Right now a common case that prevents SRA on structure field
>> is simply a memset or memcpy.
>>
>> -Andi
>>
>>
>> --
>> ak@linux.intel.com -- Speaking for myself only
>>
>
Xinliang David Li - Oct. 20, 2011, 5:46 p.m.
While discussion for trunk version is still going, it is ok for google branches.

thanks,

David

On Wed, Oct 19, 2011 at 4:28 PM, Rong Xu <xur@google.com> wrote:
> After some off-line discussion, we decided to use a more general approach
> to control the printing of optimization messages/warnings. We will
> introduce a new option -fopt-info:
>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>  * fopt-info or fopt-info=1: emit important warnings and optimization
>   messages with large performance impact.
>  * fopt-info=2: warnings and optimization messages targeting power users.
>  * fopt-info=3: informational messages for compiler developers.
>
> 2011-10-19   Rong Xu  <xur@google.com>
>
>        * gcc/common.opt (fopt-info): New flag. (fopt-info=) Ditto.
>        * gcc/opts.c (common_handle_option): Handle OPT_fopt_info_.
>        * gcc/flag-types.h (opt_info_verbosity_levels): New enum.
>        * gcc/value-prof.c (check_ic_counter): guard warnings/notes by
>          flag_opt_info.
>          (find_func_by_funcdef_no): Ditto.
>          (check_ic_target): Ditto.
>          (check_counter): Ditto.
>          (check_ic_counter): Ditto.
>        * gcc/mcf.c (find_minimum_cost_flow): Ditto.
>        * gcc/profile.c (read_profile_edge_counts): Ditto.
>          (compute_branch_probabilities): Ditto.
>        * gcc/coverage.c (read_counts_file): Ditto.
>          (get_coverage_counts): Ditto.
>        * gcc/tree-profile.c: (gimple_gen_reusedist): Ditto.
>          (maybe_issue_profile_use_note): Ditto.
>          (optimize_reusedist): Ditto.
>        * gcc/testsuite/gcc.dg/pr32773.c: add -fopt-info.
>        * gcc/testsuite/gcc.dg/pr40209.c: Ditto.
>        * gcc/testsuite/gcc.dg/pr26570.c: Ditto.
>        * gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C: Ditto.
>
>
> Index: gcc/value-prof.c
> ===================================================================
> --- gcc/value-prof.c    (revision 180106)
> +++ gcc/value-prof.c    (working copy)
> @@ -472,9 +472,10 @@
>               : DECL_SOURCE_LOCATION (current_function_decl);
>       if (flag_profile_correction)
>         {
> -         inform (locus, "correcting inconsistent value profile: "
> -                 "%s profiler overall count (%d) does not match BB count "
> -                  "(%d)", name, (int)*all, (int)bb_count);
> +          if (flag_opt_info >= OPT_INFO_MAX)
> +            inform (locus, "correcting inconsistent value profile: %s "
> +                   "profiler overall count (%d) does not match BB count "
> +                    "(%d)", name, (int)*all, (int)bb_count);
>          *all = bb_count;
>          if (*count > *all)
>             *count = *all;
> @@ -510,33 +511,42 @@
>   location_t locus;
>   if (*count1 > all && flag_profile_correction)
>     {
> -      locus = (stmt != NULL)
> -              ? gimple_location (stmt)
> -              : DECL_SOURCE_LOCATION (current_function_decl);
> -      inform (locus, "Correcting inconsistent value profile: "
> -              "ic (topn) profiler top target count (%ld) exceeds "
> -             "BB count (%ld)", (long)*count1, (long)all);
> +      if (flag_opt_info >= OPT_INFO_MAX)
> +        {
> +          locus = (stmt != NULL)
> +                  ? gimple_location (stmt)
> +                  : DECL_SOURCE_LOCATION (current_function_decl);
> +          inform (locus, "Correcting inconsistent value profile: "
> +                  "ic (topn) profiler top target count (%ld) exceeds "
> +                  "BB count (%ld)", (long)*count1, (long)all);
> +        }
>       *count1 = all;
>     }
>   if (*count2 > all && flag_profile_correction)
>     {
> -      locus = (stmt != NULL)
> -              ? gimple_location (stmt)
> -              : DECL_SOURCE_LOCATION (current_function_decl);
> -      inform (locus, "Correcting inconsistent value profile: "
> -              "ic (topn) profiler second target count (%ld) exceeds "
> -             "BB count (%ld)", (long)*count2, (long)all);
> +      if (flag_opt_info >= OPT_INFO_MAX)
> +        {
> +          locus = (stmt != NULL)
> +                  ? gimple_location (stmt)
> +                  : DECL_SOURCE_LOCATION (current_function_decl);
> +          inform (locus, "Correcting inconsistent value profile: "
> +                  "ic (topn) profiler second target count (%ld) exceeds "
> +                 "BB count (%ld)", (long)*count2, (long)all);
> +        }
>       *count2 = all;
>     }
>
>   if (*count2 > *count1)
>     {
> -      locus = (stmt != NULL)
> -              ? gimple_location (stmt)
> -              : DECL_SOURCE_LOCATION (current_function_decl);
> -      inform (locus, "Corrupted topn ic value profile: "
> -             "first target count (%ld) is less than the second "
> -             "target count (%ld)", (long)*count1, (long)*count2);
> +      if (flag_opt_info >= OPT_INFO_MAX)
> +        {
> +          locus = (stmt != NULL)
> +                  ? gimple_location (stmt)
> +                  : DECL_SOURCE_LOCATION (current_function_decl);
> +          inform (locus, "Corrupted topn ic value profile: "
> +                 "first target count (%ld) is less than the second "
> +                 "target count (%ld)", (long)*count1, (long)*count2);
> +        }
>       return true;
>     }
>
> @@ -548,12 +558,16 @@
>        *count2 = all - *count1;
>       else
>        {
> -         locus = (stmt != NULL)
> -           ? gimple_location (stmt)
> -           : DECL_SOURCE_LOCATION (current_function_decl);
> -         inform (locus, "Corrupted topn ic value profile: top two targets's"
> -                 " total count (%ld) exceeds bb count (%ld)",
> -                 (long)(*count1 + *count2), (long)all);
> +          if (flag_opt_info >= OPT_INFO_MAX)
> +            {
> +             locus = (stmt != NULL)
> +               ? gimple_location (stmt)
> +               : DECL_SOURCE_LOCATION (current_function_decl);
> +             inform (locus,
> +                      "Corrupted topn ic value profile: top two targets's"
> +                      " total count (%ld) exceeds bb count (%ld)",
> +                      (long)(*count1 + *count2), (long)all);
> +            }
>          return true;
>        }
>     }
> @@ -1177,8 +1191,11 @@
>                                       func_id) == NULL)
>     {
>       if (flag_profile_correction)
> -        inform (DECL_SOURCE_LOCATION (current_function_decl),
> +        {
> +          if (flag_opt_info >= OPT_INFO_MED)
> +            inform (DECL_SOURCE_LOCATION (current_function_decl),
>                 "Inconsistent profile: indirect call target (%d) does not exist", func_id);
> +        }
>       else
>         error ("Inconsistent profile: indirect call target (%d) does not exist", func_id);
>
> @@ -1308,8 +1325,9 @@
>      return true;
>
>    locus =  gimple_location (call_stmt);
> -   inform (locus, "Skipping target %s with mismatching types for icall ",
> -           cgraph_node_name (target));
> +   if (flag_opt_info >= OPT_INFO_MAX)
> +     inform (locus, "Skipping target %s with mismatching types for icall ",
> +             cgraph_node_name (target));
>    return false;
>  }
>
> @@ -1571,7 +1589,7 @@
>   if (direct_call1 == NULL
>       || !check_ic_target (stmt, direct_call1))
>     {
> -      if (flag_ripa_verbose)
> +      if (flag_opt_info >= OPT_INFO_MAX)
>         {
>           if (!direct_call1)
>             inform (locus, "Can not find indirect call target decl "
> @@ -1597,7 +1615,7 @@
>     return false;
>
>   modify1 = gimple_ic (stmt, direct_call1, prob1, count1, all);
> -  if (flag_ripa_verbose)
> +  if (flag_opt_info >= OPT_INFO_MIN)
>     inform (locus, "Promote indirect call to target (call count:%u) %s",
>            (unsigned) count1,
>            lang_hooks.decl_printable_name (direct_call1->decl, 3));
> @@ -1635,7 +1653,7 @@
>       modify2 = gimple_ic (stmt, direct_call2,
>                            prob2, count2, all - count1);
>
> -      if (flag_ripa_verbose)
> +      if (flag_opt_info >= OPT_INFO_MIN)
>        inform (locus, "Promote indirect call to target (call count:%u) %s",
>                (unsigned) count2,
>                lang_hooks.decl_printable_name (direct_call2->decl, 3));
> Index: gcc/mcf.c
> ===================================================================
> --- gcc/mcf.c   (revision 180106)
> +++ gcc/mcf.c   (working copy)
> @@ -1442,7 +1442,8 @@
>       if (iteration > MAX_ITER (fixup_graph->num_vertices,
>                                 fixup_graph->num_edges))
>        {
> -         inform (DECL_SOURCE_LOCATION (current_function_decl),
> +          if (flag_opt_info >= OPT_INFO_MAX)
> +            inform (DECL_SOURCE_LOCATION (current_function_decl),
>                  "Exiting profile correction early to avoid excessive "
>                  "compile time");
>          break;
> Index: gcc/testsuite/gcc.dg/pr32773.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/pr32773.c      (revision 180106)
> +++ gcc/testsuite/gcc.dg/pr32773.c      (working copy)
> @@ -1,6 +1,6 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O -fprofile-use" } */
> -/* { dg-options "-O -m4 -fprofile-use" { target sh-*-* } } */
> +/* { dg-options "-O -fprofile-use -fopt-info" } */
> +/* { dg-options "-O -m4 -fprofile-use -fopt-info" { target sh-*-* } } */
>
>  void foo (int *p)
>  {
> Index: gcc/testsuite/gcc.dg/pr40209.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/pr40209.c      (revision 180106)
> +++ gcc/testsuite/gcc.dg/pr40209.c      (working copy)
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fprofile-use" } */
> +/* { dg-options "-O2 -fprofile-use -fopt-info" } */
>
>  void process(const char *s);
>
> Index: gcc/testsuite/gcc.dg/pr26570.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/pr26570.c      (revision 180106)
> +++ gcc/testsuite/gcc.dg/pr26570.c      (working copy)
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fprofile-generate -fprofile-use" } */
> +/* { dg-options "-O2 -fprofile-generate -fprofile-use -fopt-info" } */
>
>  unsigned test (unsigned a, unsigned b)
>  {
> Index: gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C
> ===================================================================
> --- gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C (revision 180106)
> +++ gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C (working copy)
> @@ -1,7 +1,7 @@
>  // PR tree-optimization/39557
>  // invalid post-dom info leads to infinite loop
>  // { dg-do run }
> -// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fno-rtti" }
> +// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fopt-info -fno-rtti" }
>
>  struct C
>  {
> Index: gcc/opts.c
> ===================================================================
> --- gcc/opts.c  (revision 180106)
> +++ gcc/opts.c  (working copy)
> @@ -1579,6 +1579,15 @@
>       pp_set_line_maximum_length (dc->printer, value);
>       break;
>
> +    case OPT_fopt_info_:
> +      if (value < 0 || value > OPT_INFO_MAX)
> +       error_at (loc,
> +                 "%d: invalid value for opt_info",
> +                 value);
> +      else
> +       opts->x_flag_opt_info = value;
> +      break;
> +
>     case OPT_fpack_struct_:
>       if (value <= 0 || (value & (value - 1)) || value > 16)
>        error_at (loc,
> Index: gcc/profile.c
> ===================================================================
> --- gcc/profile.c       (revision 180106)
> +++ gcc/profile.c       (working copy)
> @@ -414,7 +414,7 @@
>                    if (flag_profile_correction)
>                      {
>                        static bool informed = 0;
> -                       if (!informed)
> +                       if ((flag_opt_info >= OPT_INFO_MAX) && !informed)
>                          inform (input_location,
>                                  "corrupted profile info: edge count exceeds maximal count");
>                        informed = 1;
> @@ -635,7 +635,7 @@
>        {
>          /* Inconsistency detected. Make it flow-consistent. */
>          static int informed = 0;
> -         if (informed == 0)
> +         if ((flag_opt_info >= OPT_INFO_MAX) && informed == 0)
>            {
>              informed = 1;
>              inform (input_location, "correcting inconsistent profile data");
> Index: gcc/flag-types.h
> ===================================================================
> --- gcc/flag-types.h    (revision 180106)
> +++ gcc/flag-types.h    (working copy)
> @@ -204,4 +204,11 @@
>   MAX_VERBOSITY_LEVEL
>  };
>
> +/* flag_opt_info verbosity levels.  */
> +enum opt_info_verbosity_levels {
> +  OPT_INFO_NONE = 0,
> +  OPT_INFO_MIN  = 1,
> +  OPT_INFO_MED  = 2,
> +  OPT_INFO_MAX  = 3
> +};
>  #endif /* ! GCC_FLAG_TYPES_H */
> Index: gcc/coverage.c
> ===================================================================
> --- gcc/coverage.c      (revision 180106)
> +++ gcc/coverage.c      (working copy)
> @@ -355,7 +355,8 @@
>     warning (OPT_Wripa_opt_mismatch, "command line arguments mismatch for %s "
>             "and %s", mod_info1->source_filename, mod_info2->source_filename);
>
> -   if (warn_ripa_opt_mismatch && non_warning_mismatch && flag_ripa_verbose)
> +   if (warn_ripa_opt_mismatch && non_warning_mismatch
> +       && (flag_opt_info >= OPT_INFO_MED))
>      {
>        inform (UNKNOWN_LOCATION, "Options for %s", mod_info1->source_filename);
>        for (i = 0; i < num_non_warning_opts1; i++)
> @@ -575,29 +576,47 @@
>              int fd;
>              char *aux_da_filename = get_da_file_name (mod_info->da_filename);
>               gcc_assert (!mod_info->is_primary);
> -             if (pointer_set_insert (modset, (void *)(size_t)mod_info->ident))
> -               inform (input_location, "Not importing %s: already imported",
> -                       mod_info->source_filename);
> -             else if ((module_infos[0]->lang & GCOV_MODULE_LANG_MASK) !=
> -                      (mod_info->lang & GCOV_MODULE_LANG_MASK))
> -               inform (input_location, "Not importing %s: source language"
> -                       " different from primary module's source language",
> -                       mod_info->source_filename);
> -             else if (module_infos_read == max_group)
> -               inform (input_location, "Not importing %s: maximum group size"
> -                       " reached", mod_info->source_filename);
> -             else if (incompatible_cl_args (module_infos[0], mod_info))
> -               inform (input_location, "Not importing %s: command-line"
> -                       " arguments not compatible with primary module",
> -                       mod_info->source_filename);
> -             else if ((fd = open (aux_da_filename, O_RDONLY)) < 0)
> -               inform (input_location, "Not importing %s: couldn't open %s",
> -                       mod_info->source_filename, aux_da_filename);
> -             else if ((mod_info->lang & GCOV_MODULE_ASM_STMTS)
> -                      && flag_ripa_disallow_asm_modules)
> -               inform (input_location, "Not importing %s: contains assembler"
> -                       " statements", mod_info->source_filename);
> -             else
> +              if (pointer_set_insert (modset, (void *)(size_t)mod_info->ident))
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: already imported",
> +                            mod_info->source_filename);
> +                }
> +              else if ((module_infos[0]->lang & GCOV_MODULE_LANG_MASK) !=
> +                       (mod_info->lang & GCOV_MODULE_LANG_MASK))
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: source language"
> +                            " different from primary module's source language",
> +                            mod_info->source_filename);
> +                }
> +              else if (module_infos_read == max_group)
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: maximum group"
> +                            " size reached", mod_info->source_filename);
> +                }
> +              else if (incompatible_cl_args (module_infos[0], mod_info))
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: command-line"
> +                            " arguments not compatible with primary module",
> +                            mod_info->source_filename);
> +                }
> +              else if ((fd = open (aux_da_filename, O_RDONLY)) < 0)
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: couldn't open %s",
> +                            mod_info->source_filename, aux_da_filename);
> +                }
> +              else if ((mod_info->lang & GCOV_MODULE_ASM_STMTS)
> +                       && flag_ripa_disallow_asm_modules)
> +                {
> +                  if (flag_opt_info >= OPT_INFO_MAX)
> +                    inform (input_location, "Not importing %s: contains "
> +                            "assembler statements", mod_info->source_filename);
> +                }
> +              else
>                {
>                  close (fd);
>                  module_infos_read++;
> @@ -612,7 +631,7 @@
>                }
>             }
>
> -          if (flag_ripa_verbose)
> +          if (flag_opt_info >= OPT_INFO_MAX)
>             {
>               inform (input_location,
>                       "MODULE Id=%d, Is_Primary=%s,"
> @@ -676,7 +695,7 @@
>     {
>       static int warned = 0;
>
> -      if (!warned++)
> +      if ((flag_opt_info >= OPT_INFO_MIN) && !warned++)
>        inform (input_location, (flag_guess_branch_prob
>                 ? "file %s not found, execution counts estimated"
>                 : "file %s not found, execution counts assumed to be zero"),
> @@ -688,7 +707,7 @@
>
>   if (!entry)
>     {
> -      if (!flag_dyn_ipa)
> +      if ((flag_opt_info >= OPT_INFO_MIN) && !flag_dyn_ipa)
>        warning (0, "no coverage for function %qE found",
>                 DECL_ASSEMBLER_NAME (current_function_decl));
>       return NULL;
> @@ -705,7 +724,7 @@
>        warning_at (input_location, OPT_Wcoverage_mismatch,
>                    "The control flow of function %qE does not match "
>                    "its profile data (counter %qs)", id, ctr_names[counter]);
> -      if (warning_printed)
> +      if ((flag_opt_info >= OPT_INFO_MIN) && warning_printed)
>        {
>         inform (input_location, "Use -Wno-error=coverage-mismatch to tolerate "
>                 "the mismatch but performance may drop if the function is hot");
> @@ -727,7 +746,8 @@
>     }
>     else if (entry->lineno_checksum != lineno_checksum)
>       {
> -        warning (0, "Source location for function %qE have changed,"
> +        warning (OPT_Wcoverage_mismatch,
> +                 "Source location for function %qE have changed,"
>                  " the profile data may be out of date",
>                  DECL_ASSEMBLER_NAME (current_function_decl));
>       }
> Index: gcc/common.opt
> ===================================================================
> --- gcc/common.opt      (revision 180106)
> +++ gcc/common.opt      (working copy)
> @@ -1125,11 +1125,6 @@
>  Don't promote always inline static functions assuming they
>  will be inlined and no copy is needed.
>
> -
> -fripa-verbose
> -Common Report Var(flag_ripa_verbose)
> -Enable verbose informational messages for LIPO compilation
> -
>  fearly-inlining
>  Common Report Var(flag_early_inlining) Init(1) Optimization
>  Perform early inlining
> @@ -1572,6 +1567,19 @@
>  Common Report Var(flag_omit_frame_pointer) Optimization
>  When possible do not generate stack frames
>
> +fopt-info
> +Common Report Var(flag_opt_info) Optimization Init(1)
> +Enable verbose informational messages for optimizations (same as -fopt-info=1)
> +
> +; fopt-info=0: no message will be emitted.
> +; fopt-info or fopt-info=1: emit important warnings and optimization messages with
> +;   large performance impact.
> +; fopt-info=2: warnings and optimization messages targeting power users.
> +; fopt-info=3: informational messages for compiler developers.
> +fopt-info=
> +Common RejectNegative Joined UInteger Optimization
> +-fopt-info=[0|1|2|3] Set the verbose level of informational messages for optimizations
> +
>  foptimize-register-move
>  Common Report Var(flag_regmove) Optimization
>  Do the full register move optimization pass
> Index: gcc/tree-profile.c
> ===================================================================
> --- gcc/tree-profile.c  (revision 180106)
> +++ gcc/tree-profile.c  (working copy)
> @@ -1096,13 +1096,16 @@
>                 reusedist_make_instr_call (stmt, subst, counters),
>                 GSI_NEW_STMT);
>
> -            locus = (stmt != NULL)
> -                ? gimple_location (stmt)
> -                : DECL_SOURCE_LOCATION (current_function_decl);
> -            inform (locus,
> -                    "inserted reuse distance instrumentation for %qs, using "
> -                    "%d gcov counters", subst->original_name,
> -                    subst->num_ptr_args * RD_NUM_COUNTERS);
> +            if (flag_opt_info >= OPT_INFO_MAX)
> +              {
> +                locus = (stmt != NULL)
> +                    ? gimple_location (stmt)
> +                    : DECL_SOURCE_LOCATION (current_function_decl);
> +                inform (locus,
> +                        "inserted reuse distance instrumentation for %qs, using "
> +                        "%d gcov counters", subst->original_name,
> +                        subst->num_ptr_args * RD_NUM_COUNTERS);
> +              }
>           }
>       }
>  }
> @@ -1214,7 +1217,7 @@
>
>   reusedist_from_counters (counters, &rd);
>
> -  if (rd.count)
> +  if ((flag_opt_info >= OPT_INFO_MAX) && rd.count)
>     inform (locus, "reuse distance counters for arg %d: %lld %lld %lld %lld",
>             arg, (long long int)rd.mean_dist, (long long int)rd.mean_size,
>             (long long int)rd.count, (long long int)rd.dist_x_size);
> @@ -1283,9 +1286,10 @@
>   subst_decl = reusedist_get_nt_decl (gimple_call_fndecl (stmt), subst,
>                                       suffix);
>   gimple_call_set_fndecl (stmt, subst_decl);
> -  inform (locus, "replaced %qs with non-temporal %qs",
> -          subst->original_name,
> -          IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (subst_decl)));
> +  if (flag_opt_info >= OPT_INFO_MED)
> +    inform (locus, "replaced %qs with non-temporal %qs",
> +            subst->original_name,
> +            IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (subst_decl)));
>  }
>
>  /* Replace string operations with equivalent nontemporal, when profitable.  */
> @@ -1323,11 +1327,13 @@
>
>   if (counter_index != n_counters)
>     {
> -      warning (0, "coverage mismatch for reuse distance counters "
> +      warning (OPT_Wcoverage_mismatch,
> +               "coverage mismatch for reuse distance counters "
>                "in function %qs", IDENTIFIER_POINTER
>                (DECL_ASSEMBLER_NAME (current_function_decl)));
> -      inform (input_location, "number of counters is %u instead of %u",
> -              n_counters, counter_index);
> +      if (flag_opt_info >= OPT_INFO_MAX)
> +        inform (input_location, "number of counters is %u instead of %u",
> +                n_counters, counter_index);
>     }
>  }
>
>
> --
> This patch is available for review at http://codereview.appspot.com/5294043
>
Andi Kleen - Oct. 20, 2011, 7:53 p.m.
> This warnings/notes are caused
> by inconsistent profile due to data race (which is currently common in
> multi-thread programs),

I never quite understood why the gcov counters are not simply marked
__thread. This would make the profiled programs faster too because
they wouldn't bounce cache lines  that much. Especially on larger
systems (>2S) frequent cache line bouncing can lead to extreme slow downs,
and even on smaller systems it's very expensive.
This would also eliminate data races, except for signals and somesuch.

-Andi
Xinliang David Li - Oct. 20, 2011, 8:09 p.m.
On Thu, Oct 20, 2011 at 12:53 PM, Andi Kleen <andi@firstfloor.org> wrote:
>> This warnings/notes are caused
>> by inconsistent profile due to data race (which is currently common in
>> multi-thread programs),
>
> I never quite understood why the gcov counters are not simply marked
> __thread. This would make the profiled programs faster too because
> they wouldn't bounce cache lines  that much. Especially on larger
> systems (>2S) frequent cache line bouncing can lead to extreme slow downs,
> and even on smaller systems it's very expensive.
> This would also eliminate data races, except for signals and somesuch.

It uses stack space and for programs with hundreds and thousands of
threads, it can be a big problem.

David

>
> -Andi
>
>
Richard Guenther - Oct. 21, 2011, 8:32 a.m.
On Thu, Oct 20, 2011 at 7:11 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Thu, Oct 20, 2011 at 1:21 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Thu, Oct 20, 2011 at 1:33 AM, Andi Kleen <andi@firstfloor.org> wrote:
>>> xur@google.com (Rong Xu) writes:
>>>
>>>> After some off-line discussion, we decided to use a more general approach
>>>> to control the printing of optimization messages/warnings. We will
>>>> introduce a new option -fopt-info:
>>>>  * fopt-info=0 or fno-opt-info: no message will be emitted.
>>>>  * fopt-info or fopt-info=1: emit important warnings and optimization
>>>>    messages with large performance impact.
>>>>  * fopt-info=2: warnings and optimization messages targeting power users.
>>>>  * fopt-info=3: informational messages for compiler developers.
>>
>> This doesn't look scalable if you consider that each pass would print
>> as much of a mess like -fvectorizer-verbose=5.
>
> What is not scalable? For level 1 dump, only the summary of
> vectorization will be printed just like other loop transformations.
>
>>
>> I think =2 and =3 should be omitted - we do have dump-files for a reason.
>
> Dump files are not easy to use -- it is big, and slow especially for
> people with large distributed build systems.  Having both level 2 and
> 3 is debatable, but it will be useful to have a least one level above
> level 1. Dump files are mainly for compiler developers, while
> -fopt-info are for compiler developers *and* power users who know
> performance tuning.
>>
>> Also the coverage/profile cases you changed do not at all match
>> "... with large performance impact".  In fact the impact is completely
>> unknown (as it would be the case usually).
>
> Impact of any transformations is just 'potential', coverage problems
> are no different from that.
>
>>
>> I'd rather have a way to make dump-files more structured (so, following
>> some standard reporting scheme) than introducing yet another way
>> of output.  [after making dump-files more consistent it will be easy
>> to revisit patches like this, there would be a natural general central
>> way to implement it]
>
> Yes, I remember we have discussed about this before -- currently dump
> files are a big mess -- debug tracing, IR are all mixed up, but as I
> said above, this is a different matter -- it is for compiler
> developers.
>
> For more structured optimization report, we should use option
> -fopt-report which dump optimization information based on category --
> the info data base can also be shared across modules:
>
> Example:
>
> [Loop Interchange]
> File a, line x,   yyyyyyy
> File b, line xx, yyyyyyy
> ....
> File c, line z,   It is beneficial to interchange the loop, but not
> done because of possible carried dependency (caused by false aliasing
> ...)
>
> [Loop Vectorization]
> ....
>
> [Loop Unroll]
> ...
>
> [SRA]
>
> [Alias summary]
>  [Global Vars]
>   a: addr exposed
>   b: add not exposed
>   ..
>  [Global Pointers]
>    ..
>  ...

I very well understand the intent.  But I disagree with where you start
to implement this.  Dump files are _not_ only for developers - after
all we don't have anything else.  -fopt-report can get as big and unmanagable
to read as dump files - in fact I argue it will be worse than dump files if
you go beyond very very coarse reporting.

Yes, dump files are a "mess".  So - why not clean them up, and at the
same time annotate dump file pieces so _automatic_ filtering and
redirecting to stdout with something like -fopt-report would do something
sensible?  I don't see why dump files have to stay messy while you at
the same time would need to add _new_ code to dump to stdout for
-fopt-report.

So, no, please do it the right way that benefits both compiler developers
and your "power users".

And yes, the right way is not to start adding that -fopt-report switch.
The right way is to make dump-files consumable by mere mortals first.

Thanks,
Richard.

>
> Thanks,
>
> David
>
>>
>> So, please fix dump-files instead.  And for coverage/profiling, fill
>> in stuff in a dump-file!
>>
>> Richard.
>>
>>> It would be interested to have some warnings about missing SRA
>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>> large speedup.
>>>
>>> Right now a common case that prevents SRA on structure field
>>> is simply a memset or memcpy.
>>>
>>> -Andi
>>>
>>>
>>> --
>>> ak@linux.intel.com -- Speaking for myself only
>>>
>>
>
Xinliang David Li - Oct. 21, 2011, 4:48 p.m.
There are two proposals here. One is -fopt-info which prints out
informational notes to stderr, and the other is -fopt-report which is
more elaborate form of dump files. Are you object to both or just the
opt-report one?  The former is no different from any other
informational notes we already have -- the only difference is that
they are suppressed by default.

>>    ..
>>  ...
>
> I very well understand the intent.  But I disagree with where you start
> to implement this.  Dump files are _not_ only for developers - after
> all we don't have anything else.  -fopt-report can get as big and unmanagable
> to read as dump files - in fact I argue it will be worse than dump files if
> you go beyond very very coarse reporting.

The problem of using dump files for optimization report is that all
optimization decisions are 'distributed' in phase specific dumps file.
For a whole program report, the number of files that are created is
not manageable (think about a program with 4000 sources each dumping
200 files).  If we create a dummy pass and suck in all optimization
decisions in that pass's dump file -- it will be no different from
opt-report.

>
> Yes, dump files are a "mess".  So - why not clean them up, and at the
> same time annotate dump file pieces so _automatic_ filtering and
> redirecting to stdout with something like -fopt-report would do something
> sensible?  I don't see why dump files have to stay messy while you at
> the same time would need to add _new_ code to dump to stdout for
> -fopt-report.

In my mind, I would like to separate all dumps into three categories.

1) IR dumps, and support dump before and after (this reminds me my
patches are still pending :) )    -fdump-tree-pre-[before|after]-....
 Dump into .after, .before files
2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
into .debug files.
3) opt report : -fdump-opt or -fopt-report

Changes for 1) and 2) are mechanic but requires lots of work.

>
> So, no, please do it the right way that benefits both compiler developers
> and your "power users".
>
> And yes, the right way is not to start adding that -fopt-report switch.
> The right way is to make dump-files consumable by mere mortals first.

I agree we need to do the right way which needs to be discussed first.
I would argue that mere mortals will really appreciate opt-info
(separate from dump file and opt-report).

thanks,

David

>
> Thanks,
> Richard.
>
>>
>> Thanks,
>>
>> David
>>
>>>
>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>> in stuff in a dump-file!
>>>
>>> Richard.
>>>
>>>> It would be interested to have some warnings about missing SRA
>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>> large speedup.
>>>>
>>>> Right now a common case that prevents SRA on structure field
>>>> is simply a memset or memcpy.
>>>>
>>>> -Andi
>>>>
>>>>
>>>> --
>>>> ak@linux.intel.com -- Speaking for myself only
>>>>
>>>
>>
>
Richard Guenther - Oct. 23, 2011, 10:18 a.m.
On Fri, Oct 21, 2011 at 6:48 PM, Xinliang David Li <davidxl@google.com> wrote:
> There are two proposals here. One is -fopt-info which prints out
> informational notes to stderr, and the other is -fopt-report which is
> more elaborate form of dump files. Are you object to both or just the
> opt-report one?

What?  I'm objected to adding _two_ variants.  Didn't even realize
you proposed that.

>  The former is no different from any other
> informational notes we already have -- the only difference is that
> they are suppressed by default.

We do not have many informational notes, so it is different.

>>>    ..
>>>  ...
>>
>> I very well understand the intent.  But I disagree with where you start
>> to implement this.  Dump files are _not_ only for developers - after
>> all we don't have anything else.  -fopt-report can get as big and unmanagable
>> to read as dump files - in fact I argue it will be worse than dump files if
>> you go beyond very very coarse reporting.
>
> The problem of using dump files for optimization report is that all
> optimization decisions are 'distributed' in phase specific dumps file.
> For a whole program report, the number of files that are created is
> not manageable (think about a program with 4000 sources each dumping
> 200 files).  If we create a dummy pass and suck in all optimization
> decisions in that pass's dump file -- it will be no different from
> opt-report.

Well, -fopt-whatever will just funnel selected pieces also to stderr.
I object to duplicate dumping when we just need a way to filter
what goes to dump files.

>
>>
>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>> same time annotate dump file pieces so _automatic_ filtering and
>> redirecting to stdout with something like -fopt-report would do something
>> sensible?  I don't see why dump files have to stay messy while you at
>> the same time would need to add _new_ code to dump to stdout for
>> -fopt-report.
>
> In my mind, I would like to separate all dumps into three categories.
>
> 1) IR dumps, and support dump before and after (this reminds me my
> patches are still pending :) )    -fdump-tree-pre-[before|after]-....
>  Dump into .after, .before files
> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
> into .debug files.
> 3) opt report : -fdump-opt or -fopt-report
>
> Changes for 1) and 2) are mechanic but requires lots of work.

You can do that, but I want the passes to use a single mechanism to
feed all three "separated dumps".

>>
>> So, no, please do it the right way that benefits both compiler developers
>> and your "power users".
>>
>> And yes, the right way is not to start adding that -fopt-report switch.
>> The right way is to make dump-files consumable by mere mortals first.
>
> I agree we need to do the right way which needs to be discussed first.
> I would argue that mere mortals will really appreciate opt-info
> (separate from dump file and opt-report).

Well, still what you print with opt-info should be better also be present
with opt-report and in dump files.  Thus it all boils down to be able
to filter what passes put in their dump files.

Richard.

> thanks,
>
> David
>
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Thanks,
>>>
>>> David
>>>
>>>>
>>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>>> in stuff in a dump-file!
>>>>
>>>> Richard.
>>>>
>>>>> It would be interested to have some warnings about missing SRA
>>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>>> large speedup.
>>>>>
>>>>> Right now a common case that prevents SRA on structure field
>>>>> is simply a memset or memcpy.
>>>>>
>>>>> -Andi
>>>>>
>>>>>
>>>>> --
>>>>> ak@linux.intel.com -- Speaking for myself only
>>>>>
>>>>
>>>
>>
>
Xinliang David Li - Oct. 23, 2011, 5:28 p.m.
On Sun, Oct 23, 2011 at 3:18 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Fri, Oct 21, 2011 at 6:48 PM, Xinliang David Li <davidxl@google.com> wrote:
>> There are two proposals here. One is -fopt-info which prints out
>> informational notes to stderr, and the other is -fopt-report which is
>> more elaborate form of dump files. Are you object to both or just the
>> opt-report one?
>
> What?  I'm objected to adding _two_ variants.  Didn't even realize
> you proposed that.

They are different -- -fopt-info is on the fly -- the notes are
emitted as the transformations are done while -fopt-report is for more
structured report so it requires more compiler changes.  Bringing in
-fopt-report is a little distraction as the main discussion is on
-fopt-info.

>
>>  The former is no different from any other
>> informational notes we already have -- the only difference is that
>> they are suppressed by default.
>
> We do not have many informational notes, so it is different.

Why different? opt information notes are not even emitted by default.

>
>>>>    ..
>>>>  ...
>>>
>>> I very well understand the intent.  But I disagree with where you start
>>> to implement this.  Dump files are _not_ only for developers - after
>>> all we don't have anything else.  -fopt-report can get as big and unmanagable
>>> to read as dump files - in fact I argue it will be worse than dump files if
>>> you go beyond very very coarse reporting.
>>
>> The problem of using dump files for optimization report is that all
>> optimization decisions are 'distributed' in phase specific dumps file.
>> For a whole program report, the number of files that are created is
>> not manageable (think about a program with 4000 sources each dumping
>> 200 files).  If we create a dummy pass and suck in all optimization
>> decisions in that pass's dump file -- it will be no different from
>> opt-report.
>
> Well, -fopt-whatever will just funnel selected pieces also to stderr.
> I object to duplicate dumping when we just need a way to filter
> what goes to dump files.
>

that is the main point -- using dump files are not scalable. If you
are just against using stderr and propose dumping the selected
information into a single shared dump file per build, I don't see the
difference with using stderr -- they are not emitted by default and
won't contaminate the build log.

>>
>>>
>>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>>> same time annotate dump file pieces so _automatic_ filtering and
>>> redirecting to stdout with something like -fopt-report would do something
>>> sensible?  I don't see why dump files have to stay messy while you at
>>> the same time would need to add _new_ code to dump to stdout for
>>> -fopt-report.
>>
>> In my mind, I would like to separate all dumps into three categories.
>>
>> 1) IR dumps, and support dump before and after (this reminds me my
>> patches are still pending :) )    -fdump-tree-pre-[before|after]-....
>>  Dump into .after, .before files
>> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
>> into .debug files.
>> 3) opt report : -fdump-opt or -fopt-report
>>
>> Changes for 1) and 2) are mechanic but requires lots of work.
>
> You can do that, but I want the passes to use a single mechanism to
> feed all three "separated dumps".
>

Can you elaborate on single mechanism here? A set of well defined
dumping APIs (instead of free form of  if (dump_file) fprintf
(dump_file, ...) ) ?

   debug_print (message, dump_flags, message_verbose_level, ...)
   trace_enter (trace_header_note)
   trace_exit (trace_header_not)
   opt_info_print (location, message_template, insertion)

Or  how dump files are organized?

I am all for clean up of dumping, but I don't see how -fopt-info get
in the way of that.

>>>
>>> So, no, please do it the right way that benefits both compiler developers
>>> and your "power users".
>>>
>>> And yes, the right way is not to start adding that -fopt-report switch.
>>> The right way is to make dump-files consumable by mere mortals first.
>>
>> I agree we need to do the right way which needs to be discussed first.
>> I would argue that mere mortals will really appreciate opt-info
>> (separate from dump file and opt-report).
>
> Well, still what you print with opt-info should be better also be present
> with opt-report and in dump files.  Thus it all boils down to be able
> to filter what passes put in their dump files.

opt-report is different (needs to buffer information and dumping at
the end of compilation).   Dump files and fopt-info can share the same
dumping format -- whatever gets emitted by opt-info should also be
emitted in the dump file (or replace the less well formated
transformation messages that are already available in dump files),
however simply filering the dump info does not solve the scalabilty
issue I mentioned.

thanks,

David

>
> Richard.
>
>> thanks,
>>
>> David
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>>>
>>>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>>>> in stuff in a dump-file!
>>>>>
>>>>> Richard.
>>>>>
>>>>>> It would be interested to have some warnings about missing SRA
>>>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>>>> large speedup.
>>>>>>
>>>>>> Right now a common case that prevents SRA on structure field
>>>>>> is simply a memset or memcpy.
>>>>>>
>>>>>> -Andi
>>>>>>
>>>>>>
>>>>>> --
>>>>>> ak@linux.intel.com -- Speaking for myself only
>>>>>>
>>>>>
>>>>
>>>
>>
>
Richard Guenther - Oct. 24, 2011, 8:42 a.m.
On Sun, Oct 23, 2011 at 7:28 PM, Xinliang David Li <davidxl@google.com> wrote:
> On Sun, Oct 23, 2011 at 3:18 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Fri, Oct 21, 2011 at 6:48 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> There are two proposals here. One is -fopt-info which prints out
>>> informational notes to stderr, and the other is -fopt-report which is
>>> more elaborate form of dump files. Are you object to both or just the
>>> opt-report one?
>>
>> What?  I'm objected to adding _two_ variants.  Didn't even realize
>> you proposed that.
>
> They are different -- -fopt-info is on the fly -- the notes are
> emitted as the transformations are done while -fopt-report is for more
> structured report so it requires more compiler changes.  Bringing in
> -fopt-report is a little distraction as the main discussion is on
> -fopt-info.
>
>>
>>>  The former is no different from any other
>>> informational notes we already have -- the only difference is that
>>> they are suppressed by default.
>>
>> We do not have many informational notes, so it is different.
>
> Why different? opt information notes are not even emitted by default.

You say "no different from other informational notes" and I say we don't
have those at the moment.  So it is different by means of that you are
going to ab-use informational notes.

>>
>>>>>    ..
>>>>>  ...
>>>>
>>>> I very well understand the intent.  But I disagree with where you start
>>>> to implement this.  Dump files are _not_ only for developers - after
>>>> all we don't have anything else.  -fopt-report can get as big and unmanagable
>>>> to read as dump files - in fact I argue it will be worse than dump files if
>>>> you go beyond very very coarse reporting.
>>>
>>> The problem of using dump files for optimization report is that all
>>> optimization decisions are 'distributed' in phase specific dumps file.
>>> For a whole program report, the number of files that are created is
>>> not manageable (think about a program with 4000 sources each dumping
>>> 200 files).  If we create a dummy pass and suck in all optimization
>>> decisions in that pass's dump file -- it will be no different from
>>> opt-report.
>>
>> Well, -fopt-whatever will just funnel selected pieces also to stderr.
>> I object to duplicate dumping when we just need a way to filter
>> what goes to dump files.
>>
>
> that is the main point -- using dump files are not scalable. If you
> are just against using stderr and propose dumping the selected
> information into a single shared dump file per build, I don't see the
> difference with using stderr -- they are not emitted by default and
> won't contaminate the build log.

Well, you seem to keep not reading what I write.  I am not opposed
to adding -fopt-info/report nor to funnel messages to stdout/err.  What
I am opposed is the way you want to introduce them.  I want you to
fix what we dump into dump files, so that both -fopt-report and -fopt-info
can be implemented by outputting selected pieces of the dump file
to stdout/stderr.  We already have -fdump-*-stats which supposedly
could match -fopt-report, and the default -fdump-* should be what
goes to -fopt-info (minus the function bodies, of course).

>>>
>>>>
>>>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>>>> same time annotate dump file pieces so _automatic_ filtering and
>>>> redirecting to stdout with something like -fopt-report would do something
>>>> sensible?  I don't see why dump files have to stay messy while you at
>>>> the same time would need to add _new_ code to dump to stdout for
>>>> -fopt-report.
>>>
>>> In my mind, I would like to separate all dumps into three categories.
>>>
>>> 1) IR dumps, and support dump before and after (this reminds me my
>>> patches are still pending :) )    -fdump-tree-pre-[before|after]-....
>>>  Dump into .after, .before files
>>> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
>>> into .debug files.
>>> 3) opt report : -fdump-opt or -fopt-report
>>>
>>> Changes for 1) and 2) are mechanic but requires lots of work.
>>
>> You can do that, but I want the passes to use a single mechanism to
>> feed all three "separated dumps".
>>
>
> Can you elaborate on single mechanism here? A set of well defined
> dumping APIs (instead of free form of  if (dump_file) fprintf
> (dump_file, ...) ) ?

Well, design one that will work.  But yes, a set of well-defined
dumping APIs, like

 print_start_{loop,location,region,...} (...);
 print_end_{loop...} (...);

or so.

>   debug_print (message, dump_flags, message_verbose_level, ...)

Rather instead of verbosity levels use TDF_* flags (with maybe
reorganizing them a bit) internally, a verbosity level can be
implemented ontop of that by -fopt-{info,report} if needed.

>   trace_enter (trace_header_note)
>   trace_exit (trace_header_not)
>   opt_info_print (location, message_template, insertion)
>
> Or  how dump files are organized?
>
> I am all for clean up of dumping, but I don't see how -fopt-info get
> in the way of that.

In the way?  It is a prerequesite to both -fopt-info and -fopt-report.
Otherwise you will end up adding _additional_ dumping to passes.
Which is what I very very much object to.  You can transition
to the common dump API incrementally and only handle the passes
you care for initially.

But anything else from a common mechanism isn't going to be
maintainable.

>>>>
>>>> So, no, please do it the right way that benefits both compiler developers
>>>> and your "power users".
>>>>
>>>> And yes, the right way is not to start adding that -fopt-report switch.
>>>> The right way is to make dump-files consumable by mere mortals first.
>>>
>>> I agree we need to do the right way which needs to be discussed first.
>>> I would argue that mere mortals will really appreciate opt-info
>>> (separate from dump file and opt-report).
>>
>> Well, still what you print with opt-info should be better also be present
>> with opt-report and in dump files.  Thus it all boils down to be able
>> to filter what passes put in their dump files.
>
> opt-report is different (needs to buffer information and dumping at
> the end of compilation).

Why at the end of compilation?  Passes already collect info for
-stats dumping.  What would -fopt-report print?  Something like

note: I have reduced size of your binary by 90%
note: You should improve your programming skills

?  Let's put -fopt-report aside for now as I don't have the slightest
idea what it should be.

>   Dump files and fopt-info can share the same
> dumping format -- whatever gets emitted by opt-info should also be
> emitted in the dump file (or replace the less well formated
> transformation messages that are already available in dump files),
> however simply filering the dump info does not solve the scalabilty
> issue I mentioned.

What scalability issue?  I see a maintainance issue and a code
readability issue.

Richard.

> thanks,
>
> David
>
>>
>> Richard.
>>
>>> thanks,
>>>
>>> David
>>>
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> David
>>>>>
>>>>>>
>>>>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>>>>> in stuff in a dump-file!
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>>> It would be interested to have some warnings about missing SRA
>>>>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>>>>> large speedup.
>>>>>>>
>>>>>>> Right now a common case that prevents SRA on structure field
>>>>>>> is simply a memset or memcpy.
>>>>>>>
>>>>>>> -Andi
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> ak@linux.intel.com -- Speaking for myself only
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Xinliang David Li - Oct. 24, 2011, 4:27 p.m.
> Well, you seem to keep not reading what I write.  I am not opposed
> to adding -fopt-info/report nor to funnel messages to stdout/err.  What
> I am opposed is the way you want to introduce them.  I want you to
> fix what we dump into dump files, so that both -fopt-report and -fopt-info
> can be implemented by outputting selected pieces of the dump file
> to stdout/stderr.  We already have -fdump-*-stats which supposedly
> could match -fopt-report, and the default -fdump-* should be what
> goes to -fopt-info (minus the function bodies, of course).

That sounds good. What you propose seems like

-fdump-pass-[ir_only|transformation|debug]-stderr

and -fopt-info is a short cut for
-fdump-tree-all-transformations-stderr
-fdump-ipa-all-tranformations-stderr
-fdump-rtl-all-transformations-stderr


thanks,

David



>
>>>>
>>>>>
>>>>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>>>>> same time annotate dump file pieces so _automatic_ filtering and
>>>>> redirecting to stdout with something like -fopt-report would do something
>>>>> sensible?  I don't see why dump files have to stay messy while you at
>>>>> the same time would need to add _new_ code to dump to stdout for
>>>>> -fopt-report.
>>>>
>>>> In my mind, I would like to separate all dumps into three categories.
>>>>
>>>> 1) IR dumps, and support dump before and after (this reminds me my
>>>> patches are still pending :) )    -fdump-tree-pre-[before|after]-....
>>>>  Dump into .after, .before files
>>>> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
>>>> into .debug files.
>>>> 3) opt report : -fdump-opt or -fopt-report
>>>>
>>>> Changes for 1) and 2) are mechanic but requires lots of work.
>>>
>>> You can do that, but I want the passes to use a single mechanism to
>>> feed all three "separated dumps".
>>>
>>
>> Can you elaborate on single mechanism here? A set of well defined
>> dumping APIs (instead of free form of  if (dump_file) fprintf
>> (dump_file, ...) ) ?
>
> Well, design one that will work.  But yes, a set of well-defined
> dumping APIs, like
>
>  print_start_{loop,location,region,...} (...);
>  print_end_{loop...} (...);
>
> or so.
>
>>   debug_print (message, dump_flags, message_verbose_level, ...)
>
> Rather instead of verbosity levels use TDF_* flags (with maybe
> reorganizing them a bit) internally, a verbosity level can be
> implemented ontop of that by -fopt-{info,report} if needed.
>
>>   trace_enter (trace_header_note)
>>   trace_exit (trace_header_not)
>>   opt_info_print (location, message_template, insertion)
>>
>> Or  how dump files are organized?
>>
>> I am all for clean up of dumping, but I don't see how -fopt-info get
>> in the way of that.
>
> In the way?  It is a prerequesite to both -fopt-info and -fopt-report.
> Otherwise you will end up adding _additional_ dumping to passes.
> Which is what I very very much object to.  You can transition
> to the common dump API incrementally and only handle the passes
> you care for initially.
>
> But anything else from a common mechanism isn't going to be
> maintainable.
>
>>>>>
>>>>> So, no, please do it the right way that benefits both compiler developers
>>>>> and your "power users".
>>>>>
>>>>> And yes, the right way is not to start adding that -fopt-report switch.
>>>>> The right way is to make dump-files consumable by mere mortals first.
>>>>
>>>> I agree we need to do the right way which needs to be discussed first.
>>>> I would argue that mere mortals will really appreciate opt-info
>>>> (separate from dump file and opt-report).
>>>
>>> Well, still what you print with opt-info should be better also be present
>>> with opt-report and in dump files.  Thus it all boils down to be able
>>> to filter what passes put in their dump files.
>>
>> opt-report is different (needs to buffer information and dumping at
>> the end of compilation).
>
> Why at the end of compilation?  Passes already collect info for
> -stats dumping.  What would -fopt-report print?  Something like
>
> note: I have reduced size of your binary by 90%
> note: You should improve your programming skills
>
> ?  Let's put -fopt-report aside for now as I don't have the slightest
> idea what it should be.
>
>>   Dump files and fopt-info can share the same
>> dumping format -- whatever gets emitted by opt-info should also be
>> emitted in the dump file (or replace the less well formated
>> transformation messages that are already available in dump files),
>> however simply filering the dump info does not solve the scalabilty
>> issue I mentioned.
>
> What scalability issue?  I see a maintainance issue and a code
> readability issue.
>
> Richard.
>
>> thanks,
>>
>> David
>>
>>>
>>> Richard.
>>>
>>>> thanks,
>>>>
>>>> David
>>>>
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>>>>>> in stuff in a dump-file!
>>>>>>>
>>>>>>> Richard.
>>>>>>>
>>>>>>>> It would be interested to have some warnings about missing SRA
>>>>>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>>>>>> large speedup.
>>>>>>>>
>>>>>>>> Right now a common case that prevents SRA on structure field
>>>>>>>> is simply a memset or memcpy.
>>>>>>>>
>>>>>>>> -Andi
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> ak@linux.intel.com -- Speaking for myself only
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Richard Guenther - Oct. 25, 2011, 8:02 a.m.
On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li <davidxl@google.com> wrote:
>> Well, you seem to keep not reading what I write.  I am not opposed
>> to adding -fopt-info/report nor to funnel messages to stdout/err.  What
>> I am opposed is the way you want to introduce them.  I want you to
>> fix what we dump into dump files, so that both -fopt-report and -fopt-info
>> can be implemented by outputting selected pieces of the dump file
>> to stdout/stderr.  We already have -fdump-*-stats which supposedly
>> could match -fopt-report, and the default -fdump-* should be what
>> goes to -fopt-info (minus the function bodies, of course).
>
> That sounds good. What you propose seems like
>
> -fdump-pass-[ir_only|transformation|debug]-stderr
>
> and -fopt-info is a short cut for
> -fdump-tree-all-transformations-stderr
> -fdump-ipa-all-tranformations-stderr
> -fdump-rtl-all-transformations-stderr

Yes.  Note that I don't like it the way the vectorizer does (with
-fvectorizer-verbose=... the dump files are empty).  The dump
file content should be unchanged when redirecting (parts) to
stderr, so we have to arrange to duplicate messages in two places.

Richard.

>
> thanks,
>
> David
>
>
>
>>
>>>>>
>>>>>>
>>>>>> Yes, dump files are a "mess".  So - why not clean them up, and at the
>>>>>> same time annotate dump file pieces so _automatic_ filtering and
>>>>>> redirecting to stdout with something like -fopt-report would do something
>>>>>> sensible?  I don't see why dump files have to stay messy while you at
>>>>>> the same time would need to add _new_ code to dump to stdout for
>>>>>> -fopt-report.
>>>>>
>>>>> In my mind, I would like to separate all dumps into three categories.
>>>>>
>>>>> 1) IR dumps, and support dump before and after (this reminds me my
>>>>> patches are still pending :) )    -fdump-tree-pre-[before|after]-....
>>>>>  Dump into .after, .before files
>>>>> 2) debug tracing etc:        -fdump-tree-pre-debug-...          Dump
>>>>> into .debug files.
>>>>> 3) opt report : -fdump-opt or -fopt-report
>>>>>
>>>>> Changes for 1) and 2) are mechanic but requires lots of work.
>>>>
>>>> You can do that, but I want the passes to use a single mechanism to
>>>> feed all three "separated dumps".
>>>>
>>>
>>> Can you elaborate on single mechanism here? A set of well defined
>>> dumping APIs (instead of free form of  if (dump_file) fprintf
>>> (dump_file, ...) ) ?
>>
>> Well, design one that will work.  But yes, a set of well-defined
>> dumping APIs, like
>>
>>  print_start_{loop,location,region,...} (...);
>>  print_end_{loop...} (...);
>>
>> or so.
>>
>>>   debug_print (message, dump_flags, message_verbose_level, ...)
>>
>> Rather instead of verbosity levels use TDF_* flags (with maybe
>> reorganizing them a bit) internally, a verbosity level can be
>> implemented ontop of that by -fopt-{info,report} if needed.
>>
>>>   trace_enter (trace_header_note)
>>>   trace_exit (trace_header_not)
>>>   opt_info_print (location, message_template, insertion)
>>>
>>> Or  how dump files are organized?
>>>
>>> I am all for clean up of dumping, but I don't see how -fopt-info get
>>> in the way of that.
>>
>> In the way?  It is a prerequesite to both -fopt-info and -fopt-report.
>> Otherwise you will end up adding _additional_ dumping to passes.
>> Which is what I very very much object to.  You can transition
>> to the common dump API incrementally and only handle the passes
>> you care for initially.
>>
>> But anything else from a common mechanism isn't going to be
>> maintainable.
>>
>>>>>>
>>>>>> So, no, please do it the right way that benefits both compiler developers
>>>>>> and your "power users".
>>>>>>
>>>>>> And yes, the right way is not to start adding that -fopt-report switch.
>>>>>> The right way is to make dump-files consumable by mere mortals first.
>>>>>
>>>>> I agree we need to do the right way which needs to be discussed first.
>>>>> I would argue that mere mortals will really appreciate opt-info
>>>>> (separate from dump file and opt-report).
>>>>
>>>> Well, still what you print with opt-info should be better also be present
>>>> with opt-report and in dump files.  Thus it all boils down to be able
>>>> to filter what passes put in their dump files.
>>>
>>> opt-report is different (needs to buffer information and dumping at
>>> the end of compilation).
>>
>> Why at the end of compilation?  Passes already collect info for
>> -stats dumping.  What would -fopt-report print?  Something like
>>
>> note: I have reduced size of your binary by 90%
>> note: You should improve your programming skills
>>
>> ?  Let's put -fopt-report aside for now as I don't have the slightest
>> idea what it should be.
>>
>>>   Dump files and fopt-info can share the same
>>> dumping format -- whatever gets emitted by opt-info should also be
>>> emitted in the dump file (or replace the less well formated
>>> transformation messages that are already available in dump files),
>>> however simply filering the dump info does not solve the scalabilty
>>> issue I mentioned.
>>
>> What scalability issue?  I see a maintainance issue and a code
>> readability issue.
>>
>> Richard.
>>
>>> thanks,
>>>
>>> David
>>>
>>>>
>>>> Richard.
>>>>
>>>>> thanks,
>>>>>
>>>>> David
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> So, please fix dump-files instead.  And for coverage/profiling, fill
>>>>>>>> in stuff in a dump-file!
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> It would be interested to have some warnings about missing SRA
>>>>>>>>> opportunities in =1 or =2. I found that sometimes fixing those can give a
>>>>>>>>> large speedup.
>>>>>>>>>
>>>>>>>>> Right now a common case that prevents SRA on structure field
>>>>>>>>> is simply a memset or memcpy.
>>>>>>>>>
>>>>>>>>> -Andi
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> ak@linux.intel.com -- Speaking for myself only
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
Richard Guenther - Oct. 26, 2011, 8:37 a.m.
On Tue, Oct 25, 2011 at 9:30 PM, Xinliang David Li <davidxl@google.com> wrote:
>
>
> On Tue, Oct 25, 2011 at 1:02 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>>
>> On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li <davidxl@google.com>
>> wrote:
>> >> Well, you seem to keep not reading what I write.  I am not opposed
>> >> to adding -fopt-info/report nor to funnel messages to stdout/err.  What
>> >> I am opposed is the way you want to introduce them.  I want you to
>> >> fix what we dump into dump files, so that both -fopt-report and
>> >> -fopt-info
>> >> can be implemented by outputting selected pieces of the dump file
>> >> to stdout/stderr.  We already have -fdump-*-stats which supposedly
>> >> could match -fopt-report, and the default -fdump-* should be what
>> >> goes to -fopt-info (minus the function bodies, of course).
>> >
>> > That sounds good. What you propose seems like
>> >
>> > -fdump-pass-[ir_only|transformation|debug]-stderr
>> >
>> > and -fopt-info is a short cut for
>> > -fdump-tree-all-transformations-stderr
>> > -fdump-ipa-all-tranformations-stderr
>> > -fdump-rtl-all-transformations-stderr
>>
>> Yes.  Note that I don't like it the way the vectorizer does (with
>> -fvectorizer-verbose=... the dump files are empty).  The dump
>> file content should be unchanged when redirecting (parts) to
>> stderr, so we have to arrange to duplicate messages in two places.
>>
> Vectorizer dump is a good candidate for clean up when the dumping
> infrastructure improvement is done.
> FYI, for now, we will implement the opt-info for some passes in the simple
> way in google branches, and later migrate to trunk when the dumping
> infrastructure is improved.

I was hoping you would volunteer to improve the dumping infrastructure.

Richard.
Xinliang David Li - Oct. 26, 2011, 4:33 p.m.
I am hoping that too:) Yes, I will try to do it when I find some time.

David

On Wed, Oct 26, 2011 at 1:37 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Oct 25, 2011 at 9:30 PM, Xinliang David Li <davidxl@google.com> wrote:
>>
>>
>> On Tue, Oct 25, 2011 at 1:02 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>>
>>> On Mon, Oct 24, 2011 at 6:27 PM, Xinliang David Li <davidxl@google.com>
>>> wrote:
>>> >> Well, you seem to keep not reading what I write.  I am not opposed
>>> >> to adding -fopt-info/report nor to funnel messages to stdout/err.  What
>>> >> I am opposed is the way you want to introduce them.  I want you to
>>> >> fix what we dump into dump files, so that both -fopt-report and
>>> >> -fopt-info
>>> >> can be implemented by outputting selected pieces of the dump file
>>> >> to stdout/stderr.  We already have -fdump-*-stats which supposedly
>>> >> could match -fopt-report, and the default -fdump-* should be what
>>> >> goes to -fopt-info (minus the function bodies, of course).
>>> >
>>> > That sounds good. What you propose seems like
>>> >
>>> > -fdump-pass-[ir_only|transformation|debug]-stderr
>>> >
>>> > and -fopt-info is a short cut for
>>> > -fdump-tree-all-transformations-stderr
>>> > -fdump-ipa-all-tranformations-stderr
>>> > -fdump-rtl-all-transformations-stderr
>>>
>>> Yes.  Note that I don't like it the way the vectorizer does (with
>>> -fvectorizer-verbose=... the dump files are empty).  The dump
>>> file content should be unchanged when redirecting (parts) to
>>> stderr, so we have to arrange to duplicate messages in two places.
>>>
>> Vectorizer dump is a good candidate for clean up when the dumping
>> infrastructure improvement is done.
>> FYI, for now, we will implement the opt-info for some passes in the simple
>> way in google branches, and later migrate to trunk when the dumping
>> infrastructure is improved.
>
> I was hoping you would volunteer to improve the dumping infrastructure.
>
> Richard.
>

Patch

Index: gcc/value-prof.c
===================================================================
--- gcc/value-prof.c	(revision 180106)
+++ gcc/value-prof.c	(working copy)
@@ -472,9 +472,10 @@ 
               : DECL_SOURCE_LOCATION (current_function_decl);
       if (flag_profile_correction)
         {
-	  inform (locus, "correcting inconsistent value profile: "
-		  "%s profiler overall count (%d) does not match BB count "
-                  "(%d)", name, (int)*all, (int)bb_count);
+          if (flag_opt_info >= OPT_INFO_MAX)
+            inform (locus, "correcting inconsistent value profile: %s "
+		    "profiler overall count (%d) does not match BB count "
+                    "(%d)", name, (int)*all, (int)bb_count);
 	  *all = bb_count;
 	  if (*count > *all)
             *count = *all;
@@ -510,33 +511,42 @@ 
   location_t locus;
   if (*count1 > all && flag_profile_correction)
     {
-      locus = (stmt != NULL)
-              ? gimple_location (stmt)
-              : DECL_SOURCE_LOCATION (current_function_decl);
-      inform (locus, "Correcting inconsistent value profile: "
-              "ic (topn) profiler top target count (%ld) exceeds "
-	      "BB count (%ld)", (long)*count1, (long)all);
+      if (flag_opt_info >= OPT_INFO_MAX)
+        {
+          locus = (stmt != NULL)
+                  ? gimple_location (stmt)
+                  : DECL_SOURCE_LOCATION (current_function_decl);
+          inform (locus, "Correcting inconsistent value profile: "
+                  "ic (topn) profiler top target count (%ld) exceeds "
+                  "BB count (%ld)", (long)*count1, (long)all);
+        }
       *count1 = all;
     }
   if (*count2 > all && flag_profile_correction)
     {
-      locus = (stmt != NULL)
-              ? gimple_location (stmt)
-              : DECL_SOURCE_LOCATION (current_function_decl);
-      inform (locus, "Correcting inconsistent value profile: "
-              "ic (topn) profiler second target count (%ld) exceeds "
-	      "BB count (%ld)", (long)*count2, (long)all);
+      if (flag_opt_info >= OPT_INFO_MAX)
+        {
+          locus = (stmt != NULL)
+                  ? gimple_location (stmt)
+                  : DECL_SOURCE_LOCATION (current_function_decl);
+          inform (locus, "Correcting inconsistent value profile: "
+                  "ic (topn) profiler second target count (%ld) exceeds "
+    	          "BB count (%ld)", (long)*count2, (long)all);
+        }
       *count2 = all;
     }
   
   if (*count2 > *count1)
     {
-      locus = (stmt != NULL)
-              ? gimple_location (stmt)
-              : DECL_SOURCE_LOCATION (current_function_decl);
-      inform (locus, "Corrupted topn ic value profile: "
-	      "first target count (%ld) is less than the second "
-	      "target count (%ld)", (long)*count1, (long)*count2);
+      if (flag_opt_info >= OPT_INFO_MAX)
+        {
+          locus = (stmt != NULL)
+                  ? gimple_location (stmt)
+                  : DECL_SOURCE_LOCATION (current_function_decl);
+          inform (locus, "Corrupted topn ic value profile: "
+    	          "first target count (%ld) is less than the second "
+    	          "target count (%ld)", (long)*count1, (long)*count2);
+        }
       return true;
     }
 
@@ -548,12 +558,16 @@ 
 	*count2 = all - *count1;
       else
 	{
-	  locus = (stmt != NULL)
-	    ? gimple_location (stmt)
-	    : DECL_SOURCE_LOCATION (current_function_decl);
-	  inform (locus, "Corrupted topn ic value profile: top two targets's"
-		  " total count (%ld) exceeds bb count (%ld)",
-		  (long)(*count1 + *count2), (long)all);
+          if (flag_opt_info >= OPT_INFO_MAX)
+            {
+	      locus = (stmt != NULL)
+	        ? gimple_location (stmt)
+	        : DECL_SOURCE_LOCATION (current_function_decl);
+	      inform (locus,
+                      "Corrupted topn ic value profile: top two targets's"
+                      " total count (%ld) exceeds bb count (%ld)",
+                      (long)(*count1 + *count2), (long)all);
+            }
 	  return true;
 	}
     }
@@ -1177,8 +1191,11 @@ 
                                       func_id) == NULL)
     {
       if (flag_profile_correction)
-        inform (DECL_SOURCE_LOCATION (current_function_decl),
+        {
+          if (flag_opt_info >= OPT_INFO_MED)
+            inform (DECL_SOURCE_LOCATION (current_function_decl),
                 "Inconsistent profile: indirect call target (%d) does not exist", func_id);
+        }
       else
         error ("Inconsistent profile: indirect call target (%d) does not exist", func_id);
 
@@ -1308,8 +1325,9 @@ 
      return true;
 
    locus =  gimple_location (call_stmt);
-   inform (locus, "Skipping target %s with mismatching types for icall ",
-           cgraph_node_name (target));
+   if (flag_opt_info >= OPT_INFO_MAX)
+     inform (locus, "Skipping target %s with mismatching types for icall ",
+             cgraph_node_name (target));
    return false;
 }
 
@@ -1571,7 +1589,7 @@ 
   if (direct_call1 == NULL
       || !check_ic_target (stmt, direct_call1))
     {
-      if (flag_ripa_verbose)
+      if (flag_opt_info >= OPT_INFO_MAX)
         {
           if (!direct_call1)
             inform (locus, "Can not find indirect call target decl "
@@ -1597,7 +1615,7 @@ 
     return false;
 
   modify1 = gimple_ic (stmt, direct_call1, prob1, count1, all);
-  if (flag_ripa_verbose)
+  if (flag_opt_info >= OPT_INFO_MIN)
     inform (locus, "Promote indirect call to target (call count:%u) %s",
 	    (unsigned) count1,
 	    lang_hooks.decl_printable_name (direct_call1->decl, 3));
@@ -1635,7 +1653,7 @@ 
       modify2 = gimple_ic (stmt, direct_call2,
                            prob2, count2, all - count1);
 
-      if (flag_ripa_verbose)
+      if (flag_opt_info >= OPT_INFO_MIN)
 	inform (locus, "Promote indirect call to target (call count:%u) %s",
 		(unsigned) count2,
 		lang_hooks.decl_printable_name (direct_call2->decl, 3));
Index: gcc/mcf.c
===================================================================
--- gcc/mcf.c	(revision 180106)
+++ gcc/mcf.c	(working copy)
@@ -1442,7 +1442,8 @@ 
       if (iteration > MAX_ITER (fixup_graph->num_vertices,
                                 fixup_graph->num_edges))
 	{
-	  inform (DECL_SOURCE_LOCATION (current_function_decl),
+          if (flag_opt_info >= OPT_INFO_MAX)
+            inform (DECL_SOURCE_LOCATION (current_function_decl),
 		  "Exiting profile correction early to avoid excessive "
 		  "compile time");
 	  break;
Index: gcc/testsuite/gcc.dg/pr32773.c
===================================================================
--- gcc/testsuite/gcc.dg/pr32773.c	(revision 180106)
+++ gcc/testsuite/gcc.dg/pr32773.c	(working copy)
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O -fprofile-use" } */
-/* { dg-options "-O -m4 -fprofile-use" { target sh-*-* } } */
+/* { dg-options "-O -fprofile-use -fopt-info" } */
+/* { dg-options "-O -m4 -fprofile-use -fopt-info" { target sh-*-* } } */
 
 void foo (int *p)
 {
Index: gcc/testsuite/gcc.dg/pr40209.c
===================================================================
--- gcc/testsuite/gcc.dg/pr40209.c	(revision 180106)
+++ gcc/testsuite/gcc.dg/pr40209.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fprofile-use" } */
+/* { dg-options "-O2 -fprofile-use -fopt-info" } */
 
 void process(const char *s);
 
Index: gcc/testsuite/gcc.dg/pr26570.c
===================================================================
--- gcc/testsuite/gcc.dg/pr26570.c	(revision 180106)
+++ gcc/testsuite/gcc.dg/pr26570.c	(working copy)
@@ -1,5 +1,5 @@ 
 /* { dg-do compile } */
-/* { dg-options "-O2 -fprofile-generate -fprofile-use" } */
+/* { dg-options "-O2 -fprofile-generate -fprofile-use -fopt-info" } */
 
 unsigned test (unsigned a, unsigned b)
 {
Index: gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C
===================================================================
--- gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C	(revision 180106)
+++ gcc/testsuite/g++.dg/tree-ssa/dom-invalid.C	(working copy)
@@ -1,7 +1,7 @@ 
 // PR tree-optimization/39557
 // invalid post-dom info leads to infinite loop
 // { dg-do run }
-// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fno-rtti" }
+// { dg-options "-Wall -fno-exceptions -O2 -fprofile-use -fopt-info -fno-rtti" }
 
 struct C
 {
Index: gcc/opts.c
===================================================================
--- gcc/opts.c	(revision 180106)
+++ gcc/opts.c	(working copy)
@@ -1579,6 +1579,15 @@ 
       pp_set_line_maximum_length (dc->printer, value);
       break;
 
+    case OPT_fopt_info_:
+      if (value < 0 || value > OPT_INFO_MAX)
+	error_at (loc,
+		  "%d: invalid value for opt_info",
+		  value);
+      else
+	opts->x_flag_opt_info = value;
+      break;
+
     case OPT_fpack_struct_:
       if (value <= 0 || (value & (value - 1)) || value > 16)
 	error_at (loc,
Index: gcc/profile.c
===================================================================
--- gcc/profile.c	(revision 180106)
+++ gcc/profile.c	(working copy)
@@ -414,7 +414,7 @@ 
 		    if (flag_profile_correction)
 		      {
 			static bool informed = 0;
-			if (!informed)
+			if ((flag_opt_info >= OPT_INFO_MAX) && !informed)
 		          inform (input_location,
 			          "corrupted profile info: edge count exceeds maximal count");
 			informed = 1;
@@ -635,7 +635,7 @@ 
        {
          /* Inconsistency detected. Make it flow-consistent. */
          static int informed = 0;
-         if (informed == 0)
+         if ((flag_opt_info >= OPT_INFO_MAX) && informed == 0)
            {
              informed = 1;
              inform (input_location, "correcting inconsistent profile data");
Index: gcc/flag-types.h
===================================================================
--- gcc/flag-types.h	(revision 180106)
+++ gcc/flag-types.h	(working copy)
@@ -204,4 +204,11 @@ 
   MAX_VERBOSITY_LEVEL
 };
 
+/* flag_opt_info verbosity levels.  */
+enum opt_info_verbosity_levels {
+  OPT_INFO_NONE = 0,
+  OPT_INFO_MIN  = 1,
+  OPT_INFO_MED  = 2,
+  OPT_INFO_MAX  = 3
+};
 #endif /* ! GCC_FLAG_TYPES_H */
Index: gcc/coverage.c
===================================================================
--- gcc/coverage.c	(revision 180106)
+++ gcc/coverage.c	(working copy)
@@ -355,7 +355,8 @@ 
     warning (OPT_Wripa_opt_mismatch, "command line arguments mismatch for %s "
 	     "and %s", mod_info1->source_filename, mod_info2->source_filename);
 
-   if (warn_ripa_opt_mismatch && non_warning_mismatch && flag_ripa_verbose)
+   if (warn_ripa_opt_mismatch && non_warning_mismatch 
+       && (flag_opt_info >= OPT_INFO_MED))
      {
        inform (UNKNOWN_LOCATION, "Options for %s", mod_info1->source_filename);
        for (i = 0; i < num_non_warning_opts1; i++)
@@ -575,29 +576,47 @@ 
 	      int fd;
 	      char *aux_da_filename = get_da_file_name (mod_info->da_filename);
               gcc_assert (!mod_info->is_primary);
-	      if (pointer_set_insert (modset, (void *)(size_t)mod_info->ident))
-		inform (input_location, "Not importing %s: already imported",
-			mod_info->source_filename);
-	      else if ((module_infos[0]->lang & GCOV_MODULE_LANG_MASK) !=
-		       (mod_info->lang & GCOV_MODULE_LANG_MASK))
-		inform (input_location, "Not importing %s: source language"
-			" different from primary module's source language",
-			mod_info->source_filename);
-	      else if (module_infos_read == max_group)
-		inform (input_location, "Not importing %s: maximum group size"
-			" reached", mod_info->source_filename);
-	      else if (incompatible_cl_args (module_infos[0], mod_info))
-		inform (input_location, "Not importing %s: command-line"
-			" arguments not compatible with primary module",
-			mod_info->source_filename);
-	      else if ((fd = open (aux_da_filename, O_RDONLY)) < 0)
-		inform (input_location, "Not importing %s: couldn't open %s",
-			mod_info->source_filename, aux_da_filename);
-	      else if ((mod_info->lang & GCOV_MODULE_ASM_STMTS)
-		       && flag_ripa_disallow_asm_modules)
-		inform (input_location, "Not importing %s: contains assembler"
-			" statements", mod_info->source_filename);
-	      else
+              if (pointer_set_insert (modset, (void *)(size_t)mod_info->ident))
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: already imported",
+                            mod_info->source_filename);
+                }
+              else if ((module_infos[0]->lang & GCOV_MODULE_LANG_MASK) !=
+                       (mod_info->lang & GCOV_MODULE_LANG_MASK))
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: source language"
+                            " different from primary module's source language",
+                            mod_info->source_filename);
+                }
+              else if (module_infos_read == max_group)
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: maximum group"
+                            " size reached", mod_info->source_filename);
+                }
+              else if (incompatible_cl_args (module_infos[0], mod_info))
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: command-line"
+                            " arguments not compatible with primary module",
+                            mod_info->source_filename);
+                }
+              else if ((fd = open (aux_da_filename, O_RDONLY)) < 0)
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: couldn't open %s",
+                            mod_info->source_filename, aux_da_filename);
+                }
+              else if ((mod_info->lang & GCOV_MODULE_ASM_STMTS)
+                       && flag_ripa_disallow_asm_modules)
+                {
+                  if (flag_opt_info >= OPT_INFO_MAX)
+                    inform (input_location, "Not importing %s: contains "
+                            "assembler statements", mod_info->source_filename);
+                }
+              else
 		{
 		  close (fd);
 		  module_infos_read++;
@@ -612,7 +631,7 @@ 
 		}
             }
 
-          if (flag_ripa_verbose)
+          if (flag_opt_info >= OPT_INFO_MAX)
             {
               inform (input_location,
                       "MODULE Id=%d, Is_Primary=%s,"
@@ -676,7 +695,7 @@ 
     {
       static int warned = 0;
 
-      if (!warned++)
+      if ((flag_opt_info >= OPT_INFO_MIN) && !warned++)
 	inform (input_location, (flag_guess_branch_prob
 		 ? "file %s not found, execution counts estimated"
 		 : "file %s not found, execution counts assumed to be zero"),
@@ -688,7 +707,7 @@ 
 
   if (!entry)
     {
-      if (!flag_dyn_ipa)
+      if ((flag_opt_info >= OPT_INFO_MIN) && !flag_dyn_ipa)
 	warning (0, "no coverage for function %qE found",
 		 DECL_ASSEMBLER_NAME (current_function_decl));
       return NULL;
@@ -705,7 +724,7 @@ 
 	warning_at (input_location, OPT_Wcoverage_mismatch,
 		    "The control flow of function %qE does not match "
 		    "its profile data (counter %qs)", id, ctr_names[counter]);
-      if (warning_printed)
+      if ((flag_opt_info >= OPT_INFO_MIN) && warning_printed)
 	{
 	 inform (input_location, "Use -Wno-error=coverage-mismatch to tolerate "
 	 	 "the mismatch but performance may drop if the function is hot");
@@ -727,7 +746,8 @@ 
     }
     else if (entry->lineno_checksum != lineno_checksum)
       {
-        warning (0, "Source location for function %qE have changed,"
+        warning (OPT_Wcoverage_mismatch,
+                 "Source location for function %qE have changed,"
                  " the profile data may be out of date",
                  DECL_ASSEMBLER_NAME (current_function_decl));
       }
Index: gcc/common.opt
===================================================================
--- gcc/common.opt	(revision 180106)
+++ gcc/common.opt	(working copy)
@@ -1125,11 +1125,6 @@ 
 Don't promote always inline static functions assuming they
 will be inlined and no copy is needed.
 
-
-fripa-verbose
-Common Report Var(flag_ripa_verbose)
-Enable verbose informational messages for LIPO compilation
-
 fearly-inlining
 Common Report Var(flag_early_inlining) Init(1) Optimization
 Perform early inlining
@@ -1572,6 +1567,19 @@ 
 Common Report Var(flag_omit_frame_pointer) Optimization
 When possible do not generate stack frames
 
+fopt-info
+Common Report Var(flag_opt_info) Optimization Init(1)
+Enable verbose informational messages for optimizations (same as -fopt-info=1)
+
+; fopt-info=0: no message will be emitted.
+; fopt-info or fopt-info=1: emit important warnings and optimization messages with
+;   large performance impact.
+; fopt-info=2: warnings and optimization messages targeting power users.
+; fopt-info=3: informational messages for compiler developers.
+fopt-info=
+Common RejectNegative Joined UInteger Optimization
+-fopt-info=[0|1|2|3] Set the verbose level of informational messages for optimizations
+
 foptimize-register-move
 Common Report Var(flag_regmove) Optimization
 Do the full register move optimization pass
Index: gcc/tree-profile.c
===================================================================
--- gcc/tree-profile.c	(revision 180106)
+++ gcc/tree-profile.c	(working copy)
@@ -1096,13 +1096,16 @@ 
                 reusedist_make_instr_call (stmt, subst, counters),
                 GSI_NEW_STMT);
 
-            locus = (stmt != NULL)
-                ? gimple_location (stmt)
-                : DECL_SOURCE_LOCATION (current_function_decl);
-            inform (locus,
-                    "inserted reuse distance instrumentation for %qs, using "
-                    "%d gcov counters", subst->original_name,
-                    subst->num_ptr_args * RD_NUM_COUNTERS);
+            if (flag_opt_info >= OPT_INFO_MAX)
+              {
+                locus = (stmt != NULL)
+                    ? gimple_location (stmt)
+                    : DECL_SOURCE_LOCATION (current_function_decl);
+                inform (locus,
+                        "inserted reuse distance instrumentation for %qs, using "
+                        "%d gcov counters", subst->original_name,
+                        subst->num_ptr_args * RD_NUM_COUNTERS);
+              }
           }
       }
 }
@@ -1214,7 +1217,7 @@ 
 
   reusedist_from_counters (counters, &rd);
 
-  if (rd.count)
+  if ((flag_opt_info >= OPT_INFO_MAX) && rd.count)
     inform (locus, "reuse distance counters for arg %d: %lld %lld %lld %lld",
             arg, (long long int)rd.mean_dist, (long long int)rd.mean_size,
             (long long int)rd.count, (long long int)rd.dist_x_size);
@@ -1283,9 +1286,10 @@ 
   subst_decl = reusedist_get_nt_decl (gimple_call_fndecl (stmt), subst,
                                       suffix);
   gimple_call_set_fndecl (stmt, subst_decl);
-  inform (locus, "replaced %qs with non-temporal %qs",
-          subst->original_name,
-          IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (subst_decl)));
+  if (flag_opt_info >= OPT_INFO_MED)
+    inform (locus, "replaced %qs with non-temporal %qs",
+            subst->original_name,
+            IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (subst_decl)));
 }
 
 /* Replace string operations with equivalent nontemporal, when profitable.  */
@@ -1323,11 +1327,13 @@ 
 
   if (counter_index != n_counters)
     {
-      warning (0, "coverage mismatch for reuse distance counters "
+      warning (OPT_Wcoverage_mismatch,
+               "coverage mismatch for reuse distance counters "
                "in function %qs", IDENTIFIER_POINTER
                (DECL_ASSEMBLER_NAME (current_function_decl)));
-      inform (input_location, "number of counters is %u instead of %u",
-              n_counters, counter_index);
+      if (flag_opt_info >= OPT_INFO_MAX)
+        inform (input_location, "number of counters is %u instead of %u",
+                n_counters, counter_index);
     }
 }