Patchwork Add option for dumping to stderr (issue6190057)

login
register
mail settings
Submitter Richard Guenther
Date July 4, 2012, 1:33 p.m.
Message ID <CAFiYyc2neYLHce0EUDCBcVKEw4iHAA0rZhEVTWy4+SaiGYPx+A@mail.gmail.com>
Download mbox | patch
Permalink /patch/168984/
State New
Headers show

Comments

Richard Guenther - July 4, 2012, 1:33 p.m.
On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai <singhai@google.com> wrote:
> Apologies for the spam. Attempting to resend the patch after shrinking it.
>
> I have updated the attached patch to use a new dump message
> classification system for the vectorizer. It currently uses four
> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION,
> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the
> vectorizer passes and have converted each call to fprintf (dump_file,
> ....) to a message classification matching in spirit. Most often, it
> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well.
>
> For example, the following
>
> if (vect_print_dump_info (REPORT_DETAILS))
>   {
>     fprintf (vect_dump, "niters for prolog loop: ");
>     print_generic_expr (vect_dump, iters, TDF_SLIM);
>   }
>
> gets converted to
>
> if (dump_kind (MSG_OPTIMIZED_LOCATIONS))
>   {
>      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
>                       "niters for prolog loop: ");
>      dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters);
>   }
>
> The asymmetry between the first printf and the second is due to the
> fact that 'vect_print_dump_info (xxx)' prints the location as a
> "side-effect". To preserve the original intent somewhat, I have
> converted the first call within a dump sequence to a dump_printf_loc
> (xxx) which prints the location while the subsequence calls within the
> same conditional get converted to the corresponding plain variants.

Ok, that looks reasonable.

> I considered removing the support for alternate dump file, but ended
> up preserving it instead since it is needed for setting the alternate
> dump file to stderr for the case when -fopt-info is given but no dump
> file is available.
>
> The following invocation
> g++ ... -ftree-vectorize -fopt-info=4
>
> dumps *all* available information to stderr. Currently, the opt-info
> level is common to all passes, i.e., a pass can't specify if wants a
> different level of diagnostic info. This can be added as an
> enhancement with a suitable syntax for selecting passes.
>
> I haven't fixed up the documentation/tests but wanted to get some
> feedback about the current state of patch before doing that.

Some comments / questions.

+  if (dump_file && (dump_kind & opt_info_flags))
+    {
+      dump_loc (dump_kind, dump_file, loc);
+      print_generic_expr (dump_file, t, dump_flags | extra_dump_flags);
+    }
+
+  if (alt_dump_file && (dump_kind & opt_info_flags))
+    {

you always test dump_kind against the same opt_info_flags variable.
I would have thought that the alternate dump file has a different opt_info_flags
setting so I can have -fdump-tree-vect-details -fopt-info=1.  Am I missing
something?

If I do

> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo

what will foo contain afterwards?  I think you need to document the behavior
when such redirection is used with the compiler-driver feature of handling
multiple translation units.  Especially the difference (or not difference) to

> gcc file1.c -O3 -fdump-tree-vectorize=foo
> gcc file2.c -O3 -fdump-tree-vectorize=foo

I suppose we do not want to append to foo (but eventually support that
with some alternate syntax?  Like -fdump-tree-vectorize=+foo?)

+
+static void
+set_opt_info (int opt_info_level)
+{

This function needs a comment.  How do the dump flags translate to
the opt-info flags?  Is this documented anywhere?  I only see

+/* Different types of dump classifications.  */
+enum dump_msg_kind {
+  MSG_NONE                     = 1 << 0,
+  MSG_OPTIMIZED_LOCATIONS      = 1 << 1,
+  MSG_UNOPTIMIZED_LOCATIONS    = 1 << 2,
+  MSG_MISSED_OPTIMIZATION      = 1 << 3,
+  MSG_NOTE                     = 1 << 4
+};

I suppose the TDF_* flags would all map to
MSG_OPTIMIZED_LOCATIONS|MSG_UNOPTIMIZED_LOCATIONS|MSG_MISSED_OPTIMIZATION
and only TDF_DETAILS would enable MSG_NOTE?

In the above a flag for MSG_NONE does not make much sense?

How would we map TDF_SCEV?  I suppose we'd add MSG_SCEV for this
(we talked about the possibility of having some special flags for
special passes).

But this also means a linear "opt-info-level" as in

+static void
+set_opt_info (int opt_info_level)

may not be a good representation.  Not a pass-ignorant opt_info_flags
value, if you consider -fopt-info=vect,3 (details from the vectorizer please).

+}
+
+
+
+DEBUG_FUNCTION void
+dump_combine_stats (int flags)

debug functions should be called debug_*, please add a comment and
avoid excessive vertical space

Overall I like the patch!

Thanks,
Richard.

> Thanks,
> Sharad
>
> On Fri, Jun 15, 2012 at 1:18 AM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Fri, Jun 15, 2012 at 7:47 AM, Sharad Singhai <singhai@google.com> wrote:
>>> On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>>> On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai <singhai@google.com> wrote:
>>>>> Okay, I have updated the attached patch so that the output from
>>>>> -ftree-vectorizer-verbose is considered diagnostic information and is
>>>>> always
>>>>> sent to stderr. Other functionality remains unchanged. Here is some
>>>>> more context about this patch.
>>>>>
>>>>> This patch improves the dump infrastructure and public interfaces so
>>>>> that the existing private pass-specific dump stream is separated from
>>>>> the diagnostic dump stream (typically stderr).  The optimization
>>>>> passes can output information on the two streams independently.
>>>>>
>>>>> The newly defined interfaces are:
>>>>>
>>>>> Individual passes do not need to access the dump file directly. Thus Instead
>>>>> of doing
>>>>>
>>>>>   if (dump_file && (flags & dump_flags))
>>>>>      fprintf (dump_file, ...);
>>>>>
>>>>> they can do
>>>>>
>>>>>     dump_printf (flags, ...);
>>>>>
>>>>> If the current pass has FLAGS enabled then the information gets
>>>>> printed into the dump file otherwise not.
>>>>>
>>>>> Similar to the dump_printf (), another function is defined, called
>>>>>
>>>>>        diag_printf (dump_flags, ...)
>>>>>
>>>>> This prints information only onto the diagnostic stream, typically
>>>>> standard error. It is useful for separating pass-specific dump
>>>>> information from
>>>>> the diagnostic information.
>>>>>
>>>>> Currently, as a proof of concept, I have converted vectorizer passes
>>>>> to use the new dump format. For this, I have considered
>>>>> information printed in vect_dump file as diagnostic. Thus 'fprintf'
>>>>> calls are changed to 'diag_printf'. Some other information printed to
>>>>> dump_file is sent to the regular dump file via 'dump_printf ()'. It
>>>>> helps to separate the two streams because they might serve different
>>>>> purposes and might have different formatting requirements.
>>>>>
>>>>> For example, using the trunk compiler, the following invocation
>>>>>
>>>>> g++ -S v.cc -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=2
>>>>>
>>>>> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
>>>>> However, the verbose diagnostic output is silently
>>>>> ignored. This is not desirable as the two types of dump should not interfere.
>>>>>
>>>>> After this patch, the vectorizer dump is available in 'v.cc.113t.vect'
>>>>> as before, but the verbose vectorizer diagnostic is additionally
>>>>> printed on stderr. Thus both types of dump information are output.
>>>>>
>>>>> An additional feature of this patch is that individual passes can
>>>>> print dump information into command-line named files instead of auto
>>>>> numbered filename. For example,
>>>>
>>>> I'd wish you'd leave out this part for a followup.
>>>>
>>>>>
>>>>> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>>>>>     -ftree-vectorizer-verbose=2
>>>>>     -fdump-tree-pre=foo.pre
>>>>>
>>>>> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
>>>>> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>>>>>
>>>>> Please take another look.
>>>>
>>>> --- tree-vect-loop-manip.c      (revision 188325)
>>>> +++ tree-vect-loop-manip.c      (working copy)
>>>> @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop *loop
>>>>   gsi_remove (&loop_cond_gsi, true);
>>>>
>>>>   loop_loc = find_loop_location (loop);
>>>> -  if (dump_file && (dump_flags & TDF_DETAILS))
>>>> -    {
>>>> -      if (loop_loc != UNKNOWN_LOC)
>>>> -        fprintf (dump_file, "\nloop at %s:%d: ",
>>>> +  if (loop_loc != UNKNOWN_LOC)
>>>> +    dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
>>>>                  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>>> -      print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
>>>> -    }
>>>> -
>>>> +  if (dump_flags & TDF_DETAILS)
>>>> +    dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
>>>>   loop->nb_iterations = niters;
>>>>
>>>> I'm confused by this.  Why is this not simply
>>>>
>>>>  if (loop_loc != UNKNOWN_LOC)
>>>>    dump_printf (dump_flags, "\nloop at %s:%d: ",
>>>>                       LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>>>  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);
>>>>
>>>> for example.  I notice that you maybe mis-understood the message classification
>>>> I asked you to add (maybe I confused you by mentioning to eventually re-use
>>>> the TDF_* flags).  I think you basically provided this message classification
>>>> by adding two classes by providing both dump_gimple_stmt and diag_gimple_stmt.
>>>> But still in the above you keep a dump_flags test _and_ you pass in
>>>> (altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me quote them:
>>>>
>>>> +void
>>>> +dump_gimple_stmt (int flags, gimple gs, int spc)
>>>> +{
>>>> +  if (dump_file)
>>>> +    print_gimple_stmt (dump_file, gs, spc, flags);
>>>> +}
>>>>
>>>> +void
>>>> +diag_gimple_stmt (int flags, gimple gs, int spc)
>>>> +{
>>>> +  if (alt_dump_file)
>>>> +    print_gimple_stmt (alt_dump_file, gs, spc, flags);
>>>> +}
>>>>
>>>> I'd say it should have been a single function:
>>>>
>>>> void
>>>> dump_gimple_stmt (enum msg_classification, int additional_flags,
>>>> gimple gs, int spc)
>>>> {
>>>>  if (msg_classification & go-to-dumpfile
>>>>      && dump_file)
>>>>    print_gimple_stmt (dump_file, gs, spc, dump_flags | additional_flags);
>>>>  if (msg_classification & go-to-alt-dump-file
>>>>      && alt_dump_file && (alt_dump_flags & msg_classification))
>>>>    print_gimple_stmt (alt_dump_file, gs, spc, alt_dump_flags |
>>>> additional_flags);
>>>> }
>>>
>>> Yes,  I can make the single function work for both regular (dump_file)
>>> and diagnostic (stderr for now) dump streams. In fact, my original
>>> patch did just that. However, it duplicated output on *both* the
>>> streams. For example,
>>>
>>> gcc -O2 -ftree-vectorize -fdump-tree-vect=foo.vect
>>> -ftree-vectorizer-verbose=2 foo.c
>>>
>>> Since both streams are enabled in this case, a call to dump_printf ()
>>> would duplicate the information to both files, i.e., the dump and
>>> diagnostic will be intermixed. Is that intended? My first thought was
>>> to keep them separated via dump_*() and diag_* () methods.
>>
>> No, I think that's intended.  The regular dump-files should contain
>> everything, the secondary stream (eventually enabled by -fopt-info)
>> should be a filtered regular dump-file.
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Sharad
>>>
>>>> where msg_classification would include things to suitably classify messages
>>>> for -fopt-info, like MSG_MISSED_OPTIMIZATION, MSG_OPTIMIZED, MSG_NOTE.
>>>>
>>>> In another place we have
>>>>
>>>> @@ -1648,7 +1642,7 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo)
>>>>   /* Analyze phi functions of the loop header.  */
>>>>
>>>>   if (vect_print_dump_info (REPORT_DETAILS))
>>>> -    fprintf (vect_dump, "vect_can_advance_ivs_p:");
>>>> +    diag_printf (TDF_TREE, "vect_can_advance_ivs_p:");
>>>>
>>>> so why's that diag_printf and why TDF_TREE?  I suppose you made all
>>>> dumps to vect_dump diag_* and all dumps to dump_file dump_*?  I
>>>> think it should have been
>>>>
>>>>   dump_printf (REPORT_DETAILS, "vect_can_advance_ivs_p:");
>>>>
>>>> thus dump_printf only gets the msg-classification and the printf args
>>>> (dump-flags
>>>> are only meaningful when passed down to pretty-printers).  Thus
>>>>
>>>> @@ -1658,8 +1652,8 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo)
>>>>       phi = gsi_stmt (gsi);
>>>>       if (vect_print_dump_info (REPORT_DETAILS))
>>>>        {
>>>> -          fprintf (vect_dump, "Analyze phi: ");
>>>> -          print_gimple_stmt (vect_dump, phi, 0, TDF_SLIM);
>>>> +          diag_printf (TDF_TREE, "Analyze phi: ");
>>>> +          diag_gimple_stmt (TDF_SLIM, phi, 0);
>>>>        }
>>>>
>>>> should be
>>>>
>>>>  dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>>>  dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>>>
>>>> and the if (vect_print_dump_info (REPORT_DETAILS)) should be what
>>>> the dump infrastructure provides and thus hidden.  Eventually we should
>>>> have a dump_kind (msg-classification) so we can replace it with
>>>>
>>>>   if (dump_kind (REPORT_DETAILS))
>>>>     {
>>>>       dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>>>       dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>>>     }
>>>>
>>>> to reduce the runtime overhead when not diagnosing/dumping.
>>>>
>>>> @@ -2464,8 +2458,8 @@ vect_create_cond_for_alias_checks (loop_vec_info l
>>>>     }
>>>>
>>>>   if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
>>>> -    fprintf (vect_dump, "created %u versioning for alias checks.\n",
>>>> -             VEC_length (ddr_p, may_alias_ddrs));
>>>> +    diag_printf (TDF_TREE, "created %u versioning for alias checks.\n",
>>>> +                 VEC_length (ddr_p, may_alias_ddrs));
>>>>  }
>>>>
>>>> so here we have a different msg-classification,
>>>> REPORT_VECTORIZED_LOCATIONS.  As we eventually want a central
>>>> -fopt-report=... we do not want to leave this implementation detail to
>>>> individual passes but gather them in a central place.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> Sharad
>>>>>
>>>>>
>>>>> On Thu, Jun 7, 2012 at 12:19 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>> On Wed, Jun 6, 2012 at 10:58 PM, Sharad Singhai <singhai@google.com> wrote:
>>>>>>> Sorry about the delay. I have finally incorporated all the suggestions
>>>>>>> and reorganized the dump infrastructure a bit. The attached patch
>>>>>>> updates vectorizer passes so that instead of accessing global
>>>>>>> dump_file directly, these passes call dump_printf (FLAG, format, ...).
>>>>>>> The dump_printf can choose between two streams, one regular pass dump
>>>>>>> file, and another optional command line provided file. Currently, it
>>>>>>> doesn't discriminate and all the dump information goes to both the
>>>>>>> streams.
>>>>>>>
>>>>>>> Thus, for example,
>>>>>>>
>>>>>>> g++ -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.v -ftree-vectorizer-verbose=3
>>>>>>
>>>>>> But the default form of dump option (-fdump-tree-vect) no longer
>>>>>> interferes with -ftree-vectorize-verbose=xxx output right? (this is
>>>>>> one of the main requirements). One dumped to the primary stream (named
>>>>>> dump file) and the other to the stderr -- the default diagnostic (alt)
>>>>>> stream.
>>>>>>
>>>>>> David
>>>>>>
>>>>>>>
>>>>>>> will output the verbose vectorizer information in both *.vect file and
>>>>>>> foo.v file. However, as I have converted only vectorizer passes so
>>>>>>> far, there is additional information in *.vect file which is not
>>>>>>> present in foo.v file. Once other passes are converted to use this
>>>>>>> scheme, then these two dump files should have identical output.
>>>>>>>
>>>>>>> Also note that in this patch -fdump-xxx=yyy format does not override
>>>>>>> any auto named dump files as in my earlier patches. Instead the dump
>>>>>>> information is output to both places when a command line dump file
>>>>>>> option is provided.
>>>>>>>
>>>>>>> To summarize:
>>>>>>> - instead of using dump_begin () / dump_end (), the passes should use
>>>>>>> dump_start ()/dump_finish (). These new variants do not return the
>>>>>>> dump_file. However, they still set the global dump_file/dump_flags for
>>>>>>> the benefit of other passes during the transition.
>>>>>>> - instead of directly printing to the dump_file, as in
>>>>>>> if (dump_file)
>>>>>>>  fprintf (dump_file, ...);
>>>>>>>
>>>>>>> The passes should do
>>>>>>>
>>>>>>> dump_printf (dump_flag, ...);
>>>>>>> This will output to dump file(s) only when dump_flag is enabled for a
>>>>>>> given pass.
>>>>>>>
>>>>>>> I have bootstrapped and tested it on x86_64. Does it look okay?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Sharad
>>>>>>>
>>>>>>>
>>>>>>> On Mon, May 14, 2012 at 12:26 AM, Richard Guenther
>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>> On Sat, May 12, 2012 at 6:39 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>>>> On Sat, May 12, 2012 at 9:26 AM, Gabriel Dos Reis
>>>>>>>>> <gdr@integrable-solutions.net> wrote:
>>>>>>>>>> On Sat, May 12, 2012 at 11:05 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The downside is that the dump file format will look different from the
>>>>>>>>>>> stderr output which is less than ideal.
>>>>>>>>>>
>>>>>>>>>> BTW, why do people want to use stderr for dumping internal IRs,
>>>>>>>>>> as opposed to stdout or other files?  That does not sound right.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I was talking about the transformation information difference. In
>>>>>>>>> stderr (where diagnostics go to), we may have
>>>>>>>>>
>>>>>>>>> foo.c: in function 'foo':
>>>>>>>>> foo.c:5:6: note: loop was vectorized
>>>>>>>>>
>>>>>>>>> but in dump file the format for the information may be different,
>>>>>>>>> unless we want to duplicate the machinery in diagnostics.
>>>>>>>>
>>>>>>>> So?  What's the problem with that ("different" diagnostics)?
>>>>>>>>
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> David
>>>>>>>>>
>>>>>>>>>> -- Gaby
Sharad Singhai - Aug. 24, 2012, 8:06 a.m.
Sorry about the delay. Please see comments inline.

On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai <singhai@google.com> wrote:
>> Apologies for the spam. Attempting to resend the patch after shrinking it.
>>
>> I have updated the attached patch to use a new dump message
>> classification system for the vectorizer. It currently uses four
>> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION,
>> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the
>> vectorizer passes and have converted each call to fprintf (dump_file,
>> ....) to a message classification matching in spirit. Most often, it
>> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well.
>>
>> For example, the following
>>
>> if (vect_print_dump_info (REPORT_DETAILS))
>>   {
>>     fprintf (vect_dump, "niters for prolog loop: ");
>>     print_generic_expr (vect_dump, iters, TDF_SLIM);
>>   }
>>
>> gets converted to
>>
>> if (dump_kind (MSG_OPTIMIZED_LOCATIONS))
>>   {
>>      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
>>                       "niters for prolog loop: ");
>>      dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters);
>>   }
>>
>> The asymmetry between the first printf and the second is due to the
>> fact that 'vect_print_dump_info (xxx)' prints the location as a
>> "side-effect". To preserve the original intent somewhat, I have
>> converted the first call within a dump sequence to a dump_printf_loc
>> (xxx) which prints the location while the subsequence calls within the
>> same conditional get converted to the corresponding plain variants.
>
> Ok, that looks reasonable.
>
>> I considered removing the support for alternate dump file, but ended
>> up preserving it instead since it is needed for setting the alternate
>> dump file to stderr for the case when -fopt-info is given but no dump
>> file is available.
>>
>> The following invocation
>> g++ ... -ftree-vectorize -fopt-info=4
>>
>> dumps *all* available information to stderr. Currently, the opt-info
>> level is common to all passes, i.e., a pass can't specify if wants a
>> different level of diagnostic info. This can be added as an
>> enhancement with a suitable syntax for selecting passes.
>>
>> I haven't fixed up the documentation/tests but wanted to get some
>> feedback about the current state of patch before doing that.
>
> Some comments / questions.
>
> +  if (dump_file && (dump_kind & opt_info_flags))
> +    {
> +      dump_loc (dump_kind, dump_file, loc);
> +      print_generic_expr (dump_file, t, dump_flags | extra_dump_flags);
> +    }
> +
> +  if (alt_dump_file && (dump_kind & opt_info_flags))
> +    {
>
> you always test dump_kind against the same opt_info_flags variable.
> I would have thought that the alternate dump file has a different opt_info_flags
> setting so I can have -fdump-tree-vect-details -fopt-info=1.  Am I missing
> something?

It was an oversight on my part. I have since fixed this. There are two
separate flags corresponding to the two types of dump files,

pflags ==> pass private dump file
alt_flags ==> opt-info dump file

> If I do
>
>> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo
>
> what will foo contain afterwards?  I think you need to document the behavior
> when such redirection is used with the compiler-driver feature of handling
> multiple translation units.  Especially the difference (or not difference) to
>
>> gcc file1.c -O3 -fdump-tree-vectorize=foo
>> gcc file2.c -O3 -fdump-tree-vectorize=foo

Yes, the dump file gets overwritten during each invocation. I have
noted this in the documentation.

> I suppose we do not want to append to foo (but eventually support that
> with some alternate syntax?  Like -fdump-tree-vectorize=+foo?)

Yes, I agree. We could define a new syntax as you suggested for
appending to a dump file. However, this feature can wait for a
separate patch.

> +
> +static void
> +set_opt_info (int opt_info_level)
> +{
>
> This function needs a comment.  How do the dump flags translate to
> the opt-info flags?  Is this documented anywhere?  I only see
>
> +/* Different types of dump classifications.  */
> +enum dump_msg_kind {
> +  MSG_NONE                     = 1 << 0,
> +  MSG_OPTIMIZED_LOCATIONS      = 1 << 1,
> +  MSG_UNOPTIMIZED_LOCATIONS    = 1 << 2,
> +  MSG_MISSED_OPTIMIZATION      = 1 << 3,
> +  MSG_NOTE                     = 1 << 4
> +};

Yes, my mapping was static (pass-agnostic), defined inside the set_opt_info. I
have now documented it and revamped it so that -fopt-info is applied
to specific passes as explained below.

> I suppose the TDF_* flags would all map to
> MSG_OPTIMIZED_LOCATIONS|MSG_UNOPTIMIZED_LOCATIONS|MSG_MISSED_OPTIMIZATION
> and only TDF_DETAILS would enable MSG_NOTE?

Yes, the mapping is very coarse-grained right now. Currently I have
made MSG_* flags an extension of TDF_* flags. This way both kinds of
flags can be present in a single word yet still be interpreted by the
dumps which are interested in them. TDF_DETAILS gets mapped to a union
of all the MSG_* flags. This will allow flags such -fdump-vect-details
to continue to work as expected during the transition to the new dump
infrastructure. Of coure,  during the transition, we would need to
rejigger flags to express the desired behavior.

> In the above a flag for MSG_NONE does not make much sense?

Removed. It is now implicitly 0.

>
> How would we map TDF_SCEV?  I suppose we'd add MSG_SCEV for this
> (we talked about the possibility of having some special flags for
> special passes).

Yes, I think this would be necessary. So far, I haven't converted any
passes other than vectorization, and I suspect a handful of special
case flags would be needed. During the transition, the TDF_* flags are
assumed to be an extension of MSG_* flags and things won't break.

>
> But this also means a linear "opt-info-level" as in
>
> +static void
> +set_opt_info (int opt_info_level)
>
> may not be a good representation.  Not a pass-ignorant opt_info_flags
> value, if you consider -fopt-info=vect,3 (details from the vectorizer please).

Agreed. As I mentioned earlier, I was considering a simple linear view
of -fopt-info. However, I understand that making it pass cognizant is
certainly better. To this end, I have modified the syntax of
-fopt-info somewhat along the lines of -fdump-xxx format. For example,

gcc ... -fopt-info-tree-vect-optimized=foo.vect.optimized
-fdump-tree-pre-details=stderr ...

This would dump info about optimized locations from the vectorizer
pass into foo.vect.optimized, and pass dump from the PRE pass on to
stderr.

>
> +}
> +
> +
> +
> +DEBUG_FUNCTION void
> +dump_combine_stats (int flags)
>
> debug functions should be called debug_*, please add a comment and
> avoid excessive vertical space

Fixed.

>
> Index: tree-pass.h
> ===================================================================
> --- tree-pass.h (revision 188966)
> +++ tree-pass.h (working copy)
> @@ -85,7 +85,6 @@ enum tree_dump_index
>  #define TDF_CSELIB     (1 << 23)       /* Dump cselib details.  */
>  #define TDF_SCEV       (1 << 24)       /* Dump SCEV details.  */
>
> -
>  /* In tree-dump.c */
>
>  extern char *get_dump_file_name (int);
>
> please avoid whitespace-only changes (also elsewhere).

Fixed.

>
>  /* Global variables used to communicate with passes.  */
>  extern FILE *dump_file;
> +extern FILE *alt_dump_file;
>  extern int dump_flags;
> +extern int opt_info_flags;
>
> so I expected the primary dump_file to be controlled with dump_flags,
> or with a (cached) translation of them to a primary_opt_info_flags.

Yes, now I have two sets of flags.

>
> Index: gimple-pretty-print.h
> ===================================================================
> --- gimple-pretty-print.h       (revision 188966)
> +++ gimple-pretty-print.h       (working copy)
> @@ -31,6 +31,6 @@ extern void debug_gimple_seq (gimple_seq);
>  extern void print_gimple_seq (FILE *, gimple_seq, int, int);
>  extern void print_gimple_stmt (FILE *, gimple, int, int);
>  extern void print_gimple_expr (FILE *, gimple, int, int);
> -extern void dump_gimple_stmt (pretty_printer *, gimple, int, int);
> +extern void print_gimple_stmt_1 (pretty_printer *, gimple, int, int);
>
> I'd call this pp_gimple_stmt instead.

Fixed.

>
> @@ -1418,6 +1419,16 @@ init_branch_prob (void)
>  void
>  end_branch_prob (void)
>  {
> +  end_branch_prob_1 (dump_file);
> +  end_branch_prob_1 (alt_dump_file);
> +}
> +
> +/* Helper function for file-level cleanup for DUMP_FILE after
> +   branch-prob processing is completed. */
> +
> +static void
> +end_branch_prob_1 (FILE *dump_file)
> +{
>    if (dump_file)
>      {
>        fprintf (dump_file, "\n");
>
> That change looks odd ;)  Can you instead use the new dump_printf
> interface?  (side-note: we should not need to export alt_dump_file at all!)

Sorry, it was my ill conceived attempt to avoid converting this pass.
:) Anyway, I did a proper fix now, and converted an additional file
(profile.c) as a side benefit.

>
> @@ -2166,10 +2177,6 @@ ftree-vect-loop-version
>  Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
>  Enable loop versioning when doing loop vectorization on trees
>
> -ftree-vectorizer-verbose=
> -Common RejectNegative Joined UInteger
> --ftree-vectorizer-verbose=<number>     Set the verbosity level of the
> vectorizer
> -
>
> we need to preserve old switches for backward compatibility, I suggest
> to alias it to -fopt-info for now.

Okay. I mapped -ftree-vectorizer-verbose=<number> as
0 ==> No dump output
1 ==> dump optimized locations
2 ==> dump missed optimizations
3 ==> dump notes
4 and up ==> dump all, i.e., 1, 2, 3

Curiously, several tests are casualties of reinterpretation of this
flag. Here is how -- the old (and odd) behavior was that

gcc ... -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=3

would output tree-vect dump as usual but nothing would get printed by
-ftree-vectorizer-verbose=3. In this patch I have "fixed" this
behavior so that -ftree-vectorizer-verbose=<n> prints on stderr
regardless of other flags present(*). This has the unfortunate side
effect of extra output being sent to stderr which is interpreted as a
test failure. Please advise how to ignore that additional stderr
output in tests.

(*) Well, not entirely true. Since -ftree-vectorizer-verbose=<n> is
implemented in terms of -fopt-info, the output of verbose flag is the
stream where ever vectorizer's opt-info is being sent.

> @@ -13909,7 +13909,7 @@ unmentioned_reg_p (rtx equiv, rtx expr)
>  }
>  ^L
>  DEBUG_FUNCTION void
> -dump_combine_stats (FILE *file)
> +print_combine_stats (FILE *file)
>
> see above, debug_combine_stats.

Fixed.

>
> Index: system.h
> ===================================================================
> --- system.h    (revision 188966)
> +++ system.h    (working copy)
> @@ -669,6 +669,7 @@ extern int vsnprintf(char *, size_t, const char *,
>     except, maybe, something to the dump file.  */
>  #ifdef BUFSIZ
>  extern FILE *dump_file;
> +extern FILE *alt_dump_file;
>
> see above - we should not need to export this (nor opt_info_flags).

Fixed.

Please take another look and see if the attached patch looks okay? I
still need to fix the failing tests due to intentional output on
stderr.

Thanks,
Sharad

>
> Overall I like the patch!
>
> Thanks,
> Richard.
>
>> Thanks,
>> Sharad
>>
>> On Fri, Jun 15, 2012 at 1:18 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>>> On Fri, Jun 15, 2012 at 7:47 AM, Sharad Singhai <singhai@google.com> wrote:
>>>> On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther
>>>> <richard.guenther@gmail.com> wrote:
>>>>> On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai <singhai@google.com> wrote:
>>>>>> Okay, I have updated the attached patch so that the output from
>>>>>> -ftree-vectorizer-verbose is considered diagnostic information and is
>>>>>> always
>>>>>> sent to stderr. Other functionality remains unchanged. Here is some
>>>>>> more context about this patch.
>>>>>>
>>>>>> This patch improves the dump infrastructure and public interfaces so
>>>>>> that the existing private pass-specific dump stream is separated from
>>>>>> the diagnostic dump stream (typically stderr).  The optimization
>>>>>> passes can output information on the two streams independently.
>>>>>>
>>>>>> The newly defined interfaces are:
>>>>>>
>>>>>> Individual passes do not need to access the dump file directly. Thus Instead
>>>>>> of doing
>>>>>>
>>>>>>   if (dump_file && (flags & dump_flags))
>>>>>>      fprintf (dump_file, ...);
>>>>>>
>>>>>> they can do
>>>>>>
>>>>>>     dump_printf (flags, ...);
>>>>>>
>>>>>> If the current pass has FLAGS enabled then the information gets
>>>>>> printed into the dump file otherwise not.
>>>>>>
>>>>>> Similar to the dump_printf (), another function is defined, called
>>>>>>
>>>>>>        diag_printf (dump_flags, ...)
>>>>>>
>>>>>> This prints information only onto the diagnostic stream, typically
>>>>>> standard error. It is useful for separating pass-specific dump
>>>>>> information from
>>>>>> the diagnostic information.
>>>>>>
>>>>>> Currently, as a proof of concept, I have converted vectorizer passes
>>>>>> to use the new dump format. For this, I have considered
>>>>>> information printed in vect_dump file as diagnostic. Thus 'fprintf'
>>>>>> calls are changed to 'diag_printf'. Some other information printed to
>>>>>> dump_file is sent to the regular dump file via 'dump_printf ()'. It
>>>>>> helps to separate the two streams because they might serve different
>>>>>> purposes and might have different formatting requirements.
>>>>>>
>>>>>> For example, using the trunk compiler, the following invocation
>>>>>>
>>>>>> g++ -S v.cc -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=2
>>>>>>
>>>>>> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
>>>>>> However, the verbose diagnostic output is silently
>>>>>> ignored. This is not desirable as the two types of dump should not interfere.
>>>>>>
>>>>>> After this patch, the vectorizer dump is available in 'v.cc.113t.vect'
>>>>>> as before, but the verbose vectorizer diagnostic is additionally
>>>>>> printed on stderr. Thus both types of dump information are output.
>>>>>>
>>>>>> An additional feature of this patch is that individual passes can
>>>>>> print dump information into command-line named files instead of auto
>>>>>> numbered filename. For example,
>>>>>
>>>>> I'd wish you'd leave out this part for a followup.
>>>>>
>>>>>>
>>>>>> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>>>>>>     -ftree-vectorizer-verbose=2
>>>>>>     -fdump-tree-pre=foo.pre
>>>>>>
>>>>>> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
>>>>>> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>>>>>>
>>>>>> Please take another look.
>>>>>
>>>>> --- tree-vect-loop-manip.c      (revision 188325)
>>>>> +++ tree-vect-loop-manip.c      (working copy)
>>>>> @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop *loop
>>>>>   gsi_remove (&loop_cond_gsi, true);
>>>>>
>>>>>   loop_loc = find_loop_location (loop);
>>>>> -  if (dump_file && (dump_flags & TDF_DETAILS))
>>>>> -    {
>>>>> -      if (loop_loc != UNKNOWN_LOC)
>>>>> -        fprintf (dump_file, "\nloop at %s:%d: ",
>>>>> +  if (loop_loc != UNKNOWN_LOC)
>>>>> +    dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
>>>>>                  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>>>> -      print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
>>>>> -    }
>>>>> -
>>>>> +  if (dump_flags & TDF_DETAILS)
>>>>> +    dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
>>>>>   loop->nb_iterations = niters;
>>>>>
>>>>> I'm confused by this.  Why is this not simply
>>>>>
>>>>>  if (loop_loc != UNKNOWN_LOC)
>>>>>    dump_printf (dump_flags, "\nloop at %s:%d: ",
>>>>>                       LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>>>>  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);
>>>>>
>>>>> for example.  I notice that you maybe mis-understood the message classification
>>>>> I asked you to add (maybe I confused you by mentioning to eventually re-use
>>>>> the TDF_* flags).  I think you basically provided this message classification
>>>>> by adding two classes by providing both dump_gimple_stmt and diag_gimple_stmt.
>>>>> But still in the above you keep a dump_flags test _and_ you pass in
>>>>> (altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me quote them:
>>>>>
>>>>> +void
>>>>> +dump_gimple_stmt (int flags, gimple gs, int spc)
>>>>> +{
>>>>> +  if (dump_file)
>>>>> +    print_gimple_stmt (dump_file, gs, spc, flags);
>>>>> +}
>>>>>
>>>>> +void
>>>>> +diag_gimple_stmt (int flags, gimple gs, int spc)
>>>>> +{
>>>>> +  if (alt_dump_file)
>>>>> +    print_gimple_stmt (alt_dump_file, gs, spc, flags);
>>>>> +}
>>>>>
>>>>> I'd say it should have been a single function:
>>>>>
>>>>> void
>>>>> dump_gimple_stmt (enum msg_classification, int additional_flags,
>>>>> gimple gs, int spc)
>>>>> {
>>>>>  if (msg_classification & go-to-dumpfile
>>>>>      && dump_file)
>>>>>    print_gimple_stmt (dump_file, gs, spc, dump_flags | additional_flags);
>>>>>  if (msg_classification & go-to-alt-dump-file
>>>>>      && alt_dump_file && (alt_dump_flags & msg_classification))
>>>>>    print_gimple_stmt (alt_dump_file, gs, spc, alt_dump_flags |
>>>>> additional_flags);
>>>>> }
>>>>
>>>> Yes,  I can make the single function work for both regular (dump_file)
>>>> and diagnostic (stderr for now) dump streams. In fact, my original
>>>> patch did just that. However, it duplicated output on *both* the
>>>> streams. For example,
>>>>
>>>> gcc -O2 -ftree-vectorize -fdump-tree-vect=foo.vect
>>>> -ftree-vectorizer-verbose=2 foo.c
>>>>
>>>> Since both streams are enabled in this case, a call to dump_printf ()
>>>> would duplicate the information to both files, i.e., the dump and
>>>> diagnostic will be intermixed. Is that intended? My first thought was
>>>> to keep them separated via dump_*() and diag_* () methods.
>>>
>>> No, I think that's intended.  The regular dump-files should contain
>>> everything, the secondary stream (eventually enabled by -fopt-info)
>>> should be a filtered regular dump-file.
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> Sharad
>>>>
>>>>> where msg_classification would include things to suitably classify messages
>>>>> for -fopt-info, like MSG_MISSED_OPTIMIZATION, MSG_OPTIMIZED, MSG_NOTE.
>>>>>
>>>>> In another place we have
>>>>>
>>>>> @@ -1648,7 +1642,7 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo)
>>>>>   /* Analyze phi functions of the loop header.  */
>>>>>
>>>>>   if (vect_print_dump_info (REPORT_DETAILS))
>>>>> -    fprintf (vect_dump, "vect_can_advance_ivs_p:");
>>>>> +    diag_printf (TDF_TREE, "vect_can_advance_ivs_p:");
>>>>>
>>>>> so why's that diag_printf and why TDF_TREE?  I suppose you made all
>>>>> dumps to vect_dump diag_* and all dumps to dump_file dump_*?  I
>>>>> think it should have been
>>>>>
>>>>>   dump_printf (REPORT_DETAILS, "vect_can_advance_ivs_p:");
>>>>>
>>>>> thus dump_printf only gets the msg-classification and the printf args
>>>>> (dump-flags
>>>>> are only meaningful when passed down to pretty-printers).  Thus
>>>>>
>>>>> @@ -1658,8 +1652,8 @@ vect_can_advance_ivs_p (loop_vec_info loop_vinfo)
>>>>>       phi = gsi_stmt (gsi);
>>>>>       if (vect_print_dump_info (REPORT_DETAILS))
>>>>>        {
>>>>> -          fprintf (vect_dump, "Analyze phi: ");
>>>>> -          print_gimple_stmt (vect_dump, phi, 0, TDF_SLIM);
>>>>> +          diag_printf (TDF_TREE, "Analyze phi: ");
>>>>> +          diag_gimple_stmt (TDF_SLIM, phi, 0);
>>>>>        }
>>>>>
>>>>> should be
>>>>>
>>>>>  dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>>>>  dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>>>>
>>>>> and the if (vect_print_dump_info (REPORT_DETAILS)) should be what
>>>>> the dump infrastructure provides and thus hidden.  Eventually we should
>>>>> have a dump_kind (msg-classification) so we can replace it with
>>>>>
>>>>>   if (dump_kind (REPORT_DETAILS))
>>>>>     {
>>>>>       dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>>>>       dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>>>>     }
>>>>>
>>>>> to reduce the runtime overhead when not diagnosing/dumping.
>>>>>
>>>>> @@ -2464,8 +2458,8 @@ vect_create_cond_for_alias_checks (loop_vec_info l
>>>>>     }
>>>>>
>>>>>   if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
>>>>> -    fprintf (vect_dump, "created %u versioning for alias checks.\n",
>>>>> -             VEC_length (ddr_p, may_alias_ddrs));
>>>>> +    diag_printf (TDF_TREE, "created %u versioning for alias checks.\n",
>>>>> +                 VEC_length (ddr_p, may_alias_ddrs));
>>>>>  }
>>>>>
>>>>> so here we have a different msg-classification,
>>>>> REPORT_VECTORIZED_LOCATIONS.  As we eventually want a central
>>>>> -fopt-report=... we do not want to leave this implementation detail to
>>>>> individual passes but gather them in a central place.
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>> Thanks,
>>>>>> Sharad
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 7, 2012 at 12:19 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>> On Wed, Jun 6, 2012 at 10:58 PM, Sharad Singhai <singhai@google.com> wrote:
>>>>>>>> Sorry about the delay. I have finally incorporated all the suggestions
>>>>>>>> and reorganized the dump infrastructure a bit. The attached patch
>>>>>>>> updates vectorizer passes so that instead of accessing global
>>>>>>>> dump_file directly, these passes call dump_printf (FLAG, format, ...).
>>>>>>>> The dump_printf can choose between two streams, one regular pass dump
>>>>>>>> file, and another optional command line provided file. Currently, it
>>>>>>>> doesn't discriminate and all the dump information goes to both the
>>>>>>>> streams.
>>>>>>>>
>>>>>>>> Thus, for example,
>>>>>>>>
>>>>>>>> g++ -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.v -ftree-vectorizer-verbose=3
>>>>>>>
>>>>>>> But the default form of dump option (-fdump-tree-vect) no longer
>>>>>>> interferes with -ftree-vectorize-verbose=xxx output right? (this is
>>>>>>> one of the main requirements). One dumped to the primary stream (named
>>>>>>> dump file) and the other to the stderr -- the default diagnostic (alt)
>>>>>>> stream.
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>>
>>>>>>>> will output the verbose vectorizer information in both *.vect file and
>>>>>>>> foo.v file. However, as I have converted only vectorizer passes so
>>>>>>>> far, there is additional information in *.vect file which is not
>>>>>>>> present in foo.v file. Once other passes are converted to use this
>>>>>>>> scheme, then these two dump files should have identical output.
>>>>>>>>
>>>>>>>> Also note that in this patch -fdump-xxx=yyy format does not override
>>>>>>>> any auto named dump files as in my earlier patches. Instead the dump
>>>>>>>> information is output to both places when a command line dump file
>>>>>>>> option is provided.
>>>>>>>>
>>>>>>>> To summarize:
>>>>>>>> - instead of using dump_begin () / dump_end (), the passes should use
>>>>>>>> dump_start ()/dump_finish (). These new variants do not return the
>>>>>>>> dump_file. However, they still set the global dump_file/dump_flags for
>>>>>>>> the benefit of other passes during the transition.
>>>>>>>> - instead of directly printing to the dump_file, as in
>>>>>>>> if (dump_file)
>>>>>>>>  fprintf (dump_file, ...);
>>>>>>>>
>>>>>>>> The passes should do
>>>>>>>>
>>>>>>>> dump_printf (dump_flag, ...);
>>>>>>>> This will output to dump file(s) only when dump_flag is enabled for a
>>>>>>>> given pass.
>>>>>>>>
>>>>>>>> I have bootstrapped and tested it on x86_64. Does it look okay?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Sharad
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, May 14, 2012 at 12:26 AM, Richard Guenther
>>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>>>> On Sat, May 12, 2012 at 6:39 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>>>>> On Sat, May 12, 2012 at 9:26 AM, Gabriel Dos Reis
>>>>>>>>>> <gdr@integrable-solutions.net> wrote:
>>>>>>>>>>> On Sat, May 12, 2012 at 11:05 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The downside is that the dump file format will look different from the
>>>>>>>>>>>> stderr output which is less than ideal.
>>>>>>>>>>>
>>>>>>>>>>> BTW, why do people want to use stderr for dumping internal IRs,
>>>>>>>>>>> as opposed to stdout or other files?  That does not sound right.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I was talking about the transformation information difference. In
>>>>>>>>>> stderr (where diagnostics go to), we may have
>>>>>>>>>>
>>>>>>>>>> foo.c: in function 'foo':
>>>>>>>>>> foo.c:5:6: note: loop was vectorized
>>>>>>>>>>
>>>>>>>>>> but in dump file the format for the information may be different,
>>>>>>>>>> unless we want to duplicate the machinery in diagnostics.
>>>>>>>>>
>>>>>>>>> So?  What's the problem with that ("different" diagnostics)?
>>>>>>>>>
>>>>>>>>> Richard.
>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>> -- Gaby
Sharad Singhai - Sept. 10, 2012, 6:20 p.m.
Ping.

Thanks,
Sharad
Sharad


On Wed, Sep 5, 2012 at 10:34 AM, Sharad Singhai <singhai@google.com> wrote:
> Ping.
>
> Thanks,
> Sharad
>
> Sharad
>
>
>
>
> On Fri, Aug 24, 2012 at 1:06 AM, Sharad Singhai <singhai@google.com> wrote:
>>
>> Sorry about the delay. Please see comments inline.
>>
>> On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther
>> <richard.guenther@gmail.com> wrote:
>> > On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai <singhai@google.com>
>> > wrote:
>> >> Apologies for the spam. Attempting to resend the patch after shrinking
>> >> it.
>> >>
>> >> I have updated the attached patch to use a new dump message
>> >> classification system for the vectorizer. It currently uses four
>> >> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION,
>> >> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the
>> >> vectorizer passes and have converted each call to fprintf (dump_file,
>> >> ....) to a message classification matching in spirit. Most often, it
>> >> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well.
>> >>
>> >> For example, the following
>> >>
>> >> if (vect_print_dump_info (REPORT_DETAILS))
>> >>   {
>> >>     fprintf (vect_dump, "niters for prolog loop: ");
>> >>     print_generic_expr (vect_dump, iters, TDF_SLIM);
>> >>   }
>> >>
>> >> gets converted to
>> >>
>> >> if (dump_kind (MSG_OPTIMIZED_LOCATIONS))
>> >>   {
>> >>      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
>> >>                       "niters for prolog loop: ");
>> >>      dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters);
>> >>   }
>> >>
>> >> The asymmetry between the first printf and the second is due to the
>> >> fact that 'vect_print_dump_info (xxx)' prints the location as a
>> >> "side-effect". To preserve the original intent somewhat, I have
>> >> converted the first call within a dump sequence to a dump_printf_loc
>> >> (xxx) which prints the location while the subsequence calls within the
>> >> same conditional get converted to the corresponding plain variants.
>> >
>> > Ok, that looks reasonable.
>> >
>> >> I considered removing the support for alternate dump file, but ended
>> >> up preserving it instead since it is needed for setting the alternate
>> >> dump file to stderr for the case when -fopt-info is given but no dump
>> >> file is available.
>> >>
>> >> The following invocation
>> >> g++ ... -ftree-vectorize -fopt-info=4
>> >>
>> >> dumps *all* available information to stderr. Currently, the opt-info
>> >> level is common to all passes, i.e., a pass can't specify if wants a
>> >> different level of diagnostic info. This can be added as an
>> >> enhancement with a suitable syntax for selecting passes.
>> >>
>> >> I haven't fixed up the documentation/tests but wanted to get some
>> >> feedback about the current state of patch before doing that.
>> >
>> > Some comments / questions.
>> >
>> > +  if (dump_file && (dump_kind & opt_info_flags))
>> > +    {
>> > +      dump_loc (dump_kind, dump_file, loc);
>> > +      print_generic_expr (dump_file, t, dump_flags | extra_dump_flags);
>> > +    }
>> > +
>> > +  if (alt_dump_file && (dump_kind & opt_info_flags))
>> > +    {
>> >
>> > you always test dump_kind against the same opt_info_flags variable.
>> > I would have thought that the alternate dump file has a different
>> > opt_info_flags
>> > setting so I can have -fdump-tree-vect-details -fopt-info=1.  Am I
>> > missing
>> > something?
>>
>> It was an oversight on my part. I have since fixed this. There are two
>> separate flags corresponding to the two types of dump files,
>>
>> pflags ==> pass private dump file
>> alt_flags ==> opt-info dump file
>>
>> > If I do
>> >
>> >> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo
>> >
>> > what will foo contain afterwards?  I think you need to document the
>> > behavior
>> > when such redirection is used with the compiler-driver feature of
>> > handling
>> > multiple translation units.  Especially the difference (or not
>> > difference) to
>> >
>> >> gcc file1.c -O3 -fdump-tree-vectorize=foo
>> >> gcc file2.c -O3 -fdump-tree-vectorize=foo
>>
>> Yes, the dump file gets overwritten during each invocation. I have
>> noted this in the documentation.
>>
>> > I suppose we do not want to append to foo (but eventually support that
>> > with some alternate syntax?  Like -fdump-tree-vectorize=+foo?)
>>
>> Yes, I agree. We could define a new syntax as you suggested for
>> appending to a dump file. However, this feature can wait for a
>> separate patch.
>>
>> > +
>> > +static void
>> > +set_opt_info (int opt_info_level)
>> > +{
>> >
>> > This function needs a comment.  How do the dump flags translate to
>> > the opt-info flags?  Is this documented anywhere?  I only see
>> >
>> > +/* Different types of dump classifications.  */
>> > +enum dump_msg_kind {
>> > +  MSG_NONE                     = 1 << 0,
>> > +  MSG_OPTIMIZED_LOCATIONS      = 1 << 1,
>> > +  MSG_UNOPTIMIZED_LOCATIONS    = 1 << 2,
>> > +  MSG_MISSED_OPTIMIZATION      = 1 << 3,
>> > +  MSG_NOTE                     = 1 << 4
>> > +};
>>
>> Yes, my mapping was static (pass-agnostic), defined inside the
>> set_opt_info. I
>> have now documented it and revamped it so that -fopt-info is applied
>> to specific passes as explained below.
>>
>> > I suppose the TDF_* flags would all map to
>> >
>> > MSG_OPTIMIZED_LOCATIONS|MSG_UNOPTIMIZED_LOCATIONS|MSG_MISSED_OPTIMIZATION
>> > and only TDF_DETAILS would enable MSG_NOTE?
>>
>> Yes, the mapping is very coarse-grained right now. Currently I have
>> made MSG_* flags an extension of TDF_* flags. This way both kinds of
>> flags can be present in a single word yet still be interpreted by the
>> dumps which are interested in them. TDF_DETAILS gets mapped to a union
>> of all the MSG_* flags. This will allow flags such -fdump-vect-details
>> to continue to work as expected during the transition to the new dump
>> infrastructure. Of coure,  during the transition, we would need to
>> rejigger flags to express the desired behavior.
>>
>> > In the above a flag for MSG_NONE does not make much sense?
>>
>> Removed. It is now implicitly 0.
>>
>> >
>> > How would we map TDF_SCEV?  I suppose we'd add MSG_SCEV for this
>> > (we talked about the possibility of having some special flags for
>> > special passes).
>>
>> Yes, I think this would be necessary. So far, I haven't converted any
>> passes other than vectorization, and I suspect a handful of special
>> case flags would be needed. During the transition, the TDF_* flags are
>> assumed to be an extension of MSG_* flags and things won't break.
>>
>> >
>> > But this also means a linear "opt-info-level" as in
>> >
>> > +static void
>> > +set_opt_info (int opt_info_level)
>> >
>> > may not be a good representation.  Not a pass-ignorant opt_info_flags
>> > value, if you consider -fopt-info=vect,3 (details from the vectorizer
>> > please).
>>
>> Agreed. As I mentioned earlier, I was considering a simple linear view
>> of -fopt-info. However, I understand that making it pass cognizant is
>> certainly better. To this end, I have modified the syntax of
>> -fopt-info somewhat along the lines of -fdump-xxx format. For example,
>>
>> gcc ... -fopt-info-tree-vect-optimized=foo.vect.optimized
>> -fdump-tree-pre-details=stderr ...
>>
>> This would dump info about optimized locations from the vectorizer
>> pass into foo.vect.optimized, and pass dump from the PRE pass on to
>> stderr.
>>
>> >
>> > +}
>> > +
>> > +
>> > +
>> > +DEBUG_FUNCTION void
>> > +dump_combine_stats (int flags)
>> >
>> > debug functions should be called debug_*, please add a comment and
>> > avoid excessive vertical space
>>
>> Fixed.
>>
>> >
>> > Index: tree-pass.h
>> > ===================================================================
>> > --- tree-pass.h (revision 188966)
>> > +++ tree-pass.h (working copy)
>> > @@ -85,7 +85,6 @@ enum tree_dump_index
>> >  #define TDF_CSELIB     (1 << 23)       /* Dump cselib details.  */
>> >  #define TDF_SCEV       (1 << 24)       /* Dump SCEV details.  */
>> >
>> > -
>> >  /* In tree-dump.c */
>> >
>> >  extern char *get_dump_file_name (int);
>> >
>> > please avoid whitespace-only changes (also elsewhere).
>>
>> Fixed.
>>
>> >
>> >  /* Global variables used to communicate with passes.  */
>> >  extern FILE *dump_file;
>> > +extern FILE *alt_dump_file;
>> >  extern int dump_flags;
>> > +extern int opt_info_flags;
>> >
>> > so I expected the primary dump_file to be controlled with dump_flags,
>> > or with a (cached) translation of them to a primary_opt_info_flags.
>>
>> Yes, now I have two sets of flags.
>>
>> >
>> > Index: gimple-pretty-print.h
>> > ===================================================================
>> > --- gimple-pretty-print.h       (revision 188966)
>> > +++ gimple-pretty-print.h       (working copy)
>> > @@ -31,6 +31,6 @@ extern void debug_gimple_seq (gimple_seq);
>> >  extern void print_gimple_seq (FILE *, gimple_seq, int, int);
>> >  extern void print_gimple_stmt (FILE *, gimple, int, int);
>> >  extern void print_gimple_expr (FILE *, gimple, int, int);
>> > -extern void dump_gimple_stmt (pretty_printer *, gimple, int, int);
>> > +extern void print_gimple_stmt_1 (pretty_printer *, gimple, int, int);
>> >
>> > I'd call this pp_gimple_stmt instead.
>>
>> Fixed.
>>
>> >
>> > @@ -1418,6 +1419,16 @@ init_branch_prob (void)
>> >  void
>> >  end_branch_prob (void)
>> >  {
>> > +  end_branch_prob_1 (dump_file);
>> > +  end_branch_prob_1 (alt_dump_file);
>> > +}
>> > +
>> > +/* Helper function for file-level cleanup for DUMP_FILE after
>> > +   branch-prob processing is completed. */
>> > +
>> > +static void
>> > +end_branch_prob_1 (FILE *dump_file)
>> > +{
>> >    if (dump_file)
>> >      {
>> >        fprintf (dump_file, "\n");
>> >
>> > That change looks odd ;)  Can you instead use the new dump_printf
>> > interface?  (side-note: we should not need to export alt_dump_file at
>> > all!)
>>
>> Sorry, it was my ill conceived attempt to avoid converting this pass.
>> :) Anyway, I did a proper fix now, and converted an additional file
>> (profile.c) as a side benefit.
>>
>> >
>> > @@ -2166,10 +2177,6 @@ ftree-vect-loop-version
>> >  Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
>> >  Enable loop versioning when doing loop vectorization on trees
>> >
>> > -ftree-vectorizer-verbose=
>> > -Common RejectNegative Joined UInteger
>> > --ftree-vectorizer-verbose=<number>     Set the verbosity level of the
>> > vectorizer
>> > -
>> >
>> > we need to preserve old switches for backward compatibility, I suggest
>> > to alias it to -fopt-info for now.
>>
>> Okay. I mapped -ftree-vectorizer-verbose=<number> as
>> 0 ==> No dump output
>> 1 ==> dump optimized locations
>> 2 ==> dump missed optimizations
>> 3 ==> dump notes
>> 4 and up ==> dump all, i.e., 1, 2, 3
>>
>> Curiously, several tests are casualties of reinterpretation of this
>> flag. Here is how -- the old (and odd) behavior was that
>>
>> gcc ... -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=3
>>
>> would output tree-vect dump as usual but nothing would get printed by
>> -ftree-vectorizer-verbose=3. In this patch I have "fixed" this
>> behavior so that -ftree-vectorizer-verbose=<n> prints on stderr
>> regardless of other flags present(*). This has the unfortunate side
>> effect of extra output being sent to stderr which is interpreted as a
>> test failure. Please advise how to ignore that additional stderr
>> output in tests.
>>
>> (*) Well, not entirely true. Since -ftree-vectorizer-verbose=<n> is
>> implemented in terms of -fopt-info, the output of verbose flag is the
>> stream where ever vectorizer's opt-info is being sent.
>>
>> > @@ -13909,7 +13909,7 @@ unmentioned_reg_p (rtx equiv, rtx expr)
>> >  }
>> >  ^L
>> >  DEBUG_FUNCTION void
>> > -dump_combine_stats (FILE *file)
>> > +print_combine_stats (FILE *file)
>> >
>> > see above, debug_combine_stats.
>>
>> Fixed.
>>
>> >
>> > Index: system.h
>> > ===================================================================
>> > --- system.h    (revision 188966)
>> > +++ system.h    (working copy)
>> > @@ -669,6 +669,7 @@ extern int vsnprintf(char *, size_t, const char *,
>> >     except, maybe, something to the dump file.  */
>> >  #ifdef BUFSIZ
>> >  extern FILE *dump_file;
>> > +extern FILE *alt_dump_file;
>> >
>> > see above - we should not need to export this (nor opt_info_flags).
>>
>> Fixed.
>>
>> Please take another look and see if the attached patch looks okay? I
>> still need to fix the failing tests due to intentional output on
>> stderr.
>>
>> Thanks,
>> Sharad
>>
>> >
>> > Overall I like the patch!
>> >
>> > Thanks,
>> > Richard.
>> >
>> >> Thanks,
>> >> Sharad
>> >>
>> >> On Fri, Jun 15, 2012 at 1:18 AM, Richard Guenther
>> >> <richard.guenther@gmail.com> wrote:
>> >>> On Fri, Jun 15, 2012 at 7:47 AM, Sharad Singhai <singhai@google.com>
>> >>> wrote:
>> >>>> On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther
>> >>>> <richard.guenther@gmail.com> wrote:
>> >>>>> On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai <singhai@google.com>
>> >>>>> wrote:
>> >>>>>> Okay, I have updated the attached patch so that the output from
>> >>>>>> -ftree-vectorizer-verbose is considered diagnostic information and
>> >>>>>> is
>> >>>>>> always
>> >>>>>> sent to stderr. Other functionality remains unchanged. Here is some
>> >>>>>> more context about this patch.
>> >>>>>>
>> >>>>>> This patch improves the dump infrastructure and public interfaces
>> >>>>>> so
>> >>>>>> that the existing private pass-specific dump stream is separated
>> >>>>>> from
>> >>>>>> the diagnostic dump stream (typically stderr).  The optimization
>> >>>>>> passes can output information on the two streams independently.
>> >>>>>>
>> >>>>>> The newly defined interfaces are:
>> >>>>>>
>> >>>>>> Individual passes do not need to access the dump file directly.
>> >>>>>> Thus Instead
>> >>>>>> of doing
>> >>>>>>
>> >>>>>>   if (dump_file && (flags & dump_flags))
>> >>>>>>      fprintf (dump_file, ...);
>> >>>>>>
>> >>>>>> they can do
>> >>>>>>
>> >>>>>>     dump_printf (flags, ...);
>> >>>>>>
>> >>>>>> If the current pass has FLAGS enabled then the information gets
>> >>>>>> printed into the dump file otherwise not.
>> >>>>>>
>> >>>>>> Similar to the dump_printf (), another function is defined, called
>> >>>>>>
>> >>>>>>        diag_printf (dump_flags, ...)
>> >>>>>>
>> >>>>>> This prints information only onto the diagnostic stream, typically
>> >>>>>> standard error. It is useful for separating pass-specific dump
>> >>>>>> information from
>> >>>>>> the diagnostic information.
>> >>>>>>
>> >>>>>> Currently, as a proof of concept, I have converted vectorizer
>> >>>>>> passes
>> >>>>>> to use the new dump format. For this, I have considered
>> >>>>>> information printed in vect_dump file as diagnostic. Thus 'fprintf'
>> >>>>>> calls are changed to 'diag_printf'. Some other information printed
>> >>>>>> to
>> >>>>>> dump_file is sent to the regular dump file via 'dump_printf ()'. It
>> >>>>>> helps to separate the two streams because they might serve
>> >>>>>> different
>> >>>>>> purposes and might have different formatting requirements.
>> >>>>>>
>> >>>>>> For example, using the trunk compiler, the following invocation
>> >>>>>>
>> >>>>>> g++ -S v.cc -ftree-vectorize -fdump-tree-vect
>> >>>>>> -ftree-vectorizer-verbose=2
>> >>>>>>
>> >>>>>> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
>> >>>>>> However, the verbose diagnostic output is silently
>> >>>>>> ignored. This is not desirable as the two types of dump should not
>> >>>>>> interfere.
>> >>>>>>
>> >>>>>> After this patch, the vectorizer dump is available in
>> >>>>>> 'v.cc.113t.vect'
>> >>>>>> as before, but the verbose vectorizer diagnostic is additionally
>> >>>>>> printed on stderr. Thus both types of dump information are output.
>> >>>>>>
>> >>>>>> An additional feature of this patch is that individual passes can
>> >>>>>> print dump information into command-line named files instead of
>> >>>>>> auto
>> >>>>>> numbered filename. For example,
>> >>>>>
>> >>>>> I'd wish you'd leave out this part for a followup.
>> >>>>>
>> >>>>>>
>> >>>>>> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>> >>>>>>     -ftree-vectorizer-verbose=2
>> >>>>>>     -fdump-tree-pre=foo.pre
>> >>>>>>
>> >>>>>> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
>> >>>>>> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>> >>>>>>
>> >>>>>> Please take another look.
>> >>>>>
>> >>>>> --- tree-vect-loop-manip.c      (revision 188325)
>> >>>>> +++ tree-vect-loop-manip.c      (working copy)
>> >>>>> @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop
>> >>>>> *loop
>> >>>>>   gsi_remove (&loop_cond_gsi, true);
>> >>>>>
>> >>>>>   loop_loc = find_loop_location (loop);
>> >>>>> -  if (dump_file && (dump_flags & TDF_DETAILS))
>> >>>>> -    {
>> >>>>> -      if (loop_loc != UNKNOWN_LOC)
>> >>>>> -        fprintf (dump_file, "\nloop at %s:%d: ",
>> >>>>> +  if (loop_loc != UNKNOWN_LOC)
>> >>>>> +    dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
>> >>>>>                  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>> >>>>> -      print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
>> >>>>> -    }
>> >>>>> -
>> >>>>> +  if (dump_flags & TDF_DETAILS)
>> >>>>> +    dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
>> >>>>>   loop->nb_iterations = niters;
>> >>>>>
>> >>>>> I'm confused by this.  Why is this not simply
>> >>>>>
>> >>>>>  if (loop_loc != UNKNOWN_LOC)
>> >>>>>    dump_printf (dump_flags, "\nloop at %s:%d: ",
>> >>>>>                       LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>> >>>>>  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);
>> >>>>>
>> >>>>> for example.  I notice that you maybe mis-understood the message
>> >>>>> classification
>> >>>>> I asked you to add (maybe I confused you by mentioning to eventually
>> >>>>> re-use
>> >>>>> the TDF_* flags).  I think you basically provided this message
>> >>>>> classification
>> >>>>> by adding two classes by providing both dump_gimple_stmt and
>> >>>>> diag_gimple_stmt.
>> >>>>> But still in the above you keep a dump_flags test _and_ you pass in
>> >>>>> (altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me
>> >>>>> quote them:
>> >>>>>
>> >>>>> +void
>> >>>>> +dump_gimple_stmt (int flags, gimple gs, int spc)
>> >>>>> +{
>> >>>>> +  if (dump_file)
>> >>>>> +    print_gimple_stmt (dump_file, gs, spc, flags);
>> >>>>> +}
>> >>>>>
>> >>>>> +void
>> >>>>> +diag_gimple_stmt (int flags, gimple gs, int spc)
>> >>>>> +{
>> >>>>> +  if (alt_dump_file)
>> >>>>> +    print_gimple_stmt (alt_dump_file, gs, spc, flags);
>> >>>>> +}
>> >>>>>
>> >>>>> I'd say it should have been a single function:
>> >>>>>
>> >>>>> void
>> >>>>> dump_gimple_stmt (enum msg_classification, int additional_flags,
>> >>>>> gimple gs, int spc)
>> >>>>> {
>> >>>>>  if (msg_classification & go-to-dumpfile
>> >>>>>      && dump_file)
>> >>>>>    print_gimple_stmt (dump_file, gs, spc, dump_flags |
>> >>>>> additional_flags);
>> >>>>>  if (msg_classification & go-to-alt-dump-file
>> >>>>>      && alt_dump_file && (alt_dump_flags & msg_classification))
>> >>>>>    print_gimple_stmt (alt_dump_file, gs, spc, alt_dump_flags |
>> >>>>> additional_flags);
>> >>>>> }
>> >>>>
>> >>>> Yes,  I can make the single function work for both regular
>> >>>> (dump_file)
>> >>>> and diagnostic (stderr for now) dump streams. In fact, my original
>> >>>> patch did just that. However, it duplicated output on *both* the
>> >>>> streams. For example,
>> >>>>
>> >>>> gcc -O2 -ftree-vectorize -fdump-tree-vect=foo.vect
>> >>>> -ftree-vectorizer-verbose=2 foo.c
>> >>>>
>> >>>> Since both streams are enabled in this case, a call to dump_printf ()
>> >>>> would duplicate the information to both files, i.e., the dump and
>> >>>> diagnostic will be intermixed. Is that intended? My first thought was
>> >>>> to keep them separated via dump_*() and diag_* () methods.
>> >>>
>> >>> No, I think that's intended.  The regular dump-files should contain
>> >>> everything, the secondary stream (eventually enabled by -fopt-info)
>> >>> should be a filtered regular dump-file.
>> >>>
>> >>> Thanks,
>> >>> Richard.
>> >>>
>> >>>> Thanks,
>> >>>> Sharad
>> >>>>
>> >>>>> where msg_classification would include things to suitably classify
>> >>>>> messages
>> >>>>> for -fopt-info, like MSG_MISSED_OPTIMIZATION, MSG_OPTIMIZED,
>> >>>>> MSG_NOTE.
>> >>>>>
>> >>>>> In another place we have
>> >>>>>
>> >>>>> @@ -1648,7 +1642,7 @@ vect_can_advance_ivs_p (loop_vec_info
>> >>>>> loop_vinfo)
>> >>>>>   /* Analyze phi functions of the loop header.  */
>> >>>>>
>> >>>>>   if (vect_print_dump_info (REPORT_DETAILS))
>> >>>>> -    fprintf (vect_dump, "vect_can_advance_ivs_p:");
>> >>>>> +    diag_printf (TDF_TREE, "vect_can_advance_ivs_p:");
>> >>>>>
>> >>>>> so why's that diag_printf and why TDF_TREE?  I suppose you made all
>> >>>>> dumps to vect_dump diag_* and all dumps to dump_file dump_*?  I
>> >>>>> think it should have been
>> >>>>>
>> >>>>>   dump_printf (REPORT_DETAILS, "vect_can_advance_ivs_p:");
>> >>>>>
>> >>>>> thus dump_printf only gets the msg-classification and the printf
>> >>>>> args
>> >>>>> (dump-flags
>> >>>>> are only meaningful when passed down to pretty-printers).  Thus
>> >>>>>
>> >>>>> @@ -1658,8 +1652,8 @@ vect_can_advance_ivs_p (loop_vec_info
>> >>>>> loop_vinfo)
>> >>>>>       phi = gsi_stmt (gsi);
>> >>>>>       if (vect_print_dump_info (REPORT_DETAILS))
>> >>>>>        {
>> >>>>> -          fprintf (vect_dump, "Analyze phi: ");
>> >>>>> -          print_gimple_stmt (vect_dump, phi, 0, TDF_SLIM);
>> >>>>> +          diag_printf (TDF_TREE, "Analyze phi: ");
>> >>>>> +          diag_gimple_stmt (TDF_SLIM, phi, 0);
>> >>>>>        }
>> >>>>>
>> >>>>> should be
>> >>>>>
>> >>>>>  dump_printf (REPORT_DETAILS, "Analyze phi: ");
>> >>>>>  dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>> >>>>>
>> >>>>> and the if (vect_print_dump_info (REPORT_DETAILS)) should be what
>> >>>>> the dump infrastructure provides and thus hidden.  Eventually we
>> >>>>> should
>> >>>>> have a dump_kind (msg-classification) so we can replace it with
>> >>>>>
>> >>>>>   if (dump_kind (REPORT_DETAILS))
>> >>>>>     {
>> >>>>>       dump_printf (REPORT_DETAILS, "Analyze phi: ");
>> >>>>>       dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>> >>>>>     }
>> >>>>>
>> >>>>> to reduce the runtime overhead when not diagnosing/dumping.
>> >>>>>
>> >>>>> @@ -2464,8 +2458,8 @@ vect_create_cond_for_alias_checks
>> >>>>> (loop_vec_info l
>> >>>>>     }
>> >>>>>
>> >>>>>   if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
>> >>>>> -    fprintf (vect_dump, "created %u versioning for alias
>> >>>>> checks.\n",
>> >>>>> -             VEC_length (ddr_p, may_alias_ddrs));
>> >>>>> +    diag_printf (TDF_TREE, "created %u versioning for alias
>> >>>>> checks.\n",
>> >>>>> +                 VEC_length (ddr_p, may_alias_ddrs));
>> >>>>>  }
>> >>>>>
>> >>>>> so here we have a different msg-classification,
>> >>>>> REPORT_VECTORIZED_LOCATIONS.  As we eventually want a central
>> >>>>> -fopt-report=... we do not want to leave this implementation detail
>> >>>>> to
>> >>>>> individual passes but gather them in a central place.
>> >>>>>
>> >>>>> Thanks,
>> >>>>> Richard.
>> >>>>>
>> >>>>>> Thanks,
>> >>>>>> Sharad
>> >>>>>>
>> >>>>>>
>> >>>>>> On Thu, Jun 7, 2012 at 12:19 AM, Xinliang David Li
>> >>>>>> <davidxl@google.com> wrote:
>> >>>>>>> On Wed, Jun 6, 2012 at 10:58 PM, Sharad Singhai
>> >>>>>>> <singhai@google.com> wrote:
>> >>>>>>>> Sorry about the delay. I have finally incorporated all the
>> >>>>>>>> suggestions
>> >>>>>>>> and reorganized the dump infrastructure a bit. The attached patch
>> >>>>>>>> updates vectorizer passes so that instead of accessing global
>> >>>>>>>> dump_file directly, these passes call dump_printf (FLAG, format,
>> >>>>>>>> ...).
>> >>>>>>>> The dump_printf can choose between two streams, one regular pass
>> >>>>>>>> dump
>> >>>>>>>> file, and another optional command line provided file. Currently,
>> >>>>>>>> it
>> >>>>>>>> doesn't discriminate and all the dump information goes to both
>> >>>>>>>> the
>> >>>>>>>> streams.
>> >>>>>>>>
>> >>>>>>>> Thus, for example,
>> >>>>>>>>
>> >>>>>>>> g++ -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.v
>> >>>>>>>> -ftree-vectorizer-verbose=3
>> >>>>>>>
>> >>>>>>> But the default form of dump option (-fdump-tree-vect) no longer
>> >>>>>>> interferes with -ftree-vectorize-verbose=xxx output right? (this
>> >>>>>>> is
>> >>>>>>> one of the main requirements). One dumped to the primary stream
>> >>>>>>> (named
>> >>>>>>> dump file) and the other to the stderr -- the default diagnostic
>> >>>>>>> (alt)
>> >>>>>>> stream.
>> >>>>>>>
>> >>>>>>> David
>> >>>>>>>
>> >>>>>>>>
>> >>>>>>>> will output the verbose vectorizer information in both *.vect
>> >>>>>>>> file and
>> >>>>>>>> foo.v file. However, as I have converted only vectorizer passes
>> >>>>>>>> so
>> >>>>>>>> far, there is additional information in *.vect file which is not
>> >>>>>>>> present in foo.v file. Once other passes are converted to use
>> >>>>>>>> this
>> >>>>>>>> scheme, then these two dump files should have identical output.
>> >>>>>>>>
>> >>>>>>>> Also note that in this patch -fdump-xxx=yyy format does not
>> >>>>>>>> override
>> >>>>>>>> any auto named dump files as in my earlier patches. Instead the
>> >>>>>>>> dump
>> >>>>>>>> information is output to both places when a command line dump
>> >>>>>>>> file
>> >>>>>>>> option is provided.
>> >>>>>>>>
>> >>>>>>>> To summarize:
>> >>>>>>>> - instead of using dump_begin () / dump_end (), the passes should
>> >>>>>>>> use
>> >>>>>>>> dump_start ()/dump_finish (). These new variants do not return
>> >>>>>>>> the
>> >>>>>>>> dump_file. However, they still set the global
>> >>>>>>>> dump_file/dump_flags for
>> >>>>>>>> the benefit of other passes during the transition.
>> >>>>>>>> - instead of directly printing to the dump_file, as in
>> >>>>>>>> if (dump_file)
>> >>>>>>>>  fprintf (dump_file, ...);
>> >>>>>>>>
>> >>>>>>>> The passes should do
>> >>>>>>>>
>> >>>>>>>> dump_printf (dump_flag, ...);
>> >>>>>>>> This will output to dump file(s) only when dump_flag is enabled
>> >>>>>>>> for a
>> >>>>>>>> given pass.
>> >>>>>>>>
>> >>>>>>>> I have bootstrapped and tested it on x86_64. Does it look okay?
>> >>>>>>>>
>> >>>>>>>> Thanks,
>> >>>>>>>> Sharad
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> On Mon, May 14, 2012 at 12:26 AM, Richard Guenther
>> >>>>>>>> <richard.guenther@gmail.com> wrote:
>> >>>>>>>>> On Sat, May 12, 2012 at 6:39 PM, Xinliang David Li
>> >>>>>>>>> <davidxl@google.com> wrote:
>> >>>>>>>>>> On Sat, May 12, 2012 at 9:26 AM, Gabriel Dos Reis
>> >>>>>>>>>> <gdr@integrable-solutions.net> wrote:
>> >>>>>>>>>>> On Sat, May 12, 2012 at 11:05 AM, Xinliang David Li
>> >>>>>>>>>>> <davidxl@google.com> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> The downside is that the dump file format will look different
>> >>>>>>>>>>>> from the
>> >>>>>>>>>>>> stderr output which is less than ideal.
>> >>>>>>>>>>>
>> >>>>>>>>>>> BTW, why do people want to use stderr for dumping internal
>> >>>>>>>>>>> IRs,
>> >>>>>>>>>>> as opposed to stdout or other files?  That does not sound
>> >>>>>>>>>>> right.
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> I was talking about the transformation information difference.
>> >>>>>>>>>> In
>> >>>>>>>>>> stderr (where diagnostics go to), we may have
>> >>>>>>>>>>
>> >>>>>>>>>> foo.c: in function 'foo':
>> >>>>>>>>>> foo.c:5:6: note: loop was vectorized
>> >>>>>>>>>>
>> >>>>>>>>>> but in dump file the format for the information may be
>> >>>>>>>>>> different,
>> >>>>>>>>>> unless we want to duplicate the machinery in diagnostics.
>> >>>>>>>>>
>> >>>>>>>>> So?  What's the problem with that ("different" diagnostics)?
>> >>>>>>>>>
>> >>>>>>>>> Richard.
>> >>>>>>>>>
>> >>>>>>>>>> David
>> >>>>>>>>>>
>> >>>>>>>>>>> -- Gaby
>
>
Xinliang David Li - Sept. 11, 2012, 8:16 p.m.
Can you resend your patch in text form (also need to resolve the
latest conflicts) so that it can be commented inline?

Please also provide as summary a more up-to-date description of
1) Command line option syntax and semantics
2) New dumping APIs and semantics
3) Conversion changes

Looking at the patch briefly, I am confused with the opt-info syntax.
I thought the following is desired:

-fopt-info=pass-flags

where pass is the pass name, and flags is one of [optimized, notes,
missed].  Both pass and flags can be omitted.

Is it implemented this way in your patch?

David




On Mon, Sep 10, 2012 at 11:20 AM, Sharad Singhai <singhai@google.com> wrote:
> Ping.
>
> Thanks,
> Sharad
> Sharad
>
>
> On Wed, Sep 5, 2012 at 10:34 AM, Sharad Singhai <singhai@google.com> wrote:
>> Ping.
>>
>> Thanks,
>> Sharad
>>
>> Sharad
>>
>>
>>
>>
>> On Fri, Aug 24, 2012 at 1:06 AM, Sharad Singhai <singhai@google.com> wrote:
>>>
>>> Sorry about the delay. Please see comments inline.
>>>
>>> On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther
>>> <richard.guenther@gmail.com> wrote:
>>> > On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai <singhai@google.com>
>>> > wrote:
>>> >> Apologies for the spam. Attempting to resend the patch after shrinking
>>> >> it.
>>> >>
>>> >> I have updated the attached patch to use a new dump message
>>> >> classification system for the vectorizer. It currently uses four
>>> >> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION,
>>> >> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the
>>> >> vectorizer passes and have converted each call to fprintf (dump_file,
>>> >> ....) to a message classification matching in spirit. Most often, it
>>> >> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well.
>>> >>
>>> >> For example, the following
>>> >>
>>> >> if (vect_print_dump_info (REPORT_DETAILS))
>>> >>   {
>>> >>     fprintf (vect_dump, "niters for prolog loop: ");
>>> >>     print_generic_expr (vect_dump, iters, TDF_SLIM);
>>> >>   }
>>> >>
>>> >> gets converted to
>>> >>
>>> >> if (dump_kind (MSG_OPTIMIZED_LOCATIONS))
>>> >>   {
>>> >>      dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
>>> >>                       "niters for prolog loop: ");
>>> >>      dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters);
>>> >>   }
>>> >>
>>> >> The asymmetry between the first printf and the second is due to the
>>> >> fact that 'vect_print_dump_info (xxx)' prints the location as a
>>> >> "side-effect". To preserve the original intent somewhat, I have
>>> >> converted the first call within a dump sequence to a dump_printf_loc
>>> >> (xxx) which prints the location while the subsequence calls within the
>>> >> same conditional get converted to the corresponding plain variants.
>>> >
>>> > Ok, that looks reasonable.
>>> >
>>> >> I considered removing the support for alternate dump file, but ended
>>> >> up preserving it instead since it is needed for setting the alternate
>>> >> dump file to stderr for the case when -fopt-info is given but no dump
>>> >> file is available.
>>> >>
>>> >> The following invocation
>>> >> g++ ... -ftree-vectorize -fopt-info=4
>>> >>
>>> >> dumps *all* available information to stderr. Currently, the opt-info
>>> >> level is common to all passes, i.e., a pass can't specify if wants a
>>> >> different level of diagnostic info. This can be added as an
>>> >> enhancement with a suitable syntax for selecting passes.
>>> >>
>>> >> I haven't fixed up the documentation/tests but wanted to get some
>>> >> feedback about the current state of patch before doing that.
>>> >
>>> > Some comments / questions.
>>> >
>>> > +  if (dump_file && (dump_kind & opt_info_flags))
>>> > +    {
>>> > +      dump_loc (dump_kind, dump_file, loc);
>>> > +      print_generic_expr (dump_file, t, dump_flags | extra_dump_flags);
>>> > +    }
>>> > +
>>> > +  if (alt_dump_file && (dump_kind & opt_info_flags))
>>> > +    {
>>> >
>>> > you always test dump_kind against the same opt_info_flags variable.
>>> > I would have thought that the alternate dump file has a different
>>> > opt_info_flags
>>> > setting so I can have -fdump-tree-vect-details -fopt-info=1.  Am I
>>> > missing
>>> > something?
>>>
>>> It was an oversight on my part. I have since fixed this. There are two
>>> separate flags corresponding to the two types of dump files,
>>>
>>> pflags ==> pass private dump file
>>> alt_flags ==> opt-info dump file
>>>
>>> > If I do
>>> >
>>> >> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo
>>> >
>>> > what will foo contain afterwards?  I think you need to document the
>>> > behavior
>>> > when such redirection is used with the compiler-driver feature of
>>> > handling
>>> > multiple translation units.  Especially the difference (or not
>>> > difference) to
>>> >
>>> >> gcc file1.c -O3 -fdump-tree-vectorize=foo
>>> >> gcc file2.c -O3 -fdump-tree-vectorize=foo
>>>
>>> Yes, the dump file gets overwritten during each invocation. I have
>>> noted this in the documentation.
>>>
>>> > I suppose we do not want to append to foo (but eventually support that
>>> > with some alternate syntax?  Like -fdump-tree-vectorize=+foo?)
>>>
>>> Yes, I agree. We could define a new syntax as you suggested for
>>> appending to a dump file. However, this feature can wait for a
>>> separate patch.
>>>
>>> > +
>>> > +static void
>>> > +set_opt_info (int opt_info_level)
>>> > +{
>>> >
>>> > This function needs a comment.  How do the dump flags translate to
>>> > the opt-info flags?  Is this documented anywhere?  I only see
>>> >
>>> > +/* Different types of dump classifications.  */
>>> > +enum dump_msg_kind {
>>> > +  MSG_NONE                     = 1 << 0,
>>> > +  MSG_OPTIMIZED_LOCATIONS      = 1 << 1,
>>> > +  MSG_UNOPTIMIZED_LOCATIONS    = 1 << 2,
>>> > +  MSG_MISSED_OPTIMIZATION      = 1 << 3,
>>> > +  MSG_NOTE                     = 1 << 4
>>> > +};
>>>
>>> Yes, my mapping was static (pass-agnostic), defined inside the
>>> set_opt_info. I
>>> have now documented it and revamped it so that -fopt-info is applied
>>> to specific passes as explained below.
>>>
>>> > I suppose the TDF_* flags would all map to
>>> >
>>> > MSG_OPTIMIZED_LOCATIONS|MSG_UNOPTIMIZED_LOCATIONS|MSG_MISSED_OPTIMIZATION
>>> > and only TDF_DETAILS would enable MSG_NOTE?
>>>
>>> Yes, the mapping is very coarse-grained right now. Currently I have
>>> made MSG_* flags an extension of TDF_* flags. This way both kinds of
>>> flags can be present in a single word yet still be interpreted by the
>>> dumps which are interested in them. TDF_DETAILS gets mapped to a union
>>> of all the MSG_* flags. This will allow flags such -fdump-vect-details
>>> to continue to work as expected during the transition to the new dump
>>> infrastructure. Of coure,  during the transition, we would need to
>>> rejigger flags to express the desired behavior.
>>>
>>> > In the above a flag for MSG_NONE does not make much sense?
>>>
>>> Removed. It is now implicitly 0.
>>>
>>> >
>>> > How would we map TDF_SCEV?  I suppose we'd add MSG_SCEV for this
>>> > (we talked about the possibility of having some special flags for
>>> > special passes).
>>>
>>> Yes, I think this would be necessary. So far, I haven't converted any
>>> passes other than vectorization, and I suspect a handful of special
>>> case flags would be needed. During the transition, the TDF_* flags are
>>> assumed to be an extension of MSG_* flags and things won't break.
>>>
>>> >
>>> > But this also means a linear "opt-info-level" as in
>>> >
>>> > +static void
>>> > +set_opt_info (int opt_info_level)
>>> >
>>> > may not be a good representation.  Not a pass-ignorant opt_info_flags
>>> > value, if you consider -fopt-info=vect,3 (details from the vectorizer
>>> > please).
>>>
>>> Agreed. As I mentioned earlier, I was considering a simple linear view
>>> of -fopt-info. However, I understand that making it pass cognizant is
>>> certainly better. To this end, I have modified the syntax of
>>> -fopt-info somewhat along the lines of -fdump-xxx format. For example,
>>>
>>> gcc ... -fopt-info-tree-vect-optimized=foo.vect.optimized
>>> -fdump-tree-pre-details=stderr ...
>>>
>>> This would dump info about optimized locations from the vectorizer
>>> pass into foo.vect.optimized, and pass dump from the PRE pass on to
>>> stderr.
>>>
>>> >
>>> > +}
>>> > +
>>> > +
>>> > +
>>> > +DEBUG_FUNCTION void
>>> > +dump_combine_stats (int flags)
>>> >
>>> > debug functions should be called debug_*, please add a comment and
>>> > avoid excessive vertical space
>>>
>>> Fixed.
>>>
>>> >
>>> > Index: tree-pass.h
>>> > ===================================================================
>>> > --- tree-pass.h (revision 188966)
>>> > +++ tree-pass.h (working copy)
>>> > @@ -85,7 +85,6 @@ enum tree_dump_index
>>> >  #define TDF_CSELIB     (1 << 23)       /* Dump cselib details.  */
>>> >  #define TDF_SCEV       (1 << 24)       /* Dump SCEV details.  */
>>> >
>>> > -
>>> >  /* In tree-dump.c */
>>> >
>>> >  extern char *get_dump_file_name (int);
>>> >
>>> > please avoid whitespace-only changes (also elsewhere).
>>>
>>> Fixed.
>>>
>>> >
>>> >  /* Global variables used to communicate with passes.  */
>>> >  extern FILE *dump_file;
>>> > +extern FILE *alt_dump_file;
>>> >  extern int dump_flags;
>>> > +extern int opt_info_flags;
>>> >
>>> > so I expected the primary dump_file to be controlled with dump_flags,
>>> > or with a (cached) translation of them to a primary_opt_info_flags.
>>>
>>> Yes, now I have two sets of flags.
>>>
>>> >
>>> > Index: gimple-pretty-print.h
>>> > ===================================================================
>>> > --- gimple-pretty-print.h       (revision 188966)
>>> > +++ gimple-pretty-print.h       (working copy)
>>> > @@ -31,6 +31,6 @@ extern void debug_gimple_seq (gimple_seq);
>>> >  extern void print_gimple_seq (FILE *, gimple_seq, int, int);
>>> >  extern void print_gimple_stmt (FILE *, gimple, int, int);
>>> >  extern void print_gimple_expr (FILE *, gimple, int, int);
>>> > -extern void dump_gimple_stmt (pretty_printer *, gimple, int, int);
>>> > +extern void print_gimple_stmt_1 (pretty_printer *, gimple, int, int);
>>> >
>>> > I'd call this pp_gimple_stmt instead.
>>>
>>> Fixed.
>>>
>>> >
>>> > @@ -1418,6 +1419,16 @@ init_branch_prob (void)
>>> >  void
>>> >  end_branch_prob (void)
>>> >  {
>>> > +  end_branch_prob_1 (dump_file);
>>> > +  end_branch_prob_1 (alt_dump_file);
>>> > +}
>>> > +
>>> > +/* Helper function for file-level cleanup for DUMP_FILE after
>>> > +   branch-prob processing is completed. */
>>> > +
>>> > +static void
>>> > +end_branch_prob_1 (FILE *dump_file)
>>> > +{
>>> >    if (dump_file)
>>> >      {
>>> >        fprintf (dump_file, "\n");
>>> >
>>> > That change looks odd ;)  Can you instead use the new dump_printf
>>> > interface?  (side-note: we should not need to export alt_dump_file at
>>> > all!)
>>>
>>> Sorry, it was my ill conceived attempt to avoid converting this pass.
>>> :) Anyway, I did a proper fix now, and converted an additional file
>>> (profile.c) as a side benefit.
>>>
>>> >
>>> > @@ -2166,10 +2177,6 @@ ftree-vect-loop-version
>>> >  Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
>>> >  Enable loop versioning when doing loop vectorization on trees
>>> >
>>> > -ftree-vectorizer-verbose=
>>> > -Common RejectNegative Joined UInteger
>>> > --ftree-vectorizer-verbose=<number>     Set the verbosity level of the
>>> > vectorizer
>>> > -
>>> >
>>> > we need to preserve old switches for backward compatibility, I suggest
>>> > to alias it to -fopt-info for now.
>>>
>>> Okay. I mapped -ftree-vectorizer-verbose=<number> as
>>> 0 ==> No dump output
>>> 1 ==> dump optimized locations
>>> 2 ==> dump missed optimizations
>>> 3 ==> dump notes
>>> 4 and up ==> dump all, i.e., 1, 2, 3
>>>
>>> Curiously, several tests are casualties of reinterpretation of this
>>> flag. Here is how -- the old (and odd) behavior was that
>>>
>>> gcc ... -ftree-vectorize -fdump-tree-vect -ftree-vectorizer-verbose=3
>>>
>>> would output tree-vect dump as usual but nothing would get printed by
>>> -ftree-vectorizer-verbose=3. In this patch I have "fixed" this
>>> behavior so that -ftree-vectorizer-verbose=<n> prints on stderr
>>> regardless of other flags present(*). This has the unfortunate side
>>> effect of extra output being sent to stderr which is interpreted as a
>>> test failure. Please advise how to ignore that additional stderr
>>> output in tests.
>>>
>>> (*) Well, not entirely true. Since -ftree-vectorizer-verbose=<n> is
>>> implemented in terms of -fopt-info, the output of verbose flag is the
>>> stream where ever vectorizer's opt-info is being sent.
>>>
>>> > @@ -13909,7 +13909,7 @@ unmentioned_reg_p (rtx equiv, rtx expr)
>>> >  }
>>> >  ^L
>>> >  DEBUG_FUNCTION void
>>> > -dump_combine_stats (FILE *file)
>>> > +print_combine_stats (FILE *file)
>>> >
>>> > see above, debug_combine_stats.
>>>
>>> Fixed.
>>>
>>> >
>>> > Index: system.h
>>> > ===================================================================
>>> > --- system.h    (revision 188966)
>>> > +++ system.h    (working copy)
>>> > @@ -669,6 +669,7 @@ extern int vsnprintf(char *, size_t, const char *,
>>> >     except, maybe, something to the dump file.  */
>>> >  #ifdef BUFSIZ
>>> >  extern FILE *dump_file;
>>> > +extern FILE *alt_dump_file;
>>> >
>>> > see above - we should not need to export this (nor opt_info_flags).
>>>
>>> Fixed.
>>>
>>> Please take another look and see if the attached patch looks okay? I
>>> still need to fix the failing tests due to intentional output on
>>> stderr.
>>>
>>> Thanks,
>>> Sharad
>>>
>>> >
>>> > Overall I like the patch!
>>> >
>>> > Thanks,
>>> > Richard.
>>> >
>>> >> Thanks,
>>> >> Sharad
>>> >>
>>> >> On Fri, Jun 15, 2012 at 1:18 AM, Richard Guenther
>>> >> <richard.guenther@gmail.com> wrote:
>>> >>> On Fri, Jun 15, 2012 at 7:47 AM, Sharad Singhai <singhai@google.com>
>>> >>> wrote:
>>> >>>> On Wed, Jun 13, 2012 at 4:48 AM, Richard Guenther
>>> >>>> <richard.guenther@gmail.com> wrote:
>>> >>>>> On Fri, Jun 8, 2012 at 7:16 AM, Sharad Singhai <singhai@google.com>
>>> >>>>> wrote:
>>> >>>>>> Okay, I have updated the attached patch so that the output from
>>> >>>>>> -ftree-vectorizer-verbose is considered diagnostic information and
>>> >>>>>> is
>>> >>>>>> always
>>> >>>>>> sent to stderr. Other functionality remains unchanged. Here is some
>>> >>>>>> more context about this patch.
>>> >>>>>>
>>> >>>>>> This patch improves the dump infrastructure and public interfaces
>>> >>>>>> so
>>> >>>>>> that the existing private pass-specific dump stream is separated
>>> >>>>>> from
>>> >>>>>> the diagnostic dump stream (typically stderr).  The optimization
>>> >>>>>> passes can output information on the two streams independently.
>>> >>>>>>
>>> >>>>>> The newly defined interfaces are:
>>> >>>>>>
>>> >>>>>> Individual passes do not need to access the dump file directly.
>>> >>>>>> Thus Instead
>>> >>>>>> of doing
>>> >>>>>>
>>> >>>>>>   if (dump_file && (flags & dump_flags))
>>> >>>>>>      fprintf (dump_file, ...);
>>> >>>>>>
>>> >>>>>> they can do
>>> >>>>>>
>>> >>>>>>     dump_printf (flags, ...);
>>> >>>>>>
>>> >>>>>> If the current pass has FLAGS enabled then the information gets
>>> >>>>>> printed into the dump file otherwise not.
>>> >>>>>>
>>> >>>>>> Similar to the dump_printf (), another function is defined, called
>>> >>>>>>
>>> >>>>>>        diag_printf (dump_flags, ...)
>>> >>>>>>
>>> >>>>>> This prints information only onto the diagnostic stream, typically
>>> >>>>>> standard error. It is useful for separating pass-specific dump
>>> >>>>>> information from
>>> >>>>>> the diagnostic information.
>>> >>>>>>
>>> >>>>>> Currently, as a proof of concept, I have converted vectorizer
>>> >>>>>> passes
>>> >>>>>> to use the new dump format. For this, I have considered
>>> >>>>>> information printed in vect_dump file as diagnostic. Thus 'fprintf'
>>> >>>>>> calls are changed to 'diag_printf'. Some other information printed
>>> >>>>>> to
>>> >>>>>> dump_file is sent to the regular dump file via 'dump_printf ()'. It
>>> >>>>>> helps to separate the two streams because they might serve
>>> >>>>>> different
>>> >>>>>> purposes and might have different formatting requirements.
>>> >>>>>>
>>> >>>>>> For example, using the trunk compiler, the following invocation
>>> >>>>>>
>>> >>>>>> g++ -S v.cc -ftree-vectorize -fdump-tree-vect
>>> >>>>>> -ftree-vectorizer-verbose=2
>>> >>>>>>
>>> >>>>>> prints tree vectorizer dump into a file named 'v.cc.113t.vect'.
>>> >>>>>> However, the verbose diagnostic output is silently
>>> >>>>>> ignored. This is not desirable as the two types of dump should not
>>> >>>>>> interfere.
>>> >>>>>>
>>> >>>>>> After this patch, the vectorizer dump is available in
>>> >>>>>> 'v.cc.113t.vect'
>>> >>>>>> as before, but the verbose vectorizer diagnostic is additionally
>>> >>>>>> printed on stderr. Thus both types of dump information are output.
>>> >>>>>>
>>> >>>>>> An additional feature of this patch is that individual passes can
>>> >>>>>> print dump information into command-line named files instead of
>>> >>>>>> auto
>>> >>>>>> numbered filename. For example,
>>> >>>>>
>>> >>>>> I'd wish you'd leave out this part for a followup.
>>> >>>>>
>>> >>>>>>
>>> >>>>>> g++ -S -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.vect
>>> >>>>>>     -ftree-vectorizer-verbose=2
>>> >>>>>>     -fdump-tree-pre=foo.pre
>>> >>>>>>
>>> >>>>>> This prints the tree vectorizer dump into 'foo.vect', PRE dump into
>>> >>>>>> 'foo.pre', and the vectorizer verbose diagnostic dump onto stderr.
>>> >>>>>>
>>> >>>>>> Please take another look.
>>> >>>>>
>>> >>>>> --- tree-vect-loop-manip.c      (revision 188325)
>>> >>>>> +++ tree-vect-loop-manip.c      (working copy)
>>> >>>>> @@ -789,14 +789,11 @@ slpeel_make_loop_iterate_ntimes (struct loop
>>> >>>>> *loop
>>> >>>>>   gsi_remove (&loop_cond_gsi, true);
>>> >>>>>
>>> >>>>>   loop_loc = find_loop_location (loop);
>>> >>>>> -  if (dump_file && (dump_flags & TDF_DETAILS))
>>> >>>>> -    {
>>> >>>>> -      if (loop_loc != UNKNOWN_LOC)
>>> >>>>> -        fprintf (dump_file, "\nloop at %s:%d: ",
>>> >>>>> +  if (loop_loc != UNKNOWN_LOC)
>>> >>>>> +    dump_printf (TDF_DETAILS, "\nloop at %s:%d: ",
>>> >>>>>                  LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>> >>>>> -      print_gimple_stmt (dump_file, cond_stmt, 0, TDF_SLIM);
>>> >>>>> -    }
>>> >>>>> -
>>> >>>>> +  if (dump_flags & TDF_DETAILS)
>>> >>>>> +    dump_gimple_stmt (TDF_SLIM, cond_stmt, 0);
>>> >>>>>   loop->nb_iterations = niters;
>>> >>>>>
>>> >>>>> I'm confused by this.  Why is this not simply
>>> >>>>>
>>> >>>>>  if (loop_loc != UNKNOWN_LOC)
>>> >>>>>    dump_printf (dump_flags, "\nloop at %s:%d: ",
>>> >>>>>                       LOC_FILE (loop_loc), LOC_LINE (loop_loc));
>>> >>>>>  dump_gimple_stmt (dump_flags | TDF_SLIM, cond_stmt, 0);
>>> >>>>>
>>> >>>>> for example.  I notice that you maybe mis-understood the message
>>> >>>>> classification
>>> >>>>> I asked you to add (maybe I confused you by mentioning to eventually
>>> >>>>> re-use
>>> >>>>> the TDF_* flags).  I think you basically provided this message
>>> >>>>> classification
>>> >>>>> by adding two classes by providing both dump_gimple_stmt and
>>> >>>>> diag_gimple_stmt.
>>> >>>>> But still in the above you keep a dump_flags test _and_ you pass in
>>> >>>>> (altered) dump_flags to the dump/diag_gimple_stmt routines.  Let me
>>> >>>>> quote them:
>>> >>>>>
>>> >>>>> +void
>>> >>>>> +dump_gimple_stmt (int flags, gimple gs, int spc)
>>> >>>>> +{
>>> >>>>> +  if (dump_file)
>>> >>>>> +    print_gimple_stmt (dump_file, gs, spc, flags);
>>> >>>>> +}
>>> >>>>>
>>> >>>>> +void
>>> >>>>> +diag_gimple_stmt (int flags, gimple gs, int spc)
>>> >>>>> +{
>>> >>>>> +  if (alt_dump_file)
>>> >>>>> +    print_gimple_stmt (alt_dump_file, gs, spc, flags);
>>> >>>>> +}
>>> >>>>>
>>> >>>>> I'd say it should have been a single function:
>>> >>>>>
>>> >>>>> void
>>> >>>>> dump_gimple_stmt (enum msg_classification, int additional_flags,
>>> >>>>> gimple gs, int spc)
>>> >>>>> {
>>> >>>>>  if (msg_classification & go-to-dumpfile
>>> >>>>>      && dump_file)
>>> >>>>>    print_gimple_stmt (dump_file, gs, spc, dump_flags |
>>> >>>>> additional_flags);
>>> >>>>>  if (msg_classification & go-to-alt-dump-file
>>> >>>>>      && alt_dump_file && (alt_dump_flags & msg_classification))
>>> >>>>>    print_gimple_stmt (alt_dump_file, gs, spc, alt_dump_flags |
>>> >>>>> additional_flags);
>>> >>>>> }
>>> >>>>
>>> >>>> Yes,  I can make the single function work for both regular
>>> >>>> (dump_file)
>>> >>>> and diagnostic (stderr for now) dump streams. In fact, my original
>>> >>>> patch did just that. However, it duplicated output on *both* the
>>> >>>> streams. For example,
>>> >>>>
>>> >>>> gcc -O2 -ftree-vectorize -fdump-tree-vect=foo.vect
>>> >>>> -ftree-vectorizer-verbose=2 foo.c
>>> >>>>
>>> >>>> Since both streams are enabled in this case, a call to dump_printf ()
>>> >>>> would duplicate the information to both files, i.e., the dump and
>>> >>>> diagnostic will be intermixed. Is that intended? My first thought was
>>> >>>> to keep them separated via dump_*() and diag_* () methods.
>>> >>>
>>> >>> No, I think that's intended.  The regular dump-files should contain
>>> >>> everything, the secondary stream (eventually enabled by -fopt-info)
>>> >>> should be a filtered regular dump-file.
>>> >>>
>>> >>> Thanks,
>>> >>> Richard.
>>> >>>
>>> >>>> Thanks,
>>> >>>> Sharad
>>> >>>>
>>> >>>>> where msg_classification would include things to suitably classify
>>> >>>>> messages
>>> >>>>> for -fopt-info, like MSG_MISSED_OPTIMIZATION, MSG_OPTIMIZED,
>>> >>>>> MSG_NOTE.
>>> >>>>>
>>> >>>>> In another place we have
>>> >>>>>
>>> >>>>> @@ -1648,7 +1642,7 @@ vect_can_advance_ivs_p (loop_vec_info
>>> >>>>> loop_vinfo)
>>> >>>>>   /* Analyze phi functions of the loop header.  */
>>> >>>>>
>>> >>>>>   if (vect_print_dump_info (REPORT_DETAILS))
>>> >>>>> -    fprintf (vect_dump, "vect_can_advance_ivs_p:");
>>> >>>>> +    diag_printf (TDF_TREE, "vect_can_advance_ivs_p:");
>>> >>>>>
>>> >>>>> so why's that diag_printf and why TDF_TREE?  I suppose you made all
>>> >>>>> dumps to vect_dump diag_* and all dumps to dump_file dump_*?  I
>>> >>>>> think it should have been
>>> >>>>>
>>> >>>>>   dump_printf (REPORT_DETAILS, "vect_can_advance_ivs_p:");
>>> >>>>>
>>> >>>>> thus dump_printf only gets the msg-classification and the printf
>>> >>>>> args
>>> >>>>> (dump-flags
>>> >>>>> are only meaningful when passed down to pretty-printers).  Thus
>>> >>>>>
>>> >>>>> @@ -1658,8 +1652,8 @@ vect_can_advance_ivs_p (loop_vec_info
>>> >>>>> loop_vinfo)
>>> >>>>>       phi = gsi_stmt (gsi);
>>> >>>>>       if (vect_print_dump_info (REPORT_DETAILS))
>>> >>>>>        {
>>> >>>>> -          fprintf (vect_dump, "Analyze phi: ");
>>> >>>>> -          print_gimple_stmt (vect_dump, phi, 0, TDF_SLIM);
>>> >>>>> +          diag_printf (TDF_TREE, "Analyze phi: ");
>>> >>>>> +          diag_gimple_stmt (TDF_SLIM, phi, 0);
>>> >>>>>        }
>>> >>>>>
>>> >>>>> should be
>>> >>>>>
>>> >>>>>  dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>> >>>>>  dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>> >>>>>
>>> >>>>> and the if (vect_print_dump_info (REPORT_DETAILS)) should be what
>>> >>>>> the dump infrastructure provides and thus hidden.  Eventually we
>>> >>>>> should
>>> >>>>> have a dump_kind (msg-classification) so we can replace it with
>>> >>>>>
>>> >>>>>   if (dump_kind (REPORT_DETAILS))
>>> >>>>>     {
>>> >>>>>       dump_printf (REPORT_DETAILS, "Analyze phi: ");
>>> >>>>>       dump_gimple_stmt (REPORT_DETAILS, TDF_SLIM, phi, 0);
>>> >>>>>     }
>>> >>>>>
>>> >>>>> to reduce the runtime overhead when not diagnosing/dumping.
>>> >>>>>
>>> >>>>> @@ -2464,8 +2458,8 @@ vect_create_cond_for_alias_checks
>>> >>>>> (loop_vec_info l
>>> >>>>>     }
>>> >>>>>
>>> >>>>>   if (vect_print_dump_info (REPORT_VECTORIZED_LOCATIONS))
>>> >>>>> -    fprintf (vect_dump, "created %u versioning for alias
>>> >>>>> checks.\n",
>>> >>>>> -             VEC_length (ddr_p, may_alias_ddrs));
>>> >>>>> +    diag_printf (TDF_TREE, "created %u versioning for alias
>>> >>>>> checks.\n",
>>> >>>>> +                 VEC_length (ddr_p, may_alias_ddrs));
>>> >>>>>  }
>>> >>>>>
>>> >>>>> so here we have a different msg-classification,
>>> >>>>> REPORT_VECTORIZED_LOCATIONS.  As we eventually want a central
>>> >>>>> -fopt-report=... we do not want to leave this implementation detail
>>> >>>>> to
>>> >>>>> individual passes but gather them in a central place.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Richard.
>>> >>>>>
>>> >>>>>> Thanks,
>>> >>>>>> Sharad
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Thu, Jun 7, 2012 at 12:19 AM, Xinliang David Li
>>> >>>>>> <davidxl@google.com> wrote:
>>> >>>>>>> On Wed, Jun 6, 2012 at 10:58 PM, Sharad Singhai
>>> >>>>>>> <singhai@google.com> wrote:
>>> >>>>>>>> Sorry about the delay. I have finally incorporated all the
>>> >>>>>>>> suggestions
>>> >>>>>>>> and reorganized the dump infrastructure a bit. The attached patch
>>> >>>>>>>> updates vectorizer passes so that instead of accessing global
>>> >>>>>>>> dump_file directly, these passes call dump_printf (FLAG, format,
>>> >>>>>>>> ...).
>>> >>>>>>>> The dump_printf can choose between two streams, one regular pass
>>> >>>>>>>> dump
>>> >>>>>>>> file, and another optional command line provided file. Currently,
>>> >>>>>>>> it
>>> >>>>>>>> doesn't discriminate and all the dump information goes to both
>>> >>>>>>>> the
>>> >>>>>>>> streams.
>>> >>>>>>>>
>>> >>>>>>>> Thus, for example,
>>> >>>>>>>>
>>> >>>>>>>> g++ -O2 v.cc -ftree-vectorize -fdump-tree-vect=foo.v
>>> >>>>>>>> -ftree-vectorizer-verbose=3
>>> >>>>>>>
>>> >>>>>>> But the default form of dump option (-fdump-tree-vect) no longer
>>> >>>>>>> interferes with -ftree-vectorize-verbose=xxx output right? (this
>>> >>>>>>> is
>>> >>>>>>> one of the main requirements). One dumped to the primary stream
>>> >>>>>>> (named
>>> >>>>>>> dump file) and the other to the stderr -- the default diagnostic
>>> >>>>>>> (alt)
>>> >>>>>>> stream.
>>> >>>>>>>
>>> >>>>>>> David
>>> >>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> will output the verbose vectorizer information in both *.vect
>>> >>>>>>>> file and
>>> >>>>>>>> foo.v file. However, as I have converted only vectorizer passes
>>> >>>>>>>> so
>>> >>>>>>>> far, there is additional information in *.vect file which is not
>>> >>>>>>>> present in foo.v file. Once other passes are converted to use
>>> >>>>>>>> this
>>> >>>>>>>> scheme, then these two dump files should have identical output.
>>> >>>>>>>>
>>> >>>>>>>> Also note that in this patch -fdump-xxx=yyy format does not
>>> >>>>>>>> override
>>> >>>>>>>> any auto named dump files as in my earlier patches. Instead the
>>> >>>>>>>> dump
>>> >>>>>>>> information is output to both places when a command line dump
>>> >>>>>>>> file
>>> >>>>>>>> option is provided.
>>> >>>>>>>>
>>> >>>>>>>> To summarize:
>>> >>>>>>>> - instead of using dump_begin () / dump_end (), the passes should
>>> >>>>>>>> use
>>> >>>>>>>> dump_start ()/dump_finish (). These new variants do not return
>>> >>>>>>>> the
>>> >>>>>>>> dump_file. However, they still set the global
>>> >>>>>>>> dump_file/dump_flags for
>>> >>>>>>>> the benefit of other passes during the transition.
>>> >>>>>>>> - instead of directly printing to the dump_file, as in
>>> >>>>>>>> if (dump_file)
>>> >>>>>>>>  fprintf (dump_file, ...);
>>> >>>>>>>>
>>> >>>>>>>> The passes should do
>>> >>>>>>>>
>>> >>>>>>>> dump_printf (dump_flag, ...);
>>> >>>>>>>> This will output to dump file(s) only when dump_flag is enabled
>>> >>>>>>>> for a
>>> >>>>>>>> given pass.
>>> >>>>>>>>
>>> >>>>>>>> I have bootstrapped and tested it on x86_64. Does it look okay?
>>> >>>>>>>>
>>> >>>>>>>> Thanks,
>>> >>>>>>>> Sharad
>>> >>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> On Mon, May 14, 2012 at 12:26 AM, Richard Guenther
>>> >>>>>>>> <richard.guenther@gmail.com> wrote:
>>> >>>>>>>>> On Sat, May 12, 2012 at 6:39 PM, Xinliang David Li
>>> >>>>>>>>> <davidxl@google.com> wrote:
>>> >>>>>>>>>> On Sat, May 12, 2012 at 9:26 AM, Gabriel Dos Reis
>>> >>>>>>>>>> <gdr@integrable-solutions.net> wrote:
>>> >>>>>>>>>>> On Sat, May 12, 2012 at 11:05 AM, Xinliang David Li
>>> >>>>>>>>>>> <davidxl@google.com> wrote:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> The downside is that the dump file format will look different
>>> >>>>>>>>>>>> from the
>>> >>>>>>>>>>>> stderr output which is less than ideal.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> BTW, why do people want to use stderr for dumping internal
>>> >>>>>>>>>>> IRs,
>>> >>>>>>>>>>> as opposed to stdout or other files?  That does not sound
>>> >>>>>>>>>>> right.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> I was talking about the transformation information difference.
>>> >>>>>>>>>> In
>>> >>>>>>>>>> stderr (where diagnostics go to), we may have
>>> >>>>>>>>>>
>>> >>>>>>>>>> foo.c: in function 'foo':
>>> >>>>>>>>>> foo.c:5:6: note: loop was vectorized
>>> >>>>>>>>>>
>>> >>>>>>>>>> but in dump file the format for the information may be
>>> >>>>>>>>>> different,
>>> >>>>>>>>>> unless we want to duplicate the machinery in diagnostics.
>>> >>>>>>>>>
>>> >>>>>>>>> So?  What's the problem with that ("different" diagnostics)?
>>> >>>>>>>>>
>>> >>>>>>>>> Richard.
>>> >>>>>>>>>
>>> >>>>>>>>>> David
>>> >>>>>>>>>>
>>> >>>>>>>>>>> -- Gaby
>>
>>

Patch

Index: tree-pass.h
===================================================================
--- tree-pass.h (revision 188966)
+++ tree-pass.h (working copy)
@@ -85,7 +85,6 @@  enum tree_dump_index
 #define TDF_CSELIB     (1 << 23)       /* Dump cselib details.  */
 #define TDF_SCEV       (1 << 24)       /* Dump SCEV details.  */

-
 /* In tree-dump.c */

 extern char *get_dump_file_name (int);

please avoid whitespace-only changes (also elsewhere).

 /* Global variables used to communicate with passes.  */
 extern FILE *dump_file;
+extern FILE *alt_dump_file;
 extern int dump_flags;
+extern int opt_info_flags;

so I expected the primary dump_file to be controlled with dump_flags,
or with a (cached) translation of them to a primary_opt_info_flags.

Index: gimple-pretty-print.h
===================================================================
--- gimple-pretty-print.h       (revision 188966)
+++ gimple-pretty-print.h       (working copy)
@@ -31,6 +31,6 @@  extern void debug_gimple_seq (gimple_seq);
 extern void print_gimple_seq (FILE *, gimple_seq, int, int);
 extern void print_gimple_stmt (FILE *, gimple, int, int);
 extern void print_gimple_expr (FILE *, gimple, int, int);
-extern void dump_gimple_stmt (pretty_printer *, gimple, int, int);
+extern void print_gimple_stmt_1 (pretty_printer *, gimple, int, int);

I'd call this pp_gimple_stmt instead.

@@ -1418,6 +1419,16 @@  init_branch_prob (void)
 void
 end_branch_prob (void)
 {
+  end_branch_prob_1 (dump_file);
+  end_branch_prob_1 (alt_dump_file);
+}
+
+/* Helper function for file-level cleanup for DUMP_FILE after
+   branch-prob processing is completed. */
+
+static void
+end_branch_prob_1 (FILE *dump_file)
+{
   if (dump_file)
     {
       fprintf (dump_file, "\n");

That change looks odd ;)  Can you instead use the new dump_printf
interface?  (side-note: we should not need to export alt_dump_file at all!)

@@ -2166,10 +2177,6 @@  ftree-vect-loop-version
 Common Report Var(flag_tree_vect_loop_version) Init(1) Optimization
 Enable loop versioning when doing loop vectorization on trees

-ftree-vectorizer-verbose=
-Common RejectNegative Joined UInteger
--ftree-vectorizer-verbose=<number>     Set the verbosity level of the
vectorizer
-

we need to preserve old switches for backward compatibility, I suggest
to alias it to -fopt-info for now.

@@ -13909,7 +13909,7 @@  unmentioned_reg_p (rtx equiv, rtx expr)
 }
 ^L
 DEBUG_FUNCTION void
-dump_combine_stats (FILE *file)
+print_combine_stats (FILE *file)

see above, debug_combine_stats.

Index: system.h
===================================================================
--- system.h    (revision 188966)
+++ system.h    (working copy)
@@ -669,6 +669,7 @@  extern int vsnprintf(char *, size_t, const char *,
    except, maybe, something to the dump file.  */
 #ifdef BUFSIZ
 extern FILE *dump_file;
+extern FILE *alt_dump_file;

see above - we should not need to export this (nor opt_info_flags).