[RFC] Introduce -fdump*-folding

Message ID	668c62b5-9603-47c5-9bfe-266d0d369a1f@suse.cz
State	New
Headers	show Return-Path: <gcc-patches-return-452663-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=UXAukH6hNzhAYLCA+gVwkjlFK/Ru+8GQhaFOPYPiIKENzj+Ov0 45qBYqziFjDfWB0FSvZWbfAcXEWVRIx/k9yNLONK176JBRFoEQ4vAfkLUjjxGpbi 6AlboAMOYxSlNiC03ae/wPqX8lKu9YiHEJhad5goL2yPsRQ3X5/NMNmh8= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org To: GCC Patches <gcc-patches@gcc.gnu.org> Cc: Richard Biener <richard.guenther@gmail.com> From: =?UTF-8?Q?Martin_Li=c5=a1ka?= <mliska@suse.cz> Subject: [RFC][PATCH] Introduce -fdump*-folding Message-ID: <668c62b5-9603-47c5-9bfe-266d0d369a1f@suse.cz> Date: Wed, 3 May 2017 10:10:44 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------043E0B1481B58BF8D8E05085"

Martin Liška May 3, 2017, 8:10 a.m. UTC

Hello

Last release cycle I spent quite some time with reading of IVOPTS pass
dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern'
lines, which can make reading of a dump file more complicated.

There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number
shows how many lines are of the aforementioned pattern:

                        tramp3d-v4.cpp.164t.ivopts: 6.34%
                          tramp3d-v4.cpp.091t.ccp2: 5.04%
                      tramp3d-v4.cpp.093t.cunrolli: 4.41%
                      tramp3d-v4.cpp.129t.laddress: 3.70%
                          tramp3d-v4.cpp.032t.ccp1: 2.31%
                          tramp3d-v4.cpp.038t.evrp: 1.90%
                     tramp3d-v4.cpp.033t.forwprop1: 1.74%
                          tramp3d-v4.cpp.103t.vrp1: 1.52%
                     tramp3d-v4.cpp.124t.forwprop3: 1.31%
                          tramp3d-v4.cpp.181t.vrp2: 1.30%
                       tramp3d-v4.cpp.161t.cunroll: 1.22%
                    tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
                       tramp3d-v4.cpp.153t.ivcanon: 1.07%
                          tramp3d-v4.cpp.126t.ccp3: 0.96%
                          tramp3d-v4.cpp.143t.sccp: 0.91%
                     tramp3d-v4.cpp.185t.forwprop4: 0.82%
                           tramp3d-v4.cpp.011t.cfg: 0.74%
                     tramp3d-v4.cpp.096t.forwprop2: 0.50%
                    tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
                     tramp3d-v4.cpp.120t.phicprop1: 0.33%
                           tramp3d-v4.cpp.133t.pre: 0.32%
                     tramp3d-v4.cpp.182t.phicprop2: 0.27%
                    tramp3d-v4.cpp.170t.veclower21: 0.25%
                       tramp3d-v4.cpp.029t.einline: 0.24%

I'm suggesting to add new TDF that will be allocated for that.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Thoughts?
Martin

Andrew Pinski May 3, 2017, 8:14 a.m. UTC | #1

On Wed, May 3, 2017 at 1:10 AM, Martin Liška <mliska@suse.cz> wrote:
> Hello
>
> Last release cycle I spent quite some time with reading of IVOPTS pass
> dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern'
> lines, which can make reading of a dump file more complicated.

Yes the folding message can get annoying.  Especially when it is not
clear what is being folded which is the case a lot of the time I have
seen the message.

Thanks,
Andrew

>
> There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number
> shows how many lines are of the aforementioned pattern:
>
>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>                            tramp3d-v4.cpp.133t.pre: 0.32%
>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>                        tramp3d-v4.cpp.029t.einline: 0.24%
>
> I'm suggesting to add new TDF that will be allocated for that.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Thoughts?
> Martin

Richard Biener May 3, 2017, 10:12 a.m. UTC | #2

On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
> Hello
>
> Last release cycle I spent quite some time with reading of IVOPTS pass
> dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern'
> lines, which can make reading of a dump file more complicated.
>
> There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number
> shows how many lines are of the aforementioned pattern:
>
>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>                            tramp3d-v4.cpp.133t.pre: 0.32%
>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>                        tramp3d-v4.cpp.029t.einline: 0.24%
>
> I'm suggesting to add new TDF that will be allocated for that.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Thoughts?

Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1 bit left
if you allow negative dump_flags).  It'll tickle down on a lot of interfaces
so introducing dump_flags_t at the same time might be a good idea.

Thanks,
Richard.

> Martin

Martin Liška May 4, 2017, 9:22 a.m. UTC | #3

On 05/03/2017 12:12 PM, Richard Biener wrote:
> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>> Hello
>>
>> Last release cycle I spent quite some time with reading of IVOPTS pass
>> dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern'
>> lines, which can make reading of a dump file more complicated.
>>
>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number
>> shows how many lines are of the aforementioned pattern:
>>
>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>
>> I'm suggesting to add new TDF that will be allocated for that.
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>>
>> Thoughts?
>
> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1 bit left
> if you allow negative dump_flags).  It'll tickle down on a lot of interfaces
> so introducing dump_flags_t at the same time might be a good idea.

Hello.

I've prepared patch that migrates all interfaces and introduces dump_flags_t. I've been
currently testing that. Apart from that Richi requested to come up with more generic approach
of hierarchical structure of options.

Can you please take a look at self-contained source file that shows way I've decided to go?
Another question is whether we want to implement also "aliases", where for instance
current 'all' is equal to union of couple of suboptions?

Thanks for feedback,
Martin

>
> Thanks,
> Richard.
>
>> Martin

Richard Biener May 4, 2017, 10:40 a.m. UTC | #4

On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>
>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>
>>> Hello
>>>
>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>> pattern'
>>> lines, which can make reading of a dump file more complicated.
>>>
>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>> Percentage number
>>> shows how many lines are of the aforementioned pattern:
>>>
>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>
>>> I'm suggesting to add new TDF that will be allocated for that.
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>> tests.
>>>
>>> Thoughts?
>>
>>
>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1 bit
>> left
>> if you allow negative dump_flags).  It'll tickle down on a lot of
>> interfaces
>> so introducing dump_flags_t at the same time might be a good idea.
>
>
> Hello.
>
> I've prepared patch that migrates all interfaces and introduces
> dump_flags_t.

Great.

> I've been
> currently testing that. Apart from that Richi requested to come up with more
> generic approach
> of hierarchical structure of options.

Didn't really "request" it, it's just something we eventually need to do when
we run out of bits again ;)

>
> Can you please take a look at self-contained source file that shows way I've
> decided to go?
> Another question is whether we want to implement also "aliases", where for
> instance
> current 'all' is equal to union of couple of suboptions?

Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
about -all-foo-all.

The important thing is to make sure dump_flags_t stays POD and thus is
eligible to be passed in register(s).  In the end we might simply come up
with a two-level hierarchy, each 32bits (or we can even get back to 32bits
in total with two times 16bits).

It looks you didn't actually implement this as a hierarchy though but
still allocate from one pool of bits (so you only do a change to how
users access this?)

Thanks,
Richard.

>
> Thanks for feedback,
> Martin
>
>>
>> Thanks,
>> Richard.
>>
>>> Martin
>
>

Martin Liška May 4, 2017, 11:10 a.m. UTC | #5

On 05/04/2017 12:40 PM, Richard Biener wrote:
> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>
>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> Hello
>>>>
>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>> pattern'
>>>> lines, which can make reading of a dump file more complicated.
>>>>
>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>> Percentage number
>>>> shows how many lines are of the aforementioned pattern:
>>>>
>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>
>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>> tests.
>>>>
>>>> Thoughts?
>>>
>>>
>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1 bit
>>> left
>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>> interfaces
>>> so introducing dump_flags_t at the same time might be a good idea.
>>
>>
>> Hello.
>>
>> I've prepared patch that migrates all interfaces and introduces
>> dump_flags_t.
>
> Great.
>
>> I've been
>> currently testing that. Apart from that Richi requested to come up with more
>> generic approach
>> of hierarchical structure of options.
>
> Didn't really "request" it, it's just something we eventually need to do when
> we run out of bits again ;)

I know, but it was me who came up with the idea of more fine suboptions :)

>
>>
>> Can you please take a look at self-contained source file that shows way I've
>> decided to go?
>> Another question is whether we want to implement also "aliases", where for
>> instance
>> current 'all' is equal to union of couple of suboptions?
>
> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
> about -all-foo-all.

Actually only having 'all' is quite easy to implement.

Let's imagine following hierarchy:

(root)
- vops
- folding
   - gimple
     - ctor
     - array_ref
     - arithmetic
   - generic
     - c
     - c++
     - ctor
     - xyz

Then '-fdump-passname-folding-all' will be equal to '-fdump-passname-folding'.

>
> The important thing is to make sure dump_flags_t stays POD and thus is
> eligible to be passed in register(s).  In the end we might simply come up
> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
> in total with two times 16bits).

I'm aware of having the type as POD.

>
> It looks you didn't actually implement this as a hierarchy though but
> still allocate from one pool of bits (so you only do a change to how
> users access this?)

Yep, all leaf options are mapped to a mask and all inner nodes are just union
of suboptions. That will allow us to have 64 leaf suboptions. Reaching the limit
we can encode the values in more sophisticated way. That however brings need
to implement more complicated '&' and '|' operators.

I'll finish the implementation and try to migrate that to current handling.
Guess, I'm quite close.

Martin

>
> Thanks,
> Richard.
>
>>
>> Thanks for feedback,
>> Martin
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Martin
>>
>>

Martin Liška May 5, 2017, 10:41 a.m. UTC | #6

Hello.

There's first patch that just defines dump_flags_t as uint64_t and changes all
corresponding interfaces that do use it. There's a problematic impact that
all targets have to include dumpfile.h just right after coretypes.h. That makes
the patch harder to properly test. I tried couple of cross-compilers and it works.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets.

Thoughts?
Martin

Richard Biener May 5, 2017, 11:38 a.m. UTC | #7

On Fri, May 5, 2017 at 12:41 PM, Martin Liška <mliska@suse.cz> wrote:
> Hello.
>
> There's first patch that just defines dump_flags_t as uint64_t and changes all
> corresponding interfaces that do use it. There's a problematic impact that
> all targets have to include dumpfile.h just right after coretypes.h. That makes
> the patch harder to properly test. I tried couple of cross-compilers and it works.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets.
>
> Thoughts?

So usually we get away with defining pervasive types in coretypes.h instead.

Now I see how that can be not what we want if dump_flags_t becomes a
"class".  Well, at least if it has too many (inline) methods.

How many of our files end up including dumpfile.h?  There's

/* Most host source files will require the following headers.  */
#if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET)
#include "machmode.h"
#include "signop.h"
#include "wide-int.h"

at the end of coretypes.h so it might be possible to stuff dumpfile.h there.

Or create a dump-flags.h header just containing the flag bits.  I suppose
most files need it because they include some interfaces that have
dump_flags_t, not because they do dumping (they'd include dumpflags.h
already), right?

Anyway, I think the patch is a good thing (how do we make sure we
don't "regress", aka people using 'int', not knowing about dump_flag_t?).

I'd probably, as a first step, simply put the typedef into coretypes.h.

A patch doing that instead of sprinkling dumpfile.h everywhere is ok.

Thanks,
Richard.

> Martin

Richard Biener May 5, 2017, 11:50 a.m. UTC | #8

On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote:
> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>
>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>>>
>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>
>>>>
>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>
>>>>>
>>>>> Hello
>>>>>
>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>> pattern'
>>>>> lines, which can make reading of a dump file more complicated.
>>>>>
>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>> Percentage number
>>>>> shows how many lines are of the aforementioned pattern:
>>>>>
>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>
>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>> tests.
>>>>>
>>>>> Thoughts?
>>>>
>>>>
>>>>
>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>> bit
>>>> left
>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>> interfaces
>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>
>>>
>>>
>>> Hello.
>>>
>>> I've prepared patch that migrates all interfaces and introduces
>>> dump_flags_t.
>>
>>
>> Great.
>>
>>> I've been
>>> currently testing that. Apart from that Richi requested to come up with
>>> more
>>> generic approach
>>> of hierarchical structure of options.
>>
>>
>> Didn't really "request" it, it's just something we eventually need to do
>> when
>> we run out of bits again ;)
>
>
> I know, but it was me who came up with the idea of more fine suboptions :)
>
>>
>>>
>>> Can you please take a look at self-contained source file that shows way
>>> I've
>>> decided to go?
>>> Another question is whether we want to implement also "aliases", where
>>> for
>>> instance
>>> current 'all' is equal to union of couple of suboptions?
>>
>>
>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>> about -all-foo-all.
>
>
> Actually only having 'all' is quite easy to implement.
>
> Let's imagine following hierarchy:
>
> (root)
> - vops
> - folding
>   - gimple
>     - ctor
>     - array_ref
>     - arithmetic
>   - generic
>     - c
>     - c++
>     - ctor
>     - xyz
>
> Then '-fdump-passname-folding-all' will be equal to
> '-fdump-passname-folding'.

Ok, so you envision that sub-options restrict stuff.  I thought of

 -gimple
   -vops
 -generic
   -folding

so the other way around.  We do not have many options that would be RTL
specific but gimple only are -vops -alias -scev -gimple -rhs-only
-verbose -memsyms
while RTL has -cselib. -eh sounds gimple specific.  Then there's the optgroup
stuff you already saw.

So it looks like a 8 bit "group id" plus 56 bits of flags would do.

Yes, this implies reworking how & and | work.  For example you can't
| dump-flags of different groups.

>>
>> The important thing is to make sure dump_flags_t stays POD and thus is
>> eligible to be passed in register(s).  In the end we might simply come up
>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
>> in total with two times 16bits).
>
>
> I'm aware of having the type as POD.
>
>>
>> It looks you didn't actually implement this as a hierarchy though but
>> still allocate from one pool of bits (so you only do a change to how
>> users access this?)
>
>
> Yep, all leaf options are mapped to a mask and all inner nodes are just
> union
> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
> limit
> we can encode the values in more sophisticated way. That however brings need
> to implement more complicated '&' and '|' operators.
>
> I'll finish the implementation and try to migrate that to current handling.
> Guess, I'm quite close.

Hmm, but then there's not much advantage in suboptions (well, apart from maybe
at the user-side).

> Martin
>
>
>>
>> Thanks,
>> Richard.
>>
>>>
>>> Thanks for feedback,
>>> Martin
>>>
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>> Martin
>>>
>>>
>>>
>

Trevor Saunders May 6, 2017, 12:53 p.m. UTC | #9

On Fri, May 05, 2017 at 01:38:21PM +0200, Richard Biener wrote:
> On Fri, May 5, 2017 at 12:41 PM, Martin Liška <mliska@suse.cz> wrote:
> > Hello.
> >
> > There's first patch that just defines dump_flags_t as uint64_t and changes all
> > corresponding interfaces that do use it. There's a problematic impact that
> > all targets have to include dumpfile.h just right after coretypes.h. That makes
> > the patch harder to properly test. I tried couple of cross-compilers and it works.
> >
> > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> > I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets.
> >
> > Thoughts?
> 
> So usually we get away with defining pervasive types in coretypes.h instead.
> 
> Now I see how that can be not what we want if dump_flags_t becomes a
> "class".  Well, at least if it has too many (inline) methods.
> 
> How many of our files end up including dumpfile.h?  There's
> 
> /* Most host source files will require the following headers.  */
> #if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET)
> #include "machmode.h"
> #include "signop.h"
> #include "wide-int.h"
> 
> at the end of coretypes.h so it might be possible to stuff dumpfile.h there.
> 
> Or create a dump-flags.h header just containing the flag bits.  I suppose
> most files need it because they include some interfaces that have
> dump_flags_t, not because they do dumping (they'd include dumpflags.h
> already), right?

well, if a header uses dumpflags_t and a forward declaration won't do it
should include it right? but continueing the coretypes.h hacks may be
expedient.

> Anyway, I think the patch is a good thing (how do we make sure we
> don't "regress", aka people using 'int', not knowing about dump_flag_t?).

does it make sense to define it as a struct whose only member is a
uint64_t?  That way you'd have to explicitly pull out the field wherever
you want to pass it to something taking an int.  That would require
defining a bunch of operator &/| etc.  However we could also move from
there to a set of bitfields for most of the flags and usually not need
to explicitly extract the bit we want.

Trev

Martin Liška May 9, 2017, 12:01 p.m. UTC | #10

On 05/05/2017 01:50 PM, Richard Biener wrote:
> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote:
>> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>>
>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>
>>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>>
>>>>>
>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>
>>>>>>
>>>>>> Hello
>>>>>>
>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>>> pattern'
>>>>>> lines, which can make reading of a dump file more complicated.
>>>>>>
>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>>> Percentage number
>>>>>> shows how many lines are of the aforementioned pattern:
>>>>>>
>>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>>
>>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>>> tests.
>>>>>>
>>>>>> Thoughts?
>>>>>
>>>>>
>>>>>
>>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>>> bit
>>>>> left
>>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>>> interfaces
>>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>>
>>>>
>>>>
>>>> Hello.
>>>>
>>>> I've prepared patch that migrates all interfaces and introduces
>>>> dump_flags_t.
>>>
>>>
>>> Great.
>>>
>>>> I've been
>>>> currently testing that. Apart from that Richi requested to come up with
>>>> more
>>>> generic approach
>>>> of hierarchical structure of options.
>>>
>>>
>>> Didn't really "request" it, it's just something we eventually need to do
>>> when
>>> we run out of bits again ;)
>>
>>
>> I know, but it was me who came up with the idea of more fine suboptions :)
>>
>>>
>>>>
>>>> Can you please take a look at self-contained source file that shows way
>>>> I've
>>>> decided to go?
>>>> Another question is whether we want to implement also "aliases", where
>>>> for
>>>> instance
>>>> current 'all' is equal to union of couple of suboptions?
>>>
>>>
>>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>>> about -all-foo-all.
>>
>>
>> Actually only having 'all' is quite easy to implement.
>>
>> Let's imagine following hierarchy:
>>
>> (root)
>> - vops
>> - folding
>>   - gimple
>>     - ctor
>>     - array_ref
>>     - arithmetic
>>   - generic
>>     - c
>>     - c++
>>     - ctor
>>     - xyz
>>
>> Then '-fdump-passname-folding-all' will be equal to
>> '-fdump-passname-folding'.
> 
> Ok, so you envision that sub-options restrict stuff.  I thought of
> 
>  -gimple
>    -vops
>  -generic
>    -folding
> 
> so the other way around.  We do not have many options that would be RTL
> specific but gimple only are -vops -alias -scev -gimple -rhs-only
> -verbose -memsyms
> while RTL has -cselib. -eh sounds gimple specific.  Then there's the optgroup
> stuff you already saw.
> 
> So it looks like a 8 bit "group id" plus 56 bits of flags would do.
> 
> Yes, this implies reworking how & and | work.  For example you can't
> | dump-flags of different groups.

Well, I'm not opposed to idea of converting that to way you described.
So, you're willing to introduce something like:

(root)
- generic
  - eh
  - folding
  - ...
- gimple
  - vops
  - folding
   - rhs-only
   - ...
  - vops
- rtl
  - cselib
  - ...

?

> 
>>>
>>> The important thing is to make sure dump_flags_t stays POD and thus is
>>> eligible to be passed in register(s).  In the end we might simply come up
>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
>>> in total with two times 16bits).
>>
>>
>> I'm aware of having the type as POD.
>>
>>>
>>> It looks you didn't actually implement this as a hierarchy though but
>>> still allocate from one pool of bits (so you only do a change to how
>>> users access this?)
>>
>>
>> Yep, all leaf options are mapped to a mask and all inner nodes are just
>> union
>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
>> limit
>> we can encode the values in more sophisticated way. That however brings need
>> to implement more complicated '&' and '|' operators.
>>
>> I'll finish the implementation and try to migrate that to current handling.
>> Guess, I'm quite close.
> 
> Hmm, but then there's not much advantage in suboptions (well, apart from maybe
> at the user-side).

Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info.
As you can see I had to explicitly define all enum values and hierarchy creation
of every single node.

Martin

> 
>> Martin
>>
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>>
>>>> Thanks for feedback,
>>>> Martin
>>>>
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>> Martin
>>>>
>>>>
>>>>
>>

Richard Biener May 9, 2017, 12:16 p.m. UTC | #11

On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote:
> On 05/05/2017 01:50 PM, Richard Biener wrote:
>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote:
>>> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>>>
>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>
>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>>>
>>>>>>
>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>>
>>>>>>>
>>>>>>> Hello
>>>>>>>
>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>>>> pattern'
>>>>>>> lines, which can make reading of a dump file more complicated.
>>>>>>>
>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>>>> Percentage number
>>>>>>> shows how many lines are of the aforementioned pattern:
>>>>>>>
>>>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>>>
>>>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>>>> tests.
>>>>>>>
>>>>>>> Thoughts?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>>>> bit
>>>>>> left
>>>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>>>> interfaces
>>>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>>>
>>>>>
>>>>>
>>>>> Hello.
>>>>>
>>>>> I've prepared patch that migrates all interfaces and introduces
>>>>> dump_flags_t.
>>>>
>>>>
>>>> Great.
>>>>
>>>>> I've been
>>>>> currently testing that. Apart from that Richi requested to come up with
>>>>> more
>>>>> generic approach
>>>>> of hierarchical structure of options.
>>>>
>>>>
>>>> Didn't really "request" it, it's just something we eventually need to do
>>>> when
>>>> we run out of bits again ;)
>>>
>>>
>>> I know, but it was me who came up with the idea of more fine suboptions :)
>>>
>>>>
>>>>>
>>>>> Can you please take a look at self-contained source file that shows way
>>>>> I've
>>>>> decided to go?
>>>>> Another question is whether we want to implement also "aliases", where
>>>>> for
>>>>> instance
>>>>> current 'all' is equal to union of couple of suboptions?
>>>>
>>>>
>>>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>>>> about -all-foo-all.
>>>
>>>
>>> Actually only having 'all' is quite easy to implement.
>>>
>>> Let's imagine following hierarchy:
>>>
>>> (root)
>>> - vops
>>> - folding
>>>   - gimple
>>>     - ctor
>>>     - array_ref
>>>     - arithmetic
>>>   - generic
>>>     - c
>>>     - c++
>>>     - ctor
>>>     - xyz
>>>
>>> Then '-fdump-passname-folding-all' will be equal to
>>> '-fdump-passname-folding'.
>>
>> Ok, so you envision that sub-options restrict stuff.  I thought of
>>
>>  -gimple
>>    -vops
>>  -generic
>>    -folding
>>
>> so the other way around.  We do not have many options that would be RTL
>> specific but gimple only are -vops -alias -scev -gimple -rhs-only
>> -verbose -memsyms
>> while RTL has -cselib. -eh sounds gimple specific.  Then there's the optgroup
>> stuff you already saw.
>>
>> So it looks like a 8 bit "group id" plus 56 bits of flags would do.
>>
>> Yes, this implies reworking how & and | work.  For example you can't
>> | dump-flags of different groups.
>
> Well, I'm not opposed to idea of converting that to way you described.
> So, you're willing to introduce something like:
>
> (root)
> - generic
>   - eh
>   - folding
>   - ...
> - gimple
>   - vops
>   - folding
>    - rhs-only
>    - ...
>   - vops
> - rtl
>   - cselib
>   - ...
>
> ?

Yeah.  As said the motivation was to escape the 32 (now 64) bits limitation,
not to make the user interface into a hierarchy.

I suppose we can easily defer now given we have 32 bits available now ;)

Richard.

>>
>>>>
>>>> The important thing is to make sure dump_flags_t stays POD and thus is
>>>> eligible to be passed in register(s).  In the end we might simply come up
>>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
>>>> in total with two times 16bits).
>>>
>>>
>>> I'm aware of having the type as POD.
>>>
>>>>
>>>> It looks you didn't actually implement this as a hierarchy though but
>>>> still allocate from one pool of bits (so you only do a change to how
>>>> users access this?)
>>>
>>>
>>> Yep, all leaf options are mapped to a mask and all inner nodes are just
>>> union
>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
>>> limit
>>> we can encode the values in more sophisticated way. That however brings need
>>> to implement more complicated '&' and '|' operators.
>>>
>>> I'll finish the implementation and try to migrate that to current handling.
>>> Guess, I'm quite close.
>>
>> Hmm, but then there's not much advantage in suboptions (well, apart from maybe
>> at the user-side).
>
> Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info.
> As you can see I had to explicitly define all enum values and hierarchy creation
> of every single node.
>
> Martin
>
>>
>>> Martin
>>>
>>>
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>
>>>>> Thanks for feedback,
>>>>> Martin
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>> Martin
>>>>>
>>>>>
>>>>>
>>>
>

Martin Liška May 9, 2017, 12:46 p.m. UTC | #12

On 05/09/2017 02:16 PM, Richard Biener wrote:
> On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote:
>> On 05/05/2017 01:50 PM, Richard Biener wrote:
>>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote:
>>>> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>>>>
>>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>
>>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>>>>
>>>>>>>
>>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> Hello
>>>>>>>>
>>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>>>>> pattern'
>>>>>>>> lines, which can make reading of a dump file more complicated.
>>>>>>>>
>>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>>>>> Percentage number
>>>>>>>> shows how many lines are of the aforementioned pattern:
>>>>>>>>
>>>>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>>>>
>>>>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>>>>> tests.
>>>>>>>>
>>>>>>>> Thoughts?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>>>>> bit
>>>>>>> left
>>>>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>>>>> interfaces
>>>>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Hello.
>>>>>>
>>>>>> I've prepared patch that migrates all interfaces and introduces
>>>>>> dump_flags_t.
>>>>>
>>>>>
>>>>> Great.
>>>>>
>>>>>> I've been
>>>>>> currently testing that. Apart from that Richi requested to come up with
>>>>>> more
>>>>>> generic approach
>>>>>> of hierarchical structure of options.
>>>>>
>>>>>
>>>>> Didn't really "request" it, it's just something we eventually need to do
>>>>> when
>>>>> we run out of bits again ;)
>>>>
>>>>
>>>> I know, but it was me who came up with the idea of more fine suboptions :)
>>>>
>>>>>
>>>>>>
>>>>>> Can you please take a look at self-contained source file that shows way
>>>>>> I've
>>>>>> decided to go?
>>>>>> Another question is whether we want to implement also "aliases", where
>>>>>> for
>>>>>> instance
>>>>>> current 'all' is equal to union of couple of suboptions?
>>>>>
>>>>>
>>>>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>>>>> about -all-foo-all.
>>>>
>>>>
>>>> Actually only having 'all' is quite easy to implement.
>>>>
>>>> Let's imagine following hierarchy:
>>>>
>>>> (root)
>>>> - vops
>>>> - folding
>>>>   - gimple
>>>>     - ctor
>>>>     - array_ref
>>>>     - arithmetic
>>>>   - generic
>>>>     - c
>>>>     - c++
>>>>     - ctor
>>>>     - xyz
>>>>
>>>> Then '-fdump-passname-folding-all' will be equal to
>>>> '-fdump-passname-folding'.
>>>
>>> Ok, so you envision that sub-options restrict stuff.  I thought of
>>>
>>>  -gimple
>>>    -vops
>>>  -generic
>>>    -folding
>>>
>>> so the other way around.  We do not have many options that would be RTL
>>> specific but gimple only are -vops -alias -scev -gimple -rhs-only
>>> -verbose -memsyms
>>> while RTL has -cselib. -eh sounds gimple specific.  Then there's the optgroup
>>> stuff you already saw.
>>>
>>> So it looks like a 8 bit "group id" plus 56 bits of flags would do.
>>>
>>> Yes, this implies reworking how & and | work.  For example you can't
>>> | dump-flags of different groups.
>>
>> Well, I'm not opposed to idea of converting that to way you described.
>> So, you're willing to introduce something like:
>>
>> (root)
>> - generic
>>   - eh
>>   - folding
>>   - ...
>> - gimple
>>   - vops
>>   - folding
>>    - rhs-only
>>    - ...
>>   - vops
>> - rtl
>>   - cselib
>>   - ...
>>
>> ?
> 
> Yeah.  As said the motivation was to escape the 32 (now 64) bits limitation,
> not to make the user interface into a hierarchy.
> 
> I suppose we can easily defer now given we have 32 bits available now ;)

I see. Hopefully we can live quite some time with another 32 bits and I'm going
to transform the TDF_* stuff to enum.

Martin

> 
> Richard.
> 
>>>
>>>>>
>>>>> The important thing is to make sure dump_flags_t stays POD and thus is
>>>>> eligible to be passed in register(s).  In the end we might simply come up
>>>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
>>>>> in total with two times 16bits).
>>>>
>>>>
>>>> I'm aware of having the type as POD.
>>>>
>>>>>
>>>>> It looks you didn't actually implement this as a hierarchy though but
>>>>> still allocate from one pool of bits (so you only do a change to how
>>>>> users access this?)
>>>>
>>>>
>>>> Yep, all leaf options are mapped to a mask and all inner nodes are just
>>>> union
>>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
>>>> limit
>>>> we can encode the values in more sophisticated way. That however brings need
>>>> to implement more complicated '&' and '|' operators.
>>>>
>>>> I'll finish the implementation and try to migrate that to current handling.
>>>> Guess, I'm quite close.
>>>
>>> Hmm, but then there's not much advantage in suboptions (well, apart from maybe
>>> at the user-side).
>>
>> Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info.
>> As you can see I had to explicitly define all enum values and hierarchy creation
>> of every single node.
>>
>> Martin
>>
>>>
>>>> Martin
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Richard.
>>>>>
>>>>>>
>>>>>> Thanks for feedback,
>>>>>> Martin
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Richard.
>>>>>>>
>>>>>>>> Martin
>>>>>>
>>>>>>
>>>>>>
>>>>
>>

Richard Biener May 9, 2017, 12:52 p.m. UTC | #13

On Tue, May 9, 2017 at 2:46 PM, Martin Liška <mliska@suse.cz> wrote:
> On 05/09/2017 02:16 PM, Richard Biener wrote:
>> On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote:
>>> On 05/05/2017 01:50 PM, Richard Biener wrote:
>>>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote:
>>>>> On 05/04/2017 12:40 PM, Richard Biener wrote:
>>>>>>
>>>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>>
>>>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Hello
>>>>>>>>>
>>>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass
>>>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying
>>>>>>>>> pattern'
>>>>>>>>> lines, which can make reading of a dump file more complicated.
>>>>>>>>>
>>>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details.
>>>>>>>>> Percentage number
>>>>>>>>> shows how many lines are of the aforementioned pattern:
>>>>>>>>>
>>>>>>>>>                         tramp3d-v4.cpp.164t.ivopts: 6.34%
>>>>>>>>>                           tramp3d-v4.cpp.091t.ccp2: 5.04%
>>>>>>>>>                       tramp3d-v4.cpp.093t.cunrolli: 4.41%
>>>>>>>>>                       tramp3d-v4.cpp.129t.laddress: 3.70%
>>>>>>>>>                           tramp3d-v4.cpp.032t.ccp1: 2.31%
>>>>>>>>>                           tramp3d-v4.cpp.038t.evrp: 1.90%
>>>>>>>>>                      tramp3d-v4.cpp.033t.forwprop1: 1.74%
>>>>>>>>>                           tramp3d-v4.cpp.103t.vrp1: 1.52%
>>>>>>>>>                      tramp3d-v4.cpp.124t.forwprop3: 1.31%
>>>>>>>>>                           tramp3d-v4.cpp.181t.vrp2: 1.30%
>>>>>>>>>                        tramp3d-v4.cpp.161t.cunroll: 1.22%
>>>>>>>>>                     tramp3d-v4.cpp.027t.fixup_cfg3: 1.11%
>>>>>>>>>                        tramp3d-v4.cpp.153t.ivcanon: 1.07%
>>>>>>>>>                           tramp3d-v4.cpp.126t.ccp3: 0.96%
>>>>>>>>>                           tramp3d-v4.cpp.143t.sccp: 0.91%
>>>>>>>>>                      tramp3d-v4.cpp.185t.forwprop4: 0.82%
>>>>>>>>>                            tramp3d-v4.cpp.011t.cfg: 0.74%
>>>>>>>>>                      tramp3d-v4.cpp.096t.forwprop2: 0.50%
>>>>>>>>>                     tramp3d-v4.cpp.019t.fixup_cfg1: 0.37%
>>>>>>>>>                      tramp3d-v4.cpp.120t.phicprop1: 0.33%
>>>>>>>>>                            tramp3d-v4.cpp.133t.pre: 0.32%
>>>>>>>>>                      tramp3d-v4.cpp.182t.phicprop2: 0.27%
>>>>>>>>>                     tramp3d-v4.cpp.170t.veclower21: 0.25%
>>>>>>>>>                        tramp3d-v4.cpp.029t.einline: 0.24%
>>>>>>>>>
>>>>>>>>> I'm suggesting to add new TDF that will be allocated for that.
>>>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression
>>>>>>>>> tests.
>>>>>>>>>
>>>>>>>>> Thoughts?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Ok.  Soon we'll want to change dump_flags to uint64_t ...  (we have 1
>>>>>>>> bit
>>>>>>>> left
>>>>>>>> if you allow negative dump_flags).  It'll tickle down on a lot of
>>>>>>>> interfaces
>>>>>>>> so introducing dump_flags_t at the same time might be a good idea.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hello.
>>>>>>>
>>>>>>> I've prepared patch that migrates all interfaces and introduces
>>>>>>> dump_flags_t.
>>>>>>
>>>>>>
>>>>>> Great.
>>>>>>
>>>>>>> I've been
>>>>>>> currently testing that. Apart from that Richi requested to come up with
>>>>>>> more
>>>>>>> generic approach
>>>>>>> of hierarchical structure of options.
>>>>>>
>>>>>>
>>>>>> Didn't really "request" it, it's just something we eventually need to do
>>>>>> when
>>>>>> we run out of bits again ;)
>>>>>
>>>>>
>>>>> I know, but it was me who came up with the idea of more fine suboptions :)
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Can you please take a look at self-contained source file that shows way
>>>>>>> I've
>>>>>>> decided to go?
>>>>>>> Another question is whether we want to implement also "aliases", where
>>>>>>> for
>>>>>>> instance
>>>>>>> current 'all' is equal to union of couple of suboptions?
>>>>>>
>>>>>>
>>>>>> Yeah, I think we do want -all-all-all and -foo-all to work.  Not sure
>>>>>> about -all-foo-all.
>>>>>
>>>>>
>>>>> Actually only having 'all' is quite easy to implement.
>>>>>
>>>>> Let's imagine following hierarchy:
>>>>>
>>>>> (root)
>>>>> - vops
>>>>> - folding
>>>>>   - gimple
>>>>>     - ctor
>>>>>     - array_ref
>>>>>     - arithmetic
>>>>>   - generic
>>>>>     - c
>>>>>     - c++
>>>>>     - ctor
>>>>>     - xyz
>>>>>
>>>>> Then '-fdump-passname-folding-all' will be equal to
>>>>> '-fdump-passname-folding'.
>>>>
>>>> Ok, so you envision that sub-options restrict stuff.  I thought of
>>>>
>>>>  -gimple
>>>>    -vops
>>>>  -generic
>>>>    -folding
>>>>
>>>> so the other way around.  We do not have many options that would be RTL
>>>> specific but gimple only are -vops -alias -scev -gimple -rhs-only
>>>> -verbose -memsyms
>>>> while RTL has -cselib. -eh sounds gimple specific.  Then there's the optgroup
>>>> stuff you already saw.
>>>>
>>>> So it looks like a 8 bit "group id" plus 56 bits of flags would do.
>>>>
>>>> Yes, this implies reworking how & and | work.  For example you can't
>>>> | dump-flags of different groups.
>>>
>>> Well, I'm not opposed to idea of converting that to way you described.
>>> So, you're willing to introduce something like:
>>>
>>> (root)
>>> - generic
>>>   - eh
>>>   - folding
>>>   - ...
>>> - gimple
>>>   - vops
>>>   - folding
>>>    - rhs-only
>>>    - ...
>>>   - vops
>>> - rtl
>>>   - cselib
>>>   - ...
>>>
>>> ?
>>
>> Yeah.  As said the motivation was to escape the 32 (now 64) bits limitation,
>> not to make the user interface into a hierarchy.
>>
>> I suppose we can easily defer now given we have 32 bits available now ;)
>
> I see. Hopefully we can live quite some time with another 32 bits and I'm going
> to transform the TDF_* stuff to enum.

Maybe another .def file with embedded docs ;)

Ok, no, I didn't suggest this.

/me runs...

> Martin
>
>>
>> Richard.
>>
>>>>
>>>>>>
>>>>>> The important thing is to make sure dump_flags_t stays POD and thus is
>>>>>> eligible to be passed in register(s).  In the end we might simply come up
>>>>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits
>>>>>> in total with two times 16bits).
>>>>>
>>>>>
>>>>> I'm aware of having the type as POD.
>>>>>
>>>>>>
>>>>>> It looks you didn't actually implement this as a hierarchy though but
>>>>>> still allocate from one pool of bits (so you only do a change to how
>>>>>> users access this?)
>>>>>
>>>>>
>>>>> Yep, all leaf options are mapped to a mask and all inner nodes are just
>>>>> union
>>>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the
>>>>> limit
>>>>> we can encode the values in more sophisticated way. That however brings need
>>>>> to implement more complicated '&' and '|' operators.
>>>>>
>>>>> I'll finish the implementation and try to migrate that to current handling.
>>>>> Guess, I'm quite close.
>>>>
>>>> Hmm, but then there's not much advantage in suboptions (well, apart from maybe
>>>> at the user-side).
>>>
>>> Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info.
>>> As you can see I had to explicitly define all enum values and hierarchy creation
>>> of every single node.
>>>
>>> Martin
>>>
>>>>
>>>>> Martin
>>>>>
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>>
>>>>>>> Thanks for feedback,
>>>>>>> Martin
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Richard.
>>>>>>>>
>>>>>>>>> Martin
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>
>

Martin Liška May 12, 2017, 12:57 p.m. UTC | #14

Hello.

After fiddling with that, I decided to come up with reworked first part
of the patch and eventual translation to a more hierarchical structure
is subject for discussion.

This first patch adds default for couple of dump functions and it helps in future
to transform 0 to a TDF_NONE (or whatever we call it).

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

Richard Biener May 16, 2017, 1:44 p.m. UTC | #15

On Fri, May 12, 2017 at 2:57 PM, Martin Liška <mliska@suse.cz> wrote:
> Hello.
>
> After fiddling with that, I decided to come up with reworked first part
> of the patch and eventual translation to a more hierarchical structure
> is subject for discussion.
>
> This first patch adds default for couple of dump functions and it helps in future
> to transform 0 to a TDF_NONE (or whatever we call it).
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

Works for me.

Thanks,
Richard.

> Martin

[RFC] Introduce -fdump*-folding

Commit Message

Comments

Patch