Message ID | 668c62b5-9603-47c5-9bfe-266d0d369a1f@suse.cz |
---|---|
State | New |
Headers | show |
On Wed, May 3, 2017 at 1:10 AM, Martin Liška <mliska@suse.cz> wrote: > Hello > > Last release cycle I spent quite some time with reading of IVOPTS pass > dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern' > lines, which can make reading of a dump file more complicated. Yes the folding message can get annoying. Especially when it is not clear what is being folded which is the case a lot of the time I have seen the message. Thanks, Andrew > > There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number > shows how many lines are of the aforementioned pattern: > > tramp3d-v4.cpp.164t.ivopts: 6.34% > tramp3d-v4.cpp.091t.ccp2: 5.04% > tramp3d-v4.cpp.093t.cunrolli: 4.41% > tramp3d-v4.cpp.129t.laddress: 3.70% > tramp3d-v4.cpp.032t.ccp1: 2.31% > tramp3d-v4.cpp.038t.evrp: 1.90% > tramp3d-v4.cpp.033t.forwprop1: 1.74% > tramp3d-v4.cpp.103t.vrp1: 1.52% > tramp3d-v4.cpp.124t.forwprop3: 1.31% > tramp3d-v4.cpp.181t.vrp2: 1.30% > tramp3d-v4.cpp.161t.cunroll: 1.22% > tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% > tramp3d-v4.cpp.153t.ivcanon: 1.07% > tramp3d-v4.cpp.126t.ccp3: 0.96% > tramp3d-v4.cpp.143t.sccp: 0.91% > tramp3d-v4.cpp.185t.forwprop4: 0.82% > tramp3d-v4.cpp.011t.cfg: 0.74% > tramp3d-v4.cpp.096t.forwprop2: 0.50% > tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% > tramp3d-v4.cpp.120t.phicprop1: 0.33% > tramp3d-v4.cpp.133t.pre: 0.32% > tramp3d-v4.cpp.182t.phicprop2: 0.27% > tramp3d-v4.cpp.170t.veclower21: 0.25% > tramp3d-v4.cpp.029t.einline: 0.24% > > I'm suggesting to add new TDF that will be allocated for that. > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Thoughts? > Martin
On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: > Hello > > Last release cycle I spent quite some time with reading of IVOPTS pass > dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern' > lines, which can make reading of a dump file more complicated. > > There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number > shows how many lines are of the aforementioned pattern: > > tramp3d-v4.cpp.164t.ivopts: 6.34% > tramp3d-v4.cpp.091t.ccp2: 5.04% > tramp3d-v4.cpp.093t.cunrolli: 4.41% > tramp3d-v4.cpp.129t.laddress: 3.70% > tramp3d-v4.cpp.032t.ccp1: 2.31% > tramp3d-v4.cpp.038t.evrp: 1.90% > tramp3d-v4.cpp.033t.forwprop1: 1.74% > tramp3d-v4.cpp.103t.vrp1: 1.52% > tramp3d-v4.cpp.124t.forwprop3: 1.31% > tramp3d-v4.cpp.181t.vrp2: 1.30% > tramp3d-v4.cpp.161t.cunroll: 1.22% > tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% > tramp3d-v4.cpp.153t.ivcanon: 1.07% > tramp3d-v4.cpp.126t.ccp3: 0.96% > tramp3d-v4.cpp.143t.sccp: 0.91% > tramp3d-v4.cpp.185t.forwprop4: 0.82% > tramp3d-v4.cpp.011t.cfg: 0.74% > tramp3d-v4.cpp.096t.forwprop2: 0.50% > tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% > tramp3d-v4.cpp.120t.phicprop1: 0.33% > tramp3d-v4.cpp.133t.pre: 0.32% > tramp3d-v4.cpp.182t.phicprop2: 0.27% > tramp3d-v4.cpp.170t.veclower21: 0.25% > tramp3d-v4.cpp.029t.einline: 0.24% > > I'm suggesting to add new TDF that will be allocated for that. > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Thoughts? Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 bit left if you allow negative dump_flags). It'll tickle down on a lot of interfaces so introducing dump_flags_t at the same time might be a good idea. Thanks, Richard. > Martin
On 05/03/2017 12:12 PM, Richard Biener wrote: > On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >> Hello >> >> Last release cycle I spent quite some time with reading of IVOPTS pass >> dump file. Using -fdump*-details causes to generate a lot of 'Applying pattern' >> lines, which can make reading of a dump file more complicated. >> >> There are stats for tramp3d with -O2 and -fdump-tree-all-details. Percentage number >> shows how many lines are of the aforementioned pattern: >> >> tramp3d-v4.cpp.164t.ivopts: 6.34% >> tramp3d-v4.cpp.091t.ccp2: 5.04% >> tramp3d-v4.cpp.093t.cunrolli: 4.41% >> tramp3d-v4.cpp.129t.laddress: 3.70% >> tramp3d-v4.cpp.032t.ccp1: 2.31% >> tramp3d-v4.cpp.038t.evrp: 1.90% >> tramp3d-v4.cpp.033t.forwprop1: 1.74% >> tramp3d-v4.cpp.103t.vrp1: 1.52% >> tramp3d-v4.cpp.124t.forwprop3: 1.31% >> tramp3d-v4.cpp.181t.vrp2: 1.30% >> tramp3d-v4.cpp.161t.cunroll: 1.22% >> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >> tramp3d-v4.cpp.153t.ivcanon: 1.07% >> tramp3d-v4.cpp.126t.ccp3: 0.96% >> tramp3d-v4.cpp.143t.sccp: 0.91% >> tramp3d-v4.cpp.185t.forwprop4: 0.82% >> tramp3d-v4.cpp.011t.cfg: 0.74% >> tramp3d-v4.cpp.096t.forwprop2: 0.50% >> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >> tramp3d-v4.cpp.120t.phicprop1: 0.33% >> tramp3d-v4.cpp.133t.pre: 0.32% >> tramp3d-v4.cpp.182t.phicprop2: 0.27% >> tramp3d-v4.cpp.170t.veclower21: 0.25% >> tramp3d-v4.cpp.029t.einline: 0.24% >> >> I'm suggesting to add new TDF that will be allocated for that. >> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. >> >> Thoughts? > > Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 bit left > if you allow negative dump_flags). It'll tickle down on a lot of interfaces > so introducing dump_flags_t at the same time might be a good idea. Hello. I've prepared patch that migrates all interfaces and introduces dump_flags_t. I've been currently testing that. Apart from that Richi requested to come up with more generic approach of hierarchical structure of options. Can you please take a look at self-contained source file that shows way I've decided to go? Another question is whether we want to implement also "aliases", where for instance current 'all' is equal to union of couple of suboptions? Thanks for feedback, Martin > > Thanks, > Richard. > >> Martin
On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: > On 05/03/2017 12:12 PM, Richard Biener wrote: >> >> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>> >>> Hello >>> >>> Last release cycle I spent quite some time with reading of IVOPTS pass >>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>> pattern' >>> lines, which can make reading of a dump file more complicated. >>> >>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>> Percentage number >>> shows how many lines are of the aforementioned pattern: >>> >>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>> tramp3d-v4.cpp.129t.laddress: 3.70% >>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>> tramp3d-v4.cpp.038t.evrp: 1.90% >>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>> tramp3d-v4.cpp.143t.sccp: 0.91% >>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>> tramp3d-v4.cpp.011t.cfg: 0.74% >>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>> tramp3d-v4.cpp.133t.pre: 0.32% >>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>> tramp3d-v4.cpp.029t.einline: 0.24% >>> >>> I'm suggesting to add new TDF that will be allocated for that. >>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>> tests. >>> >>> Thoughts? >> >> >> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 bit >> left >> if you allow negative dump_flags). It'll tickle down on a lot of >> interfaces >> so introducing dump_flags_t at the same time might be a good idea. > > > Hello. > > I've prepared patch that migrates all interfaces and introduces > dump_flags_t. Great. > I've been > currently testing that. Apart from that Richi requested to come up with more > generic approach > of hierarchical structure of options. Didn't really "request" it, it's just something we eventually need to do when we run out of bits again ;) > > Can you please take a look at self-contained source file that shows way I've > decided to go? > Another question is whether we want to implement also "aliases", where for > instance > current 'all' is equal to union of couple of suboptions? Yeah, I think we do want -all-all-all and -foo-all to work. Not sure about -all-foo-all. The important thing is to make sure dump_flags_t stays POD and thus is eligible to be passed in register(s). In the end we might simply come up with a two-level hierarchy, each 32bits (or we can even get back to 32bits in total with two times 16bits). It looks you didn't actually implement this as a hierarchy though but still allocate from one pool of bits (so you only do a change to how users access this?) Thanks, Richard. > > Thanks for feedback, > Martin > >> >> Thanks, >> Richard. >> >>> Martin > >
On 05/04/2017 12:40 PM, Richard Biener wrote: > On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >> On 05/03/2017 12:12 PM, Richard Biener wrote: >>> >>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>> >>>> Hello >>>> >>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>> pattern' >>>> lines, which can make reading of a dump file more complicated. >>>> >>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>> Percentage number >>>> shows how many lines are of the aforementioned pattern: >>>> >>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>> >>>> I'm suggesting to add new TDF that will be allocated for that. >>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>> tests. >>>> >>>> Thoughts? >>> >>> >>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 bit >>> left >>> if you allow negative dump_flags). It'll tickle down on a lot of >>> interfaces >>> so introducing dump_flags_t at the same time might be a good idea. >> >> >> Hello. >> >> I've prepared patch that migrates all interfaces and introduces >> dump_flags_t. > > Great. > >> I've been >> currently testing that. Apart from that Richi requested to come up with more >> generic approach >> of hierarchical structure of options. > > Didn't really "request" it, it's just something we eventually need to do when > we run out of bits again ;) I know, but it was me who came up with the idea of more fine suboptions :) > >> >> Can you please take a look at self-contained source file that shows way I've >> decided to go? >> Another question is whether we want to implement also "aliases", where for >> instance >> current 'all' is equal to union of couple of suboptions? > > Yeah, I think we do want -all-all-all and -foo-all to work. Not sure > about -all-foo-all. Actually only having 'all' is quite easy to implement. Let's imagine following hierarchy: (root) - vops - folding - gimple - ctor - array_ref - arithmetic - generic - c - c++ - ctor - xyz Then '-fdump-passname-folding-all' will be equal to '-fdump-passname-folding'. > > The important thing is to make sure dump_flags_t stays POD and thus is > eligible to be passed in register(s). In the end we might simply come up > with a two-level hierarchy, each 32bits (or we can even get back to 32bits > in total with two times 16bits). I'm aware of having the type as POD. > > It looks you didn't actually implement this as a hierarchy though but > still allocate from one pool of bits (so you only do a change to how > users access this?) Yep, all leaf options are mapped to a mask and all inner nodes are just union of suboptions. That will allow us to have 64 leaf suboptions. Reaching the limit we can encode the values in more sophisticated way. That however brings need to implement more complicated '&' and '|' operators. I'll finish the implementation and try to migrate that to current handling. Guess, I'm quite close. Martin > > Thanks, > Richard. > >> >> Thanks for feedback, >> Martin >> >>> >>> Thanks, >>> Richard. >>> >>>> Martin >> >>
Hello. There's first patch that just defines dump_flags_t as uint64_t and changes all corresponding interfaces that do use it. There's a problematic impact that all targets have to include dumpfile.h just right after coretypes.h. That makes the patch harder to properly test. I tried couple of cross-compilers and it works. Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets. Thoughts? Martin
On Fri, May 5, 2017 at 12:41 PM, Martin Liška <mliska@suse.cz> wrote: > Hello. > > There's first patch that just defines dump_flags_t as uint64_t and changes all > corresponding interfaces that do use it. There's a problematic impact that > all targets have to include dumpfile.h just right after coretypes.h. That makes > the patch harder to properly test. I tried couple of cross-compilers and it works. > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets. > > Thoughts? So usually we get away with defining pervasive types in coretypes.h instead. Now I see how that can be not what we want if dump_flags_t becomes a "class". Well, at least if it has too many (inline) methods. How many of our files end up including dumpfile.h? There's /* Most host source files will require the following headers. */ #if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET) #include "machmode.h" #include "signop.h" #include "wide-int.h" at the end of coretypes.h so it might be possible to stuff dumpfile.h there. Or create a dump-flags.h header just containing the flag bits. I suppose most files need it because they include some interfaces that have dump_flags_t, not because they do dumping (they'd include dumpflags.h already), right? Anyway, I think the patch is a good thing (how do we make sure we don't "regress", aka people using 'int', not knowing about dump_flag_t?). I'd probably, as a first step, simply put the typedef into coretypes.h. A patch doing that instead of sprinkling dumpfile.h everywhere is ok. Thanks, Richard. > Martin
On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote: > On 05/04/2017 12:40 PM, Richard Biener wrote: >> >> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >>> >>> On 05/03/2017 12:12 PM, Richard Biener wrote: >>>> >>>> >>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>>> >>>>> >>>>> Hello >>>>> >>>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>>> pattern' >>>>> lines, which can make reading of a dump file more complicated. >>>>> >>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>>> Percentage number >>>>> shows how many lines are of the aforementioned pattern: >>>>> >>>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>>> >>>>> I'm suggesting to add new TDF that will be allocated for that. >>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>> tests. >>>>> >>>>> Thoughts? >>>> >>>> >>>> >>>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>>> bit >>>> left >>>> if you allow negative dump_flags). It'll tickle down on a lot of >>>> interfaces >>>> so introducing dump_flags_t at the same time might be a good idea. >>> >>> >>> >>> Hello. >>> >>> I've prepared patch that migrates all interfaces and introduces >>> dump_flags_t. >> >> >> Great. >> >>> I've been >>> currently testing that. Apart from that Richi requested to come up with >>> more >>> generic approach >>> of hierarchical structure of options. >> >> >> Didn't really "request" it, it's just something we eventually need to do >> when >> we run out of bits again ;) > > > I know, but it was me who came up with the idea of more fine suboptions :) > >> >>> >>> Can you please take a look at self-contained source file that shows way >>> I've >>> decided to go? >>> Another question is whether we want to implement also "aliases", where >>> for >>> instance >>> current 'all' is equal to union of couple of suboptions? >> >> >> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >> about -all-foo-all. > > > Actually only having 'all' is quite easy to implement. > > Let's imagine following hierarchy: > > (root) > - vops > - folding > - gimple > - ctor > - array_ref > - arithmetic > - generic > - c > - c++ > - ctor > - xyz > > Then '-fdump-passname-folding-all' will be equal to > '-fdump-passname-folding'. Ok, so you envision that sub-options restrict stuff. I thought of -gimple -vops -generic -folding so the other way around. We do not have many options that would be RTL specific but gimple only are -vops -alias -scev -gimple -rhs-only -verbose -memsyms while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup stuff you already saw. So it looks like a 8 bit "group id" plus 56 bits of flags would do. Yes, this implies reworking how & and | work. For example you can't | dump-flags of different groups. >> >> The important thing is to make sure dump_flags_t stays POD and thus is >> eligible to be passed in register(s). In the end we might simply come up >> with a two-level hierarchy, each 32bits (or we can even get back to 32bits >> in total with two times 16bits). > > > I'm aware of having the type as POD. > >> >> It looks you didn't actually implement this as a hierarchy though but >> still allocate from one pool of bits (so you only do a change to how >> users access this?) > > > Yep, all leaf options are mapped to a mask and all inner nodes are just > union > of suboptions. That will allow us to have 64 leaf suboptions. Reaching the > limit > we can encode the values in more sophisticated way. That however brings need > to implement more complicated '&' and '|' operators. > > I'll finish the implementation and try to migrate that to current handling. > Guess, I'm quite close. Hmm, but then there's not much advantage in suboptions (well, apart from maybe at the user-side). > Martin > > >> >> Thanks, >> Richard. >> >>> >>> Thanks for feedback, >>> Martin >>> >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> Martin >>> >>> >>> >
On Fri, May 05, 2017 at 01:38:21PM +0200, Richard Biener wrote: > On Fri, May 5, 2017 at 12:41 PM, Martin Liška <mliska@suse.cz> wrote: > > Hello. > > > > There's first patch that just defines dump_flags_t as uint64_t and changes all > > corresponding interfaces that do use it. There's a problematic impact that > > all targets have to include dumpfile.h just right after coretypes.h. That makes > > the patch harder to properly test. I tried couple of cross-compilers and it works. > > > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > I've been also testing i686-linux-gnu bootstrap and x86_64-linux-gnu targets. > > > > Thoughts? > > So usually we get away with defining pervasive types in coretypes.h instead. > > Now I see how that can be not what we want if dump_flags_t becomes a > "class". Well, at least if it has too many (inline) methods. > > How many of our files end up including dumpfile.h? There's > > /* Most host source files will require the following headers. */ > #if !defined (GENERATOR_FILE) && !defined (USED_FOR_TARGET) > #include "machmode.h" > #include "signop.h" > #include "wide-int.h" > > at the end of coretypes.h so it might be possible to stuff dumpfile.h there. > > Or create a dump-flags.h header just containing the flag bits. I suppose > most files need it because they include some interfaces that have > dump_flags_t, not because they do dumping (they'd include dumpflags.h > already), right? well, if a header uses dumpflags_t and a forward declaration won't do it should include it right? but continueing the coretypes.h hacks may be expedient. > Anyway, I think the patch is a good thing (how do we make sure we > don't "regress", aka people using 'int', not knowing about dump_flag_t?). does it make sense to define it as a struct whose only member is a uint64_t? That way you'd have to explicitly pull out the field wherever you want to pass it to something taking an int. That would require defining a bunch of operator &/| etc. However we could also move from there to a set of bitfields for most of the flags and usually not need to explicitly extract the bit we want. Trev
On 05/05/2017 01:50 PM, Richard Biener wrote: > On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote: >> On 05/04/2017 12:40 PM, Richard Biener wrote: >>> >>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >>>> >>>> On 05/03/2017 12:12 PM, Richard Biener wrote: >>>>> >>>>> >>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>> >>>>>> >>>>>> Hello >>>>>> >>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>>>> pattern' >>>>>> lines, which can make reading of a dump file more complicated. >>>>>> >>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>>>> Percentage number >>>>>> shows how many lines are of the aforementioned pattern: >>>>>> >>>>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>>>> >>>>>> I'm suggesting to add new TDF that will be allocated for that. >>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>>> tests. >>>>>> >>>>>> Thoughts? >>>>> >>>>> >>>>> >>>>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>>>> bit >>>>> left >>>>> if you allow negative dump_flags). It'll tickle down on a lot of >>>>> interfaces >>>>> so introducing dump_flags_t at the same time might be a good idea. >>>> >>>> >>>> >>>> Hello. >>>> >>>> I've prepared patch that migrates all interfaces and introduces >>>> dump_flags_t. >>> >>> >>> Great. >>> >>>> I've been >>>> currently testing that. Apart from that Richi requested to come up with >>>> more >>>> generic approach >>>> of hierarchical structure of options. >>> >>> >>> Didn't really "request" it, it's just something we eventually need to do >>> when >>> we run out of bits again ;) >> >> >> I know, but it was me who came up with the idea of more fine suboptions :) >> >>> >>>> >>>> Can you please take a look at self-contained source file that shows way >>>> I've >>>> decided to go? >>>> Another question is whether we want to implement also "aliases", where >>>> for >>>> instance >>>> current 'all' is equal to union of couple of suboptions? >>> >>> >>> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >>> about -all-foo-all. >> >> >> Actually only having 'all' is quite easy to implement. >> >> Let's imagine following hierarchy: >> >> (root) >> - vops >> - folding >> - gimple >> - ctor >> - array_ref >> - arithmetic >> - generic >> - c >> - c++ >> - ctor >> - xyz >> >> Then '-fdump-passname-folding-all' will be equal to >> '-fdump-passname-folding'. > > Ok, so you envision that sub-options restrict stuff. I thought of > > -gimple > -vops > -generic > -folding > > so the other way around. We do not have many options that would be RTL > specific but gimple only are -vops -alias -scev -gimple -rhs-only > -verbose -memsyms > while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup > stuff you already saw. > > So it looks like a 8 bit "group id" plus 56 bits of flags would do. > > Yes, this implies reworking how & and | work. For example you can't > | dump-flags of different groups. Well, I'm not opposed to idea of converting that to way you described. So, you're willing to introduce something like: (root) - generic - eh - folding - ... - gimple - vops - folding - rhs-only - ... - vops - rtl - cselib - ... ? > >>> >>> The important thing is to make sure dump_flags_t stays POD and thus is >>> eligible to be passed in register(s). In the end we might simply come up >>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits >>> in total with two times 16bits). >> >> >> I'm aware of having the type as POD. >> >>> >>> It looks you didn't actually implement this as a hierarchy though but >>> still allocate from one pool of bits (so you only do a change to how >>> users access this?) >> >> >> Yep, all leaf options are mapped to a mask and all inner nodes are just >> union >> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the >> limit >> we can encode the values in more sophisticated way. That however brings need >> to implement more complicated '&' and '|' operators. >> >> I'll finish the implementation and try to migrate that to current handling. >> Guess, I'm quite close. > > Hmm, but then there's not much advantage in suboptions (well, apart from maybe > at the user-side). Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info. As you can see I had to explicitly define all enum values and hierarchy creation of every single node. Martin > >> Martin >> >> >>> >>> Thanks, >>> Richard. >>> >>>> >>>> Thanks for feedback, >>>> Martin >>>> >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>>> Martin >>>> >>>> >>>> >>
On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote: > On 05/05/2017 01:50 PM, Richard Biener wrote: >> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote: >>> On 05/04/2017 12:40 PM, Richard Biener wrote: >>>> >>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >>>>> >>>>> On 05/03/2017 12:12 PM, Richard Biener wrote: >>>>>> >>>>>> >>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>>> >>>>>>> >>>>>>> Hello >>>>>>> >>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>>>>> pattern' >>>>>>> lines, which can make reading of a dump file more complicated. >>>>>>> >>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>>>>> Percentage number >>>>>>> shows how many lines are of the aforementioned pattern: >>>>>>> >>>>>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>>>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>>>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>>>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>>>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>>>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>>>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>>>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>>>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>>>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>>>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>>>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>>>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>>>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>>>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>>>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>>>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>>>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>>>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>>>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>>>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>>>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>>>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>>>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>>>>> >>>>>>> I'm suggesting to add new TDF that will be allocated for that. >>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>>>> tests. >>>>>>> >>>>>>> Thoughts? >>>>>> >>>>>> >>>>>> >>>>>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>>>>> bit >>>>>> left >>>>>> if you allow negative dump_flags). It'll tickle down on a lot of >>>>>> interfaces >>>>>> so introducing dump_flags_t at the same time might be a good idea. >>>>> >>>>> >>>>> >>>>> Hello. >>>>> >>>>> I've prepared patch that migrates all interfaces and introduces >>>>> dump_flags_t. >>>> >>>> >>>> Great. >>>> >>>>> I've been >>>>> currently testing that. Apart from that Richi requested to come up with >>>>> more >>>>> generic approach >>>>> of hierarchical structure of options. >>>> >>>> >>>> Didn't really "request" it, it's just something we eventually need to do >>>> when >>>> we run out of bits again ;) >>> >>> >>> I know, but it was me who came up with the idea of more fine suboptions :) >>> >>>> >>>>> >>>>> Can you please take a look at self-contained source file that shows way >>>>> I've >>>>> decided to go? >>>>> Another question is whether we want to implement also "aliases", where >>>>> for >>>>> instance >>>>> current 'all' is equal to union of couple of suboptions? >>>> >>>> >>>> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >>>> about -all-foo-all. >>> >>> >>> Actually only having 'all' is quite easy to implement. >>> >>> Let's imagine following hierarchy: >>> >>> (root) >>> - vops >>> - folding >>> - gimple >>> - ctor >>> - array_ref >>> - arithmetic >>> - generic >>> - c >>> - c++ >>> - ctor >>> - xyz >>> >>> Then '-fdump-passname-folding-all' will be equal to >>> '-fdump-passname-folding'. >> >> Ok, so you envision that sub-options restrict stuff. I thought of >> >> -gimple >> -vops >> -generic >> -folding >> >> so the other way around. We do not have many options that would be RTL >> specific but gimple only are -vops -alias -scev -gimple -rhs-only >> -verbose -memsyms >> while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup >> stuff you already saw. >> >> So it looks like a 8 bit "group id" plus 56 bits of flags would do. >> >> Yes, this implies reworking how & and | work. For example you can't >> | dump-flags of different groups. > > Well, I'm not opposed to idea of converting that to way you described. > So, you're willing to introduce something like: > > (root) > - generic > - eh > - folding > - ... > - gimple > - vops > - folding > - rhs-only > - ... > - vops > - rtl > - cselib > - ... > > ? Yeah. As said the motivation was to escape the 32 (now 64) bits limitation, not to make the user interface into a hierarchy. I suppose we can easily defer now given we have 32 bits available now ;) Richard. >> >>>> >>>> The important thing is to make sure dump_flags_t stays POD and thus is >>>> eligible to be passed in register(s). In the end we might simply come up >>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits >>>> in total with two times 16bits). >>> >>> >>> I'm aware of having the type as POD. >>> >>>> >>>> It looks you didn't actually implement this as a hierarchy though but >>>> still allocate from one pool of bits (so you only do a change to how >>>> users access this?) >>> >>> >>> Yep, all leaf options are mapped to a mask and all inner nodes are just >>> union >>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the >>> limit >>> we can encode the values in more sophisticated way. That however brings need >>> to implement more complicated '&' and '|' operators. >>> >>> I'll finish the implementation and try to migrate that to current handling. >>> Guess, I'm quite close. >> >> Hmm, but then there's not much advantage in suboptions (well, apart from maybe >> at the user-side). > > Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info. > As you can see I had to explicitly define all enum values and hierarchy creation > of every single node. > > Martin > >> >>> Martin >>> >>> >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> >>>>> Thanks for feedback, >>>>> Martin >>>>> >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>>> Martin >>>>> >>>>> >>>>> >>> >
On 05/09/2017 02:16 PM, Richard Biener wrote: > On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote: >> On 05/05/2017 01:50 PM, Richard Biener wrote: >>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote: >>>> On 05/04/2017 12:40 PM, Richard Biener wrote: >>>>> >>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>> >>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote: >>>>>>> >>>>>>> >>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>>>> >>>>>>>> >>>>>>>> Hello >>>>>>>> >>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>>>>>> pattern' >>>>>>>> lines, which can make reading of a dump file more complicated. >>>>>>>> >>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>>>>>> Percentage number >>>>>>>> shows how many lines are of the aforementioned pattern: >>>>>>>> >>>>>>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>>>>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>>>>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>>>>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>>>>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>>>>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>>>>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>>>>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>>>>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>>>>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>>>>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>>>>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>>>>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>>>>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>>>>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>>>>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>>>>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>>>>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>>>>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>>>>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>>>>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>>>>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>>>>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>>>>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>>>>>> >>>>>>>> I'm suggesting to add new TDF that will be allocated for that. >>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>>>>> tests. >>>>>>>> >>>>>>>> Thoughts? >>>>>>> >>>>>>> >>>>>>> >>>>>>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>>>>>> bit >>>>>>> left >>>>>>> if you allow negative dump_flags). It'll tickle down on a lot of >>>>>>> interfaces >>>>>>> so introducing dump_flags_t at the same time might be a good idea. >>>>>> >>>>>> >>>>>> >>>>>> Hello. >>>>>> >>>>>> I've prepared patch that migrates all interfaces and introduces >>>>>> dump_flags_t. >>>>> >>>>> >>>>> Great. >>>>> >>>>>> I've been >>>>>> currently testing that. Apart from that Richi requested to come up with >>>>>> more >>>>>> generic approach >>>>>> of hierarchical structure of options. >>>>> >>>>> >>>>> Didn't really "request" it, it's just something we eventually need to do >>>>> when >>>>> we run out of bits again ;) >>>> >>>> >>>> I know, but it was me who came up with the idea of more fine suboptions :) >>>> >>>>> >>>>>> >>>>>> Can you please take a look at self-contained source file that shows way >>>>>> I've >>>>>> decided to go? >>>>>> Another question is whether we want to implement also "aliases", where >>>>>> for >>>>>> instance >>>>>> current 'all' is equal to union of couple of suboptions? >>>>> >>>>> >>>>> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >>>>> about -all-foo-all. >>>> >>>> >>>> Actually only having 'all' is quite easy to implement. >>>> >>>> Let's imagine following hierarchy: >>>> >>>> (root) >>>> - vops >>>> - folding >>>> - gimple >>>> - ctor >>>> - array_ref >>>> - arithmetic >>>> - generic >>>> - c >>>> - c++ >>>> - ctor >>>> - xyz >>>> >>>> Then '-fdump-passname-folding-all' will be equal to >>>> '-fdump-passname-folding'. >>> >>> Ok, so you envision that sub-options restrict stuff. I thought of >>> >>> -gimple >>> -vops >>> -generic >>> -folding >>> >>> so the other way around. We do not have many options that would be RTL >>> specific but gimple only are -vops -alias -scev -gimple -rhs-only >>> -verbose -memsyms >>> while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup >>> stuff you already saw. >>> >>> So it looks like a 8 bit "group id" plus 56 bits of flags would do. >>> >>> Yes, this implies reworking how & and | work. For example you can't >>> | dump-flags of different groups. >> >> Well, I'm not opposed to idea of converting that to way you described. >> So, you're willing to introduce something like: >> >> (root) >> - generic >> - eh >> - folding >> - ... >> - gimple >> - vops >> - folding >> - rhs-only >> - ... >> - vops >> - rtl >> - cselib >> - ... >> >> ? > > Yeah. As said the motivation was to escape the 32 (now 64) bits limitation, > not to make the user interface into a hierarchy. > > I suppose we can easily defer now given we have 32 bits available now ;) I see. Hopefully we can live quite some time with another 32 bits and I'm going to transform the TDF_* stuff to enum. Martin > > Richard. > >>> >>>>> >>>>> The important thing is to make sure dump_flags_t stays POD and thus is >>>>> eligible to be passed in register(s). In the end we might simply come up >>>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits >>>>> in total with two times 16bits). >>>> >>>> >>>> I'm aware of having the type as POD. >>>> >>>>> >>>>> It looks you didn't actually implement this as a hierarchy though but >>>>> still allocate from one pool of bits (so you only do a change to how >>>>> users access this?) >>>> >>>> >>>> Yep, all leaf options are mapped to a mask and all inner nodes are just >>>> union >>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the >>>> limit >>>> we can encode the values in more sophisticated way. That however brings need >>>> to implement more complicated '&' and '|' operators. >>>> >>>> I'll finish the implementation and try to migrate that to current handling. >>>> Guess, I'm quite close. >>> >>> Hmm, but then there's not much advantage in suboptions (well, apart from maybe >>> at the user-side). >> >> Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info. >> As you can see I had to explicitly define all enum values and hierarchy creation >> of every single node. >> >> Martin >> >>> >>>> Martin >>>> >>>> >>>>> >>>>> Thanks, >>>>> Richard. >>>>> >>>>>> >>>>>> Thanks for feedback, >>>>>> Martin >>>>>> >>>>>>> >>>>>>> Thanks, >>>>>>> Richard. >>>>>>> >>>>>>>> Martin >>>>>> >>>>>> >>>>>> >>>> >>
On Tue, May 9, 2017 at 2:46 PM, Martin Liška <mliska@suse.cz> wrote: > On 05/09/2017 02:16 PM, Richard Biener wrote: >> On Tue, May 9, 2017 at 2:01 PM, Martin Liška <mliska@suse.cz> wrote: >>> On 05/05/2017 01:50 PM, Richard Biener wrote: >>>> On Thu, May 4, 2017 at 1:10 PM, Martin Liška <mliska@suse.cz> wrote: >>>>> On 05/04/2017 12:40 PM, Richard Biener wrote: >>>>>> >>>>>> On Thu, May 4, 2017 at 11:22 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>>> >>>>>>> On 05/03/2017 12:12 PM, Richard Biener wrote: >>>>>>>> >>>>>>>> >>>>>>>> On Wed, May 3, 2017 at 10:10 AM, Martin Liška <mliska@suse.cz> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hello >>>>>>>>> >>>>>>>>> Last release cycle I spent quite some time with reading of IVOPTS pass >>>>>>>>> dump file. Using -fdump*-details causes to generate a lot of 'Applying >>>>>>>>> pattern' >>>>>>>>> lines, which can make reading of a dump file more complicated. >>>>>>>>> >>>>>>>>> There are stats for tramp3d with -O2 and -fdump-tree-all-details. >>>>>>>>> Percentage number >>>>>>>>> shows how many lines are of the aforementioned pattern: >>>>>>>>> >>>>>>>>> tramp3d-v4.cpp.164t.ivopts: 6.34% >>>>>>>>> tramp3d-v4.cpp.091t.ccp2: 5.04% >>>>>>>>> tramp3d-v4.cpp.093t.cunrolli: 4.41% >>>>>>>>> tramp3d-v4.cpp.129t.laddress: 3.70% >>>>>>>>> tramp3d-v4.cpp.032t.ccp1: 2.31% >>>>>>>>> tramp3d-v4.cpp.038t.evrp: 1.90% >>>>>>>>> tramp3d-v4.cpp.033t.forwprop1: 1.74% >>>>>>>>> tramp3d-v4.cpp.103t.vrp1: 1.52% >>>>>>>>> tramp3d-v4.cpp.124t.forwprop3: 1.31% >>>>>>>>> tramp3d-v4.cpp.181t.vrp2: 1.30% >>>>>>>>> tramp3d-v4.cpp.161t.cunroll: 1.22% >>>>>>>>> tramp3d-v4.cpp.027t.fixup_cfg3: 1.11% >>>>>>>>> tramp3d-v4.cpp.153t.ivcanon: 1.07% >>>>>>>>> tramp3d-v4.cpp.126t.ccp3: 0.96% >>>>>>>>> tramp3d-v4.cpp.143t.sccp: 0.91% >>>>>>>>> tramp3d-v4.cpp.185t.forwprop4: 0.82% >>>>>>>>> tramp3d-v4.cpp.011t.cfg: 0.74% >>>>>>>>> tramp3d-v4.cpp.096t.forwprop2: 0.50% >>>>>>>>> tramp3d-v4.cpp.019t.fixup_cfg1: 0.37% >>>>>>>>> tramp3d-v4.cpp.120t.phicprop1: 0.33% >>>>>>>>> tramp3d-v4.cpp.133t.pre: 0.32% >>>>>>>>> tramp3d-v4.cpp.182t.phicprop2: 0.27% >>>>>>>>> tramp3d-v4.cpp.170t.veclower21: 0.25% >>>>>>>>> tramp3d-v4.cpp.029t.einline: 0.24% >>>>>>>>> >>>>>>>>> I'm suggesting to add new TDF that will be allocated for that. >>>>>>>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression >>>>>>>>> tests. >>>>>>>>> >>>>>>>>> Thoughts? >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Ok. Soon we'll want to change dump_flags to uint64_t ... (we have 1 >>>>>>>> bit >>>>>>>> left >>>>>>>> if you allow negative dump_flags). It'll tickle down on a lot of >>>>>>>> interfaces >>>>>>>> so introducing dump_flags_t at the same time might be a good idea. >>>>>>> >>>>>>> >>>>>>> >>>>>>> Hello. >>>>>>> >>>>>>> I've prepared patch that migrates all interfaces and introduces >>>>>>> dump_flags_t. >>>>>> >>>>>> >>>>>> Great. >>>>>> >>>>>>> I've been >>>>>>> currently testing that. Apart from that Richi requested to come up with >>>>>>> more >>>>>>> generic approach >>>>>>> of hierarchical structure of options. >>>>>> >>>>>> >>>>>> Didn't really "request" it, it's just something we eventually need to do >>>>>> when >>>>>> we run out of bits again ;) >>>>> >>>>> >>>>> I know, but it was me who came up with the idea of more fine suboptions :) >>>>> >>>>>> >>>>>>> >>>>>>> Can you please take a look at self-contained source file that shows way >>>>>>> I've >>>>>>> decided to go? >>>>>>> Another question is whether we want to implement also "aliases", where >>>>>>> for >>>>>>> instance >>>>>>> current 'all' is equal to union of couple of suboptions? >>>>>> >>>>>> >>>>>> Yeah, I think we do want -all-all-all and -foo-all to work. Not sure >>>>>> about -all-foo-all. >>>>> >>>>> >>>>> Actually only having 'all' is quite easy to implement. >>>>> >>>>> Let's imagine following hierarchy: >>>>> >>>>> (root) >>>>> - vops >>>>> - folding >>>>> - gimple >>>>> - ctor >>>>> - array_ref >>>>> - arithmetic >>>>> - generic >>>>> - c >>>>> - c++ >>>>> - ctor >>>>> - xyz >>>>> >>>>> Then '-fdump-passname-folding-all' will be equal to >>>>> '-fdump-passname-folding'. >>>> >>>> Ok, so you envision that sub-options restrict stuff. I thought of >>>> >>>> -gimple >>>> -vops >>>> -generic >>>> -folding >>>> >>>> so the other way around. We do not have many options that would be RTL >>>> specific but gimple only are -vops -alias -scev -gimple -rhs-only >>>> -verbose -memsyms >>>> while RTL has -cselib. -eh sounds gimple specific. Then there's the optgroup >>>> stuff you already saw. >>>> >>>> So it looks like a 8 bit "group id" plus 56 bits of flags would do. >>>> >>>> Yes, this implies reworking how & and | work. For example you can't >>>> | dump-flags of different groups. >>> >>> Well, I'm not opposed to idea of converting that to way you described. >>> So, you're willing to introduce something like: >>> >>> (root) >>> - generic >>> - eh >>> - folding >>> - ... >>> - gimple >>> - vops >>> - folding >>> - rhs-only >>> - ... >>> - vops >>> - rtl >>> - cselib >>> - ... >>> >>> ? >> >> Yeah. As said the motivation was to escape the 32 (now 64) bits limitation, >> not to make the user interface into a hierarchy. >> >> I suppose we can easily defer now given we have 32 bits available now ;) > > I see. Hopefully we can live quite some time with another 32 bits and I'm going > to transform the TDF_* stuff to enum. Maybe another .def file with embedded docs ;) Ok, no, I didn't suggest this. /me runs... > Martin > >> >> Richard. >> >>>> >>>>>> >>>>>> The important thing is to make sure dump_flags_t stays POD and thus is >>>>>> eligible to be passed in register(s). In the end we might simply come up >>>>>> with a two-level hierarchy, each 32bits (or we can even get back to 32bits >>>>>> in total with two times 16bits). >>>>> >>>>> >>>>> I'm aware of having the type as POD. >>>>> >>>>>> >>>>>> It looks you didn't actually implement this as a hierarchy though but >>>>>> still allocate from one pool of bits (so you only do a change to how >>>>>> users access this?) >>>>> >>>>> >>>>> Yep, all leaf options are mapped to a mask and all inner nodes are just >>>>> union >>>>> of suboptions. That will allow us to have 64 leaf suboptions. Reaching the >>>>> limit >>>>> we can encode the values in more sophisticated way. That however brings need >>>>> to implement more complicated '&' and '|' operators. >>>>> >>>>> I'll finish the implementation and try to migrate that to current handling. >>>>> Guess, I'm quite close. >>>> >>>> Hmm, but then there's not much advantage in suboptions (well, apart from maybe >>>> at the user-side). >>> >>> Yep, please take a look at updated version of PATCH 2/N, where I ported -fopt-info. >>> As you can see I had to explicitly define all enum values and hierarchy creation >>> of every single node. >>> >>> Martin >>> >>>> >>>>> Martin >>>>> >>>>> >>>>>> >>>>>> Thanks, >>>>>> Richard. >>>>>> >>>>>>> >>>>>>> Thanks for feedback, >>>>>>> Martin >>>>>>> >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Richard. >>>>>>>> >>>>>>>>> Martin >>>>>>> >>>>>>> >>>>>>> >>>>> >>> >
Hello. After fiddling with that, I decided to come up with reworked first part of the patch and eventual translation to a more hierarchical structure is subject for discussion. This first patch adds default for couple of dump functions and it helps in future to transform 0 to a TDF_NONE (or whatever we call it). Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. Ready to be installed? Martin
On Fri, May 12, 2017 at 2:57 PM, Martin Liška <mliska@suse.cz> wrote: > Hello. > > After fiddling with that, I decided to come up with reworked first part > of the patch and eventual translation to a more hierarchical structure > is subject for discussion. > > This first patch adds default for couple of dump functions and it helps in future > to transform 0 to a TDF_NONE (or whatever we call it). > > Patch can bootstrap on ppc64le-redhat-linux and survives regression tests. > > Ready to be installed? Works for me. Thanks, Richard. > Martin
From c1b832212576fd9f89fd738ae0cc98e9fb189c1d Mon Sep 17 00:00:00 2001 From: marxin <mliska@suse.cz> Date: Wed, 1 Feb 2017 15:34:52 +0100 Subject: [PATCH] Introduce -fdump*-folding gcc/ChangeLog: 2017-05-03 Martin Liska <mliska@suse.cz> * dumpfile.c: Add TDF_FOLDING. * dumpfile.h (enum tree_dump_index): Add to the enum. * genmatch.c (dt_simplify::gen_1): Use the newly added enum value. --- gcc/dumpfile.c | 1 + gcc/dumpfile.h | 7 ++++--- gcc/genmatch.c | 2 +- 3 files changed, 6 insertions(+), 4 deletions(-) diff --git a/gcc/dumpfile.c b/gcc/dumpfile.c index 6b9a47c5a26..b9c881c103f 100644 --- a/gcc/dumpfile.c +++ b/gcc/dumpfile.c @@ -111,6 +111,7 @@ static const struct dump_option_value_info dump_options[] = {"enumerate_locals", TDF_ENUMERATE_LOCALS}, {"scev", TDF_SCEV}, {"gimple", TDF_GIMPLE}, + {"folding", TDF_FOLDING}, {"optimized", MSG_OPTIMIZED_LOCATIONS}, {"missed", MSG_MISSED_OPTIMIZATION}, {"note", MSG_NOTE}, diff --git a/gcc/dumpfile.h b/gcc/dumpfile.h index fef58f5e9b1..69c4ec0f861 100644 --- a/gcc/dumpfile.h +++ b/gcc/dumpfile.h @@ -84,9 +84,10 @@ enum tree_dump_index #define TDF_SCEV (1 << 24) /* Dump SCEV details. */ #define TDF_COMMENT (1 << 25) /* Dump lines with prefix ";;" */ #define TDF_GIMPLE (1 << 26) /* Dump in GIMPLE FE syntax */ -#define MSG_OPTIMIZED_LOCATIONS (1 << 27) /* -fopt-info optimized sources */ -#define MSG_MISSED_OPTIMIZATION (1 << 28) /* missed opportunities */ -#define MSG_NOTE (1 << 29) /* general optimization info */ +#define TDF_FOLDING (1 << 27) /* Dump folding details. */ +#define MSG_OPTIMIZED_LOCATIONS (1 << 28) /* -fopt-info optimized sources */ +#define MSG_MISSED_OPTIMIZATION (1 << 29) /* missed opportunities */ +#define MSG_NOTE (1 << 30) /* general optimization info */ #define MSG_ALL (MSG_OPTIMIZED_LOCATIONS | MSG_MISSED_OPTIMIZATION \ | MSG_NOTE) diff --git a/gcc/genmatch.c b/gcc/genmatch.c index 5621aa05b59..979d6856084 100644 --- a/gcc/genmatch.c +++ b/gcc/genmatch.c @@ -3187,7 +3187,7 @@ dt_simplify::gen_1 (FILE *f, int indent, bool gimple, operand *result) } } - fprintf_indent (f, indent, "if (dump_file && (dump_flags & TDF_DETAILS)) " + fprintf_indent (f, indent, "if (dump_file && (dump_flags & TDF_FOLDING)) " "fprintf (dump_file, \"Applying pattern "); output_line_directive (f, result ? result->location : s->match->location, true); -- 2.12.2