Message ID | CAMe9rOrHzd163wbCh0c7cFbOHfCBpi-waZ_dNL4T396ZPFcHew@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Jan 26, 2015 at 10:04 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Sun, Jan 25, 2015 at 10:37 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >> On Sun, Jan 25, 2015 at 7:23 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >>> On Sat, Jan 24, 2015 at 11:49 AM, Allan Sandfeld Jensen >>> <allan@carewolf.com> wrote: >>>> On Saturday 24 January 2015, Uros Bizjak wrote: >>>>> On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >>>>> > Hello! >>>>> > >>>>> >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen wrote: >>>>> >>> > I recently wanted to use multiversioning for BMI2 specific extensions >>>>> >>> > PDEP/PEXT, and noticed it wasn't there. So I wrote this patch to add >>>>> >>> > it, and also added AES, F16C and BMI1 for completeness. >>>>> >>> >>>>> >>> AES nor F16C doesn't make any sense IMHO for multiversioning, you need >>>>> >>> special intrinsics for that anyway and when you use them, the function >>>>> >>> will fail to compile without those features. >>>>> >>> Multiversioning only makes sense for ISA features the compiler uses for >>>>> >>> normal C/C++ code without any intrinsics. >>>>> >> >>>>> >> Patch reduced to just adding BMI and BMI2 multiversioning: >>>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >>>>> > + >>>>> > + * config/i386/i386.c (get_builtin_code_for_version): Add >>>>> > + support for BMI and BMI2 multiversion functions. >>>>> > >>>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >>>>> > + >>>>> > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >>>>> > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >>>>> > >>>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >>>>> > + >>>>> > + * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI and >>>>> > + FEATURE_BMI2. >>>>> > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >>>>> > >>>>> > OK for mainline >>>>> >>>>> Allan, did you commit the patch to mainline? I don't see it in SVN logs. >>>>> >>>>> (If you don't have SVN commit access, please mention it in the patch >>>>> submission, so someone will commit the patch for you). >>>>> >>>> Sorry. I don't have SVN commit access. >>> >>> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu part in >>> gcc/config/i386/i386.c, and mv17.C test didn't compile at all due to >>> missing parenthesis). >> >> ... and now with committed ChangeLog and patch. >> >> gcc/ChangeLog: >> >> * config/i386/i386.c (get_builtin_code_for_version): Add >> support for BMI and BMI2 multiversion functions. >> (fold_builtin_cpu): Add F_BMI and F_BMI2. >> >> libgcc/ChangeLog: >> >> * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI and >> FEATURE_BMI2. >> (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >> >> testsuite/ChangeLog: >> >> * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >> * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >> > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 9ec40cb..441911d 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree *predica > te_list) > P_PROC_SSE4_A, > P_SSE4_1, > P_SSE4_2, > - P_PROC_SSE4_2, > P_POPCNT, > + P_PROC_SSE4_2, > P_AVX, > P_PROC_AVX, > + P_BMI, > + P_PROC_BMI, > P_FMA4, > P_XOP, > P_PROC_XOP, > P_FMA, > P_PROC_FMA, > + P_BMI2, > P_AVX2, > P_PROC_AVX2, > P_AVX512F, > > This changed the priority of P_POPCNT and caused > > FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test > FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test > FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test > > on Nehalem and Westmere machines: > > mv1.exe: /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: > int main(): Assertion `val == 5' failed. > > since "val" is 6 now. I opened: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64806
On Monday 26 January 2015, H.J. Lu wrote: > On Sun, Jan 25, 2015 at 10:37 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > > On Sun, Jan 25, 2015 at 7:23 PM, Uros Bizjak <ubizjak@gmail.com> wrote: > >> On Sat, Jan 24, 2015 at 11:49 AM, Allan Sandfeld Jensen > >> > >> <allan@carewolf.com> wrote: > >>> On Saturday 24 January 2015, Uros Bizjak wrote: > >>>> On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak <ubizjak@gmail.com> wrote: > >>>> > Hello! > >>>> > > >>>> >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen wrote: > >>>> >>> > I recently wanted to use multiversioning for BMI2 specific > >>>> >>> > extensions PDEP/PEXT, and noticed it wasn't there. So I wrote > >>>> >>> > this patch to add it, and also added AES, F16C and BMI1 for > >>>> >>> > completeness. > >>>> >>> > >>>> >>> AES nor F16C doesn't make any sense IMHO for multiversioning, you > >>>> >>> need special intrinsics for that anyway and when you use them, > >>>> >>> the function will fail to compile without those features. > >>>> >>> Multiversioning only makes sense for ISA features the compiler > >>>> >>> uses for normal C/C++ code without any intrinsics. > >>>> >> > >>>> >> Patch reduced to just adding BMI and BMI2 multiversioning: > >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >>>> > + > >>>> > + * config/i386/i386.c (get_builtin_code_for_version): Add > >>>> > + support for BMI and BMI2 multiversion functions. > >>>> > > >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >>>> > + > >>>> > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > >>>> > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > >>>> > > >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >>>> > + > >>>> > + * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI > >>>> > and + FEATURE_BMI2. > >>>> > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > >>>> > > >>>> > OK for mainline > >>>> > >>>> Allan, did you commit the patch to mainline? I don't see it in SVN > >>>> logs. > >>>> > >>>> (If you don't have SVN commit access, please mention it in the patch > >>>> submission, so someone will commit the patch for you). > >>> > >>> Sorry. I don't have SVN commit access. > >> > >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu part in > >> gcc/config/i386/i386.c, and mv17.C test didn't compile at all due to > >> missing parenthesis). > > > > ... and now with committed ChangeLog and patch. > > > > gcc/ChangeLog: > > * config/i386/i386.c (get_builtin_code_for_version): Add > > support for BMI and BMI2 multiversion functions. > > (fold_builtin_cpu): Add F_BMI and F_BMI2. > > > > libgcc/ChangeLog: > > * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI > > and FEATURE_BMI2. > > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > > > > testsuite/ChangeLog: > > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 9ec40cb..441911d 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree > *predica te_list) > P_PROC_SSE4_A, > P_SSE4_1, > P_SSE4_2, > - P_PROC_SSE4_2, > P_POPCNT, > + P_PROC_SSE4_2, > P_AVX, > P_PROC_AVX, > + P_BMI, > + P_PROC_BMI, > P_FMA4, > P_XOP, > P_PROC_XOP, > P_FMA, > P_PROC_FMA, > + P_BMI2, > P_AVX2, > P_PROC_AVX2, > P_AVX512F, > > This changed the priority of P_POPCNT and caused > > FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test > FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test > FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test > > on Nehalem and Westmere machines: > > mv1.exe: > /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: int > main(): Assertion `val == 5' failed. > > since "val" is 6 now. Right. I am not sure why popcnt was prioritized below arch=corei7. The logic is supposed to be that any target that includes an extension is prioritized above that extension. I included the change in the order in i386.c, but not the updated test. It is probably better to not change the order in i386.c, or at least change that in a separate commit. `Allan
On Mon, Jan 26, 2015 at 10:38 AM, Allan Sandfeld Jensen <allan@carewolf.com> wrote: > On Monday 26 January 2015, H.J. Lu wrote: >> On Sun, Jan 25, 2015 at 10:37 AM, Uros Bizjak <ubizjak@gmail.com> wrote: >> > On Sun, Jan 25, 2015 at 7:23 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >> >> On Sat, Jan 24, 2015 at 11:49 AM, Allan Sandfeld Jensen >> >> >> >> <allan@carewolf.com> wrote: >> >>> On Saturday 24 January 2015, Uros Bizjak wrote: >> >>>> On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >> >>>> > Hello! >> >>>> > >> >>>> >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen > wrote: >> >>>> >>> > I recently wanted to use multiversioning for BMI2 specific >> >>>> >>> > extensions PDEP/PEXT, and noticed it wasn't there. So I wrote >> >>>> >>> > this patch to add it, and also added AES, F16C and BMI1 for >> >>>> >>> > completeness. >> >>>> >>> >> >>>> >>> AES nor F16C doesn't make any sense IMHO for multiversioning, you >> >>>> >>> need special intrinsics for that anyway and when you use them, >> >>>> >>> the function will fail to compile without those features. >> >>>> >>> Multiversioning only makes sense for ISA features the compiler >> >>>> >>> uses for normal C/C++ code without any intrinsics. >> >>>> >> >> >>>> >> Patch reduced to just adding BMI and BMI2 multiversioning: >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >> >>>> > + >> >>>> > + * config/i386/i386.c (get_builtin_code_for_version): Add >> >>>> > + support for BMI and BMI2 multiversion functions. >> >>>> > >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >> >>>> > + >> >>>> > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >> >>>> > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >> >>>> > >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> >> >>>> > + >> >>>> > + * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI >> >>>> > and + FEATURE_BMI2. >> >>>> > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >> >>>> > >> >>>> > OK for mainline >> >>>> >> >>>> Allan, did you commit the patch to mainline? I don't see it in SVN >> >>>> logs. >> >>>> >> >>>> (If you don't have SVN commit access, please mention it in the patch >> >>>> submission, so someone will commit the patch for you). >> >>> >> >>> Sorry. I don't have SVN commit access. >> >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu part in >> >> gcc/config/i386/i386.c, and mv17.C test didn't compile at all due to >> >> missing parenthesis). >> > >> > ... and now with committed ChangeLog and patch. >> > >> > gcc/ChangeLog: >> > * config/i386/i386.c (get_builtin_code_for_version): Add >> > support for BMI and BMI2 multiversion functions. >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. >> > >> > libgcc/ChangeLog: >> > * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI >> > and FEATURE_BMI2. >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >> > >> > testsuite/ChangeLog: >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> index 9ec40cb..441911d 100644 >> --- a/gcc/config/i386/i386.c >> +++ b/gcc/config/i386/i386.c >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree >> *predica te_list) >> P_PROC_SSE4_A, >> P_SSE4_1, >> P_SSE4_2, >> - P_PROC_SSE4_2, >> P_POPCNT, >> + P_PROC_SSE4_2, >> P_AVX, >> P_PROC_AVX, >> + P_BMI, >> + P_PROC_BMI, >> P_FMA4, >> P_XOP, >> P_PROC_XOP, >> P_FMA, >> P_PROC_FMA, >> + P_BMI2, >> P_AVX2, >> P_PROC_AVX2, >> P_AVX512F, >> >> This changed the priority of P_POPCNT and caused >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test >> >> on Nehalem and Westmere machines: >> >> mv1.exe: >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: int >> main(): Assertion `val == 5' failed. >> >> since "val" is 6 now. > > Right. I am not sure why popcnt was prioritized below arch=corei7. The logic > is supposed to be that any target that includes an extension is prioritized I don't understand your question. popcnt feature is separate from -march. Its priority has nothing to do with -march=corei7. > above that extension. I included the change in the order in i386.c, but not > the updated test. It is probably better to not change the order in i386.c, or > at least change that in a separate commit.
On Monday 26 January 2015, you wrote: > On Mon, Jan 26, 2015 at 10:38 AM, Allan Sandfeld Jensen > > <allan@carewolf.com> wrote: > > On Monday 26 January 2015, H.J. Lu wrote: > >> On Sun, Jan 25, 2015 at 10:37 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > >> > On Sun, Jan 25, 2015 at 7:23 PM, Uros Bizjak <ubizjak@gmail.com> wrote: > >> >> On Sat, Jan 24, 2015 at 11:49 AM, Allan Sandfeld Jensen > >> >> > >> >> <allan@carewolf.com> wrote: > >> >>> On Saturday 24 January 2015, Uros Bizjak wrote: > >> >>>> On Mon, Jan 12, 2015 at 6:02 PM, Uros Bizjak <ubizjak@gmail.com> wrote: > >> >>>> > Hello! > >> >>>> > > >> >>>> >>> On Wed, Dec 31, 2014 at 01:28:47PM +0100, Allan Sandfeld Jensen > > > > wrote: > >> >>>> >>> > I recently wanted to use multiversioning for BMI2 specific > >> >>>> >>> > extensions PDEP/PEXT, and noticed it wasn't there. So I wrote > >> >>>> >>> > this patch to add it, and also added AES, F16C and BMI1 for > >> >>>> >>> > completeness. > >> >>>> >>> > >> >>>> >>> AES nor F16C doesn't make any sense IMHO for multiversioning, > >> >>>> >>> you need special intrinsics for that anyway and when you use > >> >>>> >>> them, the function will fail to compile without those > >> >>>> >>> features. Multiversioning only makes sense for ISA features > >> >>>> >>> the compiler uses for normal C/C++ code without any > >> >>>> >>> intrinsics. > >> >>>> >> > >> >>>> >> Patch reduced to just adding BMI and BMI2 multiversioning: > >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >> >>>> > + > >> >>>> > + * config/i386/i386.c (get_builtin_code_for_version): Add > >> >>>> > + support for BMI and BMI2 multiversion functions. > >> >>>> > > >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >> >>>> > + > >> >>>> > + * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > >> >>>> > + * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > >> >>>> > > >> >>>> > +2014-12-29 Allan Sandfeld Jensen <sandfeld@kde.org> > >> >>>> > + > >> >>>> > + * config/i386/cpuinfo.c (enum processor_features): Add > >> >>>> > FEATURE_BMI and + FEATURE_BMI2. > >> >>>> > + (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > >> >>>> > > >> >>>> > OK for mainline > >> >>>> > >> >>>> Allan, did you commit the patch to mainline? I don't see it in SVN > >> >>>> logs. > >> >>>> > >> >>>> (If you don't have SVN commit access, please mention it in the > >> >>>> patch submission, so someone will commit the patch for you). > >> >>> > >> >>> Sorry. I don't have SVN commit access. > >> >> > >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu part > >> >> in gcc/config/i386/i386.c, and mv17.C test didn't compile at all due > >> >> to missing parenthesis). > >> > > >> > ... and now with committed ChangeLog and patch. > >> > > >> > gcc/ChangeLog: > >> > * config/i386/i386.c (get_builtin_code_for_version): Add > >> > support for BMI and BMI2 multiversion functions. > >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. > >> > > >> > libgcc/ChangeLog: > >> > * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI > >> > and FEATURE_BMI2. > >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > >> > > >> > testsuite/ChangeLog: > >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > >> > >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > >> index 9ec40cb..441911d 100644 > >> --- a/gcc/config/i386/i386.c > >> +++ b/gcc/config/i386/i386.c > >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree > >> *predica te_list) > >> > >> P_PROC_SSE4_A, > >> P_SSE4_1, > >> P_SSE4_2, > >> > >> - P_PROC_SSE4_2, > >> > >> P_POPCNT, > >> > >> + P_PROC_SSE4_2, > >> > >> P_AVX, > >> P_PROC_AVX, > >> > >> + P_BMI, > >> + P_PROC_BMI, > >> > >> P_FMA4, > >> P_XOP, > >> P_PROC_XOP, > >> P_FMA, > >> P_PROC_FMA, > >> > >> + P_BMI2, > >> > >> P_AVX2, > >> P_PROC_AVX2, > >> P_AVX512F, > >> > >> This changed the priority of P_POPCNT and caused > >> > >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test > >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test > >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test > >> > >> on Nehalem and Westmere machines: > >> > >> mv1.exe: > >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: > >> int main(): Assertion `val == 5' failed. > >> > >> since "val" is 6 now. > > > > Right. I am not sure why popcnt was prioritized below arch=corei7. The > > logic is supposed to be that any target that includes an extension is > > prioritized > > I don't understand your question. popcnt feature is separate from -march. > Its priority has nothing to do with -march=corei7. > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would probably work with -march=core2. AFAIK The logic of the priorities in multiversioning is that architecture specific functions are chosen over feature specific, unless the feature is one that isn't required by the architecture. `Allan
On Mon, Jan 26, 2015 at 10:53 AM, Allan Sandfeld Jensen <carewolf@gmail.com> wrote: >> >> >> >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu part >> >> >> in gcc/config/i386/i386.c, and mv17.C test didn't compile at all due >> >> >> to missing parenthesis). >> >> > >> >> > ... and now with committed ChangeLog and patch. >> >> > >> >> > gcc/ChangeLog: >> >> > * config/i386/i386.c (get_builtin_code_for_version): Add >> >> > support for BMI and BMI2 multiversion functions. >> >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. >> >> > >> >> > libgcc/ChangeLog: >> >> > * config/i386/cpuinfo.c (enum processor_features): Add FEATURE_BMI >> >> > and FEATURE_BMI2. >> >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >> >> > >> >> > testsuite/ChangeLog: >> >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >> >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >> >> >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> >> index 9ec40cb..441911d 100644 >> >> --- a/gcc/config/i386/i386.c >> >> +++ b/gcc/config/i386/i386.c >> >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree >> >> *predica te_list) >> >> >> >> P_PROC_SSE4_A, >> >> P_SSE4_1, >> >> P_SSE4_2, >> >> >> >> - P_PROC_SSE4_2, >> >> >> >> P_POPCNT, >> >> >> >> + P_PROC_SSE4_2, >> >> >> >> P_AVX, >> >> P_PROC_AVX, >> >> >> >> + P_BMI, >> >> + P_PROC_BMI, >> >> >> >> P_FMA4, >> >> P_XOP, >> >> P_PROC_XOP, >> >> P_FMA, >> >> P_PROC_FMA, >> >> >> >> + P_BMI2, >> >> >> >> P_AVX2, >> >> P_PROC_AVX2, >> >> P_AVX512F, >> >> >> >> This changed the priority of P_POPCNT and caused >> >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test >> >> >> >> on Nehalem and Westmere machines: >> >> >> >> mv1.exe: >> >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: >> >> int main(): Assertion `val == 5' failed. >> >> >> >> since "val" is 6 now. >> > >> > Right. I am not sure why popcnt was prioritized below arch=corei7. The >> > logic is supposed to be that any target that includes an extension is >> > prioritized >> >> I don't understand your question. popcnt feature is separate from -march. >> Its priority has nothing to do with -march=corei7. >> > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would probably > work with -march=core2. > > AFAIK The logic of the priorities in multiversioning is that architecture > specific functions are chosen over feature specific, unless the feature is one > that isn't required by the architecture. > On SSE4.2 machines, we should choose int __attribute__ ((target("arch=corei7"))) foo (); over int __attribute__ ((target("popcnt"))) foo (); But we shouldn't choose int __attribute__ ((target("arch=corei7"))) foo (); over int __attribute__ ((target("arch=corei7,popcnt"))) foo ();
On Monday 26 January 2015, H.J. Lu wrote: > On Mon, Jan 26, 2015 at 10:53 AM, Allan Sandfeld Jensen > > <carewolf@gmail.com> wrote: > >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu > >> >> >> part in gcc/config/i386/i386.c, and mv17.C test didn't compile at > >> >> >> all due to missing parenthesis). > >> >> > > >> >> > ... and now with committed ChangeLog and patch. > >> >> > > >> >> > gcc/ChangeLog: > >> >> > * config/i386/i386.c (get_builtin_code_for_version): Add > >> >> > support for BMI and BMI2 multiversion functions. > >> >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. > >> >> > > >> >> > libgcc/ChangeLog: > >> >> > * config/i386/cpuinfo.c (enum processor_features): Add > >> >> > FEATURE_BMI and FEATURE_BMI2. > >> >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. > >> >> > > >> >> > testsuite/ChangeLog: > >> >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. > >> >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. > >> >> > >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > >> >> index 9ec40cb..441911d 100644 > >> >> --- a/gcc/config/i386/i386.c > >> >> +++ b/gcc/config/i386/i386.c > >> >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, > >> >> tree *predica te_list) > >> >> > >> >> P_PROC_SSE4_A, > >> >> P_SSE4_1, > >> >> P_SSE4_2, > >> >> > >> >> - P_PROC_SSE4_2, > >> >> > >> >> P_POPCNT, > >> >> > >> >> + P_PROC_SSE4_2, > >> >> > >> >> P_AVX, > >> >> P_PROC_AVX, > >> >> > >> >> + P_BMI, > >> >> + P_PROC_BMI, > >> >> > >> >> P_FMA4, > >> >> P_XOP, > >> >> P_PROC_XOP, > >> >> P_FMA, > >> >> P_PROC_FMA, > >> >> > >> >> + P_BMI2, > >> >> > >> >> P_AVX2, > >> >> P_PROC_AVX2, > >> >> P_AVX512F, > >> >> > >> >> This changed the priority of P_POPCNT and caused > >> >> > >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test > >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test > >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test > >> >> > >> >> on Nehalem and Westmere machines: > >> >> > >> >> mv1.exe: > >> >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: > >> >> int main(): Assertion `val == 5' failed. > >> >> > >> >> since "val" is 6 now. > >> > > >> > Right. I am not sure why popcnt was prioritized below arch=corei7. The > >> > logic is supposed to be that any target that includes an extension is > >> > prioritized > >> > >> I don't understand your question. popcnt feature is separate from > >> -march. Its priority has nothing to do with -march=corei7. > > > > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would > > probably work with -march=core2. > > > > AFAIK The logic of the priorities in multiversioning is that architecture > > specific functions are chosen over feature specific, unless the feature > > is one that isn't required by the architecture. > > On SSE4.2 machines, we should choose > > int __attribute__ ((target("arch=corei7"))) foo (); > > over > > int __attribute__ ((target("popcnt"))) foo (); > > But we shouldn't choose > > int __attribute__ ((target("arch=corei7"))) foo (); > > over > > int __attribute__ ((target("arch=corei7,popcnt"))) foo (); I guess since they represent the exact same effective ISA, they would have equal priority, so that it would likely chose whatever comes last. `Allan
On Mon, Jan 26, 2015 at 11:08 AM, Allan Sandfeld Jensen <carewolf@gmail.com> wrote: > On Monday 26 January 2015, H.J. Lu wrote: >> On Mon, Jan 26, 2015 at 10:53 AM, Allan Sandfeld Jensen >> >> <carewolf@gmail.com> wrote: >> >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu >> >> >> >> part in gcc/config/i386/i386.c, and mv17.C test didn't compile at >> >> >> >> all due to missing parenthesis). >> >> >> > >> >> >> > ... and now with committed ChangeLog and patch. >> >> >> > >> >> >> > gcc/ChangeLog: >> >> >> > * config/i386/i386.c (get_builtin_code_for_version): Add >> >> >> > support for BMI and BMI2 multiversion functions. >> >> >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. >> >> >> > >> >> >> > libgcc/ChangeLog: >> >> >> > * config/i386/cpuinfo.c (enum processor_features): Add >> >> >> > FEATURE_BMI and FEATURE_BMI2. >> >> >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >> >> >> > >> >> >> > testsuite/ChangeLog: >> >> >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >> >> >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >> >> >> >> >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >> >> >> index 9ec40cb..441911d 100644 >> >> >> --- a/gcc/config/i386/i386.c >> >> >> +++ b/gcc/config/i386/i386.c >> >> >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, >> >> >> tree *predica te_list) >> >> >> >> >> >> P_PROC_SSE4_A, >> >> >> P_SSE4_1, >> >> >> P_SSE4_2, >> >> >> >> >> >> - P_PROC_SSE4_2, >> >> >> >> >> >> P_POPCNT, >> >> >> >> >> >> + P_PROC_SSE4_2, >> >> >> >> >> >> P_AVX, >> >> >> P_PROC_AVX, >> >> >> >> >> >> + P_BMI, >> >> >> + P_PROC_BMI, >> >> >> >> >> >> P_FMA4, >> >> >> P_XOP, >> >> >> P_PROC_XOP, >> >> >> P_FMA, >> >> >> P_PROC_FMA, >> >> >> >> >> >> + P_BMI2, >> >> >> >> >> >> P_AVX2, >> >> >> P_PROC_AVX2, >> >> >> P_AVX512F, >> >> >> >> >> >> This changed the priority of P_POPCNT and caused >> >> >> >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test >> >> >> >> >> >> on Nehalem and Westmere machines: >> >> >> >> >> >> mv1.exe: >> >> >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: >> >> >> int main(): Assertion `val == 5' failed. >> >> >> >> >> >> since "val" is 6 now. >> >> > >> >> > Right. I am not sure why popcnt was prioritized below arch=corei7. The >> >> > logic is supposed to be that any target that includes an extension is >> >> > prioritized >> >> >> >> I don't understand your question. popcnt feature is separate from >> >> -march. Its priority has nothing to do with -march=corei7. >> > >> > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would >> > probably work with -march=core2. >> > >> > AFAIK The logic of the priorities in multiversioning is that architecture >> > specific functions are chosen over feature specific, unless the feature >> > is one that isn't required by the architecture. >> >> On SSE4.2 machines, we should choose >> >> int __attribute__ ((target("arch=corei7"))) foo (); >> >> over >> >> int __attribute__ ((target("popcnt"))) foo (); >> >> But we shouldn't choose >> >> int __attribute__ ((target("arch=corei7"))) foo (); >> >> over >> >> int __attribute__ ((target("arch=corei7,popcnt"))) foo (); > > I guess since they represent the exact same effective ISA, they would have > equal priority, so that it would likely chose whatever comes last. I have no strong opinion on this. But this is a user visible compiler behavior change. We should issue a warning/note here.
On Mon, Jan 26, 2015 at 8:15 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Mon, Jan 26, 2015 at 11:08 AM, Allan Sandfeld Jensen > <carewolf@gmail.com> wrote: >> On Monday 26 January 2015, H.J. Lu wrote: >>> On Mon, Jan 26, 2015 at 10:53 AM, Allan Sandfeld Jensen >>> >>> <carewolf@gmail.com> wrote: >>> >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu >>> >> >> >> part in gcc/config/i386/i386.c, and mv17.C test didn't compile at >>> >> >> >> all due to missing parenthesis). >>> >> >> > >>> >> >> > ... and now with committed ChangeLog and patch. >>> >> >> > >>> >> >> > gcc/ChangeLog: >>> >> >> > * config/i386/i386.c (get_builtin_code_for_version): Add >>> >> >> > support for BMI and BMI2 multiversion functions. >>> >> >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. >>> >> >> > >>> >> >> > libgcc/ChangeLog: >>> >> >> > * config/i386/cpuinfo.c (enum processor_features): Add >>> >> >> > FEATURE_BMI and FEATURE_BMI2. >>> >> >> > (get_available_features): Detect FEATURE_BMI and FEATURE_BMI2. >>> >> >> > >>> >> >> > testsuite/ChangeLog: >>> >> >> > * gcc.target/i386/funcspec-5.c: Test new multiversion targets. >>> >> >> > * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion dispatcher. >>> >> >> >>> >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >>> >> >> index 9ec40cb..441911d 100644 >>> >> >> --- a/gcc/config/i386/i386.c >>> >> >> +++ b/gcc/config/i386/i386.c >>> >> >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, >>> >> >> tree *predica te_list) >>> >> >> >>> >> >> P_PROC_SSE4_A, >>> >> >> P_SSE4_1, >>> >> >> P_SSE4_2, >>> >> >> >>> >> >> - P_PROC_SSE4_2, >>> >> >> >>> >> >> P_POPCNT, >>> >> >> >>> >> >> + P_PROC_SSE4_2, >>> >> >> >>> >> >> P_AVX, >>> >> >> P_PROC_AVX, >>> >> >> >>> >> >> + P_BMI, >>> >> >> + P_PROC_BMI, >>> >> >> >>> >> >> P_FMA4, >>> >> >> P_XOP, >>> >> >> P_PROC_XOP, >>> >> >> P_FMA, >>> >> >> P_PROC_FMA, >>> >> >> >>> >> >> + P_BMI2, >>> >> >> >>> >> >> P_AVX2, >>> >> >> P_PROC_AVX2, >>> >> >> P_AVX512F, >>> >> >> >>> >> >> This changed the priority of P_POPCNT and caused >>> >> >> >>> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test >>> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test >>> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test >>> >> >> >>> >> >> on Nehalem and Westmere machines: >>> >> >> >>> >> >> mv1.exe: >>> >> >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C:51: >>> >> >> int main(): Assertion `val == 5' failed. >>> >> >> >>> >> >> since "val" is 6 now. >>> >> > >>> >> > Right. I am not sure why popcnt was prioritized below arch=corei7. The >>> >> > logic is supposed to be that any target that includes an extension is >>> >> > prioritized >>> >> >>> >> I don't understand your question. popcnt feature is separate from >>> >> -march. Its priority has nothing to do with -march=corei7. >>> > >>> > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would >>> > probably work with -march=core2. >>> > >>> > AFAIK The logic of the priorities in multiversioning is that architecture >>> > specific functions are chosen over feature specific, unless the feature >>> > is one that isn't required by the architecture. >>> >>> On SSE4.2 machines, we should choose >>> >>> int __attribute__ ((target("arch=corei7"))) foo (); >>> >>> over >>> >>> int __attribute__ ((target("popcnt"))) foo (); >>> >>> But we shouldn't choose >>> >>> int __attribute__ ((target("arch=corei7"))) foo (); >>> >>> over >>> >>> int __attribute__ ((target("arch=corei7,popcnt"))) foo (); >> >> I guess since they represent the exact same effective ISA, they would have >> equal priority, so that it would likely chose whatever comes last. > > I have no strong opinion on this. But this is a user visible compiler > behavior change. We should issue a warning/note here. IMO, this is such a small (but useful) change, that we can get away without warning. Uros.
On Monday 26 January 2015, H.J. Lu wrote: > On Mon, Jan 26, 2015 at 11:08 AM, Allan Sandfeld Jensen > > <carewolf@gmail.com> wrote: > > On Monday 26 January 2015, H.J. Lu wrote: > >> On Mon, Jan 26, 2015 at 10:53 AM, Allan Sandfeld Jensen > >> > >> <carewolf@gmail.com> wrote: > >> >> >> >> Committed with a bunch of fixes (e.g. missing fold_builtin_cpu > >> >> >> >> part in gcc/config/i386/i386.c, and mv17.C test didn't compile > >> >> >> >> at all due to missing parenthesis). > >> >> >> > > >> >> >> > ... and now with committed ChangeLog and patch. > >> >> >> > > >> >> >> > gcc/ChangeLog: > >> >> >> > * config/i386/i386.c (get_builtin_code_for_version): Add > >> >> >> > support for BMI and BMI2 multiversion functions. > >> >> >> > (fold_builtin_cpu): Add F_BMI and F_BMI2. > >> >> >> > > >> >> >> > libgcc/ChangeLog: > >> >> >> > * config/i386/cpuinfo.c (enum processor_features): Add > >> >> >> > FEATURE_BMI and FEATURE_BMI2. > >> >> >> > (get_available_features): Detect FEATURE_BMI and > >> >> >> > FEATURE_BMI2. > >> >> >> > > >> >> >> > testsuite/ChangeLog: > >> >> >> > * gcc.target/i386/funcspec-5.c: Test new multiversion > >> >> >> > targets. * g++.dg/ext/mv17.C: Test BMI/BMI2 multiversion > >> >> >> > dispatcher. > >> >> >> > >> >> >> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > >> >> >> index 9ec40cb..441911d 100644 > >> >> >> --- a/gcc/config/i386/i386.c > >> >> >> +++ b/gcc/config/i386/i386.c > >> >> >> @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, > >> >> >> tree *predica te_list) > >> >> >> > >> >> >> P_PROC_SSE4_A, > >> >> >> P_SSE4_1, > >> >> >> P_SSE4_2, > >> >> >> > >> >> >> - P_PROC_SSE4_2, > >> >> >> > >> >> >> P_POPCNT, > >> >> >> > >> >> >> + P_PROC_SSE4_2, > >> >> >> > >> >> >> P_AVX, > >> >> >> P_PROC_AVX, > >> >> >> > >> >> >> + P_BMI, > >> >> >> + P_PROC_BMI, > >> >> >> > >> >> >> P_FMA4, > >> >> >> P_XOP, > >> >> >> P_PROC_XOP, > >> >> >> P_FMA, > >> >> >> P_PROC_FMA, > >> >> >> > >> >> >> + P_BMI2, > >> >> >> > >> >> >> P_AVX2, > >> >> >> P_PROC_AVX2, > >> >> >> P_AVX512F, > >> >> >> > >> >> >> This changed the priority of P_POPCNT and caused > >> >> >> > >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++98 execution test > >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++11 execution test > >> >> >> FAIL: g++.dg/ext/mv1.C -std=gnu++14 execution test > >> >> >> > >> >> >> on Nehalem and Westmere machines: > >> >> >> > >> >> >> mv1.exe: > >> >> >> /export/gnu/import/git/sources/gcc/gcc/testsuite/g++.dg/ext/mv1.C: > >> >> >> 51: int main(): Assertion `val == 5' failed. > >> >> >> > >> >> >> since "val" is 6 now. > >> >> > > >> >> > Right. I am not sure why popcnt was prioritized below arch=corei7. > >> >> > The logic is supposed to be that any target that includes an > >> >> > extension is prioritized > >> >> > >> >> I don't understand your question. popcnt feature is separate from > >> >> -march. Its priority has nothing to do with -march=corei7. > >> > > >> > arch=corei7 implies popcnt. See PTA_NEHALEM in i386.c. The test would > >> > probably work with -march=core2. > >> > > >> > AFAIK The logic of the priorities in multiversioning is that > >> > architecture specific functions are chosen over feature specific, > >> > unless the feature is one that isn't required by the architecture. > >> > >> On SSE4.2 machines, we should choose > >> > >> int __attribute__ ((target("arch=corei7"))) foo (); > >> > >> over > >> > >> int __attribute__ ((target("popcnt"))) foo (); > >> > >> But we shouldn't choose > >> > >> int __attribute__ ((target("arch=corei7"))) foo (); > >> > >> over > >> > >> int __attribute__ ((target("arch=corei7,popcnt"))) foo (); > > > > I guess since they represent the exact same effective ISA, they would > > have equal priority, so that it would likely chose whatever comes last. > > I have no strong opinion on this. But this is a user visible compiler > behavior change. We should issue a warning/note here. Yes, or revert that part of the patch. It should have been a separate patch anyway. `Allan
On Mon, Jan 26, 2015 at 8:22 PM, Allan Sandfeld Jensen <carewolf@gmail.com> wrote: >> > I guess since they represent the exact same effective ISA, they would >> > have equal priority, so that it would likely chose whatever comes last. >> >> I have no strong opinion on this. But this is a user visible compiler >> behavior change. We should issue a warning/note here. > > Yes, or revert that part of the patch. It should have been a separate patch > anyway. Agreed. HJ, can you please revert this part? The patch from PR is OK. Thanks, Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 9ec40cb..441911d 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -34289,15 +34289,18 @@ get_builtin_code_for_version (tree decl, tree *predica te_list) P_PROC_SSE4_A, P_SSE4_1, P_SSE4_2, - P_PROC_SSE4_2, P_POPCNT, + P_PROC_SSE4_2, P_AVX, P_PROC_AVX, + P_BMI, + P_PROC_BMI, P_FMA4, P_XOP, P_PROC_XOP, P_FMA, P_PROC_FMA, + P_BMI2, P_AVX2, P_PROC_AVX2,