Message ID | 20121002193519.GA29567@gmail.com |
---|---|
State | New |
Headers | show |
On Tue, Oct 2, 2012 at 9:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > This patch checks SSE and YMM state support for -march=native. Tested > on Linux/x86-64. OK to install? > > 2012-10-02 H.J. Lu <hongjiu.lu@intel.com> > > PR target/54741 > * config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New. > (XSTATE_FP): Likewise. > (XSTATE_SSE): Likewise. > (XSTATE_YMM): Likewise. > (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if > SSE and YMM states aren't supported. OK for mainline and release branches. Thanks, Uros.
On Tue, Oct 2, 2012 at 12:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > Hi, > > This patch checks SSE and YMM state support for -march=native. Tested > on Linux/x86-64. OK to install? > > Thanks. > > > H.J. > --- > 2012-10-02 H.J. Lu <hongjiu.lu@intel.com> > > PR target/54741 > * config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New. > (XSTATE_FP): Likewise. > (XSTATE_SSE): Likewise. > (XSTATE_YMM): Likewise. > (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if > SSE and YMM states aren't supported. > > diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c > index bda4e02..4dffc51 100644 > --- a/gcc/config/i386/driver-i386.c > +++ b/gcc/config/i386/driver-i386.c > @@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) > unsigned int has_hle = 0, has_rtm = 0; > unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0; > unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0; > + unsigned int has_osxsave = 0; > > bool arch; > > @@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) > has_sse4_1 = ecx & bit_SSE4_1; > has_sse4_2 = ecx & bit_SSE4_2; > has_avx = ecx & bit_AVX; > + has_osxsave = ecx & bit_OSXSAVE; > has_cmpxchg16b = ecx & bit_CMPXCHG16B; > has_movbe = ecx & bit_MOVBE; > has_popcnt = ecx & bit_POPCNT; > @@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char **argv) > has_adx = ebx & bit_ADX; > } > > + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ > +#define XCR_XFEATURE_ENABLED_MASK 0x0 > +#define XSTATE_FP 0x1 > +#define XSTATE_SSE 0x2 > +#define XSTATE_YMM 0x4 > + if (has_osxsave) > + asm (".byte 0x0f; .byte 0x01; .byte 0xd0" > + : "=a" (eax), "=d" (edx) > + : "c" (XCR_XFEATURE_ENABLED_MASK)); > + > + /* Check if SSE and YMM states are supported. */ > + if ((eax & (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM)) > + { > + has_avx = 0; > + has_avx2 = 0; > + has_fma = 0; > + has_fma4 = 0; > + has_xop = 0; > + } > + > /* Check cpuid level of extended features. */ > __cpuid (0x80000000, ext_level, ebx, ecx, edx); > This is very embarrassing. Thanks to Andrew, I checked his fix http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54741#c18 into trunk, 4.6 and 4.7 branches.
diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index bda4e02..4dffc51 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) unsigned int has_hle = 0, has_rtm = 0; unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0; unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0; + unsigned int has_osxsave = 0; bool arch; @@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_sse4_1 = ecx & bit_SSE4_1; has_sse4_2 = ecx & bit_SSE4_2; has_avx = ecx & bit_AVX; + has_osxsave = ecx & bit_OSXSAVE; has_cmpxchg16b = ecx & bit_CMPXCHG16B; has_movbe = ecx & bit_MOVBE; has_popcnt = ecx & bit_POPCNT; @@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char **argv) has_adx = ebx & bit_ADX; } + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ +#define XCR_XFEATURE_ENABLED_MASK 0x0 +#define XSTATE_FP 0x1 +#define XSTATE_SSE 0x2 +#define XSTATE_YMM 0x4 + if (has_osxsave) + asm (".byte 0x0f; .byte 0x01; .byte 0xd0" + : "=a" (eax), "=d" (edx) + : "c" (XCR_XFEATURE_ENABLED_MASK)); + + /* Check if SSE and YMM states are supported. */ + if ((eax & (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM)) + { + has_avx = 0; + has_avx2 = 0; + has_fma = 0; + has_fma4 = 0; + has_xop = 0; + } + /* Check cpuid level of extended features. */ __cpuid (0x80000000, ext_level, ebx, ecx, edx);