Patchwork PATCH: PR target/54741: Check SSE and YMM state support for -march=native

login
register
mail settings
Submitter H.J. Lu
Date Oct. 2, 2012, 7:35 p.m.
Message ID <20121002193519.GA29567@gmail.com>
Download mbox | patch
Permalink /patch/188645/
State New
Headers show

Comments

H.J. Lu - Oct. 2, 2012, 7:35 p.m.
Hi,

This patch checks SSE and YMM state support for -march=native.  Tested
on Linux/x86-64.  OK to install?

Thanks.


H.J.
---
2012-10-02  H.J. Lu  <hongjiu.lu@intel.com>

	PR target/54741
	*  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
	(XSTATE_FP): Likewise.
	(XSTATE_SSE): Likewise.
	(XSTATE_YMM): Likewise.
	(host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
	SSE and YMM states aren't supported.
Uros Bizjak - Oct. 2, 2012, 7:39 p.m.
On Tue, Oct 2, 2012 at 9:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

> This patch checks SSE and YMM state support for -march=native.  Tested
> on Linux/x86-64.  OK to install?
>
> 2012-10-02  H.J. Lu  <hongjiu.lu@intel.com>
>
>         PR target/54741
>         *  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
>         (XSTATE_FP): Likewise.
>         (XSTATE_SSE): Likewise.
>         (XSTATE_YMM): Likewise.
>         (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
>         SSE and YMM states aren't supported.

OK for mainline and release branches.

Thanks,
Uros.
H.J. Lu - Oct. 3, 2012, 5:55 p.m.
On Tue, Oct 2, 2012 at 12:35 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> Hi,
>
> This patch checks SSE and YMM state support for -march=native.  Tested
> on Linux/x86-64.  OK to install?
>
> Thanks.
>
>
> H.J.
> ---
> 2012-10-02  H.J. Lu  <hongjiu.lu@intel.com>
>
>         PR target/54741
>         *  config/i386/driver-i386.c (XCR_XFEATURE_ENABLED_MASK): New.
>         (XSTATE_FP): Likewise.
>         (XSTATE_SSE): Likewise.
>         (XSTATE_YMM): Likewise.
>         (host_detect_local_cpu): Disable AVX, AVX2, FMA, FMA4 and XOP if
>         SSE and YMM states aren't supported.
>
> diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
> index bda4e02..4dffc51 100644
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -390,6 +390,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
>    unsigned int has_hle = 0, has_rtm = 0;
>    unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
>    unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
> +  unsigned int has_osxsave = 0;
>
>    bool arch;
>
> @@ -431,6 +432,7 @@ const char *host_detect_local_cpu (int argc, const char **argv)
>    has_sse4_1 = ecx & bit_SSE4_1;
>    has_sse4_2 = ecx & bit_SSE4_2;
>    has_avx = ecx & bit_AVX;
> +  has_osxsave = ecx & bit_OSXSAVE;
>    has_cmpxchg16b = ecx & bit_CMPXCHG16B;
>    has_movbe = ecx & bit_MOVBE;
>    has_popcnt = ecx & bit_POPCNT;
> @@ -460,6 +462,26 @@ const char *host_detect_local_cpu (int argc, const char **argv)
>        has_adx = ebx & bit_ADX;
>      }
>
> +  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
> +#define XCR_XFEATURE_ENABLED_MASK      0x0
> +#define XSTATE_FP                      0x1
> +#define XSTATE_SSE                     0x2
> +#define XSTATE_YMM                     0x4
> +  if (has_osxsave)
> +    asm (".byte 0x0f; .byte 0x01; .byte 0xd0"
> +        : "=a" (eax), "=d" (edx)
> +        : "c" (XCR_XFEATURE_ENABLED_MASK));
> +
> +  /* Check if SSE and YMM states are supported.  */
> +  if ((eax & (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM))
> +    {
> +      has_avx = 0;
> +      has_avx2 = 0;
> +      has_fma = 0;
> +      has_fma4 = 0;
> +      has_xop = 0;
> +    }
> +
>    /* Check cpuid level of extended features.  */
>    __cpuid (0x80000000, ext_level, ebx, ecx, edx);
>

This is very embarrassing.  Thanks to Andrew, I checked his fix

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54741#c18

into trunk, 4.6 and 4.7 branches.

Patch

diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c
index bda4e02..4dffc51 100644
--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -390,6 +390,7 @@  const char *host_detect_local_cpu (int argc, const char **argv)
   unsigned int has_hle = 0, has_rtm = 0;
   unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0;
   unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0;
+  unsigned int has_osxsave = 0;
 
   bool arch;
 
@@ -431,6 +432,7 @@  const char *host_detect_local_cpu (int argc, const char **argv)
   has_sse4_1 = ecx & bit_SSE4_1;
   has_sse4_2 = ecx & bit_SSE4_2;
   has_avx = ecx & bit_AVX;
+  has_osxsave = ecx & bit_OSXSAVE;
   has_cmpxchg16b = ecx & bit_CMPXCHG16B;
   has_movbe = ecx & bit_MOVBE;
   has_popcnt = ecx & bit_POPCNT;
@@ -460,6 +462,26 @@  const char *host_detect_local_cpu (int argc, const char **argv)
       has_adx = ebx & bit_ADX;
     }
 
+  /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv.  */
+#define XCR_XFEATURE_ENABLED_MASK	0x0
+#define XSTATE_FP			0x1
+#define XSTATE_SSE			0x2
+#define XSTATE_YMM			0x4
+  if (has_osxsave)
+    asm (".byte 0x0f; .byte 0x01; .byte 0xd0"
+	 : "=a" (eax), "=d" (edx)
+	 : "c" (XCR_XFEATURE_ENABLED_MASK));
+
+  /* Check if SSE and YMM states are supported.  */
+  if ((eax & (XSTATE_SSE | XSTATE_YMM)) == (XSTATE_SSE | XSTATE_YMM))
+    {
+      has_avx = 0;
+      has_avx2 = 0;
+      has_fma = 0;
+      has_fma4 = 0;
+      has_xop = 0;
+    }
+
   /* Check cpuid level of extended features.  */
   __cpuid (0x80000000, ext_level, ebx, ecx, edx);