x86: Set Prefer_No_VZEROUPPER if AVX512ER is available

Submitted by H.J. Lu on April 17, 2017, 7:54 p.m.

Details

Message ID 20170417195412.GA8487@intel.com
State New
Headers show

Commit Message

H.J. Lu April 17, 2017, 7:54 p.m.
Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER if
AVX512ER is available.

Any comments?

H.J.
----
	* sysdeps/x86/cpu-features.c (init_cpu_features): Set
	Prefer_No_VZEROUPPER if AVX512ER is available.
	* sysdeps/x86/cpu-features.h
	(bit_cpu_AVX512PF): New.
	(bit_cpu_AVX512ER): Likewise.
	(bit_cpu_AVX512CD): Likewise.
	(bit_cpu_AVX512BW): Likewise.
	(bit_cpu_AVX512VL): Likewise.
	(index_cpu_AVX512PF): Likewise.
	(index_cpu_AVX512ER): Likewise.
	(index_cpu_AVX512CD): Likewise.
	(index_cpu_AVX512BW): Likewise.
	(index_cpu_AVX512VL): Likewise.
	(reg_AVX512PF): Likewise.
	(reg_AVX512ER): Likewise.
	(reg_AVX512CD): Likewise.
	(reg_AVX512BW): Likewise.
	(reg_AVX512VL): Likewise.
---
 sysdeps/x86/cpu-features.c |  8 ++++++--
 sysdeps/x86/cpu-features.h | 15 +++++++++++++++
 2 files changed, 21 insertions(+), 2 deletions(-)

Comments

Florian Weimer April 17, 2017, 8:06 p.m.
On 04/17/2017 09:54 PM, H.J. Lu wrote:
> Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER if
> AVX512ER is available.

This approach doesn't seem very future-proof to me.

Thanks,
Florian
H.J. Lu April 17, 2017, 8:13 p.m.
On Mon, Apr 17, 2017 at 1:06 PM, Florian Weimer <fweimer@redhat.com> wrote:
> On 04/17/2017 09:54 PM, H.J. Lu wrote:
>>
>> Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER if
>> AVX512ER is available.
>
>
> This approach doesn't seem very future-proof to me.
>

AVX512ER won't be implemented in any Xeon processors and it will be in
all Xeon Phi processors.   In this way, we don't need to check CPU model
numbers when setting Prefer_No_VZEROUPPER.   It will work with current
and future Xeon Phi and non-Xeon Phi processors.
Florian Weimer April 18, 2017, 5:47 a.m.
On 04/17/2017 10:13 PM, H.J. Lu wrote:
> On Mon, Apr 17, 2017 at 1:06 PM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 04/17/2017 09:54 PM, H.J. Lu wrote:
>>>
>>> Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER if
>>> AVX512ER is available.
>>
>>
>> This approach doesn't seem very future-proof to me.
>>
> 
> AVX512ER won't be implemented in any Xeon processors and it will be in
> all Xeon Phi processors.   In this way, we don't need to check CPU model
> numbers when setting Prefer_No_VZEROUPPER.   It will work with current
> and future Xeon Phi and non-Xeon Phi processors.

Well, okay.  I see it's in the Intel-specific block, so the change 
should be okay.

Thanks,
Florian

Patch hide | download patch | download mbox

diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index 33788ed..ae7f844 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -138,8 +138,6 @@  init_cpu_features (struct cpu_features *cpu_features)
 
 	    case 0x57:
 	      /* Knights Landing.  Enable Silvermont optimizations.  */
-	      cpu_features->feature[index_arch_Prefer_No_VZEROUPPER]
-		|= bit_arch_Prefer_No_VZEROUPPER;
 
 	    case 0x5c:
 	    case 0x5f:
@@ -225,6 +223,12 @@  init_cpu_features (struct cpu_features *cpu_features)
 	cpu_features->feature[index_arch_AVX_Fast_Unaligned_Load]
 	  |= bit_arch_AVX_Fast_Unaligned_Load;
 
+      /* Since AVX512ER is unique to Xeon Phi, set Prefer_No_VZEROUPPER
+         if AVX512ER is available.  */
+      if (CPU_FEATURES_CPU_P (cpu_features, AVX512ER))
+	cpu_features->feature[index_arch_Prefer_No_VZEROUPPER]
+	  |= bit_arch_Prefer_No_VZEROUPPER;
+
       /* To avoid SSE transition penalty, use _dl_runtime_resolve_slow.
          If XGETBV suports ECX == 1, use _dl_runtime_resolve_opt.  */
       cpu_features->feature[index_arch_Use_dl_runtime_resolve_slow]
diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index 8ec1562..1583d65 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -63,6 +63,11 @@ 
 #define bit_cpu_AVX2		(1 << 5)
 #define bit_cpu_AVX512F		(1 << 16)
 #define bit_cpu_AVX512DQ	(1 << 17)
+#define bit_cpu_AVX512PF	(1 << 26)
+#define bit_cpu_AVX512ER	(1 << 27)
+#define bit_cpu_AVX512CD	(1 << 28)
+#define bit_cpu_AVX512BW	(1 << 30)
+#define bit_cpu_AVX512VL	(1u << 31)
 
 /* XCR0 Feature flags.  */
 #define bit_XMM_state		(1 << 1)
@@ -239,6 +244,11 @@  extern const struct cpu_features *__get_cpu_features (void)
 # define index_cpu_AVX2		COMMON_CPUID_INDEX_7
 # define index_cpu_AVX512F	COMMON_CPUID_INDEX_7
 # define index_cpu_AVX512DQ	COMMON_CPUID_INDEX_7
+# define index_cpu_AVX512PF	COMMON_CPUID_INDEX_7
+# define index_cpu_AVX512ER	COMMON_CPUID_INDEX_7
+# define index_cpu_AVX512CD	COMMON_CPUID_INDEX_7
+# define index_cpu_AVX512BW	COMMON_CPUID_INDEX_7
+# define index_cpu_AVX512VL	COMMON_CPUID_INDEX_7
 # define index_cpu_ERMS		COMMON_CPUID_INDEX_7
 # define index_cpu_RTM		COMMON_CPUID_INDEX_7
 # define index_cpu_FMA		COMMON_CPUID_INDEX_1
@@ -258,6 +268,11 @@  extern const struct cpu_features *__get_cpu_features (void)
 # define reg_AVX2		ebx
 # define reg_AVX512F		ebx
 # define reg_AVX512DQ		ebx
+# define reg_AVX512PF		ebx
+# define reg_AVX512ER		ebx
+# define reg_AVX512CD		ebx
+# define reg_AVX512BW		ebx
+# define reg_AVX512VL		ebx
 # define reg_ERMS		ebx
 # define reg_RTM		ebx
 # define reg_FMA		ecx