diff mbox

[BZ,#21391] x86: Set dl_platform and dl_hwcap from CPU features

Message ID 20170419183532.GA18407@intel.com
State New
Headers show

Commit Message

H.J. Lu April 19, 2017, 6:35 p.m. UTC
dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
early during startup.  They are used by dynamic linker to determine
platform and build an array of hardware capability names, which are
added to search path when loading shared object.  dl_platform and
dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
platforms were supported and only SSE2 capability was used.

On x86, usage of AT_PLATFORM and AT_HWCAP to determine platform and
processor capabilities is obsolete since all information is available
in dl_x86_cpu_features.  This patch sets dl_platform and dl_hwcap from
dl_x86_cpu_features in dynamic linker.  On i386, the available plaforms
are changed to i586 and i686 since i386 has been deprecated.  On x86-64,
the available plaforms are haswell, which is for Haswell class processors
with BMI1, BMI2, LZCNT, MOVBE, POPCNT, AVX2 and FMA, and xeon_phi, which
is for Xeon Phi class processors with AVX512F, AVX512CD, AVX512ER and
AVX512PF.  A capability, avx512_1, is also added to x86-64 for AVX512
ISAs: AVX512F, AVX512CD, AVX512BW, AVX512DQ and AVX512VL.

Any comments?

H.J.
---
	[BZ #21391]
	* sysdeps/i386/dl-machine.h (dl_platform_init) [IS_IN (rtld)]:
	Only call init_cpu_features.
	[!IS_IN (rtld)]: Only set GLRO(dl_platform) to NULL if needed.
	* sysdeps/x86_64/dl-machine.h (dl_platform_init): Likewise.
	* sysdeps/i386/dl-procinfo.h: Removed.
	* sysdeps/unix/sysv/linux/i386/dl-procinfo.h: Don't include
	<sysdeps/i386/dl-procinfo.h> nor <ldsodefs.h>.  Include
	<sysdeps/x86/dl-procinfo.h>.
	(_dl_procinfo): Replace _DL_HWCAP_COUNT with 32.
	* sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h [!IS_IN (ldconfig)]:
	Include <sysdeps/x86/dl-procinfo.h> instead of
	 <sysdeps/generic/dl-procinfo.h>.
	* sysdeps/x86/cpu-features.c: Include <dl-hwcap.h>.
	(init_cpu_features): Set dl_platform, dl_hwcap and dl_hwcap_mask.
	* sysdeps/x86/cpu-features.h (bit_cpu_LZCNT): New.
	(bit_cpu_MOVBE): Likewise.
	(bit_cpu_BMI1): Likewise.
	(bit_cpu_BMI2): Likewise.
	(index_cpu_BMI1): Likewise.
	(index_cpu_BMI2): Likewise.
	(index_cpu_LZCNT): Likewise.
	(index_cpu_MOVBE): Likewise.
	(index_cpu_POPCNT): Likewise.
	(reg_BMI1): Likewise.
	(reg_BMI2): Likewise.
	(reg_LZCNT): Likewise.
	(reg_MOVBE): Likewise.
	(reg_POPCNT): Likewise.
	* sysdeps/x86/dl-hwcap.h: New file.
	* sysdeps/x86/dl-procinfo.h: Likewise.
	* sysdeps/x86/dl-procinfo.c (_dl_x86_hwcap_flags): New.
	(_dl_x86_platforms): Likewise.
---
 sysdeps/i386/dl-machine.h                    |  10 +--
 sysdeps/i386/dl-procinfo.c                   |  21 +-----
 sysdeps/i386/dl-procinfo.h                   | 102 ---------------------------
 sysdeps/unix/sysv/linux/i386/dl-procinfo.h   |   6 +-
 sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h |   2 +-
 sysdeps/x86/cpu-features.c                   |  48 +++++++++++++
 sysdeps/x86/cpu-features.h                   |  15 ++++
 sysdeps/x86/dl-hwcap.h                       |  75 ++++++++++++++++++++
 sysdeps/x86/dl-procinfo.c                    |  38 +++++++++-
 sysdeps/x86/dl-procinfo.h                    |  48 +++++++++++++
 sysdeps/x86_64/dl-machine.h                  |  10 +--
 11 files changed, 237 insertions(+), 138 deletions(-)
 delete mode 100644 sysdeps/i386/dl-procinfo.h
 create mode 100644 sysdeps/x86/dl-hwcap.h
 create mode 100644 sysdeps/x86/dl-procinfo.h

Comments

Florian Weimer April 19, 2017, 7:02 p.m. UTC | #1
On 04/19/2017 08:35 PM, H.J. Lu wrote:
> dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
> early during startup.  They are used by dynamic linker to determine
> platform and build an array of hardware capability names, which are
> added to search path when loading shared object.  dl_platform and
> dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
> platforms were supported and only SSE2 capability was used.

I don't know where you want to take this, so it's hard to tell if this 
is going to cause problems eventually.  GLRO is default-initialized in a 
nested libc (after dlmopen or static dlopen).

Thanks,
Florian
H.J. Lu April 19, 2017, 7:31 p.m. UTC | #2
On Wed, Apr 19, 2017 at 12:02 PM, Florian Weimer <fweimer@redhat.com> wrote:
> On 04/19/2017 08:35 PM, H.J. Lu wrote:
>>
>> dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
>> early during startup.  They are used by dynamic linker to determine
>> platform and build an array of hardware capability names, which are
>> added to search path when loading shared object.  dl_platform and
>> dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
>> platforms were supported and only SSE2 capability was used.
>
>
> I don't know where you want to take this, so it's hard to tell if this is
> going to cause problems eventually.  GLRO is default-initialized in a nested
> libc (after dlmopen or static dlopen).
>

dl_platform and dl_hwcap are used to search for additional directories when
loading a shared object:

     15453: find library=libx.so [0]; searching
     15453: search path=./tls/x86_64:./tls:./x86_64:. (RPATH from file ./m)
     15453:  trying file=./tls/x86_64/libx.so
     15453:  trying file=./tls/libx.so
     15453:  trying file=./x86_64/libx.so
     15453:  trying file=./libx.so
     15453:

My change updates them inside dynamic linker before they are used
by dynamic linker.  On Haswell class machines, I got

     19268: find library=libx.so [0]; searching
     19268: search path=./tls/haswell:./tls:./haswell:. (RPATH from file ./m)
     19268:  trying file=./tls/haswell/libx.so
     19268:  trying file=./tls/libx.so
     19268:  trying file=./haswell/libx.so

When loading libx.so, it prefers the one in the "haswell" subdirectory.
One can place shared libraries optimized for Haswell class processors
under the "haswell" subdirectory.  They will be used on Haswell class
processors.  It has no impact on static dlopen nor nested libc.

We have been using this scheme on i386 to place i686/SSE2 optimized
libraries under sse2/i686 directories.   My patch extends it to x86-64.
Rodriguez Bahena, Victor April 19, 2017, 9:11 p.m. UTC | #3
-----Original Message-----
From: <libc-alpha-owner@sourceware.org> on behalf of "H.J. Lu"
<hongjiu.lu@intel.com>
Reply-To: "H.J. Lu" <hjl.tools@gmail.com>
Date: Wednesday, April 19, 2017 at 1:35 PM
To: GNU C Library <libc-alpha@sourceware.org>
Subject: [PATCH] [BZ #21391] x86: Set dl_platform and dl_hwcap from CPU
features

>dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
>early during startup.  They are used by dynamic linker to determine
>platform and build an array of hardware capability names, which are
>added to search path when loading shared object.  dl_platform and
>dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
>platforms were supported and only SSE2 capability was used.
>
>On x86, usage of AT_PLATFORM and AT_HWCAP to determine platform and
>processor capabilities is obsolete since all information is available
>in dl_x86_cpu_features.  This patch sets dl_platform and dl_hwcap from
>dl_x86_cpu_features in dynamic linker.  On i386, the available plaforms
>are changed to i586 and i686 since i386 has been deprecated.  On x86-64,
>the available plaforms are haswell, which is for Haswell class processors
>with BMI1, BMI2, LZCNT, MOVBE, POPCNT, AVX2 and FMA, and xeon_phi, which
>is for Xeon Phi class processors with AVX512F, AVX512CD, AVX512ER and
>AVX512PF.  A capability, avx512_1, is also added to x86-64 for AVX512
>ISAs: AVX512F, AVX512CD, AVX512BW, AVX512DQ and AVX512VL.
>
>Any comments?

Tested , it works just fine

+1

Thanks for the patch

>
>H.J.
>---
>	[BZ #21391]
>	* sysdeps/i386/dl-machine.h (dl_platform_init) [IS_IN (rtld)]:
>	Only call init_cpu_features.
>	[!IS_IN (rtld)]: Only set GLRO(dl_platform) to NULL if needed.
>	* sysdeps/x86_64/dl-machine.h (dl_platform_init): Likewise.
>	* sysdeps/i386/dl-procinfo.h: Removed.
>	* sysdeps/unix/sysv/linux/i386/dl-procinfo.h: Don't include
>	<sysdeps/i386/dl-procinfo.h> nor <ldsodefs.h>.  Include
>	<sysdeps/x86/dl-procinfo.h>.
>	(_dl_procinfo): Replace _DL_HWCAP_COUNT with 32.
>	* sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h [!IS_IN (ldconfig)]:
>	Include <sysdeps/x86/dl-procinfo.h> instead of
>	 <sysdeps/generic/dl-procinfo.h>.
>	* sysdeps/x86/cpu-features.c: Include <dl-hwcap.h>.
>	(init_cpu_features): Set dl_platform, dl_hwcap and dl_hwcap_mask.
>	* sysdeps/x86/cpu-features.h (bit_cpu_LZCNT): New.
>	(bit_cpu_MOVBE): Likewise.
>	(bit_cpu_BMI1): Likewise.
>	(bit_cpu_BMI2): Likewise.
>	(index_cpu_BMI1): Likewise.
>	(index_cpu_BMI2): Likewise.
>	(index_cpu_LZCNT): Likewise.
>	(index_cpu_MOVBE): Likewise.
>	(index_cpu_POPCNT): Likewise.
>	(reg_BMI1): Likewise.
>	(reg_BMI2): Likewise.
>	(reg_LZCNT): Likewise.
>	(reg_MOVBE): Likewise.
>	(reg_POPCNT): Likewise.
>	* sysdeps/x86/dl-hwcap.h: New file.
>	* sysdeps/x86/dl-procinfo.h: Likewise.
>	* sysdeps/x86/dl-procinfo.c (_dl_x86_hwcap_flags): New.
>	(_dl_x86_platforms): Likewise.
>---
> sysdeps/i386/dl-machine.h                    |  10 +--
> sysdeps/i386/dl-procinfo.c                   |  21 +-----
> sysdeps/i386/dl-procinfo.h                   | 102
>---------------------------
> sysdeps/unix/sysv/linux/i386/dl-procinfo.h   |   6 +-
> sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h |   2 +-
> sysdeps/x86/cpu-features.c                   |  48 +++++++++++++
> sysdeps/x86/cpu-features.h                   |  15 ++++
> sysdeps/x86/dl-hwcap.h                       |  75 ++++++++++++++++++++
> sysdeps/x86/dl-procinfo.c                    |  38 +++++++++-
> sysdeps/x86/dl-procinfo.h                    |  48 +++++++++++++
> sysdeps/x86_64/dl-machine.h                  |  10 +--
> 11 files changed, 237 insertions(+), 138 deletions(-)
> delete mode 100644 sysdeps/i386/dl-procinfo.h
> create mode 100644 sysdeps/x86/dl-hwcap.h
> create mode 100644 sysdeps/x86/dl-procinfo.h
>
>diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
>index 99a72f6..57d4a0b 100644
>--- a/sysdeps/i386/dl-machine.h
>+++ b/sysdeps/i386/dl-machine.h
>@@ -233,14 +233,14 @@ _dl_start_user:\n\
> static inline void __attribute__ ((unused))
> dl_platform_init (void)
> {
>-  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
>-    /* Avoid an empty string which would disturb us.  */
>-    GLRO(dl_platform) = NULL;
>-
>-#ifdef SHARED
>+#if IS_IN (rtld)
>   /* init_cpu_features has been called early from __libc_start_main in
>      static executable.  */
>   init_cpu_features (&GLRO(dl_x86_cpu_features));
>+#else
>+  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
>+    /* Avoid an empty string which would disturb us.  */
>+    GLRO(dl_platform) = NULL;
> #endif
> }
> 
>diff --git a/sysdeps/i386/dl-procinfo.c b/sysdeps/i386/dl-procinfo.c
>index b832830..7237f77 100644
>--- a/sysdeps/i386/dl-procinfo.c
>+++ b/sysdeps/i386/dl-procinfo.c
>@@ -17,10 +17,7 @@
>    License along with the GNU C Library; if not, see
>    <http://www.gnu.org/licenses/>.  */
> 
>-/* This information must be kept in sync with the _DL_HWCAP_COUNT and
>-   _DL_PLATFORM_COUNT definitions in procinfo.h.
>-
>-   If anything should be added here check whether the size of each string
>+/* If anything should be added here check whether the size of each string
>    is still ok with the given array size.
> 
>    All the #ifdefs in the definitions are quite irritating but
>@@ -64,21 +61,5 @@ PROCINFO_CLASS const char _dl_x86_cap_flags[32][8]
> ,
> #endif
> 
>-#if !defined PROCINFO_DECL && defined SHARED
>-  ._dl_x86_platforms
>-#else
>-PROCINFO_CLASS const char _dl_x86_platforms[4][5]
>-#endif
>-#ifndef PROCINFO_DECL
>-= {
>-    "i386", "i486", "i586", "i686"
>-  }
>-#endif
>-#if !defined SHARED || defined PROCINFO_DECL
>-;
>-#else
>-,
>-#endif
>-
> #undef PROCINFO_DECL
> #undef PROCINFO_CLASS
>diff --git a/sysdeps/i386/dl-procinfo.h b/sysdeps/i386/dl-procinfo.h
>deleted file mode 100644
>index 9c38846..0000000
>--- a/sysdeps/i386/dl-procinfo.h
>+++ /dev/null
>@@ -1,102 +0,0 @@
>-/* i386 version of processor capability information handling macros.
>-   Copyright (C) 1998-2017 Free Software Foundation, Inc.
>-   This file is part of the GNU C Library.
>-   Contributed by Ulrich Drepper <drepper@cygnus.com>, 1998.
>-
>-   The GNU C Library is free software; you can redistribute it and/or
>-   modify it under the terms of the GNU Lesser General Public
>-   License as published by the Free Software Foundation; either
>-   version 2.1 of the License, or (at your option) any later version.
>-
>-   The GNU C Library is distributed in the hope that it will be useful,
>-   but WITHOUT ANY WARRANTY; without even the implied warranty of
>-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>-   Lesser General Public License for more details.
>-
>-   You should have received a copy of the GNU Lesser General Public
>-   License along with the GNU C Library; if not, see
>-   <http://www.gnu.org/licenses/>.  */
>-
>-#ifndef _DL_PROCINFO_H
>-#define _DL_PROCINFO_H	1
>-#include <ldsodefs.h>
>-
>-#define _DL_HWCAP_COUNT 32
>-
>-#define _DL_PLATFORMS_COUNT	4
>-
>-/* Start at 48 to reserve some space.  */
>-#define _DL_FIRST_PLATFORM	48
>-/* Mask to filter out platforms.  */
>-#define _DL_HWCAP_PLATFORM	(((1ULL << _DL_PLATFORMS_COUNT) - 1) \
>-				 << _DL_FIRST_PLATFORM)
>-
>-enum
>-{
>-  HWCAP_I386_FPU   = 1 << 0,
>-  HWCAP_I386_VME   = 1 << 1,
>-  HWCAP_I386_DE    = 1 << 2,
>-  HWCAP_I386_PSE   = 1 << 3,
>-  HWCAP_I386_TSC   = 1 << 4,
>-  HWCAP_I386_MSR   = 1 << 5,
>-  HWCAP_I386_PAE   = 1 << 6,
>-  HWCAP_I386_MCE   = 1 << 7,
>-  HWCAP_I386_CX8   = 1 << 8,
>-  HWCAP_I386_APIC  = 1 << 9,
>-  HWCAP_I386_SEP   = 1 << 11,
>-  HWCAP_I386_MTRR  = 1 << 12,
>-  HWCAP_I386_PGE   = 1 << 13,
>-  HWCAP_I386_MCA   = 1 << 14,
>-  HWCAP_I386_CMOV  = 1 << 15,
>-  HWCAP_I386_FCMOV = 1 << 16,
>-  HWCAP_I386_MMX   = 1 << 23,
>-  HWCAP_I386_OSFXSR = 1 << 24,
>-  HWCAP_I386_XMM   = 1 << 25,
>-  HWCAP_I386_XMM2  = 1 << 26,
>-  HWCAP_I386_AMD3D = 1 << 31,
>-
>-  /* XXX Which others to add here?  */
>-  HWCAP_IMPORTANT = (HWCAP_I386_XMM2)
>-
>-};
>-
>-/* We cannot provide a general printing function.  */
>-#define _dl_procinfo(type, word) -1
>-
>-static inline const char *
>-__attribute__ ((unused))
>-_dl_hwcap_string (int idx)
>-{
>-  return GLRO(dl_x86_cap_flags)[idx];
>-};
>-
>-static inline int
>-__attribute__ ((unused, always_inline))
>-_dl_string_hwcap (const char *str)
>-{
>-  int i;
>-
>-  for (i = 0; i < _DL_HWCAP_COUNT; i++)
>-    {
>-      if (strcmp (str, GLRO(dl_x86_cap_flags)[i]) == 0)
>-	return i;
>-    }
>-  return -1;
>-};
>-
>-static inline int
>-__attribute__ ((unused, always_inline))
>-_dl_string_platform (const char *str)
>-{
>-  int i;
>-
>-  if (str != NULL)
>-    for (i = 0; i < _DL_PLATFORMS_COUNT; ++i)
>-      {
>-	if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0)
>-	  return _DL_FIRST_PLATFORM + i;
>-      }
>-  return -1;
>-};
>-
>-#endif /* dl-procinfo.h */
>diff --git a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
>b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
>index d49638c..a3a5f9d 100644
>--- a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
>+++ b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
>@@ -17,9 +17,7 @@
>    License along with the GNU C Library; if not, see
>    <http://www.gnu.org/licenses/>.  */
> 
>-#include <sysdeps/i386/dl-procinfo.h>
>-#include <ldsodefs.h>
>-
>+#include <sysdeps/x86/dl-procinfo.h>
> 
> #undef _dl_procinfo
> static inline int
>@@ -36,7 +34,7 @@ _dl_procinfo (unsigned int type, unsigned long int word)
> 
>   _dl_printf ("AT_HWCAP:   ");
> 
>-  for (i = 0; i < _DL_HWCAP_COUNT; ++i)
>+  for (i = 0; i < 32; ++i)
>     if (word & (1 << i))
>       _dl_printf (" %s", GLRO(dl_x86_cap_flags)[i]);
> 
>diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
>b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
>index 7829e1c..7b45fe4 100644
>--- a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
>+++ b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
>@@ -1,5 +1,5 @@
> #if IS_IN (ldconfig)
> # include <sysdeps/unix/sysv/linux/i386/dl-procinfo.h>
> #else
>-# include <sysdeps/generic/dl-procinfo.h>
>+# include <sysdeps/x86/dl-procinfo.h>
> #endif
>diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
>index f30918d..b481f50 100644
>--- a/sysdeps/x86/cpu-features.c
>+++ b/sysdeps/x86/cpu-features.c
>@@ -18,6 +18,7 @@
> 
> #include <cpuid.h>
> #include <cpu-features.h>
>+#include <dl-hwcap.h>
> 
> static void
> get_common_indeces (struct cpu_features *cpu_features,
>@@ -310,4 +311,51 @@ no_cpuid:
>   cpu_features->family = family;
>   cpu_features->model = model;
>   cpu_features->kind = kind;
>+
>+#if IS_IN (rtld)
>+  /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86.  */
>+  GLRO(dl_platform) = NULL;
>+  GLRO(dl_hwcap) = 0;
>+  GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT;
>+
>+# ifdef __x86_64__
>+  if (cpu_features->kind == arch_kind_intel)
>+    {
>+      if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable)
>+	  && CPU_FEATURES_CPU_P (cpu_features, AVX512CD))
>+	{
>+	  if (CPU_FEATURES_CPU_P (cpu_features, AVX512ER))
>+	    {
>+	      if (CPU_FEATURES_CPU_P (cpu_features, AVX512PF))
>+		GLRO(dl_platform) = "xeon_phi";
>+	    }
>+	  else
>+	    {
>+	      if (CPU_FEATURES_CPU_P (cpu_features, AVX512BW)
>+		  && CPU_FEATURES_CPU_P (cpu_features, AVX512DQ)
>+		  && CPU_FEATURES_CPU_P (cpu_features, AVX512VL))
>+		GLRO(dl_hwcap) |= HWCAP_X86_AVX512_1;
>+	    }
>+	}
>+
>+      if (GLRO(dl_platform) == NULL
>+	  && CPU_FEATURES_ARCH_P (cpu_features, AVX2_Usable)
>+	  && CPU_FEATURES_ARCH_P (cpu_features, FMA_Usable)
>+	  && CPU_FEATURES_CPU_P (cpu_features, BMI1)
>+	  && CPU_FEATURES_CPU_P (cpu_features, BMI2)
>+	  && CPU_FEATURES_CPU_P (cpu_features, LZCNT)
>+	  && CPU_FEATURES_CPU_P (cpu_features, MOVBE)
>+	  && CPU_FEATURES_CPU_P (cpu_features, POPCNT))
>+	GLRO(dl_platform) = "haswell";
>+    }
>+# else
>+  if (CPU_FEATURES_CPU_P (cpu_features, SSE2))
>+    GLRO(dl_hwcap) |= HWCAP_X86_SSE2;
>+
>+  if (CPU_FEATURES_ARCH_P (cpu_features, I686))
>+    GLRO(dl_platform) = "i686";
>+  else if (CPU_FEATURES_ARCH_P (cpu_features, I586))
>+    GLRO(dl_platform) = "i586";
>+# endif
>+#endif
> }
>diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
>index 85a39e7..31c7c80 100644
>--- a/sysdeps/x86/cpu-features.h
>+++ b/sysdeps/x86/cpu-features.h
>@@ -57,8 +57,13 @@
> #define bit_cpu_FMA		(1 << 12)
> #define bit_cpu_FMA4		(1 << 16)
> #define bit_cpu_HTT		(1 << 28)
>+#define bit_cpu_LZCNT		(1 << 5)
>+#define bit_cpu_MOVBE		(1 << 22)
>+#define bit_cpu_POPCNT		(1 << 23)
> 
> /* COMMON_CPUID_INDEX_7.  */
>+#define bit_cpu_BMI1		(1 << 3)
>+#define bit_cpu_BMI2		(1 << 8)
> #define bit_cpu_ERMS		(1 << 9)
> #define bit_cpu_RTM		(1 << 11)
> #define bit_cpu_AVX2		(1 << 5)
>@@ -258,6 +263,11 @@ extern const struct cpu_features *__get_cpu_features
>(void)
> # define index_cpu_POPCOUNT	COMMON_CPUID_INDEX_1
> # define index_cpu_OSXSAVE	COMMON_CPUID_INDEX_1
> # define index_cpu_HTT		COMMON_CPUID_INDEX_1
>+# define index_cpu_BMI1		COMMON_CPUID_INDEX_7
>+# define index_cpu_BMI2		COMMON_CPUID_INDEX_7
>+# define index_cpu_LZCNT	COMMON_CPUID_INDEX_1
>+# define index_cpu_MOVBE	COMMON_CPUID_INDEX_1
>+# define index_cpu_POPCNT	COMMON_CPUID_INDEX_1
> 
> # define reg_CX8		edx
> # define reg_CMOV		edx
>@@ -282,6 +292,11 @@ extern const struct cpu_features *__get_cpu_features
>(void)
> # define reg_POPCOUNT		ecx
> # define reg_OSXSAVE		ecx
> # define reg_HTT		edx
>+# define reg_BMI1		ebx
>+# define reg_BMI2		ebx
>+# define reg_LZCNT		ecx
>+# define reg_MOVBE		ecx
>+# define reg_POPCNT		ecx
> 
> # define index_arch_Fast_Rep_String	FEATURE_INDEX_1
> # define index_arch_Fast_Copy_Backward	FEATURE_INDEX_1
>diff --git a/sysdeps/x86/dl-hwcap.h b/sysdeps/x86/dl-hwcap.h
>new file mode 100644
>index 0000000..c956684
>--- /dev/null
>+++ b/sysdeps/x86/dl-hwcap.h
>@@ -0,0 +1,75 @@
>+/* x86 version of hardware capability information handling macros.
>+   Copyright (C) 2017 Free Software Foundation, Inc.
>+
>+   The GNU C Library is free software; you can redistribute it and/or
>+   modify it under the terms of the GNU Lesser General Public
>+   License as published by the Free Software Foundation; either
>+   version 2.1 of the License, or (at your option) any later version.
>+
>+   The GNU C Library is distributed in the hope that it will be useful,
>+   but WITHOUT ANY WARRANTY; without even the implied warranty of
>+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>+   Lesser General Public License for more details.
>+
>+   You should have received a copy of the GNU Lesser General Public
>+   License along with the GNU C Library; if not, see
>+   <http://www.gnu.org/licenses/>.  */
>+
>+#ifndef _DL_HWCAP_H
>+#define _DL_HWCAP_H
>+
>+#if IS_IN (ldconfig)
>+/* Since ldconfig processes both i386 and x86-64 libraries, it needs
>+   to cover all platforms and hardware capabilities.  */
>+# define HWCAP_PLATFORMS_START	0
>+# define HWCAP_PLATFORMS_COUNT	4
>+# define HWCAP_START		0
>+# define HWCAP_COUNT		2
>+# define HWCAP_IMPORTANT	(HWCAP_X86_SSE2 | HWCAP_X86_AVX512_1)
>+#elif defined __x86_64__
>+/* For 64 bit, only cover x86-64 platforms and capabilities.  */
>+# define HWCAP_PLATFORMS_START	2
>+# define HWCAP_PLATFORMS_COUNT	4
>+# define HWCAP_START		1
>+# define HWCAP_COUNT		2
>+# define HWCAP_IMPORTANT	(HWCAP_X86_AVX512_1)
>+#else
>+/* For 32 bit, only cover i586, i686 and SSE2.  */
>+# define HWCAP_PLATFORMS_START	0
>+# define HWCAP_PLATFORMS_COUNT	2
>+# define HWCAP_START		0
>+# define HWCAP_COUNT		1
>+# define HWCAP_IMPORTANT	(HWCAP_X86_SSE2)
>+#endif
>+
>+enum
>+{
>+  HWCAP_X86_SSE2		= 1 << 0,
>+  HWCAP_X86_AVX512_1		= 1 << 1
>+};
>+
>+static inline const char *
>+__attribute__ ((unused))
>+_dl_hwcap_string (int idx)
>+{
>+  return GLRO(dl_x86_hwcap_flags)[idx];
>+};
>+
>+static inline int
>+__attribute__ ((unused, always_inline))
>+_dl_string_hwcap (const char *str)
>+{
>+  int i;
>+
>+  for (i = HWCAP_START; i < HWCAP_COUNT; i++)
>+    {
>+      if (strcmp (str, GLRO(dl_x86_hwcap_flags)[i]) == 0)
>+	return i;
>+    }
>+  return -1;
>+};
>+
>+/* We cannot provide a general printing function.  */
>+#define _dl_procinfo(type, word) -1
>+
>+#endif /* dl-hwcap.h */
>diff --git a/sysdeps/x86/dl-procinfo.c b/sysdeps/x86/dl-procinfo.c
>index 9d154bf..43ab8fe 100644
>--- a/sysdeps/x86/dl-procinfo.c
>+++ b/sysdeps/x86/dl-procinfo.c
>@@ -16,7 +16,11 @@
>    License along with the GNU C Library; if not, see
>    <http://www.gnu.org/licenses/>.  */
> 
>-/* If anything should be added here check whether the size of each string
>+/* This information must be kept in sync with the _DL_HWCAP_COUNT,
>+   HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT definitions in
>+   dl-hwcap.h.
>+
>+   If anything should be added here check whether the size of each string
>    is still ok with the given array size.
> 
>    All the #ifdefs in the definitions are quite irritating but
>@@ -50,3 +54,35 @@ PROCINFO_CLASS struct cpu_features _dl_x86_cpu_features
> ,
> # endif
> #endif
>+
>+#if !defined PROCINFO_DECL && defined SHARED
>+  ._dl_x86_hwcap_flags
>+#else
>+PROCINFO_CLASS const char _dl_x86_hwcap_flags[2][9]
>+#endif
>+#ifndef PROCINFO_DECL
>+= {
>+    "sse2", "avx512_1"
>+  }
>+#endif
>+#if !defined SHARED || defined PROCINFO_DECL
>+;
>+#else
>+,
>+#endif
>+
>+#if !defined PROCINFO_DECL && defined SHARED
>+  ._dl_x86_platforms
>+#else
>+PROCINFO_CLASS const char _dl_x86_platforms[4][9]
>+#endif
>+#ifndef PROCINFO_DECL
>+= {
>+    "i586", "i686", "haswell", "xeon_phi"
>+  }
>+#endif
>+#if !defined SHARED || defined PROCINFO_DECL
>+;
>+#else
>+,
>+#endif
>diff --git a/sysdeps/x86/dl-procinfo.h b/sysdeps/x86/dl-procinfo.h
>new file mode 100644
>index 0000000..5feb146
>--- /dev/null
>+++ b/sysdeps/x86/dl-procinfo.h
>@@ -0,0 +1,48 @@
>+/* x86 version of processor capability information handling macros.
>+   Copyright (C) 2017 Free Software Foundation, Inc.
>+   This file is part of the GNU C Library.
>+
>+   The GNU C Library is free software; you can redistribute it and/or
>+   modify it under the terms of the GNU Lesser General Public
>+   License as published by the Free Software Foundation; either
>+   version 2.1 of the License, or (at your option) any later version.
>+
>+   The GNU C Library is distributed in the hope that it will be useful,
>+   but WITHOUT ANY WARRANTY; without even the implied warranty of
>+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>+   Lesser General Public License for more details.
>+
>+   You should have received a copy of the GNU Lesser General Public
>+   License along with the GNU C Library; if not, see
>+   <http://www.gnu.org/licenses/>.  */
>+
>+#ifndef _DL_PROCINFO_H
>+#define _DL_PROCINFO_H	1
>+#include <ldsodefs.h>
>+#include <dl-hwcap.h>
>+
>+#define _DL_HWCAP_COUNT		HWCAP_COUNT
>+#define _DL_PLATFORMS_COUNT	HWCAP_PLATFORMS_COUNT
>+
>+/* Start at 48 to reserve spaces for hardware capabilities.  */
>+#define _DL_FIRST_PLATFORM	48
>+/* Mask to filter out platforms.  */
>+#define _DL_HWCAP_PLATFORM	(((1ULL << _DL_PLATFORMS_COUNT) - 1) \
>+				 << _DL_FIRST_PLATFORM)
>+
>+static inline int
>+__attribute__ ((unused, always_inline))
>+_dl_string_platform (const char *str)
>+{
>+  int i;
>+
>+  if (str != NULL)
>+    for (i = HWCAP_PLATFORMS_START; i < HWCAP_PLATFORMS_COUNT; ++i)
>+      {
>+	if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0)
>+	  return _DL_FIRST_PLATFORM + i;
>+      }
>+  return -1;
>+};
>+
>+#endif /* dl-procinfo.h */
>diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
>index daf4d8c..0015db4 100644
>--- a/sysdeps/x86_64/dl-machine.h
>+++ b/sysdeps/x86_64/dl-machine.h
>@@ -240,14 +240,14 @@ _dl_start_user:\n\
> static inline void __attribute__ ((unused))
> dl_platform_init (void)
> {
>-  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
>-    /* Avoid an empty string which would disturb us.  */
>-    GLRO(dl_platform) = NULL;
>-
>-#ifdef SHARED
>+#if IS_IN (rtld)
>   /* init_cpu_features has been called early from __libc_start_main in
>      static executable.  */
>   init_cpu_features (&GLRO(dl_x86_cpu_features));
>+#else
>+  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
>+    /* Avoid an empty string which would disturb us.  */
>+    GLRO(dl_platform) = NULL;
> #endif
> }
> 
>-- 
>2.9.3
>
Florian Weimer April 20, 2017, 8:51 a.m. UTC | #4
On 04/19/2017 09:31 PM, H.J. Lu wrote:
> On Wed, Apr 19, 2017 at 12:02 PM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 04/19/2017 08:35 PM, H.J. Lu wrote:
>>>
>>> dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
>>> early during startup.  They are used by dynamic linker to determine
>>> platform and build an array of hardware capability names, which are
>>> added to search path when loading shared object.  dl_platform and
>>> dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
>>> platforms were supported and only SSE2 capability was used.
>>
>>
>> I don't know where you want to take this, so it's hard to tell if this is
>> going to cause problems eventually.  GLRO is default-initialized in a nested
>> libc (after dlmopen or static dlopen).
>>
> 
> dl_platform and dl_hwcap are used to search for additional directories when
> loading a shared object:
> 
>       15453: find library=libx.so [0]; searching
>       15453: search path=./tls/x86_64:./tls:./x86_64:. (RPATH from file ./m)
>       15453:  trying file=./tls/x86_64/libx.so
>       15453:  trying file=./tls/libx.so
>       15453:  trying file=./x86_64/libx.so
>       15453:  trying file=./libx.so
>       15453:
> 
> My change updates them inside dynamic linker before they are used
> by dynamic linker.  On Haswell class machines, I got
> 
>       19268: find library=libx.so [0]; searching
>       19268: search path=./tls/haswell:./tls:./haswell:. (RPATH from file ./m)
>       19268:  trying file=./tls/haswell/libx.so
>       19268:  trying file=./tls/libx.so
>       19268:  trying file=./haswell/libx.so
> 
> When loading libx.so, it prefers the one in the "haswell" subdirectory.
> One can place shared libraries optimized for Haswell class processors
> under the "haswell" subdirectory.  They will be used on Haswell class
> processors.  It has no impact on static dlopen nor nested libc.

Okay, but only because the external dynamic linker is used, which has a 
fully set up GLRO object.  The default-initialized GLRO object in the 
inner libc is not used in this case.

Thanks,
Florian
H.J. Lu April 20, 2017, 2:42 p.m. UTC | #5
On Thu, Apr 20, 2017 at 1:51 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 04/19/2017 09:31 PM, H.J. Lu wrote:
>>
>> On Wed, Apr 19, 2017 at 12:02 PM, Florian Weimer <fweimer@redhat.com>
>> wrote:
>>>
>>> On 04/19/2017 08:35 PM, H.J. Lu wrote:
>>>>
>>>>
>>>> dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
>>>> early during startup.  They are used by dynamic linker to determine
>>>> platform and build an array of hardware capability names, which are
>>>> added to search path when loading shared object.  dl_platform and
>>>> dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
>>>> platforms were supported and only SSE2 capability was used.
>>>
>>>
>>>
>>> I don't know where you want to take this, so it's hard to tell if this is
>>> going to cause problems eventually.  GLRO is default-initialized in a
>>> nested
>>> libc (after dlmopen or static dlopen).
>>>
>>
>> dl_platform and dl_hwcap are used to search for additional directories
>> when
>> loading a shared object:
>>
>>       15453: find library=libx.so [0]; searching
>>       15453: search path=./tls/x86_64:./tls:./x86_64:. (RPATH from file
>> ./m)
>>       15453:  trying file=./tls/x86_64/libx.so
>>       15453:  trying file=./tls/libx.so
>>       15453:  trying file=./x86_64/libx.so
>>       15453:  trying file=./libx.so
>>       15453:
>>
>> My change updates them inside dynamic linker before they are used
>> by dynamic linker.  On Haswell class machines, I got
>>
>>       19268: find library=libx.so [0]; searching
>>       19268: search path=./tls/haswell:./tls:./haswell:. (RPATH from file
>> ./m)
>>       19268:  trying file=./tls/haswell/libx.so
>>       19268:  trying file=./tls/libx.so
>>       19268:  trying file=./haswell/libx.so
>>
>> When loading libx.so, it prefers the one in the "haswell" subdirectory.
>> One can place shared libraries optimized for Haswell class processors
>> under the "haswell" subdirectory.  They will be used on Haswell class
>> processors.  It has no impact on static dlopen nor nested libc.
>
>
> Okay, but only because the external dynamic linker is used, which has a
> fully set up GLRO object.  The default-initialized GLRO object in the inner
> libc is not used in this case.

That is true.  My patch only applies to the external dynamic linker.
H.J. Lu April 28, 2017, 2:44 p.m. UTC | #6
On Thu, Apr 20, 2017 at 7:42 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Thu, Apr 20, 2017 at 1:51 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 04/19/2017 09:31 PM, H.J. Lu wrote:
>>>
>>> On Wed, Apr 19, 2017 at 12:02 PM, Florian Weimer <fweimer@redhat.com>
>>> wrote:
>>>>
>>>> On 04/19/2017 08:35 PM, H.J. Lu wrote:
>>>>>
>>>>>
>>>>> dl_platform and dl_hwcap are set from AT_PLATFORM and AT_HWCAP very
>>>>> early during startup.  They are used by dynamic linker to determine
>>>>> platform and build an array of hardware capability names, which are
>>>>> added to search path when loading shared object.  dl_platform and
>>>>> dl_hwcap are unused on x86-64.  On i386, i386, i486, i586 and i686
>>>>> platforms were supported and only SSE2 capability was used.
>>>>
>>>>
>>>>
>>>> I don't know where you want to take this, so it's hard to tell if this is
>>>> going to cause problems eventually.  GLRO is default-initialized in a
>>>> nested
>>>> libc (after dlmopen or static dlopen).
>>>>
>>>
>>> dl_platform and dl_hwcap are used to search for additional directories
>>> when
>>> loading a shared object:
>>>
>>>       15453: find library=libx.so [0]; searching
>>>       15453: search path=./tls/x86_64:./tls:./x86_64:. (RPATH from file
>>> ./m)
>>>       15453:  trying file=./tls/x86_64/libx.so
>>>       15453:  trying file=./tls/libx.so
>>>       15453:  trying file=./x86_64/libx.so
>>>       15453:  trying file=./libx.so
>>>       15453:
>>>
>>> My change updates them inside dynamic linker before they are used
>>> by dynamic linker.  On Haswell class machines, I got
>>>
>>>       19268: find library=libx.so [0]; searching
>>>       19268: search path=./tls/haswell:./tls:./haswell:. (RPATH from file
>>> ./m)
>>>       19268:  trying file=./tls/haswell/libx.so
>>>       19268:  trying file=./tls/libx.so
>>>       19268:  trying file=./haswell/libx.so
>>>
>>> When loading libx.so, it prefers the one in the "haswell" subdirectory.
>>> One can place shared libraries optimized for Haswell class processors
>>> under the "haswell" subdirectory.  They will be used on Haswell class
>>> processors.  It has no impact on static dlopen nor nested libc.
>>
>>
>> Okay, but only because the external dynamic linker is used, which has a
>> fully set up GLRO object.  The default-initialized GLRO object in the inner
>> libc is not used in this case.
>
> That is true.  My patch only applies to the external dynamic linker.
>

Any other comments?
Florian Weimer April 28, 2017, 2:45 p.m. UTC | #7
On 04/28/2017 04:44 PM, H.J. Lu wrote:
> Any other comments?

I don't have any further comments.

Except maybe this: Why isn't this is in the kernel, like for the other 
architectures?

Thanks,
Florian
H.J. Lu April 28, 2017, 3:51 p.m. UTC | #8
On Fri, Apr 28, 2017 at 7:45 AM, Florian Weimer <fweimer@redhat.com> wrote:
> On 04/28/2017 04:44 PM, H.J. Lu wrote:
>>
>> Any other comments?
>
>
> I don't have any further comments.
>
> Except maybe this: Why isn't this is in the kernel, like for the other
> architectures?
>

On x86,  CPUID is available to both kernel and user space.  Kernel
sets AT_PLATFORM and AT_HWCAP from CPUID.  But it isn't flexible
for setting dynamic linker search path.   There is no support for platform
nor hardware capability in dynamic linker search path for x86-64.
AT_HWCAP only provides a small subset of features from CPUID.  My
patch sets dynamic linker search path in user space from CPUID.  It
works for both i386 and x86-64.
H.J. Lu May 3, 2017, 2:36 p.m. UTC | #9
On Fri, Apr 28, 2017 at 8:51 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
> On Fri, Apr 28, 2017 at 7:45 AM, Florian Weimer <fweimer@redhat.com> wrote:
>> On 04/28/2017 04:44 PM, H.J. Lu wrote:
>>>
>>> Any other comments?
>>
>>
>> I don't have any further comments.
>>
>> Except maybe this: Why isn't this is in the kernel, like for the other
>> architectures?
>>
>
> On x86,  CPUID is available to both kernel and user space.  Kernel
> sets AT_PLATFORM and AT_HWCAP from CPUID.  But it isn't flexible
> for setting dynamic linker search path.   There is no support for platform
> nor hardware capability in dynamic linker search path for x86-64.
> AT_HWCAP only provides a small subset of features from CPUID.  My
> patch sets dynamic linker search path in user space from CPUID.  It
> works for both i386 and x86-64.
>

I will check it today.
diff mbox

Patch

diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
index 99a72f6..57d4a0b 100644
--- a/sysdeps/i386/dl-machine.h
+++ b/sysdeps/i386/dl-machine.h
@@ -233,14 +233,14 @@  _dl_start_user:\n\
 static inline void __attribute__ ((unused))
 dl_platform_init (void)
 {
-  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
-    /* Avoid an empty string which would disturb us.  */
-    GLRO(dl_platform) = NULL;
-
-#ifdef SHARED
+#if IS_IN (rtld)
   /* init_cpu_features has been called early from __libc_start_main in
      static executable.  */
   init_cpu_features (&GLRO(dl_x86_cpu_features));
+#else
+  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
+    /* Avoid an empty string which would disturb us.  */
+    GLRO(dl_platform) = NULL;
 #endif
 }
 
diff --git a/sysdeps/i386/dl-procinfo.c b/sysdeps/i386/dl-procinfo.c
index b832830..7237f77 100644
--- a/sysdeps/i386/dl-procinfo.c
+++ b/sysdeps/i386/dl-procinfo.c
@@ -17,10 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-/* This information must be kept in sync with the _DL_HWCAP_COUNT and
-   _DL_PLATFORM_COUNT definitions in procinfo.h.
-
-   If anything should be added here check whether the size of each string
+/* If anything should be added here check whether the size of each string
    is still ok with the given array size.
 
    All the #ifdefs in the definitions are quite irritating but
@@ -64,21 +61,5 @@  PROCINFO_CLASS const char _dl_x86_cap_flags[32][8]
 ,
 #endif
 
-#if !defined PROCINFO_DECL && defined SHARED
-  ._dl_x86_platforms
-#else
-PROCINFO_CLASS const char _dl_x86_platforms[4][5]
-#endif
-#ifndef PROCINFO_DECL
-= {
-    "i386", "i486", "i586", "i686"
-  }
-#endif
-#if !defined SHARED || defined PROCINFO_DECL
-;
-#else
-,
-#endif
-
 #undef PROCINFO_DECL
 #undef PROCINFO_CLASS
diff --git a/sysdeps/i386/dl-procinfo.h b/sysdeps/i386/dl-procinfo.h
deleted file mode 100644
index 9c38846..0000000
--- a/sysdeps/i386/dl-procinfo.h
+++ /dev/null
@@ -1,102 +0,0 @@ 
-/* i386 version of processor capability information handling macros.
-   Copyright (C) 1998-2017 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@cygnus.com>, 1998.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#ifndef _DL_PROCINFO_H
-#define _DL_PROCINFO_H	1
-#include <ldsodefs.h>
-
-#define _DL_HWCAP_COUNT 32
-
-#define _DL_PLATFORMS_COUNT	4
-
-/* Start at 48 to reserve some space.  */
-#define _DL_FIRST_PLATFORM	48
-/* Mask to filter out platforms.  */
-#define _DL_HWCAP_PLATFORM	(((1ULL << _DL_PLATFORMS_COUNT) - 1) \
-				 << _DL_FIRST_PLATFORM)
-
-enum
-{
-  HWCAP_I386_FPU   = 1 << 0,
-  HWCAP_I386_VME   = 1 << 1,
-  HWCAP_I386_DE    = 1 << 2,
-  HWCAP_I386_PSE   = 1 << 3,
-  HWCAP_I386_TSC   = 1 << 4,
-  HWCAP_I386_MSR   = 1 << 5,
-  HWCAP_I386_PAE   = 1 << 6,
-  HWCAP_I386_MCE   = 1 << 7,
-  HWCAP_I386_CX8   = 1 << 8,
-  HWCAP_I386_APIC  = 1 << 9,
-  HWCAP_I386_SEP   = 1 << 11,
-  HWCAP_I386_MTRR  = 1 << 12,
-  HWCAP_I386_PGE   = 1 << 13,
-  HWCAP_I386_MCA   = 1 << 14,
-  HWCAP_I386_CMOV  = 1 << 15,
-  HWCAP_I386_FCMOV = 1 << 16,
-  HWCAP_I386_MMX   = 1 << 23,
-  HWCAP_I386_OSFXSR = 1 << 24,
-  HWCAP_I386_XMM   = 1 << 25,
-  HWCAP_I386_XMM2  = 1 << 26,
-  HWCAP_I386_AMD3D = 1 << 31,
-
-  /* XXX Which others to add here?  */
-  HWCAP_IMPORTANT = (HWCAP_I386_XMM2)
-
-};
-
-/* We cannot provide a general printing function.  */
-#define _dl_procinfo(type, word) -1
-
-static inline const char *
-__attribute__ ((unused))
-_dl_hwcap_string (int idx)
-{
-  return GLRO(dl_x86_cap_flags)[idx];
-};
-
-static inline int
-__attribute__ ((unused, always_inline))
-_dl_string_hwcap (const char *str)
-{
-  int i;
-
-  for (i = 0; i < _DL_HWCAP_COUNT; i++)
-    {
-      if (strcmp (str, GLRO(dl_x86_cap_flags)[i]) == 0)
-	return i;
-    }
-  return -1;
-};
-
-static inline int
-__attribute__ ((unused, always_inline))
-_dl_string_platform (const char *str)
-{
-  int i;
-
-  if (str != NULL)
-    for (i = 0; i < _DL_PLATFORMS_COUNT; ++i)
-      {
-	if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0)
-	  return _DL_FIRST_PLATFORM + i;
-      }
-  return -1;
-};
-
-#endif /* dl-procinfo.h */
diff --git a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
index d49638c..a3a5f9d 100644
--- a/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
+++ b/sysdeps/unix/sysv/linux/i386/dl-procinfo.h
@@ -17,9 +17,7 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <sysdeps/i386/dl-procinfo.h>
-#include <ldsodefs.h>
-
+#include <sysdeps/x86/dl-procinfo.h>
 
 #undef _dl_procinfo
 static inline int
@@ -36,7 +34,7 @@  _dl_procinfo (unsigned int type, unsigned long int word)
 
   _dl_printf ("AT_HWCAP:   ");
 
-  for (i = 0; i < _DL_HWCAP_COUNT; ++i)
+  for (i = 0; i < 32; ++i)
     if (word & (1 << i))
       _dl_printf (" %s", GLRO(dl_x86_cap_flags)[i]);
 
diff --git a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
index 7829e1c..7b45fe4 100644
--- a/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
+++ b/sysdeps/unix/sysv/linux/x86_64/dl-procinfo.h
@@ -1,5 +1,5 @@ 
 #if IS_IN (ldconfig)
 # include <sysdeps/unix/sysv/linux/i386/dl-procinfo.h>
 #else
-# include <sysdeps/generic/dl-procinfo.h>
+# include <sysdeps/x86/dl-procinfo.h>
 #endif
diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c
index f30918d..b481f50 100644
--- a/sysdeps/x86/cpu-features.c
+++ b/sysdeps/x86/cpu-features.c
@@ -18,6 +18,7 @@ 
 
 #include <cpuid.h>
 #include <cpu-features.h>
+#include <dl-hwcap.h>
 
 static void
 get_common_indeces (struct cpu_features *cpu_features,
@@ -310,4 +311,51 @@  no_cpuid:
   cpu_features->family = family;
   cpu_features->model = model;
   cpu_features->kind = kind;
+
+#if IS_IN (rtld)
+  /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86.  */
+  GLRO(dl_platform) = NULL;
+  GLRO(dl_hwcap) = 0;
+  GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT;
+
+# ifdef __x86_64__
+  if (cpu_features->kind == arch_kind_intel)
+    {
+      if (CPU_FEATURES_ARCH_P (cpu_features, AVX512F_Usable)
+	  && CPU_FEATURES_CPU_P (cpu_features, AVX512CD))
+	{
+	  if (CPU_FEATURES_CPU_P (cpu_features, AVX512ER))
+	    {
+	      if (CPU_FEATURES_CPU_P (cpu_features, AVX512PF))
+		GLRO(dl_platform) = "xeon_phi";
+	    }
+	  else
+	    {
+	      if (CPU_FEATURES_CPU_P (cpu_features, AVX512BW)
+		  && CPU_FEATURES_CPU_P (cpu_features, AVX512DQ)
+		  && CPU_FEATURES_CPU_P (cpu_features, AVX512VL))
+		GLRO(dl_hwcap) |= HWCAP_X86_AVX512_1;
+	    }
+	}
+
+      if (GLRO(dl_platform) == NULL
+	  && CPU_FEATURES_ARCH_P (cpu_features, AVX2_Usable)
+	  && CPU_FEATURES_ARCH_P (cpu_features, FMA_Usable)
+	  && CPU_FEATURES_CPU_P (cpu_features, BMI1)
+	  && CPU_FEATURES_CPU_P (cpu_features, BMI2)
+	  && CPU_FEATURES_CPU_P (cpu_features, LZCNT)
+	  && CPU_FEATURES_CPU_P (cpu_features, MOVBE)
+	  && CPU_FEATURES_CPU_P (cpu_features, POPCNT))
+	GLRO(dl_platform) = "haswell";
+    }
+# else
+  if (CPU_FEATURES_CPU_P (cpu_features, SSE2))
+    GLRO(dl_hwcap) |= HWCAP_X86_SSE2;
+
+  if (CPU_FEATURES_ARCH_P (cpu_features, I686))
+    GLRO(dl_platform) = "i686";
+  else if (CPU_FEATURES_ARCH_P (cpu_features, I586))
+    GLRO(dl_platform) = "i586";
+# endif
+#endif
 }
diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h
index 85a39e7..31c7c80 100644
--- a/sysdeps/x86/cpu-features.h
+++ b/sysdeps/x86/cpu-features.h
@@ -57,8 +57,13 @@ 
 #define bit_cpu_FMA		(1 << 12)
 #define bit_cpu_FMA4		(1 << 16)
 #define bit_cpu_HTT		(1 << 28)
+#define bit_cpu_LZCNT		(1 << 5)
+#define bit_cpu_MOVBE		(1 << 22)
+#define bit_cpu_POPCNT		(1 << 23)
 
 /* COMMON_CPUID_INDEX_7.  */
+#define bit_cpu_BMI1		(1 << 3)
+#define bit_cpu_BMI2		(1 << 8)
 #define bit_cpu_ERMS		(1 << 9)
 #define bit_cpu_RTM		(1 << 11)
 #define bit_cpu_AVX2		(1 << 5)
@@ -258,6 +263,11 @@  extern const struct cpu_features *__get_cpu_features (void)
 # define index_cpu_POPCOUNT	COMMON_CPUID_INDEX_1
 # define index_cpu_OSXSAVE	COMMON_CPUID_INDEX_1
 # define index_cpu_HTT		COMMON_CPUID_INDEX_1
+# define index_cpu_BMI1		COMMON_CPUID_INDEX_7
+# define index_cpu_BMI2		COMMON_CPUID_INDEX_7
+# define index_cpu_LZCNT	COMMON_CPUID_INDEX_1
+# define index_cpu_MOVBE	COMMON_CPUID_INDEX_1
+# define index_cpu_POPCNT	COMMON_CPUID_INDEX_1
 
 # define reg_CX8		edx
 # define reg_CMOV		edx
@@ -282,6 +292,11 @@  extern const struct cpu_features *__get_cpu_features (void)
 # define reg_POPCOUNT		ecx
 # define reg_OSXSAVE		ecx
 # define reg_HTT		edx
+# define reg_BMI1		ebx
+# define reg_BMI2		ebx
+# define reg_LZCNT		ecx
+# define reg_MOVBE		ecx
+# define reg_POPCNT		ecx
 
 # define index_arch_Fast_Rep_String	FEATURE_INDEX_1
 # define index_arch_Fast_Copy_Backward	FEATURE_INDEX_1
diff --git a/sysdeps/x86/dl-hwcap.h b/sysdeps/x86/dl-hwcap.h
new file mode 100644
index 0000000..c956684
--- /dev/null
+++ b/sysdeps/x86/dl-hwcap.h
@@ -0,0 +1,75 @@ 
+/* x86 version of hardware capability information handling macros.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _DL_HWCAP_H
+#define _DL_HWCAP_H
+
+#if IS_IN (ldconfig)
+/* Since ldconfig processes both i386 and x86-64 libraries, it needs
+   to cover all platforms and hardware capabilities.  */
+# define HWCAP_PLATFORMS_START	0
+# define HWCAP_PLATFORMS_COUNT	4
+# define HWCAP_START		0
+# define HWCAP_COUNT		2
+# define HWCAP_IMPORTANT	(HWCAP_X86_SSE2 | HWCAP_X86_AVX512_1)
+#elif defined __x86_64__
+/* For 64 bit, only cover x86-64 platforms and capabilities.  */
+# define HWCAP_PLATFORMS_START	2
+# define HWCAP_PLATFORMS_COUNT	4
+# define HWCAP_START		1
+# define HWCAP_COUNT		2
+# define HWCAP_IMPORTANT	(HWCAP_X86_AVX512_1)
+#else
+/* For 32 bit, only cover i586, i686 and SSE2.  */
+# define HWCAP_PLATFORMS_START	0
+# define HWCAP_PLATFORMS_COUNT	2
+# define HWCAP_START		0
+# define HWCAP_COUNT		1
+# define HWCAP_IMPORTANT	(HWCAP_X86_SSE2)
+#endif
+
+enum
+{
+  HWCAP_X86_SSE2		= 1 << 0,
+  HWCAP_X86_AVX512_1		= 1 << 1
+};
+
+static inline const char *
+__attribute__ ((unused))
+_dl_hwcap_string (int idx)
+{
+  return GLRO(dl_x86_hwcap_flags)[idx];
+};
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_hwcap (const char *str)
+{
+  int i;
+
+  for (i = HWCAP_START; i < HWCAP_COUNT; i++)
+    {
+      if (strcmp (str, GLRO(dl_x86_hwcap_flags)[i]) == 0)
+	return i;
+    }
+  return -1;
+};
+
+/* We cannot provide a general printing function.  */
+#define _dl_procinfo(type, word) -1
+
+#endif /* dl-hwcap.h */
diff --git a/sysdeps/x86/dl-procinfo.c b/sysdeps/x86/dl-procinfo.c
index 9d154bf..43ab8fe 100644
--- a/sysdeps/x86/dl-procinfo.c
+++ b/sysdeps/x86/dl-procinfo.c
@@ -16,7 +16,11 @@ 
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-/* If anything should be added here check whether the size of each string
+/* This information must be kept in sync with the _DL_HWCAP_COUNT,
+   HWCAP_PLATFORMS_START and HWCAP_PLATFORMS_COUNT definitions in
+   dl-hwcap.h.
+
+   If anything should be added here check whether the size of each string
    is still ok with the given array size.
 
    All the #ifdefs in the definitions are quite irritating but
@@ -50,3 +54,35 @@  PROCINFO_CLASS struct cpu_features _dl_x86_cpu_features
 ,
 # endif
 #endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_x86_hwcap_flags
+#else
+PROCINFO_CLASS const char _dl_x86_hwcap_flags[2][9]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    "sse2", "avx512_1"
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
+
+#if !defined PROCINFO_DECL && defined SHARED
+  ._dl_x86_platforms
+#else
+PROCINFO_CLASS const char _dl_x86_platforms[4][9]
+#endif
+#ifndef PROCINFO_DECL
+= {
+    "i586", "i686", "haswell", "xeon_phi"
+  }
+#endif
+#if !defined SHARED || defined PROCINFO_DECL
+;
+#else
+,
+#endif
diff --git a/sysdeps/x86/dl-procinfo.h b/sysdeps/x86/dl-procinfo.h
new file mode 100644
index 0000000..5feb146
--- /dev/null
+++ b/sysdeps/x86/dl-procinfo.h
@@ -0,0 +1,48 @@ 
+/* x86 version of processor capability information handling macros.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _DL_PROCINFO_H
+#define _DL_PROCINFO_H	1
+#include <ldsodefs.h>
+#include <dl-hwcap.h>
+
+#define _DL_HWCAP_COUNT		HWCAP_COUNT
+#define _DL_PLATFORMS_COUNT	HWCAP_PLATFORMS_COUNT
+
+/* Start at 48 to reserve spaces for hardware capabilities.  */
+#define _DL_FIRST_PLATFORM	48
+/* Mask to filter out platforms.  */
+#define _DL_HWCAP_PLATFORM	(((1ULL << _DL_PLATFORMS_COUNT) - 1) \
+				 << _DL_FIRST_PLATFORM)
+
+static inline int
+__attribute__ ((unused, always_inline))
+_dl_string_platform (const char *str)
+{
+  int i;
+
+  if (str != NULL)
+    for (i = HWCAP_PLATFORMS_START; i < HWCAP_PLATFORMS_COUNT; ++i)
+      {
+	if (strcmp (str, GLRO(dl_x86_platforms)[i]) == 0)
+	  return _DL_FIRST_PLATFORM + i;
+      }
+  return -1;
+};
+
+#endif /* dl-procinfo.h */
diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index daf4d8c..0015db4 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -240,14 +240,14 @@  _dl_start_user:\n\
 static inline void __attribute__ ((unused))
 dl_platform_init (void)
 {
-  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
-    /* Avoid an empty string which would disturb us.  */
-    GLRO(dl_platform) = NULL;
-
-#ifdef SHARED
+#if IS_IN (rtld)
   /* init_cpu_features has been called early from __libc_start_main in
      static executable.  */
   init_cpu_features (&GLRO(dl_x86_cpu_features));
+#else
+  if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0')
+    /* Avoid an empty string which would disturb us.  */
+    GLRO(dl_platform) = NULL;
 #endif
 }