From patchwork Thu Jun 25 12:30:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1316873 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=sourceware.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=EWJEwnSp; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49szrt3Xqlz9sRf for ; Thu, 25 Jun 2020 22:31:10 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 01AA03851C11; Thu, 25 Jun 2020 12:31:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 01AA03851C11 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1593088267; bh=aD2VMgzuzdYG4afCwAVzQryZOS2YOyh9z5uti/hs5E4=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=EWJEwnSp2leZUKBPloM3JDv2zsUZr9+xuiMRurum1tDAsY3dA4T6gg/LK4Pgxxat6 jZTHabOcOt6jRaZLEGORpjhQgaNjR3O38aY2cYvG3+M9Dc4JK6cD5xQhzFYsSUBpN8 wJVMn6RyOljnN002n6F2lKLLPbobFxyfsu3Xrvng= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by sourceware.org (Postfix) with ESMTPS id 554453858D35 for ; Thu, 25 Jun 2020 12:31:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 554453858D35 Received: by mail-pf1-x441.google.com with SMTP id x207so2962702pfc.5 for ; Thu, 25 Jun 2020 05:31:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=aD2VMgzuzdYG4afCwAVzQryZOS2YOyh9z5uti/hs5E4=; b=QUmFQa52P8X/uI1FcgFa4wxZiUxTueY1tSM0FepS4BnXlKovxQ5Lw+EIczMPzMGPQ2 Q5Qu2/xMy8Fn6ceI+Bv/REp2lLtAI06TtkVn5iBI5g5r3xZjZXynt+6rOmUZPCtODZTu UTS802YXcKp/5iU1Ec9RbB03gDb4ImavKXbL1mNyJXtmihw09LVrRQSWAilLXOS57FR/ pv1AsMbzL4CWUCHGSb+lEzUFXy4sU7VSrt9FWEJc9WB/ROcCwHFCR6u8cCIBNCRQslWt 0O2kH/7fBFNigyI8LXlyV0P4msn+7iV4OpNqkVycKi7ibl0lZlIN/T0zhbhmp9fMWO8E CGJw== X-Gm-Message-State: AOAM533jVYHfF4ec8bxcvJMwHVwV3/jM/7f4j8SDzFVmgk375qr/buO8 3sEnqFZ4ohcan6QIqc8XnM8= X-Google-Smtp-Source: ABdhPJzlvwlGYKD/ycdGbEZ8I0ryif/WgKJjxSHmEbNoRA3/KAw0z54WfaaMdRLLjjRDaa9EDZdv+w== X-Received: by 2002:a65:67d9:: with SMTP id b25mr26447971pgs.311.1593088260987; Thu, 25 Jun 2020 05:31:00 -0700 (PDT) Received: from gnu-cfl-2.localdomain (c-69-181-90-243.hsd1.ca.comcast.net. [69.181.90.243]) by smtp.gmail.com with ESMTPSA id b4sm2042788pfr.149.2020.06.25.05.31.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jun 2020 05:31:00 -0700 (PDT) Received: by gnu-cfl-2.localdomain (Postfix, from userid 1000) id B1FEE1A015A; Thu, 25 Jun 2020 05:30:59 -0700 (PDT) Date: Thu, 25 Jun 2020 05:30:59 -0700 To: Florian Weimer Subject: V5: [PATCH] x86: Install [BZ #26124] Message-ID: <20200625123059.GA1169557@gmail.com> References: <87ftamg7ez.fsf@oldenburg2.str.redhat.com> <87a70ug5v8.fsf@oldenburg2.str.redhat.com> <871rm4bkj6.fsf@oldenburg2.str.redhat.com> <697cab8a-2d75-ef36-9e09-dbfe6daa4ae1@linaro.org> <87a70r8urd.fsf@oldenburg2.str.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87a70r8urd.fsf@oldenburg2.str.redhat.com> X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Libc-alpha" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: "H.J. Lu via Libc-alpha" Errors-To: libc-alpha-bounces@sourceware.org Sender: "Libc-alpha" On Thu, Jun 25, 2020 at 09:33:26AM +0200, Florian Weimer wrote: > * H. J. Lu via Libc-alpha: > > >> is not expected to be extendable. The macro API is not my favorite way of > > > > We can expand the cpuid array. We just need to add an alias to > > __x86_get_cpu_features > > with a new symbol version. > > Agreed, from a GNU gABI perspective. > > However, for some distributions, this requirement will put hardware > enablement for new CPUs on hold until we have reworked our package > dependency generation, so that we can express symbol-specific version > information. Debian & downstreams can already do this, with some manual > work, but Fedora & downstreams can only express versions on sonames. > (Without these changes, we would have to backport the entire GLIBC_2.33 > symbol set to get __x86_get_cpu_features@GLIBC_2.33, for example. This > is not something we will be able to do in all cases, depending on the > other changes that are going in.) > > Even then, we can only backport such changes if a glibc release has > happened (so that we can be sure that the meaning of the symbol version > will not change again). > > My proposal, where the index is passed to the function and the function > returns the flag word, does not have these issues: old glibcs will > simply return 0 flags, and the feature appears unusable/unavailable. > This is what already happens with your approach, too, if we use more > bits inside the existing array elements. > > I'm not entirely opposed to enhancing the RPM dependency generation, but > I already tried once and couldn't get it done, and if I fail again, it > might seriously impact CPU hardware enablement in Red Hat Enterprise > Linux 9. I hope this explains my reservations about this interface > design. > Thanks for your detailed explanation. Here is the revised patch. COMMON_CPUID_INDEX_MAX and USABLE_FEATURE_INDEX_MAX are passed to __x86_get_cpu_features so that when USABLE_FEATURE_INDEX_MAX or COMMON_CPUID_INDEX_MAX are increased to support new processor features, __x86_get_cpu_features in the older glibc binaries returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return falses on the new processor feature. No new symbol version is neeeded. Any comments? Thanks. H.J. --- Install so that programmers can do #if __has_include() #include #endif ... if (HAS_CPU_FEATURE (SSE2)) ... if (CPU_FEATURE_USABLE (AVX2)) ... exports only: enum { /* The integer bit array index for the first set of usable feature bits. */ USABLE_FEATURE_INDEX_1 = 0, /* The current maximum size of the feature integer bit array. */ USABLE_FEATURE_INDEX_MAX }; enum { COMMON_CPUID_INDEX_1 = 0, COMMON_CPUID_INDEX_7, COMMON_CPUID_INDEX_80000001, COMMON_CPUID_INDEX_D_ECX_1, COMMON_CPUID_INDEX_80000007, COMMON_CPUID_INDEX_80000008, COMMON_CPUID_INDEX_7_ECX_1, /* Keep the following line at the end. */ COMMON_CPUID_INDEX_MAX }; struct cpu_features { struct cpu_features_basic basic; unsigned int *usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; }; /* Get a pointer to the CPU features structure. */ extern const struct cpu_features *__x86_get_cpu_features (unsigned int max_cpuid, unsigned int max_usable) __attribute__ ((const)); Since all feature checks are done through macros, programs compiled with a newer are compatible with the older glibc binaries as long as the layout of struct cpu_features is identical. The cpuid array can be expanded with backward binary compatibility for both .o and .so files. When USABLE_FEATURE_INDEX_MAX or COMMON_CPUID_INDEX_MAX are increased to support new processor features, __x86_get_cpu_features in the older glibc binaries returns NULL and HAS_CPU_FEATURE/CPU_FEATURE_USABLE return falses on the new processor feature. No new symbol version is neeeded. Note: Although GCC has __builtin_cpu_supports, it only supports a subset of and it is equivalent to CPU_FEATURE_USABLE. It doesn't support HAS_CPU_FEATURE. --- NEWS | 2 + manual/platform.texi | 23 ++ sysdeps/unix/sysv/linux/i386/ld.abilist | 1 + sysdeps/unix/sysv/linux/x86_64/64/ld.abilist | 1 + sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist | 1 + sysdeps/x86/Makefile | 1 + sysdeps/x86/Versions | 4 +- sysdeps/x86/dl-get-cpu-features.c | 7 +- sysdeps/x86/include/cpu-features.h | 216 ++++++++++++++++++ .../{cpu-features.h => sys/platform/x86.h} | 185 ++------------- sysdeps/x86/tst-get-cpu-features.c | 8 +- sysdeps/x86_64/fpu/math-tests-arch.h | 8 +- sysdeps/x86_64/multiarch/test-multiarch.c | 10 +- 13 files changed, 289 insertions(+), 178 deletions(-) create mode 100644 sysdeps/x86/include/cpu-features.h rename sysdeps/x86/{cpu-features.h => sys/platform/x86.h} (77%) diff --git a/NEWS b/NEWS index a660fc59a8..ae7d1ece35 100644 --- a/NEWS +++ b/NEWS @@ -9,6 +9,8 @@ Version 2.32 Major new features: +* Add to provide query macros for x86 CPU features. + * Unicode 12.1.0 Support: Character encoding, character type info, and transliteration tables are all updated to Unicode 12.1.0, using generator scripts contributed by Mike FABIAN (Red Hat). diff --git a/manual/platform.texi b/manual/platform.texi index 504addc956..4cc58fbc6a 100644 --- a/manual/platform.texi +++ b/manual/platform.texi @@ -7,6 +7,7 @@ @menu * PowerPC:: Facilities Specific to the PowerPC Architecture * RISC-V:: Facilities Specific to the RISC-V Architecture +* X86:: Facilities Specific to the X86 Architecture @end menu @node PowerPC @@ -134,3 +135,25 @@ all threads in the current process. Setting the ordering on only the current thread is necessary. All other flag bits are reserved. @end deftypefun + +@node X86 +@appendixsec X86-specific Facilities + +Facilities specific to X86 that are not specific to a particular +operating system are declared in @file{sys/platform/x86.h}. + +@deftypefun {const struct cpu_features *} __x86_get_cpu_features (unsigned int, unsigned int) +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}} +Return a pointer to x86 CPU feature structure used by query macros for x86 +CPU features. +@end deftypefun + +@deftypefn Macro int HAS_CPU_FEATURE (@var{name}) +This macro returns a nonzero value (true) if the processor has the feature +@var{name}. +@end deftypefn + +@deftypefn Macro int CPU_FEATURE_USABLE (@var{name}) +This macro returns a nonzero value (true) if the processor has the feature +@var{name} and the feature is supported by the operating system. +@end deftypefn diff --git a/sysdeps/unix/sysv/linux/i386/ld.abilist b/sysdeps/unix/sysv/linux/i386/ld.abilist index 0478e22071..1226876689 100644 --- a/sysdeps/unix/sysv/linux/i386/ld.abilist +++ b/sysdeps/unix/sysv/linux/i386/ld.abilist @@ -3,3 +3,4 @@ GLIBC_2.1 __libc_stack_end D 0x4 GLIBC_2.1 _dl_mcount F GLIBC_2.3 ___tls_get_addr F GLIBC_2.3 __tls_get_addr F +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist b/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist index d3cdf7611e..886e57abd5 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/ld.abilist @@ -2,3 +2,4 @@ GLIBC_2.2.5 __libc_stack_end D 0x8 GLIBC_2.2.5 _dl_mcount F GLIBC_2.2.5 _r_debug D 0x28 GLIBC_2.3 __tls_get_addr F +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist index c70bccf782..0d2f8a2cc5 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/ld.abilist @@ -2,3 +2,4 @@ GLIBC_2.16 __libc_stack_end D 0x4 GLIBC_2.16 __tls_get_addr F GLIBC_2.16 _dl_mcount F GLIBC_2.16 _r_debug D 0x14 +GLIBC_2.32 __x86_get_cpu_features F diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index beab426f67..0e4d132803 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -4,6 +4,7 @@ endif ifeq ($(subdir),elf) sysdep-dl-routines += dl-get-cpu-features +sysdep_headers += sys/platform/x86.h tests += tst-get-cpu-features tst-get-cpu-features-static tests-static += tst-get-cpu-features-static diff --git a/sysdeps/x86/Versions b/sysdeps/x86/Versions index e02923708e..7e3139dbb1 100644 --- a/sysdeps/x86/Versions +++ b/sysdeps/x86/Versions @@ -1,5 +1,5 @@ ld { - GLIBC_PRIVATE { - __get_cpu_features; + GLIBC_2.32 { + __x86_get_cpu_features; } } diff --git a/sysdeps/x86/dl-get-cpu-features.c b/sysdeps/x86/dl-get-cpu-features.c index 9d61cd56be..68cdee26c1 100644 --- a/sysdeps/x86/dl-get-cpu-features.c +++ b/sysdeps/x86/dl-get-cpu-features.c @@ -18,10 +18,13 @@ #include -#undef __get_cpu_features +#undef __x86_get_cpu_features const struct cpu_features * -__get_cpu_features (void) +__x86_get_cpu_features (unsigned int max_cpuid, unsigned int max_usable) { + if (max_cpuid > COMMON_CPUID_INDEX_MAX + || max_usable > USABLE_FEATURE_INDEX_MAX) + return NULL; return &GLRO(dl_x86_cpu_features); } diff --git a/sysdeps/x86/include/cpu-features.h b/sysdeps/x86/include/cpu-features.h new file mode 100644 index 0000000000..8fe2e42e25 --- /dev/null +++ b/sysdeps/x86/include/cpu-features.h @@ -0,0 +1,216 @@ +/* Data structure for x86 CPU features. + Copyright (C) 2020 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _PRIVATE_CPU_FEATURES_H +#define _PRIVATE_CPU_FEATURES_H 1 + +#ifdef _CPU_FEATURES_H +# error this should be impossible +#endif + +#ifndef _ISOMAC +/* Get most of the contents from the public header, but we define a + different `struct cpu_features' type for private use. */ +# define cpu_features cpu_features_public +# define __x86_get_cpu_features __x86_get_cpu_features_public +#endif + +#include + +#ifndef _ISOMAC + +# undef cpu_features +# undef __x86_get_cpu_features +# define __get_cpu_features() \ + __x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, USABLE_FEATURE_INDEX_MAX) + +enum +{ + /* The integer bit array index for the first set of preferred feature + bits. */ + PREFERRED_FEATURE_INDEX_1 = 0, + /* The current maximum size of the feature integer bit array. */ + PREFERRED_FEATURE_INDEX_MAX +}; + +# undef CPU_FEATURES_ARCH_P +# define CPU_FEATURES_ARCH_P(ptr, name) \ + ((ptr->feature_##name[index_arch_##name] & (bit_arch_##name)) != 0) + +/* HAS_CPU_FEATURE evaluates to true if CPU supports the feature. */ +# undef HAS_CPU_FEATURE +# define HAS_CPU_FEATURE(name) \ + CPU_FEATURES_CPU_P (__x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, \ + USABLE_FEATURE_INDEX_MAX), \ + name) +/* HAS_ARCH_FEATURE evaluates to true if we may use the feature at + runtime. */ +# define HAS_ARCH_FEATURE(name) \ + CPU_FEATURES_ARCH_P (__x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, \ + USABLE_FEATURE_INDEX_MAX), \ + name) +/* CPU_FEATURE_USABLE evaluates to true if the feature is usable. */ +# undef CPU_FEATURE_USABLE +# define CPU_FEATURE_USABLE(name) \ + CPU_FEATURES_ARCH_P (__x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, \ + USABLE_FEATURE_INDEX_MAX), \ + name##_Usable) + +/* USABLE_FEATURE_INDEX_1. */ +# define feature_AVX_Usable usable +# define feature_AVX2_Usable usable +# define feature_AVX512F_Usable usable +# define feature_AVX512CD_Usable usable +# define feature_AVX512ER_Usable usable +# define feature_AVX512PF_Usable usable +# define feature_AVX512VL_Usable usable +# define feature_AVX512BW_Usable usable +# define feature_AVX512DQ_Usable usable +# define feature_AVX512_4FMAPS_Usable usable +# define feature_AVX512_4VNNIW_Usable usable +# define feature_AVX512_BITALG_Usable usable +# define feature_AVX512_IFMA_Usable usable +# define feature_AVX512_VBMI_Usable usable +# define feature_AVX512_VBMI2_Usable usable +# define feature_AVX512_VNNI_Usable usable +# define feature_AVX512_VPOPCNTDQ_Usable usable +# define feature_FMA_Usable usable +# define feature_FMA4_Usable usable +# define feature_VAES_Usable usable +# define feature_VPCLMULQDQ_Usable usable +# define feature_XOP_Usable usable +# define feature_XSAVEC_Usable usable +# define feature_F16C_Usable usable +# define feature_AVX512_VP2INTERSECT_Usable usable +# define feature_AVX512_BF16_Usable usable +# define feature_PKU_Usable usable + +/* PREFERRED_FEATURE_INDEX_1. */ +# define bit_arch_I586 (1u << 0) +# define bit_arch_I686 (1u << 1) +# define bit_arch_Fast_Rep_String (1u << 2) +# define bit_arch_Fast_Copy_Backward (1u << 3) +# define bit_arch_Fast_Unaligned_Load (1u << 4) +# define bit_arch_Fast_Unaligned_Copy (1u << 5) +# define bit_arch_Slow_BSF (1u << 6) +# define bit_arch_Slow_SSE4_2 (1u << 7) +# define bit_arch_AVX_Fast_Unaligned_Load (1u << 8) +# define bit_arch_Prefer_MAP_32BIT_EXEC (1u << 9) +# define bit_arch_Prefer_PMINUB_for_stringop (1u << 10) +# define bit_arch_Prefer_No_VZEROUPPER (1u << 11) +# define bit_arch_Prefer_ERMS (1u << 12) +# define bit_arch_Prefer_FSRM (1u << 13) +# define bit_arch_Prefer_No_AVX512 (1u << 14) +# define bit_arch_MathVec_Prefer_No_AVX512 (1u << 15) + +# define index_arch_Fast_Rep_String PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Copy_Backward PREFERRED_FEATURE_INDEX_1 +# define index_arch_Slow_BSF PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_PMINUB_for_stringop PREFERRED_FEATURE_INDEX_1 +# define index_arch_Fast_Unaligned_Copy PREFERRED_FEATURE_INDEX_1 +# define index_arch_I586 PREFERRED_FEATURE_INDEX_1 +# define index_arch_I686 PREFERRED_FEATURE_INDEX_1 +# define index_arch_Slow_SSE4_2 PREFERRED_FEATURE_INDEX_1 +# define index_arch_AVX_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_MAP_32BIT_EXEC PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_No_VZEROUPPER PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_ERMS PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 +# define index_arch_MathVec_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 +# define index_arch_Prefer_FSRM PREFERRED_FEATURE_INDEX_1 + +# define feature_Fast_Rep_String preferred +# define feature_Fast_Copy_Backward preferred +# define feature_Slow_BSF preferred +# define feature_Fast_Unaligned_Load preferred +# define feature_Prefer_PMINUB_for_stringop preferred +# define feature_Fast_Unaligned_Copy preferred +# define feature_I586 preferred +# define feature_I686 preferred +# define feature_Slow_SSE4_2 preferred +# define feature_AVX_Fast_Unaligned_Load preferred +# define feature_Prefer_MAP_32BIT_EXEC preferred +# define feature_Prefer_No_VZEROUPPER preferred +# define feature_Prefer_ERMS preferred +# define feature_Prefer_No_AVX512 preferred +# define feature_MathVec_Prefer_No_AVX512 preferred +# define feature_Prefer_FSRM preferred + +/* XCR0 Feature flags. */ +# define bit_XMM_state (1u << 1) +# define bit_YMM_state (1u << 2) +# define bit_Opmask_state (1u << 5) +# define bit_ZMM0_15_state (1u << 6) +# define bit_ZMM16_31_state (1u << 7) + +struct cpu_features +{ + struct cpu_features_basic basic; + unsigned int *usable_p; + struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; + unsigned int usable[USABLE_FEATURE_INDEX_MAX]; + unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; + /* The state size for XSAVEC or XSAVE. The type must be unsigned long + int so that we use + + sub xsave_state_size_offset(%rip) %RSP_LP + + in _dl_runtime_resolve. */ + unsigned long int xsave_state_size; + /* The full state size for XSAVE when XSAVEC is disabled by + + GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable + */ + unsigned int xsave_state_full_size; + /* Data cache size for use in memory and string routines, typically + L1 size. */ + unsigned long int data_cache_size; + /* Shared cache size for use in memory and string routines, typically + L2 or L3 size. */ + unsigned long int shared_cache_size; + /* Threshold to use non temporal store. */ + unsigned long int non_temporal_threshold; +}; + +# if defined (_LIBC) && !IS_IN (nonlib) +/* Unused for x86. */ +# define INIT_ARCH() +# define __x86_get_cpu_features(c, u) (&GLRO(dl_x86_cpu_features)) +# endif + +# ifdef __x86_64__ +# define HAS_CPUID 1 +# elif (defined __i586__ || defined __pentium__ \ + || defined __geode__ || defined __k6__) +# define HAS_CPUID 1 +# define HAS_I586 1 +# define HAS_I686 HAS_ARCH_FEATURE (I686) +# elif defined __i486__ +# define HAS_CPUID 0 +# define HAS_I586 HAS_ARCH_FEATURE (I586) +# define HAS_I686 HAS_ARCH_FEATURE (I686) +# else +# define HAS_CPUID 1 +# define HAS_I586 1 +# define HAS_I686 1 +# endif + +#endif /* !_ISOMAC */ + +#endif /* include/cpu-features.h */ diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/sys/platform/x86.h similarity index 77% rename from sysdeps/x86/cpu-features.h rename to sysdeps/x86/sys/platform/x86.h index 574f055e0c..6d0310892b 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/sys/platform/x86.h @@ -1,4 +1,5 @@ -/* This file is part of the GNU C Library. +/* Data structure for x86 CPU features. + This file is part of the GNU C Library. Copyright (C) 2008-2020 Free Software Foundation, Inc. The GNU C Library is free software; you can redistribute it and/or @@ -15,8 +16,8 @@ License along with the GNU C Library; if not, see . */ -#ifndef cpu_features_h -#define cpu_features_h +#ifndef _SYS_PLATFORM_X86_H +#define _SYS_PLATFORM_X86_H enum { @@ -27,15 +28,6 @@ enum USABLE_FEATURE_INDEX_MAX }; -enum -{ - /* The integer bit array index for the first set of preferred feature - bits. */ - PREFERRED_FEATURE_INDEX_1 = 0, - /* The current maximum size of the feature integer bit array. */ - PREFERRED_FEATURE_INDEX_MAX -}; - enum { COMMON_CPUID_INDEX_1 = 0, @@ -80,51 +72,32 @@ struct cpu_features struct cpu_features_basic basic; unsigned int *usable_p; struct cpuid_registers cpuid[COMMON_CPUID_INDEX_MAX]; - unsigned int usable[USABLE_FEATURE_INDEX_MAX]; - unsigned int preferred[PREFERRED_FEATURE_INDEX_MAX]; - /* The state size for XSAVEC or XSAVE. The type must be unsigned long - int so that we use - - sub xsave_state_size_offset(%rip) %RSP_LP - - in _dl_runtime_resolve. */ - unsigned long int xsave_state_size; - /* The full state size for XSAVE when XSAVEC is disabled by - - GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable - */ - unsigned int xsave_state_full_size; - /* Data cache size for use in memory and string routines, typically - L1 size. */ - unsigned long int data_cache_size; - /* Shared cache size for use in memory and string routines, typically - L2 or L3 size. */ - unsigned long int shared_cache_size; - /* Threshold to use non temporal store. */ - unsigned long int non_temporal_threshold; }; -/* Used from outside of glibc to get access to the CPU features - structure. */ -extern const struct cpu_features *__get_cpu_features (void) +/* Get a pointer to the CPU features structure. */ +extern const struct cpu_features *__x86_get_cpu_features (unsigned int, + unsigned int) __attribute__ ((const)); -/* Only used directly in cpu-features.c. */ -# define CPU_FEATURES_CPU_P(ptr, name) \ +#define CPU_FEATURES_CPU_P(ptr, name) \ ((ptr->cpuid[index_cpu_##name].reg_##name & (bit_cpu_##name)) != 0) -# define CPU_FEATURES_ARCH_P(ptr, name) \ - ((ptr->feature_##name[index_arch_##name] & (bit_arch_##name)) != 0) +#define CPU_FEATURES_ARCH_P(ptr, name) \ + ((ptr->usable_p[index_arch_##name] & (bit_arch_##name)) != 0) /* HAS_CPU_FEATURE evaluates to true if CPU supports the feature. */ -#define HAS_CPU_FEATURE(name) \ - CPU_FEATURES_CPU_P (__get_cpu_features (), name) -/* HAS_ARCH_FEATURE evaluates to true if we may use the feature at - runtime. */ -# define HAS_ARCH_FEATURE(name) \ - CPU_FEATURES_ARCH_P (__get_cpu_features (), name) +#define HAS_CPU_FEATURE(name) \ + (__extension__ \ + ({ const struct cpu_features *__ptr = \ + __x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, \ + USABLE_FEATURE_INDEX_MAX); \ + __ptr && CPU_FEATURES_CPU_P (__ptr, name); })) /* CPU_FEATURE_USABLE evaluates to true if the feature is usable. */ -#define CPU_FEATURE_USABLE(name) \ - HAS_ARCH_FEATURE (name##_Usable) +#define CPU_FEATURE_USABLE(name) \ + (__extension__ \ + ({ const struct cpu_features *__ptr = \ + __x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, \ + USABLE_FEATURE_INDEX_MAX); \ + __ptr && CPU_FEATURES_ARCH_P (__ptr, name##_Usable); })) /* Architecture features. */ @@ -185,34 +158,6 @@ extern const struct cpu_features *__get_cpu_features (void) #define index_arch_AVX512_BF16_Usable USABLE_FEATURE_INDEX_1 #define index_arch_PKU_Usable USABLE_FEATURE_INDEX_1 -#define feature_AVX_Usable usable -#define feature_AVX2_Usable usable -#define feature_AVX512F_Usable usable -#define feature_AVX512CD_Usable usable -#define feature_AVX512ER_Usable usable -#define feature_AVX512PF_Usable usable -#define feature_AVX512VL_Usable usable -#define feature_AVX512BW_Usable usable -#define feature_AVX512DQ_Usable usable -#define feature_AVX512_4FMAPS_Usable usable -#define feature_AVX512_4VNNIW_Usable usable -#define feature_AVX512_BITALG_Usable usable -#define feature_AVX512_IFMA_Usable usable -#define feature_AVX512_VBMI_Usable usable -#define feature_AVX512_VBMI2_Usable usable -#define feature_AVX512_VNNI_Usable usable -#define feature_AVX512_VPOPCNTDQ_Usable usable -#define feature_FMA_Usable usable -#define feature_FMA4_Usable usable -#define feature_VAES_Usable usable -#define feature_VPCLMULQDQ_Usable usable -#define feature_XOP_Usable usable -#define feature_XSAVEC_Usable usable -#define feature_F16C_Usable usable -#define feature_AVX512_VP2INTERSECT_Usable usable -#define feature_AVX512_BF16_Usable usable -#define feature_PKU_Usable usable - /* CPU features. */ /* COMMON_CPUID_INDEX_1. */ @@ -761,88 +706,4 @@ extern const struct cpu_features *__get_cpu_features (void) /* EAX. */ #define reg_AVX512_BF16 eax -/* FEATURE_INDEX_2. */ -#define bit_arch_I586 (1u << 0) -#define bit_arch_I686 (1u << 1) -#define bit_arch_Fast_Rep_String (1u << 2) -#define bit_arch_Fast_Copy_Backward (1u << 3) -#define bit_arch_Fast_Unaligned_Load (1u << 4) -#define bit_arch_Fast_Unaligned_Copy (1u << 5) -#define bit_arch_Slow_BSF (1u << 6) -#define bit_arch_Slow_SSE4_2 (1u << 7) -#define bit_arch_AVX_Fast_Unaligned_Load (1u << 8) -#define bit_arch_Prefer_MAP_32BIT_EXEC (1u << 9) -#define bit_arch_Prefer_PMINUB_for_stringop (1u << 10) -#define bit_arch_Prefer_No_VZEROUPPER (1u << 11) -#define bit_arch_Prefer_ERMS (1u << 12) -#define bit_arch_Prefer_FSRM (1u << 13) -#define bit_arch_Prefer_No_AVX512 (1u << 14) -#define bit_arch_MathVec_Prefer_No_AVX512 (1u << 15) - -#define index_arch_Fast_Rep_String PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Copy_Backward PREFERRED_FEATURE_INDEX_1 -#define index_arch_Slow_BSF PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_PMINUB_for_stringop PREFERRED_FEATURE_INDEX_1 -#define index_arch_Fast_Unaligned_Copy PREFERRED_FEATURE_INDEX_1 -#define index_arch_I586 PREFERRED_FEATURE_INDEX_1 -#define index_arch_I686 PREFERRED_FEATURE_INDEX_1 -#define index_arch_Slow_SSE4_2 PREFERRED_FEATURE_INDEX_1 -#define index_arch_AVX_Fast_Unaligned_Load PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_MAP_32BIT_EXEC PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_No_VZEROUPPER PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_ERMS PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 -#define index_arch_MathVec_Prefer_No_AVX512 PREFERRED_FEATURE_INDEX_1 -#define index_arch_Prefer_FSRM PREFERRED_FEATURE_INDEX_1 - -#define feature_Fast_Rep_String preferred -#define feature_Fast_Copy_Backward preferred -#define feature_Slow_BSF preferred -#define feature_Fast_Unaligned_Load preferred -#define feature_Prefer_PMINUB_for_stringop preferred -#define feature_Fast_Unaligned_Copy preferred -#define feature_I586 preferred -#define feature_I686 preferred -#define feature_Slow_SSE4_2 preferred -#define feature_AVX_Fast_Unaligned_Load preferred -#define feature_Prefer_MAP_32BIT_EXEC preferred -#define feature_Prefer_No_VZEROUPPER preferred -#define feature_Prefer_ERMS preferred -#define feature_Prefer_No_AVX512 preferred -#define feature_MathVec_Prefer_No_AVX512 preferred -#define feature_Prefer_FSRM preferred - -/* XCR0 Feature flags. */ -#define bit_XMM_state (1u << 1) -#define bit_YMM_state (1u << 2) -#define bit_Opmask_state (1u << 5) -#define bit_ZMM0_15_state (1u << 6) -#define bit_ZMM16_31_state (1u << 7) - -# if defined (_LIBC) && !IS_IN (nonlib) -/* Unused for x86. */ -# define INIT_ARCH() -# define __get_cpu_features() (&GLRO(dl_x86_cpu_features)) -# define x86_get_cpuid_registers(i) \ - (&(GLRO(dl_x86_cpu_features).cpuid[i])) -# endif - -#ifdef __x86_64__ -# define HAS_CPUID 1 -#elif (defined __i586__ || defined __pentium__ \ - || defined __geode__ || defined __k6__) -# define HAS_CPUID 1 -# define HAS_I586 1 -# define HAS_I686 HAS_ARCH_FEATURE (I686) -#elif defined __i486__ -# define HAS_CPUID 0 -# define HAS_I586 HAS_ARCH_FEATURE (I586) -# define HAS_I686 HAS_ARCH_FEATURE (I686) -#else -# define HAS_CPUID 1 -# define HAS_I586 1 -# define HAS_I686 1 -#endif - -#endif /* cpu_features_h */ +#endif /* _SYS_PLATFORM_X86_H */ diff --git a/sysdeps/x86/tst-get-cpu-features.c b/sysdeps/x86/tst-get-cpu-features.c index c60918cf00..f6e2cfc556 100644 --- a/sysdeps/x86/tst-get-cpu-features.c +++ b/sysdeps/x86/tst-get-cpu-features.c @@ -1,4 +1,4 @@ -/* Test case for x86 __get_cpu_features interface +/* Test case for __x86_get_cpu_features interface Copyright (C) 2015-2020 Free Software Foundation, Inc. This file is part of the GNU C Library. @@ -18,7 +18,7 @@ #include #include -#include +#include #include #define CHECK_CPU_FEATURE(name) \ @@ -45,7 +45,9 @@ static const char * const cpu_kinds[] = static int do_test (void) { - const struct cpu_features *cpu_features = __get_cpu_features (); + const struct cpu_features *cpu_features + = __x86_get_cpu_features (COMMON_CPUID_INDEX_MAX, + USABLE_FEATURE_INDEX_MAX); switch (cpu_features->basic.kind) { diff --git a/sysdeps/x86_64/fpu/math-tests-arch.h b/sysdeps/x86_64/fpu/math-tests-arch.h index 435ddad991..cc3c2b0c11 100644 --- a/sysdeps/x86_64/fpu/math-tests-arch.h +++ b/sysdeps/x86_64/fpu/math-tests-arch.h @@ -16,7 +16,7 @@ License along with the GNU C Library; if not, see . */ -#include +#include #if defined REQUIRE_AVX @@ -24,7 +24,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX)) return; \ } \ while (0) @@ -34,7 +34,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX2_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX2)) return; \ } \ while (0) @@ -44,7 +44,7 @@ # define CHECK_ARCH_EXT \ do \ { \ - if (!HAS_ARCH_FEATURE (AVX512F_Usable)) return; \ + if (!CPU_FEATURE_USABLE (AVX512F)) return; \ } \ while (0) diff --git a/sysdeps/x86_64/multiarch/test-multiarch.c b/sysdeps/x86_64/multiarch/test-multiarch.c index 317373ceda..9feaf057e5 100644 --- a/sysdeps/x86_64/multiarch/test-multiarch.c +++ b/sysdeps/x86_64/multiarch/test-multiarch.c @@ -16,7 +16,7 @@ License along with the GNU C Library; if not, see . */ -#include +#include #include #include #include @@ -75,10 +75,10 @@ do_test (int argc, char **argv) int fails; get_cpuinfo (); - fails = check_proc ("avx", HAS_ARCH_FEATURE (AVX_Usable), - "HAS_ARCH_FEATURE (AVX_Usable)"); - fails += check_proc ("fma4", HAS_ARCH_FEATURE (FMA4_Usable), - "HAS_ARCH_FEATURE (FMA4_Usable)"); + fails = check_proc ("avx", CPU_FEATURE_USABLE (AVX), + "CPU_FEATURE_USABLE (AVX)"); + fails += check_proc ("fma4", CPU_FEATURE_USABLE (FMA4), + "CPU_FEATURE_USABLE (FMA4)"); fails += check_proc ("sse4_2", HAS_CPU_FEATURE (SSE4_2), "HAS_CPU_FEATURE (SSE4_2)"); fails += check_proc ("sse4_1", HAS_CPU_FEATURE (SSE4_1)