From patchwork Mon Apr 14 12:53:51 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ma Ling X-Patchwork-Id: 338945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id DE6AE140087 for ; Mon, 14 Apr 2014 22:54:17 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; q=dns; s= default; b=ZAq3LaE36M3qcceiQ7owWaUY0C4TEs4cKQ4qOiq/hkF/luRvppGdq le9+93+gUwKpJphEFtFJih/PBM4wZEI11YL5Gg3zXUo4elXsKjGvAiUbbW0+YyeB jAUK2suESdC58Sns7TohYyLbZ94SXljsozqJf23ChOEgST2QEQDFUs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; s=default; bh=Osj1TFgeOyL3QNesBJSAUpmldPE=; b=V3E4Nh2J4k1AH/8xieCcHjbvQfWK 6BszIJo8FBqTkCnIdGb2aPEvYCHp3an+FtdMnXD0p17wraNo96Z9Dtd2M/xocmId hVyAUJjNVgfTbNhJpxoUc66gX71bfnAyFwGdyrwT4FPKlvNVuuhk9LlynP7wNt2g 0/5x/V8TStAFIvE= Received: (qmail 13881 invoked by alias); 14 Apr 2014 12:54:12 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 13868 invoked by uid 89); 14 Apr 2014 12:54:11 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: Yes, score=6.7 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPAM_URI1, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-pa0-f48.google.com X-Received: by 10.66.147.130 with SMTP id tk2mr3250180pab.125.1397480047357; Mon, 14 Apr 2014 05:54:07 -0700 (PDT) From: ling.ma.program@gmail.com To: hjl.tools@gmail.com Cc: libc-alpha@sourceware.org, neleai@seznam.cz, aj@suse.com, liubov.dmitrieva@gmail.com, Sihai Yao Subject: Re: [PATCH RFC] X86_64 Avx2 Detection Date: Mon, 14 Apr 2014 08:53:51 -0400 Message-Id: <1397480031-5847-1-git-send-email-sihai.ysh@alibaba-inc.com> From: Sihai Yao This patch sets bit_AVX2_Usable of __cpu_features.feature by checking COMMON_CPUID_INDEX_7 for Haswell. Architecture related assembler file can use this bit to determine calling path. --- This version removed the unrelated cpu module branch code and FEATURE_INDEX_7, which is unusefull for AVX ChangeLog | 8 ++++++++ sysdeps/x86_64/multiarch/ifunc-defines.sym | 1 + sysdeps/x86_64/multiarch/init-arch.c | 3 +++ sysdeps/x86_64/multiarch/init-arch.h | 8 ++++++++ 4 files changed, 20 insertions(+) diff --git a/ChangeLog b/ChangeLog index fb0177d..ba8980c 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,11 @@ +2014-04-04 Sihai Yao + * sysdeps/x86_64/multiarch/ifunc-defines.sym: Add COMMON_CPU_INDEX_7 and + FEATURE_INDEX_7. + * sysdeps/x86_64/multiarch/init-arch.c: Add AVX2 detection from cpu + features word of COMMON_CPUID_INDEX_7. + * sysdeps/x86_64/multiarch/init-arch.h: Add bit_AVX2_Usable and + index_AVX2_Usable for future assembly code to determing calling path. + 2014-04-10 Torvald Riegel * benchtests/pthread_once-inputs: New file. diff --git a/sysdeps/x86_64/multiarch/ifunc-defines.sym b/sysdeps/x86_64/multiarch/ifunc-defines.sym index eb1538a..a410d88 100644 --- a/sysdeps/x86_64/multiarch/ifunc-defines.sym +++ b/sysdeps/x86_64/multiarch/ifunc-defines.sym @@ -17,4 +17,5 @@ FEATURE_OFFSET offsetof (struct cpu_features, feature) FEATURE_SIZE sizeof (unsigned int) COMMON_CPUID_INDEX_1 +COMMON_CPUID_INDEX_7 FEATURE_INDEX_1 diff --git a/sysdeps/x86_64/multiarch/init-arch.c b/sysdeps/x86_64/multiarch/init-arch.c index db74d97..2a6dcb7 100644 --- a/sysdeps/x86_64/multiarch/init-arch.c +++ b/sysdeps/x86_64/multiarch/init-arch.c @@ -167,6 +167,9 @@ __init_cpu_features (void) /* Determine if AVX is usable. */ if (CPUID_AVX) __cpu_features.feature[index_AVX_Usable] |= bit_AVX_Usable; + /* Determine if AVX2 is usable. */ + if (CPUID_AVX2) + __cpu_features.feature[index_AVX2_Usable] |= bit_AVX2_Usable; /* Determine if FMA is usable. */ if (CPUID_FMA) __cpu_features.feature[index_FMA_Usable] |= bit_FMA_Usable; diff --git a/sysdeps/x86_64/multiarch/init-arch.h b/sysdeps/x86_64/multiarch/init-arch.h index 793707a..813b6de 100644 --- a/sysdeps/x86_64/multiarch/init-arch.h +++ b/sysdeps/x86_64/multiarch/init-arch.h @@ -24,6 +24,7 @@ #define bit_FMA_Usable (1 << 7) #define bit_FMA4_Usable (1 << 8) #define bit_Slow_SSE4_2 (1 << 9) +#define bit_AVX2_Usable (1 << 10) /* CPUID Feature flags. */ @@ -40,6 +41,7 @@ /* COMMON_CPUID_INDEX_7. */ #define bit_RTM (1 << 11) +#define bit_AVX2 (1 << 5) /* XCR0 Feature flags. */ #define bit_XMM_state (1 << 1) @@ -54,6 +56,7 @@ # define index_SSE4_1 COMMON_CPUID_INDEX_1*CPUID_SIZE+CPUID_ECX_OFFSET # define index_SSE4_2 COMMON_CPUID_INDEX_1*CPUID_SIZE+CPUID_ECX_OFFSET # define index_AVX COMMON_CPUID_INDEX_1*CPUID_SIZE+CPUID_ECX_OFFSET +# define index_AVX2 COMMON_CPUID_INDEX_7*CPUID_SIZE+CPUID_EBX_OFFSET # define index_Fast_Rep_String FEATURE_INDEX_1*FEATURE_SIZE # define index_Fast_Copy_Backward FEATURE_INDEX_1*FEATURE_SIZE @@ -64,6 +67,7 @@ # define index_FMA_Usable FEATURE_INDEX_1*FEATURE_SIZE # define index_FMA4_Usable FEATURE_INDEX_1*FEATURE_SIZE # define index_Slow_SSE4_2 FEATURE_INDEX_1*FEATURE_SIZE +# define index_AVX2_Usable FEATURE_INDEX_1*FEATURE_SIZE #else /* __ASSEMBLER__ */ @@ -145,6 +149,8 @@ extern const struct cpu_features *__get_cpu_features (void) HAS_CPUID_FLAG (COMMON_CPUID_INDEX_80000001, ecx, bit_FMA4) # define CPUID_RTM \ HAS_CPUID_FLAG (COMMON_CPUID_INDEX_7, ebx, bit_RTM) +# define CPUID_AVX2 \ + HAS_CPUID_FLAG (COMMON_CPUID_INDEX_7, ebx, bit_AVX2) /* HAS_* evaluates to true if we may use the feature at runtime. */ # define HAS_SSE2 HAS_CPU_FEATURE (COMMON_CPUID_INDEX_1, edx, bit_SSE2) @@ -153,6 +159,7 @@ extern const struct cpu_features *__get_cpu_features (void) # define HAS_SSE4_1 HAS_CPU_FEATURE (COMMON_CPUID_INDEX_1, ecx, bit_SSE4_1) # define HAS_SSE4_2 HAS_CPU_FEATURE (COMMON_CPUID_INDEX_1, ecx, bit_SSE4_2) # define HAS_RTM HAS_CPU_FEATURE (COMMON_CPUID_INDEX_7, ebx, bit_RTM) +# define HAS_AVX2 HAS_CPU_FEATURE (COMMON_CPUID_INDEX_7, ebx, bit_AVX2) # define index_Fast_Rep_String FEATURE_INDEX_1 # define index_Fast_Copy_Backward FEATURE_INDEX_1 @@ -163,6 +170,7 @@ extern const struct cpu_features *__get_cpu_features (void) # define index_FMA_Usable FEATURE_INDEX_1 # define index_FMA4_Usable FEATURE_INDEX_1 # define index_Slow_SSE4_2 FEATURE_INDEX_1 +# define index_AVX2_Usable FEATURE_INDEX_1 # define HAS_ARCH_FEATURE(name) \ ((__get_cpu_features ()->feature[index_##name] & (bit_##name)) != 0)