From patchwork Thu Mar 29 12:43:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 892780 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-475605-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=intel.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="UcokOlm5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Bktz1bjBz9rx7 for ; Thu, 29 Mar 2018 23:43:21 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=wnz6ko4ibjo6zhmgHUbLZzuHaCCi1 e2A/gDZ5OWz/Qx4udERgLr5Jj83EJa0hi5cD8ly+LSOALVdLegSrGunnKmEYMMEY 4vE+2keYvqHqI9o2SuWTXaCb6djkRg6oqF3SE6SStAzK+Dg7ITgizRwxNDHaNGGX zt+aKY4STtz+2s= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=ZF8+n9Cw7Z1kuT/D/fi/yIocWgo=; b=Uco kOlm5MPJtLZUHIrdWRiDfH28vKtTwSGBBjH7exG39tcTuSCRKc8IutVe8ouWGLi8 wAiSX6UL38yTItL54NzWcqEHtMqRs4bx4+NGJaDO0xxXEamx2HejDkKbEr+J6gHT 7JpzntF46WOkshyihsc5hhwh5SDoNpGK1YKGQdDs= Received: (qmail 84037 invoked by alias); 29 Mar 2018 12:43:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 82641 invoked by uid 89); 29 Mar 2018 12:43:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, NO_DNS_FOR_FROM, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=upper, states X-HELO: mga02.intel.com Received: from mga02.intel.com (HELO mga02.intel.com) (134.134.136.20) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 29 Mar 2018 12:43:11 +0000 X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Mar 2018 05:43:09 -0700 X-ExtLoop1: 1 Received: from gnu-cfl-1.sc.intel.com ([172.25.70.237]) by orsmga001.jf.intel.com with ESMTP; 29 Mar 2018 05:43:09 -0700 Received: by gnu-cfl-1.sc.intel.com (Postfix, from userid 1000) id 942B7440B32; Thu, 29 Mar 2018 05:43:09 -0700 (PDT) Date: Thu, 29 Mar 2018 05:43:09 -0700 From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak Subject: [PATCH] i386: Enable AVX/AVX512 features only if supported by OSXSAVE Message-ID: <20180329124309.GA12667@intel.com> Reply-To: "H.J. Lu" MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.9.2 (2017-12-15) Enable AVX and AVX512 features only if their states are supported by OSXSAVE. OK for trunk and release branches? H.J. --- PR target/85100 * config/i386/cpuinfo.c (XCR_XFEATURE_ENABLED_MASK): New. (XSTATE_FP): Likewise. (XSTATE_SSE): Likewise. (XSTATE_YMM): Likewise. (XSTATE_OPMASK): Likewise. (XSTATE_ZMM): Likewise. (XSTATE_HI_ZMM): Likewise. (XCR_AVX_ENABLED_MASK): Likewise. (XCR_AVX512F_ENABLED_MASK): Likewise. (get_available_features): Enable AVX and AVX512 features only if their states are supported by OSXSAVE. --- libgcc/config/i386/cpuinfo.c | 134 +++++++++++++++++++++++++++++-------------- 1 file changed, 90 insertions(+), 44 deletions(-) diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c index 4eb3f5cd944..1dac110a79a 100644 --- a/libgcc/config/i386/cpuinfo.c +++ b/libgcc/config/i386/cpuinfo.c @@ -240,6 +240,40 @@ get_available_features (unsigned int ecx, unsigned int edx, unsigned int features = 0; unsigned int features2 = 0; + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ +#define XCR_XFEATURE_ENABLED_MASK 0x0 +#define XSTATE_FP 0x1 +#define XSTATE_SSE 0x2 +#define XSTATE_YMM 0x4 +#define XSTATE_OPMASK 0x20 +#define XSTATE_ZMM 0x40 +#define XSTATE_HI_ZMM 0x80 + +#define XCR_AVX_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM) +#define XCR_AVX512F_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) + + /* Check if AVX and AVX512 are usable. */ + int avx_usable = 0; + int avx512_usable = 0; + if ((ecx & bit_OSXSAVE)) + { + /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and + ZMM16-ZMM31 states are supported by OSXSAVE. */ + unsigned int xcrlow; + unsigned int xcrhigh; + asm (".byte 0x0f, 0x01, 0xd0" + : "=a" (xcrlow), "=d" (xcrhigh) + : "c" (XCR_XFEATURE_ENABLED_MASK)); + if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK) + { + avx_usable = 1; + avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK) + == XCR_AVX512F_ENABLED_MASK); + } + } + #define set_feature(f) \ if (f < 32) features |= (1U << f); else features2 |= (1U << (f - 32)) @@ -265,10 +299,13 @@ get_available_features (unsigned int ecx, unsigned int edx, set_feature (FEATURE_SSE4_1); if (ecx & bit_SSE4_2) set_feature (FEATURE_SSE4_2); - if (ecx & bit_AVX) - set_feature (FEATURE_AVX); - if (ecx & bit_FMA) - set_feature (FEATURE_FMA); + if (avx_usable) + { + if (ecx & bit_AVX) + set_feature (FEATURE_AVX); + if (ecx & bit_FMA) + set_feature (FEATURE_FMA); + } /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */ if (max_cpuid_level >= 7) @@ -276,44 +313,50 @@ get_available_features (unsigned int ecx, unsigned int edx, __cpuid_count (7, 0, eax, ebx, ecx, edx); if (ebx & bit_BMI) set_feature (FEATURE_BMI); - if (ebx & bit_AVX2) - set_feature (FEATURE_AVX2); + if (avx_usable) + { + if (ebx & bit_AVX2) + set_feature (FEATURE_AVX2); + } if (ebx & bit_BMI2) set_feature (FEATURE_BMI2); - if (ebx & bit_AVX512F) - set_feature (FEATURE_AVX512F); - if (ebx & bit_AVX512VL) - set_feature (FEATURE_AVX512VL); - if (ebx & bit_AVX512BW) - set_feature (FEATURE_AVX512BW); - if (ebx & bit_AVX512DQ) - set_feature (FEATURE_AVX512DQ); - if (ebx & bit_AVX512CD) - set_feature (FEATURE_AVX512CD); - if (ebx & bit_AVX512PF) - set_feature (FEATURE_AVX512PF); - if (ebx & bit_AVX512ER) - set_feature (FEATURE_AVX512ER); - if (ebx & bit_AVX512IFMA) - set_feature (FEATURE_AVX512IFMA); - if (ecx & bit_AVX512VBMI) - set_feature (FEATURE_AVX512VBMI); - if (ecx & bit_AVX512VBMI2) - set_feature (FEATURE_AVX512VBMI2); - if (ecx & bit_GFNI) - set_feature (FEATURE_GFNI); - if (ecx & bit_VPCLMULQDQ) - set_feature (FEATURE_VPCLMULQDQ); - if (ecx & bit_AVX512VNNI) - set_feature (FEATURE_AVX512VNNI); - if (ecx & bit_AVX512BITALG) - set_feature (FEATURE_AVX512BITALG); - if (ecx & bit_AVX512VPOPCNTDQ) - set_feature (FEATURE_AVX512VPOPCNTDQ); - if (edx & bit_AVX5124VNNIW) - set_feature (FEATURE_AVX5124VNNIW); - if (edx & bit_AVX5124FMAPS) - set_feature (FEATURE_AVX5124FMAPS); + if (avx512_usable) + { + if (ebx & bit_AVX512F) + set_feature (FEATURE_AVX512F); + if (ebx & bit_AVX512VL) + set_feature (FEATURE_AVX512VL); + if (ebx & bit_AVX512BW) + set_feature (FEATURE_AVX512BW); + if (ebx & bit_AVX512DQ) + set_feature (FEATURE_AVX512DQ); + if (ebx & bit_AVX512CD) + set_feature (FEATURE_AVX512CD); + if (ebx & bit_AVX512PF) + set_feature (FEATURE_AVX512PF); + if (ebx & bit_AVX512ER) + set_feature (FEATURE_AVX512ER); + if (ebx & bit_AVX512IFMA) + set_feature (FEATURE_AVX512IFMA); + if (ecx & bit_AVX512VBMI) + set_feature (FEATURE_AVX512VBMI); + if (ecx & bit_AVX512VBMI2) + set_feature (FEATURE_AVX512VBMI2); + if (ecx & bit_GFNI) + set_feature (FEATURE_GFNI); + if (ecx & bit_VPCLMULQDQ) + set_feature (FEATURE_VPCLMULQDQ); + if (ecx & bit_AVX512VNNI) + set_feature (FEATURE_AVX512VNNI); + if (ecx & bit_AVX512BITALG) + set_feature (FEATURE_AVX512BITALG); + if (ecx & bit_AVX512VPOPCNTDQ) + set_feature (FEATURE_AVX512VPOPCNTDQ); + if (edx & bit_AVX5124VNNIW) + set_feature (FEATURE_AVX5124VNNIW); + if (edx & bit_AVX5124FMAPS) + set_feature (FEATURE_AVX5124FMAPS); + } } /* Check cpuid level of extended features. */ @@ -325,10 +368,13 @@ get_available_features (unsigned int ecx, unsigned int edx, if (ecx & bit_SSE4a) set_feature (FEATURE_SSE4_A); - if (ecx & bit_FMA4) - set_feature (FEATURE_FMA4); - if (ecx & bit_XOP) - set_feature (FEATURE_XOP); + if (avx_usable) + { + if (ecx & bit_FMA4) + set_feature (FEATURE_FMA4); + if (ecx & bit_XOP) + set_feature (FEATURE_XOP); + } } __cpu_model.__cpu_features[0] = features;