From patchwork Tue May 26 11:59:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1297938 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=lbxfBXCY; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49WXbh10qcz9sSk for ; Tue, 26 May 2020 22:00:43 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC2D43840C1F; Tue, 26 May 2020 12:00:40 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AC2D43840C1F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1590494440; bh=ElbS2U+rZqsLMltBfVDdrXoNPaw+YbFjCL9wnKNFzrM=; h=References:In-Reply-To:Date:Subject:To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=lbxfBXCYwxBaMccLWPTsWXMISRO5o/yZfi/gp4WVM7NG3UTjvEyyYmJ8uN74kwsgX G3ix6ZHhxfjJDys6F39v76asltTF2qoWbWmrLw+LeIC79aqDJGfO7iU1i5JKgdpZGN AHSLPsB75pRlQgLjd5co6RYQ6MN8ubU/etlpPTeU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-io1-xd33.google.com (mail-io1-xd33.google.com [IPv6:2607:f8b0:4864:20::d33]) by sourceware.org (Postfix) with ESMTPS id 1F4B5386F434 for ; Tue, 26 May 2020 12:00:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 1F4B5386F434 Received: by mail-io1-xd33.google.com with SMTP id y18so1170629iow.3 for ; Tue, 26 May 2020 05:00:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ElbS2U+rZqsLMltBfVDdrXoNPaw+YbFjCL9wnKNFzrM=; b=X/dRq8InFkUYGnVqQeK3FlNJU8vEbW/Zk36gjtljIGwx4Nu+ZKFSUXSv9ucDytQD+S 1n3VD2X8nxuzXoAOMR5DACVgWzDB4M0ZJWAzTGcFzLTS/79JdKzlXShyoPjSuTMnYaqr NA2PhjyrgzX5mQsmyZLhLZ3u3UP1p3Q2IJtN0+cLpuWTgkD1WtbvMvU+uCV7Wb5dyLHT UZWBg1VqwNvoyTKRBp+fAzRKCURwKw+8Lw1h1grNtKbkgQ7+VNzNTJmzeXwnOWBqNrdY RgecmKY/k47JkVwwl3oRQ8ZFY9IXqVrByonEQ8kdJljYfOobNkkDD0XYjJJnmG9pbwgv Ehhg== X-Gm-Message-State: AOAM531NJA7h5UEttN3L39z5+G0kIgx6/VTFPwbOm3TKOvcNdKQpPBfa klNfBcuRkAY5KUv5TkZ6rMChQvtDY2d3eNU6th8= X-Google-Smtp-Source: ABdhPJzjadK+aFk1YicJLFvEBPFL+pPI6nHAa7qwutw3V+0WhPXrKevas/d4pcXvp6yhgq8ypdcdkAnXibAgLJ+8MpQ= X-Received: by 2002:a6b:e311:: with SMTP id u17mr10434875ioc.51.1590494429504; Tue, 26 May 2020 05:00:29 -0700 (PDT) MIME-Version: 1.0 References: <20200518115853.3468-1-hjl.tools@gmail.com> <6696000c-63b6-949a-1203-e696c1d743d0@suse.cz> In-Reply-To: <6696000c-63b6-949a-1203-e696c1d743d0@suse.cz> Date: Tue, 26 May 2020 04:59:53 -0700 Message-ID: Subject: V5 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386 To: =?utf-8?q?Martin_Li=C5=A1ka?= X-Spam-Status: No, score=-10.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Gcc-patches" From: "H.J. Lu" Reply-To: "H.J. Lu" Cc: Jakub Jelinek , Richard Biener , Uros Bizjak , "gcc-patches@gcc.gnu.org" , Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" On Tue, May 26, 2020 at 2:30 AM Martin Liška wrote: > > On 5/25/20 7:42 PM, H.J. Lu wrote: > > Here is the updated patch. OK for master? > > Thank you for the updated patch. > > I have still few nits: > > 1) I would make all the: > > > + has_sse3 = has_feature (FEATURE_SSE3); > > a macro. The local variable seems to superfluous. Done. > 2) can we automatically deduce option name: > > > + ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, "-mrdpid") > > + ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, "-mrdrnd") > > I mean "-m" + "rdrnd" == "-mrdrnd" ? The new option field serves 2 purposes: 1. Not all features have a corresponding command-line option ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL) for (i = 0; i < ARRAY_SIZE (isa_names_table); i++) if (isa_names_table[i].option) 2. Some feature has a different name in the command-line option. ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, "-mfxsr") Here is the updated patch. OK for master? Thanks. From 48a429a02c91937ba2a4bd37f42304c2ce59bb28 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Mon, 18 May 2020 05:58:41 -0700 Subject: [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386 Move cpuinfo.h from libgcc to common/config/i386 and move isa_names_table to common/config/i386 so that get_intel_cpu can be shared by libgcc, GCC driver and gcc.target/i386/builtin_target.c to detect the specific type of Intel and AMD CPUs: 1. Use the same enum processor_features in libgcc and x86 backend. 2. Add more processor features to enum processor_features. 3. Use the same isa_names_table in i386-builtins.c, driver-i386.c and gcc.target/i386/builtin_target.c. 4. Use isa_names_table to generate ISA command-line options. 5. Use isa_names_table to generate __builtin_cpu_supports tests. 6. Add M_VENDOR, M_CPU_TYPE and M_CPU_SUBTYPE in i386-builtins.c to avoid duplication in. 7. Use cpu_indicator_init, has_cpu_feature, get_amd_cpu and get_intel_cpu in driver-i386.c and builtin_target.c. gcc/ PR target/95259 * common/config/i386/cpuinfo-builtins.h: Moved from libgcc/config/i386/cpuinfo.h. (processor_vendor): Add VENDOR_CENTAUR, VENDOR_CYRIX, VENDOR_NSC and BUILTIN_VENDOR_MAX. (processor_types): Add BUILTIN_CPU_TYPE_MAX. (processor_features): Add FEATURE_3DNOW, FEATURE_3DNOWP, FEATURE_ADX, FEATURE_ABM, FEATURE_CLDEMOTE, FEATURE_CLFLUSHOPT, FEATURE_CLWB, FEATURE_CLZERO, FEATURE_CMPXCHG16B, FEATURE_CMPXCHG8B, FEATURE_ENQCMD, FEATURE_F16C, FEATURE_FSGSBASE, FEATURE_FXSAVE, FEATURE_HLE, FEATURE_IBT, FEATURE_LAHF_LM, FEATURE_LM, FEATURE_LWP, FEATURE_LZCNT, FEATURE_MOVBE, FEATURE_MOVDIR64B, FEATURE_MOVDIRI, FEATURE_MWAITX, FEATURE_OSPKE, FEATURE_OSXSAVE, FEATURE_PCONFIG, FEATURE_PKU, FEATURE_PREFETCHWT1, FEATURE_PRFCHW, FEATURE_PTWRITE, FEATURE_RDPID, FEATURE_RDRND, FEATURE_RDSEED, FEATURE_RTM, FEATURE_SERIALIZE, FEATURE_SGX, FEATURE_SHA, FEATURE_SHSTK, FEATURE_TBM, FEATURE_TSXLDTRK, FEATURE_VAES, FEATURE_WAITPKG, FEATURE_WBNOINVD, FEATURE_XSAVE, FEATURE_XSAVEC, FEATURE_XSAVEOPT, FEATURE_XSAVES and CPU_FEATURE_MAX. (__processor_model): Moved to cpuinfo.h. (__cpu_model): Removed. (__cpu_features2): Likewise. (SIZE_OF_CPU_FEATURES): New. * common/config/i386/cpuinfo.h: Moved from libgcc/config/i386/cpuinfo.c. (__processor_model): Moved from libgcc/config/i386/cpuinfo.h. (__processor_model2): New. (CHECK___builtin_cpu_is): New. Defined as empty if not defined. (has_cpu_feature): New function. (set_cpu_feature): Likewise. (get_amd_cpu): Moved from libgcc/config/i386/cpuinfo.c. Use CHECK___builtin_cpu_is. Return AMD CPU name. (get_intel_cpu): Moved from libgcc/config/i386/cpuinfo.c. Use Use CHECK___builtin_cpu_is. Return Intel CPU name. (get_available_features): Moved from libgcc/config/i386/cpuinfo.c. Also check FEATURE_3DNOW, FEATURE_3DNOWP, FEATURE_ADX, FEATURE_ABM, FEATURE_CLDEMOTE, FEATURE_CLFLUSHOPT, FEATURE_CLWB, FEATURE_CLZERO, FEATURE_CMPXCHG16B, FEATURE_CMPXCHG8B, FEATURE_ENQCMD, FEATURE_F16C, FEATURE_FSGSBASE, FEATURE_FXSAVE, FEATURE_HLE, FEATURE_IBT, FEATURE_LAHF_LM, FEATURE_LM, FEATURE_LWP, FEATURE_LZCNT, FEATURE_MOVBE, FEATURE_MOVDIR64B, FEATURE_MOVDIRI, FEATURE_MWAITX, FEATURE_OSPKE, FEATURE_OSXSAVE, FEATURE_PCONFIG, FEATURE_PKU, FEATURE_PREFETCHWT1, FEATURE_PRFCHW, FEATURE_PTWRITE, FEATURE_RDPID, FEATURE_RDRND, FEATURE_RDSEED, FEATURE_RTM, FEATURE_SERIALIZE, FEATURE_SGX, FEATURE_SHA, FEATURE_SHSTK, FEATURE_TBM, FEATURE_TSXLDTRK, FEATURE_VAES, FEATURE_WAITPKG, FEATURE_WBNOINVD, FEATURE_XSAVE, FEATURE_XSAVEC, FEATURE_XSAVEOPT and FEATURE_XSAVES (cpu_indicator_init): Moved from libgcc/config/i386/cpuinfo.c. Also update cpu_model2. * common/config/i386/i386-isas.h: New file. Extracted from gcc/config/i386/i386-builtins.c. (_isa_names_table): Add option. (ISA_NAMES_TABLE_START): New. (ISA_NAMES_TABLE_END): Likewise. (ISA_NAMES_TABLE_ENTRY): Likewise. (isa_names_table): Defined with ISA_NAMES_TABLE_START, ISA_NAMES_TABLE_END and ISA_NAMES_TABLE_ENTRY. Replace F_XXX with FEATURE_XXX. Add more ISAs from enum processor_features. * config/i386/driver-i386.c: Include "common/config/i386/cpuinfo.h" and "common/config/i386/i386-isas.h". (has_feature): New macro. (host_detect_local_cpu): Call cpu_indicator_init to get CPU features. Use has_feature to detect processor features. Call get_amd_cpu to get AMD CPU name. Call get_intel_cpu to get Intel CPU name. Use isa_names_table to generate command-line options. * config/i386/i386-builtins.c: Include "common/config/i386/cpuinfo.h" and "common/config/i386/i386-isas.h". (feature_priority): Moved to common/config/i386/i386-isas.h. (processor_features): Removed. (processor_model): Removed. (_arch_names_table): Use "const int" on model. (M_CPU_TYPE_START): New. (M_CPU_SUBTYPE_START): Likewise. (M_VENDOR): Likewise. (M_CPU_TYPE): Likewise. (M_CPU_SUBTYPE): Likewise. (arch_names_table): Replace M_XXX with M_VENDOR, M_CPU_TYPE and M_CPU_SUBTYPE. (isa_names_table): Moved to common/config/i386/i386-isas.h. (fold_builtin_cpu): Change __cpu_features2 to an array. gcc/testsuite/ PR target/95259 * gcc.target/i386/builtin_target.c: Include and ../../../common/config/i386/cpuinfo.h. (check_amd_cpu_model): Removed. (check_intel_cpu_model): Likewise, (CHECK___builtin_cpu_is): New. (gcc_assert): New. Defined as assert. (gcc_unreachable): New. Defined as abort. (inline): New. Defined as empty. (ISA_NAMES_TABLE_START): Likewise. (ISA_NAMES_TABLE_END): Likewise. (ISA_NAMES_TABLE_ENTRY): New. (check_features): Include "../../../common/config/i386/i386-isas.h". (check_detailed): Call cpu_indicator_init. Always call check_features. Call get_amd_cpu instead of check_amd_cpu_model. Call get_intel_cpu instead of check_intel_cpu_model. libgcc/ PR target/95259 * config/i386/cpuinfo.c: Include "common/config/i386/cpuinfo.h". (__cpu_features2): Changed to array. (get_amd_cpu): Moved to ... gcc/common/config/i386/cpuinfo.h. (get_intel_cpu): Likewise. (get_available_features): Likewise. (__cpu_indicator_init): Call cpu_indicator_init. * config/i386/cpuinfo.h: Moved to gcc/common/config/i386/cpuinfo-builtins.h. --- .../common/config/i386/cpuinfo-builtins.h | 70 +- gcc/common/config/i386/cpuinfo.h | 845 ++++++++++++++++++ gcc/common/config/i386/i386-isas.h | 201 +++++ gcc/config/i386/driver-i386.c | 680 +++----------- gcc/config/i386/i386-builtins.c | 302 ++----- .../gcc.target/i386/builtin_target.c | 354 +------- libgcc/config/i386/cpuinfo.c | 464 +--------- 7 files changed, 1325 insertions(+), 1591 deletions(-) rename libgcc/config/i386/cpuinfo.h => gcc/common/config/i386/cpuinfo-builtins.h (71%) create mode 100644 gcc/common/config/i386/cpuinfo.h create mode 100644 gcc/common/config/i386/i386-isas.h diff --git a/libgcc/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo-builtins.h similarity index 71% rename from libgcc/config/i386/cpuinfo.h rename to gcc/common/config/i386/cpuinfo-builtins.h index 0f97510cde1..7a20dcf9ef8 100644 --- a/libgcc/config/i386/cpuinfo.h +++ b/gcc/common/config/i386/cpuinfo-builtins.h @@ -30,6 +30,10 @@ enum processor_vendor VENDOR_INTEL = 1, VENDOR_AMD, VENDOR_OTHER, + VENDOR_CENTAUR, + VENDOR_CYRIX, + VENDOR_NSC, + BUILTIN_VENDOR_MAX = VENDOR_OTHER, VENDOR_MAX }; @@ -45,13 +49,14 @@ enum processor_types INTEL_SILVERMONT, INTEL_KNL, AMD_BTVER1, - AMD_BTVER2, + AMD_BTVER2, AMDFAM17H, INTEL_KNM, INTEL_GOLDMONT, INTEL_GOLDMONT_PLUS, INTEL_TREMONT, - CPU_TYPE_MAX + CPU_TYPE_MAX, + BUILTIN_CPU_TYPE_MAX = CPU_TYPE_MAX }; enum processor_subtypes @@ -123,14 +128,57 @@ enum processor_features FEATURE_AVX512VNNI, FEATURE_AVX512BITALG, FEATURE_AVX512BF16, - FEATURE_AVX512VP2INTERSECT + FEATURE_AVX512VP2INTERSECT, + FEATURE_3DNOW, + FEATURE_3DNOWP, + FEATURE_ADX, + FEATURE_ABM, + FEATURE_CLDEMOTE, + FEATURE_CLFLUSHOPT, + FEATURE_CLWB, + FEATURE_CLZERO, + FEATURE_CMPXCHG16B, + FEATURE_CMPXCHG8B, + FEATURE_ENQCMD, + FEATURE_F16C, + FEATURE_FSGSBASE, + FEATURE_FXSAVE, + FEATURE_HLE, + FEATURE_IBT, + FEATURE_LAHF_LM, + FEATURE_LM, + FEATURE_LWP, + FEATURE_LZCNT, + FEATURE_MOVBE, + FEATURE_MOVDIR64B, + FEATURE_MOVDIRI, + FEATURE_MWAITX, + FEATURE_OSPKE, + FEATURE_OSXSAVE, + FEATURE_PCONFIG, + FEATURE_PKU, + FEATURE_PREFETCHWT1, + FEATURE_PRFCHW, + FEATURE_PTWRITE, + FEATURE_RDPID, + FEATURE_RDRND, + FEATURE_RDSEED, + FEATURE_RTM, + FEATURE_SERIALIZE, + FEATURE_SGX, + FEATURE_SHA, + FEATURE_SHSTK, + FEATURE_TBM, + FEATURE_TSXLDTRK, + FEATURE_VAES, + FEATURE_WAITPKG, + FEATURE_WBNOINVD, + FEATURE_XSAVE, + FEATURE_XSAVEC, + FEATURE_XSAVEOPT, + FEATURE_XSAVES, + CPU_FEATURE_MAX }; -extern struct __processor_model -{ - unsigned int __cpu_vendor; - unsigned int __cpu_type; - unsigned int __cpu_subtype; - unsigned int __cpu_features[1]; -} __cpu_model; -extern unsigned int __cpu_features2; +/* Size of __cpu_features2 array in libgcc/config/i386/cpuinfo.c. */ +#define SIZE_OF_CPU_FEATURES ((CPU_FEATURE_MAX - 1) / 32) diff --git a/gcc/common/config/i386/cpuinfo.h b/gcc/common/config/i386/cpuinfo.h new file mode 100644 index 00000000000..13f4b0e51e5 --- /dev/null +++ b/gcc/common/config/i386/cpuinfo.h @@ -0,0 +1,845 @@ +/* Get CPU type and Features for x86 processors. + Copyright (C) 2012-2020 Free Software Foundation, Inc. + Contributed by Sriraman Tallam (tmsriram@google.com) + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +Under Section 7 of GPL version 3, you are granted additional +permissions described in the GCC Runtime Library Exception, version +3.1, as published by the Free Software Foundation. + +You should have received a copy of the GNU General Public License and +a copy of the GCC Runtime Library Exception along with this program; +see the files COPYING3 and COPYING.RUNTIME respectively. If not, see +. */ + +#include "cpuinfo-builtins.h" + +struct __processor_model +{ + unsigned int __cpu_vendor; + unsigned int __cpu_type; + unsigned int __cpu_subtype; + /* The first 32 features are stored as bitmasks in __cpu_features. + The rest of features are stored as bitmasks in a separate array + of unsigned int. */ + unsigned int __cpu_features[1]; +}; + +struct __processor_model2 +{ + unsigned int __cpu_family; + unsigned int __cpu_model; + unsigned int __cpu_max_level; + unsigned int __cpu_ext_level; +}; + +#ifndef CHECK___builtin_cpu_is +# define CHECK___builtin_cpu_is(cpu) +#endif + +/* Return non-zero if the processor has feature F. */ + +static inline int +has_cpu_feature (struct __processor_model *cpu_model, + unsigned int *cpu_features2, + enum processor_features f) +{ + unsigned int i; + if (f < 32) + { + /* The first 32 features. */ + return cpu_model->__cpu_features[0] & (1U << (f & 31)); + } + /* The rest of features. cpu_features2[i] contains features from + (32 + i * 32) to (31 + 32 + i * 32), inclusively. */ + for (i = 0; i < SIZE_OF_CPU_FEATURES; i++) + if (f < (32 + 32 + i * 32)) + return cpu_features2[i] & (1U << ((f - (32 + i * 32)) & 31)); + gcc_unreachable (); +} + +static inline void +set_cpu_feature (struct __processor_model *cpu_model, + unsigned int *cpu_features2, + enum processor_features f) +{ + unsigned int i; + if (f < 32) + { + /* The first 32 features. */ + cpu_model->__cpu_features[0] |= (1U << (f & 31)); + return; + } + /* The rest of features. cpu_features2[i] contains features from + (32 + i * 32) to (31 + 32 + i * 32), inclusively. */ + for (i = 0; i < SIZE_OF_CPU_FEATURES; i++) + if (f < (32 + 32 + i * 32)) + { + cpu_features2[i] |= (1U << ((f - (32 + i * 32)) & 31)); + return; + } + gcc_unreachable (); +} + +/* Get the specific type of AMD CPU and return AMD CPU name. Return + NULL for unknown AMD CPU. */ + +static inline const char * +get_amd_cpu (struct __processor_model *cpu_model, + struct __processor_model2 *cpu_model2, + unsigned int *cpu_features2) +{ + const char *cpu = NULL; + unsigned int family = cpu_model2->__cpu_family; + unsigned int model = cpu_model2->__cpu_model; + + switch (family) + { + case 0x10: + /* AMD Family 10h. */ + cpu = "amdfam10"; + cpu_model->__cpu_type = AMDFAM10H; + switch (model) + { + case 0x2: + /* Barcelona. */ + CHECK___builtin_cpu_is ("amdfam10h"); + CHECK___builtin_cpu_is ("barcelona"); + cpu_model->__cpu_subtype = AMDFAM10H_BARCELONA; + break; + case 0x4: + /* Shanghai. */ + CHECK___builtin_cpu_is ("amdfam10h"); + CHECK___builtin_cpu_is ("shanghai"); + cpu_model->__cpu_subtype = AMDFAM10H_SHANGHAI; + break; + case 0x8: + /* Istanbul. */ + CHECK___builtin_cpu_is ("amdfam10h"); + CHECK___builtin_cpu_is ("istanbul"); + cpu_model->__cpu_subtype = AMDFAM10H_ISTANBUL; + break; + default: + break; + } + break; + case 0x14: + /* AMD Family 14h "btver1". */ + cpu = "btver1"; + CHECK___builtin_cpu_is ("btver1"); + cpu_model->__cpu_type = AMD_BTVER1; + break; + case 0x15: + /* AMD Family 15h "Bulldozer". */ + cpu_model->__cpu_type = AMDFAM15H; + if (model == 0x2) + { + /* Bulldozer version 2 "Piledriver" */ + cpu = "bdver2"; + CHECK___builtin_cpu_is ("bdver2"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER2; + } + else if (model <= 0xf) + { + /* Bulldozer version 1. */ + cpu = "bdver1"; + CHECK___builtin_cpu_is ("bdver1"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER1; + } + else if (model <= 0x2f) + { + /* Bulldozer version 2 "Piledriver" */ + cpu = "bdver2"; + CHECK___builtin_cpu_is ("bdver2"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER2; + } + else if (model <= 0x4f) + { + /* Bulldozer version 3 "Steamroller" */ + cpu = "bdver3"; + CHECK___builtin_cpu_is ("bdver3"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER3; + } + else if (model <= 0x7f) + { + /* Bulldozer version 4 "Excavator" */ + cpu = "bdver4"; + CHECK___builtin_cpu_is ("bdver4"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER4; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_AVX2)) + { + cpu = "bdver4"; + CHECK___builtin_cpu_is ("bdver4"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER4; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_XSAVEOPT)) + { + cpu = "bdver3"; + CHECK___builtin_cpu_is ("bdver3"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER3; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_BMI)) + { + cpu = "bdver2"; + CHECK___builtin_cpu_is ("bdver2"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER2; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_XOP)) + { + cpu = "bdver1"; + CHECK___builtin_cpu_is ("bdver1"); + cpu_model->__cpu_subtype = AMDFAM15H_BDVER1; + } + break; + case 0x16: + /* AMD Family 16h "btver2" */ + cpu = "btver2"; + CHECK___builtin_cpu_is ("btver2"); + cpu_model->__cpu_type = AMD_BTVER2; + break; + case 0x17: + cpu_model->__cpu_type = AMDFAM17H; + if (model <= 0x1f) + { + /* AMD family 17h version 1. */ + cpu = "znver1"; + CHECK___builtin_cpu_is ("znver1"); + cpu_model->__cpu_subtype = AMDFAM17H_ZNVER1; + } + else if (model >= 0x30) + { + cpu = "znver2"; + CHECK___builtin_cpu_is ("znver2"); + cpu_model->__cpu_subtype = AMDFAM17H_ZNVER2; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_CLWB)) + { + cpu = "znver2"; + CHECK___builtin_cpu_is ("znver2"); + cpu_model->__cpu_subtype = AMDFAM17H_ZNVER2; + } + else if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_CLZERO)) + { + cpu = "znver1"; + CHECK___builtin_cpu_is ("znver1"); + cpu_model->__cpu_subtype = AMDFAM17H_ZNVER1; + } + break; + default: + break; + } + + return cpu; +} + +/* Get the specific type of Intel CPU and return Intel CPU name. Return + NULL for unknown Intel CPU. */ + +static inline const char * +get_intel_cpu (struct __processor_model *cpu_model, + struct __processor_model2 *cpu_model2, + unsigned int *cpu_features2, + unsigned int brand_id) +{ + const char *cpu = NULL; + + /* Parse family and model only for brand ID 0 and model 6. */ + if (brand_id != 0 || cpu_model2->__cpu_family != 0x6) + return cpu; + + switch (cpu_model2->__cpu_model) + { + case 0x1c: + case 0x26: + /* Bonnell. */ + cpu = "bonnell"; + CHECK___builtin_cpu_is ("atom"); + cpu_model->__cpu_type = INTEL_BONNELL; + break; + case 0x37: + case 0x4a: + case 0x4d: + case 0x5d: + /* Silvermont. */ + case 0x4c: + case 0x5a: + case 0x75: + /* Airmont. */ + cpu = "silvermont"; + CHECK___builtin_cpu_is ("silvermont"); + cpu_model->__cpu_type = INTEL_SILVERMONT; + break; + case 0x5c: + case 0x5f: + /* Goldmont. */ + cpu = "goldmont"; + CHECK___builtin_cpu_is ("goldmont"); + cpu_model->__cpu_type = INTEL_GOLDMONT; + break; + case 0x7a: + /* Goldmont Plus. */ + cpu = "goldmont-plus"; + CHECK___builtin_cpu_is ("goldmont-plus"); + cpu_model->__cpu_type = INTEL_GOLDMONT_PLUS; + break; + case 0x86: + case 0x96: + case 0x9c: + /* Tremont. */ + cpu = "tremont"; + CHECK___builtin_cpu_is ("tremont"); + cpu_model->__cpu_type = INTEL_TREMONT; + break; + case 0x57: + /* Knights Landing. */ + cpu = "knl"; + CHECK___builtin_cpu_is ("knl"); + cpu_model->__cpu_type = INTEL_KNL; + break; + case 0x85: + /* Knights Mill. */ + cpu = "knm"; + CHECK___builtin_cpu_is ("knm"); + cpu_model->__cpu_type = INTEL_KNM; + break; + case 0x1a: + case 0x1e: + case 0x1f: + case 0x2e: + /* Nehalem. */ + cpu = "nehalem"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("nehalem"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_NEHALEM; + break; + case 0x25: + case 0x2c: + case 0x2f: + /* Westmere. */ + cpu = "westmere"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("westmere"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_WESTMERE; + break; + case 0x2a: + case 0x2d: + /* Sandy Bridge. */ + cpu = "sandybridge"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("sandybridge"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_SANDYBRIDGE; + break; + case 0x3a: + case 0x3e: + /* Ivy Bridge. */ + cpu = "ivybridge"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("ivybridge"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_IVYBRIDGE; + break; + case 0x3c: + case 0x3f: + case 0x45: + case 0x46: + /* Haswell. */ + cpu = "haswell"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("haswell"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_HASWELL; + break; + case 0x3d: + case 0x47: + case 0x4f: + case 0x56: + /* Broadwell. */ + cpu = "broadwell"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("broadwell"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_BROADWELL; + break; + case 0x4e: + case 0x5e: + /* Skylake. */ + case 0x8e: + case 0x9e: + /* Kaby Lake. */ + case 0xa5: + case 0xa6: + /* Comet Lake. */ + cpu = "skylake"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("skylake"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_SKYLAKE; + break; + case 0x55: + CHECK___builtin_cpu_is ("corei7"); + cpu_model->__cpu_type = INTEL_COREI7; + if (has_cpu_feature (cpu_model, cpu_features2, + FEATURE_AVX512VNNI)) + { + /* Cascade Lake. */ + cpu = "cascadelake"; + CHECK___builtin_cpu_is ("cascadelake"); + cpu_model->__cpu_subtype = INTEL_COREI7_CASCADELAKE; + } + else + { + /* Skylake with AVX-512 support. */ + cpu = "skylake-avx512"; + CHECK___builtin_cpu_is ("skylake-avx512"); + cpu_model->__cpu_subtype = INTEL_COREI7_SKYLAKE_AVX512; + } + break; + case 0x66: + /* Cannon Lake. */ + cpu = "cannonlake"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("cannonlake"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_CANNONLAKE; + break; + case 0x6a: + case 0x6c: + /* Ice Lake server. */ + cpu = "icelake-server"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("icelake-server"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_ICELAKE_SERVER; + break; + case 0x7e: + case 0x7d: + case 0x9d: + /* Ice Lake client. */ + cpu = "icelake-client"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("icelake-client"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_ICELAKE_CLIENT; + break; + case 0x8c: + case 0x8d: + /* Tiger Lake. */ + cpu = "tigerlake"; + CHECK___builtin_cpu_is ("corei7"); + CHECK___builtin_cpu_is ("tigerlake"); + cpu_model->__cpu_type = INTEL_COREI7; + cpu_model->__cpu_subtype = INTEL_COREI7_TIGERLAKE; + break; + case 0x17: + case 0x1d: + /* Penryn. */ + case 0x0f: + /* Merom. */ + cpu = "core2"; + CHECK___builtin_cpu_is ("core2"); + cpu_model->__cpu_type = INTEL_CORE2; + break; + default: + break; + } + + return cpu; +} + +/* ECX and EDX are output of CPUID at level one. */ +static inline void +get_available_features (struct __processor_model *cpu_model, + struct __processor_model2 *cpu_model2, + unsigned int *cpu_features2, + unsigned int ecx, unsigned int edx) +{ + unsigned int max_cpuid_level = cpu_model2->__cpu_max_level; + unsigned int eax, ebx; + unsigned int ext_level; + + /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ +#define XCR_XFEATURE_ENABLED_MASK 0x0 +#define XSTATE_FP 0x1 +#define XSTATE_SSE 0x2 +#define XSTATE_YMM 0x4 +#define XSTATE_OPMASK 0x20 +#define XSTATE_ZMM 0x40 +#define XSTATE_HI_ZMM 0x80 + +#define XCR_AVX_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM) +#define XCR_AVX512F_ENABLED_MASK \ + (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) + + /* Check if AVX and AVX512 are usable. */ + int avx_usable = 0; + int avx512_usable = 0; + if ((ecx & bit_OSXSAVE)) + { + /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and + ZMM16-ZMM31 states are supported by OSXSAVE. */ + unsigned int xcrlow; + unsigned int xcrhigh; + __asm__ (".byte 0x0f, 0x01, 0xd0" + : "=a" (xcrlow), "=d" (xcrhigh) + : "c" (XCR_XFEATURE_ENABLED_MASK)); + if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK) + { + avx_usable = 1; + avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK) + == XCR_AVX512F_ENABLED_MASK); + } + } + +#define set_feature(f) \ + set_cpu_feature (cpu_model, cpu_features2, f) + + if (edx & bit_CMOV) + set_feature (FEATURE_CMOV); + if (edx & bit_MMX) + set_feature (FEATURE_MMX); + if (edx & bit_SSE) + set_feature (FEATURE_SSE); + if (edx & bit_SSE2) + set_feature (FEATURE_SSE2); + if (edx & bit_CMPXCHG8B) + set_feature (FEATURE_CMPXCHG8B); + if (edx & bit_FXSAVE) + set_feature (FEATURE_FXSAVE); + + if (ecx & bit_POPCNT) + set_feature (FEATURE_POPCNT); + if (ecx & bit_AES) + set_feature (FEATURE_AES); + if (ecx & bit_PCLMUL) + set_feature (FEATURE_PCLMUL); + if (ecx & bit_SSE3) + set_feature (FEATURE_SSE3); + if (ecx & bit_SSSE3) + set_feature (FEATURE_SSSE3); + if (ecx & bit_SSE4_1) + set_feature (FEATURE_SSE4_1); + if (ecx & bit_SSE4_2) + set_feature (FEATURE_SSE4_2); + if (ecx & bit_OSXSAVE) + set_feature (FEATURE_OSXSAVE); + if (ecx & bit_CMPXCHG16B) + set_feature (FEATURE_CMPXCHG16B); + if (ecx & bit_MOVBE) + set_feature (FEATURE_MOVBE); + if (ecx & bit_AES) + set_feature (FEATURE_AES); + if (ecx & bit_F16C) + set_feature (FEATURE_F16C); + if (ecx & bit_RDRND) + set_feature (FEATURE_RDRND); + if (ecx & bit_XSAVE) + set_feature (FEATURE_XSAVE); + if (avx_usable) + { + if (ecx & bit_AVX) + set_feature (FEATURE_AVX); + if (ecx & bit_FMA) + set_feature (FEATURE_FMA); + } + + /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */ + if (max_cpuid_level >= 7) + { + __cpuid_count (7, 0, eax, ebx, ecx, edx); + if (ebx & bit_BMI) + set_feature (FEATURE_BMI); + if (ebx & bit_SGX) + set_feature (FEATURE_SGX); + if (ebx & bit_HLE) + set_feature (FEATURE_HLE); + if (ebx & bit_RTM) + set_feature (FEATURE_RTM); + if (avx_usable) + { + if (ebx & bit_AVX2) + set_feature (FEATURE_AVX2); + if (ecx & bit_VPCLMULQDQ) + set_feature (FEATURE_VPCLMULQDQ); + } + if (ebx & bit_BMI2) + set_feature (FEATURE_BMI2); + if (ebx & bit_FSGSBASE) + set_feature (FEATURE_FSGSBASE); + if (ebx & bit_RDSEED) + set_feature (FEATURE_RDSEED); + if (ebx & bit_ADX) + set_feature (FEATURE_ADX); + if (ebx & bit_SHA) + set_feature (FEATURE_SHA); + if (ebx & bit_CLFLUSHOPT) + set_feature (FEATURE_CLFLUSHOPT); + if (ebx & bit_CLWB) + set_feature (FEATURE_CLWB); + if (ecx & bit_PREFETCHWT1) + set_feature (FEATURE_PREFETCHWT1); + if (ecx & bit_OSPKE) + set_feature (FEATURE_OSPKE); + if (ecx & bit_RDPID) + set_feature (FEATURE_RDPID); + if (ecx & bit_VAES) + set_feature (FEATURE_VAES); + if (ecx & bit_GFNI) + set_feature (FEATURE_GFNI); + if (ecx & bit_MOVDIRI) + set_feature (FEATURE_MOVDIRI); + if (ecx & bit_MOVDIR64B) + set_feature (FEATURE_MOVDIR64B); + if (ecx & bit_ENQCMD) + set_feature (FEATURE_ENQCMD); + if (ecx & bit_CLDEMOTE) + set_feature (FEATURE_CLDEMOTE); + if (ecx & bit_WAITPKG) + set_feature (FEATURE_WAITPKG); + if (ecx & bit_SHSTK) + set_feature (FEATURE_SHSTK); + if (edx & bit_SERIALIZE) + set_feature (FEATURE_SERIALIZE); + if (edx & bit_TSXLDTRK) + set_feature (FEATURE_TSXLDTRK); + if (edx & bit_PCONFIG) + set_feature (FEATURE_PCONFIG); + if (edx & bit_IBT) + set_feature (FEATURE_IBT); + if (avx512_usable) + { + if (ebx & bit_AVX512F) + set_feature (FEATURE_AVX512F); + if (ebx & bit_AVX512VL) + set_feature (FEATURE_AVX512VL); + if (ebx & bit_AVX512BW) + set_feature (FEATURE_AVX512BW); + if (ebx & bit_AVX512DQ) + set_feature (FEATURE_AVX512DQ); + if (ebx & bit_AVX512CD) + set_feature (FEATURE_AVX512CD); + if (ebx & bit_AVX512PF) + set_feature (FEATURE_AVX512PF); + if (ebx & bit_AVX512ER) + set_feature (FEATURE_AVX512ER); + if (ebx & bit_AVX512IFMA) + set_feature (FEATURE_AVX512IFMA); + if (ecx & bit_AVX512VBMI) + set_feature (FEATURE_AVX512VBMI); + if (ecx & bit_AVX512VBMI2) + set_feature (FEATURE_AVX512VBMI2); + if (ecx & bit_AVX512VNNI) + set_feature (FEATURE_AVX512VNNI); + if (ecx & bit_AVX512BITALG) + set_feature (FEATURE_AVX512BITALG); + if (ecx & bit_AVX512VPOPCNTDQ) + set_feature (FEATURE_AVX512VPOPCNTDQ); + if (edx & bit_AVX5124VNNIW) + set_feature (FEATURE_AVX5124VNNIW); + if (edx & bit_AVX5124FMAPS) + set_feature (FEATURE_AVX5124FMAPS); + if (edx & bit_AVX512VP2INTERSECT) + set_feature (FEATURE_AVX512VP2INTERSECT); + + __cpuid_count (7, 1, eax, ebx, ecx, edx); + if (eax & bit_AVX512BF16) + set_feature (FEATURE_AVX512BF16); + } + } + + /* Get Advanced Features at level 0xd (eax = 0xd, ecx = 1). */ + if (max_cpuid_level >= 0xd) + { + __cpuid_count (0xd, 1, eax, ebx, ecx, edx); + if (eax & bit_XSAVEOPT) + set_feature (FEATURE_XSAVEOPT); + if (eax & bit_XSAVEC) + set_feature (FEATURE_XSAVEC); + if (eax & bit_XSAVES) + set_feature (FEATURE_XSAVES); + } + + /* Get Advanced Features at level 0x14 (eax = 0x14, ecx = 0). */ + if (max_cpuid_level >= 0x14) + { + __cpuid_count (0x14, 0, eax, ebx, ecx, edx); + if (ebx & bit_PTWRITE) + set_feature (FEATURE_PTWRITE); + } + + /* Check cpuid level of extended features. */ + __cpuid (0x80000000, ext_level, ebx, ecx, edx); + + cpu_model2->__cpu_ext_level = ext_level; + + if (ext_level >= 0x80000001) + { + __cpuid (0x80000001, eax, ebx, ecx, edx); + + if (ecx & bit_SSE4a) + set_feature (FEATURE_SSE4_A); + if (ecx & bit_LAHF_LM) + set_feature (FEATURE_LAHF_LM); + if (ecx & bit_ABM) + set_feature (FEATURE_ABM); + if (ecx & bit_LWP) + set_feature (FEATURE_LWP); + if (ecx & bit_TBM) + set_feature (FEATURE_TBM); + if (ecx & bit_LZCNT) + set_feature (FEATURE_LZCNT); + if (ecx & bit_PRFCHW) + set_feature (FEATURE_PRFCHW); + if (ecx & bit_MWAITX) + set_feature (FEATURE_MWAITX); + + if (edx & bit_LM) + set_feature (FEATURE_LM); + if (edx & bit_3DNOWP) + set_feature (FEATURE_3DNOWP); + if (edx & bit_3DNOW) + set_feature (FEATURE_3DNOW); + + if (avx_usable) + { + if (ecx & bit_FMA4) + set_feature (FEATURE_FMA4); + if (ecx & bit_XOP) + set_feature (FEATURE_XOP); + } + } + + if (ext_level >= 0x80000008) + { + __cpuid (0x80000008, eax, ebx, ecx, edx); + if (ebx & bit_CLZERO) + set_feature (FEATURE_CLZERO); + if (ebx & bit_WBNOINVD) + set_feature (FEATURE_WBNOINVD); + } + +#undef set_feature +} + +static inline int +cpu_indicator_init (struct __processor_model *cpu_model, + struct __processor_model2 *cpu_model2, + unsigned int *cpu_features2) +{ + unsigned int eax, ebx, ecx, edx; + + int max_level; + unsigned int vendor; + unsigned int model, family, brand_id; + unsigned int extended_model, extended_family; + + /* This function needs to run just once. */ + if (cpu_model->__cpu_vendor) + return 0; + + /* Assume cpuid insn present. Run in level 0 to get vendor id. */ + if (!__get_cpuid (0, &eax, &ebx, &ecx, &edx)) + { + cpu_model->__cpu_vendor = VENDOR_OTHER; + return -1; + } + + vendor = ebx; + max_level = eax; + + if (max_level < 1) + { + cpu_model->__cpu_vendor = VENDOR_OTHER; + return -1; + } + + if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx)) + { + cpu_model->__cpu_vendor = VENDOR_OTHER; + return -1; + } + + cpu_model2->__cpu_max_level = max_level; + + model = (eax >> 4) & 0x0f; + family = (eax >> 8) & 0x0f; + brand_id = ebx & 0xff; + extended_model = (eax >> 12) & 0xf0; + extended_family = (eax >> 20) & 0xff; + + if (vendor == signature_INTEL_ebx) + { + /* Adjust model and family for Intel CPUS. */ + if (family == 0x0f) + { + family += extended_family; + model += extended_model; + } + else if (family == 0x06) + model += extended_model; + + cpu_model2->__cpu_family = family; + cpu_model2->__cpu_model = model; + + /* Find available features. */ + get_available_features (cpu_model, cpu_model2, cpu_features2, + ecx, edx); + /* Get CPU type. */ + get_intel_cpu (cpu_model, cpu_model2, cpu_features2, brand_id); + cpu_model->__cpu_vendor = VENDOR_INTEL; + } + else if (vendor == signature_AMD_ebx) + { + /* Adjust model and family for AMD CPUS. */ + if (family == 0x0f) + { + family += extended_family; + model += extended_model; + } + + cpu_model2->__cpu_family = family; + cpu_model2->__cpu_model = model; + + /* Find available features. */ + get_available_features (cpu_model, cpu_model2, cpu_features2, + ecx, edx); + /* Get CPU type. */ + get_amd_cpu (cpu_model, cpu_model2, cpu_features2); + cpu_model->__cpu_vendor = VENDOR_AMD; + } + else if (vendor == signature_CENTAUR_ebx) + cpu_model->__cpu_vendor = VENDOR_CENTAUR; + else if (vendor == signature_CYRIX_ebx) + cpu_model->__cpu_vendor = VENDOR_CYRIX; + else if (vendor == signature_NSC_ebx) + cpu_model->__cpu_vendor = VENDOR_NSC; + else + cpu_model->__cpu_vendor = VENDOR_OTHER; + + gcc_assert (cpu_model->__cpu_vendor < VENDOR_MAX); + gcc_assert (cpu_model->__cpu_type < CPU_TYPE_MAX); + gcc_assert (cpu_model->__cpu_subtype < CPU_SUBTYPE_MAX); + + return 0; +} diff --git a/gcc/common/config/i386/i386-isas.h b/gcc/common/config/i386/i386-isas.h new file mode 100644 index 00000000000..32faee6c608 --- /dev/null +++ b/gcc/common/config/i386/i386-isas.h @@ -0,0 +1,201 @@ +/* i386 ISA table. + Copyright (C) 2020 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +/* Priority of i386 features, greater value is higher priority. This is + used to decide the order in which function dispatch must happen. For + instance, a version specialized for SSE4.2 should be checked for dispatch + before a version for SSE3, as SSE4.2 implies SSE3. */ +enum feature_priority +{ + P_ZERO = 0, + P_MMX, + P_SSE, + P_SSE2, + P_SSE3, + P_SSSE3, + P_PROC_SSSE3, + P_SSE4_A, + P_PROC_SSE4_A, + P_SSE4_1, + P_SSE4_2, + P_PROC_SSE4_2, + P_POPCNT, + P_AES, + P_PCLMUL, + P_AVX, + P_PROC_AVX, + P_BMI, + P_PROC_BMI, + P_FMA4, + P_XOP, + P_PROC_XOP, + P_FMA, + P_PROC_FMA, + P_BMI2, + P_AVX2, + P_PROC_AVX2, + P_AVX512F, + P_PROC_AVX512F +}; + +/* These are the target attribute strings for which a dispatcher is + available, from fold_builtin_cpu. */ +struct _isa_names_table +{ + const char *const name; + const enum processor_features feature; + const enum feature_priority priority; + const char *const option; +}; + +/* NB: isa_names_table is shared by i386-builtins.c, driver-i386.c and + gcc.target/i386/builtin_target.c. isa_names_table is a static const + array in i386-builtins.c and driver-i386.c. But it is a list of + assert statements in gcc.target/i386/builtin_target.c. */ + +#ifndef ISA_NAMES_TABLE_START +# define ISA_NAMES_TABLE_START \ + static const struct _isa_names_table isa_names_table[] = { +#endif + +#ifndef ISA_NAMES_TABLE_END +# define ISA_NAMES_TABLE_END }; +#endif + +#ifndef ISA_NAMES_TABLE_ENTRY +# define ISA_NAMES_TABLE_ENTRY(name, feature, priority, option) \ + {name, feature, priority, option}, +#endif + +ISA_NAMES_TABLE_START + ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("mmx", FEATURE_MMX, P_MMX, "-mmmx") + ISA_NAMES_TABLE_ENTRY("popcnt", FEATURE_POPCNT, P_POPCNT, "-mpopcnt") + ISA_NAMES_TABLE_ENTRY("sse", FEATURE_SSE, P_SSE, "-msse") + ISA_NAMES_TABLE_ENTRY("sse2", FEATURE_SSE2, P_SSE2, "-msse2") + ISA_NAMES_TABLE_ENTRY("sse3", FEATURE_SSE3, P_SSE3, "-msse3") + ISA_NAMES_TABLE_ENTRY("ssse3", FEATURE_SSSE3, P_SSSE3, "-mssse3") + ISA_NAMES_TABLE_ENTRY("sse4.1", FEATURE_SSE4_1, P_SSE4_1, "-msse4.1") + ISA_NAMES_TABLE_ENTRY("sse4.2", FEATURE_SSE4_2, P_SSE4_2, "-msse4.2") + ISA_NAMES_TABLE_ENTRY("avx", FEATURE_AVX, P_AVX, "-mavx") + ISA_NAMES_TABLE_ENTRY("avx2", FEATURE_AVX2, P_AVX2, "-mavx2") + ISA_NAMES_TABLE_ENTRY("sse4a", FEATURE_SSE4_A, P_SSE4_A, "-msse4a") + ISA_NAMES_TABLE_ENTRY("fma4", FEATURE_FMA4, P_FMA4, "-mfma4") + ISA_NAMES_TABLE_ENTRY("xop", FEATURE_XOP, P_XOP, "-mxop") + ISA_NAMES_TABLE_ENTRY("fma", FEATURE_FMA, P_FMA, "-mfma") + ISA_NAMES_TABLE_ENTRY("avx512f", FEATURE_AVX512F, P_AVX512F, + "-mavx512f") + ISA_NAMES_TABLE_ENTRY("bmi", FEATURE_BMI, P_BMI, "-mbmi") + ISA_NAMES_TABLE_ENTRY("bmi2", FEATURE_BMI2, P_BMI2, "-mbmi2") + ISA_NAMES_TABLE_ENTRY("aes", FEATURE_AES, P_AES, "-maes") + ISA_NAMES_TABLE_ENTRY("pclmul", FEATURE_PCLMUL, P_PCLMUL, "-mpclmul") + ISA_NAMES_TABLE_ENTRY("avx512vl", FEATURE_AVX512VL, P_ZERO, + "-mavx512vl") + ISA_NAMES_TABLE_ENTRY("avx512bw", FEATURE_AVX512BW, P_ZERO, + "-mavx512bw") + ISA_NAMES_TABLE_ENTRY("avx512dq", FEATURE_AVX512DQ, P_ZERO, + "-mavx512dq") + ISA_NAMES_TABLE_ENTRY("avx512cd", FEATURE_AVX512CD, P_ZERO, + "-mavx512cd") + ISA_NAMES_TABLE_ENTRY("avx512er", FEATURE_AVX512ER, P_ZERO, + "-mavx512er") + ISA_NAMES_TABLE_ENTRY("avx512pf", FEATURE_AVX512PF, P_ZERO, + "-mavx512pf") + ISA_NAMES_TABLE_ENTRY("avx512vbmi", FEATURE_AVX512VBMI, P_ZERO, + "-mavx512vbmi") + ISA_NAMES_TABLE_ENTRY("avx512ifma", FEATURE_AVX512IFMA, P_ZERO, + "-mavx512ifma") + ISA_NAMES_TABLE_ENTRY("avx5124vnniw", FEATURE_AVX5124VNNIW, P_ZERO, + "-mavx5124vnniw") + ISA_NAMES_TABLE_ENTRY("avx5124fmaps", FEATURE_AVX5124FMAPS, P_ZERO, + "-mavx5124fmaps") + ISA_NAMES_TABLE_ENTRY("avx512vpopcntdq", FEATURE_AVX512VPOPCNTDQ, + P_ZERO, "-mavx512vpopcntdq") + ISA_NAMES_TABLE_ENTRY("avx512vbmi2", FEATURE_AVX512VBMI2, P_ZERO, + "-mavx512vbmi2") + ISA_NAMES_TABLE_ENTRY("gfni", FEATURE_GFNI, P_ZERO, "-mgfni") + ISA_NAMES_TABLE_ENTRY("vpclmulqdq", FEATURE_VPCLMULQDQ, P_ZERO, + "-mvpclmulqdq") + ISA_NAMES_TABLE_ENTRY("avx512vnni", FEATURE_AVX512VNNI, P_ZERO, + "-mavx512vnni") + ISA_NAMES_TABLE_ENTRY("avx512bitalg", FEATURE_AVX512BITALG, P_ZERO, + "-mavx512bitalg") + ISA_NAMES_TABLE_ENTRY("avx512bf16", FEATURE_AVX512BF16, P_ZERO, + "-mavx512bf16") + ISA_NAMES_TABLE_ENTRY("avx512vp2intersect", FEATURE_AVX512VP2INTERSECT, + P_ZERO, "-mavx512vp2intersect") + ISA_NAMES_TABLE_ENTRY("3dnow", FEATURE_3DNOW, P_ZERO, "-m3dnow") + ISA_NAMES_TABLE_ENTRY("3dnowp", FEATURE_3DNOWP, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("adx", FEATURE_ADX, P_ZERO, "-madx") + ISA_NAMES_TABLE_ENTRY("abm", FEATURE_ABM, P_ZERO, "-mabm") + ISA_NAMES_TABLE_ENTRY("cldemote", FEATURE_CLDEMOTE, P_ZERO, + "-mcldemote") + ISA_NAMES_TABLE_ENTRY("clflushopt", FEATURE_CLFLUSHOPT, P_ZERO, + "-mclflushopt") + ISA_NAMES_TABLE_ENTRY("clwb", FEATURE_CLWB, P_ZERO, "-mclwb") + ISA_NAMES_TABLE_ENTRY("clzero", FEATURE_CLZERO, P_ZERO, "-mclzero") + ISA_NAMES_TABLE_ENTRY("cmpxchg16b", FEATURE_CMPXCHG16B, P_ZERO, + "-mcx16") + ISA_NAMES_TABLE_ENTRY("cmpxchg8b", FEATURE_CMPXCHG8B, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("enqcmd", FEATURE_ENQCMD, P_ZERO, "-menqcmd") + ISA_NAMES_TABLE_ENTRY("f16c", FEATURE_F16C, P_ZERO, "-mf16c") + ISA_NAMES_TABLE_ENTRY("fsgsbase", FEATURE_FSGSBASE, P_ZERO, + "-mfsgsbase") + ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, "-mfxsr") + ISA_NAMES_TABLE_ENTRY("hle", FEATURE_HLE, P_ZERO, "-mhle") + ISA_NAMES_TABLE_ENTRY("ibt", FEATURE_IBT, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("lahf_lm", FEATURE_LAHF_LM, P_ZERO, "-msahf") + ISA_NAMES_TABLE_ENTRY("lm", FEATURE_LM, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("lwp", FEATURE_LWP, P_ZERO, "-mlwp") + ISA_NAMES_TABLE_ENTRY("lzcnt", FEATURE_LZCNT, P_ZERO, "-mlzcnt") + ISA_NAMES_TABLE_ENTRY("movbe", FEATURE_MOVBE, P_ZERO, "-mmovbe") + ISA_NAMES_TABLE_ENTRY("movdir64b", FEATURE_MOVDIR64B, P_ZERO, + "-mmovdir64b") + ISA_NAMES_TABLE_ENTRY("movdiri", FEATURE_MOVDIRI, P_ZERO, "-mmovdiri") + ISA_NAMES_TABLE_ENTRY("mwaitx", FEATURE_MWAITX, P_ZERO, "-mmwaitx") + ISA_NAMES_TABLE_ENTRY("ospke", FEATURE_OSPKE, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("osxsave", FEATURE_OSXSAVE, P_ZERO, NULL) + ISA_NAMES_TABLE_ENTRY("pconfig", FEATURE_PCONFIG, P_ZERO, "-mpconfig") + ISA_NAMES_TABLE_ENTRY("pku", FEATURE_PKU, P_ZERO, "-mpku") + ISA_NAMES_TABLE_ENTRY("prefetchwt1", FEATURE_PREFETCHWT1, P_ZERO, + "-mprefetchwt1") + ISA_NAMES_TABLE_ENTRY("prfchw", FEATURE_PRFCHW, P_ZERO, "-mprfchw") + ISA_NAMES_TABLE_ENTRY("ptwrite", FEATURE_PTWRITE, P_ZERO, "-mptwrite") + ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, "-mrdpid") + ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, "-mrdrnd") + ISA_NAMES_TABLE_ENTRY("rdseed", FEATURE_RDSEED, P_ZERO, "-mrdseed") + ISA_NAMES_TABLE_ENTRY("rtm", FEATURE_RTM, P_ZERO, "-mrtm") + ISA_NAMES_TABLE_ENTRY("serialize", FEATURE_SERIALIZE, P_ZERO, + "-mserialize") + ISA_NAMES_TABLE_ENTRY("sgx", FEATURE_SGX, P_ZERO, "-msgx") + ISA_NAMES_TABLE_ENTRY("sha", FEATURE_SHA, P_ZERO, "-msha") + ISA_NAMES_TABLE_ENTRY("shstk", FEATURE_SHSTK, P_ZERO, "-mshstk") + ISA_NAMES_TABLE_ENTRY("tbm", FEATURE_TBM, P_ZERO, "-mtbm") + ISA_NAMES_TABLE_ENTRY("tsxldtrk", FEATURE_TSXLDTRK, P_ZERO, + "-mtsxldtrk") + ISA_NAMES_TABLE_ENTRY("vaes", FEATURE_VAES, P_ZERO, "-mvaes") + ISA_NAMES_TABLE_ENTRY("waitpkg", FEATURE_WAITPKG, P_ZERO, "-mwaitpkg") + ISA_NAMES_TABLE_ENTRY("wbnoinvd", FEATURE_WBNOINVD, P_ZERO, + "-mwbnoinvd") + ISA_NAMES_TABLE_ENTRY("xsave", FEATURE_XSAVE, P_ZERO, "-mxsave") + ISA_NAMES_TABLE_ENTRY("xsavec", FEATURE_XSAVEC, P_ZERO, "-mxsavec") + ISA_NAMES_TABLE_ENTRY("xsaveopt", FEATURE_XSAVEOPT, P_ZERO, + "-mxsaveopt") + ISA_NAMES_TABLE_ENTRY("xsaves", FEATURE_XSAVES, P_ZERO, "-mxsaves") +ISA_NAMES_TABLE_END diff --git a/gcc/config/i386/driver-i386.c b/gcc/config/i386/driver-i386.c index 3a816400729..3641d81c80b 100644 --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -28,6 +28,8 @@ const char *host_detect_local_cpu (int argc, const char **argv); #if defined(__GNUC__) && (__GNUC__ >= 5 || !defined(__PIC__)) #include "cpuid.h" +#include "common/config/i386/cpuinfo.h" +#include "common/config/i386/i386-isas.h" struct cache_desc { @@ -388,53 +390,13 @@ const char *host_detect_local_cpu (int argc, const char **argv) const char *cache = ""; const char *options = ""; - unsigned int eax, ebx, ecx, edx; + unsigned int ebx, ecx, edx; unsigned int max_level, ext_level; unsigned int vendor; unsigned int model, family; - unsigned int has_sse3, has_ssse3, has_cmpxchg16b; - unsigned int has_cmpxchg8b, has_cmov, has_mmx, has_sse, has_sse2; - - /* Extended features */ - unsigned int has_lahf_lm = 0, has_sse4a = 0; - unsigned int has_longmode = 0, has_3dnowp = 0, has_3dnow = 0; - unsigned int has_movbe = 0, has_sse4_1 = 0, has_sse4_2 = 0; - unsigned int has_popcnt = 0, has_aes = 0, has_avx = 0, has_avx2 = 0; - unsigned int has_pclmul = 0, has_abm = 0, has_lwp = 0; - unsigned int has_fma = 0, has_fma4 = 0, has_xop = 0; - unsigned int has_bmi = 0, has_bmi2 = 0, has_tbm = 0, has_lzcnt = 0; - unsigned int has_hle = 0, has_rtm = 0, has_sgx = 0; - unsigned int has_pconfig = 0, has_wbnoinvd = 0; - unsigned int has_rdrnd = 0, has_f16c = 0, has_fsgsbase = 0; - unsigned int has_rdseed = 0, has_prfchw = 0, has_adx = 0; - unsigned int has_osxsave = 0, has_fxsr = 0, has_xsave = 0, has_xsaveopt = 0; - unsigned int has_avx512er = 0, has_avx512pf = 0, has_avx512cd = 0; - unsigned int has_avx512f = 0, has_sha = 0, has_prefetchwt1 = 0; - unsigned int has_clflushopt = 0, has_xsavec = 0, has_xsaves = 0; - unsigned int has_avx512dq = 0, has_avx512bw = 0, has_avx512vl = 0; - unsigned int has_avx512vbmi = 0, has_avx512ifma = 0, has_clwb = 0; - unsigned int has_mwaitx = 0, has_clzero = 0, has_pku = 0, has_rdpid = 0; - unsigned int has_avx5124fmaps = 0, has_avx5124vnniw = 0; - unsigned int has_gfni = 0, has_avx512vbmi2 = 0; - unsigned int has_avx512bitalg = 0; - unsigned int has_avx512vpopcntdq = 0; - unsigned int has_shstk = 0; - unsigned int has_avx512vnni = 0, has_vaes = 0; - unsigned int has_vpclmulqdq = 0; - unsigned int has_avx512vp2intersect = 0; - unsigned int has_movdiri = 0, has_movdir64b = 0; - unsigned int has_enqcmd = 0; - unsigned int has_waitpkg = 0; - unsigned int has_cldemote = 0; - unsigned int has_avx512bf16 = 0; - unsigned int has_serialize = 0; - unsigned int has_tsxldtrk = 0; - - unsigned int has_ptwrite = 0; - bool arch; unsigned int l2sizekb = 0; @@ -447,210 +409,27 @@ const char *host_detect_local_cpu (int argc, const char **argv) if (!arch && strcmp (argv[0], "tune")) return NULL; - max_level = __get_cpuid_max (0, &vendor); - if (max_level < 1) - goto done; - - __cpuid (1, eax, ebx, ecx, edx); - - model = (eax >> 4) & 0x0f; - family = (eax >> 8) & 0x0f; - if (vendor == signature_INTEL_ebx - || vendor == signature_AMD_ebx) - { - unsigned int extended_model, extended_family; - - extended_model = (eax >> 12) & 0xf0; - extended_family = (eax >> 20) & 0xff; - if (family == 0x0f) - { - family += extended_family; - model += extended_model; - } - else if (family == 0x06) - model += extended_model; - } - - has_sse3 = ecx & bit_SSE3; - has_ssse3 = ecx & bit_SSSE3; - has_sse4_1 = ecx & bit_SSE4_1; - has_sse4_2 = ecx & bit_SSE4_2; - has_avx = ecx & bit_AVX; - has_osxsave = ecx & bit_OSXSAVE; - has_cmpxchg16b = ecx & bit_CMPXCHG16B; - has_movbe = ecx & bit_MOVBE; - has_popcnt = ecx & bit_POPCNT; - has_aes = ecx & bit_AES; - has_pclmul = ecx & bit_PCLMUL; - has_fma = ecx & bit_FMA; - has_f16c = ecx & bit_F16C; - has_rdrnd = ecx & bit_RDRND; - has_xsave = ecx & bit_XSAVE; - - has_cmpxchg8b = edx & bit_CMPXCHG8B; - has_cmov = edx & bit_CMOV; - has_mmx = edx & bit_MMX; - has_fxsr = edx & bit_FXSAVE; - has_sse = edx & bit_SSE; - has_sse2 = edx & bit_SSE2; - - if (max_level >= 7) - { - __cpuid_count (7, 0, eax, ebx, ecx, edx); - - has_bmi = ebx & bit_BMI; - has_sgx = ebx & bit_SGX; - has_hle = ebx & bit_HLE; - has_rtm = ebx & bit_RTM; - has_avx2 = ebx & bit_AVX2; - has_bmi2 = ebx & bit_BMI2; - has_fsgsbase = ebx & bit_FSGSBASE; - has_rdseed = ebx & bit_RDSEED; - has_adx = ebx & bit_ADX; - has_avx512f = ebx & bit_AVX512F; - has_avx512er = ebx & bit_AVX512ER; - has_avx512pf = ebx & bit_AVX512PF; - has_avx512cd = ebx & bit_AVX512CD; - has_sha = ebx & bit_SHA; - has_clflushopt = ebx & bit_CLFLUSHOPT; - has_clwb = ebx & bit_CLWB; - has_avx512dq = ebx & bit_AVX512DQ; - has_avx512bw = ebx & bit_AVX512BW; - has_avx512vl = ebx & bit_AVX512VL; - has_avx512ifma = ebx & bit_AVX512IFMA; - - has_prefetchwt1 = ecx & bit_PREFETCHWT1; - has_avx512vbmi = ecx & bit_AVX512VBMI; - has_pku = ecx & bit_OSPKE; - has_avx512vbmi2 = ecx & bit_AVX512VBMI2; - has_avx512vnni = ecx & bit_AVX512VNNI; - has_rdpid = ecx & bit_RDPID; - has_gfni = ecx & bit_GFNI; - has_vaes = ecx & bit_VAES; - has_vpclmulqdq = ecx & bit_VPCLMULQDQ; - has_avx512bitalg = ecx & bit_AVX512BITALG; - has_avx512vpopcntdq = ecx & bit_AVX512VPOPCNTDQ; - has_movdiri = ecx & bit_MOVDIRI; - has_movdir64b = ecx & bit_MOVDIR64B; - has_enqcmd = ecx & bit_ENQCMD; - has_cldemote = ecx & bit_CLDEMOTE; - - has_avx5124vnniw = edx & bit_AVX5124VNNIW; - has_avx5124fmaps = edx & bit_AVX5124FMAPS; - has_avx512vp2intersect = edx & bit_AVX512VP2INTERSECT; - has_serialize = edx & bit_SERIALIZE; - has_tsxldtrk = edx & bit_TSXLDTRK; - - has_shstk = ecx & bit_SHSTK; - has_pconfig = edx & bit_PCONFIG; - has_waitpkg = ecx & bit_WAITPKG; - - __cpuid_count (7, 1, eax, ebx, ecx, edx); - has_avx512bf16 = eax & bit_AVX512BF16; - } - - if (max_level >= 13) - { - __cpuid_count (13, 1, eax, ebx, ecx, edx); - - has_xsaveopt = eax & bit_XSAVEOPT; - has_xsavec = eax & bit_XSAVEC; - has_xsaves = eax & bit_XSAVES; - } - - if (max_level >= 0x14) - { - __cpuid_count (0x14, 0, eax, ebx, ecx, edx); - - has_ptwrite = ebx & bit_PTWRITE; - } - - /* Check cpuid level of extended features. */ - __cpuid (0x80000000, ext_level, ebx, ecx, edx); - - if (ext_level >= 0x80000001) - { - __cpuid (0x80000001, eax, ebx, ecx, edx); - - has_lahf_lm = ecx & bit_LAHF_LM; - has_sse4a = ecx & bit_SSE4a; - has_abm = ecx & bit_ABM; - has_lwp = ecx & bit_LWP; - has_fma4 = ecx & bit_FMA4; - has_xop = ecx & bit_XOP; - has_tbm = ecx & bit_TBM; - has_lzcnt = ecx & bit_LZCNT; - has_prfchw = ecx & bit_PRFCHW; - - has_longmode = edx & bit_LM; - has_3dnowp = edx & bit_3DNOWP; - has_3dnow = edx & bit_3DNOW; - has_mwaitx = ecx & bit_MWAITX; - } - - if (ext_level >= 0x80000008) - { - __cpuid (0x80000008, eax, ebx, ecx, edx); - has_clzero = ebx & bit_CLZERO; - has_wbnoinvd = ebx & bit_WBNOINVD; - } - - /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ -#define XCR_XFEATURE_ENABLED_MASK 0x0 -#define XSTATE_FP 0x1 -#define XSTATE_SSE 0x2 -#define XSTATE_YMM 0x4 -#define XSTATE_OPMASK 0x20 -#define XSTATE_ZMM 0x40 -#define XSTATE_HI_ZMM 0x80 - -#define XCR_AVX_ENABLED_MASK \ - (XSTATE_SSE | XSTATE_YMM) -#define XCR_AVX512F_ENABLED_MASK \ - (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) - - if (has_osxsave) - asm (".byte 0x0f; .byte 0x01; .byte 0xd0" - : "=a" (eax), "=d" (edx) - : "c" (XCR_XFEATURE_ENABLED_MASK)); - else - eax = 0; + struct __processor_model cpu_model = { }; + struct __processor_model2 cpu_model2 = { }; + unsigned int cpu_features2[SIZE_OF_CPU_FEATURES] = { }; - /* Check if AVX registers are supported. */ - if ((eax & XCR_AVX_ENABLED_MASK) != XCR_AVX_ENABLED_MASK) - { - has_avx = 0; - has_avx2 = 0; - has_fma = 0; - has_fma4 = 0; - has_f16c = 0; - has_xop = 0; - has_xsave = 0; - has_xsaveopt = 0; - has_xsaves = 0; - has_xsavec = 0; - } + if (cpu_indicator_init (&cpu_model, &cpu_model2, cpu_features2) != 0) + goto done; - /* Check if AVX512F registers are supported. */ - if ((eax & XCR_AVX512F_ENABLED_MASK) != XCR_AVX512F_ENABLED_MASK) - { - has_avx512f = 0; - has_avx512er = 0; - has_avx512pf = 0; - has_avx512cd = 0; - has_avx512dq = 0; - has_avx512bw = 0; - has_avx512vl = 0; - } + vendor = cpu_model.__cpu_vendor; + family = cpu_model2.__cpu_family; + model = cpu_model2.__cpu_model; + max_level = cpu_model2.__cpu_max_level; + ext_level = cpu_model2.__cpu_ext_level; if (!arch) { - if (vendor == signature_AMD_ebx - || vendor == signature_CENTAUR_ebx - || vendor == signature_CYRIX_ebx - || vendor == signature_NSC_ebx) + if (vendor == VENDOR_AMD + || vendor == VENDOR_CENTAUR + || vendor == VENDOR_CYRIX + || vendor == VENDOR_NSC) cache = detect_caches_amd (ext_level); - else if (vendor == signature_INTEL_ebx) + else if (vendor == VENDOR_INTEL) { bool xeon_mp = (family == 15 && model == 6); cache = detect_caches_intel (xeon_mp, max_level, @@ -658,7 +437,11 @@ const char *host_detect_local_cpu (int argc, const char **argv) } } - if (vendor == signature_AMD_ebx) + /* Extended features */ +#define has_feature(f) \ + has_cpu_feature (&cpu_model, cpu_features2, f) + + if (vendor == VENDOR_AMD) { unsigned int name; @@ -668,36 +451,23 @@ const char *host_detect_local_cpu (int argc, const char **argv) else name = 0; - if (name == signature_NSC_ebx) - processor = PROCESSOR_GEODE; - else if (has_movbe && family == 22) - processor = PROCESSOR_BTVER2; - else if (has_clwb) - processor = PROCESSOR_ZNVER2; - else if (has_clzero) - processor = PROCESSOR_ZNVER1; - else if (has_avx2) - processor = PROCESSOR_BDVER4; - else if (has_xsaveopt) - processor = PROCESSOR_BDVER3; - else if (has_bmi) - processor = PROCESSOR_BDVER2; - else if (has_xop) - processor = PROCESSOR_BDVER1; - else if (has_sse4a && has_ssse3) - processor = PROCESSOR_BTVER1; - else if (has_sse4a) - processor = PROCESSOR_AMDFAM10; - else if (has_sse2 || has_longmode) - processor = PROCESSOR_K8; - else if (has_3dnowp && family == 6) - processor = PROCESSOR_ATHLON; - else if (has_mmx) - processor = PROCESSOR_K6; - else - processor = PROCESSOR_PENTIUM; + cpu = get_amd_cpu (&cpu_model, &cpu_model2, cpu_features2); + if (cpu == NULL) + { + if (name == signature_NSC_ebx) + processor = PROCESSOR_GEODE; + else if (has_feature (FEATURE_SSE2) + || has_feature (FEATURE_LM)) + processor = PROCESSOR_K8; + else if (has_feature (FEATURE_3DNOWP) && family == 6) + processor = PROCESSOR_ATHLON; + else if (has_feature (FEATURE_MMX)) + processor = PROCESSOR_K6; + else + processor = PROCESSOR_PENTIUM; + } } - else if (vendor == signature_CENTAUR_ebx) + else if (vendor == VENDOR_CENTAUR) { processor = PROCESSOR_GENERIC; @@ -708,12 +478,13 @@ const char *host_detect_local_cpu (int argc, const char **argv) break; case 5: - if (has_3dnow || has_mmx) + if (has_feature (FEATURE_3DNOW) + || has_feature (FEATURE_MMX)) processor = PROCESSOR_I486; break; case 6: - if (has_longmode) + if (has_feature (FEATURE_LM)) processor = PROCESSOR_K8; else if (model >= 9) processor = PROCESSOR_PENTIUMPRO; @@ -749,11 +520,11 @@ const char *host_detect_local_cpu (int argc, const char **argv) /* Default. */ break; case PROCESSOR_I486: - if (arch && vendor == signature_CENTAUR_ebx) + if (arch && vendor == VENDOR_CENTAUR) { if (model >= 6) cpu = "c3"; - else if (has_3dnow) + else if (has_feature (FEATURE_3DNOW)) cpu = "winchip2"; else /* Assume WinChip C6. */ @@ -763,226 +534,104 @@ const char *host_detect_local_cpu (int argc, const char **argv) cpu = "i486"; break; case PROCESSOR_PENTIUM: - if (arch && has_mmx) + if (arch && has_feature (FEATURE_MMX)) cpu = "pentium-mmx"; else cpu = "pentium"; break; case PROCESSOR_PENTIUMPRO: - switch (model) + cpu = get_intel_cpu (&cpu_model, &cpu_model2, cpu_features2, 0); + if (cpu == NULL) { - case 0x1c: - case 0x26: - /* Bonnell. */ - cpu = "bonnell"; - break; - case 0x37: - case 0x4a: - case 0x4d: - case 0x5d: - /* Silvermont. */ - case 0x4c: - case 0x5a: - case 0x75: - /* Airmont. */ - cpu = "silvermont"; - break; - case 0x5c: - case 0x5f: - /* Goldmont. */ - cpu = "goldmont"; - break; - case 0x7a: - /* Goldmont Plus. */ - cpu = "goldmont-plus"; - break; - case 0x86: - case 0x96: - case 0x9c: - /* Tremont. */ - cpu = "tremont"; - break; - case 0x0f: - /* Merom. */ - case 0x17: - case 0x1d: - /* Penryn. */ - cpu = "core2"; - break; - case 0x1a: - case 0x1e: - case 0x1f: - case 0x2e: - /* Nehalem. */ - cpu = "nehalem"; - break; - case 0x25: - case 0x2c: - case 0x2f: - /* Westmere. */ - cpu = "westmere"; - break; - case 0x2a: - case 0x2d: - /* Sandy Bridge. */ - cpu = "sandybridge"; - break; - case 0x3a: - case 0x3e: - /* Ivy Bridge. */ - cpu = "ivybridge"; - break; - case 0x3c: - case 0x3f: - case 0x45: - case 0x46: - /* Haswell. */ - cpu = "haswell"; - break; - case 0x3d: - case 0x47: - case 0x4f: - case 0x56: - /* Broadwell. */ - cpu = "broadwell"; - break; - case 0x4e: - case 0x5e: - /* Skylake. */ - case 0x8e: - case 0x9e: - /* Kaby Lake. */ - case 0xa5: - case 0xa6: - /* Comet Lake. */ - cpu = "skylake"; - break; - case 0x55: - if (has_avx512vnni) - /* Cascade Lake. */ - cpu = "cascadelake"; - else - /* Skylake with AVX-512. */ - cpu = "skylake-avx512"; - break; - case 0x6a: - case 0x6c: - /* Ice Lake server. */ - cpu = "icelake-server"; - break; - case 0x7e: - case 0x7d: - case 0x9d: - /* Ice Lake client. */ - cpu = "icelake-client"; - break; - case 0x8c: - case 0x8d: - /* Tiger Lake. */ - cpu = "tigerlake"; - break; - case 0x57: - /* Knights Landing. */ - cpu = "knl"; - break; - case 0x66: - /* Cannon Lake. */ - cpu = "cannonlake"; - break; - case 0x85: - /* Knights Mill. */ - cpu = "knm"; - break; - default: if (arch) { /* This is unknown family 0x6 CPU. */ - if (has_avx) - { - /* Assume Tiger Lake */ - if (has_avx512vp2intersect) - cpu = "tigerlake"; - /* Assume Cooper Lake */ - else if (has_avx512bf16) - cpu = "cooperlake"; - /* Assume Ice Lake Server. */ - else if (has_wbnoinvd) - cpu = "icelake-server"; + if (has_feature (FEATURE_AVX)) + { + /* Assume Tiger Lake */ + if (has_feature (FEATURE_AVX512VP2INTERSECT)) + cpu = "tigerlake"; + /* Assume Cooper Lake */ + else if (has_feature (FEATURE_AVX512BF16)) + cpu = "cooperlake"; + /* Assume Ice Lake Server. */ + else if (has_feature (FEATURE_WBNOINVD)) + cpu = "icelake-server"; /* Assume Ice Lake. */ - else if (has_avx512bitalg) + else if (has_feature (FEATURE_AVX512BITALG)) cpu = "icelake-client"; /* Assume Cannon Lake. */ - else if (has_avx512vbmi) + else if (has_feature (FEATURE_AVX512VBMI)) cpu = "cannonlake"; /* Assume Knights Mill. */ - else if (has_avx5124vnniw) + else if (has_feature (FEATURE_AVX5124VNNIW)) cpu = "knm"; /* Assume Knights Landing. */ - else if (has_avx512er) + else if (has_feature (FEATURE_AVX512ER)) cpu = "knl"; /* Assume Skylake with AVX-512. */ - else if (has_avx512f) + else if (has_feature (FEATURE_AVX512F)) cpu = "skylake-avx512"; /* Assume Skylake. */ - else if (has_clflushopt) + else if (has_feature (FEATURE_CLFLUSHOPT)) cpu = "skylake"; /* Assume Broadwell. */ - else if (has_adx) + else if (has_feature (FEATURE_ADX)) cpu = "broadwell"; - else if (has_avx2) + else if (has_feature (FEATURE_AVX2)) /* Assume Haswell. */ cpu = "haswell"; else /* Assume Sandy Bridge. */ cpu = "sandybridge"; } - else if (has_sse4_2) + else if (has_feature (FEATURE_SSE4_2)) { - if (has_gfni) + if (has_feature (FEATURE_GFNI)) /* Assume Tremont. */ cpu = "tremont"; - else if (has_sgx) + else if (has_feature (FEATURE_SGX)) /* Assume Goldmont Plus. */ cpu = "goldmont-plus"; - else if (has_xsave) + else if (has_feature (FEATURE_XSAVE)) /* Assume Goldmont. */ cpu = "goldmont"; - else if (has_movbe) + else if (has_feature (FEATURE_MOVBE)) /* Assume Silvermont. */ cpu = "silvermont"; else /* Assume Nehalem. */ cpu = "nehalem"; } - else if (has_ssse3) + else if (has_feature (FEATURE_SSSE3)) { - if (has_movbe) + if (has_feature (FEATURE_MOVBE)) /* Assume Bonnell. */ cpu = "bonnell"; else /* Assume Core 2. */ cpu = "core2"; } - else if (has_longmode) + else if (has_feature (FEATURE_LM)) /* Perhaps some emulator? Assume x86-64, otherwise gcc -march=native would be unusable for 64-bit compilations, as all the CPUs below are 32-bit only. */ cpu = "x86-64"; - else if (has_sse3) + else if (has_feature (FEATURE_SSE3)) { - if (vendor == signature_CENTAUR_ebx) + if (vendor == VENDOR_CENTAUR) /* C7 / Eden "Esther" */ cpu = "c7"; else /* It is Core Duo. */ cpu = "pentium-m"; } - else if (has_sse2) + else if (has_feature (FEATURE_SSE2)) /* It is Pentium M. */ cpu = "pentium-m"; - else if (has_sse) + else if (has_feature (FEATURE_SSE)) { - if (vendor == signature_CENTAUR_ebx) + if (vendor == VENDOR_CENTAUR) { if (model >= 9) /* Eden "Nehemiah" */ @@ -994,7 +643,7 @@ const char *host_detect_local_cpu (int argc, const char **argv) /* It is Pentium III. */ cpu = "pentium3"; } - else if (has_mmx) + else if (has_feature (FEATURE_MMX)) /* It is Pentium II. */ cpu = "pentium2"; else @@ -1004,13 +653,12 @@ const char *host_detect_local_cpu (int argc, const char **argv) else /* For -mtune, we default to -mtune=generic. */ cpu = "generic"; - break; } break; case PROCESSOR_PENTIUM4: - if (has_sse3) + if (has_feature (FEATURE_SSE3)) { - if (has_longmode) + if (has_feature (FEATURE_LM)) cpu = "nocona"; else cpu = "prescott"; @@ -1022,13 +670,13 @@ const char *host_detect_local_cpu (int argc, const char **argv) cpu = "geode"; break; case PROCESSOR_K6: - if (arch && has_3dnow) + if (arch && has_feature (FEATURE_3DNOW)) cpu = "k6-3"; else cpu = "k6"; break; case PROCESSOR_ATHLON: - if (arch && has_sse) + if (arch && has_feature (FEATURE_SSE)) cpu = "athlon-4"; else cpu = "athlon"; @@ -1036,22 +684,22 @@ const char *host_detect_local_cpu (int argc, const char **argv) case PROCESSOR_K8: if (arch) { - if (vendor == signature_CENTAUR_ebx) + if (vendor == VENDOR_CENTAUR) { - if (has_sse4_1) + if (has_feature (FEATURE_SSE4_1)) /* Nano 3000 | Nano dual / quad core | Eden X4 */ cpu = "nano-3000"; - else if (has_ssse3) + else if (has_feature (FEATURE_SSSE3)) /* Nano 1000 | Nano 2000 */ cpu = "nano"; - else if (has_sse3) + else if (has_feature (FEATURE_SSE3)) /* Eden X2 */ cpu = "eden-x2"; else /* Default to k8 */ cpu = "k8"; } - else if (has_sse3) + else if (has_feature (FEATURE_SSE3)) cpu = "k8-sse3"; else cpu = "k8"; @@ -1060,59 +708,32 @@ const char *host_detect_local_cpu (int argc, const char **argv) /* For -mtune, we default to -mtune=k8 */ cpu = "k8"; break; - case PROCESSOR_AMDFAM10: - cpu = "amdfam10"; - break; - case PROCESSOR_BDVER1: - cpu = "bdver1"; - break; - case PROCESSOR_BDVER2: - cpu = "bdver2"; - break; - case PROCESSOR_BDVER3: - cpu = "bdver3"; - break; - case PROCESSOR_BDVER4: - cpu = "bdver4"; - break; - case PROCESSOR_ZNVER1: - cpu = "znver1"; - break; - case PROCESSOR_ZNVER2: - cpu = "znver2"; - break; - case PROCESSOR_BTVER1: - cpu = "btver1"; - break; - case PROCESSOR_BTVER2: - cpu = "btver2"; - break; default: /* Use something reasonable. */ if (arch) { - if (has_ssse3) + if (has_feature (FEATURE_SSSE3)) cpu = "core2"; - else if (has_sse3) + else if (has_feature (FEATURE_SSE3)) { - if (has_longmode) + if (has_feature (FEATURE_LM)) cpu = "nocona"; else cpu = "prescott"; } - else if (has_longmode) + else if (has_feature (FEATURE_LM)) /* Perhaps some emulator? Assume x86-64, otherwise gcc -march=native would be unusable for 64-bit compilations, as all the CPUs below are 32-bit only. */ cpu = "x86-64"; - else if (has_sse2) + else if (has_feature (FEATURE_SSE2)) cpu = "pentium4"; - else if (has_cmov) + else if (has_feature (FEATURE_CMOV)) cpu = "pentiumpro"; - else if (has_mmx) + else if (has_feature (FEATURE_MMX)) cpu = "pentium-mmx"; - else if (has_cmpxchg8b) + else if (has_feature (FEATURE_CMPXCHG8B)) cpu = "pentium"; } else @@ -1121,101 +742,18 @@ const char *host_detect_local_cpu (int argc, const char **argv) if (arch) { - const char *mmx = has_mmx ? " -mmmx" : " -mno-mmx"; - const char *mmx3dnow = has_3dnow ? " -m3dnow" : " -mno-3dnow"; - const char *sse = has_sse ? " -msse" : " -mno-sse"; - const char *sse2 = has_sse2 ? " -msse2" : " -mno-sse2"; - const char *sse3 = has_sse3 ? " -msse3" : " -mno-sse3"; - const char *ssse3 = has_ssse3 ? " -mssse3" : " -mno-ssse3"; - const char *sse4a = has_sse4a ? " -msse4a" : " -mno-sse4a"; - const char *cx16 = has_cmpxchg16b ? " -mcx16" : " -mno-cx16"; - const char *sahf = has_lahf_lm ? " -msahf" : " -mno-sahf"; - const char *movbe = has_movbe ? " -mmovbe" : " -mno-movbe"; - const char *aes = has_aes ? " -maes" : " -mno-aes"; - const char *sha = has_sha ? " -msha" : " -mno-sha"; - const char *pclmul = has_pclmul ? " -mpclmul" : " -mno-pclmul"; - const char *popcnt = has_popcnt ? " -mpopcnt" : " -mno-popcnt"; - const char *abm = has_abm ? " -mabm" : " -mno-abm"; - const char *lwp = has_lwp ? " -mlwp" : " -mno-lwp"; - const char *fma = has_fma ? " -mfma" : " -mno-fma"; - const char *fma4 = has_fma4 ? " -mfma4" : " -mno-fma4"; - const char *xop = has_xop ? " -mxop" : " -mno-xop"; - const char *bmi = has_bmi ? " -mbmi" : " -mno-bmi"; - const char *pconfig = has_pconfig ? " -mpconfig" : " -mno-pconfig"; - const char *wbnoinvd = has_wbnoinvd ? " -mwbnoinvd" : " -mno-wbnoinvd"; - const char *sgx = has_sgx ? " -msgx" : " -mno-sgx"; - const char *bmi2 = has_bmi2 ? " -mbmi2" : " -mno-bmi2"; - const char *tbm = has_tbm ? " -mtbm" : " -mno-tbm"; - const char *avx = has_avx ? " -mavx" : " -mno-avx"; - const char *avx2 = has_avx2 ? " -mavx2" : " -mno-avx2"; - const char *sse4_2 = has_sse4_2 ? " -msse4.2" : " -mno-sse4.2"; - const char *sse4_1 = has_sse4_1 ? " -msse4.1" : " -mno-sse4.1"; - const char *lzcnt = has_lzcnt ? " -mlzcnt" : " -mno-lzcnt"; - const char *hle = has_hle ? " -mhle" : " -mno-hle"; - const char *rtm = has_rtm ? " -mrtm" : " -mno-rtm"; - const char *rdrnd = has_rdrnd ? " -mrdrnd" : " -mno-rdrnd"; - const char *f16c = has_f16c ? " -mf16c" : " -mno-f16c"; - const char *fsgsbase = has_fsgsbase ? " -mfsgsbase" : " -mno-fsgsbase"; - const char *rdseed = has_rdseed ? " -mrdseed" : " -mno-rdseed"; - const char *prfchw = has_prfchw ? " -mprfchw" : " -mno-prfchw"; - const char *adx = has_adx ? " -madx" : " -mno-adx"; - const char *fxsr = has_fxsr ? " -mfxsr" : " -mno-fxsr"; - const char *xsave = has_xsave ? " -mxsave" : " -mno-xsave"; - const char *xsaveopt = has_xsaveopt ? " -mxsaveopt" : " -mno-xsaveopt"; - const char *avx512f = has_avx512f ? " -mavx512f" : " -mno-avx512f"; - const char *avx512er = has_avx512er ? " -mavx512er" : " -mno-avx512er"; - const char *avx512cd = has_avx512cd ? " -mavx512cd" : " -mno-avx512cd"; - const char *avx512pf = has_avx512pf ? " -mavx512pf" : " -mno-avx512pf"; - const char *prefetchwt1 = has_prefetchwt1 ? " -mprefetchwt1" : " -mno-prefetchwt1"; - const char *clflushopt = has_clflushopt ? " -mclflushopt" : " -mno-clflushopt"; - const char *xsavec = has_xsavec ? " -mxsavec" : " -mno-xsavec"; - const char *xsaves = has_xsaves ? " -mxsaves" : " -mno-xsaves"; - const char *avx512dq = has_avx512dq ? " -mavx512dq" : " -mno-avx512dq"; - const char *avx512bw = has_avx512bw ? " -mavx512bw" : " -mno-avx512bw"; - const char *avx512vl = has_avx512vl ? " -mavx512vl" : " -mno-avx512vl"; - const char *avx512ifma = has_avx512ifma ? " -mavx512ifma" : " -mno-avx512ifma"; - const char *avx512vbmi = has_avx512vbmi ? " -mavx512vbmi" : " -mno-avx512vbmi"; - const char *avx5124vnniw = has_avx5124vnniw ? " -mavx5124vnniw" : " -mno-avx5124vnniw"; - const char *avx512vbmi2 = has_avx512vbmi2 ? " -mavx512vbmi2" : " -mno-avx512vbmi2"; - const char *avx512vnni = has_avx512vnni ? " -mavx512vnni" : " -mno-avx512vnni"; - const char *avx5124fmaps = has_avx5124fmaps ? " -mavx5124fmaps" : " -mno-avx5124fmaps"; - const char *clwb = has_clwb ? " -mclwb" : " -mno-clwb"; - const char *mwaitx = has_mwaitx ? " -mmwaitx" : " -mno-mwaitx"; - const char *clzero = has_clzero ? " -mclzero" : " -mno-clzero"; - const char *pku = has_pku ? " -mpku" : " -mno-pku"; - const char *rdpid = has_rdpid ? " -mrdpid" : " -mno-rdpid"; - const char *gfni = has_gfni ? " -mgfni" : " -mno-gfni"; - const char *shstk = has_shstk ? " -mshstk" : " -mno-shstk"; - const char *vaes = has_vaes ? " -mvaes" : " -mno-vaes"; - const char *vpclmulqdq = has_vpclmulqdq ? " -mvpclmulqdq" : " -mno-vpclmulqdq"; - const char *avx512vp2intersect = has_avx512vp2intersect ? " -mavx512vp2intersect" : " -mno-avx512vp2intersect"; - const char *tsxldtrk = has_tsxldtrk ? " -mtsxldtrk " : " -mno-tsxldtrk"; - const char *avx512bitalg = has_avx512bitalg ? " -mavx512bitalg" : " -mno-avx512bitalg"; - const char *avx512vpopcntdq = has_avx512vpopcntdq ? " -mavx512vpopcntdq" : " -mno-avx512vpopcntdq"; - const char *movdiri = has_movdiri ? " -mmovdiri" : " -mno-movdiri"; - const char *movdir64b = has_movdir64b ? " -mmovdir64b" : " -mno-movdir64b"; - const char *enqcmd = has_enqcmd ? " -menqcmd" : " -mno-enqcmd"; - const char *waitpkg = has_waitpkg ? " -mwaitpkg" : " -mno-waitpkg"; - const char *cldemote = has_cldemote ? " -mcldemote" : " -mno-cldemote"; - const char *serialize = has_serialize ? " -mserialize" : " -mno-serialize"; - const char *ptwrite = has_ptwrite ? " -mptwrite" : " -mno-ptwrite"; - const char *avx512bf16 = has_avx512bf16 ? " -mavx512bf16" : " -mno-avx512bf16"; - - options = concat (options, mmx, mmx3dnow, sse, sse2, sse3, ssse3, - sse4a, cx16, sahf, movbe, aes, sha, pclmul, - popcnt, abm, lwp, fma, fma4, xop, bmi, sgx, bmi2, - pconfig, wbnoinvd, - tbm, avx, avx2, sse4_2, sse4_1, lzcnt, rtm, - hle, rdrnd, f16c, fsgsbase, rdseed, prfchw, adx, - fxsr, xsave, xsaveopt, avx512f, avx512er, - avx512cd, avx512pf, prefetchwt1, clflushopt, - xsavec, xsaves, avx512dq, avx512bw, avx512vl, - avx512ifma, avx512vbmi, avx5124fmaps, avx5124vnniw, - clwb, mwaitx, clzero, pku, rdpid, gfni, shstk, - avx512vbmi2, avx512vnni, vaes, vpclmulqdq, - avx512bitalg, avx512vpopcntdq, movdiri, movdir64b, - waitpkg, cldemote, ptwrite, avx512bf16, enqcmd, - avx512vp2intersect, serialize, tsxldtrk, NULL); + unsigned int i; + const char *const neg_option = " -mno-"; + for (i = 0; i < ARRAY_SIZE (isa_names_table); i++) + if (isa_names_table[i].option) + { + if (has_feature (isa_names_table[i].feature)) + options = concat (options, " ", + isa_names_table[i].option, NULL); + else + options = concat (options, neg_option, + isa_names_table[i].option + 2, NULL); + } } done: diff --git a/gcc/config/i386/i386-builtins.c b/gcc/config/i386/i386-builtins.c index be3ed0158f2..fbbc957510a 100644 --- a/gcc/config/i386/i386-builtins.c +++ b/gcc/config/i386/i386-builtins.c @@ -90,6 +90,8 @@ along with GCC; see the file COPYING3. If not see #include "debug.h" #include "dwarf2out.h" #include "i386-builtins.h" +#include "common/config/i386/cpuinfo-builtins.h" +#include "common/config/i386/i386-isas.h" #undef BDESC #undef BDESC_FIRST @@ -1835,235 +1837,66 @@ ix86_builtin_reciprocal (tree fndecl) } } -/* Priority of i386 features, greater value is higher priority. This is - used to decide the order in which function dispatch must happen. For - instance, a version specialized for SSE4.2 should be checked for dispatch - before a version for SSE3, as SSE4.2 implies SSE3. */ -enum feature_priority -{ - P_ZERO = 0, - P_MMX, - P_SSE, - P_SSE2, - P_SSE3, - P_SSSE3, - P_PROC_SSSE3, - P_SSE4_A, - P_PROC_SSE4_A, - P_SSE4_1, - P_SSE4_2, - P_PROC_SSE4_2, - P_POPCNT, - P_AES, - P_PCLMUL, - P_AVX, - P_PROC_AVX, - P_BMI, - P_PROC_BMI, - P_FMA4, - P_XOP, - P_PROC_XOP, - P_FMA, - P_PROC_FMA, - P_BMI2, - P_AVX2, - P_PROC_AVX2, - P_AVX512F, - P_PROC_AVX512F -}; - -/* This is the order of bit-fields in __processor_features in cpuinfo.c */ -enum processor_features -{ - F_CMOV = 0, - F_MMX, - F_POPCNT, - F_SSE, - F_SSE2, - F_SSE3, - F_SSSE3, - F_SSE4_1, - F_SSE4_2, - F_AVX, - F_AVX2, - F_SSE4_A, - F_FMA4, - F_XOP, - F_FMA, - F_AVX512F, - F_BMI, - F_BMI2, - F_AES, - F_PCLMUL, - F_AVX512VL, - F_AVX512BW, - F_AVX512DQ, - F_AVX512CD, - F_AVX512ER, - F_AVX512PF, - F_AVX512VBMI, - F_AVX512IFMA, - F_AVX5124VNNIW, - F_AVX5124FMAPS, - F_AVX512VPOPCNTDQ, - F_AVX512VBMI2, - F_GFNI, - F_VPCLMULQDQ, - F_AVX512VNNI, - F_AVX512BITALG, - F_AVX512BF16, - F_AVX512VP2INTERSECT, - F_MAX -}; - -/* These are the values for vendor types and cpu types and subtypes - in cpuinfo.c. Cpu types and subtypes should be subtracted by - the corresponding start value. */ -enum processor_model -{ - M_INTEL = 1, - M_AMD, - M_CPU_TYPE_START, - M_INTEL_BONNELL, - M_INTEL_CORE2, - M_INTEL_COREI7, - M_AMDFAM10H, - M_AMDFAM15H, - M_INTEL_SILVERMONT, - M_INTEL_KNL, - M_AMD_BTVER1, - M_AMD_BTVER2, - M_AMDFAM17H, - M_INTEL_KNM, - M_INTEL_GOLDMONT, - M_INTEL_GOLDMONT_PLUS, - M_INTEL_TREMONT, - M_CPU_SUBTYPE_START, - M_INTEL_COREI7_NEHALEM, - M_INTEL_COREI7_WESTMERE, - M_INTEL_COREI7_SANDYBRIDGE, - M_AMDFAM10H_BARCELONA, - M_AMDFAM10H_SHANGHAI, - M_AMDFAM10H_ISTANBUL, - M_AMDFAM15H_BDVER1, - M_AMDFAM15H_BDVER2, - M_AMDFAM15H_BDVER3, - M_AMDFAM15H_BDVER4, - M_AMDFAM17H_ZNVER1, - M_INTEL_COREI7_IVYBRIDGE, - M_INTEL_COREI7_HASWELL, - M_INTEL_COREI7_BROADWELL, - M_INTEL_COREI7_SKYLAKE, - M_INTEL_COREI7_SKYLAKE_AVX512, - M_INTEL_COREI7_CANNONLAKE, - M_INTEL_COREI7_ICELAKE_CLIENT, - M_INTEL_COREI7_ICELAKE_SERVER, - M_AMDFAM17H_ZNVER2, - M_INTEL_COREI7_CASCADELAKE, - M_INTEL_COREI7_TIGERLAKE, - M_INTEL_COREI7_COOPERLAKE -}; - struct _arch_names_table { const char *const name; - const enum processor_model model; + const int model; }; -static const _arch_names_table arch_names_table[] = -{ - {"amd", M_AMD}, - {"intel", M_INTEL}, - {"atom", M_INTEL_BONNELL}, - {"slm", M_INTEL_SILVERMONT}, - {"core2", M_INTEL_CORE2}, - {"corei7", M_INTEL_COREI7}, - {"nehalem", M_INTEL_COREI7_NEHALEM}, - {"westmere", M_INTEL_COREI7_WESTMERE}, - {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE}, - {"ivybridge", M_INTEL_COREI7_IVYBRIDGE}, - {"haswell", M_INTEL_COREI7_HASWELL}, - {"broadwell", M_INTEL_COREI7_BROADWELL}, - {"skylake", M_INTEL_COREI7_SKYLAKE}, - {"skylake-avx512", M_INTEL_COREI7_SKYLAKE_AVX512}, - {"cannonlake", M_INTEL_COREI7_CANNONLAKE}, - {"icelake-client", M_INTEL_COREI7_ICELAKE_CLIENT}, - {"icelake-server", M_INTEL_COREI7_ICELAKE_SERVER}, - {"cascadelake", M_INTEL_COREI7_CASCADELAKE}, - {"tigerlake", M_INTEL_COREI7_TIGERLAKE}, - {"cooperlake", M_INTEL_COREI7_COOPERLAKE}, - {"bonnell", M_INTEL_BONNELL}, - {"silvermont", M_INTEL_SILVERMONT}, - {"goldmont", M_INTEL_GOLDMONT}, - {"goldmont-plus", M_INTEL_GOLDMONT_PLUS}, - {"tremont", M_INTEL_TREMONT}, - {"knl", M_INTEL_KNL}, - {"knm", M_INTEL_KNM}, - {"amdfam10h", M_AMDFAM10H}, - {"barcelona", M_AMDFAM10H_BARCELONA}, - {"shanghai", M_AMDFAM10H_SHANGHAI}, - {"istanbul", M_AMDFAM10H_ISTANBUL}, - {"btver1", M_AMD_BTVER1}, - {"amdfam15h", M_AMDFAM15H}, - {"bdver1", M_AMDFAM15H_BDVER1}, - {"bdver2", M_AMDFAM15H_BDVER2}, - {"bdver3", M_AMDFAM15H_BDVER3}, - {"bdver4", M_AMDFAM15H_BDVER4}, - {"btver2", M_AMD_BTVER2}, - {"amdfam17h", M_AMDFAM17H}, - {"znver1", M_AMDFAM17H_ZNVER1}, - {"znver2", M_AMDFAM17H_ZNVER2}, -}; +/* These are the values for vendor types, cpu types and subtypes in + cpuinfo.h. Cpu types and subtypes should be subtracted by the + corresponding start value. */ -/* These are the target attribute strings for which a dispatcher is - available, from fold_builtin_cpu. */ -struct _isa_names_table -{ - const char *const name; - const enum processor_features feature; - const enum feature_priority priority; -}; +#define M_CPU_TYPE_START (BUILTIN_VENDOR_MAX) +#define M_CPU_SUBTYPE_START \ + (M_CPU_TYPE_START + BUILTIN_CPU_TYPE_MAX) +#define M_VENDOR(a) (a) +#define M_CPU_TYPE(a) (M_CPU_TYPE_START + a) +#define M_CPU_SUBTYPE(a) (M_CPU_SUBTYPE_START + a) -static const _isa_names_table isa_names_table[] = +static const _arch_names_table arch_names_table[] = { - {"cmov", F_CMOV, P_ZERO}, - {"mmx", F_MMX, P_MMX}, - {"popcnt", F_POPCNT, P_POPCNT}, - {"sse", F_SSE, P_SSE}, - {"sse2", F_SSE2, P_SSE2}, - {"sse3", F_SSE3, P_SSE3}, - {"ssse3", F_SSSE3, P_SSSE3}, - {"sse4a", F_SSE4_A, P_SSE4_A}, - {"sse4.1", F_SSE4_1, P_SSE4_1}, - {"sse4.2", F_SSE4_2, P_SSE4_2}, - {"avx", F_AVX, P_AVX}, - {"fma4", F_FMA4, P_FMA4}, - {"xop", F_XOP, P_XOP}, - {"fma", F_FMA, P_FMA}, - {"avx2", F_AVX2, P_AVX2}, - {"avx512f", F_AVX512F, P_AVX512F}, - {"bmi", F_BMI, P_BMI}, - {"bmi2", F_BMI2, P_BMI2}, - {"aes", F_AES, P_AES}, - {"pclmul", F_PCLMUL, P_PCLMUL}, - {"avx512vl",F_AVX512VL, P_ZERO}, - {"avx512bw",F_AVX512BW, P_ZERO}, - {"avx512dq",F_AVX512DQ, P_ZERO}, - {"avx512cd",F_AVX512CD, P_ZERO}, - {"avx512er",F_AVX512ER, P_ZERO}, - {"avx512pf",F_AVX512PF, P_ZERO}, - {"avx512vbmi",F_AVX512VBMI, P_ZERO}, - {"avx512ifma",F_AVX512IFMA, P_ZERO}, - {"avx5124vnniw",F_AVX5124VNNIW, P_ZERO}, - {"avx5124fmaps",F_AVX5124FMAPS, P_ZERO}, - {"avx512vpopcntdq",F_AVX512VPOPCNTDQ, P_ZERO}, - {"avx512vbmi2", F_AVX512VBMI2, P_ZERO}, - {"gfni", F_GFNI, P_ZERO}, - {"vpclmulqdq", F_VPCLMULQDQ, P_ZERO}, - {"avx512vnni", F_AVX512VNNI, P_ZERO}, - {"avx512bitalg", F_AVX512BITALG, P_ZERO}, - {"avx512bf16", F_AVX512BF16, P_ZERO}, - {"avx512vp2intersect",F_AVX512VP2INTERSECT, P_ZERO} + {"amd", M_VENDOR (VENDOR_AMD)}, + {"intel", M_VENDOR (VENDOR_INTEL)}, + {"atom", M_CPU_TYPE (INTEL_BONNELL)}, + {"slm", M_CPU_TYPE (INTEL_SILVERMONT)}, + {"core2", M_CPU_TYPE (INTEL_CORE2)}, + {"corei7", M_CPU_TYPE (INTEL_COREI7)}, + {"nehalem", M_CPU_SUBTYPE (INTEL_COREI7_NEHALEM)}, + {"westmere", M_CPU_SUBTYPE (INTEL_COREI7_WESTMERE)}, + {"sandybridge", M_CPU_SUBTYPE (INTEL_COREI7_SANDYBRIDGE)}, + {"ivybridge", M_CPU_SUBTYPE (INTEL_COREI7_IVYBRIDGE)}, + {"haswell", M_CPU_SUBTYPE (INTEL_COREI7_HASWELL)}, + {"broadwell", M_CPU_SUBTYPE (INTEL_COREI7_BROADWELL)}, + {"skylake", M_CPU_SUBTYPE (INTEL_COREI7_SKYLAKE)}, + {"skylake-avx512", M_CPU_SUBTYPE (INTEL_COREI7_SKYLAKE_AVX512)}, + {"cannonlake", M_CPU_SUBTYPE (INTEL_COREI7_CANNONLAKE)}, + {"icelake-client", M_CPU_SUBTYPE (INTEL_COREI7_ICELAKE_CLIENT)}, + {"icelake-server", M_CPU_SUBTYPE (INTEL_COREI7_ICELAKE_SERVER)}, + {"cascadelake", M_CPU_SUBTYPE (INTEL_COREI7_CASCADELAKE)}, + {"tigerlake", M_CPU_SUBTYPE (INTEL_COREI7_TIGERLAKE)}, + {"cooperlake", M_CPU_SUBTYPE (INTEL_COREI7_COOPERLAKE)}, + {"bonnell", M_CPU_TYPE (INTEL_BONNELL)}, + {"silvermont", M_CPU_TYPE (INTEL_SILVERMONT)}, + {"goldmont", M_CPU_TYPE (INTEL_GOLDMONT)}, + {"goldmont-plus", M_CPU_TYPE (INTEL_GOLDMONT_PLUS)}, + {"tremont", M_CPU_TYPE (INTEL_TREMONT)}, + {"knl", M_CPU_TYPE (INTEL_KNL)}, + {"knm", M_CPU_TYPE (INTEL_KNM)}, + {"amdfam10h", M_CPU_TYPE (AMDFAM10H)}, + {"barcelona", M_CPU_SUBTYPE (AMDFAM10H_BARCELONA)}, + {"shanghai", M_CPU_SUBTYPE (AMDFAM10H_SHANGHAI)}, + {"istanbul", M_CPU_SUBTYPE (AMDFAM10H_ISTANBUL)}, + {"btver1", M_CPU_TYPE (AMD_BTVER1)}, + {"amdfam15h", M_CPU_TYPE (AMDFAM15H)}, + {"bdver1", M_CPU_SUBTYPE (AMDFAM15H_BDVER1)}, + {"bdver2", M_CPU_SUBTYPE (AMDFAM15H_BDVER2)}, + {"bdver3", M_CPU_SUBTYPE (AMDFAM15H_BDVER3)}, + {"bdver4", M_CPU_SUBTYPE (AMDFAM15H_BDVER4)}, + {"btver2", M_CPU_TYPE (AMD_BTVER2)}, + {"amdfam17h", M_CPU_TYPE (AMDFAM17H)}, + {"znver1", M_CPU_SUBTYPE (AMDFAM17H_ZNVER1)}, + {"znver2", M_CPU_SUBTYPE (AMDFAM17H_ZNVER2)}, }; /* This parses the attribute arguments to target in DECL and determines @@ -2509,16 +2342,29 @@ fold_builtin_cpu (tree fndecl, tree *args) if (isa_names_table[i].feature >= 32) { - tree __cpu_features2_var = make_var_decl (unsigned_type_node, + tree index_type + = build_index_type (size_int (SIZE_OF_CPU_FEATURES)); + tree type = build_array_type (unsigned_type_node, index_type); + tree __cpu_features2_var = make_var_decl (type, "__cpu_features2"); varpool_node::add (__cpu_features2_var); - field_val = (1U << (isa_names_table[i].feature - 32)); - /* Return __cpu_features2 & field_val */ - final = build2 (BIT_AND_EXPR, unsigned_type_node, - __cpu_features2_var, - build_int_cstu (unsigned_type_node, field_val)); - return build1 (CONVERT_EXPR, integer_type_node, final); + for (unsigned int j = 0; j < SIZE_OF_CPU_FEATURES; j++) + if (isa_names_table[i].feature < (32 + 32 + j * 32)) + { + field_val = (1U << (isa_names_table[i].feature + - (32 + j * 32))); + tree index = size_int (j); + array_elt = build4 (ARRAY_REF, unsigned_type_node, + __cpu_features2_var, + index, NULL_TREE, NULL_TREE); + /* Return __cpu_features2[index] & field_val */ + final = build2 (BIT_AND_EXPR, unsigned_type_node, + array_elt, + build_int_cstu (unsigned_type_node, + field_val)); + return build1 (CONVERT_EXPR, integer_type_node, final); + } } field = TYPE_FIELDS (__processor_model_type); diff --git a/gcc/testsuite/gcc.target/i386/builtin_target.c b/gcc/testsuite/gcc.target/i386/builtin_target.c index 7a8b6e805ed..368c0a54efe 100644 --- a/gcc/testsuite/gcc.target/i386/builtin_target.c +++ b/gcc/testsuite/gcc.target/i386/builtin_target.c @@ -7,348 +7,52 @@ /* { dg-do run } */ #include +#include #include "cpuid.h" - -/* Check if the Intel CPU model and sub-model are identified. */ -static void -check_intel_cpu_model (unsigned int family, unsigned int model, - unsigned int brand_id) -{ - /* Parse family and model only if brand ID is 0. */ - if (brand_id == 0) - { - switch (family) - { - case 0x5: - /* Pentium. */ - break; - case 0x6: - switch (model) - { - case 0x1c: - case 0x26: - /* Atom. */ - assert (__builtin_cpu_is ("atom")); - break; - case 0x37: - case 0x4a: - case 0x4d: - case 0x5a: - case 0x5d: - /* Silvermont. */ - assert (__builtin_cpu_is ("silvermont")); - break; - case 0x5c: - case 0x5f: - /* Goldmont. */ - assert (__builtin_cpu_is ("goldmont")); - break; - case 0x7a: - /* Goldmont Plus. */ - assert (__builtin_cpu_is ("goldmont-plus")); - break; - case 0x57: - /* Knights Landing. */ - assert (__builtin_cpu_is ("knl")); - break; - case 0x85: - /* Knights Mill */ - assert (__builtin_cpu_is ("knm")); - break; - case 0x1a: - case 0x1e: - case 0x1f: - case 0x2e: - /* Nehalem. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("nehalem")); - break; - case 0x25: - case 0x2c: - case 0x2f: - /* Westmere. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("westmere")); - break; - case 0x2a: - case 0x2d: - /* Sandy Bridge. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("sandybridge")); - break; - case 0x3a: - case 0x3e: - /* Ivy Bridge. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("ivybridge")); - break; - case 0x3c: - case 0x3f: - case 0x45: - case 0x46: - /* Haswell. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("haswell")); - break; - case 0x3d: - case 0x47: - case 0x4f: - case 0x56: - /* Broadwell. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("broadwell")); - break; - case 0x4e: - case 0x5e: - /* Skylake. */ - case 0x8e: - case 0x9e: - /* Kaby Lake. */ - assert (__builtin_cpu_is ("corei7")); - assert (__builtin_cpu_is ("skylake")); - break; - case 0x55: - { - unsigned int eax, ebx, ecx, edx; - __cpuid_count (7, 0, eax, ebx, ecx, edx); - assert (__builtin_cpu_is ("corei7")); - if (ecx & bit_AVX512VNNI) - /* Cascade Lake. */ - assert (__builtin_cpu_is ("cascadelake")); - else - /* Skylake with AVX-512 support. */ - assert (__builtin_cpu_is ("skylake-avx512")); - break; - } - case 0x66: - /* Cannon Lake. */ - assert (__builtin_cpu_is ("cannonlake")); - break; - case 0x17: - case 0x1d: - /* Penryn. */ - case 0x0f: - /* Merom. */ - assert (__builtin_cpu_is ("core2")); - break; - default: - break; - } - break; - default: - /* We have no idea. */ - break; - } - } -} - -/* Check if the AMD CPU model and sub-model are identified. */ -static void -check_amd_cpu_model (unsigned int family, unsigned int model) -{ - switch (family) - { - /* AMD Family 10h. */ - case 0x10: - switch (model) - { - case 0x2: - /* Barcelona. */ - assert (__builtin_cpu_is ("amdfam10h")); - assert (__builtin_cpu_is ("barcelona")); - break; - case 0x4: - /* Shanghai. */ - assert (__builtin_cpu_is ("amdfam10h")); - assert (__builtin_cpu_is ("shanghai")); - break; - case 0x8: - /* Istanbul. */ - assert (__builtin_cpu_is ("amdfam10h")); - assert (__builtin_cpu_is ("istanbul")); - break; - default: - break; - } - break; - /* AMD Family 15h. */ - case 0x15: - assert (__builtin_cpu_is ("amdfam15h")); - /* Bulldozer version 1. */ - if ( model <= 0xf) - assert (__builtin_cpu_is ("bdver1")); - /* Bulldozer version 2. */ - if (model >= 0x10 && model <= 0x1f) - assert (__builtin_cpu_is ("bdver2")); - break; - default: - break; - } -} +#define CHECK___builtin_cpu_is(cpu) assert (__builtin_cpu_is (cpu)) +#define gcc_assert(a) assert (a) +#define gcc_unreachable() abort () +#define inline +#include "../../../common/config/i386/cpuinfo.h" /* Check if the ISA features are identified. */ static void -check_features (unsigned int ecx, unsigned int edx, - int max_cpuid_level) +check_features (struct __processor_model *cpu_model, + unsigned int *cpu_features2) { - unsigned int eax, ebx; - unsigned int ext_level; - - if (edx & bit_CMOV) - assert (__builtin_cpu_supports ("cmov")); - if (edx & bit_MMX) - assert (__builtin_cpu_supports ("mmx")); - if (edx & bit_SSE) - assert (__builtin_cpu_supports ("sse")); - if (edx & bit_SSE2) - assert (__builtin_cpu_supports ("sse2")); - if (ecx & bit_POPCNT) - assert (__builtin_cpu_supports ("popcnt")); - if (ecx & bit_AES) - assert (__builtin_cpu_supports ("aes")); - if (ecx & bit_PCLMUL) - assert (__builtin_cpu_supports ("pclmul")); - if (ecx & bit_SSE3) - assert (__builtin_cpu_supports ("sse3")); - if (ecx & bit_SSSE3) - assert (__builtin_cpu_supports ("ssse3")); - if (ecx & bit_SSE4_1) - assert (__builtin_cpu_supports ("sse4.1")); - if (ecx & bit_SSE4_2) - assert (__builtin_cpu_supports ("sse4.2")); - if (ecx & bit_AVX) - assert (__builtin_cpu_supports ("avx")); - if (ecx & bit_FMA) - assert (__builtin_cpu_supports ("fma")); - - /* Get advanced features at level 7 (eax = 7, ecx = 0). */ - if (max_cpuid_level >= 7) - { - __cpuid_count (7, 0, eax, ebx, ecx, edx); - if (ebx & bit_BMI) - assert (__builtin_cpu_supports ("bmi")); - if (ebx & bit_AVX2) - assert (__builtin_cpu_supports ("avx2")); - if (ebx & bit_BMI2) - assert (__builtin_cpu_supports ("bmi2")); - if (ebx & bit_AVX512F) - assert (__builtin_cpu_supports ("avx512f")); - if (ebx & bit_AVX512VL) - assert (__builtin_cpu_supports ("avx512vl")); - if (ebx & bit_AVX512BW) - assert (__builtin_cpu_supports ("avx512bw")); - if (ebx & bit_AVX512DQ) - assert (__builtin_cpu_supports ("avx512dq")); - if (ebx & bit_AVX512CD) - assert (__builtin_cpu_supports ("avx512cd")); - if (ebx & bit_AVX512PF) - assert (__builtin_cpu_supports ("avx512pf")); - if (ebx & bit_AVX512ER) - assert (__builtin_cpu_supports ("avx512er")); - if (ebx & bit_AVX512IFMA) - assert (__builtin_cpu_supports ("avx512ifma")); - if (ecx & bit_AVX512VBMI) - assert (__builtin_cpu_supports ("avx512vbmi")); - if (ecx & bit_AVX512VBMI2) - assert (__builtin_cpu_supports ("avx512vbmi2")); - if (ecx & bit_GFNI) - assert (__builtin_cpu_supports ("gfni")); - if (ecx & bit_VPCLMULQDQ) - assert (__builtin_cpu_supports ("vpclmulqdq")); - if (ecx & bit_AVX512VNNI) - assert (__builtin_cpu_supports ("avx512vnni")); - if (ecx & bit_AVX512BITALG) - assert (__builtin_cpu_supports ("avx512bitalg")); - if (ecx & bit_AVX512VPOPCNTDQ) - assert (__builtin_cpu_supports ("avx512vpopcntdq")); - if (edx & bit_AVX5124VNNIW) - assert (__builtin_cpu_supports ("avx5124vnniw")); - if (edx & bit_AVX5124FMAPS) - assert (__builtin_cpu_supports ("avx5124fmaps")); - - __cpuid_count (7, 1, eax, ebx, ecx, edx); - if (eax & bit_AVX512BF16) - assert (__builtin_cpu_supports ("avx512bf16")); - } - - /* Check cpuid level of extended features. */ - __cpuid (0x80000000, ext_level, ebx, ecx, edx); - - if (ext_level >= 0x80000001) - { - __cpuid (0x80000001, eax, ebx, ecx, edx); - - if (ecx & bit_SSE4a) - assert (__builtin_cpu_supports ("sse4a")); - if (ecx & bit_FMA4) - assert (__builtin_cpu_supports ("fma4")); - if (ecx & bit_XOP) - assert (__builtin_cpu_supports ("xop")); - } -} - -static int __attribute__ ((noinline)) -__get_cpuid_output (unsigned int __level, - unsigned int *__eax, unsigned int *__ebx, - unsigned int *__ecx, unsigned int *__edx) -{ - return __get_cpuid (__level, __eax, __ebx, __ecx, __edx); +#define has_feature(f) \ + has_cpu_feature (cpu_model, cpu_features2, f) +#define ISA_NAMES_TABLE_START +#define ISA_NAMES_TABLE_END +#define ISA_NAMES_TABLE_ENTRY(name, feature, priority, option) \ + assert (!!has_feature (feature) == !!__builtin_cpu_supports (name)); +#include "../../../common/config/i386/i386-isas.h" } static int check_detailed () { - unsigned int eax, ebx, ecx, edx; - - int max_level; - unsigned int vendor; - unsigned int model, family, brand_id; - unsigned int extended_model, extended_family; - - /* Assume cpuid insn present. Run in level 0 to get vendor id. */ - if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx)) - return 0; + struct __processor_model cpu_model = { 0 }; + struct __processor_model2 cpu_model2 = { 0 }; + unsigned int cpu_features2[SIZE_OF_CPU_FEATURES] = { 0 }; - vendor = ebx; - max_level = eax; - - if (max_level < 1) - return 0; - - if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx)) + if (cpu_indicator_init (&cpu_model, &cpu_model2, cpu_features2) != 0) return 0; - model = (eax >> 4) & 0x0f; - family = (eax >> 8) & 0x0f; - brand_id = ebx & 0xff; - extended_model = (eax >> 12) & 0xf0; - extended_family = (eax >> 20) & 0xff; + check_features (&cpu_model, cpu_features2); - if (vendor == signature_INTEL_ebx) + switch (cpu_model.__cpu_vendor) { + case VENDOR_INTEL: assert (__builtin_cpu_is ("intel")); - /* Adjust family and model for Intel CPUs. */ - if (family == 0x0f) - { - family += extended_family; - model += extended_model; - } - else if (family == 0x06) - model += extended_model; - check_intel_cpu_model (family, model, brand_id); - check_features (ecx, edx, max_level); - } - else if (vendor == signature_AMD_ebx) - { + get_intel_cpu (&cpu_model, &cpu_model2, cpu_features2, 0); + break; + case VENDOR_AMD: assert (__builtin_cpu_is ("amd")); - /* Adjust model and family for AMD CPUS. */ - if (family == 0x0f) - { - family += extended_family; - model += (extended_model << 4); - } - check_amd_cpu_model (family, model); - check_features (ecx, edx, max_level); + get_amd_cpu (&cpu_model, &cpu_model2, cpu_features2); + break; + default: + break; } return 0; diff --git a/libgcc/config/i386/cpuinfo.c b/libgcc/config/i386/cpuinfo.c index cf5f0884bb4..49c5107546f 100644 --- a/libgcc/config/i386/cpuinfo.c +++ b/libgcc/config/i386/cpuinfo.c @@ -26,7 +26,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #include "cpuid.h" #include "tsystem.h" #include "auto-target.h" -#include "cpuinfo.h" +#include "common/config/i386/cpuinfo.h" #ifdef HAVE_INIT_PRIORITY #define CONSTRUCTOR_PRIORITY (101) @@ -39,386 +39,14 @@ int __cpu_indicator_init (void) struct __processor_model __cpu_model = { }; -#ifndef SHARED /* We want to move away from __cpu_model in libgcc_s.so.1 and the size of __cpu_model is part of ABI. So, new features that don't fit into __cpu_model.__cpu_features[0] go into extra variables - in libgcc.a only, preferrably hidden. */ -unsigned int __cpu_features2; -#endif - - -/* Get the specific type of AMD CPU. */ - -static void -get_amd_cpu (unsigned int family, unsigned int model) -{ - switch (family) - { - /* AMD Family 10h. */ - case 0x10: - __cpu_model.__cpu_type = AMDFAM10H; - switch (model) - { - case 0x2: - /* Barcelona. */ - __cpu_model.__cpu_subtype = AMDFAM10H_BARCELONA; - break; - case 0x4: - /* Shanghai. */ - __cpu_model.__cpu_subtype = AMDFAM10H_SHANGHAI; - break; - case 0x8: - /* Istanbul. */ - __cpu_model.__cpu_subtype = AMDFAM10H_ISTANBUL; - break; - default: - break; - } - break; - /* AMD Family 14h "btver1". */ - case 0x14: - __cpu_model.__cpu_type = AMD_BTVER1; - break; - /* AMD Family 15h "Bulldozer". */ - case 0x15: - __cpu_model.__cpu_type = AMDFAM15H; - - if (model == 0x2) - __cpu_model.__cpu_subtype = AMDFAM15H_BDVER2; - /* Bulldozer version 1. */ - else if (model <= 0xf) - __cpu_model.__cpu_subtype = AMDFAM15H_BDVER1; - /* Bulldozer version 2 "Piledriver" */ - else if (model <= 0x2f) - __cpu_model.__cpu_subtype = AMDFAM15H_BDVER2; - /* Bulldozer version 3 "Steamroller" */ - else if (model <= 0x4f) - __cpu_model.__cpu_subtype = AMDFAM15H_BDVER3; - /* Bulldozer version 4 "Excavator" */ - else if (model <= 0x7f) - __cpu_model.__cpu_subtype = AMDFAM15H_BDVER4; - break; - /* AMD Family 16h "btver2" */ - case 0x16: - __cpu_model.__cpu_type = AMD_BTVER2; - break; - case 0x17: - __cpu_model.__cpu_type = AMDFAM17H; - /* AMD family 17h version 1. */ - if (model <= 0x1f) - __cpu_model.__cpu_subtype = AMDFAM17H_ZNVER1; - if (model >= 0x30) - __cpu_model.__cpu_subtype = AMDFAM17H_ZNVER2; - break; - default: - break; - } -} - -/* Get the specific type of Intel CPU. */ - -static void -get_intel_cpu (unsigned int family, unsigned int model, unsigned int brand_id) -{ - /* Parse family and model only if brand ID is 0. */ - if (brand_id == 0) - { - switch (family) - { - case 0x5: - /* Pentium. */ - break; - case 0x6: - switch (model) - { - case 0x1c: - case 0x26: - /* Bonnell. */ - __cpu_model.__cpu_type = INTEL_BONNELL; - break; - case 0x37: - case 0x4a: - case 0x4d: - case 0x5a: - case 0x5d: - /* Silvermont. */ - __cpu_model.__cpu_type = INTEL_SILVERMONT; - break; - case 0x5c: - case 0x5f: - /* Goldmont. */ - __cpu_model.__cpu_type = INTEL_GOLDMONT; - break; - case 0x7a: - /* Goldmont Plus. */ - __cpu_model.__cpu_type = INTEL_GOLDMONT_PLUS; - break; - case 0x57: - /* Knights Landing. */ - __cpu_model.__cpu_type = INTEL_KNL; - break; - case 0x85: - /* Knights Mill. */ - __cpu_model.__cpu_type = INTEL_KNM; - break; - case 0x1a: - case 0x1e: - case 0x1f: - case 0x2e: - /* Nehalem. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_NEHALEM; - break; - case 0x25: - case 0x2c: - case 0x2f: - /* Westmere. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_WESTMERE; - break; - case 0x2a: - case 0x2d: - /* Sandy Bridge. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_SANDYBRIDGE; - break; - case 0x3a: - case 0x3e: - /* Ivy Bridge. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_IVYBRIDGE; - break; - case 0x3c: - case 0x3f: - case 0x45: - case 0x46: - /* Haswell. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_HASWELL; - break; - case 0x3d: - case 0x47: - case 0x4f: - case 0x56: - /* Broadwell. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_BROADWELL; - break; - case 0x4e: - case 0x5e: - /* Skylake. */ - case 0x8e: - case 0x9e: - /* Kaby Lake. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_SKYLAKE; - break; - case 0x55: - { - unsigned int eax, ebx, ecx, edx; - __cpu_model.__cpu_type = INTEL_COREI7; - __cpuid_count (7, 0, eax, ebx, ecx, edx); - if (ecx & bit_AVX512VNNI) - /* Cascade Lake. */ - __cpu_model.__cpu_subtype = INTEL_COREI7_CASCADELAKE; - else - /* Skylake with AVX-512 support. */ - __cpu_model.__cpu_subtype = INTEL_COREI7_SKYLAKE_AVX512; - } - break; - case 0x66: - /* Cannon Lake. */ - __cpu_model.__cpu_type = INTEL_COREI7; - __cpu_model.__cpu_subtype = INTEL_COREI7_CANNONLAKE; - break; - case 0x17: - case 0x1d: - /* Penryn. */ - case 0x0f: - /* Merom. */ - __cpu_model.__cpu_type = INTEL_CORE2; - break; - default: - break; - } - break; - default: - /* We have no idea. */ - break; - } - } -} - -/* ECX and EDX are output of CPUID at level one. MAX_CPUID_LEVEL is - the max possible level of CPUID insn. */ -static void -get_available_features (unsigned int ecx, unsigned int edx, - int max_cpuid_level) -{ - unsigned int eax, ebx; - unsigned int ext_level; - - unsigned int features = 0; - unsigned int features2 = 0; - - /* Get XCR_XFEATURE_ENABLED_MASK register with xgetbv. */ -#define XCR_XFEATURE_ENABLED_MASK 0x0 -#define XSTATE_FP 0x1 -#define XSTATE_SSE 0x2 -#define XSTATE_YMM 0x4 -#define XSTATE_OPMASK 0x20 -#define XSTATE_ZMM 0x40 -#define XSTATE_HI_ZMM 0x80 - -#define XCR_AVX_ENABLED_MASK \ - (XSTATE_SSE | XSTATE_YMM) -#define XCR_AVX512F_ENABLED_MASK \ - (XSTATE_SSE | XSTATE_YMM | XSTATE_OPMASK | XSTATE_ZMM | XSTATE_HI_ZMM) - - /* Check if AVX and AVX512 are usable. */ - int avx_usable = 0; - int avx512_usable = 0; - if ((ecx & bit_OSXSAVE)) - { - /* Check if XMM, YMM, OPMASK, upper 256 bits of ZMM0-ZMM15 and - ZMM16-ZMM31 states are supported by OSXSAVE. */ - unsigned int xcrlow; - unsigned int xcrhigh; - asm (".byte 0x0f, 0x01, 0xd0" - : "=a" (xcrlow), "=d" (xcrhigh) - : "c" (XCR_XFEATURE_ENABLED_MASK)); - if ((xcrlow & XCR_AVX_ENABLED_MASK) == XCR_AVX_ENABLED_MASK) - { - avx_usable = 1; - avx512_usable = ((xcrlow & XCR_AVX512F_ENABLED_MASK) - == XCR_AVX512F_ENABLED_MASK); - } - } - -#define set_feature(f) \ - do \ - { \ - if (f < 32) \ - features |= (1U << (f & 31)); \ - else \ - features2 |= (1U << ((f - 32) & 31)); \ - } \ - while (0) - - if (edx & bit_CMOV) - set_feature (FEATURE_CMOV); - if (edx & bit_MMX) - set_feature (FEATURE_MMX); - if (edx & bit_SSE) - set_feature (FEATURE_SSE); - if (edx & bit_SSE2) - set_feature (FEATURE_SSE2); - if (ecx & bit_POPCNT) - set_feature (FEATURE_POPCNT); - if (ecx & bit_AES) - set_feature (FEATURE_AES); - if (ecx & bit_PCLMUL) - set_feature (FEATURE_PCLMUL); - if (ecx & bit_SSE3) - set_feature (FEATURE_SSE3); - if (ecx & bit_SSSE3) - set_feature (FEATURE_SSSE3); - if (ecx & bit_SSE4_1) - set_feature (FEATURE_SSE4_1); - if (ecx & bit_SSE4_2) - set_feature (FEATURE_SSE4_2); - if (avx_usable) - { - if (ecx & bit_AVX) - set_feature (FEATURE_AVX); - if (ecx & bit_FMA) - set_feature (FEATURE_FMA); - } - - /* Get Advanced Features at level 7 (eax = 7, ecx = 0/1). */ - if (max_cpuid_level >= 7) - { - __cpuid_count (7, 0, eax, ebx, ecx, edx); - if (ebx & bit_BMI) - set_feature (FEATURE_BMI); - if (avx_usable) - { - if (ebx & bit_AVX2) - set_feature (FEATURE_AVX2); - if (ecx & bit_VPCLMULQDQ) - set_feature (FEATURE_VPCLMULQDQ); - } - if (ebx & bit_BMI2) - set_feature (FEATURE_BMI2); - if (ecx & bit_GFNI) - set_feature (FEATURE_GFNI); - if (avx512_usable) - { - if (ebx & bit_AVX512F) - set_feature (FEATURE_AVX512F); - if (ebx & bit_AVX512VL) - set_feature (FEATURE_AVX512VL); - if (ebx & bit_AVX512BW) - set_feature (FEATURE_AVX512BW); - if (ebx & bit_AVX512DQ) - set_feature (FEATURE_AVX512DQ); - if (ebx & bit_AVX512CD) - set_feature (FEATURE_AVX512CD); - if (ebx & bit_AVX512PF) - set_feature (FEATURE_AVX512PF); - if (ebx & bit_AVX512ER) - set_feature (FEATURE_AVX512ER); - if (ebx & bit_AVX512IFMA) - set_feature (FEATURE_AVX512IFMA); - if (ecx & bit_AVX512VBMI) - set_feature (FEATURE_AVX512VBMI); - if (ecx & bit_AVX512VBMI2) - set_feature (FEATURE_AVX512VBMI2); - if (ecx & bit_AVX512VNNI) - set_feature (FEATURE_AVX512VNNI); - if (ecx & bit_AVX512BITALG) - set_feature (FEATURE_AVX512BITALG); - if (ecx & bit_AVX512VPOPCNTDQ) - set_feature (FEATURE_AVX512VPOPCNTDQ); - if (edx & bit_AVX5124VNNIW) - set_feature (FEATURE_AVX5124VNNIW); - if (edx & bit_AVX5124FMAPS) - set_feature (FEATURE_AVX5124FMAPS); - if (edx & bit_AVX512VP2INTERSECT) - set_feature (FEATURE_AVX512VP2INTERSECT); + in libgcc.a only, preferably hidden. - __cpuid_count (7, 1, eax, ebx, ecx, edx); - if (eax & bit_AVX512BF16) - set_feature (FEATURE_AVX512BF16); - } - } - - /* Check cpuid level of extended features. */ - __cpuid (0x80000000, ext_level, ebx, ecx, edx); - - if (ext_level >= 0x80000001) - { - __cpuid (0x80000001, eax, ebx, ecx, edx); - - if (ecx & bit_SSE4a) - set_feature (FEATURE_SSE4_A); - if (avx_usable) - { - if (ecx & bit_FMA4) - set_feature (FEATURE_FMA4); - if (ecx & bit_XOP) - set_feature (FEATURE_XOP); - } - } - - __cpu_model.__cpu_features[0] = features; -#ifndef SHARED - __cpu_features2 = features2; -#else - (void) features2; -#endif -} + NB: Since older 386-builtins.c accesses __cpu_features2 as scalar or + smaller array, it can only access the first few elements. */ +unsigned int __cpu_features2[SIZE_OF_CPU_FEATURES]; /* A constructor function that is sets __cpu_model and __cpu_features with the right values. This needs to run only once. This constructor is @@ -429,85 +57,9 @@ get_available_features (unsigned int ecx, unsigned int edx, int __attribute__ ((constructor CONSTRUCTOR_PRIORITY)) __cpu_indicator_init (void) { - unsigned int eax, ebx, ecx, edx; - - int max_level; - unsigned int vendor; - unsigned int model, family, brand_id; - unsigned int extended_model, extended_family; - - /* This function needs to run just once. */ - if (__cpu_model.__cpu_vendor) - return 0; - - /* Assume cpuid insn present. Run in level 0 to get vendor id. */ - if (!__get_cpuid (0, &eax, &ebx, &ecx, &edx)) - { - __cpu_model.__cpu_vendor = VENDOR_OTHER; - return -1; - } - - vendor = ebx; - max_level = eax; - - if (max_level < 1) - { - __cpu_model.__cpu_vendor = VENDOR_OTHER; - return -1; - } - - if (!__get_cpuid (1, &eax, &ebx, &ecx, &edx)) - { - __cpu_model.__cpu_vendor = VENDOR_OTHER; - return -1; - } - - model = (eax >> 4) & 0x0f; - family = (eax >> 8) & 0x0f; - brand_id = ebx & 0xff; - extended_model = (eax >> 12) & 0xf0; - extended_family = (eax >> 20) & 0xff; - - if (vendor == signature_INTEL_ebx) - { - /* Adjust model and family for Intel CPUS. */ - if (family == 0x0f) - { - family += extended_family; - model += extended_model; - } - else if (family == 0x06) - model += extended_model; - - /* Get CPU type. */ - get_intel_cpu (family, model, brand_id); - /* Find available features. */ - get_available_features (ecx, edx, max_level); - __cpu_model.__cpu_vendor = VENDOR_INTEL; - } - else if (vendor == signature_AMD_ebx) - { - /* Adjust model and family for AMD CPUS. */ - if (family == 0x0f) - { - family += extended_family; - model += extended_model; - } - - /* Get CPU type. */ - get_amd_cpu (family, model); - /* Find available features. */ - get_available_features (ecx, edx, max_level); - __cpu_model.__cpu_vendor = VENDOR_AMD; - } - else - __cpu_model.__cpu_vendor = VENDOR_OTHER; - - gcc_assert (__cpu_model.__cpu_vendor < VENDOR_MAX); - gcc_assert (__cpu_model.__cpu_type < CPU_TYPE_MAX); - gcc_assert (__cpu_model.__cpu_subtype < CPU_SUBTYPE_MAX); - - return 0; + struct __processor_model2 cpu_model2; + return cpu_indicator_init (&__cpu_model, &cpu_model2, + __cpu_features2); } #if defined SHARED && defined USE_ELF_SYMVER -- 2.26.2