From patchwork Thu Apr 11 19:05:41 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 235878 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9B3C22C00BB for ; Fri, 12 Apr 2013 05:05:51 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; q= dns; s=default; b=SU30K9uonszBF8jpOf/q4bYUSEWC2Xzrtt9xYuIVowh/3w v6FPsJd6XDXobrOwm4D96g6HlmJ6Q0q6GYOwFJM7ewmkvghaBZ+cd7cWaJrXO+O2 wDyEATwiXbbRT2uYtiGRKY2kzyFXR0d3fkZpL3PCU/1Y0CfGljJUfx8uLmz28= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:content-type; s= default; bh=+Jtit5Auc2+pXQnCQFcVSHbKhug=; b=EDW2N22U16UW3jz9fvmG GTTEBTn2/AObYCa4mv/WzgUR9OITb2VLYqAXp/kDExeijHH95hmx7ci5kf8l1Hli iqQDZ6tm+EOrV5oox7nU3giDosoHmiyV/Cdhigl8GpqOr1GZDE9pii9J1RwNoPj9 bP/YcpwxiEHddLhPQQusEws= Received: (qmail 2041 invoked by alias); 11 Apr 2013 19:05:44 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 1976 invoked by uid 89); 11 Apr 2013 19:05:44 -0000 X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_TRUST, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD autolearn=ham version=3.3.1 Received: from mail-oa0-f43.google.com (HELO mail-oa0-f43.google.com) (209.85.219.43) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Thu, 11 Apr 2013 19:05:42 +0000 Received: by mail-oa0-f43.google.com with SMTP id l10so1826408oag.30 for ; Thu, 11 Apr 2013 12:05:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type:x-gm-message-state; bh=uItNlUG9vZVfbW+ilJ2CO22cNgHTInuah+qYjpo9mPs=; b=U6OaUykWQWJpBuPm33W4EVf3F9EcuQ+CPCf0o4eS1WbcSOXGoyhYZCsSQOlAb/ANvV JO5LqM1NbKnl/oxwSkaAVWA+BkvKCCh9WXAljbNct+5h+0fqO8bCRw+/Z8LWxMinGyC7 4K3obW05Oa4SX1IAXmjr61K7Wx7Cjn/YAzugAK8MnlKz5WrBlM2Y0fPhXj0/2L/cTiok zRBQyLX89l3kvQ5kEGIbFDA08BnEdA5KSlU6ALNKb1SpjzOzvjNVmDggarD7IV+Q8g7i WLpZhFxeP7xxC7mprDIaR6Lg3bM3Urb+f5rZmJPx/6KaDCQmTpljHSyjtL4bOf+0OLWh Ak6g== MIME-Version: 1.0 X-Received: by 10.60.144.42 with SMTP id sj10mr2685041oeb.130.1365707141320; Thu, 11 Apr 2013 12:05:41 -0700 (PDT) Received: by 10.182.245.229 with HTTP; Thu, 11 Apr 2013 12:05:41 -0700 (PDT) Date: Thu, 11 Apr 2013 12:05:41 -0700 Message-ID: Subject: GCC does not support *mmintrin.h with function specific opts From: Sriraman Tallam To: GCC Patches , David Li X-Gm-Message-State: ALoCoQnqNisLOagGx7mOrxcubPPnK7+0ENmKsHAmxxh9JbPeTkc4OpahRMjpg6jel6KiFfsZDUIRsIx/ep2feKMgTmKqXte2+0KsciLzyGZZ2m4fAWSnHk9NVmoGAzL93B8dJ/zkIHBf1Y6zoswLxWdGlBeAnRD1wIXkaMWOo5zApnHaF7a/NbiqqA8MFMWX5qYExBTpa1ip X-Virus-Found: No X-IsSubscribed: yes Hi, *mmintrin headers does not work with function specific opts. Example 1: #include __attribute__((target("sse4.1"))) __m128i foo(__m128i *V) { return _mm_stream_load_si128(V); } $ g++ test.cc smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled" # error "SSE4.1 instruction set not enabled" This error happens even though foo is marked "sse4.1" There are multiple issues at play here. One, the headers are guarded by macros that are switched on only when the target specific options, like -msse4.1 in this case, are present in the command line. Also, the target specific builtins, like __builtin_ia32_movntdqa called by _mm_stream_load_si128, are exposed only in the presence of the appropriate target ISA option. I have attached a patch that fixes this. I have added an option "-mgenerate-builtins" that will do two things. It will define a macro "__ALL_ISA__" which will expose the *intrin.h functions. It will also expose all the target specific builtins. -mgenerate-builtins will not affect code generation. This feature will greatly benefit the function multiversioning usability too. Comments? Thanks Sri Index: emmintrin.h =================================================================== --- emmintrin.h (revision 197691) +++ emmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ +#if !defined (__SSE2__) && !defined (__ALL_ISA__) # error "SSE2 instruction set not enabled" #else Index: fma4intrin.h =================================================================== --- fma4intrin.h (revision 197691) +++ fma4intrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ +#if !defined (__FMA4__) && !defined (__ALL_ISA__) # error "FMA4 instruction set not enabled" #else Index: lwpintrin.h =================================================================== --- lwpintrin.h (revision 197691) +++ lwpintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _LWPINTRIN_H_INCLUDED #define _LWPINTRIN_H_INCLUDED -#ifndef __LWP__ +#if !defined (__LWP__) && !defined (__ALL_ISA__) # error "LWP instruction set not enabled" #else Index: xopintrin.h =================================================================== --- xopintrin.h (revision 197691) +++ xopintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED -#ifndef __XOP__ +#if !defined (__XOP__) && !defined (__ALL_ISA__) # error "XOP instruction set not enabled" #else Index: fmaintrin.h =================================================================== --- fmaintrin.h (revision 197691) +++ fmaintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMAINTRIN_H_INCLUDED #define _FMAINTRIN_H_INCLUDED -#ifndef __FMA__ +#if !defined (__FMA__) && !defined (__ALL_ISA__) # error "FMA instruction set not enabled" #else Index: bmiintrin.h =================================================================== --- bmiintrin.h (revision 197691) +++ bmiintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __BMI__ +#if !defined (__BMI__) && !defined (__ALL_ISA__) # error "BMI instruction set not enabled" #endif /* __BMI__ */ Index: mmintrin.h =================================================================== --- mmintrin.h (revision 197691) +++ mmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _MMINTRIN_H_INCLUDED #define _MMINTRIN_H_INCLUDED -#ifndef __MMX__ +#if !defined (__MMX__) && !defined (__ALL_ISA__) # error "MMX instruction set not enabled" #else /* The Intel API is flexible enough that we must allow aliasing with other Index: nmmintrin.h =================================================================== --- nmmintrin.h (revision 197691) +++ nmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ +#if !defined (__SSE4_2__) && !defined (__ALL_ISA__) # error "SSE4.2 instruction set not enabled" #else /* We just include SSE4.1 header file. */ Index: tbmintrin.h =================================================================== --- tbmintrin.h (revision 197691) +++ tbmintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __TBM__ +#if !defined (__TBM__) && !defined (__ALL_ISA__) # error "TBM instruction set not enabled" #endif /* __TBM__ */ Index: f16cintrin.h =================================================================== --- f16cintrin.h (revision 197691) +++ f16cintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include or instead." #endif -#ifndef __F16C__ +#if !defined (__F16C__) && !defined (__ALL_ISA__) # error "F16C instruction set not enabled" #else Index: i386.opt =================================================================== --- i386.opt (revision 197691) +++ i386.opt (working copy) @@ -626,3 +626,7 @@ Split 32-byte AVX unaligned store mrtm Target Report Mask(ISA_RTM) Var(ix86_isa_flags) Save Support RTM built-in functions and code generation + +mgenerate-builtins +Target Report Var(generate_target_builtins) Save +Generate all target builtins that are otherwise only generated when the approrpriate ISA is turned on. Index: i386-c.c =================================================================== --- i386-c.c (revision 197691) +++ i386-c.c (working copy) @@ -54,6 +54,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_fla int last_arch_char = ix86_arch_string[arch_len - 1]; int last_tune_char = ix86_tune_string[tune_len - 1]; + if (generate_target_builtins) + def_or_undef (parse_in, "__ALL_ISA__"); + /* Built-ins based on -march=. */ switch (arch) { Index: bmi2intrin.h =================================================================== --- bmi2intrin.h (revision 197691) +++ bmi2intrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __BMI2__ +#if !defined (__BMI2__) && !defined (__ALL_ISA__) # error "BMI2 instruction set not enabled" #endif /* __BMI2__ */ Index: lzcntintrin.h =================================================================== --- lzcntintrin.h (revision 197691) +++ lzcntintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __LZCNT__ +#if !defined (__LZCNT__) && !defined (__ALL_ISA__) # error "LZCNT instruction is not enabled" #endif /* __LZCNT__ */ Index: smmintrin.h =================================================================== --- smmintrin.h (revision 197691) +++ smmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ +#if !defined (__SSE4_1__) && !defined (__ALL_ISA__) # error "SSE4.1 instruction set not enabled" #else Index: i386.c =================================================================== --- i386.c (revision 197691) +++ i386.c (working copy) @@ -26813,7 +26813,8 @@ def_builtin (HOST_WIDE_INT mask, const char *name, ix86_builtins_isa[(int) code].isa = mask; mask &= ~OPTION_MASK_ISA_64BIT; - if (mask == 0 + if (generate_target_builtins + || mask == 0 || (mask & ix86_isa_flags) != 0 || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) Index: wmmintrin.h =================================================================== --- wmmintrin.h (revision 197691) +++ wmmintrin.h (working copy) @@ -30,7 +30,7 @@ /* We need definitions from the SSE2 header file. */ #include -#if !defined (__AES__) && !defined (__PCLMUL__) +#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__) # error "AES/PCLMUL instructions not enabled" #else Index: pmmintrin.h =================================================================== --- pmmintrin.h (revision 197691) +++ pmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ +#if !defined (__SSE3__) && !defined (__ALL_ISA__) # error "SSE3 instruction set not enabled" #else Index: tmmintrin.h =================================================================== --- tmmintrin.h (revision 197691) +++ tmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ +#if !defined (__SSSE3__) && !defined (__ALL_ISA__) # error "SSSE3 instruction set not enabled" #else Index: xmmintrin.h =================================================================== --- xmmintrin.h (revision 197691) +++ xmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _XMMINTRIN_H_INCLUDED #define _XMMINTRIN_H_INCLUDED -#ifndef __SSE__ +#if !defined (__SSE__) && !defined (__ALL_ISA__) # error "SSE instruction set not enabled" #else Index: popcntintrin.h =================================================================== --- popcntintrin.h (revision 197691) +++ popcntintrin.h (working copy) @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see . */ -#ifndef __POPCNT__ +#if !defined (__POPCNT__) && !defined (__ALL_ISA__) # error "POPCNT instruction set not enabled" #endif /* __POPCNT__ */ Index: ammintrin.h =================================================================== --- ammintrin.h (revision 197691) +++ ammintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _AMMINTRIN_H_INCLUDED #define _AMMINTRIN_H_INCLUDED -#ifndef __SSE4A__ +#if !defined (__SSE4A__) && !defined (__ALL_ISA__) # error "SSE4A instruction set not enabled" #else