From patchwork Sat May 18 04:00:21 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 244730 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 780512C007B for ; Sat, 18 May 2013 14:00:39 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=nPbRJuSIGVEbAmPixC Fo9nh0mg8DmY4Fh5N7OU+3sb9e0WnONj1FMcGuYD8f1HZyWdGjuCns5ttSbq+2cv MuLgysx9JdkVA3/gwyCRAwaes2fGka9k/87fATEU29YTAseRd8nxOXRGNZvDpkWn gA1NGavxCAvH5r/5zLJpSJWnw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=8z43RTEJLsxHUBJzhU0+uBrV mbw=; b=YF88R5BbYuyMxTbj+ckbtL7HtMpmrab724gQxzHZkFtrduBkWDg5s5Go rMqlwt8Dbwbv3GyO/rpA5XrXAxwlKvtTCXL261OMPDDj07UFp6wV4lmi9EPn9gqx BtEWEXEqk/0WoWdf2pSdiE3gA1D7ZA6tC2GBByyMdBUOcPpbxvk= Received: (qmail 27917 invoked by alias); 18 May 2013 04:00:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 27880 invoked by uid 89); 18 May 2013 04:00:28 -0000 X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL, BAYES_00, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD, SPF_PASS, TW_AV, TW_FX, TW_LZ, TW_XR, TW_ZC, TW_ZM autolearn=ham version=3.3.1 Received: from mail-ia0-f173.google.com (HELO mail-ia0-f173.google.com) (209.85.210.173) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Sat, 18 May 2013 04:00:23 +0000 Received: by mail-ia0-f173.google.com with SMTP id k20so5672156iak.18 for ; Fri, 17 May 2013 21:00:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=tJWBvER6WdeHHUDw1k5irju2yynKEGe8NKuugTwX/qU=; b=BFffc+JY6N8Z8jn4QsJNFbK7SylEA8AiyTfR7S48jSageSXtYI8eQ0tXIJdnvE4Dy9 p/ruTkWhIm2GyO6qV2WyYbahcn4nzkUReh53bhnRsY6JsQi0wnrMXWEkwtQf+aAJ4Ydm wOmo3nezfZilgdV+x9aMJUXgHKH/7YtOd6e19oX4TgEQscXajUYih1RE4/LyCVfzf1RY xj+aHMV6Xv93qoJM273O+QooDl8XvIenNPc4nxsfd1BzDq1uur2ra4NjVM6NRS6YS+RL MMcK+aMOpwgHYwTAqKIvmGQfRkV7xHN4yHeaSuQLi7iTzXKSEIiJzoK4kWOQhxZVTwyX JzGw== MIME-Version: 1.0 X-Received: by 10.50.41.66 with SMTP id d2mr197840igl.100.1368849621669; Fri, 17 May 2013 21:00:21 -0700 (PDT) Received: by 10.231.111.193 with HTTP; Fri, 17 May 2013 21:00:21 -0700 (PDT) In-Reply-To: <20130517054931.GM1377@tucnak.redhat.com> References: <20130514083913.GJ1377@tucnak.redhat.com> <20130514100419.GM1377@tucnak.redhat.com> <20130517054931.GM1377@tucnak.redhat.com> Date: Fri, 17 May 2013 21:00:21 -0700 Message-ID: Subject: Re: GCC does not support *mmintrin.h with function specific opts From: Sriraman Tallam To: Jakub Jelinek Cc: GCC Patches , Uros Bizjak , "H.J. Lu" , "Joseph S. Myers" , Diego Novillo , David Li X-Gm-Message-State: ALoCoQn1fCRdtzN2nUz/lzWRXfUJoEVn4qsMueRDv8B1wqXhs/92XNMPdigCIKhWbzYVILuNKUSS/phxhxMRTKu5Vc5ldh2YjzpZQh6c/XBngkKKe5IoIV50GLzykwtAn9zVloghkk//Fst/7awQrM98kEpXq7Ywo/KX8nXQRgLPtoCyvHET8JnhCUPFZ7Tj8GSGjA5gKwYB X-Virus-Found: No X-IsSubscribed: yes On Thu, May 16, 2013 at 10:49 PM, Jakub Jelinek wrote: > On Thu, May 16, 2013 at 04:00:53PM -0700, Sriraman Tallam wrote: >> On Thu, May 16, 2013 at 3:55 PM, Marc Glisse wrote: >> > I don't really understand why you made the change to x86intrin.h instead of >> > making it inside each *mmintrin.h header. The code would be the same size, >> > it would let us include smmintrin.h directly if we wanted to, and >> > x86intrin.h would also automatically work. >> >> Right, I should have done that instead! > > Yeah, definitely. For the standalone headers, which have currently > ____ guards inside of it, please replace it by the larger snippets > involving #pragma, and in the x86intrin.h/immintrin.h headers include those > unconditionally, instead of just if ____ is defined. > For the non-standalone headers (newer ones like avxintrin.h), replace > the #ifdef ____ in immintrin.h/x86intrin.h with larger snippets. * I did mostly as suggested except that even for avx and avx2 headers I did not see the harm in doing it in the header itself. AVX header did not have the "#ifndef _AVXINTRIN_H_INCLUDED" which I added before doing this. I have added test cases to show it is doing the right thing for avx. * I also found that when the caller to these intrinsics do not have the right target attribute, an error is raised in -O2 mode but not in -O0 mode. I have fixed this with a patch to ipa-inline.c, please see if this is alright. Test case intrinsics_5.c checks if an error is raised. * LZCNT needed to be handled which is done now. This patch is messy to review because of too many headers. This needs a careful review. Thanks Sri > > Jakub * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * config/i386/i386.c (ix86_pragma_target_parse): Restore target when current target options does not apply. * config/i386/i386-protos.h (ix86_reset_previous_fndecl): New function. * config/i386/i386.c (ix86_reset_previous_fndecl): Ditto. * common/config/i386/i386-common.c: Handle LZCNT. * ipa-inline.c ( can_early_inline_edge_p): Generate an error when extern inline "gnu_inline,always_inline" functions cannot be inlined because of target mismatch. * config/i386/bmiintrin.h: Pass appropriate target attributes to header. * config/i386/mmintrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/avx2intrin.h: Ditto. * config/i386/fxsrintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/xsaveintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/xtestintrin.h: Ditto. * config/i386/xsaveoptintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/lzcntintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/x86intrin.h: Remove all header include guards. * config/i386/prfchwintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/adxintrin.h: Ditto. * config/i386/rdseedintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. * config/i386/immintrin.h: Remove all header include guards. * config/i386/fma4intrin.h: Ditto. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/ia32intrin.h: Ditto. * config/i386/avxintrin.h: Ditto. * config/i386/rtmintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. Index: config/i386/bmiintrin.h =================================================================== --- config/i386/bmiintrin.h (revision 198950) +++ config/i386/bmiintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _BMIINTRIN_H_INCLUDED +#define _BMIINTRIN_H_INCLUDED + #ifndef __BMI__ -# error "BMI instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("bmi") +#define __DISABLE_BMI__ #endif /* __BMI__ */ -#ifndef _BMIINTRIN_H_INCLUDED -#define _BMIINTRIN_H_INCLUDED - extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __tzcnt_u16 (unsigned short __X) { @@ -116,4 +118,9 @@ __tzcnt_u64 (unsigned long long __X) #endif /* __x86_64__ */ +#ifdef __DISABLE_BMI__ +#undef __DISABLE_BMI__ +#pragma GCC pop_options +#endif /* __DISABLE_BMI__ */ + #endif /* _BMIINTRIN_H_INCLUDED */ Index: config/i386/mmintrin.h =================================================================== --- config/i386/mmintrin.h (revision 198950) +++ config/i386/mmintrin.h (working copy) @@ -28,8 +28,11 @@ #define _MMINTRIN_H_INCLUDED #ifndef __MMX__ -# error "MMX instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("mmx") +#define __DISABLE_MMX__ +#endif /* __MMX__ */ + /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); @@ -303,13 +306,21 @@ _m_paddd (__m64 __m1, __m64 __m2) } /* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_add_si64 (__m64 __m1, __m64 __m2) { return (__m64) __builtin_ia32_paddq ((__v1di)__m1, (__v1di)__m2); } -#endif +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ /* Add the 8-bit values in M1 to the 8-bit values in M2 using signed saturated arithmetic. */ @@ -407,13 +418,21 @@ _m_psubd (__m64 __m1, __m64 __m2) } /* Add the 64-bit values in M1 to the 64-bit values in M2. */ -#ifdef __SSE2__ +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + extern __inline __m64 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_sub_si64 (__m64 __m1, __m64 __m2) { return (__m64) __builtin_ia32_psubq ((__v1di)__m1, (__v1di)__m2); } -#endif +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ /* Subtract the 8-bit values in M2 from the 8-bit values in M1 using signed saturating arithmetic. */ @@ -915,6 +934,9 @@ _mm_set1_pi8 (char __b) { return _mm_set_pi8 (__b, __b, __b, __b, __b, __b, __b, __b); } +#ifdef __DISABLE_MMX__ +#undef __DISABLE_MMX__ +#pragma GCC pop_options +#endif /* __DISABLE_MMX__ */ -#endif /* __MMX__ */ #endif /* _MMINTRIN_H_INCLUDED */ Index: config/i386/nmmintrin.h =================================================================== --- config/i386/nmmintrin.h (revision 198950) +++ config/i386/nmmintrin.h (working copy) @@ -27,11 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ -# error "SSE4.2 instruction set not enabled" -#else /* We just include SSE4.1 header file. */ #include -#endif /* __SSE4_2__ */ #endif /* _NMMINTRIN_H_INCLUDED */ Index: config/i386/i386-protos.h =================================================================== --- config/i386/i386-protos.h (revision 198950) +++ config/i386/i386-protos.h (working copy) @@ -40,6 +40,8 @@ extern void ix86_output_addr_diff_elt (FILE *, int extern enum calling_abi ix86_cfun_abi (void); extern enum calling_abi ix86_function_type_abi (const_tree); +extern void ix86_reset_previous_fndecl (void); + #ifdef RTX_CODE extern int standard_80387_constant_p (rtx); extern const char *standard_80387_constant_opcode (rtx); Index: config/i386/avx2intrin.h =================================================================== --- config/i386/avx2intrin.h (revision 198950) +++ config/i386/avx2intrin.h (working copy) @@ -25,6 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _AVX2INTRIN_H_INCLUDED +#define _AVX2INTRIN_H_INCLUDED + +#ifndef __AVX2__ +#pragma GCC push_options +#pragma GCC target("avx2") +#define __DISABLE_AVX2__ +#endif /* __AVX2__ */ + /* Sum absolute 8-bit integer difference of adjacent groups of 4 byte integers in the first 2 operands. Starting offsets within operands are determined by the 3rd mask operand. */ @@ -1871,3 +1880,10 @@ _mm256_mask_i64gather_epi32 (__m128i src, int cons (__v4si)(__m128i)MASK, \ (int)SCALE) #endif /* __OPTIMIZE__ */ + +#ifdef __DISABLE_AVX2__ +#undef __DISABLE_AVX2__ +#pragma GCC pop_options +#endif /* __DISABLE_AVX2__ */ + +#endif /* _AVX2INTRIN_H_INCLUDED */ Index: config/i386/fxsrintrin.h =================================================================== --- config/i386/fxsrintrin.h (revision 198950) +++ config/i386/fxsrintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _FXSRINTRIN_H_INCLUDED #define _FXSRINTRIN_H_INCLUDED +#ifndef __FXSR__ +#pragma GCC push_options +#pragma GCC target("fxsr") +#define __DISABLE_FXSR__ +#endif /* __FXSR__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _fxsave (void *__P) @@ -58,4 +64,10 @@ _fxrstor64 (void *__P) } #endif +#ifdef __DISABLE_FXSR__ +#undef __DISABLE_FXSR__ +#pragma GCC pop_options +#endif /* __DISABLE_FXSR__ */ + + #endif /* _FXSRINTRIN_H_INCLUDED */ Index: config/i386/tbmintrin.h =================================================================== --- config/i386/tbmintrin.h (revision 198950) +++ config/i386/tbmintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _TBMINTRIN_H_INCLUDED +#define _TBMINTRIN_H_INCLUDED + #ifndef __TBM__ -# error "TBM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("tbm") +#define __DISABLE_TBM__ #endif /* __TBM__ */ -#ifndef _TBMINTRIN_H_INCLUDED -#define _TBMINTRIN_H_INCLUDED - #ifdef __OPTIMIZE__ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __bextri_u32 (unsigned int __X, const unsigned int __I) @@ -169,4 +171,10 @@ __tzmsk_u64 (unsigned long long __X) #endif /* __x86_64__ */ + +#ifdef __DISABLE_TBM__ +#undef __DISABLE_TBM__ +#pragma GCC pop_options +#endif /* __DISABLE_TBM__ */ + #endif /* _TBMINTRIN_H_INCLUDED */ Index: config/i386/xsaveintrin.h =================================================================== --- config/i386/xsaveintrin.h (revision 198950) +++ config/i386/xsaveintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _XSAVEINTRIN_H_INCLUDED #define _XSAVEINTRIN_H_INCLUDED +#ifndef __XSAVE__ +#pragma GCC push_options +#pragma GCC target("xsave") +#define __DISABLE_XSAVE__ +#endif /* __XSAVE__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _xsave (void *__P, long long __M) @@ -58,4 +64,9 @@ _xrstor64 (void *__P, long long __M) } #endif +#ifdef __DISABLE_XSAVE__ +#undef __DISABLE_XSAVE__ +#pragma GCC pop_options +#endif /* __DISABLE_XSAVE__ */ + #endif /* _XSAVEINTRIN_H_INCLUDED */ Index: config/i386/f16cintrin.h =================================================================== --- config/i386/f16cintrin.h (revision 198950) +++ config/i386/f16cintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include or instead." #endif -#ifndef __F16C__ -# error "F16C instruction set not enabled" -#else - #ifndef _F16CINTRIN_H_INCLUDED #define _F16CINTRIN_H_INCLUDED +#ifndef __F16C__ +#pragma GCC push_options +#pragma GCC target("f16c") +#define __DISABLE_F16C__ +#endif /* __F16C__ */ + extern __inline float __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _cvtsh_ss (unsigned short __S) { @@ -88,5 +90,9 @@ _mm256_cvtps_ph (__m256 __A, const int __I) ((__m128i) __builtin_ia32_vcvtps2ph256 ((__v8sf)(__m256) A, (int) (I))) #endif /* __OPTIMIZE */ +#ifdef __DISABLE_F16C__ +#undef __DISABLE_F16C__ +#pragma GCC pop_options +#endif /* __DISABLE_F16C__ */ + #endif /* _F16CINTRIN_H_INCLUDED */ -#endif /* __F16C__ */ Index: config/i386/xtestintrin.h =================================================================== --- config/i386/xtestintrin.h (revision 198950) +++ config/i386/xtestintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _XTESTINTRIN_H_INCLUDED +#define _XTESTINTRIN_H_INCLUDED + #ifndef __RTM__ -# error "RTM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("rtm") +#define __DISABLE_RTM__ #endif /* __RTM__ */ -#ifndef _XTESTINTRIN_H_INCLUDED -#define _XTESTINTRIN_H_INCLUDED - /* Return non-zero if the instruction executes inside an RTM or HLE code region. Return zero otherwise. */ extern __inline int @@ -41,4 +43,9 @@ _xtest (void) return __builtin_ia32_xtest (); } +#ifdef __DISABLE_RTM__ +#undef __DISABLE_RTM__ +#pragma GCC pop_options +#endif /* __DISABLE_RTM__ */ + #endif /* _XTESTINTRIN_H_INCLUDED */ Index: config/i386/xsaveoptintrin.h =================================================================== --- config/i386/xsaveoptintrin.h (revision 198950) +++ config/i386/xsaveoptintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _XSAVEOPTINTRIN_H_INCLUDED #define _XSAVEOPTINTRIN_H_INCLUDED +#ifndef __XSAVEOPT__ +#pragma GCC push_options +#pragma GCC target("xsaveopt") +#define __DISABLE_XSAVEOPT__ +#endif /* __XSAVEOPT__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _xsaveopt (void *__P, long long __M) @@ -44,4 +50,9 @@ _xsaveopt64 (void *__P, long long __M) } #endif +#ifdef __DISABLE_XSAVEOPT__ +#undef __DISABLE_XSAVEOPT__ +#pragma GCC pop_options +#endif /* __DISABLE_XSAVEOPT__ */ + #endif /* _XSAVEOPTINTRIN_H_INCLUDED */ Index: config/i386/i386-c.c =================================================================== --- config/i386/i386-c.c (revision 198950) +++ config/i386/i386-c.c (working copy) @@ -369,20 +369,23 @@ ix86_pragma_target_parse (tree args, tree pop_targ if (! args) { - cur_tree = ((pop_target) - ? pop_target - : target_option_default_node); + cur_tree = (pop_target ? pop_target : target_option_default_node); cl_target_option_restore (&global_options, TREE_TARGET_OPTION (cur_tree)); } else { cur_tree = ix86_valid_target_attribute_tree (args); - if (!cur_tree) - return false; + if (!cur_tree || cur_tree == error_mark_node) + { + cl_target_option_restore (&global_options, + TREE_TARGET_OPTION (prev_tree)); + return false; + } } target_option_current_node = cur_tree; + ix86_reset_previous_fndecl (); /* Figure out the previous/current isa, arch, tune and the differences. */ prev_opt = TREE_TARGET_OPTION (prev_tree); Index: config/i386/bmi2intrin.h =================================================================== --- config/i386/bmi2intrin.h (revision 198950) +++ config/i386/bmi2intrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _BMI2INTRIN_H_INCLUDED +#define _BMI2INTRIN_H_INCLUDED + #ifndef __BMI2__ -# error "BMI2 instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("bmi2") +#define __DISABLE_BMI2__ #endif /* __BMI2__ */ -#ifndef _BMI2INTRIN_H_INCLUDED -#define _BMI2INTRIN_H_INCLUDED - extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _bzhi_u32 (unsigned int __X, unsigned int __Y) @@ -99,4 +101,9 @@ _mulx_u32 (unsigned int __X, unsigned int __Y, uns #endif /* !__x86_64__ */ +#ifdef __DISABLE_BMI2__ +#undef __DISABLE_BMI2__ +#pragma GCC pop_options +#endif /* __DISABLE_BMI2__ */ + #endif /* _BMI2INTRIN_H_INCLUDED */ Index: config/i386/lzcntintrin.h =================================================================== --- config/i386/lzcntintrin.h (revision 198950) +++ config/i386/lzcntintrin.h (working copy) @@ -25,13 +25,16 @@ # error "Never use directly; include instead." #endif -#ifndef __LZCNT__ -# error "LZCNT instruction is not enabled" -#endif /* __LZCNT__ */ #ifndef _LZCNTINTRIN_H_INCLUDED #define _LZCNTINTRIN_H_INCLUDED +#ifndef __LZCNT__ +#pragma GCC push_options +#pragma GCC target("lzcnt") +#define __DISABLE_LZCNT__ +#endif /* __LZCNT__ */ + extern __inline unsigned short __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __lzcnt16 (unsigned short __X) { @@ -64,4 +67,9 @@ _lzcnt_u64 (unsigned long long __X) } #endif +#ifdef __DISABLE_LZCNT__ +#undef __DISABLE_LZCNT__ +#pragma GCC pop_options +#endif /* __DISABLE_LZCNT__ */ + #endif /* _LZCNTINTRIN_H_INCLUDED */ Index: config/i386/smmintrin.h =================================================================== --- config/i386/smmintrin.h (revision 198950) +++ config/i386/smmintrin.h (working copy) @@ -27,14 +27,16 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ -# error "SSE4.1 instruction set not enabled" -#else - /* We need definitions from the SSSE3, SSE3, SSE2 and SSE header files. */ #include +#ifndef __SSE4_1__ +#pragma GCC push_options +#pragma GCC target("sse4.1") +#define __DISABLE_SSE4_1__ +#endif /* __SSE4_1__ */ + /* Rounding mode macros. */ #define _MM_FROUND_TO_NEAREST_INT 0x00 #define _MM_FROUND_TO_NEG_INF 0x01 @@ -582,7 +584,11 @@ _mm_stream_load_si128 (__m128i *__X) return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ /* These macros specify the source data format. */ #define _SIDD_UBYTE_OPS 0x00 @@ -792,10 +798,30 @@ _mm_cmpgt_epi64 (__m128i __X, __m128i __Y) return (__m128i) __builtin_ia32_pcmpgtq ((__v2di)__X, (__v2di)__Y); } -#ifdef __POPCNT__ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + +#ifdef __DISABLE_SSE4_1__ +#undef __DISABLE_SSE4_1__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_1__ */ + #include -#endif +#ifndef __SSE4_1__ +#pragma GCC push_options +#pragma GCC target("sse4.1") +#define __DISABLE_SSE4_1__ +#endif /* __SSE4_1__ */ + +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_1__ */ + /* Accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_crc32_u8 (unsigned int __C, unsigned char __V) @@ -823,8 +849,14 @@ _mm_crc32_u64 (unsigned long long __C, unsigned lo } #endif -#endif /* __SSE4_2__ */ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ -#endif /* __SSE4_1__ */ +#ifdef __DISABLE_SSE4_1__ +#undef __DISABLE_SSE4_1__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_1__ */ #endif /* _SMMINTRIN_H_INCLUDED */ Index: config/i386/wmmintrin.h =================================================================== --- config/i386/wmmintrin.h (revision 198950) +++ config/i386/wmmintrin.h (working copy) @@ -30,13 +30,14 @@ /* We need definitions from the SSE2 header file. */ #include -#if !defined (__AES__) && !defined (__PCLMUL__) -# error "AES/PCLMUL instructions not enabled" -#else - /* AES */ -#ifdef __AES__ +#ifndef __AES__ +#pragma GCC push_options +#pragma GCC target("aes") +#define __DISABLE_AES__ +#endif /* __AES__ */ + /* Performs 1 round of AES decryption of the first m128i using the second m128i as a round key. */ extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -92,11 +93,20 @@ _mm_aeskeygenassist_si128 (__m128i __X, const int ((__m128i) __builtin_ia32_aeskeygenassist128 ((__v2di)(__m128i)(X), \ (int)(C))) #endif -#endif /* __AES__ */ +#ifdef __DISABLE_AES__ +#undef __DISABLE_AES__ +#pragma GCC pop_options +#endif /* __DISABLE_AES__ */ + /* PCLMUL */ -#ifdef __PCLMUL__ +#ifndef __PCLMUL__ +#pragma GCC push_options +#pragma GCC target("pclmul") +#define __DISABLE_PCLMUL__ +#endif /* __PCLMUL__ */ + /* Performs carry-less integer multiplication of 64-bit halves of 128-bit input operands. The third parameter inducates which 64-bit haves of the input parameters v1 and v2 should be used. It must be @@ -113,8 +123,10 @@ _mm_clmulepi64_si128 (__m128i __X, __m128i __Y, co ((__m128i) __builtin_ia32_pclmulqdq128 ((__v2di)(__m128i)(X), \ (__v2di)(__m128i)(Y), (int)(I))) #endif -#endif /* __PCLMUL__ */ -#endif /* __AES__/__PCLMUL__ */ +#ifdef __DISABLE_PCLMUL__ +#undef __DISABLE_PCLMUL__ +#pragma GCC pop_options +#endif /* __DISABLE_PCLMUL__ */ #endif /* _WMMINTRIN_H_INCLUDED */ Index: config/i386/mm3dnow.h =================================================================== --- config/i386/mm3dnow.h (revision 198950) +++ config/i386/mm3dnow.h (working copy) @@ -27,11 +27,15 @@ #ifndef _MM3DNOW_H_INCLUDED #define _MM3DNOW_H_INCLUDED -#ifdef __3dNOW__ - #include #include +#ifndef __3dNOW__ +#pragma GCC push_options +#pragma GCC target("3dnow") +#define __DISABLE_3dNOW__ +#endif /* __3dNOW__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _m_femms (void) { @@ -205,6 +209,10 @@ _m_pswapd (__m64 __A) } #endif /* __3dNOW_A__ */ -#endif /* __3dNOW__ */ +#ifdef __DISABLE_3dNOW__ +#undef __DISABLE_3dNOW__ +#pragma GCC pop_options +#endif /* __DISABLE_3dNOW__ */ + #endif /* _MM3DNOW_H_INCLUDED */ Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 198950) +++ config/i386/i386.c (working copy) @@ -4564,6 +4564,13 @@ ix86_can_inline_p (tree caller, tree callee) /* Remember the last target of ix86_set_current_function. */ static GTY(()) tree ix86_previous_fndecl; +/* Invalidate ix86_previous_fndecl cache. */ +void +ix86_reset_previous_fndecl (void) +{ + ix86_previous_fndecl = NULL_TREE; +} + /* Establish appropriate back-end context for processing the function FNDECL. The argument might be NULL to indicate processing at top level, outside of any function scope. */ Index: config/i386/x86intrin.h =================================================================== --- config/i386/x86intrin.h (revision 198950) +++ config/i386/x86intrin.h (working copy) @@ -26,96 +26,52 @@ #include -#ifdef __MMX__ #include -#endif -#ifdef __SSE__ #include -#endif -#ifdef __SSE2__ #include -#endif -#ifdef __SSE3__ #include -#endif -#ifdef __SSSE3__ #include -#endif -#ifdef __SSE4A__ #include -#endif -#if defined (__SSE4_2__) || defined (__SSE4_1__) #include -#endif -#if defined (__AES__) || defined (__PCLMUL__) #include -#endif /* For including AVX instructions */ #include -#ifdef __3dNOW__ #include -#endif -#ifdef __FMA4__ #include -#endif -#ifdef __XOP__ #include -#endif -#ifdef __LWP__ #include -#endif -#ifdef __BMI__ #include -#endif -#ifdef __BMI2__ #include -#endif -#ifdef __TBM__ #include -#endif -#ifdef __LZCNT__ #include -#endif -#ifdef __POPCNT__ #include -#endif -#ifdef __RDSEED__ #include -#endif -#ifdef __PRFCHW__ #include -#endif -#ifdef __FXSR__ #include -#endif -#ifdef __XSAVE__ #include -#endif -#ifdef __XSAVEOPT__ #include -#endif #include Index: config/i386/prfchwintrin.h =================================================================== --- config/i386/prfchwintrin.h (revision 198950) +++ config/i386/prfchwintrin.h (working copy) @@ -26,17 +26,24 @@ #endif -#if !defined (__PRFCHW__) && !defined (__3dNOW__) -# error "PRFCHW instruction not enabled" -#endif /* __PRFCHW__ or __3dNOW__*/ - #ifndef _PRFCHWINTRIN_H_INCLUDED #define _PRFCHWINTRIN_H_INCLUDED +#ifndef __PRFCHW__ +#pragma GCC push_options +#pragma GCC target("prfchw") +#define __DISABLE_PRFCHW__ +#endif /* __PRFCHW__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _m_prefetchw (void *__P) { __builtin_prefetch (__P, 1, 3 /* _MM_HINT_T0 */); } +#ifdef __DISABLE_PRFCHW__ +#undef __DISABLE_PRFCHW__ +#pragma GCC pop_options +#endif /* __DISABLE_PRFCHW__ */ + #endif /* _PRFCHWINTRIN_H_INCLUDED */ Index: config/i386/pmmintrin.h =================================================================== --- config/i386/pmmintrin.h (revision 198950) +++ config/i386/pmmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ -# error "SSE3 instruction set not enabled" -#else - /* We need definitions from the SSE2 and SSE header files*/ #include +#ifndef __SSE3__ +#pragma GCC push_options +#pragma GCC target("sse3") +#define __DISABLE_SSE3__ +#endif /* __SSE3__ */ + /* Additional bits in the MXCSR. */ #define _MM_DENORMALS_ZERO_MASK 0x0040 #define _MM_DENORMALS_ZERO_ON 0x0040 @@ -122,6 +124,9 @@ _mm_mwait (unsigned int __E, unsigned int __H) __builtin_ia32_mwait (__E, __H); } -#endif /* __SSE3__ */ +#ifdef __DISABLE_SSE3__ +#undef __DISABLE_SSE3__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE3__ */ #endif /* _PMMINTRIN_H_INCLUDED */ Index: config/i386/tmmintrin.h =================================================================== --- config/i386/tmmintrin.h (revision 198950) +++ config/i386/tmmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ -# error "SSSE3 instruction set not enabled" -#else - /* We need definitions from the SSE3, SSE2 and SSE header files*/ #include +#ifndef __SSSE3__ +#pragma GCC push_options +#pragma GCC target("ssse3") +#define __DISABLE_SSSE3__ +#endif /* __SSSE3__ */ + extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_hadd_epi16 (__m128i __X, __m128i __Y) { @@ -239,6 +241,9 @@ _mm_abs_pi32 (__m64 __X) return (__m64) __builtin_ia32_pabsd ((__v2si)__X); } -#endif /* __SSSE3__ */ +#ifdef __DISABLE_SSSE3__ +#undef __DISABLE_SSSE3__ +#pragma GCC pop_options +#endif /* __DISABLE_SSSE3__ */ #endif /* _TMMINTRIN_H_INCLUDED */ Index: config/i386/xmmintrin.h =================================================================== --- config/i386/xmmintrin.h (revision 198950) +++ config/i386/xmmintrin.h (working copy) @@ -27,16 +27,18 @@ #ifndef _XMMINTRIN_H_INCLUDED #define _XMMINTRIN_H_INCLUDED -#ifndef __SSE__ -# error "SSE instruction set not enabled" -#else - /* We need type definitions from the MMX header file. */ #include /* Get _mm_malloc () and _mm_free (). */ #include +#ifndef __SSE__ +#pragma GCC push_options +#pragma GCC target("sse") +#define __DISABLE_SSE__ +#endif /* __SSE__ */ + /* The Intel API is flexible enough that we must allow aliasing with other vector types, and their scalar components. */ typedef float __m128 __attribute__ ((__vector_size__ (16), __may_alias__)); @@ -1242,9 +1244,11 @@ do { \ } while (0) /* For backward source compatibility. */ -#ifdef __SSE2__ # include -#endif -#endif /* __SSE__ */ +#ifdef __DISABLE_SSE__ +#undef __DISABLE_SSE__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE__ */ + #endif /* _XMMINTRIN_H_INCLUDED */ Index: config/i386/popcntintrin.h =================================================================== --- config/i386/popcntintrin.h (revision 198950) +++ config/i386/popcntintrin.h (working copy) @@ -21,13 +21,15 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see . */ +#ifndef _POPCNTINTRIN_H_INCLUDED +#define _POPCNTINTRIN_H_INCLUDED + #ifndef __POPCNT__ -# error "POPCNT instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("popcnt") +#define __DISABLE_POPCNT__ #endif /* __POPCNT__ */ -#ifndef _POPCNTINTRIN_H_INCLUDED -#define _POPCNTINTRIN_H_INCLUDED - /* Calculate a number of bits set to 1. */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_popcnt_u32 (unsigned int __X) @@ -43,4 +45,9 @@ _mm_popcnt_u64 (unsigned long long __X) } #endif +#ifdef __DISABLE_POPCNT__ +#undef __DISABLE_POPCNT__ +#pragma GCC pop_options +#endif /* __DISABLE_POPCNT__ */ + #endif /* _POPCNTINTRIN_H_INCLUDED */ Index: config/i386/adxintrin.h =================================================================== --- config/i386/adxintrin.h (revision 198950) +++ config/i386/adxintrin.h (working copy) @@ -28,6 +28,12 @@ #ifndef _ADXINTRIN_H_INCLUDED #define _ADXINTRIN_H_INCLUDED +#ifndef __ADX__ +#pragma GCC push_options +#pragma GCC target("adx") +#define __DISABLE_ADX__ +#endif /* __ADX__ */ + extern __inline unsigned char __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _addcarryx_u32 (unsigned char __CF, unsigned int __X, @@ -46,4 +52,9 @@ _addcarryx_u64 (unsigned char __CF, unsigned long } #endif +#ifdef __DISABLE_ADX__ +#undef __DISABLE_ADX__ +#pragma GCC pop_options +#endif /* __DISABLE_ADX__ */ + #endif /* _ADXINTRIN_H_INCLUDED */ Index: config/i386/rdseedintrin.h =================================================================== --- config/i386/rdseedintrin.h (revision 198950) +++ config/i386/rdseedintrin.h (working copy) @@ -25,12 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _RDSEEDINTRIN_H_INCLUDED +#define _RDSEEDINTRIN_H_INCLUDED + #ifndef __RDSEED__ -# error "RDSEED instruction not enabled" +#pragma GCC push_options +#pragma GCC target("rdseed") +#define __DISABLE_RDSEED__ #endif /* __RDSEED__ */ -#ifndef _RDSEEDINTRIN_H_INCLUDED -#define _RDSEEDINTRIN_H_INCLUDED extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -55,4 +58,9 @@ _rdseed64_step (unsigned long long *p) } #endif +#ifdef __DISABLE_RDSEED__ +#undef __DISABLE_RDSEED__ +#pragma GCC pop_options +#endif /* __DISABLE_RDSEED__ */ + #endif /* _RDSEEDINTRIN_H_INCLUDED */ Index: config/i386/ammintrin.h =================================================================== --- config/i386/ammintrin.h (revision 198950) +++ config/i386/ammintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _AMMINTRIN_H_INCLUDED #define _AMMINTRIN_H_INCLUDED -#ifndef __SSE4A__ -# error "SSE4A instruction set not enabled" -#else - /* We need definitions from the SSE3, SSE2 and SSE header files*/ #include +#ifndef __SSE4A__ +#pragma GCC push_options +#pragma GCC target("sse4a") +#define __DISABLE_SSE4A__ +#endif /* __SSE4A__ */ + extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_stream_sd (double * __P, __m128d __Y) { @@ -83,6 +85,9 @@ _mm_inserti_si64(__m128i __X, __m128i __Y, unsigne (unsigned int)(I), (unsigned int)(L))) #endif -#endif /* __SSE4A__ */ +#ifdef __DISABLE_SSE4A__ +#undef __DISABLE_SSE4A__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4A__ */ #endif /* _AMMINTRIN_H_INCLUDED */ Index: config/i386/emmintrin.h =================================================================== --- config/i386/emmintrin.h (revision 198950) +++ config/i386/emmintrin.h (working copy) @@ -27,13 +27,15 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ -# error "SSE2 instruction set not enabled" -#else - /* We need definitions from the SSE header files*/ #include +#ifndef __SSE2__ +#pragma GCC push_options +#pragma GCC target("sse2") +#define __DISABLE_SSE2__ +#endif /* __SSE2__ */ + /* SSE2 */ typedef double __v2df __attribute__ ((__vector_size__ (16))); typedef long long __v2di __attribute__ ((__vector_size__ (16))); @@ -1515,6 +1517,9 @@ _mm_castsi128_pd(__m128i __A) return (__m128d) __A; } -#endif /* __SSE2__ */ +#ifdef __DISABLE_SSE2__ +#undef __DISABLE_SSE2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE2__ */ #endif /* _EMMINTRIN_H_INCLUDED */ Index: config/i386/immintrin.h =================================================================== --- config/i386/immintrin.h (revision 198950) +++ config/i386/immintrin.h (working copy) @@ -24,71 +24,43 @@ #ifndef _IMMINTRIN_H_INCLUDED #define _IMMINTRIN_H_INCLUDED -#ifdef __MMX__ #include -#endif -#ifdef __SSE__ #include -#endif -#ifdef __SSE2__ #include -#endif -#ifdef __SSE3__ #include -#endif -#ifdef __SSSE3__ #include -#endif -#if defined (__SSE4_2__) || defined (__SSE4_1__) #include -#endif -#if defined (__AES__) || defined (__PCLMUL__) #include -#endif -#ifdef __AVX__ #include -#endif -#ifdef __AVX2__ #include -#endif -#ifdef __LZCNT__ #include -#endif -#ifdef __BMI__ #include -#endif -#ifdef __BMI2__ #include -#endif -#ifdef __FMA__ #include -#endif -#ifdef __F16C__ #include -#endif -#ifdef __RTM__ #include -#endif -#ifdef __RTM__ #include -#endif -#ifdef __RDRND__ +#ifndef __RDRND__ +#pragma GCC push_options +#pragma GCC target("rdrnd") +#define __DISABLE_RDRND__ +#endif /* __RDRND__ */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _rdrand16_step (unsigned short *__P) @@ -102,10 +74,18 @@ _rdrand32_step (unsigned int *__P) { return __builtin_ia32_rdrand32_step (__P); } -#endif /* __RDRND__ */ +#ifdef __DISABLE_RDRND__ +#undef __DISABLE_RDRND__ +#pragma GCC pop_options +#endif /* __DISABLE_RDRND__ */ #ifdef __x86_64__ -#ifdef __FSGSBASE__ + +#ifndef __FSGSBASE__ +#pragma GCC push_options +#pragma GCC target("fsgsbase") +#define __DISABLE_FSGSBASE__ +#endif /* __FSGSBASE__ */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _readfsbase_u32 (void) @@ -161,16 +141,27 @@ _writegsbase_u64 (unsigned long long __B) { __builtin_ia32_wrgsbase64 (__B); } -#endif /* __FSGSBASE__ */ +#ifdef __DISABLE_FSGSBASE__ +#undef __DISABLE_FSGSBASE__ +#pragma GCC pop_options +#endif /* __DISABLE_FSGSBASE__ */ -#ifdef __RDRND__ +#ifndef __RDRND__ +#pragma GCC push_options +#pragma GCC target("rdrnd") +#define __DISABLE_RDRND__ +#endif /* __RDRND__ */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _rdrand64_step (unsigned long long *__P) { return __builtin_ia32_rdrand64_step (__P); } -#endif /* __RDRND__ */ +#ifdef __DISABLE_RDRND__ +#undef __DISABLE_RDRND__ +#pragma GCC pop_options +#endif /* __DISABLE_RDRND__ */ + #endif /* __x86_64__ */ #endif /* _IMMINTRIN_H_INCLUDED */ Index: config/i386/fma4intrin.h =================================================================== --- config/i386/fma4intrin.h (revision 198950) +++ config/i386/fma4intrin.h (working copy) @@ -28,13 +28,15 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ -# error "FMA4 instruction set not enabled" -#else - /* We need definitions from the SSE4A, SSE3, SSE2 and SSE header files. */ #include +#ifndef __FMA4__ +#pragma GCC push_options +#pragma GCC target("fma4") +#define __DISABLE_FMA4__ +#endif /* __FMA4__ */ + /* 128b Floating point multiply/add type instructions. */ extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_macc_ps (__m128 __A, __m128 __B, __m128 __C) @@ -231,6 +233,9 @@ _mm256_msubadd_pd (__m256d __A, __m256d __B, __m25 return (__m256d) __builtin_ia32_vfmaddsubpd256 ((__v4df)__A, (__v4df)__B, -(__v4df)__C); } -#endif +#ifdef __DISABLE_FMA4__ +#undef __DISABLE_FMA4__ +#pragma GCC pop_options +#endif /* __DISABLE_FMA4__ */ #endif Index: config/i386/lwpintrin.h =================================================================== --- config/i386/lwpintrin.h (revision 198950) +++ config/i386/lwpintrin.h (working copy) @@ -29,8 +29,10 @@ #define _LWPINTRIN_H_INCLUDED #ifndef __LWP__ -# error "LWP instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("lwp") +#define __DISABLE_LWP__ +#endif /* __LWP__ */ extern __inline void __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __llwpcb (void *pcbAddress) @@ -95,6 +97,9 @@ __lwpins64 (unsigned long long data2, unsigned int #endif #endif -#endif /* __LWP__ */ +#ifdef __DISABLE_LWP__ +#undef __DISABLE_LWP__ +#pragma GCC pop_options +#endif /* __DISABLE_LWP__ */ #endif /* _LWPINTRIN_H_INCLUDED */ Index: config/i386/xopintrin.h =================================================================== --- config/i386/xopintrin.h (revision 198950) +++ config/i386/xopintrin.h (working copy) @@ -28,12 +28,14 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED +#include + #ifndef __XOP__ -# error "XOP instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("xop") +#define __DISABLE_XOP__ +#endif /* __XOP__ */ -#include - /* Integer multiply/add intructions. */ extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_maccs_epi16(__m128i __A, __m128i __B, __m128i __C) @@ -830,6 +832,9 @@ _mm256_permute2_ps (__m256 __X, __m256 __Y, __m256 (int)(I))) #endif /* __OPTIMIZE__ */ -#endif /* __XOP__ */ +#ifdef __DISABLE_XOP__ +#undef __DISABLE_XOP__ +#pragma GCC pop_options +#endif /* __DISABLE_XOP__ */ #endif /* _XOPMMINTRIN_H_INCLUDED */ Index: config/i386/ia32intrin.h =================================================================== --- config/i386/ia32intrin.h (revision 198950) +++ config/i386/ia32intrin.h (working copy) @@ -49,7 +49,12 @@ __bswapd (int __X) return __builtin_bswap32 (__X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ + /* 32bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -71,8 +76,12 @@ __crc32d (unsigned int __C, unsigned int __V) { return __builtin_ia32_crc32si (__C, __V); } -#endif /* SSE4.2 */ +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + /* 32bit popcnt */ extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -186,7 +195,12 @@ __bswapq (long long __X) return __builtin_bswap64 (__X); } -#ifdef __SSE4_2__ +#ifndef __SSE4_2__ +#pragma GCC push_options +#pragma GCC target("sse4.2") +#define __DISABLE_SSE4_2__ +#endif /* __SSE4_2__ */ + /* 64bit accumulate CRC32 (polynomial 0x11EDC6F41) value. */ extern __inline unsigned long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -194,8 +208,12 @@ __crc32q (unsigned long long __C, unsigned long lo { return __builtin_ia32_crc32di (__C, __V); } -#endif +#ifdef __DISABLE_SSE4_2__ +#undef __DISABLE_SSE4_2__ +#pragma GCC pop_options +#endif /* __DISABLE_SSE4_2__ */ + /* 64bit popcnt */ extern __inline long long __attribute__((__gnu_inline__, __always_inline__, __artificial__)) Index: config/i386/avxintrin.h =================================================================== --- config/i386/avxintrin.h (revision 198950) +++ config/i386/avxintrin.h (working copy) @@ -28,6 +28,15 @@ # error "Never use directly; include instead." #endif +#ifndef _AVXINTRIN_H_INCLUDED +#define _AVXINTRIN_H_INCLUDED + +#ifndef __AVX__ +#pragma GCC push_options +#pragma GCC target("avx") +#define __DISABLE_AVX__ +#endif /* __AVX__ */ + /* Internal data types for implementing the intrinsics. */ typedef double __v4df __attribute__ ((__vector_size__ (32))); typedef float __v8sf __attribute__ ((__vector_size__ (32))); @@ -1424,3 +1433,10 @@ _mm256_castsi128_si256 (__m128i __A) { return (__m256i) __builtin_ia32_si256_si ((__v4si)__A); } + +#ifdef __DISABLE_AVX__ +#undef __DISABLE_AVX__ +#pragma GCC pop_options +#endif /* __DISABLE_AVX__ */ + +#endif /* _AVXINTRIN_H_INCLUDED */ Index: config/i386/rtmintrin.h =================================================================== --- config/i386/rtmintrin.h (revision 198950) +++ config/i386/rtmintrin.h (working copy) @@ -25,13 +25,15 @@ # error "Never use directly; include instead." #endif +#ifndef _RTMINTRIN_H_INCLUDED +#define _RTMINTRIN_H_INCLUDED + #ifndef __RTM__ -# error "RTM instruction set not enabled" +#pragma GCC push_options +#pragma GCC target("rtm") +#define __DISABLE_RTM__ #endif /* __RTM__ */ -#ifndef _RTMINTRIN_H_INCLUDED -#define _RTMINTRIN_H_INCLUDED - #define _XBEGIN_STARTED (~0u) #define _XABORT_EXPLICIT (1 << 0) #define _XABORT_RETRY (1 << 1) @@ -74,4 +76,9 @@ _xabort (const unsigned int imm) #define _xabort(N) __builtin_ia32_xabort (N) #endif /* __OPTIMIZE__ */ +#ifdef __DISABLE_RTM__ +#undef __DISABLE_RTM__ +#pragma GCC pop_options +#endif /* __DISABLE_RTM__ */ + #endif /* _RTMINTRIN_H_INCLUDED */ Index: config/i386/fmaintrin.h =================================================================== --- config/i386/fmaintrin.h (revision 198950) +++ config/i386/fmaintrin.h (working copy) @@ -29,8 +29,10 @@ #define _FMAINTRIN_H_INCLUDED #ifndef __FMA__ -# error "FMA instruction set not enabled" -#else +#pragma GCC push_options +#pragma GCC target("fma") +#define __DISABLE_FMA__ +#endif /* __FMA__ */ extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, __artificial__)) @@ -292,6 +294,9 @@ _mm256_fmsubadd_ps (__m256 __A, __m256 __B, __m256 -(__v8sf)__C); } -#endif +#ifdef __DISABLE_FMA__ +#undef __DISABLE_FMA__ +#pragma GCC pop_options +#endif /* __DISABLE_FMA__ */ #endif Index: ipa-inline.c =================================================================== --- ipa-inline.c (revision 198950) +++ ipa-inline.c (working copy) @@ -374,7 +374,33 @@ can_early_inline_edge_p (struct cgraph_edge *e) return false; } if (!can_inline_edge_p (e, true)) - return false; + { + enum availability avail; + struct cgraph_node *callee + = cgraph_function_or_thunk_node (e->callee, &avail); + /* Flag an error when the inlining cannot happen because of target option + mismatch but the callee is marked as "always_inline". In -O0 mode + this will go undetected because the error flagged in + "expand_call_inline" in tree-inline.c might not execute and the + inlining will not happen. Then, the linker could complain about a + missing body for the callee if it turned out that the callee was + also marked "gnu_inline" with extern inline keyword as bodies of such + functions are not generated. */ + if ((!optimize + || flag_no_inline) + && e->inline_failed == CIF_TARGET_OPTION_MISMATCH + && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee->symbol.decl)) + && lookup_attribute ("gnu_inline", DECL_ATTRIBUTES (callee->symbol.decl)) + && DECL_EXTERNAL (callee->symbol.decl) + && DECL_DECLARED_INLINE_P (callee->symbol.decl)) + { + error ("inlining failed in call to extern gnu_inline %q+F: %s", + callee->symbol.decl, + cgraph_inline_failed_string (CIF_TARGET_OPTION_MISMATCH)); + error ("called from here"); + } + return false; + } return true; } Index: common/config/i386/i386-common.c =================================================================== --- common/config/i386/i386-common.c (revision 198950) +++ common/config/i386/i386-common.c (working copy) @@ -87,6 +87,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_BMI_SET OPTION_MASK_ISA_BMI #define OPTION_MASK_ISA_BMI2_SET OPTION_MASK_ISA_BMI2 +#define OPTION_MASK_ISA_LZCNT_SET OPTION_MASK_ISA_LZCNT #define OPTION_MASK_ISA_TBM_SET OPTION_MASK_ISA_TBM #define OPTION_MASK_ISA_POPCNT_SET OPTION_MASK_ISA_POPCNT #define OPTION_MASK_ISA_CX16_SET OPTION_MASK_ISA_CX16 @@ -154,6 +155,7 @@ along with GCC; see the file COPYING3. If not see #define OPTION_MASK_ISA_ABM_UNSET OPTION_MASK_ISA_ABM #define OPTION_MASK_ISA_BMI_UNSET OPTION_MASK_ISA_BMI #define OPTION_MASK_ISA_BMI2_UNSET OPTION_MASK_ISA_BMI2 +#define OPTION_MASK_ISA_LZCNT_UNSET OPTION_MASK_ISA_LZCNT #define OPTION_MASK_ISA_TBM_UNSET OPTION_MASK_ISA_TBM #define OPTION_MASK_ISA_POPCNT_UNSET OPTION_MASK_ISA_POPCNT #define OPTION_MASK_ISA_CX16_UNSET OPTION_MASK_ISA_CX16 @@ -438,6 +440,18 @@ ix86_handle_option (struct gcc_options *opts, } return true; + case OPT_mlzcnt: + if (value) + { + opts->x_ix86_isa_flags |= OPTION_MASK_ISA_LZCNT_SET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_LZCNT_SET; + } + else + { + opts->x_ix86_isa_flags &= ~OPTION_MASK_ISA_LZCNT_UNSET; + opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_LZCNT_UNSET; + } + case OPT_mtbm: if (value) { Index: testsuite/gcc.target/i386/intrinsics_4.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_4.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_4.c (revision 0) @@ -0,0 +1,14 @@ +/* Test case to check if AVX intrinsics and function specific target + optimizations work together. Check by including immintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-avx" } */ + +#include + +__m256 a[10], b[10], c[10]; +void __attribute__((target ("avx"))) +foo (void) +{ + a[0] = _mm256_and_ps (b[0], c[0]); +} Index: testsuite/gcc.target/i386/intrinsics_1.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_1.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_1.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check by including x86intrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2" } */ + +#include + +__attribute__((target("sse4.2"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_5.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_5.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_5.c (revision 0) @@ -0,0 +1,17 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check if an error is issued in + -O0 mode when foo calls an intrinsic without the right target + attribute. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2" } */ + +#include + +//__attribute__((target("sse4.2"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); /* { dg-error "called from here" } */ +} + +/* { dg-prune-output ".*inlining failed.*" } */ Index: testsuite/gcc.target/i386/intrinsics_2.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_2.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_2.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check by including immintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include + +__attribute__((target("sse4.2"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_3.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_3.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_3.c (revision 0) @@ -0,0 +1,15 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. Check if the POPCNT specific intrinsics + in included with popcntintrin.h get enabled by directly including + popcntintrin.h */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1 -mno-sse4.2 -mno-popcnt" } */ + +#include + +__attribute__((target("popcnt"))) +long long foo(unsigned long long X) +{ + return _mm_popcnt_u64 (X); +}