Patchwork GCC does not support *mmintrin.h with function specific opts

login
register
mail settings
Submitter Sriraman Tallam
Date April 29, 2013, 5:47 p.m.
Message ID <CAAs8HmyBKhjMKyu7aFqn2AU4RGXoCvLP=AwKV+hVS31Wb=FuRg@mail.gmail.com>
Download mbox | patch
Permalink /patch/240426/
State New
Headers show

Comments

Sriraman Tallam - April 29, 2013, 5:47 p.m.
On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
<joseph@codesourcery.com> wrote:
> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>
>> Ok, it is on by default now.  There is a way to turn it off, with
>> -mno-generate-builtins.
>
> Any new option needs documenting in invoke.texi.

Added and new patch attached.

Thanks
Sri

>
> --
> Joseph S. Myers
> joseph@codesourcery.com
* config/i386/i386.c (construct_container): Do not issue SSE
	return error for extern gnu_inline functions.
	(def_builtin): Do not generate builtins when -mno-generate-builtins
	is used.
	* doc/invoke.texi: Document option -mgenerate-builtins.
	* config/i386/i386.opt (mgenerate-builtins): New target option.
	* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
	__ALL_ISA__ when generate_target_builtins is true.
	* testsuite/gcc.target/i386/intrinsics_1.c: New test.
	* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
	* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
	* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
	* config/i386/lwpintrin.h: Ditto.
	* config/i386/xopintrin.h: Ditto.
	* config/i386/fmaintrin.h: Ditto.
	* config/i386/bmiintrin.h: Ditto.
	* config/i386/fma4intrin.h: Ditto.
	* config/i386/nmmintrin.h: Ditto.
	* config/i386/tbmintrin.h: Ditto.
	* config/i386/smmintrin.h: Ditto.
	* config/i386/wmmintrin.h: Ditto.
	* config/i386/popcntintrin.h: Ditto.
	* config/i386/f16cintrin.h: Ditto.
	* config/i386/pmmintrin.h: Ditto.
	* config/i386/bmi2intrin.h: Ditto.
	* config/i386/tmmintrin.h: Ditto.
	* config/i386/xmmintrin.h: Ditto.
	* config/i386/mmintrin.h: Ditto.
	* config/i386/ammintrin.h: Ditto.
	* config/i386/emmintrin.h: Ditto.
Sriraman Tallam - May 2, 2013, 10:51 p.m.
Ping.

On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
> <joseph@codesourcery.com> wrote:
>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>
>>> Ok, it is on by default now.  There is a way to turn it off, with
>>> -mno-generate-builtins.
>>
>> Any new option needs documenting in invoke.texi.
>
> Added and new patch attached.
>
> Thanks
> Sri
>
>>
>> --
>> Joseph S. Myers
>> joseph@codesourcery.com
Sriraman Tallam - May 7, 2013, 9:41 p.m.
Ping.

On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
>> <joseph@codesourcery.com> wrote:
>>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>>
>>>> Ok, it is on by default now.  There is a way to turn it off, with
>>>> -mno-generate-builtins.
>>>
>>> Any new option needs documenting in invoke.texi.
>>
>> Added and new patch attached.
>>
>> Thanks
>> Sri
>>
>>>
>>> --
>>> Joseph S. Myers
>>> joseph@codesourcery.com
Sriraman Tallam - May 9, 2013, 9:20 p.m.
cc:Diego

On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
>>> <joseph@codesourcery.com> wrote:
>>>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>>>
>>>>> Ok, it is on by default now.  There is a way to turn it off, with
>>>>> -mno-generate-builtins.
>>>>
>>>> Any new option needs documenting in invoke.texi.
>>>
>>> Added and new patch attached.
>>>
>>> Thanks
>>> Sri
>>>
>>>>
>>>> --
>>>> Joseph S. Myers
>>>> joseph@codesourcery.com
Sriraman Tallam - May 13, 2013, 9:21 p.m.
Ping.

On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> cc:Diego
>
> On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Ping.
>>
>> On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Ping.
>>>
>>> On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
>>>> <joseph@codesourcery.com> wrote:
>>>>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>>>>
>>>>>> Ok, it is on by default now.  There is a way to turn it off, with
>>>>>> -mno-generate-builtins.
>>>>>
>>>>> Any new option needs documenting in invoke.texi.
>>>>
>>>> Added and new patch attached.
>>>>
>>>> Thanks
>>>> Sri
>>>>
>>>>>
>>>>> --
>>>>> Joseph S. Myers
>>>>> joseph@codesourcery.com
H.J. Lu - May 13, 2013, 9:46 p.m.
On Mon, May 13, 2013 at 2:21 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Ping.
>
> On Thu, May 9, 2013 at 2:20 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> cc:Diego
>>
>> On Tue, May 7, 2013 at 2:41 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Ping.
>>>
>>> On Thu, May 2, 2013 at 3:51 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Ping.
>>>>
>>>> On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
>>>>> <joseph@codesourcery.com> wrote:
>>>>>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>>>>>
>>>>>>> Ok, it is on by default now.  There is a way to turn it off, with
>>>>>>> -mno-generate-builtins.
>>>>>>
>>>>>> Any new option needs documenting in invoke.texi.
>>>>>
>>>>> Added and new patch attached.
>>>>>
>>>>> Thanks
>>>>> Sri
>>>>>
>>>>>>
>>>>>> --
>>>>>> Joseph S. Myers
>>>>>> joseph@codesourcery.com

It looks good to me.  But I can't approve it.

Add Uros,

--
H.J.
Uros Bizjak - May 14, 2013, 6:58 a.m.
On Mon, May 13, 2013 at 11:46 PM, H.J. Lu <hjl.tools@gmail.com> wrote:

>>>>> On Mon, Apr 29, 2013 at 10:47 AM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> On Thu, Apr 25, 2013 at 12:41 PM, Joseph S. Myers
>>>>>> <joseph@codesourcery.com> wrote:
>>>>>>> On Tue, 16 Apr 2013, Sriraman Tallam wrote:
>>>>>>>
>>>>>>>> Ok, it is on by default now.  There is a way to turn it off, with
>>>>>>>> -mno-generate-builtins.
>>>>>>>
>>>>>>> Any new option needs documenting in invoke.texi.
>>>>>>
>>>>>> Added and new patch attached.
>>>>>>
>>>>>> Thanks
>>>>>> Sri
>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Joseph S. Myers
>>>>>>> joseph@codesourcery.com
>
> It looks good to me.  But I can't approve it.

* config/i386/i386.c (construct_container): Do not issue SSE
return error for extern gnu_inline functions.
(def_builtin): Do not generate builtins when -mno-generate-builtins
is used.
* doc/invoke.texi: Document option -mgenerate-builtins.
* config/i386/i386.opt (mgenerate-builtins): New target option.
* config/i386/i386-c.c (ix86_target_macros_internal): Define macro
__ALL_ISA__ when generate_target_builtins is true.
* testsuite/gcc.target/i386/intrinsics_1.c: New test.
* testsuite/gcc.target/i386/intrinsics_2.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_3.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_4.c: Ditto.
* testsuite/gcc.target/i386/intrinsics_5.c: Ditto.
* config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined.
* config/i386/lwpintrin.h: Ditto.
* config/i386/xopintrin.h: Ditto.
* config/i386/fmaintrin.h: Ditto.
* config/i386/bmiintrin.h: Ditto.
* config/i386/fma4intrin.h: Ditto.
* config/i386/nmmintrin.h: Ditto.
* config/i386/tbmintrin.h: Ditto.
* config/i386/smmintrin.h: Ditto.
* config/i386/wmmintrin.h: Ditto.
* config/i386/popcntintrin.h: Ditto.
* config/i386/f16cintrin.h: Ditto.
* config/i386/pmmintrin.h: Ditto.
* config/i386/bmi2intrin.h: Ditto.
* config/i386/tmmintrin.h: Ditto.
* config/i386/xmmintrin.h: Ditto.
* config/i386/mmintrin.h: Ditto.
* config/i386/ammintrin.h: Ditto.
* config/i386/emmintrin.h: Ditto.

I think that the option should be named -mtarget-builtins.

Since HJ is OK with this user-visible change, the patch is OK for
mainline (with eventually renamed option). We also have an option to
switch this new functionality off, and we are early enough in the
development cycle to find out if anything is fundamentaly broken with
this approach.

BTW, does this patch address the request in PR57202?

Thanks,
Uros.
Jakub Jelinek - May 14, 2013, 8:39 a.m.
On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote:
> I think that the option should be named -mtarget-builtins.

There shouldn't be an option for it at all.  If constructing the builtins is
slow (it is), we should just create them lazily, the
*builtin_decl_{explicit,implicit}* APIs were a first step for that, plus
we should build some gperf table of all the generic and all the target
builtins and record prefixes used to find them (__builtin_, __sync_,
__atomic_, what else?), then just teach FE that if they are looking up
a symbol prefixed with one of these, they should also look it up
in the perfect hash table and if found there, call some function to
construct the builtin.  Of course, this isn't a prerequisite of the
changes you are looking for, but introducing an option that hopefully will
be completely useless in a few months is just a bad idea.

> Since HJ is OK with this user-visible change, the patch is OK for
> mainline (with eventually renamed option). We also have an option to
> switch this new functionality off, and we are early enough in the
> development cycle to find out if anything is fundamentaly broken with
> this approach.
> 
> BTW, does this patch address the request in PR57202?

I'm strongly against the patch in its current form, it is a hack rather
than a solution.  I don't see how it could be properly tested, when say
immintrin.h and x86intrin.h is still full of code like:
#ifdef __AVX__
#include <avxintrin.h>
#endif
so when you just
#include <x86intrin.h>
rather than the few headers that were tested, you are out of luck.

The right solution is really IMNSHO something along the lines of:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target("avx")
#define __AVXINTRIN_H_POP_OPTIONS__
#endif
...
#ifdef __AVXINTRIN_H_POP_OPTIONS__
#pragma GCC pop_options
#undef __AVXINTRIN_H_POP_OPTIONS__
#endif

around the headers.  As can be seen on say -O2 -mno-avx:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target("avx")
#define __DISABLE_AVX__
#endif
typedef float __v8sf __attribute__((vector_size (32)));
extern __v8sf a, b;
extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) bar (int x) { a = b + 1.0f; return x + 1; }
#ifdef __DISABLE_AVX__
#pragma GCC pop_options
#undef __DISABLE_AVX__
#endif
int __attribute__((target ("avx2"))) avx2 (int x) { return bar (x) + bar (x + 1); }
int __attribute__((target ("avx"))) avx (int x) { return bar (x) + bar (x + 1); }
int __attribute__((target ("xop"))) xop (int x) { return bar (x) + bar (x + 1); }
int __attribute__((target ("sse2"))) sse2 (int x) { return bar (x) + bar (x + 1); }
int nothing (int x) { return bar (x) + bar (x + 1); }
the inliner happily inlines the avx target function into avx/avx2/xop
targetted functions (correct), inlines it even into sse2 (something that
should be fixed not to), and doesn't inline into nothing (IMHO correct, we
want an error in that case, one shouldn't use avx intrinsics in say sse2
only targetted function (unless -mavx is used on command line, i.e. the
check should always be if the caller's target set is equal or superset of
callee's target set).

When trying with -O2 -mno-avx:
#ifndef __AVX__
#pragma GCC push_options
#pragma GCC target("avx")
#define __DISABLE_AVX__
#endif
typedef float __v8sf __attribute__ ((__vector_size__ (32)));
typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
extern __inline __m256 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm256_and_ps (__m256 __A, __m256 __B) { return (__m256) __builtin_ia32_andps256 ((__v8sf)__A, (__v8sf)__B); }
#ifdef __DISABLE_AVX__
#pragma GCC pop_options
#undef __DISABLE_AVX__
#endif
__m256 a, b, c;
void __attribute__((target ("avx")))
foo (void)
{
  a = _mm256_and_ps (b, c);
}
we get bogus errors and ICE:
tty2.c: In function '_mm256_and_ps':
tty2.c:9:1: note: The ABI for passing parameters with 32-byte alignment has changed in GCC 4.6
tty2.c: In function 'foo':
tty2.c:9:82: error: '__builtin_ia32_andps256' needs isa option -m32
tty2.c:9:82: internal compiler error: in emit_move_insn, at expr.c:3486
0x77a3d2 emit_move_insn(rtx_def*, rtx_def*)
	../../gcc/expr.c:3485
(I have added "1 ||" instead of your generate_builtins into i386.c
(def_builtin)), that just shows that target attribute/pragma support still
has very severe issues that need to be fixed, instead of papered around.

Note, we ICE on:
#pragma GCC target ("mavx")
That should be fixed too.

	Jakub
Uros Bizjak - May 14, 2013, 9:42 a.m.
On Tue, May 14, 2013 at 10:39 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, May 14, 2013 at 08:58:55AM +0200, Uros Bizjak wrote:
>> I think that the option should be named -mtarget-builtins.
>
> There shouldn't be an option for it at all.  If constructing the builtins is
> slow (it is), we should just create them lazily, the
> *builtin_decl_{explicit,implicit}* APIs were a first step for that, plus
> we should build some gperf table of all the generic and all the target
> builtins and record prefixes used to find them (__builtin_, __sync_,
> __atomic_, what else?), then just teach FE that if they are looking up
> a symbol prefixed with one of these, they should also look it up
> in the perfect hash table and if found there, call some function to
> construct the builtin.  Of course, this isn't a prerequisite of the
> changes you are looking for, but introducing an option that hopefully will
> be completely useless in a few months is just a bad idea.
>
>> Since HJ is OK with this user-visible change, the patch is OK for
>> mainline (with eventually renamed option). We also have an option to
>> switch this new functionality off, and we are early enough in the
>> development cycle to find out if anything is fundamentaly broken with
>> this approach.
>>
>> BTW, does this patch address the request in PR57202?
>
> I'm strongly against the patch in its current form, it is a hack rather
> than a solution.  I don't see how it could be properly tested, when say
> immintrin.h and x86intrin.h is still full of code like:
> #ifdef __AVX__
> #include <avxintrin.h>
> #endif
> so when you just
> #include <x86intrin.h>
> rather than the few headers that were tested, you are out of luck.

Jakub, thanks for your thorough analysis.

It looks that the approach in the patch is fundamentally flawed and
that infrastructure is not developed/fixed enough for alternative
solution.

Based on expressed concerns, I retract my previous approval.

Thanks,
Uros.

Patch

Index: config/i386/smmintrin.h
===================================================================
--- config/i386/smmintrin.h	(revision 198212)
+++ config/i386/smmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _SMMINTRIN_H_INCLUDED
 #define _SMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_1__
+#if !defined (__SSE4_1__) && !defined (__ALL_ISA__)
 # error "SSE4.1 instruction set not enabled"
 #else
 
Index: config/i386/f16cintrin.h
===================================================================
--- config/i386/f16cintrin.h	(revision 198212)
+++ config/i386/f16cintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <f16intrin.h> directly; include <x86intrin.h> or <immintrin.h> instead."
 #endif
 
-#ifndef __F16C__
+#if !defined (__F16C__) && !defined (__ALL_ISA__)
 # error "F16C instruction set not enabled"
 #else
 
Index: config/i386/wmmintrin.h
===================================================================
--- config/i386/wmmintrin.h	(revision 198212)
+++ config/i386/wmmintrin.h	(working copy)
@@ -30,7 +30,7 @@ 
 /* We need definitions from the SSE2 header file.  */
 #include <emmintrin.h>
 
-#if !defined (__AES__) && !defined (__PCLMUL__)
+#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__)
 # error "AES/PCLMUL instructions not enabled"
 #else
 
Index: config/i386/bmi2intrin.h
===================================================================
--- config/i386/bmi2intrin.h	(revision 198212)
+++ config/i386/bmi2intrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <bmi2intrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI2__
+#if !defined (__BMI2__) && !defined (__ALL_ISA__)
 # error "BMI2 instruction set not enabled"
 #endif /* __BMI2__ */
 
Index: config/i386/pmmintrin.h
===================================================================
--- config/i386/pmmintrin.h	(revision 198212)
+++ config/i386/pmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _PMMINTRIN_H_INCLUDED
 #define _PMMINTRIN_H_INCLUDED
 
-#ifndef __SSE3__
+#if !defined (__SSE3__) && !defined (__ALL_ISA__)
 # error "SSE3 instruction set not enabled"
 #else
 
Index: config/i386/lzcntintrin.h
===================================================================
--- config/i386/lzcntintrin.h	(revision 198212)
+++ config/i386/lzcntintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <lzcntintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __LZCNT__
+#if !defined (__LZCNT__) && !defined (__ALL_ISA__)
 # error "LZCNT instruction is not enabled"
 #endif /* __LZCNT__ */
 
Index: config/i386/tmmintrin.h
===================================================================
--- config/i386/tmmintrin.h	(revision 198212)
+++ config/i386/tmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _TMMINTRIN_H_INCLUDED
 #define _TMMINTRIN_H_INCLUDED
 
-#ifndef __SSSE3__
+#if !defined (__SSSE3__) && !defined (__ALL_ISA__)
 # error "SSSE3 instruction set not enabled"
 #else
 
Index: config/i386/xmmintrin.h
===================================================================
--- config/i386/xmmintrin.h	(revision 198212)
+++ config/i386/xmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _XMMINTRIN_H_INCLUDED
 #define _XMMINTRIN_H_INCLUDED
 
-#ifndef __SSE__
+#if !defined (__SSE__) && !defined (__ALL_ISA__)
 # error "SSE instruction set not enabled"
 #else
 
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c	(revision 198212)
+++ config/i386/i386.c	(working copy)
@@ -6384,8 +6384,13 @@  construct_container (enum machine_mode mode, enum
     return NULL;
 
   /* We allowed the user to turn off SSE for kernel mode.  Don't crash if
-     some less clueful developer tries to use floating-point anyway.  */
-  if (needed_sseregs && !TARGET_SSE)
+     some less clueful developer tries to use floating-point anyway.  It is
+     alright if this is in a extern "gnu_inline" function, as it is the
+     caller that matters in this case.  */
+  if (needed_sseregs && !TARGET_SSE
+      && !(DECL_EXTERNAL (current_function_decl)
+           && lookup_attribute ("gnu_inline",
+		DECL_ATTRIBUTES (current_function_decl)) != NULL))
     {
       if (in_return)
 	{
@@ -26823,7 +26828,8 @@  def_builtin (HOST_WIDE_INT mask, const char *name,
       ix86_builtins_isa[(int) code].isa = mask;
 
       mask &= ~OPTION_MASK_ISA_64BIT;
-      if (mask == 0
+      if (generate_target_builtins
+	  || mask == 0
 	  || (mask & ix86_isa_flags) != 0
 	  || (lang_hooks.builtin_function
 	      == lang_hooks.builtin_function_ext_scope))
Index: config/i386/ammintrin.h
===================================================================
--- config/i386/ammintrin.h	(revision 198212)
+++ config/i386/ammintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _AMMINTRIN_H_INCLUDED
 #define _AMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4A__
+#if !defined (__SSE4A__) && !defined (__ALL_ISA__)
 # error "SSE4A instruction set not enabled"
 #else
 
Index: config/i386/emmintrin.h
===================================================================
--- config/i386/emmintrin.h	(revision 198212)
+++ config/i386/emmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _EMMINTRIN_H_INCLUDED
 #define _EMMINTRIN_H_INCLUDED
 
-#ifndef __SSE2__
+#if !defined (__SSE2__) && !defined (__ALL_ISA__)
 # error "SSE2 instruction set not enabled"
 #else
 
Index: config/i386/lwpintrin.h
===================================================================
--- config/i386/lwpintrin.h	(revision 198212)
+++ config/i386/lwpintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _LWPINTRIN_H_INCLUDED
 #define _LWPINTRIN_H_INCLUDED
 
-#ifndef __LWP__
+#if !defined (__LWP__) && !defined (__ALL_ISA__)
 # error "LWP instruction set not enabled"
 #else
 
Index: config/i386/xopintrin.h
===================================================================
--- config/i386/xopintrin.h	(revision 198212)
+++ config/i386/xopintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _XOPMMINTRIN_H_INCLUDED
 #define _XOPMMINTRIN_H_INCLUDED
 
-#ifndef __XOP__
+#if !defined (__XOP__) && !defined (__ALL_ISA__)
 # error "XOP instruction set not enabled"
 #else
 
Index: config/i386/mmintrin.h
===================================================================
--- config/i386/mmintrin.h	(revision 198212)
+++ config/i386/mmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
-#ifndef __MMX__
+#if !defined (__MMX__) && !defined (__ALL_ISA__)
 # error "MMX instruction set not enabled"
 #else
 /* The Intel API is flexible enough that we must allow aliasing with other
Index: config/i386/i386-c.c
===================================================================
--- config/i386/i386-c.c	(revision 198212)
+++ config/i386/i386-c.c	(working copy)
@@ -54,6 +54,9 @@  ix86_target_macros_internal (HOST_WIDE_INT isa_fla
   int last_arch_char = ix86_arch_string[arch_len - 1];
   int last_tune_char = ix86_tune_string[tune_len - 1];
 
+  if (generate_target_builtins)
+    def_or_undef (parse_in, "__ALL_ISA__");
+
   /* Built-ins based on -march=.  */
   switch (arch)
     {
Index: config/i386/i386.opt
===================================================================
--- config/i386/i386.opt	(revision 198212)
+++ config/i386/i386.opt	(working copy)
@@ -640,3 +640,7 @@  Enum(stack_protector_guard) String(tls) Value(SSP_
 
 EnumValue
 Enum(stack_protector_guard) String(global) Value(SSP_GLOBAL)
+
+mgenerate-builtins
+Target Report Var(generate_target_builtins) Save Init(1)
+Generate all target builtins that are otherwise only generated when the appropriate ISA is turned on.
Index: config/i386/fma4intrin.h
===================================================================
--- config/i386/fma4intrin.h	(revision 198212)
+++ config/i386/fma4intrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _FMA4INTRIN_H_INCLUDED
 #define _FMA4INTRIN_H_INCLUDED
 
-#ifndef __FMA4__
+#if !defined (__FMA4__) && !defined (__ALL_ISA__)
 # error "FMA4 instruction set not enabled"
 #else
 
Index: config/i386/popcntintrin.h
===================================================================
--- config/i386/popcntintrin.h	(revision 198212)
+++ config/i386/popcntintrin.h	(working copy)
@@ -21,7 +21,7 @@ 
    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef __POPCNT__
+#if !defined (__POPCNT__) && !defined (__ALL_ISA__)
 # error "POPCNT instruction set not enabled"
 #endif /* __POPCNT__ */
 
Index: config/i386/fmaintrin.h
===================================================================
--- config/i386/fmaintrin.h	(revision 198212)
+++ config/i386/fmaintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _FMAINTRIN_H_INCLUDED
 #define _FMAINTRIN_H_INCLUDED
 
-#ifndef __FMA__
+#if !defined (__FMA__) && !defined (__ALL_ISA__)
 # error "FMA instruction set not enabled"
 #else
 
Index: config/i386/bmiintrin.h
===================================================================
--- config/i386/bmiintrin.h	(revision 198212)
+++ config/i386/bmiintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <bmiintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI__
+#if !defined (__BMI__) && !defined (__ALL_ISA__)
 # error "BMI instruction set not enabled"
 #endif /* __BMI__ */
 
Index: config/i386/nmmintrin.h
===================================================================
--- config/i386/nmmintrin.h	(revision 198212)
+++ config/i386/nmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _NMMINTRIN_H_INCLUDED
 #define _NMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_2__
+#if !defined (__SSE4_2__) && !defined (__ALL_ISA__)
 # error "SSE4.2 instruction set not enabled"
 #else
 /* We just include SSE4.1 header file.  */
Index: config/i386/tbmintrin.h
===================================================================
--- config/i386/tbmintrin.h	(revision 198212)
+++ config/i386/tbmintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <tbmintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __TBM__
+#if !defined (__TBM__) && !defined (__ALL_ISA__)
 # error "TBM instruction set not enabled"
 #endif /* __TBM__ */
 
Index: testsuite/gcc.target/i386/intrinsics_3.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_3.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_3.c	(revision 0)
@@ -0,0 +1,11 @@ 
+/* Using vector types without SSE enabled should generate an error.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse" } */
+
+typedef long long  _m128i  __attribute__((vector_size(16),__may_alias__));
+
+int foo (_m128i V) /* { dg-warning "SSE vector argument without SSE enabled changes the ABI" } */
+{
+  return 0;
+}
Index: testsuite/gcc.target/i386/intrinsics_4.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_4.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_4.c	(revision 0)
@@ -0,0 +1,11 @@ 
+/* Test to check if a target specific builtin used in a function without the
+   appropriate ISA support generates an error.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-sse4.1" } */
+
+#include <smmintrin.h>
+__m128i foo(__m128i *V)
+{
+    return __builtin_ia32_movntdqa (V); /* { dg-error "'__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1" } */
+}
Index: testsuite/gcc.target/i386/intrinsics_1.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_1.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_1.c	(revision 0)
@@ -0,0 +1,13 @@ 
+/* Test case to check if intrinsics and function specific target
+   optimizations work together.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mno-sse4.1" } */
+
+#include <smmintrin.h>
+
+__attribute__((target("sse4.1")))
+__m128i foo(__m128i *V)
+{
+    return _mm_stream_load_si128(V);
+}
Index: testsuite/gcc.target/i386/intrinsics_5.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_5.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_5.c	(revision 0)
@@ -0,0 +1,13 @@ 
+/* Test case to check if -mno-generate-builtins will break use of intrinsics
+   when the appropriate ISA is not specified.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -mno-generate-builtins -mno-sse4.1" } */
+
+#include <smmintrin.h>
+__m128i foo(__m128i *V) /* { dg-error "unknown type name" } */
+{
+    return V;
+}
+
+/* { dg-excess-errors "\"SSE4.1 instruction set not enabled\"" } */
Index: testsuite/gcc.target/i386/intrinsics_2.c
===================================================================
--- testsuite/gcc.target/i386/intrinsics_2.c	(revision 0)
+++ testsuite/gcc.target/i386/intrinsics_2.c	(revision 0)
@@ -0,0 +1,19 @@ 
+/* Ok, to have SSE4.1 code in non-SSE4.1 functions marked as
+   extern, "gnu_inline".  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse -mno-sse4.1" } */
+
+#include <smmintrin.h>
+
+extern __inline __attribute__ ((__gnu_inline__))
+__m128i bar (__m128i *V)
+{
+  return _mm_stream_load_si128(V);
+}
+
+__attribute__((target("sse4.1")))
+__m128i foo(__m128i *V)
+{
+  return bar (V);
+}
Index: doc/invoke.texi
===================================================================
--- doc/invoke.texi	(revision 198212)
+++ doc/invoke.texi	(working copy)
@@ -658,7 +658,8 @@  Objective-C and Objective-C++ Dialects}.
 -m32 -m64 -mx32 -mlarge-data-threshold=@var{num} @gol
 -msse2avx -mfentry -m8bit-idiv @gol
 -mavx256-split-unaligned-load -mavx256-split-unaligned-store @gol
--mstack-protector-guard=@var{guard}}
+-mstack-protector-guard=@var{guard}} @gol
+-mgenerate-builtins
 
 @emph{i386 and x86-64 Windows Options}
 @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll @gol
@@ -14604,6 +14605,13 @@  locations are @samp{global} for global canary or @
 canary in the TLS block (the default).  This option has effect only when
 @option{-fstack-protector} or @option{-fstack-protector-all} is specified.
 
+@item -mgenerate-builtins
+@itemx -mno-generate-builtins.
+@opindex generate-builtins
+Generate all target builtins that are otherwise only generated when the
+appropriate ISA is turned on.  This is turned on by default.  Use
+@option{-mno-generate-builtins} to turn this off.
+
 @end table
 
 These @samp{-m} switches are supported in addition to the above