diff mbox

GCC does not support *mmintrin.h with function specific opts

Message ID CAAs8HmwMYGGB80J1-YxSK4tY_hn7i65Tw4DGbaq9RpA9M4vPRQ@mail.gmail.com
State New
Headers show

Commit Message

Sriraman Tallam April 11, 2013, 7:05 p.m. UTC
Hi,

    *mmintrin headers does not work with function specific opts.

Example 1:


#include <smmintrin.h>

__attribute__((target("sse4.1")))
__m128i foo(__m128i *V)
{
    return _mm_stream_load_si128(V);
}


$ g++ test.cc
smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
 # error "SSE4.1 instruction set not enabled"

This error happens even though foo is marked "sse4.1"

There are multiple issues at play here. One, the headers are guarded
by macros that are switched on only when the target specific options,
like -msse4.1 in this case, are present in the command line. Also, the
target specific builtins, like __builtin_ia32_movntdqa called by
_mm_stream_load_si128, are exposed only in the presence of the
appropriate target ISA option.


I have attached a patch that fixes this. I have added an option
"-mgenerate-builtins" that will do two things.  It will define a macro
"__ALL_ISA__" which will expose the *intrin.h functions. It will also
expose all the target specific builtins.  -mgenerate-builtins will not
affect code generation.

This feature will greatly benefit the function multiversioning usability too.

Comments?

Thanks
Sri

Comments

Xinliang David Li April 11, 2013, 7:43 p.m. UTC | #1
What is the compile time impact for turning it on? Code not including
the intrinsic headers should not be affected too much.  If the impact
is small, why not turning on this option by default -- which seems to
be the behavior of ICC.

With this option, all functions without the appropriate target options
will be allowed to reference not supported builtins, without warnings
or errors. Is it possible to delay the check till the builtin
expansion time?

thanks,

David

On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> Hi,
>
>     *mmintrin headers does not work with function specific opts.
>
> Example 1:
>
>
> #include <smmintrin.h>
>
> __attribute__((target("sse4.1")))
> __m128i foo(__m128i *V)
> {
>     return _mm_stream_load_si128(V);
> }
>
>
> $ g++ test.cc
> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>  # error "SSE4.1 instruction set not enabled"
>
> This error happens even though foo is marked "sse4.1"
>
> There are multiple issues at play here. One, the headers are guarded
> by macros that are switched on only when the target specific options,
> like -msse4.1 in this case, are present in the command line. Also, the
> target specific builtins, like __builtin_ia32_movntdqa called by
> _mm_stream_load_si128, are exposed only in the presence of the
> appropriate target ISA option.
>
>
> I have attached a patch that fixes this. I have added an option
> "-mgenerate-builtins" that will do two things.  It will define a macro
> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
> expose all the target specific builtins.  -mgenerate-builtins will not
> affect code generation.
>
> This feature will greatly benefit the function multiversioning usability too.
>
> Comments?
>
> Thanks
> Sri
Sriraman Tallam April 11, 2013, 8:05 p.m. UTC | #2
On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davidxl@google.com> wrote:
> What is the compile time impact for turning it on? Code not including
> the intrinsic headers should not be affected too much.  If the impact
> is small, why not turning on this option by default -- which seems to
> be the behavior of ICC.

I will get back with data on this.

>
> With this option, all functions without the appropriate target options
> will be allowed to reference not supported builtins, without warnings
> or errors. Is it possible to delay the check till the builtin
> expansion time?

Right now, an error is generated if a function accesses an unsupported
builtin. Since the intrinsic functions are marked inline and call some
target builtin, this will always be caught.

Sri

>
> thanks,
>
> David
>
> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi,
>>
>>     *mmintrin headers does not work with function specific opts.
>>
>> Example 1:
>>
>>
>> #include <smmintrin.h>
>>
>> __attribute__((target("sse4.1")))
>> __m128i foo(__m128i *V)
>> {
>>     return _mm_stream_load_si128(V);
>> }
>>
>>
>> $ g++ test.cc
>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>>  # error "SSE4.1 instruction set not enabled"
>>
>> This error happens even though foo is marked "sse4.1"
>>
>> There are multiple issues at play here. One, the headers are guarded
>> by macros that are switched on only when the target specific options,
>> like -msse4.1 in this case, are present in the command line. Also, the
>> target specific builtins, like __builtin_ia32_movntdqa called by
>> _mm_stream_load_si128, are exposed only in the presence of the
>> appropriate target ISA option.
>>
>>
>> I have attached a patch that fixes this. I have added an option
>> "-mgenerate-builtins" that will do two things.  It will define a macro
>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>> expose all the target specific builtins.  -mgenerate-builtins will not
>> affect code generation.
>>
>> This feature will greatly benefit the function multiversioning usability too.
>>
>> Comments?
>>
>> Thanks
>> Sri
Sriraman Tallam April 11, 2013, 8:09 p.m. UTC | #3
On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davidxl@google.com> wrote:
>> What is the compile time impact for turning it on? Code not including
>> the intrinsic headers should not be affected too much.  If the impact
>> is small, why not turning on this option by default -- which seems to
>> be the behavior of ICC.
>
> I will get back with data on this.
>
>>
>> With this option, all functions without the appropriate target options
>> will be allowed to reference not supported builtins, without warnings
>> or errors. Is it possible to delay the check till the builtin
>> expansion time?
>
> Right now, an error is generated if a function accesses an unsupported
> builtin. Since the intrinsic functions are marked inline and call some
> target builtin, this will always be caught.

To be clear, same example without the target attribute and with
-mgenerate-builtins results in an error:

#include <smmintrin.h>

__m128i foo(__m128i *V)
{
    return _mm_stream_load_si128(V);
}

$ g++ test.cc -mgenerate-builtins

smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
-m32 -msse4.1
   return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);


>
> Sri
>
>>
>> thanks,
>>
>> David
>>
>> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Hi,
>>>
>>>     *mmintrin headers does not work with function specific opts.
>>>
>>> Example 1:
>>>
>>>
>>> #include <smmintrin.h>
>>>
>>> __attribute__((target("sse4.1")))
>>> __m128i foo(__m128i *V)
>>> {
>>>     return _mm_stream_load_si128(V);
>>> }
>>>
>>>
>>> $ g++ test.cc
>>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>>>  # error "SSE4.1 instruction set not enabled"
>>>
>>> This error happens even though foo is marked "sse4.1"
>>>
>>> There are multiple issues at play here. One, the headers are guarded
>>> by macros that are switched on only when the target specific options,
>>> like -msse4.1 in this case, are present in the command line. Also, the
>>> target specific builtins, like __builtin_ia32_movntdqa called by
>>> _mm_stream_load_si128, are exposed only in the presence of the
>>> appropriate target ISA option.
>>>
>>>
>>> I have attached a patch that fixes this. I have added an option
>>> "-mgenerate-builtins" that will do two things.  It will define a macro
>>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>>> expose all the target specific builtins.  -mgenerate-builtins will not
>>> affect code generation.
>>>
>>> This feature will greatly benefit the function multiversioning usability too.
>>>
>>> Comments?
>>>
>>> Thanks
>>> Sri
Xinliang David Li April 11, 2013, 8:18 p.m. UTC | #4
Great that it is already covered.

David

On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davidxl@google.com> wrote:
>>> What is the compile time impact for turning it on? Code not including
>>> the intrinsic headers should not be affected too much.  If the impact
>>> is small, why not turning on this option by default -- which seems to
>>> be the behavior of ICC.
>>
>> I will get back with data on this.
>>
>>>
>>> With this option, all functions without the appropriate target options
>>> will be allowed to reference not supported builtins, without warnings
>>> or errors. Is it possible to delay the check till the builtin
>>> expansion time?
>>
>> Right now, an error is generated if a function accesses an unsupported
>> builtin. Since the intrinsic functions are marked inline and call some
>> target builtin, this will always be caught.
>
> To be clear, same example without the target attribute and with
> -mgenerate-builtins results in an error:
>
> #include <smmintrin.h>
>
> __m128i foo(__m128i *V)
> {
>     return _mm_stream_load_si128(V);
> }
>
> $ g++ test.cc -mgenerate-builtins
>
> smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
> -m32 -msse4.1
>    return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
>
>
>>
>> Sri
>>
>>>
>>> thanks,
>>>
>>> David
>>>
>>> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> Hi,
>>>>
>>>>     *mmintrin headers does not work with function specific opts.
>>>>
>>>> Example 1:
>>>>
>>>>
>>>> #include <smmintrin.h>
>>>>
>>>> __attribute__((target("sse4.1")))
>>>> __m128i foo(__m128i *V)
>>>> {
>>>>     return _mm_stream_load_si128(V);
>>>> }
>>>>
>>>>
>>>> $ g++ test.cc
>>>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>>>>  # error "SSE4.1 instruction set not enabled"
>>>>
>>>> This error happens even though foo is marked "sse4.1"
>>>>
>>>> There are multiple issues at play here. One, the headers are guarded
>>>> by macros that are switched on only when the target specific options,
>>>> like -msse4.1 in this case, are present in the command line. Also, the
>>>> target specific builtins, like __builtin_ia32_movntdqa called by
>>>> _mm_stream_load_si128, are exposed only in the presence of the
>>>> appropriate target ISA option.
>>>>
>>>>
>>>> I have attached a patch that fixes this. I have added an option
>>>> "-mgenerate-builtins" that will do two things.  It will define a macro
>>>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>>>> expose all the target specific builtins.  -mgenerate-builtins will not
>>>> affect code generation.
>>>>
>>>> This feature will greatly benefit the function multiversioning usability too.
>>>>
>>>> Comments?
>>>>
>>>> Thanks
>>>> Sri
Richard Biener April 12, 2013, 8:22 a.m. UTC | #5
On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li <davidxl@google.com> wrote:
> Great that it is already covered.

I expect that with more convoluted code you'll ICE eventually.  There is also
a bugreport about exactly this issue which should be referenced when a fix
is committed.

Oh, and target maintainers disagree about if there is anything to fix and
what it would take to fix it ...

Note that ICC happily expands the intrinsics to SSEX code even if
SSEX is not enabled.

Richard.

> David
>
> On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>> What is the compile time impact for turning it on? Code not including
>>>> the intrinsic headers should not be affected too much.  If the impact
>>>> is small, why not turning on this option by default -- which seems to
>>>> be the behavior of ICC.
>>>
>>> I will get back with data on this.
>>>
>>>>
>>>> With this option, all functions without the appropriate target options
>>>> will be allowed to reference not supported builtins, without warnings
>>>> or errors. Is it possible to delay the check till the builtin
>>>> expansion time?
>>>
>>> Right now, an error is generated if a function accesses an unsupported
>>> builtin. Since the intrinsic functions are marked inline and call some
>>> target builtin, this will always be caught.
>>
>> To be clear, same example without the target attribute and with
>> -mgenerate-builtins results in an error:
>>
>> #include <smmintrin.h>
>>
>> __m128i foo(__m128i *V)
>> {
>>     return _mm_stream_load_si128(V);
>> }
>>
>> $ g++ test.cc -mgenerate-builtins
>>
>> smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
>> -m32 -msse4.1
>>    return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
>>
>>
>>>
>>> Sri
>>>
>>>>
>>>> thanks,
>>>>
>>>> David
>>>>
>>>> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>> Hi,
>>>>>
>>>>>     *mmintrin headers does not work with function specific opts.
>>>>>
>>>>> Example 1:
>>>>>
>>>>>
>>>>> #include <smmintrin.h>
>>>>>
>>>>> __attribute__((target("sse4.1")))
>>>>> __m128i foo(__m128i *V)
>>>>> {
>>>>>     return _mm_stream_load_si128(V);
>>>>> }
>>>>>
>>>>>
>>>>> $ g++ test.cc
>>>>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>>>>>  # error "SSE4.1 instruction set not enabled"
>>>>>
>>>>> This error happens even though foo is marked "sse4.1"
>>>>>
>>>>> There are multiple issues at play here. One, the headers are guarded
>>>>> by macros that are switched on only when the target specific options,
>>>>> like -msse4.1 in this case, are present in the command line. Also, the
>>>>> target specific builtins, like __builtin_ia32_movntdqa called by
>>>>> _mm_stream_load_si128, are exposed only in the presence of the
>>>>> appropriate target ISA option.
>>>>>
>>>>>
>>>>> I have attached a patch that fixes this. I have added an option
>>>>> "-mgenerate-builtins" that will do two things.  It will define a macro
>>>>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>>>>> expose all the target specific builtins.  -mgenerate-builtins will not
>>>>> affect code generation.
>>>>>
>>>>> This feature will greatly benefit the function multiversioning usability too.
>>>>>
>>>>> Comments?
>>>>>
>>>>> Thanks
>>>>> Sri
Jakub Jelinek April 12, 2013, 8:58 a.m. UTC | #6
On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
> I have attached a patch that fixes this. I have added an option
> "-mgenerate-builtins" that will do two things.  It will define a macro
> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
> expose all the target specific builtins.  -mgenerate-builtins will not
> affect code generation.

1) this shouldn't be an option, either it can be made to work reliably,
   then it should be done always, or it can't, then it shouldn't be done
2) have you verified that if you always generate all builtins, that the
   builtins not supported by the ISA selected from the command line are
   created with the right vector modes?
3) the *intrin.h headers in the case where the guarding macro isn't defined
   should be surrounded by something like
   #ifndef __FMA4__
   #pragma GCC push options
   #pragma GCC target("fma4")
   #endif
   ...
   #ifndef __FMA4__
   #pragma GCC pop options
   #endif
   so that everything that is in the headers is compiled with the ISA
   in question
4) what happens if you use the various vector types typedefed in the
   *intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
   for VECTOR_TYPE is a function call, perhaps it will just be handled as
   generic BLKmode vectors, which is desirable I think
5) what happens if you use a target builtin in a function not supporting
   the corresponding ISA, do you get proper error explaining what you are
   doing wrong?
6) what happens if you use some intrinsics in a function not supporting
   the corresponding ISA?  Dunno if the inliner chooses not to inline it
   and error out because it is always_inline, or what exactly will happen
   then

For all this you certainly need testcases.

	Jakub
Xinliang David Li April 12, 2013, 5:34 p.m. UTC | #7
On Fri, Apr 12, 2013 at 1:22 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Apr 11, 2013 at 10:18 PM, Xinliang David Li <davidxl@google.com> wrote:
>> Great that it is already covered.
>
> I expect that with more convoluted code you'll ICE eventually.

Yes, those are bugs that need to be fixed.

> There is also
> a bugreport about exactly this issue which should be referenced when a fix
> is committed.

there is a related thread from 2 years ago :
http://gcc.gnu.org/ml/gcc-patches/2010-08/msg01417.html

The patch was adding individual ISA macros eagerly which is not a good
idea (as they are also used in user code). Sri's patch is certainly
cleaner.

>
> Oh, and target maintainers disagree about if there is anything to fix and
> what it would take to fix it ...
>

There are concerns about buggy function level target option handling,
vector type lowering etc. It would be great if there are actual cases
to demonstrate it.

> Note that ICC happily expands the intrinsics to SSEX code even if
> SSEX is not enabled.

This seems handy, but may hide user errors.

David


>
> Richard.
>
>> David
>>
>> On Thu, Apr 11, 2013 at 1:09 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> On Thu, Apr 11, 2013 at 1:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>> On Thu, Apr 11, 2013 at 12:43 PM, Xinliang David Li <davidxl@google.com> wrote:
>>>>> What is the compile time impact for turning it on? Code not including
>>>>> the intrinsic headers should not be affected too much.  If the impact
>>>>> is small, why not turning on this option by default -- which seems to
>>>>> be the behavior of ICC.
>>>>
>>>> I will get back with data on this.
>>>>
>>>>>
>>>>> With this option, all functions without the appropriate target options
>>>>> will be allowed to reference not supported builtins, without warnings
>>>>> or errors. Is it possible to delay the check till the builtin
>>>>> expansion time?
>>>>
>>>> Right now, an error is generated if a function accesses an unsupported
>>>> builtin. Since the intrinsic functions are marked inline and call some
>>>> target builtin, this will always be caught.
>>>
>>> To be clear, same example without the target attribute and with
>>> -mgenerate-builtins results in an error:
>>>
>>> #include <smmintrin.h>
>>>
>>> __m128i foo(__m128i *V)
>>> {
>>>     return _mm_stream_load_si128(V);
>>> }
>>>
>>> $ g++ test.cc -mgenerate-builtins
>>>
>>> smmintrin.h:582:59: error: ‘__builtin_ia32_movntdqa’ needs isa option
>>> -m32 -msse4.1
>>>    return (__m128i) __builtin_ia32_movntdqa ((__v2di *) __X);
>>>
>>>
>>>>
>>>> Sri
>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> David
>>>>>
>>>>> On Thu, Apr 11, 2013 at 12:05 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>>     *mmintrin headers does not work with function specific opts.
>>>>>>
>>>>>> Example 1:
>>>>>>
>>>>>>
>>>>>> #include <smmintrin.h>
>>>>>>
>>>>>> __attribute__((target("sse4.1")))
>>>>>> __m128i foo(__m128i *V)
>>>>>> {
>>>>>>     return _mm_stream_load_si128(V);
>>>>>> }
>>>>>>
>>>>>>
>>>>>> $ g++ test.cc
>>>>>> smmintrin.h:31:3: error: #error "SSE4.1 instruction set not enabled"
>>>>>>  # error "SSE4.1 instruction set not enabled"
>>>>>>
>>>>>> This error happens even though foo is marked "sse4.1"
>>>>>>
>>>>>> There are multiple issues at play here. One, the headers are guarded
>>>>>> by macros that are switched on only when the target specific options,
>>>>>> like -msse4.1 in this case, are present in the command line. Also, the
>>>>>> target specific builtins, like __builtin_ia32_movntdqa called by
>>>>>> _mm_stream_load_si128, are exposed only in the presence of the
>>>>>> appropriate target ISA option.
>>>>>>
>>>>>>
>>>>>> I have attached a patch that fixes this. I have added an option
>>>>>> "-mgenerate-builtins" that will do two things.  It will define a macro
>>>>>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>>>>>> expose all the target specific builtins.  -mgenerate-builtins will not
>>>>>> affect code generation.
>>>>>>
>>>>>> This feature will greatly benefit the function multiversioning usability too.
>>>>>>
>>>>>> Comments?
>>>>>>
>>>>>> Thanks
>>>>>> Sri
Xinliang David Li April 12, 2013, 6:11 p.m. UTC | #8
On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote:
>> I have attached a patch that fixes this. I have added an option
>> "-mgenerate-builtins" that will do two things.  It will define a macro
>> "__ALL_ISA__" which will expose the *intrin.h functions. It will also
>> expose all the target specific builtins.  -mgenerate-builtins will not
>> affect code generation.
>
> 1) this shouldn't be an option, either it can be made to work reliably,
>    then it should be done always, or it can't, then it shouldn't be done

Except that if there is compile time/memory consumption concerns,
users can use the option to turn it off.

> 2) have you verified that if you always generate all builtins, that the
>    builtins not supported by the ISA selected from the command line are
>    created with the right vector modes?
> 3) the *intrin.h headers in the case where the guarding macro isn't defined
>    should be surrounded by something like
>    #ifndef __FMA4__
>    #pragma GCC push options
>    #pragma GCC target("fma4")
>    #endif
>    ...
>    #ifndef __FMA4__
>    #pragma GCC pop options
>    #endif
>    so that everything that is in the headers is compiled with the ISA
>    in question

For the inline functions? (The caller functions should have the target
option), or FE needs the target option properly set?

> 4) what happens if you use the various vector types typedefed in the
>    *intrin.h headers in code that doesn't support those ISAs?  As TYPE_MODE
>    for VECTOR_TYPE is a function call, perhaps it will just be handled as
>    generic BLKmode vectors, which is desirable I think

Will the veclower pass deal with it (lowered into scalar operations)?


> 5) what happens if you use a target builtin in a function not supporting
>    the corresponding ISA, do you get proper error explaining what you are
>    doing wrong?

Yes, in ix86_expand_builtin there is a check -- that is what Sri
mentioned in a previous email.

> 6) what happens if you use some intrinsics in a function not supporting
>    the corresponding ISA?  Dunno if the inliner chooses not to inline it
>    and error out because it is always_inline, or what exactly will happen
>    then

I think it may end up with errors about unsupported builtins as above

thanks,

David

>
> For all this you certainly need testcases.
>
>         Jakub
diff mbox

Patch

Index: emmintrin.h
===================================================================
--- emmintrin.h	(revision 197691)
+++ emmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _EMMINTRIN_H_INCLUDED
 #define _EMMINTRIN_H_INCLUDED
 
-#ifndef __SSE2__
+#if !defined (__SSE2__) && !defined (__ALL_ISA__)
 # error "SSE2 instruction set not enabled"
 #else
 
Index: fma4intrin.h
===================================================================
--- fma4intrin.h	(revision 197691)
+++ fma4intrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _FMA4INTRIN_H_INCLUDED
 #define _FMA4INTRIN_H_INCLUDED
 
-#ifndef __FMA4__
+#if !defined (__FMA4__) && !defined (__ALL_ISA__)
 # error "FMA4 instruction set not enabled"
 #else
 
Index: lwpintrin.h
===================================================================
--- lwpintrin.h	(revision 197691)
+++ lwpintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _LWPINTRIN_H_INCLUDED
 #define _LWPINTRIN_H_INCLUDED
 
-#ifndef __LWP__
+#if !defined (__LWP__) && !defined (__ALL_ISA__)
 # error "LWP instruction set not enabled"
 #else
 
Index: xopintrin.h
===================================================================
--- xopintrin.h	(revision 197691)
+++ xopintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _XOPMMINTRIN_H_INCLUDED
 #define _XOPMMINTRIN_H_INCLUDED
 
-#ifndef __XOP__
+#if !defined (__XOP__) && !defined (__ALL_ISA__)
 # error "XOP instruction set not enabled"
 #else
 
Index: fmaintrin.h
===================================================================
--- fmaintrin.h	(revision 197691)
+++ fmaintrin.h	(working copy)
@@ -28,7 +28,7 @@ 
 #ifndef _FMAINTRIN_H_INCLUDED
 #define _FMAINTRIN_H_INCLUDED
 
-#ifndef __FMA__
+#if !defined (__FMA__) && !defined (__ALL_ISA__)
 # error "FMA instruction set not enabled"
 #else
 
Index: bmiintrin.h
===================================================================
--- bmiintrin.h	(revision 197691)
+++ bmiintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <bmiintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI__
+#if !defined (__BMI__) && !defined (__ALL_ISA__)
 # error "BMI instruction set not enabled"
 #endif /* __BMI__ */
 
Index: mmintrin.h
===================================================================
--- mmintrin.h	(revision 197691)
+++ mmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _MMINTRIN_H_INCLUDED
 #define _MMINTRIN_H_INCLUDED
 
-#ifndef __MMX__
+#if !defined (__MMX__) && !defined (__ALL_ISA__)
 # error "MMX instruction set not enabled"
 #else
 /* The Intel API is flexible enough that we must allow aliasing with other
Index: nmmintrin.h
===================================================================
--- nmmintrin.h	(revision 197691)
+++ nmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _NMMINTRIN_H_INCLUDED
 #define _NMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_2__
+#if !defined (__SSE4_2__) && !defined (__ALL_ISA__)
 # error "SSE4.2 instruction set not enabled"
 #else
 /* We just include SSE4.1 header file.  */
Index: tbmintrin.h
===================================================================
--- tbmintrin.h	(revision 197691)
+++ tbmintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <tbmintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __TBM__
+#if !defined (__TBM__) && !defined (__ALL_ISA__)
 # error "TBM instruction set not enabled"
 #endif /* __TBM__ */
 
Index: f16cintrin.h
===================================================================
--- f16cintrin.h	(revision 197691)
+++ f16cintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <f16intrin.h> directly; include <x86intrin.h> or <immintrin.h> instead."
 #endif
 
-#ifndef __F16C__
+#if !defined (__F16C__) && !defined (__ALL_ISA__)
 # error "F16C instruction set not enabled"
 #else
 
Index: i386.opt
===================================================================
--- i386.opt	(revision 197691)
+++ i386.opt	(working copy)
@@ -626,3 +626,7 @@  Split 32-byte AVX unaligned store
 mrtm
 Target Report Mask(ISA_RTM) Var(ix86_isa_flags) Save
 Support RTM built-in functions and code generation
+
+mgenerate-builtins
+Target Report Var(generate_target_builtins) Save
+Generate all target builtins that are otherwise only generated when the approrpriate ISA is turned on.
Index: i386-c.c
===================================================================
--- i386-c.c	(revision 197691)
+++ i386-c.c	(working copy)
@@ -54,6 +54,9 @@  ix86_target_macros_internal (HOST_WIDE_INT isa_fla
   int last_arch_char = ix86_arch_string[arch_len - 1];
   int last_tune_char = ix86_tune_string[tune_len - 1];
 
+  if (generate_target_builtins)
+    def_or_undef (parse_in, "__ALL_ISA__");
+
   /* Built-ins based on -march=.  */
   switch (arch)
     {
Index: bmi2intrin.h
===================================================================
--- bmi2intrin.h	(revision 197691)
+++ bmi2intrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <bmi2intrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __BMI2__
+#if !defined (__BMI2__) && !defined (__ALL_ISA__)
 # error "BMI2 instruction set not enabled"
 #endif /* __BMI2__ */
 
Index: lzcntintrin.h
===================================================================
--- lzcntintrin.h	(revision 197691)
+++ lzcntintrin.h	(working copy)
@@ -25,7 +25,7 @@ 
 # error "Never use <lzcntintrin.h> directly; include <x86intrin.h> instead."
 #endif
 
-#ifndef __LZCNT__
+#if !defined (__LZCNT__) && !defined (__ALL_ISA__)
 # error "LZCNT instruction is not enabled"
 #endif /* __LZCNT__ */
 
Index: smmintrin.h
===================================================================
--- smmintrin.h	(revision 197691)
+++ smmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _SMMINTRIN_H_INCLUDED
 #define _SMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4_1__
+#if !defined (__SSE4_1__) && !defined (__ALL_ISA__)
 # error "SSE4.1 instruction set not enabled"
 #else
 
Index: i386.c
===================================================================
--- i386.c	(revision 197691)
+++ i386.c	(working copy)
@@ -26813,7 +26813,8 @@  def_builtin (HOST_WIDE_INT mask, const char *name,
       ix86_builtins_isa[(int) code].isa = mask;
 
       mask &= ~OPTION_MASK_ISA_64BIT;
-      if (mask == 0
+      if (generate_target_builtins
+	  || mask == 0
 	  || (mask & ix86_isa_flags) != 0
 	  || (lang_hooks.builtin_function
 	      == lang_hooks.builtin_function_ext_scope))
Index: wmmintrin.h
===================================================================
--- wmmintrin.h	(revision 197691)
+++ wmmintrin.h	(working copy)
@@ -30,7 +30,7 @@ 
 /* We need definitions from the SSE2 header file.  */
 #include <emmintrin.h>
 
-#if !defined (__AES__) && !defined (__PCLMUL__)
+#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__)
 # error "AES/PCLMUL instructions not enabled"
 #else
 
Index: pmmintrin.h
===================================================================
--- pmmintrin.h	(revision 197691)
+++ pmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _PMMINTRIN_H_INCLUDED
 #define _PMMINTRIN_H_INCLUDED
 
-#ifndef __SSE3__
+#if !defined (__SSE3__) && !defined (__ALL_ISA__)
 # error "SSE3 instruction set not enabled"
 #else
 
Index: tmmintrin.h
===================================================================
--- tmmintrin.h	(revision 197691)
+++ tmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _TMMINTRIN_H_INCLUDED
 #define _TMMINTRIN_H_INCLUDED
 
-#ifndef __SSSE3__
+#if !defined (__SSSE3__) && !defined (__ALL_ISA__)
 # error "SSSE3 instruction set not enabled"
 #else
 
Index: xmmintrin.h
===================================================================
--- xmmintrin.h	(revision 197691)
+++ xmmintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _XMMINTRIN_H_INCLUDED
 #define _XMMINTRIN_H_INCLUDED
 
-#ifndef __SSE__
+#if !defined (__SSE__) && !defined (__ALL_ISA__)
 # error "SSE instruction set not enabled"
 #else
 
Index: popcntintrin.h
===================================================================
--- popcntintrin.h	(revision 197691)
+++ popcntintrin.h	(working copy)
@@ -21,7 +21,7 @@ 
    see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    <http://www.gnu.org/licenses/>.  */
 
-#ifndef __POPCNT__
+#if !defined (__POPCNT__) && !defined (__ALL_ISA__)
 # error "POPCNT instruction set not enabled"
 #endif /* __POPCNT__ */
 
Index: ammintrin.h
===================================================================
--- ammintrin.h	(revision 197691)
+++ ammintrin.h	(working copy)
@@ -27,7 +27,7 @@ 
 #ifndef _AMMINTRIN_H_INCLUDED
 #define _AMMINTRIN_H_INCLUDED
 
-#ifndef __SSE4A__
+#if !defined (__SSE4A__) && !defined (__ALL_ISA__)
 # error "SSE4A instruction set not enabled"
 #else