diff mbox

[i386] PR 59422 - Support more targets for function multi versioning

Message ID 201312151954.38590.linux@carewolf.com
State New
Headers show

Commit Message

Allan Sandfeld Jensen Dec. 15, 2013, 6:54 p.m. UTC
Hi again
On Wednesday 11 December 2013, Uros Bizjak wrote:
> Hello!
> 
> > PR gcc/59422
> > 
> > This patch extends the supported targets for function multi versiong to
> > also include Haswell, Silvermont, and the most recent AMD models. It
> > also prioritizes AVX2 versions over AMD specific pre-AVX2 versions.
> 
> Please add a ChangeLog entry and attach the complete patch. Please
> also state how you tested the patch, as outlined in the instructions
> [1].
> 
> [1] http://gcc.gnu.org/contribute.html
> 
Updated patch for better CPU model detection and added ChangeLog.

The patch has been tested with the attached test.cpp. Verified that it doesn't 
build before the patch, and that it builds after, and verified it selects 
correct versions at runtime based on either CPU model or supported ISA (tested 
on 3 machines: SandyBridge, IvyBridge and Phenom II).

Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that 
an old term for what has become the Jaguar architecture?

`Allan

Comments

Uros Bizjak Dec. 16, 2013, 9:34 a.m. UTC | #1
On Sun, Dec 15, 2013 at 7:54 PM, Allan Sandfeld Jensen
<carewolf@gmail.com> wrote:

> Hi again
> On Wednesday 11 December 2013, Uros Bizjak wrote:
>> Hello!
>>
>> > PR gcc/59422
>> >
>> > This patch extends the supported targets for function multi versiong to
>> > also include Haswell, Silvermont, and the most recent AMD models. It
>> > also prioritizes AVX2 versions over AMD specific pre-AVX2 versions.
>>
>> Please add a ChangeLog entry and attach the complete patch. Please
>> also state how you tested the patch, as outlined in the instructions
>> [1].
>>
>> [1] http://gcc.gnu.org/contribute.html
>>
> Updated patch for better CPU model detection and added ChangeLog.
>
> The patch has been tested with the attached test.cpp. Verified that it doesn't
> build before the patch, and that it builds after, and verified it selects
> correct versions at runtime based on either CPU model or supported ISA (tested
> on 3 machines: SandyBridge, IvyBridge and Phenom II).
>
> Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that
> an old term for what has become the Jaguar architecture?

Thanks for the patch!

However, you should not change the existing order of enums in
cpuinfo.c (enum processor_vendor, enum processor_types, enum
processor_subtypes, enum processor_features), but new entries should
be added at the end (before *_MAX entry, if exists) of the enum. The
enums (enum processor_features and enum processor_model) in
config/i386/i386.c should mirror these changes. Please see [1].

Probably, we should document this in the source...

-      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
+      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},

Huh... Thanks for catching this. -march=sandybridge is not recognized...

I have also CC'd maintainers from Intel and AMD for their comments.

[1] http://gcc.gnu.org/ml/gcc-patches/2013-06/msg00034.html

Uros.
Uros Bizjak Dec. 16, 2013, 9:41 a.m. UTC | #2
On Mon, Dec 16, 2013 at 10:34 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> On Sun, Dec 15, 2013 at 7:54 PM, Allan Sandfeld Jensen
> <carewolf@gmail.com> wrote:
>
>> Hi again
>> On Wednesday 11 December 2013, Uros Bizjak wrote:
>>> Hello!
>>>
>>> > PR gcc/59422
>>> >
>>> > This patch extends the supported targets for function multi versiong to
>>> > also include Haswell, Silvermont, and the most recent AMD models. It
>>> > also prioritizes AVX2 versions over AMD specific pre-AVX2 versions.
>>>
>>> Please add a ChangeLog entry and attach the complete patch. Please
>>> also state how you tested the patch, as outlined in the instructions
>>> [1].
>>>
>>> [1] http://gcc.gnu.org/contribute.html
>>>
>> Updated patch for better CPU model detection and added ChangeLog.
>>
>> The patch has been tested with the attached test.cpp. Verified that it doesn't
>> build before the patch, and that it builds after, and verified it selects
>> correct versions at runtime based on either CPU model or supported ISA (tested
>> on 3 machines: SandyBridge, IvyBridge and Phenom II).
>>
>> Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that
>> an old term for what has become the Jaguar architecture?
>
> Thanks for the patch!
>
> However, you should not change the existing order of enums in
> cpuinfo.c (enum processor_vendor, enum processor_types, enum
> processor_subtypes, enum processor_features), but new entries should
> be added at the end (before *_MAX entry, if exists) of the enum. The
> enums (enum processor_features and enum processor_model) in
> config/i386/i386.c should mirror these changes. Please see [1].
>
> Probably, we should document this in the source...
>
> -      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
> +      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
>
> Huh... Thanks for catching this. -march=sandybridge is not recognized...

-      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
+      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
+      {"core-avx-i", M_INTEL_COREI7_IVYBRIDGE},
+      {"core-avx2", M_INTEL_COREI7_HASWELL},

Ah, no. These names are not intended to be used in -march. We can
follow the tradition and use sandybridge, ivybridge and haswell here.

+      {"btver1", M_AMDFAM14H_BTVER1},
+      {"btver2", M_AMDFAM14H_BTVER2},
       {"amdfam15h", M_AMDFAM15H},
       {"bdver1", M_AMDFAM15H_BDVER1},
       {"bdver2", M_AMDFAM15H_BDVER2},

Maybe AMD wants bobcat, piledriver and steamroller here instead of
btverX / bdverX?

Uros.
Gopalasubramanian, Ganesh Dec. 16, 2013, 9:59 a.m. UTC | #3
> Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that an old term for what has become the Jaguar architecture?

Yes, "btver2" = "jaguar". We have the name as per its family name (i.e, bobcat family) in GCC. 
Similarly we have the names "bdver2" = "piledriver", "bdver3" = "steamroller" as per their family (bulldozer) name.

Regards
Ganesh

-----Original Message-----
From: Allan Sandfeld Jensen [mailto:carewolf@gmail.com] 
Sent: Monday, December 16, 2013 12:25 AM
To: Uros Bizjak
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [Patch, i386] PR 59422 - Support more targets for function multi versioning

Hi again
On Wednesday 11 December 2013, Uros Bizjak wrote:
> Hello!
> 
> > PR gcc/59422
> > 
> > This patch extends the supported targets for function multi versiong 
> > to also include Haswell, Silvermont, and the most recent AMD models. 
> > It also prioritizes AVX2 versions over AMD specific pre-AVX2 versions.
> 
> Please add a ChangeLog entry and attach the complete patch. Please 
> also state how you tested the patch, as outlined in the instructions 
> [1].
> 
> [1] http://gcc.gnu.org/contribute.html
> 
Updated patch for better CPU model detection and added ChangeLog.

The patch has been tested with the attached test.cpp. Verified that it doesn't build before the patch, and that it builds after, and verified it selects correct versions at runtime based on either CPU model or supported ISA (tested on 3 machines: SandyBridge, IvyBridge and Phenom II).

Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that an old term for what has become the Jaguar architecture?

`Allan
Allan Sandfeld Jensen Dec. 16, 2013, 8:25 p.m. UTC | #4
On Monday 16 December 2013, Gopalasubramanian, Ganesh wrote:
> > Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is
> > that an old term for what has become the Jaguar architecture?
> 
> Yes, "btver2" = "jaguar". We have the name as per its family name (i.e,
> bobcat family) in GCC. Similarly we have the names "bdver2" =
> "piledriver", "bdver3" = "steamroller" as per their family (bulldozer)
> name.
> 
Yes, I figured that was the original idea behind it, but the final family of 
the jaguar processors seems to have become 16h instead of 14h (bobcat) at some 
point.

Regards
`Allan
Gopalasubramanian, Ganesh Dec. 17, 2013, 4:46 a.m. UTC | #5
> Yes, I figured that was the original idea behind it, but the final family of the jaguar processors seems to have become 16h instead of 14h (bobcat) at some point.

Yes. It is amdfam16h. I was supposed to pass on some comments on the patch.
1. Amdfam16h for Jaguar.
2. For Jaguar, the priority needs to be AVX (AVX got included into the Jaguar ISA).

I have a doubt! What would be done if priority is set to "F_FMA4" instead of "F_XOP" for bdver1?

Regards
Ganesh
Allan Sandfeld Jensen Dec. 17, 2013, 10:20 a.m. UTC | #6
On Monday 16 December 2013, Uros Bizjak wrote:
> On Mon, Dec 16, 2013 at 10:34 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> > On Sun, Dec 15, 2013 at 7:54 PM, Allan Sandfeld Jensen
> > 
> > <carewolf@gmail.com> wrote:
> >> Hi again
> >> 
> >> On Wednesday 11 December 2013, Uros Bizjak wrote:
> >>> Hello!
> >>> 
> >>> > PR gcc/59422
> >>> > 
> >>> > This patch extends the supported targets for function multi versiong
> >>> > to also include Haswell, Silvermont, and the most recent AMD models.
> >>> > It also prioritizes AVX2 versions over AMD specific pre-AVX2
> >>> > versions.
> >>> 
> >>> Please add a ChangeLog entry and attach the complete patch. Please
> >>> also state how you tested the patch, as outlined in the instructions
> >>> [1].
> >>> 
> >>> [1] http://gcc.gnu.org/contribute.html
> >> 
> >> Updated patch for better CPU model detection and added ChangeLog.
> >> 
> >> The patch has been tested with the attached test.cpp. Verified that it
> >> doesn't build before the patch, and that it builds after, and verified
> >> it selects correct versions at runtime based on either CPU model or
> >> supported ISA (tested on 3 machines: SandyBridge, IvyBridge and Phenom
> >> II).
> >> 
> >> Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is
> >> that an old term for what has become the Jaguar architecture?
> > 
> > Thanks for the patch!
> > 
> > However, you should not change the existing order of enums in
> > cpuinfo.c (enum processor_vendor, enum processor_types, enum
> > processor_subtypes, enum processor_features), but new entries should
> > be added at the end (before *_MAX entry, if exists) of the enum. The
> > enums (enum processor_features and enum processor_model) in
> > config/i386/i386.c should mirror these changes. Please see [1].
> > 
> > Probably, we should document this in the source...
> > 
> > -      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
> > +      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
> > 
> > Huh... Thanks for catching this. -march=sandybridge is not recognized...
> 
> -      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
> +      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
> +      {"core-avx-i", M_INTEL_COREI7_IVYBRIDGE},
> +      {"core-avx2", M_INTEL_COREI7_HASWELL},
> 
> Ah, no. These names are not intended to be used in -march. We can
> follow the tradition and use sandybridge, ivybridge and haswell here.
> 
I had the problem that "arch=corei7-avx" was not recognized as a valid 
property argument until I made that change. I thought it was the intend to 
merge this list of models with the canonical names, but perhaps it is an error 
in the new parameter validation?

Note that similarly "arch=sandybridge" is accepted as a valid property 
argument but then fails as an invalid argument for march.

Regards
`Allan
Allan Sandfeld Jensen Dec. 17, 2013, 7:19 p.m. UTC | #7
On Tuesday 17 December 2013, Allan Sandfeld Jensen wrote:
> On Monday 16 December 2013, Uros Bizjak wrote:
> > On Mon, Dec 16, 2013 at 10:34 AM, Uros Bizjak <ubizjak@gmail.com> wrote:
> > > On Sun, Dec 15, 2013 at 7:54 PM, Allan Sandfeld Jensen
> > > 
> > > <carewolf@gmail.com> wrote:
> > >> Hi again
> > >> 
> > >> On Wednesday 11 December 2013, Uros Bizjak wrote:
> > >>> Hello!
> > >>> 
> > >>> > PR gcc/59422
> > >>> > 
> > >>> > This patch extends the supported targets for function multi
> > >>> > versiong to also include Haswell, Silvermont, and the most recent
> > >>> > AMD models. It also prioritizes AVX2 versions over AMD specific
> > >>> > pre-AVX2 versions.
> > >>> 
> > >>> Please add a ChangeLog entry and attach the complete patch. Please
> > >>> also state how you tested the patch, as outlined in the instructions
> > >>> [1].
> > >>> 
> > >>> [1] http://gcc.gnu.org/contribute.html
> > >> 
> > >> Updated patch for better CPU model detection and added ChangeLog.
> > >> 
> > >> The patch has been tested with the attached test.cpp. Verified that it
> > >> doesn't build before the patch, and that it builds after, and verified
> > >> it selects correct versions at runtime based on either CPU model or
> > >> supported ISA (tested on 3 machines: SandyBridge, IvyBridge and Phenom
> > >> II).
> > >> 
> > >> Btw, I couldn't find anything that corresponds to gcc's btver2 arch.
> > >> Is that an old term for what has become the Jaguar architecture?
> > > 
> > > Thanks for the patch!
> > > 
> > > However, you should not change the existing order of enums in
> > > cpuinfo.c (enum processor_vendor, enum processor_types, enum
> > > processor_subtypes, enum processor_features), but new entries should
> > > be added at the end (before *_MAX entry, if exists) of the enum. The
> > > enums (enum processor_features and enum processor_model) in
> > > config/i386/i386.c should mirror these changes. Please see [1].
> > > 
> > > Probably, we should document this in the source...
> > > 
> > > -      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
> > > +      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
> > > 
> > > Huh... Thanks for catching this. -march=sandybridge is not
> > > recognized...
> > 
> > -      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
> > +      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
> > +      {"core-avx-i", M_INTEL_COREI7_IVYBRIDGE},
> > +      {"core-avx2", M_INTEL_COREI7_HASWELL},
> > 
> > Ah, no. These names are not intended to be used in -march. We can
> > follow the tradition and use sandybridge, ivybridge and haswell here.
> 
> I had the problem that "arch=corei7-avx" was not recognized as a valid
> property argument until I made that change. I thought it was the intend to
> merge this list of models with the canonical names, but perhaps it is an
> error in the new parameter validation?
> 
Ah, sorry. I think I misremembered the problem. After reviewing the code 
again, I think the only problem is with target("arch=core-avx-i") because it 
is not in the list of architectures (because it is treated as the same 
architecture as corei7-avx presumably).

I will revert the sandybridge name change in the next patch, and make the new 
names match.

`Allan
Gopalasubramanian, Ganesh Dec. 18, 2013, 3:38 p.m. UTC | #8
Ping!

"Gopalasubramanian, Ganesh" <Ganesh.Gopalasubramanian@amd.com> wrote:


> Yes, I figured that was the original idea behind it, but the final family of the jaguar processors seems to have become 16h instead of 14h (bobcat) at some point.

Yes. It is amdfam16h. I was supposed to pass on some comments on the patch.
1. Amdfam16h for Jaguar.
2. For Jaguar, the priority needs to be AVX (AVX got included into the Jaguar ISA).

I have a doubt! What would be done if priority is set to "F_FMA4" instead of "F_XOP" for bdver1?

Regards
Ganesh
Uros Bizjak Dec. 18, 2013, 4:20 p.m. UTC | #9
On Wed, Dec 18, 2013 at 4:38 PM, Gopalasubramanian, Ganesh
<Ganesh.Gopalasubramanian@amd.com> wrote:
>
> Ping!
>
> "Gopalasubramanian, Ganesh" <Ganesh.Gopalasubramanian@amd.com> wrote:
>
>
>> Yes, I figured that was the original idea behind it, but the final family of the jaguar processors seems to have become 16h instead of 14h (bobcat) at some point.
>
> Yes. It is amdfam16h. I was supposed to pass on some comments on the patch.
> 1. Amdfam16h for Jaguar.
> 2. For Jaguar, the priority needs to be AVX (AVX got included into the Jaguar ISA).
>
> I have a doubt! What would be done if priority is set to "F_FMA4" instead of "F_XOP" for bdver1?

XOP enables FMA4,  so it should be better to set priority to
P_PROC_XOP. From config/i386/i386-common.c:

#define OPTION_MASK_ISA_XOP_SET \
  (OPTION_MASK_ISA_XOP | OPTION_MASK_ISA_FMA4_SET)

Looking at processor_dispatch_table, bdver1 doesn't support F_FMA, so
the proposed patch fixes an error here.

Uros.
Allan Sandfeld Jensen Dec. 18, 2013, 5:36 p.m. UTC | #10
On Wednesday 18 December 2013, Gopalasubramanian, Ganesh wrote:
> Ping!
> 
> "Gopalasubramanian, Ganesh" <Ganesh.Gopalasubramanian@amd.com> wrote:
> > Yes, I figured that was the original idea behind it, but the final family
> > of the jaguar processors seems to have become 16h instead of 14h
> > (bobcat) at some point.
> 
> Yes. It is amdfam16h. I was supposed to pass on some comments on the patch.
> 1. Amdfam16h for Jaguar.
> 2. For Jaguar, the priority needs to be AVX (AVX got included into the
> Jaguar ISA).
> 
Yes, I changed that in the last patch, though I consider it momentarily 
problematic because you do not yet enable AVX with march=btver2 (AVX versions 
would currently be better than btver2 versions for a btver2 arch), but expect 
march=btver2 will be fixed soon.

Regards
'Allan
Gopalasubramanian, Ganesh Dec. 19, 2013, 6:35 a.m. UTC | #11
> Yes, I changed that in the last patch, though I consider it momentarily problematic because you do not yet enable AVX with march=btver2 (AVX versions would currently be better than btver2 versions for a btver2 arch), but expect
march=btver2 will be fixed soon.

The " processor_alias_table" entry for "btver2" in i386.c enables AVX.

<snip>
      {"btver2", PROCESSOR_BTVER2, CPU_BTVER2,
        PTA_64BIT | PTA_MMX |  PTA_SSE  | PTA_SSE2 | PTA_SSE3
        | PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_SSE4_1
        | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
        | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW
        | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
</snip>

The assembly listing for a simple test (compiled with -march=btver2) also has -mavx enabled. So, can you please enable AVX for btver2?

Regards
Ganesh
Allan Sandfeld Jensen Dec. 19, 2013, 9:54 a.m. UTC | #12
On Thursday 19 December 2013, Gopalasubramanian, Ganesh wrote:
> > Yes, I changed that in the last patch, though I consider it momentarily
> > problematic because you do not yet enable AVX with march=btver2 (AVX
> > versions would currently be better than btver2 versions for a btver2
> > arch), but expect
> 
> march=btver2 will be fixed soon.
> 
> The " processor_alias_table" entry for "btver2" in i386.c enables AVX.
> 
> <snip>
>       {"btver2", PROCESSOR_BTVER2, CPU_BTVER2,
>         PTA_64BIT | PTA_MMX |  PTA_SSE  | PTA_SSE2 | PTA_SSE3
> 
>         | PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_SSE4_1
>         | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
>         | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW
>         | PTA_FXSR | PTA_XSAVE | PTA_XSAVEOPT},
> 
> </snip>
> 
> The assembly listing for a simple test (compiled with -march=btver2) also
> has -mavx enabled. So, can you please enable AVX for btver2?
> 
Sorry, I must have been looking at an older version, but as I said I already 
did enable it in the latest patch. (see http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html )

`Allan
diff mbox

Patch

Index: gcc/ChangeLog
===================================================================
--- gcc/ChangeLog	(revision 205984)
+++ gcc/ChangeLog	(working copy)
@@ -1,3 +1,9 @@ 
+2013-12-14  Allan Sandfeld Jensen <sandfeld@kde.org>
+
+        PR gcc/59422
+        * config/i386/i386.c: Extend function multiversioning
+        to better support recent Intel and AMD models.
+        
 2013-12-14  Marek Polacek  <polacek@redhat.com>
 
 	PR sanitizer/59503
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 205984)
+++ gcc/config/i386/i386.c	(working copy)
@@ -29962,9 +29962,14 @@ 
     P_PROC_SSE4_2,
     P_POPCNT,
     P_AVX,
+    P_PROC_AVX,
+    P_FMA4,
+    P_XOP,
+    P_PROC_XOP,
+    P_FMA,    
+    P_PROC_FMA,
     P_AVX2,
-    P_FMA,
-    P_PROC_FMA
+    P_PROC_AVX2
   };
 
  enum feature_priority priority = P_ZERO;
@@ -29983,11 +29988,15 @@ 
       {"sse", P_SSE},
       {"sse2", P_SSE2},
       {"sse3", P_SSE3},
+      {"sse4a", P_SSE4_a},
       {"ssse3", P_SSSE3},
       {"sse4.1", P_SSE4_1},
       {"sse4.2", P_SSE4_2},
       {"popcnt", P_POPCNT},
       {"avx", P_AVX},
+      {"fma4", P_FMA4},
+      {"xop", P_XOP},
+      {"fma", P_FMA},
       {"avx2", P_AVX2}
     };
 
@@ -30041,25 +30050,49 @@ 
 	      break;
             case PROCESSOR_COREI7_AVX:
               arg_str = "corei7-avx";
-              priority = P_PROC_SSE4_2;
+              priority = P_PROC_AVX;
               break;
+            case PROCESSOR_HASWELL:
+              arg_str = "core-avx2";
+              priority = P_PROC_AVX2;
+              break;
 	    case PROCESSOR_ATOM:
 	      arg_str = "atom";
 	      priority = P_PROC_SSSE3;
 	      break;
+            case PROCESSOR_SLM:
+              arg_str = "slm";
+              priority = P_PROC_SSE4_2;
+              break;
 	    case PROCESSOR_AMDFAM10:
 	      arg_str = "amdfam10h";
 	      priority = P_PROC_SSE4_a;
 	      break;
+            case PROCESSOR_BTVER1:
+              arg_str = "btver1";
+              priority = P_PROC_SSE4_a;
+              break;
+            case PROCESSOR_BTVER2:
+              arg_str = "btver2";
+              priority = P_PROC_SSE4_2;
+              break;
 	    case PROCESSOR_BDVER1:
 	      arg_str = "bdver1";
-	      priority = P_PROC_FMA;
+	      priority = P_PROC_XOP;
 	      break;
 	    case PROCESSOR_BDVER2:
 	      arg_str = "bdver2";
 	      priority = P_PROC_FMA;
 	      break;
-	    }  
+            case PROCESSOR_BDVER3:
+              arg_str = "bdver3";
+              priority = P_PROC_FMA;
+              break;
+            case PROCESSOR_BDVER4:
+              arg_str = "bdver4";
+              priority = P_PROC_AVX2;
+              break;
+            }  
 	}    
     
       cl_target_option_restore (&global_options, &cur_target);
@@ -30919,9 +30952,13 @@ 
     F_SSE2,
     F_SSE3,
     F_SSSE3,
+    F_SSE4_a,
     F_SSE4_1,
     F_SSE4_2,
     F_AVX,
+    F_FMA4,
+    F_XOP,
+    F_FMA,
     F_AVX2,
     F_MAX
   };
@@ -30938,15 +30975,20 @@ 
     M_INTEL_CORE2,
     M_INTEL_COREI7,
     M_AMDFAM10H,
+    M_AMDFAM14H,
     M_AMDFAM15H,
     M_INTEL_SLM,
     M_CPU_SUBTYPE_START,
     M_INTEL_COREI7_NEHALEM,
     M_INTEL_COREI7_WESTMERE,
     M_INTEL_COREI7_SANDYBRIDGE,
+    M_INTEL_COREI7_IVYBRIDGE,
+    M_INTEL_COREI7_HASWELL,
     M_AMDFAM10H_BARCELONA,
     M_AMDFAM10H_SHANGHAI,
     M_AMDFAM10H_ISTANBUL,
+    M_AMDFAM14H_BTVER1,
+    M_AMDFAM14H_BTVER2,
     M_AMDFAM15H_BDVER1,
     M_AMDFAM15H_BDVER2,
     M_AMDFAM15H_BDVER3,
@@ -30968,11 +31010,16 @@ 
       {"corei7", M_INTEL_COREI7},
       {"nehalem", M_INTEL_COREI7_NEHALEM},
       {"westmere", M_INTEL_COREI7_WESTMERE},
-      {"sandybridge", M_INTEL_COREI7_SANDYBRIDGE},
+      {"corei7-avx", M_INTEL_COREI7_SANDYBRIDGE},
+      {"core-avx-i", M_INTEL_COREI7_IVYBRIDGE},
+      {"core-avx2", M_INTEL_COREI7_HASWELL},
       {"amdfam10h", M_AMDFAM10H},
       {"barcelona", M_AMDFAM10H_BARCELONA},
       {"shanghai", M_AMDFAM10H_SHANGHAI},
       {"istanbul", M_AMDFAM10H_ISTANBUL},
+      {"amdfam14h", M_AMDFAM14H},
+      {"btver1", M_AMDFAM14H_BTVER1},
+      {"btver2", M_AMDFAM14H_BTVER2},
       {"amdfam15h", M_AMDFAM15H},
       {"bdver1", M_AMDFAM15H_BDVER1},
       {"bdver2", M_AMDFAM15H_BDVER2},
@@ -30994,9 +31041,13 @@ 
       {"sse2",   F_SSE2},
       {"sse3",   F_SSE3},
       {"ssse3",  F_SSSE3},
+      {"sse4a",  F_SSE4_a},
       {"sse4.1", F_SSE4_1},
       {"sse4.2", F_SSE4_2},
       {"avx",    F_AVX},
+      {"fma4",   F_FMA4},
+      {"xop",    F_XOP},
+      {"fma",    F_FMA},
       {"avx2",   F_AVX2}
     };
 
Index: libgcc/ChangeLog
===================================================================
--- libgcc/ChangeLog	(revision 205984)
+++ libgcc/ChangeLog	(working copy)
@@ -1,3 +1,9 @@ 
+2013-12-14  Allan Sandfeld Jensen <sandfeld@kde.org>
+
+        PR gcc/59422
+        * config/i386/cpuinfo.c: Detect sse4a, fma4, xop and fma
+        ISAs and recent Intel and AMD models.
+        
 2013-12-12  Zhenqiang Chen  <zhenqiang.chen@arm.com>
 
 	* config.host (arm*-*-uclinux*): Move t-arm before t-bpabi.
Index: libgcc/config/i386/cpuinfo.c
===================================================================
--- libgcc/config/i386/cpuinfo.c	(revision 205984)
+++ libgcc/config/i386/cpuinfo.c	(working copy)
@@ -60,6 +60,7 @@ 
   INTEL_CORE2,
   INTEL_COREI7,
   AMDFAM10H,
+  AMDFAM14H,
   AMDFAM15H,
   INTEL_SLM,
   CPU_TYPE_MAX
@@ -70,11 +71,17 @@ 
   INTEL_COREI7_NEHALEM = 1,
   INTEL_COREI7_WESTMERE,
   INTEL_COREI7_SANDYBRIDGE,
+  INTEL_COREI7_IVYBRIDGE,
+  INTEL_COREI7_HASWELL,
   AMDFAM10H_BARCELONA,
   AMDFAM10H_SHANGHAI,
   AMDFAM10H_ISTANBUL,
+  AMDFAM14H_BTVER1,
+  AMDFAM14H_BTVER2,
   AMDFAM15H_BDVER1,
   AMDFAM15H_BDVER2,
+  AMDFAM15H_BDVER3,
+  AMDFAM15H_BDVER4,
   CPU_SUBTYPE_MAX
 };
 
@@ -89,9 +96,13 @@ 
   FEATURE_SSE2,
   FEATURE_SSE3,
   FEATURE_SSSE3,
+  FEATURE_SSE4_a,
   FEATURE_SSE4_1,
   FEATURE_SSE4_2,
   FEATURE_AVX,
+  FEATURE_FMA4,
+  FEATURE_XOP,
+  FEATURE_FMA,
   FEATURE_AVX2
 };
 
@@ -113,36 +124,43 @@ 
     {
     /* AMD Family 10h.  */
     case 0x10:
+      __cpu_model.__cpu_type = AMDFAM10H;
       switch (model)
 	{
 	case 0x2:
 	  /* Barcelona.  */
-	  __cpu_model.__cpu_type = AMDFAM10H;
 	  __cpu_model.__cpu_subtype = AMDFAM10H_BARCELONA;
 	  break;
 	case 0x4:
 	  /* Shanghai.  */
-	  __cpu_model.__cpu_type = AMDFAM10H;
 	  __cpu_model.__cpu_subtype = AMDFAM10H_SHANGHAI;
 	  break;
 	case 0x8:
 	  /* Istanbul.  */
-	  __cpu_model.__cpu_type = AMDFAM10H;
 	  __cpu_model.__cpu_subtype = AMDFAM10H_ISTANBUL;
 	  break;
 	default:
 	  break;
 	}
       break;
-    /* AMD Family 15h.  */
+    /* AMD Family 14h "Bobcat". */
+    case 0x14:
+      __cpu_model.__cpu_type = AMDFAM14H;
+      if ( model <= 0xf)
+        __cpu_model.__cpu_subtype = AMDFAM14H_BTVER1;
+      break;
+    /* AMD Family 15h "Bulldozer".  */
     case 0x15:
       __cpu_model.__cpu_type = AMDFAM15H;
       /* Bulldozer version 1.  */
       if ( model <= 0xf)
 	__cpu_model.__cpu_subtype = AMDFAM15H_BDVER1;
-      /* Bulldozer version 2.  */
-      if (model >= 0x10 && model <= 0x1f)
-	__cpu_model.__cpu_subtype = AMDFAM15H_BDVER2;
+      /* Bulldozer version 2 "Piledriver" */
+      if (model >= 0x10 && model <= 0x2f)
+	__cpu_model.__cpu_subtype = AMDFAM15H_BDVER2;      
+      /* Bulldozer version 3 "Steamroller"  */
+      if (model >= 0x30 && model <= 0x4f)
+        __cpu_model.__cpu_subtype = AMDFAM15H_BDVER3;
       break;
     default:
       break;
@@ -196,6 +214,18 @@ 
 	      __cpu_model.__cpu_type = INTEL_COREI7;
 	      __cpu_model.__cpu_subtype = INTEL_COREI7_SANDYBRIDGE;
 	      break;
+            case 0x3a:
+            case 0x3e:
+              /* Ivy Bridge.  */
+              __cpu_model.__cpu_type = INTEL_COREI7;
+              __cpu_model.__cpu_subtype = INTEL_COREI7_IVYBRIDGE;
+            case 0x3c:
+            case 0x3f:
+            case 0x45:
+            case 0x46:
+              /* Haswell.  */
+              __cpu_model.__cpu_type = INTEL_COREI7;
+              __cpu_model.__cpu_subtype = INTEL_COREI7_HASWELL;
 	    case 0x17:
 	    case 0x1d:
 	      /* Penryn.  */
@@ -242,6 +272,8 @@ 
     features |= (1 << FEATURE_SSE4_2);
   if (ecx & bit_AVX)
     features |= (1 << FEATURE_AVX);
+  if (ecx & bit_FMA)
+    features |= (1 << FEATURE_FMA);
 
   /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */
   if (max_cpuid_level >= 7)
@@ -252,6 +284,23 @@ 
 	features |= (1 << FEATURE_AVX2);
     }
 
+  unsigned int ext_level;
+  unsigned int eax, ebx;
+  /* Check cpuid level of extended features.  */
+  __cpuid (0x80000000, ext_level, ebx, ecx, edx);
+
+  if (ext_level > 0x80000000)
+    {
+      __cpuid (0x80000001, eax, ebx, ecx, edx);
+
+      if (ecx & bit_SSE4a)
+        features |= (1 << FEATURE_SSE4_a);
+      if (ecx & bit_FMA4)
+        features |= (1 << FEATURE_FMA4);
+      if (ecx & bit_XOP)
+        features |= (1 << FEATURE_XOP);
+    }
+    
   __cpu_model.__cpu_features[0] = features;
 }