[i386] fma4 addition for bdver2

Submitted by Gopalasubramanian, Ganesh on Sept. 5, 2012, 10:10 a.m.

Details

Message ID EB4625145972F94C9680D8CADD6516152A082188@sausexdag02.amd.com
State New
Headers show

Commit Message

Gopalasubramanian, Ganesh Sept. 5, 2012, 10:10 a.m.
Hello,

FMA4 and FMA3 ISA are implemented in bdver2 target.
FMA3 is selected by default. 
This patch supports the use of FMA4 intrinsics for bdver2 targets.

Is it OK for trunk?

Regards
Ganesh

2012-09-05  Ganesh Gopalasubramanian  <Ganesh.Gopalasubramanian@amd.com>

	* config/i386/i386.md : Comments on fma4 instruction 
	selection reflect requirement on register pressure based
	cost model.
   
	* config/i386/driver-i386.c (host_detect_local_cpu): fma4
	flag is set-reset as informed by the cpuid flag.

	* config/i386/i386.c (processor_alias_table): fma4
	flag is enabled for bdver2.


Regards
Ganesh

Comments

Uros Bizjak Sept. 5, 2012, 10:18 a.m.
On Wed, Sep 5, 2012 at 12:10 PM, Gopalasubramanian, Ganesh
<Ganesh.Gopalasubramanian@amd.com> wrote:

> FMA4 and FMA3 ISA are implemented in bdver2 target.
> FMA3 is selected by default.
> This patch supports the use of FMA4 intrinsics for bdver2 targets.
>
> Is it OK for trunk?

OK. I will backport this patch, together with my previous FMA patch to
4.7 branch.

Thanks,
Uros.

Patch hide | download patch | download mbox

Index: gcc/config/i386/i386.md
===================================================================
--- gcc/config/i386/i386.md     (revision 190830)
+++ gcc/config/i386/i386.md     (working copy)
@@ -659,9 +659,11 @@ 
         (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2")
         (eq_attr "isa" "bmi2") (symbol_ref "TARGET_BMI2")
         (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
-        ;; Disable generation of FMA4 instructions for generic code
-        ;; since FMA3 is preferred for targets that implement both
-        ;; instruction sets.
+        ;; Fma instruction selection has to be done based on
+        ;; register pressure. For generating fma4, a cost model
+        ;; based on register pressure is required. Till then,
+        ;; fma4 instruction is disabled for targets that implement
+        ;; both fma and fma4 instruction sets.
         (eq_attr "isa" "fma4")
           (symbol_ref "TARGET_FMA4 && !TARGET_FMA")
        ]
Index: gcc/config/i386/driver-i386.c
===================================================================
--- gcc/config/i386/driver-i386.c       (revision 190830)
+++ gcc/config/i386/driver-i386.c       (working copy)
@@ -483,8 +483,6 @@ 
       has_abm = ecx & bit_ABM;
       has_lwp = ecx & bit_LWP;
       has_fma4 = ecx & bit_FMA4;
-      if (vendor == SIG_AMD && has_fma4 && has_fma)
-       has_fma4 = 0;
       has_xop = ecx & bit_XOP;
       has_tbm = ecx & bit_TBM;
       has_lzcnt = ecx & bit_LZCNT;
Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c      (revision 190830)
+++ gcc/config/i386/i386.c      (working copy)
@@ -3164,7 +3164,7 @@ 
       {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2,
        PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3
        | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1
-       | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX
+       | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4
        | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C
        | PTA_FMA},
       {"btver1", PROCESSOR_BTVER1, CPU_GENERIC64,