diff mbox

[i386] fma3 instruction generation for 'march=native' in AMD processors

Message ID EB4625145972F94C9680D8CADD65161507309E5E@sausexdag04.amd.com
State New
Headers show

Commit Message

Gopalasubramanian, Ganesh May 2, 2012, 11:12 a.m. UTC
For AMD architectures with both fma3 and fma4 instructions' support, GCC generates fma4 by default. Instead, we like to generate fma3 instruction. Below patch enables the fma3 instruction generation for "-march=native".

Ok for trunk?


Regards
Ganesh

Comments

Jakub Jelinek May 2, 2012, 11:41 a.m. UTC | #1
On Wed, May 02, 2012 at 11:12:33AM +0000, Gopalasubramanian, Ganesh wrote:
> For AMD architectures with both fma3 and fma4 instructions' support, GCC generates fma4 by default. Instead, we like to generate fma3 instruction. Below patch enables the fma3 instruction generation for "-march=native".
> 
> Ok for trunk?

You haven't provided ChangeLog entry.

> Index: gcc/config/i386/driver-i386.c
> ===================================================================
> --- gcc/config/i386/driver-i386.c       (revision 186897)
> +++ gcc/config/i386/driver-i386.c       (working copy)
> @@ -472,6 +472,10 @@
>        has_abm = ecx & bit_ABM;
>        has_lwp = ecx & bit_LWP;
>        has_fma4 = ecx & bit_FMA4;
> +      if (((vendor == SIG_AMD)) && (has_fma4) && (has_fma))
> +        {
> +            has_fma4 = 0;
> +        }

And the formatting of this is wrong, 4 unnecessary pairs of (),
one unnecessary {} pair, bad indentation of the has_fma4 = 0;
assignment (should use a tab).

	Jakub
diff mbox

Patch

Index: gcc/config/i386/driver-i386.c
===================================================================
--- gcc/config/i386/driver-i386.c       (revision 186897)
+++ gcc/config/i386/driver-i386.c       (working copy)
@@ -472,6 +472,10 @@ 
       has_abm = ecx & bit_ABM;
       has_lwp = ecx & bit_LWP;
       has_fma4 = ecx & bit_FMA4;
+      if (((vendor == SIG_AMD)) && (has_fma4) && (has_fma))
+        {
+            has_fma4 = 0;
+        }
       has_xop = ecx & bit_XOP;
       has_tbm = ecx & bit_TBM;
       has_lzcnt = ecx & bit_LZCNT;