diff mbox

[i386] fma3 instruction generation for 'march=native' in AMD processors

Message ID EB4625145972F94C9680D8CADD6516150730F02B@sausexdag04.amd.com
State New
Headers show

Commit Message

Gopalasubramanian, Ganesh May 3, 2012, 6:36 a.m. UTC
I have added the ChangeLog and modified the patch.

Is it OK to commit to trunk?

Regards
Ganesh

2012-05-03  Ganesh Gopalasubramanian  <Ganesh.Gopalasubramanian@amd.com>

	* config/i386/driver-i386.c (host_detect_local_cpu): Reset
	has_fma4 for AMD processors with both fma3 and fma4 support.


-----Original Message-----
From: Jakub Jelinek [mailto:jakub@redhat.com] 
Sent: Wednesday, May 02, 2012 5:11 PM
To: Gopalasubramanian, Ganesh
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] [i386] fma3 instruction generation for 'march=native' in AMD processors

On Wed, May 02, 2012 at 11:12:33AM +0000, Gopalasubramanian, Ganesh wrote:
> For AMD architectures with both fma3 and fma4 instructions' support, GCC generates fma4 by default. Instead, we like to generate fma3 instruction. Below patch enables the fma3 instruction generation for "-march=native".
> 
> Ok for trunk?

You haven't provided ChangeLog entry.

> Index: gcc/config/i386/driver-i386.c
> ===================================================================
> --- gcc/config/i386/driver-i386.c       (revision 186897)
> +++ gcc/config/i386/driver-i386.c       (working copy)
> @@ -472,6 +472,10 @@
>        has_abm = ecx & bit_ABM;
>        has_lwp = ecx & bit_LWP;
>        has_fma4 = ecx & bit_FMA4;
> +      if (((vendor == SIG_AMD)) && (has_fma4) && (has_fma))
> +        {
> +            has_fma4 = 0;
> +        }

And the formatting of this is wrong, 4 unnecessary pairs of (),
one unnecessary {} pair, bad indentation of the has_fma4 = 0;
assignment (should use a tab).

	Jakub
diff mbox

Patch

Index: config/i386/driver-i386.c
===================================================================
--- config/i386/driver-i386.c   (revision 186897)
+++ config/i386/driver-i386.c   (working copy)
@@ -472,6 +472,8 @@ 
       has_abm = ecx & bit_ABM;
       has_lwp = ecx & bit_LWP;
       has_fma4 = ecx & bit_FMA4;
+      if (vendor == SIG_AMD && has_fma4 && has_fma)
+       has_fma4 = 0;
       has_xop = ecx & bit_XOP;
       has_tbm = ecx & bit_TBM;
       has_lzcnt = ecx & bit_LZCNT;