Patchwork [i386] fma,fma4 and xop flags

login
register
mail settings
Submitter Gopalasubramanian, Ganesh
Date Aug. 16, 2012, 11:43 a.m.
Message ID <EB4625145972F94C9680D8CADD6516152A07230F@sausexdag04.amd.com>
Download mbox | patch
Permalink /patch/177963/
State New
Headers show

Comments

Gopalasubramanian, Ganesh - Aug. 16, 2012, 11:43 a.m.
> This won't work, since we have to prefer FMA3 also in case when only "-mfma -mfma4" without -mtune=XX is used. 
> We can add TARGET_FMA_BOTH though, but I doubt there will ever be target that implements both insn sets without preferences.

Preferring FMA3 over FMA4 might not do good always. For instance, with increased register pressure FMA3 can be used. 
But, when we have more registers at our disposal, fma4 if used might do good by avoiding extra reload.
IMO, when preference of FMA instructions is adjudged by register pressure,  we may need some functionality to support that.

So, ideally for bdver2, we like to have both fma and fma4 getting generated with options "-mfma -mfma4".

Regards
Ganesh

-----Original Message-----
From: Uros Bizjak [mailto:ubizjak@gmail.com] 
Sent: Tuesday, August 14, 2012 9:12 PM
To: Richard Henderson
Cc: Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH,i386] fma,fma4 and xop flags

On Mon, Aug 13, 2012 at 9:50 PM, Richard Henderson <rth@redhat.com> wrote:
> On 08/13/2012 12:33 PM, Uros Bizjak wrote:
>> AFAIU fma3 is better than fma4 for bdver2 (the only CPU that 
>> implements both FMA sets). Current description of bdver2 doesn't even 
>> enable fma4 in processor_alias_table due to this fact.
>>
>> The change you are referring to adds preference for fma3 insn set for 
>> generic code (not FMA4 builtins!), even when fma4 is enabled. So, no 
>> matter which combination and sequence of -mfmfa -mfma4 or -mxop user 
>> passes to the compiler, only fma3 instructions will be generated.
>
> This rationale needs to appear as a comment above
>
>> +      (eq_attr "isa" "fma4")
>> +        (symbol_ref "TARGET_FMA4 && !TARGET_FMA")

I plan to commit following patch:

--cut here--
--cut here--

> Longer term we may well require some sort of
>
>   (TARGET_FMA4 && !(TARGET_FMA && TARGET_PREFER_FMA3))
>
> with an appropriate entry in ix86_tune_features to match.

This won't work, since we have to prefer FMA3 also in case when only "-mfma -mfma4" without -mtune=XX is used. We can add TARGET_FMA_BOTH though, but I doubt there will ever be target that implements both insn sets without preferences.

Uros.
Uros Bizjak - Aug. 16, 2012, 12:17 p.m.
On Thu, Aug 16, 2012 at 1:43 PM, Gopalasubramanian, Ganesh
<Ganesh.Gopalasubramanian@amd.com> wrote:
>> This won't work, since we have to prefer FMA3 also in case when only "-mfma -mfma4" without -mtune=XX is used.
>> We can add TARGET_FMA_BOTH though, but I doubt there will ever be target that implements both insn sets without preferences.
>
> Preferring FMA3 over FMA4 might not do good always. For instance, with increased register pressure FMA3 can be used.
> But, when we have more registers at our disposal, fma4 if used might do good by avoiding extra reload.
> IMO, when preference of FMA instructions is adjudged by register pressure,  we may need some functionality to support that.
>
> So, ideally for bdver2, we like to have both fma and fma4 getting generated with options "-mfma -mfma4".

Yes, now it can also work that way. Current insn generation can be
trivially changed now, just change "fma4" condition for "enabled"
attribute in i386.md.

I will wait for your recommendation.

Uros.

Patch

Index: i386.md
===================================================================
--- i386.md     (revision 190362)
+++ i386.md     (working copy)
@@ -659,6 +659,9 @@ 
         (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2")
         (eq_attr "isa" "bmi2") (symbol_ref "TARGET_BMI2")
         (eq_attr "isa" "fma") (symbol_ref "TARGET_FMA")
+        ;; Disable generation of FMA4 instructions for generic code
+        ;; since FMA3 is preferred for targets that implement both
+        ;; instruction sets.
         (eq_attr "isa" "fma4")
           (symbol_ref "TARGET_FMA4 && !TARGET_FMA")
        ]