Patchwork [9/N,PA] convert to fma

login
register
mail settings
Submitter Richard Henderson
Date Nov. 12, 2010, 3:04 a.m.
Message ID <20101112030453.GA14645@twiddle.net>
Download mbox | patch
Permalink /patch/70916/
State New
Headers show

Comments

Richard Henderson - Nov. 12, 2010, 3:04 a.m.
It would seem that all the splitters that work around
combine and single outputs are no longer necessary.

Untested.


r~
John David Anglin - Nov. 15, 2010, 9:46 p.m.
On Fri, 12 Nov 2010, John David Anglin wrote:

> > Untested.
> 
> Will test.

With the change, I see a number of fma related testsuite FAILs:

FAIL: gcc.dg/torture/builtin-attr-1.c  -O0  (test for excess errors)
FAIL: gcc.dg/torture/builtin-math-2.c  -O0  scan-tree-dump-times original "fma " 3
AIL: gcc.dg/torture/builtin-math-2.c  -O0  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O1  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O1  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-loops  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-loops  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -g  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -g  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -Os  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -Os  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto -flto-partition=none  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto -flto-partition=none  scan-tree-dump-times original "fmaf" 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto  scan-tree-dump-times original "fma " 3
FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto  scan-tree-dump-times original "fmaf" 3

Dave
Richard Guenther - Nov. 15, 2010, 10:22 p.m.
On Mon, Nov 15, 2010 at 10:46 PM, John David Anglin
<dave@hiauly1.hia.nrc.ca> wrote:
> On Fri, 12 Nov 2010, John David Anglin wrote:
>
>> > Untested.
>>
>> Will test.
>
> With the change, I see a number of fma related testsuite FAILs:
>
> FAIL: gcc.dg/torture/builtin-attr-1.c  -O0  (test for excess errors)
> FAIL: gcc.dg/torture/builtin-math-2.c  -O0  scan-tree-dump-times original "fma " 3
> AIL: gcc.dg/torture/builtin-math-2.c  -O0  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O1  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O1  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-loops  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-loops  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -g  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O3 -g  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -Os  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -Os  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto -flto-partition=none  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto -flto-partition=none  scan-tree-dump-times original "fmaf" 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto  scan-tree-dump-times original "fma " 3
> FAIL: gcc.dg/torture/builtin-math-2.c  -O2 -flto  scan-tree-dump-times original "fmaf" 3

These are known - we need to adjust the dump file scanning for the
folding to FMA_EXPR.
What's the error on builtin-attr-1.c?

Richard.

> Dave
> --
> J. David Anglin                                  dave.anglin@nrc-cnrc.gc.ca
> National Research Council of Canada              (613) 990-0752 (FAX: 952-6602)
>
John David Anglin - Nov. 16, 2010, 12:31 a.m.
On Thu, 11 Nov 2010, Richard Henderson wrote:

>  ;; Negating a multiply can be faked by adding zero in a fused multiply-add
>  ;; instruction.
> +;; ??? Only if we add -0.0 or can ignore the sign of zero.

Adding -0.0 can't be done with the fused multiply-add unless -0.0 is
already available in a floating register.  It would be better to just
condition the patterns on !flag_signed_zeros.

Dave

Patch

diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index a68989f..0d51c62 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -6090,64 +6090,44 @@ 
 ;; PA 2.0 floating point instructions
 
 ; fmpyfadd patterns
-(define_insn ""
+(define_insn "fmadf4"
   [(set (match_operand:DF 0 "register_operand" "=f")
-	(plus:DF (mult:DF (match_operand:DF 1 "register_operand" "f")
-			  (match_operand:DF 2 "register_operand" "f"))
-		 (match_operand:DF 3 "register_operand" "f")))]
+	(fma:DF (match_operand:DF 1 "register_operand" "f")
+		(match_operand:DF 2 "register_operand" "f")
+		(match_operand:DF 3 "register_operand" "f")))]
   "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
   "fmpyfadd,dbl %1,%2,%3,%0"
   [(set_attr "type" "fpmuldbl")
    (set_attr "length" "4")])
 
-(define_insn ""
-  [(set (match_operand:DF 0 "register_operand" "=f")
-	(plus:DF (match_operand:DF 1 "register_operand" "f")
-		 (mult:DF (match_operand:DF 2 "register_operand" "f")
-			  (match_operand:DF 3 "register_operand" "f"))))]
-  "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
-  "fmpyfadd,dbl %2,%3,%1,%0"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "4")])
-
-(define_insn ""
+(define_insn "fmasf4"
   [(set (match_operand:SF 0 "register_operand" "=f")
-	(plus:SF (mult:SF (match_operand:SF 1 "register_operand" "f")
-			  (match_operand:SF 2 "register_operand" "f"))
-		 (match_operand:SF 3 "register_operand" "f")))]
+	(fma:SF (match_operand:SF 1 "register_operand" "f")
+		(match_operand:SF 2 "register_operand" "f")
+		(match_operand:SF 3 "register_operand" "f")))]
   "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
   "fmpyfadd,sgl %1,%2,%3,%0"
   [(set_attr "type" "fpmulsgl")
    (set_attr "length" "4")])
 
-(define_insn ""
-  [(set (match_operand:SF 0 "register_operand" "=f")
-	(plus:SF (match_operand:SF 1 "register_operand" "f")
-		 (mult:SF (match_operand:SF 2 "register_operand" "f")
-			  (match_operand:SF 3 "register_operand" "f"))))]
-  "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
-  "fmpyfadd,sgl %2,%3,%1,%0"
-  [(set_attr "type" "fpmulsgl")
-   (set_attr "length" "4")])
-
 ; fmpynfadd patterns
-(define_insn ""
+(define_insn "fnmadf4"
   [(set (match_operand:DF 0 "register_operand" "=f")
-	(minus:DF (match_operand:DF 1 "register_operand" "f")
-		  (mult:DF (match_operand:DF 2 "register_operand" "f")
-			   (match_operand:DF 3 "register_operand" "f"))))]
+	(fma:DF (neg:DF (match_operand:DF 1 "register_operand" "f"))
+		(match_operand:DF 2 "register_operand" "f")
+		(match_operand:DF 3 "register_operand" "f")))]
   "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
-  "fmpynfadd,dbl %2,%3,%1,%0"
+  "fmpynfadd,dbl %1,%2,%3,%0"
   [(set_attr "type" "fpmuldbl")
    (set_attr "length" "4")])
 
-(define_insn ""
+(define_insn "fnmasf4"
   [(set (match_operand:SF 0 "register_operand" "=f")
-	(minus:SF (match_operand:SF 1 "register_operand" "f")
-		  (mult:SF (match_operand:SF 2 "register_operand" "f")
-			   (match_operand:SF 3 "register_operand" "f"))))]
+	(fma:SF (neg:SF (match_operand:SF 1 "register_operand" "f"))
+		(match_operand:SF 2 "register_operand" "f")
+		(match_operand:SF 3 "register_operand" "f")))]
   "TARGET_PA_20 && ! TARGET_SOFT_FLOAT"
-  "fmpynfadd,sgl %2,%3,%1,%0"
+  "fmpynfadd,sgl %1,%2,%3,%0"
   [(set_attr "type" "fpmulsgl")
    (set_attr "length" "4")])
 
@@ -6168,72 +6148,9 @@ 
   [(set_attr "type" "fpalu")
    (set_attr "length" "4")])
 
-;; Generating a fused multiply sequence is a win for this case as it will
-;; reduce the latency for the fused case without impacting the plain
-;; multiply case.
-;;
-;; Similar possibilities exist for fnegabs, shadd and other insns which
-;; perform two operations with the result of the first feeding the second.
-(define_insn ""
-  [(set (match_operand:DF 0 "register_operand" "=f")
-	(plus:DF (mult:DF (match_operand:DF 1 "register_operand" "f")
-			  (match_operand:DF 2 "register_operand" "f"))
-		 (match_operand:DF 3 "register_operand" "f")))
-   (set (match_operand:DF 4 "register_operand" "=&f")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-;; We want to split this up during scheduling since we want both insns
-;; to schedule independently.
-(define_split
-  [(set (match_operand:DF 0 "register_operand" "")
-	(plus:DF (mult:DF (match_operand:DF 1 "register_operand" "")
-			  (match_operand:DF 2 "register_operand" ""))
-		 (match_operand:DF 3 "register_operand" "")))
-   (set (match_operand:DF 4 "register_operand" "")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:DF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (plus:DF (mult:DF (match_dup 1) (match_dup 2))
-			       (match_dup 3)))]
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "register_operand" "=f")
-	(plus:SF (mult:SF (match_operand:SF 1 "register_operand" "f")
-			  (match_operand:SF 2 "register_operand" "f"))
-		 (match_operand:SF 3 "register_operand" "f")))
-   (set (match_operand:SF 4 "register_operand" "=&f")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-;; We want to split this up during scheduling since we want both insns
-;; to schedule independently.
-(define_split
-  [(set (match_operand:SF 0 "register_operand" "")
-	(plus:SF (mult:SF (match_operand:SF 1 "register_operand" "")
-			  (match_operand:SF 2 "register_operand" ""))
-		 (match_operand:SF 3 "register_operand" "")))
-   (set (match_operand:SF 4 "register_operand" "")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:SF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (plus:SF (mult:SF (match_dup 1) (match_dup 2))
-			       (match_dup 3)))]
-  "")
-
 ;; Negating a multiply can be faked by adding zero in a fused multiply-add
 ;; instruction.
+;; ??? Only if we add -0.0 or can ignore the sign of zero.
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (mult:DF (match_operand:DF 1 "register_operand" "f")
@@ -6300,135 +6217,6 @@ 
    (set (match_dup 0) (neg:SF (mult:SF (match_dup 1) (match_dup 2))))]
   "")
 
-;; Now fused multiplies with the result of the multiply negated.
-(define_insn ""
-  [(set (match_operand:DF 0 "register_operand" "=f")
-	(plus:DF (neg:DF (mult:DF (match_operand:DF 1 "register_operand" "f")
-				  (match_operand:DF 2 "register_operand" "f")))
-		 (match_operand:DF 3 "register_operand" "f")))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  "fmpynfadd,dbl %1,%2,%3,%0"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "4")])
-
-(define_insn ""
-  [(set (match_operand:SF 0 "register_operand" "=f")
-	(plus:SF (neg:SF (mult:SF (match_operand:SF 1 "register_operand" "f")
-			 (match_operand:SF 2 "register_operand" "f")))
-		 (match_operand:SF 3 "register_operand" "f")))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  "fmpynfadd,sgl %1,%2,%3,%0"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "4")])
-
-(define_insn ""
-  [(set (match_operand:DF 0 "register_operand" "=f")
-	(plus:DF (neg:DF (mult:DF (match_operand:DF 1 "register_operand" "f")
-				  (match_operand:DF 2 "register_operand" "f")))
-		 (match_operand:DF 3 "register_operand" "f")))
-   (set (match_operand:DF 4 "register_operand" "=&f")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-(define_split
-  [(set (match_operand:DF 0 "register_operand" "")
-	(plus:DF (neg:DF (mult:DF (match_operand:DF 1 "register_operand" "")
-				  (match_operand:DF 2 "register_operand" "")))
-		 (match_operand:DF 3 "register_operand" "")))
-   (set (match_operand:DF 4 "register_operand" "")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:DF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (plus:DF (neg:DF (mult:DF (match_dup 1) (match_dup 2)))
-			       (match_dup 3)))]
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "register_operand" "=f")
-	(plus:SF (neg:SF (mult:SF (match_operand:SF 1 "register_operand" "f")
-				  (match_operand:SF 2 "register_operand" "f")))
-		 (match_operand:SF 3 "register_operand" "f")))
-   (set (match_operand:SF 4 "register_operand" "=&f")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-(define_split
-  [(set (match_operand:SF 0 "register_operand" "")
-	(plus:SF (neg:SF (mult:SF (match_operand:SF 1 "register_operand" "")
-				  (match_operand:SF 2 "register_operand" "")))
-		 (match_operand:SF 3 "register_operand" "")))
-   (set (match_operand:SF 4 "register_operand" "")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:SF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (plus:SF (neg:SF (mult:SF (match_dup 1) (match_dup 2)))
-			       (match_dup 3)))]
-  "")
-
-(define_insn ""
-  [(set (match_operand:DF 0 "register_operand" "=f")
-	(minus:DF (match_operand:DF 3 "register_operand" "f")
-		  (mult:DF (match_operand:DF 1 "register_operand" "f")
-			   (match_operand:DF 2 "register_operand" "f"))))
-   (set (match_operand:DF 4 "register_operand" "=&f")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-(define_split
-  [(set (match_operand:DF 0 "register_operand" "")
-	(minus:DF (match_operand:DF 3 "register_operand" "")
-		  (mult:DF (match_operand:DF 1 "register_operand" "")
-			   (match_operand:DF 2 "register_operand" ""))))
-   (set (match_operand:DF 4 "register_operand" "")
-	(mult:DF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:DF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (minus:DF (match_dup 3)
-				(mult:DF (match_dup 1) (match_dup 2))))]
-  "")
-
-(define_insn ""
-  [(set (match_operand:SF 0 "register_operand" "=f")
-	(minus:SF (match_operand:SF 3 "register_operand" "f")
-		  (mult:SF (match_operand:SF 1 "register_operand" "f")
-			   (match_operand:SF 2 "register_operand" "f"))))
-   (set (match_operand:SF 4 "register_operand" "=&f")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "(! TARGET_SOFT_FLOAT && TARGET_PA_20
-    && ! (reg_overlap_mentioned_p (operands[4], operands[1])
-          || reg_overlap_mentioned_p (operands[4], operands[2])))"
-  "#"
-  [(set_attr "type" "fpmuldbl")
-   (set_attr "length" "8")])
-
-(define_split
-  [(set (match_operand:SF 0 "register_operand" "")
-	(minus:SF (match_operand:SF 3 "register_operand" "")
-		  (mult:SF (match_operand:SF 1 "register_operand" "")
-			   (match_operand:SF 2 "register_operand" ""))))
-   (set (match_operand:SF 4 "register_operand" "")
-	(mult:SF (match_dup 1) (match_dup 2)))]
-  "! TARGET_SOFT_FLOAT && TARGET_PA_20"
-  [(set (match_dup 4) (mult:SF (match_dup 1) (match_dup 2)))
-   (set (match_dup 0) (minus:SF (match_dup 3)
-				(mult:SF (match_dup 1) (match_dup 2))))]
-  "")
-
 (define_insn ""
   [(set (match_operand:DF 0 "register_operand" "=f")
 	(neg:DF (abs:DF (match_operand:DF 1 "register_operand" "f"))))