diff mbox

[RFC] Add FMAF support on Sparc.

Message ID 20110922.023146.2268755599806712681.davem@davemloft.net
State New
Headers show

Commit Message

David Miller Sept. 22, 2011, 6:31 a.m. UTC
Eric, could you give this a quick spin on Solaris?  Even if you
don't have a FMAF capable cpu available, at least make sure the
configury assembler feature detection bits work properly.

And any other form of feedback is appreciated as well.  This patch
works fine for me with current binutils under Linux with a Niagara-3
processor.

I've done --with-cpu=niagara3 builds plus full testsuite regression
check on sparc-linux-gnu.  I also made sure that FMAF is actually
being used by the compiler.

Thanks!

gcc/

2011-09-21  David S. Miller  <davem@davemloft.net>

	* configure.ac: Add feature check to make sure the assembler
	supports the FMAF, HPC, and VIS 3.0 instructions found on
	Niagara-3 and later cpus.
	* configure: Rebuild.
	* config.in: Likewise.
	* config/sparc/sparc.opt: New option '-mfmaf'.
	* config/sparc/sparc.md: Add float fused multiply-add patterns.
	* config/sparc/sparc.h (AS_NIAGARA3_FLAG): New macro.
	(ASM_CPU64_DEFAULT_SPEC, ASM_CPU_SPEC): Use it, as needed.
	* config/sparc/sol2.h (ASM_CPU32_DEFAULT_SPEC,
	ASM_CPU64_DEFAULT_SPEC, ASM_CPU_SPEC): Likewise.
	* config/sparc/sparc.c (sparc_option_override): Turn MASK_FMAF on
	by default for Niagara-3 and later.  Turn it off if TARGET_FPU is
	disabled.
	(sparc_rtx_costs): Handle 'FMA'.

Comments

Eric Botcazou Sept. 25, 2011, 6:44 p.m. UTC | #1
> Eric, could you give this a quick spin on Solaris?  Even if you
> don't have a FMAF capable cpu available, at least make sure the
> configury assembler feature detection bits work properly.

This seems to work just fine on Solaris.

> And any other form of feedback is appreciated as well.  This patch
> works fine for me with current binutils under Linux with a Niagara-3
> processor.

Why fall back to -xv8plusb/-xarch=v9b for Niagara-3 if !HAVE_AS_FMAF_HPC_VIS3?  
Reading the new binutils bits, it seems that -xv8plusc/-xarch=v9c would be the 
perfect match instead.

+mfmaf
+Target Report Mask(FMAF)
+Use UltraSPARC Fused Multiply extensions

Maybe "Fused Multiply-Add extensions".  Why is the final 'f' for exactly?

> I've done --with-cpu=niagara3 builds plus full testsuite regression
> check on sparc-linux-gnu.  I also made sure that FMAF is actually
> being used by the compiler.

Probably worth mentioning in gcc-4.7/changes.html too.
David Miller Sept. 25, 2011, 8:48 p.m. UTC | #2
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Sun, 25 Sep 2011 20:44:37 +0200

>> Eric, could you give this a quick spin on Solaris?  Even if you
>> don't have a FMAF capable cpu available, at least make sure the
>> configury assembler feature detection bits work properly.
> 
> This seems to work just fine on Solaris.

Thanks for testing.

>> And any other form of feedback is appreciated as well.  This patch
>> works fine for me with current binutils under Linux with a Niagara-3
>> processor.
> 
> Why fall back to -xv8plusb/-xarch=v9b for Niagara-3 if !HAVE_AS_FMAF_HPC_VIS3?  
> Reading the new binutils bits, it seems that -xv8plusc/-xarch=v9c would be the 
> perfect match instead.

We don't use any of the 'c' features, and also with binutils if 'd'
isn't available then 'c' isn't either as I added them both at the same
time.  So if you really want me to try to downgrade to 'c' I'll need
to add a seperate configure test for the availability of that option.

> +mfmaf
> +Target Report Mask(FMAF)
> +Use UltraSPARC Fused Multiply extensions
> 
> Maybe "Fused Multiply-Add extensions".  Why is the final 'f' for exactly?

FMA stands for "Fused Multiply-Add Floating-point", there are integer
versions too which I'll add support for in the future.  The integer
ones are referred to as "IMA".

>> I've done --with-cpu=niagara3 builds plus full testsuite regression
>> check on sparc-linux-gnu.  I also made sure that FMAF is actually
>> being used by the compiler.
> 
> Probably worth mentioning in gcc-4.7/changes.html too.

Absolutely.

I'll commit this with the -mfmaf description change you suggested,
if you still want me to tweak the v9d fallback, we can stitch that
up later.
diff mbox

Patch

diff --git a/gcc/config.in b/gcc/config.in
index d202b038..d9b9805 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -266,6 +266,12 @@ 
 #endif
 
 
+/* Define if your assembler supports FMAF, HPC, and VIS 3.0 instructions. */
+#ifndef USED_FOR_TARGET
+#undef HAVE_AS_FMAF_HPC_VIS3
+#endif
+
+
 /* Define if your assembler supports fprnd. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_AS_FPRND
@@ -1047,12 +1053,6 @@ 
 #endif
 
 
-/* Define if _Unwind_GetIPInfo is available. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GETIPINFO
-#endif
-
-
 /* Define to 1 if you have the `getrlimit' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_GETRLIMIT
diff --git a/gcc/config/sparc/sol2.h b/gcc/config/sparc/sol2.h
index bd58c9f..fea60d0 100644
--- a/gcc/config/sparc/sol2.h
+++ b/gcc/config/sparc/sol2.h
@@ -125,9 +125,9 @@  along with GCC; see the file COPYING3.  If not see
 #undef CPP_CPU64_DEFAULT_SPEC
 #define CPP_CPU64_DEFAULT_SPEC ""
 #undef ASM_CPU32_DEFAULT_SPEC
-#define ASM_CPU32_DEFAULT_SPEC "-xarch=v8plusb"
+#define ASM_CPU32_DEFAULT_SPEC "-xarch=v8plus" AS_NIAGARA3_FLAG
 #undef ASM_CPU64_DEFAULT_SPEC
-#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG "b"
+#define ASM_CPU64_DEFAULT_SPEC AS_SPARC64_FLAG AS_NIAGARA3_FLAG
 #undef ASM_CPU_DEFAULT_SPEC
 #define ASM_CPU_DEFAULT_SPEC ASM_CPU32_DEFAULT_SPEC
 #endif
@@ -240,8 +240,8 @@  extern const char *host_detect_local_cpu (int argc, const char **argv);
 %{mcpu=ultrasparc3:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
 %{mcpu=niagara:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
 %{mcpu=niagara2:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
-%{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
-%{mcpu=niagara4:" DEF_ARCH32_SPEC("-xarch=v8plusb") DEF_ARCH64_SPEC(AS_SPARC64_FLAG "b") "} \
+%{mcpu=niagara3:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \
+%{mcpu=niagara4:" DEF_ARCH32_SPEC("-xarch=v8plus" AS_NIAGARA3_FLAG) DEF_ARCH64_SPEC(AS_SPARC64_FLAG AS_NIAGARA3_FLAG) "} \
 %{!mcpu=niagara4:%{!mcpu=niagara3:%{!mcpu=niagara2:%{!mcpu=niagara:%{!mcpu=ultrasparc3:%{!mcpu=ultrasparc:%{!mcpu=v9:%{mcpu*:" DEF_ARCH32_SPEC("-xarch=v8") DEF_ARCH64_SPEC(AS_SPARC64_FLAG) "}}}}}}}} \
 %{!mcpu*:%(asm_cpu_default)} \
 "
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index a4917da..f60c8ca 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -776,9 +776,9 @@  sparc_option_override (void)
     /* UltraSPARC T2 */
     { MASK_ISA, MASK_V9},
     /* UltraSPARC T3 */
-    { MASK_ISA, MASK_V9},
+    { MASK_ISA, MASK_V9 | MASK_FMAF},
     /* UltraSPARC T4 */
-    { MASK_ISA, MASK_V9},
+    { MASK_ISA, MASK_V9 | MASK_FMAF},
   };
   const struct cpu_table *cpu;
   unsigned int i;
@@ -857,9 +857,9 @@  sparc_option_override (void)
   if (target_flags_explicit & MASK_FPU)
     target_flags = (target_flags & ~MASK_FPU) | fpu;
 
-  /* Don't allow -mvis if FPU is disabled.  */
+  /* Don't allow -mvis or -mfmaf if FPU is disabled.  */
   if (! TARGET_FPU)
-    target_flags &= ~MASK_VIS;
+    target_flags &= ~(MASK_VIS | MASK_FMAF);
 
   /* -mvis assumes UltraSPARC+, so we are sure v9 instructions
      are available.
@@ -9609,6 +9609,25 @@  sparc_rtx_costs (rtx x, int code, int outer_code, int opno ATTRIBUTE_UNUSED,
 	*total = COSTS_N_INSNS (1);
       return false;
 
+    case FMA:
+      {
+	rtx sub;
+
+	gcc_assert (float_mode_p);
+	*total = sparc_costs->float_mul;
+
+	sub = XEXP (x, 0);
+	if (GET_CODE (sub) == NEG)
+	  sub = XEXP (sub, 0);
+	*total += rtx_cost (sub, FMA, 0, speed);
+
+	sub = XEXP (x, 2);
+	if (GET_CODE (sub) == NEG)
+	  sub = XEXP (sub, 0);
+	*total += rtx_cost (sub, FMA, 2, speed);
+	return true;
+      }
+
     case MULT:
       if (float_mode_p)
 	*total = sparc_costs->float_mul;
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index afdca1e..04fdaf1 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -287,11 +287,11 @@  extern enum cmodel sparc_cmodel;
 #endif
 #if TARGET_CPU_DEFAULT == TARGET_CPU_niagara3
 #define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__"
-#define ASM_CPU64_DEFAULT_SPEC "-Av9b"
+#define ASM_CPU64_DEFAULT_SPEC "-Av9" AS_NIAGARA3_FLAG
 #endif
 #if TARGET_CPU_DEFAULT == TARGET_CPU_niagara4
 #define CPP_CPU64_DEFAULT_SPEC "-D__sparc_v9__"
-#define ASM_CPU64_DEFAULT_SPEC "-Av9b"
+#define ASM_CPU64_DEFAULT_SPEC "-Av9" AS_NIAGARA3_FLAG
 #endif
 
 #else
@@ -431,8 +431,8 @@  extern enum cmodel sparc_cmodel;
 %{mcpu=ultrasparc3:%{!mv8plus:-Av9b}} \
 %{mcpu=niagara:%{!mv8plus:-Av9b}} \
 %{mcpu=niagara2:%{!mv8plus:-Av9b}} \
-%{mcpu=niagara3:%{!mv8plus:-Av9b}} \
-%{mcpu=niagara4:%{!mv8plus:-Av9b}} \
+%{mcpu=niagara3:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \
+%{mcpu=niagara4:%{!mv8plus:-Av9" AS_NIAGARA3_FLAG "}} \
 %{!mcpu*:%(asm_cpu_default)} \
 "
 
@@ -1880,6 +1880,14 @@  extern int sparc_indent_opcode;
 #define TARGET_SUN_TLS TARGET_TLS
 #define TARGET_GNU_TLS 0
 
+#ifndef HAVE_AS_FMAF_HPC_VIS3
+#define AS_NIAGARA3_FLAG "b"
+#undef TARGET_FMAF
+#define TARGET_FMAF 0
+#else
+#define AS_NIAGARA3_FLAG "d"
+#endif
+
 /* The number of Pmode words for the setjmp buffer.  */
 #define JMP_BUF_SIZE 12
 
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 812ae7b..28b50d0 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -5357,6 +5357,78 @@ 
   "fmuls\t%1, %2, %0"
   [(set_attr "type" "fpmul")])
 
+(define_insn "fmadf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+        (fma:DF (match_operand:DF 1 "register_operand" "e")
+		(match_operand:DF 2 "register_operand" "e")
+		(match_operand:DF 3 "register_operand" "e")))]
+  "TARGET_FMAF"
+  "fmaddd\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "fmsdf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+        (fma:DF (match_operand:DF 1 "register_operand" "e")
+		(match_operand:DF 2 "register_operand" "e")
+		(neg:DF (match_operand:DF 3 "register_operand" "e"))))]
+  "TARGET_FMAF"
+  "fmsubd\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "*nfmadf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+        (neg:DF (fma:DF (match_operand:DF 1 "register_operand" "e")
+			(match_operand:DF 2 "register_operand" "e")
+			(match_operand:DF 3 "register_operand" "e"))))]
+  "TARGET_FMAF"
+  "fnmaddd\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "*nfmsdf4"
+  [(set (match_operand:DF 0 "register_operand" "=e")
+        (neg:DF (fma:DF (match_operand:DF 1 "register_operand" "e")
+			(match_operand:DF 2 "register_operand" "e")
+			(neg:DF (match_operand:DF 3 "register_operand" "e")))))]
+  "TARGET_FMAF"
+  "fnmsubd\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "fmasf4"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+        (fma:SF (match_operand:SF 1 "register_operand" "f")
+		(match_operand:SF 2 "register_operand" "f")
+		(match_operand:SF 3 "register_operand" "f")))]
+  "TARGET_FMAF"
+  "fmadds\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "fmssf4"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+        (fma:SF (match_operand:SF 1 "register_operand" "f")
+		(match_operand:SF 2 "register_operand" "f")
+		(neg:SF (match_operand:SF 3 "register_operand" "f"))))]
+  "TARGET_FMAF"
+  "fmsubs\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "*nfmasf4"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+        (neg:SF (fma:SF (match_operand:SF 1 "register_operand" "f")
+			(match_operand:SF 2 "register_operand" "f")
+			(match_operand:SF 3 "register_operand" "f"))))]
+  "TARGET_FMAF"
+  "fnmadds\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
+(define_insn "*nfmssf4"
+  [(set (match_operand:SF 0 "register_operand" "=f")
+        (neg:SF (fma:SF (match_operand:SF 1 "register_operand" "f")
+			(match_operand:SF 2 "register_operand" "f")
+			(neg:SF (match_operand:SF 3 "register_operand" "f")))))]
+  "TARGET_FMAF"
+  "fnmsubs\t%1, %2, %3, %0"
+  [(set_attr "type" "fpmul")])
+
 (define_insn "*muldf3_extend"
   [(set (match_operand:DF 0 "register_operand" "=e")
 	(mult:DF (float_extend:DF (match_operand:SF 1 "register_operand" "f"))
diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index ce6fa94..8e2c6bd 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -61,6 +61,10 @@  mvis
 Target Report Mask(VIS)
 Use UltraSPARC Visual Instruction Set extensions
 
+mfmaf
+Target Report Mask(FMAF)
+Use UltraSPARC Fused Multiply extensions
+
 mptr64
 Target Report RejectNegative Mask(PTR64)
 Pointers are 64-bit
diff --git a/gcc/configure b/gcc/configure
index 651471c..c183881 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -24043,6 +24043,42 @@  if test $gcc_cv_as_sparc_offsetable_lo10 = yes; then
 $as_echo "#define HAVE_AS_OFFSETABLE_LO10 1" >>confdefs.h
 
 fi
+
+    { $as_echo "$as_me:${as_lineno-$LINENO}: checking assembler for FMAF, HPC, and VIS 3.0 instructions" >&5
+$as_echo_n "checking assembler for FMAF, HPC, and VIS 3.0 instructions... " >&6; }
+if test "${gcc_cv_as_sparc_fmaf+set}" = set; then :
+  $as_echo_n "(cached) " >&6
+else
+  gcc_cv_as_sparc_fmaf=no
+  if test x$gcc_cv_as != x; then
+    $as_echo '.text
+       .align 4
+       fmaddd %f0, %f2, %f4, %f6
+       addxccc %g1, %g2, %g3
+       fsrl32 %f2, %f4, %f8
+       fnaddd %f10, %f12, %f14' > conftest.s
+    if { ac_try='$gcc_cv_as $gcc_cv_as_flags -xarch=v9d -o conftest.o conftest.s >&5'
+  { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5
+  (eval $ac_try) 2>&5
+  ac_status=$?
+  $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
+  test $ac_status = 0; }; }
+    then
+	gcc_cv_as_sparc_fmaf=yes
+    else
+      echo "configure: failed program was" >&5
+      cat conftest.s >&5
+    fi
+    rm -f conftest.o conftest.s
+  fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_as_sparc_fmaf" >&5
+$as_echo "$gcc_cv_as_sparc_fmaf" >&6; }
+if test $gcc_cv_as_sparc_fmaf = yes; then
+
+$as_echo "#define HAVE_AS_FMAF_HPC_VIS3 1" >>confdefs.h
+
+fi
     ;;
 
   i[34567]86-*-* | x86_64-*-*)
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 126cb19..8069e6a 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3478,6 +3478,18 @@  foo:
        fi],
        [AC_DEFINE(HAVE_AS_OFFSETABLE_LO10, 1,
 	         [Define if your assembler supports offsetable %lo().])])
+
+    gcc_GAS_CHECK_FEATURE([FMAF, HPC, and VIS 3.0 instructions],
+      gcc_cv_as_sparc_fmaf,,
+      [-xarch=v9d],
+      [.text
+       .align 4
+       fmaddd %f0, %f2, %f4, %f6
+       addxccc %g1, %g2, %g3
+       fsrl32 %f2, %f4, %f8
+       fnaddd %f10, %f12, %f14],,
+      [AC_DEFINE(HAVE_AS_FMAF_HPC_VIS3, 1,
+                [Define if your assembler supports FMAF, HPC, and VIS 3.0 instructions.])])
     ;;
 
 changequote(,)dnl