Patchwork Start adding support for VIS 3.0 instructions.

login
register
mail settings
Submitter David Miller
Date Oct. 1, 2011, 6:40 p.m.
Message ID <20111001.144020.405391787739276879.davem@davemloft.net>
Download mbox | patch
Permalink /patch/117276/
State New
Headers show

Comments

David Miller - Oct. 1, 2011, 6:40 p.m.
There are couple more to add, but there are some complications
wrt. v8plus for those that I need to sort out first.

And also I haven't attempted to add the floating-point-and-halve
and related ops yet, see:

http://gcc.gnu.org/ml/gcc/2011-09/msg00385.html

Committed to trunk.

gcc/

	* config/sparc/sparc.opt (VIS3): New option.
	* doc/invoke.texi: Document it.
	* config/sparc/sparc.h: Force TARGET_VIS3 to zero if assembler is
	not capable of such instructions.
	* config/sparc/sparc-c.c (sparc_target_macros): Define __VIS__
	to 0x300 when TARGET_VIS3.
	* config/sparc/sparc-modes.def: Create 16-byte vector modes.
	* config/sparc/sparc.md (UNSPEC_CMASK8, UNSPEC_CMASK16, UNSPEC_CMASK32,
	UNSPEC_FCHKSM16, UNSPEC_PDISTN, UNSPC_FUCMP): New unspecs.
	(V64N8, VASS): New mode iterators.
	(vis3_shift, vis3_addsub_ss): New code iterators.
	(vbits, vconstr): New mode attributes.
	(vis3_shift_insn, vis3_addsub_ss_insn): New code attributes.
	(cmask8<P:mode>_vis, cmask16<P:mode>_vis, cmask32<P:mode>_vis,
	fchksm16_vis, <vis3_shift_insn><vbits>_vis, pdistn<mode>_vis,
	fmean16_vis, fpadd64_vis, fpsub64_vis, <vis3_addsub_ss_insn><vbits>_vis,
	fucmp<code>8<P:mode>_vis): New VIS 3.0 instruction patterns.
	* config/sparc/sparc.c (sparc_option_override): Set MASK_VIS3 by
	default when targetting capable cpus.  TARGET_VIS3 implies
	TARGET_VIS2 and TARGET_VIS, and clear them when TARGET_FPU is
	disabled.
	(sparc_vis_init_builtins): Emit new VIS 3.0 builtins.
	(sparc_fold_builtin): Do not eliminate cmask{8,16,32} when result
	is ignored.
	* config/sparc/visintrin.h (__vis_cmask8, __vis_cmask16,
	__vis_cmask32, __vis_fchksm16, __vis_fsll16, __vis_fslas16,
	__vis_fsrl16, __vis_fsra16, __vis_fsll32, __vis_fslas32,
	__vis_fsrl32, __vis_fsra32, __vis_pdistn, __vis_fmean16,
	__vis_fpadd64, __vis_fpsub64, __vis_fpadds16, __vis_fpadds16s,
	__vis_fpsubs16, __vis_fpsubs16s, __vis_fpadds32, __vis_fpadds32s,
	__vis_fpsubs32, __vis_fpsubs32s, __vis_fucmple8, __vis_fucmpne8,
	__vis_fucmpgt8, __vis_fucmpeq8): New VIS 3.0 interfaces.
	* doc/extend.texi: Document new VIS 3.0 builtins.

gcc/testsuite/

	* gcc.target/sparc/cmask.c: New test.
	* gcc.target/sparc/fpadds.c: New test.
	* gcc.target/sparc/fshift.c: New test.
	* gcc.target/sparc/fucmp.c: New test.
	* gcc.target/sparc/vis3misc.c: New test.
---
 gcc/ChangeLog                             |   36 ++++++
 gcc/config/sparc/sparc-c.c                |    7 +-
 gcc/config/sparc/sparc-modes.def          |    1 +
 gcc/config/sparc/sparc.c                  |  142 +++++++++++++++++++--
 gcc/config/sparc/sparc.h                  |    2 +
 gcc/config/sparc/sparc.md                 |  112 ++++++++++++++++
 gcc/config/sparc/sparc.opt                |    4 +
 gcc/config/sparc/visintrin.h              |  196 +++++++++++++++++++++++++++++
 gcc/doc/extend.texi                       |   45 +++++++-
 gcc/doc/invoke.texi                       |   13 ++-
 gcc/testsuite/ChangeLog                   |    8 ++
 gcc/testsuite/gcc.target/sparc/cmask.c    |   21 +++
 gcc/testsuite/gcc.target/sparc/fpadds.c   |   55 ++++++++
 gcc/testsuite/gcc.target/sparc/fshift.c   |   53 ++++++++
 gcc/testsuite/gcc.target/sparc/fucmp.c    |   28 ++++
 gcc/testsuite/gcc.target/sparc/vis3misc.c |   37 ++++++
 16 files changed, 745 insertions(+), 15 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/sparc/cmask.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/fpadds.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/fshift.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/fucmp.c
 create mode 100644 gcc/testsuite/gcc.target/sparc/vis3misc.c
Richard Henderson - Oct. 3, 2011, 10:19 p.m.
On 10/01/2011 11:40 AM, David Miller wrote:
> +;; Conditional moves are possible via fcmpX --> cmaskX -> bshuffle

Does this comment mean you can fairly efficiently implement the
vcond<cmpmode><datamode> patterns?


r~
David Miller - Oct. 4, 2011, 1:43 a.m.
From: Richard Henderson <rth@redhat.com>
Date: Mon, 03 Oct 2011 15:19:00 -0700

> On 10/01/2011 11:40 AM, David Miller wrote:
>> +;; Conditional moves are possible via fcmpX --> cmaskX -> bshuffle
> 
> Does this comment mean you can fairly efficiently implement the
> vcond<cmpmode><datamode> patterns?

That seems to be the case.  So such an expander would emit something
like:

	fcmple32	%f0, %f2, %g1
	cmask32		%g1
	bshuffle	%f0, %f2, %f4

for an "le" conditional move.

When modes N and M use different vector element types, I'll have
to frob the bitmask produced by the fcmp and adjust the cmask
variant used.

What exactly is supposed to happen when, for example, the comparison
is between two v4hi values and the conditional move is done on
v2si values?  It seems the only requirement is that modes N and M
be vector modes of the same size.
Richard Henderson - Oct. 4, 2011, 1:54 a.m.
On 10/03/2011 06:43 PM, David Miller wrote:
> What exactly is supposed to happen when, for example, the comparison
> is between two v4hi values and the conditional move is done on
> v2si values?  It seems the only requirement is that modes N and M
> be vector modes of the same size.

It's supposed to be vector modes of the same size and element count.
I.e. only V4SI and V4SF.  I guess that's not 100% clear from the
documentation...


r~
Jakub Jelinek - Oct. 4, 2011, 6 a.m.
On Mon, Oct 03, 2011 at 03:19:00PM -0700, Richard Henderson wrote:
> On 10/01/2011 11:40 AM, David Miller wrote:
> > +;; Conditional moves are possible via fcmpX --> cmaskX -> bshuffle
> 
> Does this comment mean you can fairly efficiently implement the
> vcond<cmpmode><datamode> patterns?

vcond is actually vcond<datamode><cmpmode>

	Jakub

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2b62b96..f8e02d3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,39 @@ 
+2011-10-01  David S. Miller  <davem@davemloft.net>
+
+	* config/sparc/sparc.opt (VIS3): New option.
+	* doc/invoke.texi: Document it.
+	* config/sparc/sparc.h: Force TARGET_VIS3 to zero if assembler is
+	not capable of such instructions.
+	* config/sparc/sparc-c.c (sparc_target_macros): Define __VIS__
+	to 0x300 when TARGET_VIS3.
+	* config/sparc/sparc-modes.def: Create 16-byte vector modes.
+	* config/sparc/sparc.md (UNSPEC_CMASK8, UNSPEC_CMASK16, UNSPEC_CMASK32,
+	UNSPEC_FCHKSM16, UNSPEC_PDISTN, UNSPC_FUCMP): New unspecs.
+	(V64N8, VASS): New mode iterators.
+	(vis3_shift, vis3_addsub_ss): New code iterators.
+	(vbits, vconstr): New mode attributes.
+	(vis3_shift_insn, vis3_addsub_ss_insn): New code attributes.
+	(cmask8<P:mode>_vis, cmask16<P:mode>_vis, cmask32<P:mode>_vis,
+	fchksm16_vis, <vis3_shift_insn><vbits>_vis, pdistn<mode>_vis,
+	fmean16_vis, fpadd64_vis, fpsub64_vis, <vis3_addsub_ss_insn><vbits>_vis,
+	fucmp<code>8<P:mode>_vis): New VIS 3.0 instruction patterns.
+	* config/sparc/sparc.c (sparc_option_override): Set MASK_VIS3 by
+	default when targetting capable cpus.  TARGET_VIS3 implies
+	TARGET_VIS2 and TARGET_VIS, and clear them when TARGET_FPU is
+	disabled.
+	(sparc_vis_init_builtins): Emit new VIS 3.0 builtins.
+	(sparc_fold_builtin): Do not eliminate cmask{8,16,32} when result
+	is ignored.
+	* config/sparc/visintrin.h (__vis_cmask8, __vis_cmask16,
+	__vis_cmask32, __vis_fchksm16, __vis_fsll16, __vis_fslas16,
+	__vis_fsrl16, __vis_fsra16, __vis_fsll32, __vis_fslas32,
+	__vis_fsrl32, __vis_fsra32, __vis_pdistn, __vis_fmean16,
+	__vis_fpadd64, __vis_fpsub64, __vis_fpadds16, __vis_fpadds16s,
+	__vis_fpsubs16, __vis_fpsubs16s, __vis_fpadds32, __vis_fpadds32s,
+	__vis_fpsubs32, __vis_fpsubs32s, __vis_fucmple8, __vis_fucmpne8,
+	__vis_fucmpgt8, __vis_fucmpeq8): New VIS 3.0 interfaces.
+	* doc/extend.texi: Document new VIS 3.0 builtins.
+
 2011-09-30  H.J. Lu  <hongjiu.lu@intel.com>
 
 	* doc/extend.texi: Add missing ','.
diff --git a/gcc/config/sparc/sparc-c.c b/gcc/config/sparc/sparc-c.c
index 0f2bee1..c187970 100644
--- a/gcc/config/sparc/sparc-c.c
+++ b/gcc/config/sparc/sparc-c.c
@@ -45,7 +45,12 @@  sparc_target_macros (void)
       cpp_assert (parse_in, "machine=sparc");
     }
 
-  if (TARGET_VIS2)
+  if (TARGET_VIS3)
+    {
+      cpp_define (parse_in, "__VIS__=0x300");
+      cpp_define (parse_in, "__VIS=0x300");
+    }
+  else if (TARGET_VIS2)
     {
       cpp_define (parse_in, "__VIS__=0x200");
       cpp_define (parse_in, "__VIS=0x200");
diff --git a/gcc/config/sparc/sparc-modes.def b/gcc/config/sparc/sparc-modes.def
index 6284700..ed135cc 100644
--- a/gcc/config/sparc/sparc-modes.def
+++ b/gcc/config/sparc/sparc-modes.def
@@ -43,5 +43,6 @@  CC_MODE (CCFP);
 CC_MODE (CCFPE);
 
 /* Vector modes.  */
+VECTOR_MODES (INT, 16);       /* V16QI V8HI V4SI V2DI */
 VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI */
 VECTOR_MODES (INT, 4);        /*       V4QI V2HI */
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index 9863174..4df9f6a 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -776,9 +776,9 @@  sparc_option_override (void)
     /* UltraSPARC T2 */
     { MASK_ISA, MASK_V9|MASK_VIS2},
     /* UltraSPARC T3 */
-    { MASK_ISA, MASK_V9|MASK_VIS2|MASK_FMAF},
+    { MASK_ISA, MASK_V9|MASK_VIS2|MASK_VIS3|MASK_FMAF},
     /* UltraSPARC T4 */
-    { MASK_ISA, MASK_V9|MASK_VIS2|MASK_FMAF},
+    { MASK_ISA, MASK_V9|MASK_VIS2|MASK_VIS3|MASK_FMAF},
   };
   const struct cpu_table *cpu;
   unsigned int i;
@@ -861,9 +861,13 @@  sparc_option_override (void)
   if (TARGET_VIS2)
     target_flags |= MASK_VIS;
 
-  /* Don't allow -mvis, -mvis2, or -mfmaf if FPU is disabled.  */
+  /* -mvis3 implies -mvis2 and -mvis */
+  if (TARGET_VIS3)
+    target_flags |= MASK_VIS2 | MASK_VIS;
+
+  /* Don't allow -mvis, -mvis2, -mvis3, or -mfmaf if FPU is disabled.  */
   if (! TARGET_FPU)
-    target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_FMAF);
+    target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_FMAF);
 
   /* -mvis assumes UltraSPARC+, so we are sure v9 instructions
      are available.
@@ -9196,6 +9200,10 @@  sparc_vis_init_builtins (void)
   tree di_ftype_v8qi_v8qi_di = build_function_type_list (intDI_type_node,
 							 v8qi, v8qi,
 							 intDI_type_node, 0);
+  tree di_ftype_v8qi_v8qi = build_function_type_list (intDI_type_node,
+						      v8qi, v8qi, 0);
+  tree si_ftype_v8qi_v8qi = build_function_type_list (intSI_type_node,
+						      v8qi, v8qi, 0);
   tree di_ftype_di_di = build_function_type_list (intDI_type_node,
 						  intDI_type_node,
 						  intDI_type_node, 0);
@@ -9226,6 +9234,8 @@  sparc_vis_init_builtins (void)
 						 intDI_type_node, 0);
   tree di_ftype_void = build_function_type_list (intDI_type_node,
 						 void_type_node, 0);
+  tree void_ftype_si = build_function_type_list (void_type_node,
+						 intSI_type_node, 0);
 
   /* Packing and expanding vectors.  */
   def_builtin ("__builtin_vis_fpack16", CODE_FOR_fpack16_vis,
@@ -9447,6 +9457,102 @@  sparc_vis_init_builtins (void)
       def_builtin ("__builtin_vis_bshuffledi", CODE_FOR_bshuffledi_vis,
 		   di_ftype_di_di);
     }
+
+  if (TARGET_VIS3)
+    {
+      if (TARGET_ARCH64)
+	{
+	  def_builtin ("__builtin_vis_cmask8", CODE_FOR_cmask8di_vis,
+		       void_ftype_di);
+	  def_builtin ("__builtin_vis_cmask16", CODE_FOR_cmask16di_vis,
+		       void_ftype_di);
+	  def_builtin ("__builtin_vis_cmask32", CODE_FOR_cmask32di_vis,
+		       void_ftype_di);
+	}
+      else
+	{
+	  def_builtin ("__builtin_vis_cmask8", CODE_FOR_cmask8si_vis,
+		       void_ftype_si);
+	  def_builtin ("__builtin_vis_cmask16", CODE_FOR_cmask16si_vis,
+		       void_ftype_si);
+	  def_builtin ("__builtin_vis_cmask32", CODE_FOR_cmask32si_vis,
+		       void_ftype_si);
+	}
+
+      def_builtin_const ("__builtin_vis_fchksm16", CODE_FOR_fchksm16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+
+      def_builtin_const ("__builtin_vis_fsll16", CODE_FOR_fsll16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fslas16", CODE_FOR_fslas16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fsrl16", CODE_FOR_fsrl16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fsra16", CODE_FOR_fsra16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fsll32", CODE_FOR_fsll32_vis,
+			 v2si_ftype_v2si_v2si);
+      def_builtin_const ("__builtin_vis_fslas32", CODE_FOR_fslas32_vis,
+			 v2si_ftype_v2si_v2si);
+      def_builtin_const ("__builtin_vis_fsrl32", CODE_FOR_fsrl32_vis,
+			 v2si_ftype_v2si_v2si);
+      def_builtin_const ("__builtin_vis_fsra32", CODE_FOR_fsra32_vis,
+			 v2si_ftype_v2si_v2si);
+
+      if (TARGET_ARCH64)
+	def_builtin_const ("__builtin_vis_pdistn", CODE_FOR_pdistndi_vis,
+			   di_ftype_v8qi_v8qi);
+      else
+	def_builtin_const ("__builtin_vis_pdistn", CODE_FOR_pdistnsi_vis,
+			   si_ftype_v8qi_v8qi);
+
+      def_builtin_const ("__builtin_vis_fmean16", CODE_FOR_fmean16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fpadd64", CODE_FOR_fpadd64_vis,
+			 di_ftype_di_di);
+      def_builtin_const ("__builtin_vis_fpsub64", CODE_FOR_fpsub64_vis,
+			 di_ftype_di_di);
+
+      def_builtin_const ("__builtin_vis_fpadds16", CODE_FOR_fpadds16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fpadds16s", CODE_FOR_fpadds16s_vis,
+			 v2hi_ftype_v2hi_v2hi);
+      def_builtin_const ("__builtin_vis_fpsubs16", CODE_FOR_fpsubs16_vis,
+			 v4hi_ftype_v4hi_v4hi);
+      def_builtin_const ("__builtin_vis_fpsubs16s", CODE_FOR_fpsubs16s_vis,
+			 v2hi_ftype_v2hi_v2hi);
+      def_builtin_const ("__builtin_vis_fpadds32", CODE_FOR_fpadds32_vis,
+			 v2si_ftype_v2si_v2si);
+      def_builtin_const ("__builtin_vis_fpadds32s", CODE_FOR_fpadds32s_vis,
+			 v1si_ftype_v1si_v1si);
+      def_builtin_const ("__builtin_vis_fpsubs32", CODE_FOR_fpsubs32_vis,
+			 v2si_ftype_v2si_v2si);
+      def_builtin_const ("__builtin_vis_fpsubs32s", CODE_FOR_fpsubs32s_vis,
+			 v1si_ftype_v1si_v1si);
+
+      if (TARGET_ARCH64)
+	{
+	  def_builtin_const ("__builtin_vis_fucmple8", CODE_FOR_fucmple8di_vis,
+			     di_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpne8", CODE_FOR_fucmpne8di_vis,
+			     di_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpgt8", CODE_FOR_fucmpgt8di_vis,
+			     di_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpeq8", CODE_FOR_fucmpeq8di_vis,
+			     di_ftype_v8qi_v8qi);
+	}
+      else
+	{
+	  def_builtin_const ("__builtin_vis_fucmple8", CODE_FOR_fucmple8si_vis,
+			     si_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpne8", CODE_FOR_fucmpne8si_vis,
+			     si_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpgt8", CODE_FOR_fucmpgt8si_vis,
+			     si_ftype_v8qi_v8qi);
+	  def_builtin_const ("__builtin_vis_fucmpeq8", CODE_FOR_fucmpeq8si_vis,
+			     si_ftype_v8qi_v8qi);
+	}
+    }
 }
 
 /* Handle TARGET_EXPAND_BUILTIN target hook.
@@ -9608,13 +9714,27 @@  sparc_fold_builtin (tree fndecl, int n_args ATTRIBUTE_UNUSED,
   tree rtype = TREE_TYPE (TREE_TYPE (fndecl));
   enum insn_code icode = (enum insn_code) DECL_FUNCTION_CODE (fndecl);
 
-  if (ignore
-      && icode != CODE_FOR_alignaddrsi_vis
-      && icode != CODE_FOR_alignaddrdi_vis
-      && icode != CODE_FOR_wrgsr_vis
-      && icode != CODE_FOR_bmasksi_vis
-      && icode != CODE_FOR_bmaskdi_vis)
-    return build_zero_cst (rtype);
+  if (ignore)
+    {
+      switch (icode)
+	{
+	case CODE_FOR_alignaddrsi_vis:
+	case CODE_FOR_alignaddrdi_vis:
+	case CODE_FOR_wrgsr_vis:
+	case CODE_FOR_bmasksi_vis:
+	case CODE_FOR_bmaskdi_vis:
+	case CODE_FOR_cmask8si_vis:
+	case CODE_FOR_cmask8di_vis:
+	case CODE_FOR_cmask16si_vis:
+	case CODE_FOR_cmask16di_vis:
+	case CODE_FOR_cmask32si_vis:
+	case CODE_FOR_cmask32di_vis:
+	  break;
+
+	default:
+	  return build_zero_cst (rtype);
+	}
+    }
 
   switch (icode)
     {
diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index cccd444..fa94387 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -1868,6 +1868,8 @@  extern int sparc_indent_opcode;
 #define AS_NIAGARA3_FLAG "b"
 #undef TARGET_FMAF
 #define TARGET_FMAF 0
+#undef TARGET_VIS3
+#define TARGET_VIS3 0
 #else
 #define AS_NIAGARA3_FLAG "d"
 #endif
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 0446955..03158c7 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -80,6 +80,12 @@ 
    (UNSPEC_EDGE32N		74)
    (UNSPEC_EDGE32LN		75)
    (UNSPEC_BSHUFFLE		76)
+   (UNSPEC_CMASK8		77)
+   (UNSPEC_CMASK16		78)
+   (UNSPEC_CMASK32		79)
+   (UNSPEC_FCHKSM16		80)
+   (UNSPEC_PDISTN		81)
+   (UNSPEC_FUCMP		82)
   ])
 
 (define_constants
@@ -195,12 +201,16 @@ 
 (define_mode_iterator V64 [DF V2SI V4HI V8QI])
 (define_mode_iterator V64I [DI V2SI V4HI V8QI])
 
+(define_mode_iterator V64N8 [V2SI V4HI])
+
 ;; The upper 32 fp regs on the v9 can't hold SFmode values.  To deal with this
 ;; a second register class, EXTRA_FP_REGS, exists for the v9 chip.  The name
 ;; is a bit of a misnomer as it covers all 64 fp regs.  The corresponding
 ;; constraint letter is 'e'.  To avoid any confusion, 'e' is used instead of
 ;; 'f' for all DF/TFmode values, including those that are specific to the v8.
 
+(define_mode_attr vbits [(V2SI "32") (V4HI "16") (SI "32s") (V2HI "16s")])
+(define_mode_attr vconstr [(V2SI "e") (V4HI "e") (SI "f") (V2HI "f")])
 
 ;; Attribute for cpu type.
 ;; These must match the values for enum processor_type in sparc.h.
@@ -8271,4 +8281,106 @@ 
   "edge32ln\t%r1, %r2, %0"
   [(set_attr "type" "edge")])
 
+;; Conditional moves are possible via fcmpX --> cmaskX -> bshuffle
+(define_insn "cmask8<P:mode>_vis"
+  [(set (reg:DI GSR_REG)
+        (unspec:DI [(match_operand:P 0 "register_operand" "r")
+	            (reg:DI GSR_REG)]
+                   UNSPEC_CMASK8))]
+  "TARGET_VIS3"
+  "cmask8\t%r0")
+
+(define_insn "cmask16<P:mode>_vis"
+  [(set (reg:DI GSR_REG)
+        (unspec:DI [(match_operand:P 0 "register_operand" "r")
+	            (reg:DI GSR_REG)]
+                   UNSPEC_CMASK16))]
+  "TARGET_VIS3"
+  "cmask16\t%r0")
+
+(define_insn "cmask32<P:mode>_vis"
+  [(set (reg:DI GSR_REG)
+        (unspec:DI [(match_operand:P 0 "register_operand" "r")
+	            (reg:DI GSR_REG)]
+                   UNSPEC_CMASK32))]
+  "TARGET_VIS3"
+  "cmask32\t%r0")
+
+(define_insn "fchksm16_vis"
+  [(set (match_operand:V4HI 0 "register_operand" "=e")
+        (unspec:V4HI [(match_operand:V4HI 1 "register_operand" "e")
+                      (match_operand:V4HI 2 "register_operand" "e")]
+                     UNSPEC_FCHKSM16))]
+  "TARGET_VIS3"
+  "fchksm16\t%1, %2, %0")
+
+(define_code_iterator vis3_shift [ashift ss_ashift lshiftrt ashiftrt])
+(define_code_attr vis3_shift_insn
+  [(ashift "fsll") (ss_ashift "fslas") (lshiftrt "fsrl") (ashiftrt "fsra")])
+   
+(define_insn "<vis3_shift_insn><vbits>_vis"
+  [(set (match_operand:V64N8 0 "register_operand" "=<vconstr>")
+        (vis3_shift:V64N8 (match_operand:V64N8 1 "register_operand" "<vconstr>")
+                          (match_operand:V64N8 2 "register_operand" "<vconstr>")))]
+  "TARGET_VIS3"
+  "<vis3_shift_insn><vbits>\t%1, %2, %0")
+
+(define_insn "pdistn<mode>_vis"
+  [(set (match_operand:P 0 "register_operand" "=r")
+        (unspec:P [(match_operand:V8QI 1 "register_operand" "e")
+                   (match_operand:V8QI 2 "register_operand" "e")]
+         UNSPEC_PDISTN))]
+  "TARGET_VIS3"
+  "pdistn\t%1, %2, %0")
+
+(define_insn "fmean16_vis"
+  [(set (match_operand:V4HI 0 "register_operand" "=e")
+        (truncate:V4HI
+          (lshiftrt:V4SI
+            (plus:V4SI
+              (plus:V4SI
+                (zero_extend:V4SI
+                  (match_operand:V4HI 1 "register_operand" "e"))
+                (zero_extend:V4SI
+                  (match_operand:V4HI 2 "register_operand" "e")))
+              (const_vector:V4SI [(const_int 1) (const_int 1)
+                                  (const_int 1) (const_int 1)]))
+          (const_int 1))))]
+  "TARGET_VIS3"
+  "fmean16\t%1, %2, %0")
+
+(define_insn "fpadd64_vis"
+  [(set (match_operand:DI 0 "register_operand" "=e")
+        (plus:DI (match_operand:DI 1 "register_operand" "e")
+                 (match_operand:DI 2 "register_operand" "e")))]
+  "TARGET_VIS3"
+  "fpadd64\t%1, %2, %0")
+
+(define_insn "fpsub64_vis"
+  [(set (match_operand:DI 0 "register_operand" "=e")
+        (minus:DI (match_operand:DI 1 "register_operand" "e")
+                  (match_operand:DI 2 "register_operand" "e")))]
+  "TARGET_VIS3"
+  "fpsub64\t%1, %2, %0")
+
+(define_mode_iterator VASS [V4HI V2SI V2HI SI])
+(define_code_iterator vis3_addsub_ss [ss_plus ss_minus])
+(define_code_attr vis3_addsub_ss_insn
+  [(ss_plus "fpadds") (ss_minus "fpsubs")])
+
+(define_insn "<vis3_addsub_ss_insn><vbits>_vis"
+  [(set (match_operand:VASS 0 "register_operand" "=<vconstr>")
+        (vis3_addsub_ss:VASS (match_operand:VASS 1 "register_operand" "<vconstr>")
+                             (match_operand:VASS 2 "register_operand" "<vconstr>")))]
+  "TARGET_VIS3"
+  "<vis3_addsub_ss_insn><vbits>\t%1, %2, %0")
+
+(define_insn "fucmp<code>8<P:mode>_vis"
+  [(set (match_operand:P 0 "register_operand" "=r")
+  	(unspec:P [(gcond:V8QI (match_operand:V8QI 1 "register_operand" "e")
+		               (match_operand:V8QI 2 "register_operand" "e"))]
+	 UNSPEC_FUCMP))]
+  "TARGET_VIS3"
+  "fucmp<code>8\t%1, %2, %0")
+
 (include "sync.md")
diff --git a/gcc/config/sparc/sparc.opt b/gcc/config/sparc/sparc.opt
index a7b60c8..613ae73 100644
--- a/gcc/config/sparc/sparc.opt
+++ b/gcc/config/sparc/sparc.opt
@@ -65,6 +65,10 @@  mvis2
 Target Report Mask(VIS2)
 Use UltraSPARC Visual Instruction Set version 2.0 extensions
 
+mvis3
+Target Report Mask(VIS3)
+Use UltraSPARC Visual Instruction Set version 3.0 extensions
+
 mfmaf
 Target Report Mask(FMAF)
 Use UltraSPARC Fused Multiply-Add extensions
diff --git a/gcc/config/sparc/visintrin.h b/gcc/config/sparc/visintrin.h
index 1688301..32e44e5 100644
--- a/gcc/config/sparc/visintrin.h
+++ b/gcc/config/sparc/visintrin.h
@@ -431,4 +431,200 @@  __vis_edge32ln (void *__A, void *__B)
   return __builtin_vis_edge32ln (__A, __B);
 }
 
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_cmask8 (long __A)
+{
+  return __builtin_vis_cmask8 (__A);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_cmask16 (long __A)
+{
+  return __builtin_vis_cmask16 (__A);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_cmask32 (long __A)
+{
+  return __builtin_vis_cmask32 (__A);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fchksm16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fchksm16 (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsll16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fsll16 (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fslas16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fslas16 (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsrl16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fsrl16 (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsra16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fsra16 (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsll32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fsll32 (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fslas32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fslas32 (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsrl32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fsrl32 (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fsra32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fsra32 (__A, __B);
+}
+
+extern __inline long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_pdistn (__v8qi __A, __v8qi __B)
+{
+  return __builtin_vis_pdistn (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fmean16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fmean16 (__A, __B);
+}
+
+extern __inline __i64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpadd64 (__i64 __A, __i64 __B)
+{
+  return __builtin_vis_fpadd64 (__A, __B);
+}
+
+extern __inline __i64
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpsub64 (__i64 __A, __i64 __B)
+{
+  return __builtin_vis_fpsub64 (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpadds16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fpadds16 (__A, __B);
+}
+
+extern __inline __v2hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpadds16s (__v2hi __A, __v2hi __B)
+{
+  return __builtin_vis_fpadds16s (__A, __B);
+}
+
+extern __inline __v4hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpsubs16 (__v4hi __A, __v4hi __B)
+{
+  return __builtin_vis_fpsubs16 (__A, __B);
+}
+
+extern __inline __v2hi
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpsubs16s (__v2hi __A, __v2hi __B)
+{
+  return __builtin_vis_fpsubs16s (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpadds32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fpadds32 (__A, __B);
+}
+
+extern __inline __v1si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpadds32s (__v1si __A, __v1si __B)
+{
+  return __builtin_vis_fpadds32s (__A, __B);
+}
+
+extern __inline __v2si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpsubs32 (__v2si __A, __v2si __B)
+{
+  return __builtin_vis_fpsubs32 (__A, __B);
+}
+
+extern __inline __v1si
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fpsubs32s (__v1si __A, __v1si __B)
+{
+  return __builtin_vis_fpsubs32s (__A, __B);
+}
+
+extern __inline long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fucmple8 (__v8qi __A, __v8qi __B)
+{
+  return __builtin_vis_fucmple8 (__A, __B);
+}
+
+extern __inline long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fucmpne8 (__v8qi __A, __v8qi __B)
+{
+  return __builtin_vis_fucmpne8 (__A, __B);
+}
+
+extern __inline long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fucmpgt8 (__v8qi __A, __v8qi __B)
+{
+  return __builtin_vis_fucmpgt8 (__A, __B);
+}
+
+extern __inline long
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__vis_fucmpeq8 (__v8qi __A, __v8qi __B)
+{
+  return __builtin_vis_fucmpeq8 (__A, __B);
+}
+
 #endif  /* _VISINTRIN_H_INCLUDED */
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 5fe0371..1c688dc 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -13016,8 +13016,8 @@  long __builtin_vis_array16 (long, long);
 long __builtin_vis_array32 (long, long);
 @end smallexample
 
-Additionally, when you use the @option{-mvis2} switch, the VIS version
-2.0 built-in functions become available:
+When you use the @option{-mvis2} switch, the VIS version 2.0 built-in
+functions also become available:
 
 @smallexample
 long __builtin_vis_bmask (long, long);
@@ -13034,6 +13034,47 @@  long __builtin_vis_edge32n (void *, void *);
 long __builtin_vis_edge32ln (void *, void *);
 @end smallexample
 
+When you use the @option{-mvis3} switch, the VIS version 3.0 built-in
+functions also become available:
+
+@smallexample
+void __builtin_vis_cmask8 (long);
+void __builtin_vis_cmask16 (long);
+void __builtin_vis_cmask32 (long);
+
+v4hi __builtin_vis_fchksm16 (v4hi, v4hi);
+
+v4hi __builtin_vis_fsll16 (v4hi, v4hi);
+v4hi __builtin_vis_fslas16 (v4hi, v4hi);
+v4hi __builtin_vis_fsrl16 (v4hi, v4hi);
+v4hi __builtin_vis_fsra16 (v4hi, v4hi);
+v2si __builtin_vis_fsll16 (v2si, v2si);
+v2si __builtin_vis_fslas16 (v2si, v2si);
+v2si __builtin_vis_fsrl16 (v2si, v2si);
+v2si __builtin_vis_fsra16 (v2si, v2si);
+
+long __builtin_vis_pdistn (v8qi, v8qi);
+
+v4hi __builtin_vis_fmean16 (v4hi, v4hi);
+
+int64_t __builtin_vis_fpadd64 (int64_t, int64_t);
+int64_t __builtin_vis_fpsub64 (int64_t, int64_t);
+
+v4hi __builtin_vis_fpadds16 (v4hi, v4hi);
+v2hi __builtin_vis_fpadds16s (v2hi, v2hi);
+v4hi __builtin_vis_fpsubs16 (v4hi, v4hi);
+v2hi __builtin_vis_fpsubs16s (v2hi, v2hi);
+v2si __builtin_vis_fpadds32 (v2si, v2si);
+v1si __builtin_vis_fpadds32s (v1si, v1si);
+v2si __builtin_vis_fpsubs32 (v2si, v2si);
+v1si __builtin_vis_fpsubs32s (v1si, v1si);
+
+long __builtin_vis_fucmple8 (v8qi, v8qi);
+long __builtin_vis_fucmpne8 (v8qi, v8qi);
+long __builtin_vis_fucmpgt8 (v8qi, v8qi);
+long __builtin_vis_fucmpeq8 (v8qi, v8qi);
+@end smallexample
+
 @node SPU Built-in Functions
 @subsection SPU Built-in Functions
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8e43f82..bdc7453 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -880,7 +880,8 @@  See RS/6000 and PowerPC Options.
 -mstack-bias  -mno-stack-bias @gol
 -munaligned-doubles  -mno-unaligned-doubles @gol
 -mv8plus  -mno-v8plus  -mvis  -mno-vis @gol
--mvis2 -mno-vis2 -mfmaf -mno-fmaf}
+-mvis2  -mno-vis2  -mvis3  -mno-vis3 @gol
+-mfmaf  -mno-fmaf}
 
 @emph{SPU Options}
 @gccoptlist{-mwarn-reloc -merror-reloc @gol
@@ -17445,6 +17446,16 @@  default is @option{-mvis2} when targetting a cpu that supports such
 instructions, such as UltraSPARC-III and later.  Setting @option{-mvis2}
 also sets @option{-mvis}.
 
+@item -mvis3
+@itemx -mno-vis3
+@opindex mvis3
+@opindex mno-vis3
+With @option{-mvis3}, GCC generates code that takes advantage of
+version 3.0 of the UltraSPARC Visual Instruction Set extensions.  The
+default is @option{-mvis3} when targetting a cpu that supports such
+instructions, such as niagara-3 and later.  Setting @option{-mvis3}
+also sets @option{-mvis2} and @option{-mvis}.
+
 @item -mfmaf
 @itemx -mno-fmaf
 @opindex mfmaf
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 075edfc..a43bf9d 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,11 @@ 
+2011-10-01  David S. Miller  <davem@davemloft.net>
+
+	* gcc.target/sparc/cmask.c: New test.
+	* gcc.target/sparc/fpadds.c: New test.
+	* gcc.target/sparc/fshift.c: New test.
+	* gcc.target/sparc/fucmp.c: New test.
+	* gcc.target/sparc/vis3misc.c: New test.
+
 2011-10-01  Janus Weil  <janus@gcc.gnu.org>
 
 	PR fortran/50585
diff --git a/gcc/testsuite/gcc.target/sparc/cmask.c b/gcc/testsuite/gcc.target/sparc/cmask.c
new file mode 100644
index 0000000..b3168ec
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/cmask.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mcpu=niagara3 -mvis" } */
+
+void test_cm8 (long x)
+{
+  __builtin_vis_cmask8 (x);
+}
+
+void test_cm16 (long x)
+{
+  __builtin_vis_cmask16 (x);
+}
+
+void test_cm32 (long x)
+{
+  __builtin_vis_cmask32 (x);
+}
+
+/* { dg-final { scan-assembler "cmask8\t%" } } */
+/* { dg-final { scan-assembler "cmask16\t%" } } */
+/* { dg-final { scan-assembler "cmask32\t%" } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fpadds.c b/gcc/testsuite/gcc.target/sparc/fpadds.c
new file mode 100644
index 0000000..d0704e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/fpadds.c
@@ -0,0 +1,55 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mcpu=niagara3 -mvis" } */
+typedef int __v2si __attribute__((vector_size(8)));
+typedef int __v1si __attribute__((vector_size(4)));
+typedef short __v4hi __attribute__((vector_size(8)));
+typedef short __v2hi __attribute__((vector_size(4)));
+
+__v4hi test_fpadds16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fpadds16 (x, y);
+}
+
+__v2hi test_fpadds16s (__v2hi x, __v2hi y)
+{
+  return __builtin_vis_fpadds16s (x, y);
+}
+
+__v4hi test_fpsubs16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fpsubs16 (x, y);
+}
+
+__v2hi test_fpsubs16s (__v2hi x, __v2hi y)
+{
+  return __builtin_vis_fpsubs16s (x, y);
+}
+
+__v2si test_fpadds32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fpadds32 (x, y);
+}
+
+__v1si test_fpadds32s (__v1si x, __v1si y)
+{
+  return __builtin_vis_fpadds32s (x, y);
+}
+
+__v2si test_fpsubs32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fpsubs32 (x, y);
+}
+
+__v1si test_fpsubs32s (__v1si x, __v1si y)
+{
+  return __builtin_vis_fpsubs32s (x, y);
+}
+
+/* { dg-final { scan-assembler "fpadds16\t%" } } */
+/* { dg-final { scan-assembler "fpadds16s\t%" } } */
+/* { dg-final { scan-assembler "fpsubs16\t%" } } */
+/* { dg-final { scan-assembler "fpsubs16s\t%" } } */
+/* { dg-final { scan-assembler "fpadds32\t%" } } */
+/* { dg-final { scan-assembler "fpadds32s\t%" } } */
+/* { dg-final { scan-assembler "fpsubs32\t%" } } */
+/* { dg-final { scan-assembler "fpsubs32s\t%" } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fshift.c b/gcc/testsuite/gcc.target/sparc/fshift.c
new file mode 100644
index 0000000..a12df04
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/fshift.c
@@ -0,0 +1,53 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mcpu=niagara3 -mvis" } */
+typedef int __v2si __attribute__((vector_size(8)));
+typedef short __v4hi __attribute__((vector_size(8)));
+
+__v4hi test_fsll16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fsll16 (x, y);
+}
+
+__v4hi test_fslas16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fslas16 (x, y);
+}
+
+__v4hi test_fsrl16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fsrl16 (x, y);
+}
+
+__v4hi test_fsra16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fsra16 (x, y);
+}
+
+__v2si test_fsll32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fsll32 (x, y);
+}
+
+__v2si test_fslas32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fslas32 (x, y);
+}
+
+__v2si test_fsrl32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fsrl32 (x, y);
+}
+
+__v2si test_fsra32 (__v2si x, __v2si y)
+{
+  return __builtin_vis_fsra32 (x, y);
+}
+
+/* { dg-final { scan-assembler "fsll16\t%" } } */
+/* { dg-final { scan-assembler "fslas16\t%" } } */
+/* { dg-final { scan-assembler "fsrl16\t%" } } */
+/* { dg-final { scan-assembler "fsra16\t%" } } */
+/* { dg-final { scan-assembler "fsll32\t%" } } */
+/* { dg-final { scan-assembler "fslas32\t%" } } */
+/* { dg-final { scan-assembler "fsrl32\t%" } } */
+/* { dg-final { scan-assembler "fsra32\t%" } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fucmp.c b/gcc/testsuite/gcc.target/sparc/fucmp.c
new file mode 100644
index 0000000..7f291c3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/fucmp.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mcpu=niagara3 -mvis" } */
+typedef unsigned char vec8 __attribute__((vector_size(8)));
+
+long test_fucmple8 (vec8 a, vec8 b)
+{
+  return __builtin_vis_fucmple8 (a, b);
+}
+
+long test_fucmpne8 (vec8 a, vec8 b)
+{
+  return __builtin_vis_fucmpne8 (a, b);
+}
+
+long test_fucmpgt8 (vec8 a, vec8 b)
+{
+  return __builtin_vis_fucmpgt8 (a, b);
+}
+
+long test_fucmpeq8 (vec8 a, vec8 b)
+{
+  return __builtin_vis_fucmpeq8 (a, b);
+}
+
+/* { dg-final { scan-assembler "fucmple8\t%" } } */
+/* { dg-final { scan-assembler "fucmpne8\t%" } } */
+/* { dg-final { scan-assembler "fucmpgt8\t%" } } */
+/* { dg-final { scan-assembler "fucmpeq8\t%" } } */
diff --git a/gcc/testsuite/gcc.target/sparc/vis3misc.c b/gcc/testsuite/gcc.target/sparc/vis3misc.c
new file mode 100644
index 0000000..8a9535e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/sparc/vis3misc.c
@@ -0,0 +1,37 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mcpu=niagara3 -mvis" } */
+typedef int __v2si __attribute__((vector_size(8)));
+typedef short __v4hi __attribute__((vector_size(8)));
+typedef unsigned char __v8qi __attribute__((vector_size(8)));
+typedef long long int64_t;
+
+__v4hi test_fchksm16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fchksm16 (x, y);
+}
+
+long test_pdistn (__v8qi x, __v8qi y)
+{
+  return __builtin_vis_pdistn (x, y);
+}
+
+__v4hi test_fmean16 (__v4hi x, __v4hi y)
+{
+  return __builtin_vis_fmean16 (x, y);
+}
+
+int64_t test_fpadd64 (int64_t x, int64_t y)
+{
+  return __builtin_vis_fpadd64 (x, y);
+}
+
+int64_t test_fpsub64 (int64_t x, int64_t y)
+{
+  return __builtin_vis_fpsub64 (x, y);
+}
+
+/* { dg-final { scan-assembler "fchksm16\t%" } } */
+/* { dg-final { scan-assembler "pdistn\t%" } } */
+/* { dg-final { scan-assembler "fmean16\t%" } } */
+/* { dg-final { scan-assembler "fpadd64\t%" } } */
+/* { dg-final { scan-assembler "fpsub64\t%" } } */