diff mbox

[RFC] Sparc vector mode segregation

Message ID 20111017.164332.437787158306181870.davem@davemloft.net
State New
Headers show

Commit Message

David Miller Oct. 17, 2011, 8:43 p.m. UTC
This is an implementation of the changes I spoke about the other
week.  These changes segregate the vector vs. non-vector mode
handling in the sparc backend.

It completely regstraps on 32-bit sparc-linux with VIS3 enabled by
default.

In fact, gcc.target/sparc/combined-1.c passes always even without
adjusting the optimization level to placate the register allocator
and many tests now generate more VIS instructions than before,
particularly on 32-bit.

This work necessitated finally adding a vec_init pattern, I just
cons'd up the simplest thing and will improve the code it generates
later.  Without this, the vectorizer gets into trouble it can't get
out of for the vec/scalar --> vec/vec shift transformations.

Richard, I would just have that code abort if no vec_init pattern is
provided by the target.  If the target provides vector shifts, it is
simply required to provide a vec_init pattern for the same modes.

I'm really pleased with how a lot of the tree-vectorizer test cases
look with these changes installed, although lots of improvement is
still possible.

I have a vec_perm implementation for VIS2 from Richard in my inbox,
and I can make use of that to improve vec_init substantially.

Anyways, I think these changes are a net improvement, and will allow
the float<-->int register move VIS3 instructions to be used by the
compiler more properly when I add those.

Therefore, unless there are major objections, I'd like to install
these changes after I do some sanity checking on 64-bit.

Feedback is very welcome.

gcc/
	* config/sparc/sparc-modes.def: Add single entry vector modes for
	DImode and SImode.
	* config/sparc/sparc/sparc.md (V32, V32I, V64, V64I, V64N8): Delete
	mode iterators.
	(mov<V32:mode>): Revert back to plain SFmode pattern.
	(*movsf_insn): Likewise.
	(mov<V64:mode>): Revert back to plain DFmode pattern.
	(*movdf_insn_sp32): Likewise.
	(*movdf_insn_sp32_v9): Likewise.
	(*movdf_insn_sp64): Likewise.
	(V64 mode splitters) Likewise.
	(addsi3): Remove VIS alternatives.
	(subsi3): Likewise.
	(and<V64I:mode>3): Revert to DImode only pattern.
	(and<V64I:mode>3_sp32): Likewise.
	(*and<V64I:mode>3_sp64): Likewise.
	(and<V32I:mode>3): Likewise.
	(*and_not_<V64I:mode>_sp32): Likewise.
	(*and_not_<V64I:mode>_sp64): Likewise.
	(*and_not_<V32I:mode>): Likewise.
	(ior<V64I:mode>3): Likewise.
	(*ior<V64I:mode>3_sp32): Likewise.
	(*ior<V64I:mode>3_sp64): Likewise.
	(ior<V32I:mode>3): Likewise.
	(*or_not_<V64I:mode>_sp32): Likewise.
	(*or_not_<V64I:mode>_sp64): Likewise.
	(*or_not_<V32I:mode>): Likewise.
	(xor<V64I:mode>3): Likewise.
	(*xor<V64I:mode>3_sp32): Likewise.
	(*xor<V64I:mode>3_sp64): Likewise.
	(xor<V32I:mode>3): Likewise.
	(V64I mode splitters): Likewise.
	(*xor_not_<V64I:mode>_sp32): Likewise.
	(*xor_not_<V64I:mode>_sp64): Likewise.
	(*xor_not_<V32I:mode>): Likewise.
	(one_cmpl<V64I:mode>2): Likewise.
	(*one_cmpl<V64I:mode>2_sp32): Likewise.
	(*one_cmpl<V64I:mode>2_sp64): Likewise.
	(one_cmpl<V32I:mode>2): Likewise.
	(VM32, VM64, VMALL): New mode iterators.
	(vbits, vconstr, vfptype): New mode attributes.
	(mov<VMALL:mode>): New expander.
	(*mov<VM32:mode>_insn): New insn.
	(*mov<VM64:mode>_insn_sp64): New insn.
	(*mov<VM64:mode>_insn_sp32): New insn, and associated splitter
	specifically for the register to memory case.
	(vec_init<mode>): New expander.
	(VADDSUB): New mode iterator.
	(<plusminus_insn>v2si3, <plusminus_insn>v2hi3): Remove and replace
	with...
	(<plusminus_insn><mode>3): New consolidated pattern.
	(VL): New mode iterator for logical operations.
	(vlsuf): New more attribute.
	(vlop): New code iterator.
	(vlinsn, vlninsn): New code attributes.
	(<code><mode>3): New insn to non-negated vector logical ops.
	(*not_<code><mode>3): Likewise for negated variants.
	(*nand<mode>_vis): New insn.
	(vlnotop): New code iterator.
	(*<code>_not1<mode>_vis, *<code>_not2<mode>_vis): New insns.
	(one_cmpl<mode>2): New insn.
	(faligndata<V64I:mode>_vis): Rewrite to use VM64 iterator.
	(bshuffle<VM64:mode>_vis): Likewise.
	(v<vis3_shift_patname><mode>3): Use GCM mode iterator.
	(fp<plusminus_insn>64_vis): Use V1DI mode.
	(VASS mode iterator): Use V1SI not SI mode.
	* config/sparc/sparc.c (sparc_vis_init_builtins): Account for
	single-entry vector mode changes.
	(sparc_expand_builtin): Likewise.
	(sparc_expand_vector_init): New function.
	* config/sparc/sparc-protos.h (sparc_expand_vector_init): Declare.

gcc/testsuite/

	* gcc.target/sparc/fand.c: Remove __LP64__ ifdefs and expect
	all operations to emit VIS instructions.
	* gcc.target/sparc/fandnot.c: Likewise.
	* gcc.target/sparc/fnot.c: Likewise.
	* gcc.target/sparc/for.c: Likewise.
	* gcc.target/sparc/fornot.c: Likewise.
	* gcc.target/sparc/fxnor.c: Likewise.
	* gcc.target/sparc/fxor.c: Likewise.

Comments

Eric Botcazou Oct. 17, 2011, 10:09 p.m. UTC | #1
> This is an implementation of the changes I spoke about the other
> week.  These changes segregate the vector vs. non-vector mode
> handling in the sparc backend.

I think that the original motivation for the previous design was the 32-bit 
vector ABI, where the arguments are passed in integer registers.  So for:

typedef char  vec8 __attribute__((vector_size(8)));

extern vec8 foo (vec8);

vec8 bar(vec8 a, vec8 b)
{
  return foo(a & b);
}

the generated code at -O2 is optimal:

fun8_2:
	and	%o2, %o0, %o0
	sethi	%hi(foo), %g1
	jmp	%g1 + %lo(foo)
	 and	%o3, %o1, %o1

My understand is that, with the changes, you will spill and reload twice.
Of course things are totally different with the 64-bit ABI.

A compromise could be to segregate the patterns, but still have alternatives 
for the other registers, i.e. andsi3 would still have the 'd' alternative at 
the end and the andv1si3 would have a 'r' alternative at the end, them being 
disparaged properly.

> In fact, gcc.target/sparc/combined-1.c passes always even without
> adjusting the optimization level to placate the register allocator
> and many tests now generate more VIS instructions than before,
> particularly on 32-bit.

Feel free to revert the adjustment I made as part of the patch.
David Miller Oct. 17, 2011, 10:29 p.m. UTC | #2
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Tue, 18 Oct 2011 00:09:55 +0200

> I think that the original motivation for the previous design was the 32-bit 
> vector ABI, where the arguments are passed in integer registers.  So for:
> 
> typedef char  vec8 __attribute__((vector_size(8)));
> 
> extern vec8 foo (vec8);
> 
> vec8 bar(vec8 a, vec8 b)
> {
>   return foo(a & b);
> }
> 
> the generated code at -O2 is optimal:
> 
> fun8_2:
> 	and	%o2, %o0, %o0
> 	sethi	%hi(foo), %g1
> 	jmp	%g1 + %lo(foo)
> 	 and	%o3, %o1, %o1
> 
> My understand is that, with the changes, you will spill and reload twice.
> Of course things are totally different with the 64-bit ABI.
> 
> A compromise could be to segregate the patterns, but still have alternatives 
> for the other registers, i.e. andsi3 would still have the 'd' alternative at 
> the end and the andv1si3 would have a 'r' alternative at the end, them being 
> disparaged properly.

I understand this, but one major problem with the original patterns is that
they told the compiler that integer arithmetic was possible also on DImode
and SImode values in float regs.

Guess what kinds of things reload does when you tell it that, and you also
give it access to a one-instruction way to move values between float and
integer regs?

Compounding this is the register allocation order for leaf functions.
That would cause the compiler to sometimes go to float regs for
integer reloads before it will go to the integer regs that would make
the function non-leaf.

Even in situations where this might provide some level of gain, it caused
extra code to be generated.  For example, if it reloaded the function's
return value temporarily into a float reg, we couldn't merge the reload
into the return value register as part of a "restore" instruction whereas
using a %lN register instead would allow us to do that.  So:

	mov	%o4, %l3
	...
	ret
	 restore %g0, %l3, %o0

turned into stuff like:

	movwtos %o4, %f0
	...
	movstouw %f0, %o0
	ret
	 restore

I would suggest we start with my patch, get the int<-->float move
instructions working reasonably, and then re-add vector-only cases for
the scenerios you describe above, making sure that we don't end up with
silly code generation scenerios like those I've just described.

I'm happy to work on all of that myself.

>> In fact, gcc.target/sparc/combined-1.c passes always even without
>> adjusting the optimization level to placate the register allocator
>> and many tests now generate more VIS instructions than before,
>> particularly on 32-bit.
> 
> Feel free to revert the adjustment I made as part of the patch.

Thanks for reviewing.

Richard Henderson also suggested to me during a seperate conversation
that I should use CONSTANT_P in sparc_vector_init() instead of the
convoluted test I had there.
Eric Botcazou Oct. 17, 2011, 10:57 p.m. UTC | #3
> I would suggest we start with my patch, get the int<-->float move
> instructions working reasonably, and then re-add vector-only cases for
> the scenerios you describe above, making sure that we don't end up with
> silly code generation scenerios like those I've just described.

This sounds like a good plan to me.
David Miller Oct. 17, 2011, 11:08 p.m. UTC | #4
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Tue, 18 Oct 2011 00:57:18 +0200

>> I would suggest we start with my patch, get the int<-->float move
>> instructions working reasonably, and then re-add vector-only cases for
>> the scenerios you describe above, making sure that we don't end up with
>> silly code generation scenerios like those I've just described.
> 
> This sounds like a good plan to me.

Great, committed to trunk.

Thanks again.
Eric Botcazou Oct. 24, 2011, 5 p.m. UTC | #5
> Great, committed to trunk.

Minor nit: can't you uncouple the GY, ZC and DF couples of constraints now?
We presumably need only one member of the couples per alternative now, i.e 
F,G,C in FP insns and D,Y,Z in vector insns.
David Miller Oct. 24, 2011, 9:14 p.m. UTC | #6
From: Eric Botcazou <ebotcazou@adacore.com>
Date: Mon, 24 Oct 2011 19:00:42 +0200

>> Great, committed to trunk.
> 
> Minor nit: can't you uncouple the GY, ZC and DF couples of constraints now?
> We presumably need only one member of the couples per alternative now, i.e 
> F,G,C in FP insns and D,Y,Z in vector insns.

Right, and I was also considering getting rid of the VIS specific constraints
if we can get my "enabled" attr patch further along.

Thanks for bringing this up.
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2bc40b0..26ebc23 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,77 @@ 
+2011-10-17  David S. Miller  <davem@davemloft.net>
+
+	* config/sparc/sparc-modes.def: Add single entry vector modes for
+	DImode and SImode.
+	* config/sparc/sparc/sparc.md (V32, V32I, V64, V64I, V64N8): Delete
+	mode iterators.
+	(mov<V32:mode>): Revert back to plain SFmode pattern.
+	(*movsf_insn): Likewise.
+	(mov<V64:mode>): Revert back to plain DFmode pattern.
+	(*movdf_insn_sp32): Likewise.
+	(*movdf_insn_sp32_v9): Likewise.
+	(*movdf_insn_sp64): Likewise.
+	(V64 mode splitters) Likewise.
+	(addsi3): Remove VIS alternatives.
+	(subsi3): Likewise.
+	(and<V64I:mode>3): Revert to DImode only pattern.
+	(and<V64I:mode>3_sp32): Likewise.
+	(*and<V64I:mode>3_sp64): Likewise.
+	(and<V32I:mode>3): Likewise.
+	(*and_not_<V64I:mode>_sp32): Likewise.
+	(*and_not_<V64I:mode>_sp64): Likewise.
+	(*and_not_<V32I:mode>): Likewise.
+	(ior<V64I:mode>3): Likewise.
+	(*ior<V64I:mode>3_sp32): Likewise.
+	(*ior<V64I:mode>3_sp64): Likewise.
+	(ior<V32I:mode>3): Likewise.
+	(*or_not_<V64I:mode>_sp32): Likewise.
+	(*or_not_<V64I:mode>_sp64): Likewise.
+	(*or_not_<V32I:mode>): Likewise.
+	(xor<V64I:mode>3): Likewise.
+	(*xor<V64I:mode>3_sp32): Likewise.
+	(*xor<V64I:mode>3_sp64): Likewise.
+	(xor<V32I:mode>3): Likewise.
+	(V64I mode splitters): Likewise.
+	(*xor_not_<V64I:mode>_sp32): Likewise.
+	(*xor_not_<V64I:mode>_sp64): Likewise.
+	(*xor_not_<V32I:mode>): Likewise.
+	(one_cmpl<V64I:mode>2): Likewise.
+	(*one_cmpl<V64I:mode>2_sp32): Likewise.
+	(*one_cmpl<V64I:mode>2_sp64): Likewise.
+	(one_cmpl<V32I:mode>2): Likewise.
+	(VM32, VM64, VMALL): New mode iterators.
+	(vbits, vconstr, vfptype): New mode attributes.
+	(mov<VMALL:mode>): New expander.
+	(*mov<VM32:mode>_insn): New insn.
+	(*mov<VM64:mode>_insn_sp64): New insn.
+	(*mov<VM64:mode>_insn_sp32): New insn, and associated splitter
+	specifically for the register to memory case.
+	(vec_init<mode>): New expander.
+	(VADDSUB): New mode iterator.
+	(<plusminus_insn>v2si3, <plusminus_insn>v2hi3): Remove and replace
+	with...
+	(<plusminus_insn><mode>3): New consolidated pattern.
+	(VL): New mode iterator for logical operations.
+	(vlsuf): New more attribute.
+	(vlop): New code iterator.
+	(vlinsn, vlninsn): New code attributes.
+	(<code><mode>3): New insn to non-negated vector logical ops.
+	(*not_<code><mode>3): Likewise for negated variants.
+	(*nand<mode>_vis): New insn.
+	(vlnotop): New code iterator.
+	(*<code>_not1<mode>_vis, *<code>_not2<mode>_vis): New insns.
+	(one_cmpl<mode>2): New insn.
+	(faligndata<V64I:mode>_vis): Rewrite to use VM64 iterator.
+	(bshuffle<VM64:mode>_vis): Likewise.
+	(v<vis3_shift_patname><mode>3): Use GCM mode iterator.
+	(fp<plusminus_insn>64_vis): Use V1DI mode.
+	(VASS mode iterator): Use V1SI not SI mode.
+	* config/sparc/sparc.c (sparc_vis_init_builtins): Account for
+	single-entry vector mode changes.
+	(sparc_expand_builtin): Likewise.
+	(sparc_expand_vector_init): New function.
+	* config/sparc/sparc-protos.h (sparc_expand_vector_init): Declare.
+
 2011-10-14  David S. Miller  <davem@davemloft.net>
 
 	* config/sparc/sol2.h: Protect -m{cpu,tune}=native handling
diff --git a/gcc/config/sparc/sparc-modes.def b/gcc/config/sparc/sparc-modes.def
index ed135cc..a5849c9 100644
--- a/gcc/config/sparc/sparc-modes.def
+++ b/gcc/config/sparc/sparc-modes.def
@@ -45,4 +45,6 @@  CC_MODE (CCFPE);
 /* Vector modes.  */
 VECTOR_MODES (INT, 16);       /* V16QI V8HI V4SI V2DI */
 VECTOR_MODES (INT, 8);        /*       V8QI V4HI V2SI */
-VECTOR_MODES (INT, 4);        /*       V4QI V2HI */
+VECTOR_MODES (INT, 4);        /*       V4QI V2HI      */
+VECTOR_MODE (INT, DI, 1);     /*                 V1DI */
+VECTOR_MODE (INT, SI, 1);     /*                 V1SI */
diff --git a/gcc/config/sparc/sparc-protos.h b/gcc/config/sparc/sparc-protos.h
index f7b563e..744747a 100644
--- a/gcc/config/sparc/sparc-protos.h
+++ b/gcc/config/sparc/sparc-protos.h
@@ -106,6 +106,7 @@  extern int sparc_check_64 (rtx, rtx);
 extern rtx gen_df_reg (rtx, int);
 extern void sparc_expand_compare_and_swap_12 (rtx, rtx, rtx, rtx);
 extern const char *output_v8plus_mult (rtx, rtx *, const char *);
+extern void sparc_expand_vector_init (rtx, rtx);
 #endif /* RTX_CODE */
 
 #endif /* __SPARC_PROTOS_H__ */
diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c
index a7b075c..93f900e 100644
--- a/gcc/config/sparc/sparc.c
+++ b/gcc/config/sparc/sparc.c
@@ -9403,7 +9403,7 @@  sparc_vis_init_builtins (void)
 	       v8qi_ftype_v8qi_v8qi);
   def_builtin ("__builtin_vis_faligndatav2si", CODE_FOR_faligndatav2si_vis,
 	       v2si_ftype_v2si_v2si);
-  def_builtin ("__builtin_vis_faligndatadi", CODE_FOR_faligndatadi_vis,
+  def_builtin ("__builtin_vis_faligndatadi", CODE_FOR_faligndatav1di_vis,
 	       di_ftype_di_di);
 
   def_builtin ("__builtin_vis_write_gsr", CODE_FOR_wrgsr_vis,
@@ -9539,7 +9539,7 @@  sparc_vis_init_builtins (void)
 		     v2hi_ftype_v2hi_v2hi);
   def_builtin_const ("__builtin_vis_fpadd32", CODE_FOR_addv2si3,
 		     v2si_ftype_v2si_v2si);
-  def_builtin_const ("__builtin_vis_fpadd32s", CODE_FOR_addsi3,
+  def_builtin_const ("__builtin_vis_fpadd32s", CODE_FOR_addv1si3,
 		     v1si_ftype_v1si_v1si);
   def_builtin_const ("__builtin_vis_fpsub16", CODE_FOR_subv4hi3,
 		     v4hi_ftype_v4hi_v4hi);
@@ -9547,7 +9547,7 @@  sparc_vis_init_builtins (void)
 		     v2hi_ftype_v2hi_v2hi);
   def_builtin_const ("__builtin_vis_fpsub32", CODE_FOR_subv2si3,
 		     v2si_ftype_v2si_v2si);
-  def_builtin_const ("__builtin_vis_fpsub32s", CODE_FOR_subsi3,
+  def_builtin_const ("__builtin_vis_fpsub32s", CODE_FOR_subv1si3,
 		     v1si_ftype_v1si_v1si);
 
   /* Three-dimensional array addressing.  */
@@ -9585,7 +9585,7 @@  sparc_vis_init_builtins (void)
 		   v8qi_ftype_v8qi_v8qi);
       def_builtin ("__builtin_vis_bshufflev2si", CODE_FOR_bshufflev2si_vis,
 		   v2si_ftype_v2si_v2si);
-      def_builtin ("__builtin_vis_bshuffledi", CODE_FOR_bshuffledi_vis,
+      def_builtin ("__builtin_vis_bshuffledi", CODE_FOR_bshufflev1di_vis,
 		   di_ftype_di_di);
     }
 
@@ -9654,11 +9654,11 @@  sparc_vis_init_builtins (void)
 			 v2hi_ftype_v2hi_v2hi);
       def_builtin_const ("__builtin_vis_fpadds32", CODE_FOR_ssaddv2si3,
 			 v2si_ftype_v2si_v2si);
-      def_builtin_const ("__builtin_vis_fpadds32s", CODE_FOR_ssaddsi3,
+      def_builtin_const ("__builtin_vis_fpadds32s", CODE_FOR_ssaddv1si3,
 			 v1si_ftype_v1si_v1si);
       def_builtin_const ("__builtin_vis_fpsubs32", CODE_FOR_sssubv2si3,
 			 v2si_ftype_v2si_v2si);
-      def_builtin_const ("__builtin_vis_fpsubs32s", CODE_FOR_sssubsi3,
+      def_builtin_const ("__builtin_vis_fpsubs32s", CODE_FOR_sssubv1si3,
 			 v1si_ftype_v1si_v1si);
 
       if (TARGET_ARCH64)
@@ -9748,6 +9748,13 @@  sparc_expand_builtin (tree exp, rtx target,
       insn_op = &insn_data[icode].operand[idx];
       op[arg_count] = expand_normal (arg);
 
+      if (insn_op->mode == V1DImode
+	  && GET_MODE (op[arg_count]) == DImode)
+	op[arg_count] = gen_lowpart (V1DImode, op[arg_count]);
+      else if (insn_op->mode == V1SImode
+	  && GET_MODE (op[arg_count]) == SImode)
+	op[arg_count] = gen_lowpart (V1SImode, op[arg_count]);
+
       if (! (*insn_data[icode].operand[idx].predicate) (op[arg_count],
 							insn_op->mode))
 	op[arg_count] = copy_to_mode_reg (insn_op->mode, op[arg_count]);
@@ -11060,4 +11067,36 @@  output_v8plus_mult (rtx insn, rtx *operands, const char *name)
     }
 }
 
+void
+sparc_expand_vector_init (rtx target, rtx vals)
+{
+  enum machine_mode mode = GET_MODE (target);
+  enum machine_mode inner_mode = GET_MODE_INNER (mode);
+  int n_elts = GET_MODE_NUNITS (mode);
+  int i, n_var = 0;
+  rtx mem;
+
+  for (i = 0; i < n_elts; i++)
+    {
+      rtx x = XVECEXP (vals, 0, i);
+      if (!(CONST_INT_P (x)
+	    || GET_CODE (x) == CONST_DOUBLE
+	    || GET_CODE (x) == CONST_FIXED))
+	n_var++;
+    }
+
+  if (n_var == 0)
+    {
+      emit_move_insn (target, gen_rtx_CONST_VECTOR (mode, XVEC (vals, 0)));
+      return;
+    }
+
+  mem = assign_stack_temp (mode, GET_MODE_SIZE (mode), 0);
+  for (i = 0; i < n_elts; i++)
+    emit_move_insn (adjust_address_nv (mem, inner_mode,
+				    i * GET_MODE_SIZE (inner_mode)),
+		    XVECEXP (vals, 0, i));
+  emit_move_insn (target, mem);
+}
+
 #include "gt-sparc.h"
diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md
index 6118e6d..200245f 100644
--- a/gcc/config/sparc/sparc.md
+++ b/gcc/config/sparc/sparc.md
@@ -200,24 +200,12 @@ 
 (define_mode_iterator I [QI HI SI DI])
 (define_mode_iterator F [SF DF TF])
 
-;; We don't define V1SI because SI should work just fine.
-(define_mode_iterator V32 [SF V2HI V4QI])
-(define_mode_iterator V32I [SI V2HI V4QI])
-
-(define_mode_iterator V64 [DF V2SI V4HI V8QI])
-(define_mode_iterator V64I [DI V2SI V4HI V8QI])
-
-(define_mode_iterator V64N8 [V2SI V4HI])
-
 ;; The upper 32 fp regs on the v9 can't hold SFmode values.  To deal with this
 ;; a second register class, EXTRA_FP_REGS, exists for the v9 chip.  The name
 ;; is a bit of a misnomer as it covers all 64 fp regs.  The corresponding
 ;; constraint letter is 'e'.  To avoid any confusion, 'e' is used instead of
 ;; 'f' for all DF/TFmode values, including those that are specific to the v8.
 
-(define_mode_attr vbits [(V2SI "32") (V4HI "16") (SI "32s") (V2HI "16s")])
-(define_mode_attr vconstr [(V2SI "e") (V4HI "e") (SI "f") (V2HI "f")])
-
 ;; Attribute for cpu type.
 ;; These must match the values for enum processor_type in sparc.h.
 (define_attr "cpu"
@@ -1929,24 +1917,23 @@ 
 })
 
 
-;; Floating point and vector move instructions
+;; Floating point move instructions
 
-;; Yes, you guessed it right, the former movsf expander.
-(define_expand "mov<V32:mode>"
-  [(set (match_operand:V32 0 "nonimmediate_operand" "")
-	(match_operand:V32 1 "general_operand" ""))]
-  "<V32:MODE>mode == SFmode || TARGET_VIS"
+(define_expand "movsf"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "")
+	(match_operand:SF 1 "general_operand" ""))]
+  ""
 {
-  if (sparc_expand_move (<V32:MODE>mode, operands))
+  if (sparc_expand_move (SFmode, operands))
     DONE;
 })
 
 (define_insn "*movsf_insn"
-  [(set (match_operand:V32 0 "nonimmediate_operand" "=d,d,f,*r,*r,*r,f,*r,m,m")
-	(match_operand:V32 1 "input_operand"        "GY,ZC,f,*rRY,Q,S,m,m,f,*rGY"))]
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=d, d,f,  *r,*r,*r,f,*r,m,   m")
+	(match_operand:SF 1 "input_operand"        "GY,ZC,f,*rRY, Q, S,m, m,f,*rGY"))]
   "TARGET_FPU
-   && (register_operand (operands[0], <V32:MODE>mode)
-       || register_or_zero_or_all_ones_operand (operands[1], <V32:MODE>mode))"
+   && (register_operand (operands[0], SFmode)
+       || register_or_zero_or_all_ones_operand (operands[1], SFmode))"
 {
   if (GET_CODE (operands[1]) == CONST_DOUBLE
       && (which_alternative == 3
@@ -2067,20 +2054,19 @@ 
   [(set (match_dup 0) (high:SF (match_dup 1)))
    (set (match_dup 0) (lo_sum:SF (match_dup 0) (match_dup 1)))])
 
-;; Yes, you again guessed it right, the former movdf expander.
-(define_expand "mov<V64:mode>"
-  [(set (match_operand:V64 0 "nonimmediate_operand" "")
-	(match_operand:V64 1 "general_operand" ""))]
-  "<V64:MODE>mode == DFmode || TARGET_VIS"
+(define_expand "movdf"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "")
+	(match_operand:DF 1 "general_operand" ""))]
+  ""
 {
-  if (sparc_expand_move (<V64:MODE>mode, operands))
+  if (sparc_expand_move (DFmode, operands))
     DONE;
 })
 
 ;; Be careful, fmovd does not exist when !v9.
 (define_insn "*movdf_insn_sp32"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=e,W,U,T,o,e,*r,o,e,o")
-	(match_operand:DF 1 "input_operand"    "W#F,e,T,U,G,e,*rFo,*r,o#F,e"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=e,W,U,T,o,e,  *r, o,  e,o")
+	(match_operand:DF 1 "input_operand"       "W#F,e,T,U,G,e,*rFo,*r,o#F,e"))]
   "TARGET_FPU
    && ! TARGET_V9
    && (register_operand (operands[0], DFmode)
@@ -2117,13 +2103,13 @@ 
 
 ;; We have available v9 double floats but not 64-bit integer registers.
 (define_insn "*movdf_insn_sp32_v9"
-  [(set (match_operand:V64 0 "nonimmediate_operand" "=b,b,e,e,T,W,U,T,f,*r,o")
-        (match_operand:V64 1 "input_operand" "GY,ZC,e,W#F,GY,e,T,U,o#F,*roGYDF,*rGYf"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e,  e, T,W,U,T,  f,     *r,    o")
+        (match_operand:DF 1 "input_operand"        "GY,ZC,e,W#F,GY,e,T,U,o#F,*roGYDF,*rGYf"))]
   "TARGET_FPU
    && TARGET_V9
    && ! TARGET_ARCH64
-   && (register_operand (operands[0], <V64:MODE>mode)
-       || register_or_zero_or_all_ones_operand (operands[1], <V64:MODE>mode))"
+   && (register_operand (operands[0], DFmode)
+       || register_or_zero_or_all_ones_operand (operands[1], DFmode))"
   "@
   fzero\t%0
   fone\t%0
@@ -2159,12 +2145,12 @@ 
 
 ;; We have available both v9 double floats and 64-bit integer registers.
 (define_insn "*movdf_insn_sp64"
-  [(set (match_operand:V64 0 "nonimmediate_operand" "=b,b,e,e,W,*r,*r,m,*r")
-        (match_operand:V64 1 "input_operand"    "GY,ZC,e,W#F,e,*rGY,m,*rGY,DF"))]
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=b, b,e,  e,W,  *r,*r,   m,*r")
+        (match_operand:DF 1 "input_operand"        "GY,ZC,e,W#F,e,*rGY, m,*rGY,DF"))]
   "TARGET_FPU
    && TARGET_ARCH64
-   && (register_operand (operands[0], <V64:MODE>mode)
-       || register_or_zero_or_all_ones_operand (operands[1], <V64:MODE>mode))"
+   && (register_operand (operands[0], DFmode)
+       || register_or_zero_or_all_ones_operand (operands[1], DFmode))"
   "@
   fzero\t%0
   fone\t%0
@@ -2192,10 +2178,10 @@ 
   stx\t%r1, %0"
   [(set_attr "type" "*,load,store")])
 
-;; This pattern builds V64mode constants in integer registers.
+;; This pattern builds DFmode constants in integer registers.
 (define_split
-  [(set (match_operand:V64 0 "register_operand" "")
-        (match_operand:V64 1 "const_double_or_vector_operand" ""))]
+  [(set (match_operand:DF 0 "register_operand" "")
+        (match_operand:DF 1 "const_double_operand" ""))]
   "TARGET_FPU
    && (GET_CODE (operands[0]) == REG
        && REGNO (operands[0]) < 32)
@@ -2249,8 +2235,8 @@ 
 ;; careful when V9 but not ARCH64 because the integer
 ;; register DFmode cases must be handled.
 (define_split
-  [(set (match_operand:V64 0 "register_operand" "")
-        (match_operand:V64 1 "register_operand" ""))]
+  [(set (match_operand:DF 0 "register_operand" "")
+        (match_operand:DF 1 "register_operand" ""))]
   "(! TARGET_V9
     || (! TARGET_ARCH64
         && ((GET_CODE (operands[0]) == REG
@@ -2265,18 +2251,11 @@ 
   rtx set_src = operands[1];
   rtx dest1, dest2;
   rtx src1, src2;
-  enum machine_mode half_mode;
 
-  /* We can be expanded for DFmode or integral vector modes.  */
-  if (<V64:MODE>mode == DFmode)
-    half_mode = SFmode;
-  else
-    half_mode = SImode;
-  
-  dest1 = gen_highpart (half_mode, set_dest);
-  dest2 = gen_lowpart (half_mode, set_dest);
-  src1 = gen_highpart (half_mode, set_src);
-  src2 = gen_lowpart (half_mode, set_src);
+  dest1 = gen_highpart (SFmode, set_dest);
+  dest2 = gen_lowpart (SFmode, set_dest);
+  src1 = gen_highpart (SFmode, set_src);
+  src2 = gen_lowpart (SFmode, set_src);
 
   /* Now emit using the real source and destination we found, swapping
      the order if we detect overlap.  */
@@ -2294,8 +2273,8 @@ 
 })
 
 (define_split
-  [(set (match_operand:V64 0 "register_operand" "")
-	(match_operand:V64 1 "memory_operand" ""))]
+  [(set (match_operand:DF 0 "register_operand" "")
+	(match_operand:DF 1 "memory_operand" ""))]
   "reload_completed
    && ! TARGET_ARCH64
    && (((REGNO (operands[0]) % 2) != 0)
@@ -2303,34 +2282,27 @@ 
    && offsettable_memref_p (operands[1])"
   [(clobber (const_int 0))]
 {
-  enum machine_mode half_mode;
   rtx word0, word1;
 
-  /* We can be expanded for DFmode or integral vector modes.  */
-  if (<V64:MODE>mode == DFmode)
-    half_mode = SFmode;
-  else
-    half_mode = SImode;
-
-  word0 = adjust_address (operands[1], half_mode, 0);
-  word1 = adjust_address (operands[1], half_mode, 4);
+  word0 = adjust_address (operands[1], SFmode, 0);
+  word1 = adjust_address (operands[1], SFmode, 4);
 
-  if (reg_overlap_mentioned_p (gen_highpart (half_mode, operands[0]), word1))
+  if (reg_overlap_mentioned_p (gen_highpart (SFmode, operands[0]), word1))
     {
-      emit_move_insn_1 (gen_lowpart (half_mode, operands[0]), word1);
-      emit_move_insn_1 (gen_highpart (half_mode, operands[0]), word0);
+      emit_move_insn_1 (gen_lowpart (SFmode, operands[0]), word1);
+      emit_move_insn_1 (gen_highpart (SFmode, operands[0]), word0);
     }
   else
     {
-      emit_move_insn_1 (gen_highpart (half_mode, operands[0]), word0);
-      emit_move_insn_1 (gen_lowpart (half_mode, operands[0]), word1);
+      emit_move_insn_1 (gen_highpart (SFmode, operands[0]), word0);
+      emit_move_insn_1 (gen_lowpart (SFmode, operands[0]), word1);
     }
   DONE;
 })
 
 (define_split
-  [(set (match_operand:V64 0 "memory_operand" "")
-	(match_operand:V64 1 "register_operand" ""))]
+  [(set (match_operand:DF 0 "memory_operand" "")
+	(match_operand:DF 1 "register_operand" ""))]
   "reload_completed
    && ! TARGET_ARCH64
    && (((REGNO (operands[1]) % 2) != 0)
@@ -2338,26 +2310,19 @@ 
    && offsettable_memref_p (operands[0])"
   [(clobber (const_int 0))]
 {
-  enum machine_mode half_mode;
   rtx word0, word1;
 
-  /* We can be expanded for DFmode or integral vector modes.  */
-  if (<V64:MODE>mode == DFmode)
-    half_mode = SFmode;
-  else
-    half_mode = SImode;
-
-  word0 = adjust_address (operands[0], half_mode, 0);
-  word1 = adjust_address (operands[0], half_mode, 4);
+  word0 = adjust_address (operands[0], SFmode, 0);
+  word1 = adjust_address (operands[0], SFmode, 4);
 
-  emit_move_insn_1 (word0, gen_highpart (half_mode, operands[1]));
-  emit_move_insn_1 (word1, gen_lowpart (half_mode, operands[1]));
+  emit_move_insn_1 (word0, gen_highpart (SFmode, operands[1]));
+  emit_move_insn_1 (word1, gen_lowpart (SFmode, operands[1]));
   DONE;
 })
 
 (define_split
-  [(set (match_operand:V64 0 "memory_operand" "")
-        (match_operand:V64 1 "const_zero_operand" ""))]
+  [(set (match_operand:DF 0 "memory_operand" "")
+        (match_operand:DF 1 "const_zero_operand" ""))]
   "reload_completed
    && (! TARGET_V9
        || (! TARGET_ARCH64
@@ -2365,26 +2330,19 @@ 
    && offsettable_memref_p (operands[0])"
   [(clobber (const_int 0))]
 {
-  enum machine_mode half_mode;
   rtx dest1, dest2;
 
-  /* We can be expanded for DFmode or integral vector modes.  */
-  if (<V64:MODE>mode == DFmode)
-    half_mode = SFmode;
-  else
-    half_mode = SImode;
+  dest1 = adjust_address (operands[0], SFmode, 0);
+  dest2 = adjust_address (operands[0], SFmode, 4);
 
-  dest1 = adjust_address (operands[0], half_mode, 0);
-  dest2 = adjust_address (operands[0], half_mode, 4);
-
-  emit_move_insn_1 (dest1, CONST0_RTX (half_mode));
-  emit_move_insn_1 (dest2, CONST0_RTX (half_mode));
+  emit_move_insn_1 (dest1, CONST0_RTX (SFmode));
+  emit_move_insn_1 (dest2, CONST0_RTX (SFmode));
   DONE;
 })
 
 (define_split
-  [(set (match_operand:V64 0 "register_operand" "")
-        (match_operand:V64 1 "const_zero_operand" ""))]
+  [(set (match_operand:DF 0 "register_operand" "")
+        (match_operand:DF 1 "const_zero_operand" ""))]
   "reload_completed
    && ! TARGET_ARCH64
    && ((GET_CODE (operands[0]) == REG
@@ -2394,20 +2352,13 @@ 
 	   && REGNO (SUBREG_REG (operands[0])) < 32))"
   [(clobber (const_int 0))]
 {
-  enum machine_mode half_mode;
   rtx set_dest = operands[0];
   rtx dest1, dest2;
 
-  /* We can be expanded for DFmode or integral vector modes.  */
-  if (<V64:MODE>mode == DFmode)
-    half_mode = SFmode;
-  else
-    half_mode = SImode;
-
-  dest1 = gen_highpart (half_mode, set_dest);
-  dest2 = gen_lowpart (half_mode, set_dest);
-  emit_move_insn_1 (dest1, CONST0_RTX (half_mode));
-  emit_move_insn_1 (dest2, CONST0_RTX (half_mode));
+  dest1 = gen_highpart (SFmode, set_dest);
+  dest2 = gen_lowpart (SFmode, set_dest);
+  emit_move_insn_1 (dest1, CONST0_RTX (SFmode));
+  emit_move_insn_1 (dest2, CONST0_RTX (SFmode));
   DONE;
 })
 
@@ -3751,16 +3702,15 @@ 
    sub\t%1, -%2, %0")
 
 (define_insn "addsi3"
-  [(set (match_operand:SI 0 "register_operand" "=r,r,d")
-	(plus:SI (match_operand:SI 1 "register_operand" "%r,r,d")
-		 (match_operand:SI 2 "arith_add_operand" "rI,O,d")))]
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+	(plus:SI (match_operand:SI 1 "register_operand" "%r,r")
+		 (match_operand:SI 2 "arith_add_operand" "rI,O")))]
   ""
   "@
    add\t%1, %2, %0
-   sub\t%1, -%2, %0
-   fpadd32s\t%1, %2, %0"
-  [(set_attr "type" "*,*,fga")
-   (set_attr "fptype" "*,*,single")])
+   sub\t%1, -%2, %0"
+  [(set_attr "type" "*,*")
+   (set_attr "fptype" "*,*")])
 
 (define_insn "*cmp_cc_plus"
   [(set (reg:CC_NOOV CC_REG)
@@ -3923,16 +3873,15 @@ 
    add\t%1, -%2, %0")
 
 (define_insn "subsi3"
-  [(set (match_operand:SI 0 "register_operand" "=r,r,d")
-	(minus:SI (match_operand:SI 1 "register_operand" "r,r,d")
-		  (match_operand:SI 2 "arith_add_operand" "rI,O,d")))]
+  [(set (match_operand:SI 0 "register_operand" "=r,r")
+	(minus:SI (match_operand:SI 1 "register_operand" "r,r")
+		  (match_operand:SI 2 "arith_add_operand" "rI,O")))]
   ""
   "@
    sub\t%1, %2, %0
-   add\t%1, -%2, %0
-   fpsub32s\t%1, %2, %0"
-  [(set_attr "type" "*,*,fga")
-   (set_attr "fptype" "*,*,single")])
+   add\t%1, -%2, %0"
+  [(set_attr "type" "*,*")
+   (set_attr "fptype" "*,*")])
 
 (define_insn "*cmp_minus_cc"
   [(set (reg:CC_NOOV CC_REG)
@@ -4657,46 +4606,33 @@ 
 ;; We define DImode `and' so with DImode `not' we can get
 ;; DImode `andn'.  Other combinations are possible.
 
-(define_expand "and<V64I:mode>3"
-  [(set (match_operand:V64I 0 "register_operand" "")
-	(and:V64I (match_operand:V64I 1 "arith_double_operand" "")
-		  (match_operand:V64I 2 "arith_double_operand" "")))]
+(define_expand "anddi3"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(and:DI (match_operand:DI 1 "arith_double_operand" "")
+		(match_operand:DI 2 "arith_double_operand" "")))]
   ""
   "")
 
-(define_insn "*and<V64I:mode>3_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(and:V64I (match_operand:V64I 1 "arith_double_operand" "%r,b")
-		  (match_operand:V64I 2 "arith_double_operand" "rHI,b")))]
+(define_insn "*anddi3_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (match_operand:DI 1 "arith_double_operand" "%r")
+		(match_operand:DI 2 "arith_double_operand" "rHI")))]
   "! TARGET_ARCH64"
-  "@
-  #
-  fand\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*and<V64I:mode>3_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(and:V64I (match_operand:V64I 1 "arith_operand" "%r,b")
-		  (match_operand:V64I 2 "arith_operand" "rI,b")))]
+  "#")
+
+(define_insn "*anddi3_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (match_operand:DI 1 "arith_operand" "%r")
+		(match_operand:DI 2 "arith_operand" "rI")))]
   "TARGET_ARCH64"
-  "@
-   and\t%1, %2, %0
-   fand\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "and\t%1, %2, %0")
 
-(define_insn "and<V32I:mode>3"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(and:V32I (match_operand:V32I 1 "arith_operand" "%r,d")
-		  (match_operand:V32I 2 "arith_operand" "rI,d")))]
+(define_insn "andsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(and:SI (match_operand:SI 1 "arith_operand" "%r")
+		(match_operand:SI 2 "arith_operand" "rI")))]
   ""
-  "@
-   and\t%1, %2, %0
-   fands\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "and\t%1, %2, %0")
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -4710,14 +4646,12 @@ 
   operands[4] = GEN_INT (~INTVAL (operands[2]));
 })
 
-(define_insn_and_split "*and_not_<V64I:mode>_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(and:V64I (not:V64I (match_operand:V64I 1 "register_operand" "%r,b"))
-		  (match_operand:V64I 2 "register_operand" "r,b")))]
+(define_insn_and_split "*and_not_di_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (not:DI (match_operand:DI 1 "register_operand" "%r"))
+		(match_operand:DI 2 "register_operand" "r")))]
   "! TARGET_ARCH64"
-  "@
-   #
-   fandnot1\t%1, %2, %0"
+  "#"
   "&& reload_completed
    && ((GET_CODE (operands[0]) == REG
         && REGNO (operands[0]) < 32)
@@ -4732,72 +4666,50 @@ 
    operands[6] = gen_lowpart (SImode, operands[0]);
    operands[7] = gen_lowpart (SImode, operands[1]);
    operands[8] = gen_lowpart (SImode, operands[2]);"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*and_not_<V64I:mode>_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(and:V64I (not:V64I (match_operand:V64I 1 "register_operand" "%r,b"))
-		  (match_operand:V64I 2 "register_operand" "r,b")))]
+  [(set_attr "length" "2")])
+
+(define_insn "*and_not_di_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(and:DI (not:DI (match_operand:DI 1 "register_operand" "%r"))
+		(match_operand:DI 2 "register_operand" "r")))]
   "TARGET_ARCH64"
-  "@
-   andn\t%2, %1, %0
-   fandnot1\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "andn\t%2, %1, %0")
 
-(define_insn "*and_not_<V32I:mode>"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(and:V32I (not:V32I (match_operand:V32I 1 "register_operand" "%r,d"))
-		  (match_operand:V32I 2 "register_operand" "r,d")))]
+(define_insn "*and_not_si"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(and:SI (not:SI (match_operand:SI 1 "register_operand" "%r"))
+		(match_operand:SI 2 "register_operand" "r")))]
   ""
-  "@
-   andn\t%2, %1, %0
-   fandnot1s\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "andn\t%2, %1, %0")
 
-(define_expand "ior<V64I:mode>3"
-  [(set (match_operand:V64I 0 "register_operand" "")
-	(ior:V64I (match_operand:V64I 1 "arith_double_operand" "")
-		  (match_operand:V64I 2 "arith_double_operand" "")))]
+(define_expand "iordi3"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(ior:DI (match_operand:DI 1 "arith_double_operand" "")
+		(match_operand:DI 2 "arith_double_operand" "")))]
   ""
   "")
 
-(define_insn "*ior<V64I:mode>3_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(ior:V64I (match_operand:V64I 1 "arith_double_operand" "%r,b")
-		  (match_operand:V64I 2 "arith_double_operand" "rHI,b")))]
+(define_insn "*iordi3_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI (match_operand:DI 1 "arith_double_operand" "%r")
+		(match_operand:DI 2 "arith_double_operand" "rHI")))]
   "! TARGET_ARCH64"
-  "@
-  #
-  for\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*ior<V64I:mode>3_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(ior:V64I (match_operand:V64I 1 "arith_operand" "%r,b")
-		  (match_operand:V64I 2 "arith_operand" "rI,b")))]
+  "#"
+  [(set_attr "length" "2")])
+
+(define_insn "*iordi3_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI (match_operand:DI 1 "arith_operand" "%r")
+		(match_operand:DI 2 "arith_operand" "rI")))]
   "TARGET_ARCH64"
-  "@
-  or\t%1, %2, %0
-  for\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "or\t%1, %2, %0")
 
-(define_insn "ior<V32I:mode>3"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(ior:V32I (match_operand:V32I 1 "arith_operand" "%r,d")
-		  (match_operand:V32I 2 "arith_operand" "rI,d")))]
+(define_insn "iorsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(ior:SI (match_operand:SI 1 "arith_operand" "%r")
+		(match_operand:SI 2 "arith_operand" "rI")))]
   ""
-  "@
-   or\t%1, %2, %0
-   fors\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "or\t%1, %2, %0")
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -4811,14 +4723,12 @@ 
   operands[4] = GEN_INT (~INTVAL (operands[2]));
 })
 
-(define_insn_and_split "*or_not_<V64I:mode>_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(ior:V64I (not:V64I (match_operand:V64I 1 "register_operand" "r,b"))
-		  (match_operand:V64I 2 "register_operand" "r,b")))]
+(define_insn_and_split "*or_not_di_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI (not:DI (match_operand:DI 1 "register_operand" "r"))
+		(match_operand:DI 2 "register_operand" "r")))]
   "! TARGET_ARCH64"
-  "@
-   #
-   fornot1\t%1, %2, %0"
+  "#"
   "&& reload_completed
    && ((GET_CODE (operands[0]) == REG
         && REGNO (operands[0]) < 32)
@@ -4833,72 +4743,50 @@ 
    operands[6] = gen_lowpart (SImode, operands[0]);
    operands[7] = gen_lowpart (SImode, operands[1]);
    operands[8] = gen_lowpart (SImode, operands[2]);"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*or_not_<V64I:mode>_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(ior:V64I (not:V64I (match_operand:V64I 1 "register_operand" "r,b"))
-		  (match_operand:V64I 2 "register_operand" "r,b")))]
+  [(set_attr "length" "2")])
+
+(define_insn "*or_not_di_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(ior:DI (not:DI (match_operand:DI 1 "register_operand" "r"))
+		(match_operand:DI 2 "register_operand" "r")))]
   "TARGET_ARCH64"
-  "@
-  orn\t%2, %1, %0
-  fornot1\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "orn\t%2, %1, %0")
 
-(define_insn "*or_not_<V32I:mode>"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(ior:V32I (not:V32I (match_operand:V32I 1 "register_operand" "r,d"))
-		  (match_operand:V32I 2 "register_operand" "r,d")))]
+(define_insn "*or_not_si"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(ior:SI (not:SI (match_operand:SI 1 "register_operand" "r"))
+		(match_operand:SI 2 "register_operand" "r")))]
   ""
-  "@
-   orn\t%2, %1, %0
-   fornot1s\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "orn\t%2, %1, %0")
 
-(define_expand "xor<V64I:mode>3"
-  [(set (match_operand:V64I 0 "register_operand" "")
-	(xor:V64I (match_operand:V64I 1 "arith_double_operand" "")
-		  (match_operand:V64I 2 "arith_double_operand" "")))]
+(define_expand "xordi3"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(xor:DI (match_operand:DI 1 "arith_double_operand" "")
+		(match_operand:DI 2 "arith_double_operand" "")))]
   ""
   "")
 
-(define_insn "*xor<V64I:mode>3_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(xor:V64I (match_operand:V64I 1 "arith_double_operand" "%r,b")
-		  (match_operand:V64I 2 "arith_double_operand" "rHI,b")))]
+(define_insn "*xordi3_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(xor:DI (match_operand:DI 1 "arith_double_operand" "%r")
+		(match_operand:DI 2 "arith_double_operand" "rHI")))]
   "! TARGET_ARCH64"
-  "@
-  #
-  fxor\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*xor<V64I:mode>3_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(xor:V64I (match_operand:V64I 1 "arith_operand" "%rJ,b")
-		  (match_operand:V64I 2 "arith_operand" "rI,b")))]
+  "#"
+  [(set_attr "length" "2")])
+
+(define_insn "*xordi3_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(xor:DI (match_operand:DI 1 "arith_operand" "%rJ")
+		(match_operand:DI 2 "arith_operand" "rI")))]
   "TARGET_ARCH64"
-  "@
-  xor\t%r1, %2, %0
-  fxor\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "xor\t%r1, %2, %0")
 
-(define_insn "xor<V32I:mode>3"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(xor:V32I (match_operand:V32I 1 "arith_operand" "%rJ,d")
-		  (match_operand:V32I 2 "arith_operand" "rI,d")))]
+(define_insn "xorsi3"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(xor:SI (match_operand:SI 1 "arith_operand" "%rJ")
+		  (match_operand:SI 2 "arith_operand" "rI")))]
   ""
-  "@
-   xor\t%r1, %2, %0
-   fxors\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "xor\t%r1, %2, %0")
 
 (define_split
   [(set (match_operand:SI 0 "register_operand" "")
@@ -4926,10 +4814,10 @@ 
 
 ;; Split DImode logical operations requiring two instructions.
 (define_split
-  [(set (match_operand:V64I 0 "register_operand" "")
-	(match_operator:V64I 1 "cc_arith_operator"	; AND, IOR, XOR
-			   [(match_operand:V64I 2 "register_operand" "")
-			    (match_operand:V64I 3 "arith_double_operand" "")]))]
+  [(set (match_operand:DI 0 "register_operand" "")
+	(match_operator:DI 1 "cc_arith_operator"	; AND, IOR, XOR
+			   [(match_operand:DI 2 "register_operand" "")
+			    (match_operand:DI 3 "arith_double_operand" "")]))]
   "! TARGET_ARCH64
    && reload_completed
    && ((GET_CODE (operands[0]) == REG
@@ -4945,7 +4833,7 @@ 
   operands[6] = gen_highpart (SImode, operands[2]);
   operands[7] = gen_lowpart (SImode, operands[2]);
 #if HOST_BITS_PER_WIDE_INT == 32
-  if (GET_CODE (operands[3]) == CONST_INT && <V64I:MODE>mode == DImode)
+  if (GET_CODE (operands[3]) == CONST_INT)
     {
       if (INTVAL (operands[3]) < 0)
 	operands[8] = constm1_rtx;
@@ -4954,20 +4842,18 @@ 
     }
   else
 #endif
-    operands[8] = gen_highpart_mode (SImode, <V64I:MODE>mode, operands[3]);
+    operands[8] = gen_highpart_mode (SImode, DImode, operands[3]);
   operands[9] = gen_lowpart (SImode, operands[3]);
 })
 
 ;; xnor patterns.  Note that (a ^ ~b) == (~a ^ b) == ~(a ^ b).
 ;; Combine now canonicalizes to the rightmost expression.
-(define_insn_and_split "*xor_not_<V64I:mode>_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(not:V64I (xor:V64I (match_operand:V64I 1 "register_operand" "r,b")
-			    (match_operand:V64I 2 "register_operand" "r,b"))))]
+(define_insn_and_split "*xor_not_di_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(not:DI (xor:DI (match_operand:DI 1 "register_operand" "r")
+			(match_operand:DI 2 "register_operand" "r"))))]
   "! TARGET_ARCH64"
-  "@
-   #
-   fxnor\t%1, %2, %0"
+  "#"
   "&& reload_completed
    && ((GET_CODE (operands[0]) == REG
         && REGNO (operands[0]) < 32)
@@ -4982,31 +4868,21 @@ 
    operands[6] = gen_lowpart (SImode, operands[0]);
    operands[7] = gen_lowpart (SImode, operands[1]);
    operands[8] = gen_lowpart (SImode, operands[2]);"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
-
-(define_insn "*xor_not_<V64I:mode>_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(not:V64I (xor:V64I (match_operand:V64I 1 "register_or_zero_operand" "rJ,b")
-			    (match_operand:V64I 2 "arith_operand" "rI,b"))))]
+  [(set_attr "length" "2")])
+
+(define_insn "*xor_not_di_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(not:DI (xor:DI (match_operand:DI 1 "register_or_zero_operand" "rJ")
+			(match_operand:DI 2 "arith_operand" "rI"))))]
   "TARGET_ARCH64"
-  "@
-  xnor\t%r1, %2, %0
-  fxnor\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "xnor\t%r1, %2, %0")
 
-(define_insn "*xor_not_<V32I:mode>"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(not:V32I (xor:V32I (match_operand:V32I 1 "register_or_zero_operand" "rJ,d")
-			    (match_operand:V32I 2 "arith_operand" "rI,d"))))]
+(define_insn "*xor_not_si"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(not:SI (xor:SI (match_operand:SI 1 "register_or_zero_operand" "rJ")
+			(match_operand:SI 2 "arith_operand" "rI"))))]
   ""
-  "@
-   xnor\t%r1, %2, %0
-   fxnors\t%1, %2, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "xnor\t%r1, %2, %0")
 
 ;; These correspond to the above in the case where we also (or only)
 ;; want to set the condition code.  
@@ -5244,19 +5120,17 @@ 
 
 ;; We cannot use the "not" pseudo insn because the Sun assembler
 ;; does not know how to make it work for constants.
-(define_expand "one_cmpl<V64I:mode>2"
-  [(set (match_operand:V64I 0 "register_operand" "")
-	(not:V64I (match_operand:V64I 1 "register_operand" "")))]
+(define_expand "one_cmpldi2"
+  [(set (match_operand:DI 0 "register_operand" "")
+	(not:DI (match_operand:DI 1 "register_operand" "")))]
   ""
   "")
 
-(define_insn_and_split "*one_cmpl<V64I:mode>2_sp32"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(not:V64I (match_operand:V64I 1 "register_operand" "r,b")))]
+(define_insn_and_split "*one_cmpldi2_sp32"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(not:DI (match_operand:DI 1 "register_operand" "r")))]
   "! TARGET_ARCH64"
-  "@
-   #
-   fnot1\t%1, %0"
+  "#"
   "&& reload_completed
    && ((GET_CODE (operands[0]) == REG
         && REGNO (operands[0]) < 32)
@@ -5269,29 +5143,19 @@ 
    operands[3] = gen_highpart (SImode, operands[1]);
    operands[4] = gen_lowpart (SImode, operands[0]);
    operands[5] = gen_lowpart (SImode, operands[1]);"
-  [(set_attr "type" "*,fga")
-   (set_attr "length" "2,*")
-   (set_attr "fptype" "*,double")])
+  [(set_attr "length" "2")])
 
-(define_insn "*one_cmpl<V64I:mode>2_sp64"
-  [(set (match_operand:V64I 0 "register_operand" "=r,b")
-	(not:V64I (match_operand:V64I 1 "arith_operand" "rI,b")))]
+(define_insn "*one_cmpldi2_sp64"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(not:DI (match_operand:DI 1 "arith_operand" "rI")))]
   "TARGET_ARCH64"
-  "@
-   xnor\t%%g0, %1, %0
-   fnot1\t%1, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,double")])
+  "xnor\t%%g0, %1, %0")
 
-(define_insn "one_cmpl<V32I:mode>2"
-  [(set (match_operand:V32I 0 "register_operand" "=r,d")
-	(not:V32I (match_operand:V32I 1 "arith_operand" "rI,d")))]
+(define_insn "one_cmplsi2"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+	(not:SI (match_operand:SI 1 "arith_operand" "rI")))]
   ""
-  "@
-  xnor\t%%g0, %1, %0
-  fnot1s\t%1, %0"
-  [(set_attr "type" "*,fga")
-   (set_attr "fptype" "*,single")])
+  "xnor\t%%g0, %1, %0")
 
 (define_insn "*cmp_cc_not"
   [(set (reg:CC CC_REG)
@@ -7883,59 +7747,193 @@ 
 
 ;; Vector instructions.
 
+(define_mode_iterator VM32 [V1SI V2HI V4QI])
+(define_mode_iterator VM64 [V1DI V2SI V4HI V8QI])
+(define_mode_iterator VMALL [V1SI V2HI V4QI V1DI V2SI V4HI V8QI])
+
+(define_mode_attr vbits [(V2SI "32") (V4HI "16") (V1SI "32s") (V2HI "16s")])
+(define_mode_attr vconstr [(V1SI "f") (V2HI "f") (V4QI "f")
+			   (V1DI "e") (V2SI "e") (V4HI "e") (V8QI "e")])
+(define_mode_attr vfptype [(V1SI "single") (V2HI "single") (V4QI "single")
+			   (V1DI "double") (V2SI "double") (V4HI "double")
+			   (V8QI "double")])
+
+(define_expand "mov<VMALL:mode>"
+  [(set (match_operand:VMALL 0 "nonimmediate_operand" "")
+	(match_operand:VMALL 1 "general_operand" ""))]
+  "TARGET_VIS"
+{
+  if (sparc_expand_move (<VMALL:MODE>mode, operands))
+    DONE;
+})
+
+(define_insn "*mov<VM32:mode>_insn"
+  [(set (match_operand:VM32 0 "nonimmediate_operand" "=f, f,f,f,m, m,r,m, r, r")
+	(match_operand:VM32 1 "input_operand"        "GY,ZC,f,m,f,GY,m,r,GY,ZC"))]
+  "TARGET_VIS
+   && (register_operand (operands[0], <VM32:MODE>mode)
+       || register_or_zero_or_all_ones_operand (operands[1], <VM32:MODE>mode))"
+  "@
+  fzeros\t%0
+  fones\t%0
+  fsrc1s\t%1, %0
+  ld\t%1, %0
+  st\t%1, %0
+  st\t%r1, %0
+  ld\t%1, %0
+  st\t%1, %0
+  mov\t0, %0
+  mov\t-1, %0"
+  [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*")])
+
+(define_insn "*mov<VM64:mode>_insn_sp64"
+  [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,r,m, r, r")
+	(match_operand:VM64 1 "input_operand"        "GY,ZC,e,m,e,GY,m,r,GY,ZC"))]
+  "TARGET_VIS
+   && TARGET_ARCH64
+   && (register_operand (operands[0], <VM64:MODE>mode)
+       || register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
+  "@
+  fzero\t%0
+  fone\t%0
+  fsrc1\t%1, %0
+  ldd\t%1, %0
+  std\t%1, %0
+  stx\t%r1, %0
+  ldx\t%1, %0
+  stx\t%1, %0
+  mov\t0, %0
+  mov\t-1, %0"
+  [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*")])
+
+(define_insn "*mov<VM64:mode>_insn_sp32"
+  [(set (match_operand:VM64 0 "nonimmediate_operand" "=e, e,e,e,m, m,U,T,o, r, r")
+	(match_operand:VM64 1 "input_operand"        "GY,ZC,e,m,e,GY,T,U,r,GY,ZC"))]
+  "TARGET_VIS
+   && ! TARGET_ARCH64
+   && (register_operand (operands[0], <VM64:MODE>mode)
+       || register_or_zero_or_all_ones_operand (operands[1], <VM64:MODE>mode))"
+  "@
+  fzero\t%0
+  fone\t%0
+  fsrc1\t%1, %0
+  ldd\t%1, %0
+  std\t%1, %0
+  stx\t%r1, %0
+  ldd\t%1, %0
+  std\t%1, %0
+  #
+  mov 0, %L0; mov 0, %H0
+  mov -1, %L0; mov -1, %H0"
+  [(set_attr "type" "fga,fga,fga,fpload,fpstore,store,load,store,*,*,*")
+   (set_attr "length" "*,*,*,*,*,*,*,*,2,2,2")])
+
+(define_split
+  [(set (match_operand:VM64 0 "memory_operand" "")
+        (match_operand:VM64 1 "register_operand" ""))]
+  "reload_completed
+   && TARGET_VIS
+   && ! TARGET_ARCH64
+   && (((REGNO (operands[1]) % 2) != 0)
+       || ! mem_min_alignment (operands[0], 8))
+   && offsettable_memref_p (operands[0])"
+  [(clobber (const_int 0))]
+{
+  rtx word0, word1;
+
+  word0 = adjust_address (operands[0], SImode, 0);
+  word1 = adjust_address (operands[0], SImode, 4);
+
+  emit_move_insn_1 (word0, gen_highpart (SImode, operands[1]));
+  emit_move_insn_1 (word1, gen_lowpart (SImode, operands[1]));
+  DONE;
+})
+
+(define_expand "vec_init<mode>"
+  [(match_operand:VMALL 0 "register_operand" "")
+   (match_operand:VMALL 1 "" "")]
+  "TARGET_VIS"
+{
+  sparc_expand_vector_init (operands[0], operands[1]);
+  DONE;
+})
+
 (define_code_iterator plusminus [plus minus])
 (define_code_attr plusminus_insn [(plus "add") (minus "sub")])
 
-;; fp{add,sub}32s are emitted by the {add,sub}si3 patterns.
-(define_insn "<plusminus_insn>v2si3"
-  [(set (match_operand:V2SI 0 "register_operand" "=e")
-	(plusminus:V2SI (match_operand:V2SI 1 "register_operand" "e")
-			(match_operand:V2SI 2 "register_operand" "e")))]
+(define_mode_iterator VADDSUB [V1SI V2SI V2HI V4HI])
+
+(define_insn "<plusminus_insn><mode>3"
+  [(set (match_operand:VADDSUB 0 "register_operand" "=<vconstr>")
+	(plusminus:VADDSUB (match_operand:VADDSUB 1 "register_operand" "<vconstr>")
+			   (match_operand:VADDSUB 2 "register_operand" "<vconstr>")))]
   "TARGET_VIS"
-  "fp<plusminus_insn>32\t%1, %2, %0"
+  "fp<plusminus_insn><vbits>\t%1, %2, %0"
   [(set_attr "type" "fga")
-   (set_attr "fptype" "double")])
+   (set_attr "fptype" "<vfptype>")])
+
+(define_mode_iterator VL [V1SI V2HI V4QI V1DI V2SI V4HI V8QI])
+(define_mode_attr vlsuf [(V1SI "s") (V2HI "s") (V4QI "s")
+			 (V1DI  "") (V2SI  "") (V4HI  "") (V8QI "")])
+(define_code_iterator vlop [ior and xor])
+(define_code_attr vlinsn [(ior "or") (and "and") (xor "xor")])
+(define_code_attr vlninsn [(ior "nor") (and "nand") (xor "xnor")])
+
+(define_insn "<code><mode>3"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+	(vlop:VL (match_operand:VL 1 "register_operand" "<vconstr>")
+		 (match_operand:VL 2 "register_operand" "<vconstr>")))]
+  "TARGET_VIS"
+  "f<vlinsn><vlsuf>\t%1, %2, %0"
+  [(set_attr "type" "fga")
+   (set_attr "fptype" "<vfptype>")])
 
-(define_insn "<plusminus_insn>v4hi3"
-  [(set (match_operand:V4HI 0 "register_operand" "=e")
-	(plusminus:V4HI (match_operand:V4HI 1 "register_operand" "e")
-			(match_operand:V4HI 2 "register_operand" "e")))]
+(define_insn "*not_<code><mode>3"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+        (not:VL (vlop:VL (match_operand:VL 1 "register_operand" "<vconstr>")
+			 (match_operand:VL 2 "register_operand" "<vconstr>"))))]
   "TARGET_VIS"
-  "fp<plusminus_insn>16\t%1, %2, %0"
+  "f<vlninsn><vlsuf>\t%1, %2, %0"
   [(set_attr "type" "fga")
-   (set_attr "fptype" "double")])
+   (set_attr "fptype" "<vfptype>")])
 
-(define_insn "<plusminus_insn>v2hi3"
-  [(set (match_operand:V2HI 0 "register_operand" "=f")
-	(plusminus:V2HI (match_operand:V2HI 1 "register_operand" "f")
-			(match_operand:V2HI 2 "register_operand" "f")))]
+;; (ior (not (op1)) (not (op2))) is the canonical form of NAND.
+(define_insn "*nand<mode>_vis"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+	(ior:VL (not:VL (match_operand:VL 1 "register_operand" "<vconstr>"))
+		(not:VL (match_operand:VL 2 "register_operand" "<vconstr>"))))]
   "TARGET_VIS"
-  "fp<plusminus_insn>16s\t%1, %2, %0"
+  "fnand<vlsuf>\t%1, %2, %0"
   [(set_attr "type" "fga")
-   (set_attr "fptype" "single")])
+   (set_attr "fptype" "<vfptype>")])
 
-;; All other logical instructions have integer equivalents so they
-;; are defined together.
+(define_code_iterator vlnotop [ior and])
 
-;; (ior (not (op1)) (not (op2))) is the canonical form of NAND.
+(define_insn "*<code>_not1<mode>_vis"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+	(vlnotop:VL (not:VL (match_operand:VL 1 "register_operand" "<vconstr>"))
+		    (match_operand:VL 2 "register_operand" "<vconstr>")))]
+  "TARGET_VIS"
+  "f<vlinsn>not1<vlsuf>\t%1, %2, %0"
+  [(set_attr "type" "fga")
+   (set_attr "fptype" "<vfptype>")])
 
-(define_insn "*nand<V64:mode>_vis"
-  [(set (match_operand:V64 0 "register_operand" "=e")
-	(ior:V64 (not:V64 (match_operand:V64 1 "register_operand" "e"))
-		 (not:V64 (match_operand:V64 2 "register_operand" "e"))))]
+(define_insn "*<code>_not2<mode>_vis"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+	(vlnotop:VL (match_operand:VL 1 "register_operand" "<vconstr>")
+		    (not:VL (match_operand:VL 2 "register_operand" "<vconstr>"))))]
   "TARGET_VIS"
-  "fnand\t%1, %2, %0"
+  "f<vlinsn>not2<vlsuf>\t%1, %2, %0"
   [(set_attr "type" "fga")
-   (set_attr "fptype" "double")])
+   (set_attr "fptype" "<vfptype>")])
 
-(define_insn "*nand<V32:mode>_vis"
-  [(set (match_operand:V32 0 "register_operand" "=f")
-	 (ior:V32 (not:V32 (match_operand:V32 1 "register_operand" "f"))
-		  (not:V32 (match_operand:V32 2 "register_operand" "f"))))]
+(define_insn "one_cmpl<mode>2"
+  [(set (match_operand:VL 0 "register_operand" "=<vconstr>")
+	(not:VL (match_operand:VL 1 "register_operand" "<vconstr>")))]
   "TARGET_VIS"
-  "fnands\t%1, %2, %0"
+  "fnot1<vlsuf>\t%1, %0"
   [(set_attr "type" "fga")
-   (set_attr "fptype" "single")])
+   (set_attr "fptype" "<vfptype>")])
 
 ;; Hard to generate VIS instructions.  We have builtins for these.
 
@@ -8152,10 +8150,10 @@ 
 ;; Using faligndata only makes sense after an alignaddr since the choice of
 ;; bytes to take out of each operand is dependent on the results of the last
 ;; alignaddr.
-(define_insn "faligndata<V64I:mode>_vis"
-  [(set (match_operand:V64I 0 "register_operand" "=e")
-        (unspec:V64I [(match_operand:V64I 1 "register_operand" "e")
-                      (match_operand:V64I 2 "register_operand" "e")
+(define_insn "faligndata<VM64:mode>_vis"
+  [(set (match_operand:VM64 0 "register_operand" "=e")
+        (unspec:VM64 [(match_operand:VM64 1 "register_operand" "e")
+                      (match_operand:VM64 2 "register_operand" "e")
                       (reg:DI GSR_REG)]
          UNSPEC_ALIGNDATA))]
   "TARGET_VIS"
@@ -8341,10 +8339,10 @@ 
   "bmask\t%r1, %r2, %0"
   [(set_attr "type" "array")])
 
-(define_insn "bshuffle<V64I:mode>_vis"
-  [(set (match_operand:V64I 0 "register_operand" "=e")
-        (unspec:V64I [(match_operand:V64I 1 "register_operand" "e")
-	              (match_operand:V64I 2 "register_operand" "e")
+(define_insn "bshuffle<VM64:mode>_vis"
+  [(set (match_operand:VM64 0 "register_operand" "=e")
+        (unspec:VM64 [(match_operand:VM64 1 "register_operand" "e")
+	              (match_operand:VM64 2 "register_operand" "e")
 		      (reg:DI GSR_REG)]
                      UNSPEC_BSHUFFLE))]
   "TARGET_VIS2"
@@ -8447,9 +8445,9 @@ 
   [(ashift "ashl") (ss_ashift "ssashl") (lshiftrt "lshr") (ashiftrt "ashr")])
    
 (define_insn "v<vis3_shift_patname><mode>3"
-  [(set (match_operand:V64N8 0 "register_operand" "=<vconstr>")
-        (vis3_shift:V64N8 (match_operand:V64N8 1 "register_operand" "<vconstr>")
-                          (match_operand:V64N8 2 "register_operand" "<vconstr>")))]
+  [(set (match_operand:GCM 0 "register_operand" "=<vconstr>")
+	(vis3_shift:GCM (match_operand:GCM 1 "register_operand" "<vconstr>")
+			(match_operand:GCM 2 "register_operand" "<vconstr>")))]
   "TARGET_VIS3"
   "<vis3_shift_insn><vbits>\t%1, %2, %0")
 
@@ -8478,13 +8476,13 @@ 
   "fmean16\t%1, %2, %0")
 
 (define_insn "fp<plusminus_insn>64_vis"
-  [(set (match_operand:DI 0 "register_operand" "=e")
-	(plusminus:DI (match_operand:DI 1 "register_operand" "e")
-		      (match_operand:DI 2 "register_operand" "e")))]
+  [(set (match_operand:V1DI 0 "register_operand" "=e")
+	(plusminus:V1DI (match_operand:V1DI 1 "register_operand" "e")
+			(match_operand:V1DI 2 "register_operand" "e")))]
   "TARGET_VIS3"
   "fp<plusminus_insn>64\t%1, %2, %0")
 
-(define_mode_iterator VASS [V4HI V2SI V2HI SI])
+(define_mode_iterator VASS [V4HI V2SI V2HI V1SI])
 (define_code_iterator vis3_addsub_ss [ss_plus ss_minus])
 (define_code_attr vis3_addsub_ss_insn
   [(ss_plus "fpadds") (ss_minus "fpsubs")])
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index c89e644..ff29dcd 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,14 @@ 
+2011-10-17  David S. Miller  <davem@davemloft.net>
+
+	* gcc.target/sparc/fand.c: Remove __LP64__ ifdefs and expect
+	all operations to emit VIS instructions.
+	* gcc.target/sparc/fandnot.c: Likewise.
+	* gcc.target/sparc/fnot.c: Likewise.
+	* gcc.target/sparc/for.c: Likewise.
+	* gcc.target/sparc/fornot.c: Likewise.
+	* gcc.target/sparc/fxnor.c: Likewise.
+	* gcc.target/sparc/fxor.c: Likewise.
+
 2011-10-15  Oleg Endo  <oleg.endo@t-online.de>
 
 	PR target/49263
diff --git a/gcc/testsuite/gcc.target/sparc/fand.c b/gcc/testsuite/gcc.target/sparc/fand.c
index 3194c92..b0589bd 100644
--- a/gcc/testsuite/gcc.target/sparc/fand.c
+++ b/gcc/testsuite/gcc.target/sparc/fand.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return foo1_8 () & foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return a & b;
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return foo1_16 () & foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return a & b;
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,12 +38,9 @@  vec32 fun32(void)
   return foo1_32 () & foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return a & b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fand\t%" 3 } } */
+/* { dg-final { scan-assembler-times "fand\t%" 6 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fandnot.c b/gcc/testsuite/gcc.target/sparc/fandnot.c
index 41db849..0054863 100644
--- a/gcc/testsuite/gcc.target/sparc/fandnot.c
+++ b/gcc/testsuite/gcc.target/sparc/fandnot.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return ~foo1_8 () & foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return ~a & b;
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return ~foo1_16 () & foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return ~a & b;
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,13 +38,10 @@  vec32 fun32(void)
   return ~foo1_32 () & foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return ~a & b;
 }
-#endif
 
 
 /* This should be transformed into ~b & a.  */
@@ -59,38 +50,29 @@  vec8 fun8b(void)
   return foo1_8 () & ~foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2b(vec8 a, vec8 b)
 {
   return a & ~b;
 }
-#endif
 
 vec16 fun16b(void)
 {
   return foo1_16 () & ~foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2b(vec16 a, vec16 b)
 {
   return a & ~b;
 }
-#endif
 
 vec32 fun32b(void)
 {
   return foo1_32 () & ~foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2b(vec32 a, vec32 b)
 {
   return a & ~b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fandnot1\t%" 6 } } */
+/* { dg-final { scan-assembler-times "fandnot1\t%" 12 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fnot.c b/gcc/testsuite/gcc.target/sparc/fnot.c
index dceee52..c0ddc93 100644
--- a/gcc/testsuite/gcc.target/sparc/fnot.c
+++ b/gcc/testsuite/gcc.target/sparc/fnot.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return ~foo1_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a)
 {
   foo2_8 (~a);
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern void foo2_16(vec16);
@@ -29,13 +26,10 @@  vec16 fun16(void)
   return ~foo1_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a)
 {
   foo2_16 (~a);
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern void foo2_32(vec32);
@@ -45,12 +39,9 @@  vec32 fun32(void)
   return ~foo1_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a)
 {
   foo2_32 (~a);
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fnot1\t%" 3 } } */
+/* { dg-final { scan-assembler-times "fnot1\t%" 6 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/for.c b/gcc/testsuite/gcc.target/sparc/for.c
index 7348dce..3da4bc2 100644
--- a/gcc/testsuite/gcc.target/sparc/for.c
+++ b/gcc/testsuite/gcc.target/sparc/for.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return foo1_8 () | foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return a | b;
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return foo1_16 () | foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return a | b;
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,12 +38,9 @@  vec32 fun32(void)
   return foo1_32 () | foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return a | b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "for\t%" 3 } } */
+/* { dg-final { scan-assembler-times "for\t%" 6 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fornot.c b/gcc/testsuite/gcc.target/sparc/fornot.c
index 09fdb4f..2daa96e 100644
--- a/gcc/testsuite/gcc.target/sparc/fornot.c
+++ b/gcc/testsuite/gcc.target/sparc/fornot.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return ~foo1_8 () | foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return ~a | b;
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return ~foo1_16 () | foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return ~a | b;
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,14 +38,10 @@  vec32 fun32(void)
   return ~foo1_32 () | foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return ~a | b;
 }
-#endif
-
 
 /* This should be transformed into ~b | a.  */
 vec8 fun8b(void)
@@ -59,38 +49,29 @@  vec8 fun8b(void)
   return foo1_8 () | ~foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2b(vec8 a, vec8 b)
 {
   return a | ~b;
 }
-#endif
 
 vec16 fun16b(void)
 {
   return foo1_16 () | ~foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2b(vec16 a, vec16 b)
 {
   return a | ~b;
 }
-#endif
 
 vec32 fun32b(void)
 {
   return foo1_32 () | ~foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2b(vec32 a, vec32 b)
 {
   return a | ~b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fornot1\t%" 6 } } */
+/* { dg-final { scan-assembler-times "fornot1\t%" 12 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fxnor.c b/gcc/testsuite/gcc.target/sparc/fxnor.c
index a685e08..e635d65 100644
--- a/gcc/testsuite/gcc.target/sparc/fxnor.c
+++ b/gcc/testsuite/gcc.target/sparc/fxnor.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return ~(foo1_8 () ^ foo2_8 ());
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return ~(a ^ b);
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return ~(foo1_16 () ^ foo2_16 ());
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return ~(a ^ b);
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,13 +38,10 @@  vec32 fun32(void)
   return ~(foo1_32 () ^ foo2_32 ());
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return ~(a ^ b);
 }
-#endif
 
 
 /* This should be transformed into ~(b ^ a).  */
@@ -59,38 +50,29 @@  vec8 fun8b(void)
   return foo1_8 () ^ ~foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2b(vec8 a, vec8 b)
 {
   return a ^ ~b;
 }
-#endif
 
 vec16 fun16b(void)
 {
   return foo1_16 () ^ ~foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2b(vec16 a, vec16 b)
 {
   return a ^ ~b;
 }
-#endif
 
 vec32 fun32b(void)
 {
   return foo1_32 () ^ ~foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2b(vec32 a, vec32 b)
 {
   return a ^ ~b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fxnor\t%" 6 } } */
+/* { dg-final { scan-assembler-times "fxnor\t%" 12 } } */
diff --git a/gcc/testsuite/gcc.target/sparc/fxor.c b/gcc/testsuite/gcc.target/sparc/fxor.c
index 581b37b..6ca2f76 100644
--- a/gcc/testsuite/gcc.target/sparc/fxor.c
+++ b/gcc/testsuite/gcc.target/sparc/fxor.c
@@ -12,13 +12,10 @@  vec8 fun8(void)
   return foo1_8 () ^ foo2_8 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec8 fun8_2(vec8 a, vec8 b)
 {
   return a ^ b;
 }
-#endif
 
 extern vec16 foo1_16(void);
 extern vec16 foo2_16(void);
@@ -28,13 +25,10 @@  vec16 fun16(void)
   return foo1_16 () ^ foo2_16 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec16 fun16_2(vec16 a, vec16 b)
 {
   return a ^ b;
 }
-#endif
 
 extern vec32 foo1_32(void);
 extern vec32 foo2_32(void);
@@ -44,12 +38,9 @@  vec32 fun32(void)
   return foo1_32 () ^ foo2_32 ();
 }
 
-#ifndef __LP64__
-/* Test the 32-bit splitter. */
 vec32 fun32_2(vec32 a, vec32 b)
 {
   return a ^ b;
 }
-#endif
 
-/* { dg-final { scan-assembler-times "fxor\t%" 3 } } */
+/* { dg-final { scan-assembler-times "fxor\t%" 6 } } */