diff mbox

[4.8,21/26] Backport Power8 and LE support: Vector APIs

Message ID 1395257643.17148.23.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt March 19, 2014, 7:34 p.m. UTC
Hi,

This patch (diff-le-vector-api) backports enablement of LE support for
the Altivec APIs, including support for -maltivec=be.

Thanks,
Bill


[gcc]

2014-03-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline r206443
	2014-01-08  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
	two duplicate entries.

	Backport from mainline r206494
	2014-01-09  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* doc/invoke.texi: Add -maltivec={be,le} options, and document
	default element-order behavior for -maltivec.
	* config/rs6000/rs6000.opt: Add -maltivec={be,le} options.
	* config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure
	that -maltivec={le,be} implies -maltivec; disallow -maltivec=le
	when targeting big endian, at least for now.
	* config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG.

	Backport from mainline r206541
	2014-01-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def: Fix pasto for VPKSDUS.

	Backport from mainline r206590
	2014-01-13  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
	Implement -maltivec=be for vec_insert and vec_extract.

	Backport from mainline r206641
	2014-01-15  Bill Schmidt  <wschmidt@vnet.linux.ibm.com>

	* config/rs6000/altivec.md (mulv8hi3): Explicitly generate vmulesh
	and vmulosh rather than call gen_vec_widen_smult_*.
	(vec_widen_umult_even_v16qi): Test VECTOR_ELT_ORDER_BIG rather
	than BYTES_BIG_ENDIAN to determine use of even or odd instruction.
	(vec_widen_smult_even_v16qi): Likewise.
	(vec_widen_umult_even_v8hi): Likewise.
	(vec_widen_smult_even_v8hi): Likewise.
	(vec_widen_umult_odd_v16qi): Likewise.
	(vec_widen_smult_odd_v16qi): Likewise.
	(vec_widen_umult_odd_v8hi): Likewise.
	(vec_widen_smult_odd_v8hi): Likewise.
	(vec_widen_umult_hi_v16qi): Explicitly generate vmuleub and
	vmuloub rather than call gen_vec_widen_umult_*.
	(vec_widen_umult_lo_v16qi): Likewise.
	(vec_widen_smult_hi_v16qi): Explicitly generate vmulesb and
	vmulosb rather than call gen_vec_widen_smult_*.
	(vec_widen_smult_lo_v16qi): Likewise.
	(vec_widen_umult_hi_v8hi): Explicitly generate vmuleuh and vmulouh
	rather than call gen_vec_widen_umult_*.
	(vec_widen_umult_lo_v8hi): Likewise.
	(vec_widen_smult_hi_v8hi): Explicitly gnerate vmulesh and vmulosh
	rather than call gen_vec_widen_smult_*.
	(vec_widen_smult_lo_v8hi): Likewise.

	Backport from mainline r207062
	2014-01-24  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
	correction for little endian...
	* config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to
	here.

	Backport from mainline r207262
	2014-01-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (altivec_expand_vec_perm_const):  Use
	CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*.
	* config/rs6000/vsx.md (vsx_mergel_<mode>): Adjust for
	-maltivec=be with LE targets.
	(vsx_mergeh_<mode>): Likewise.
	* config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New
	unspecs.
	(mulv8hi3): Use gen_altivec_vmrg[hl]w_direct.
	(altivec_vmrghb): Replace with define_expand and new
	*altivec_vmrghb_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrghb_direct): New define_insn.
	(altivec_vmrghh): Replace with define_expand and new
	*altivec_vmrghh_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrghh_direct): New define_insn.
	(altivec_vmrghw): Replace with define_expand and new
	*altivec_vmrghw_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrghw_direct): New define_insn.
	(*altivec_vmrghsf): Adjust for endianness.
	(altivec_vmrglb): Replace with define_expand and new
	*altivec_vmrglb_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrglb_direct): New define_insn.
	(altivec_vmrglh): Replace with define_expand and new
	*altivec_vmrglh_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrglh_direct): New define_insn.
	(altivec_vmrglw): Replace with define_expand and new
	*altivec_vmrglw_internal insn; adjust for -maltivec=be with LE
	targets.
	(altivec_vmrglw_direct): New define_insn.
	(*altivec_vmrglsf): Adjust for endianness.
	(vec_widen_umult_hi_v16qi): Use gen_altivec_vmrghh_direct.
	(vec_widen_umult_lo_v16qi): Use gen_altivec_vmrglh_direct.
	(vec_widen_smult_hi_v16qi): Use gen_altivec_vmrghh_direct.
	(vec_widen_smult_lo_v16qi): Use gen_altivec_vmrglh_direct.
	(vec_widen_umult_hi_v8hi): Use gen_altivec_vmrghw_direct.
	(vec_widen_umult_lo_v8hi): Use gen_altivec_vmrglw_direct.
	(vec_widen_smult_hi_v8hi): Use gen_altivec_vmrghw_direct.
	(vec_widen_smult_lo_v8hi): Use gen_altivec_vmrglw_direct.

	Backport from mainline r207318
	2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc/config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
	gen_vsx_xxspltw_v4sf_direct instead of gen_vsx_xxspltw_v4sf;
	remove element index adjustment for endian (now handled in vsx.md
	and altivec.md).
	(altivec_expand_vec_perm_const): Use
	gen_altivec_vsplt[bhw]_direct instead of gen_altivec_vsplt[bhw].
	* gcc/config/rs6000/vsx.md (UNSPEC_VSX_XXSPLTW): New unspec.
	(vsx_xxspltw_<mode>): Adjust element index for little endian.
	* gcc/config/rs6000/altivec.md (altivec_vspltb): Divide into a
	define_expand and a new define_insn *altivec_vspltb_internal;
	adjust for -maltivec=be on a little endian target.
	(altivec_vspltb_direct): New.
	(altivec_vsplth): Divide into a define_expand and a new
	define_insn *altivec_vsplth_internal; adjust for -maltivec=be on a
	little endian target.
	(altivec_vsplth_direct): New.
	(altivec_vspltw): Divide into a define_expand and a new
	define_insn *altivec_vspltw_internal; adjust for -maltivec=be on a
	little endian target.
	(altivec_vspltw_direct): New.
	(altivec_vspltsf): Divide into a define_expand and a new
	define_insn *altivec_vspltsf_internal; adjust for -maltivec=be on
	a little endian target.

	Backport from mainline r207326
	2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_vector_init): Remove
	unused variable "field".
	* config/rs6000/vsx.md (vsx_mergel_<mode>): Add missing DONE.
	(vsx_mergeh_<mode>): Likewise.
	* config/rs6000/altivec.md (altivec_vmrghb): Likewise.
	(altivec_vmrghh): Likewise.
	(altivec_vmrghw): Likewise.
	(altivec_vmrglb): Likewise.
	(altivec_vmrglh): Likewise.
	(altivec_vmrglw): Likewise.
	(altivec_vspltb): Add missing uses.
	(altivec_vsplth): Likewise.
	(altivec_vspltw): Likewise.
	(altivec_vspltsf): Likewise.

	Backport from mainline r207414
	2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (UNSPEC_VSUMSWS_DIRECT): New unspec.
	(altivec_vsumsws): Add handling for -maltivec=be with a little
	endian target.
	(altivec_vsumsws_direct): New.
	(reduc_splus_<mode>): Call gen_altivec_vsumsws_direct instead of
	gen_altivec_vsumsws.

	Backport from mainline r207415
	2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (altivec_expand_vec_perm_le): Generalize
	for vector types other than V16QImode.
	* config/rs6000/altivec.md (altivec_vperm_<mode>): Change to a
	define_expand, and call altivec_expand_vec_perm_le when producing
	code with little endian element order.
	(*altivec_vperm_<mode>_internal): New insn having previous
	behavior of altivec_vperm_<mode>.
	(altivec_vperm_<mode>_uns): Change to a define_expand, and call
	altivec_expand_vec_perm_le when producing code with little endian
	element order.
	(*altivec_vperm_<mode>_uns_internal): New insn having previous
	behavior of altivec_vperm_<mode>_uns.

	Backport from mainline r207520
	2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* altivec.md (UNSPEC_VPACK_UNS_UNS_MOD_DIRECT): New unspec.
	(UNSPEC_VUNPACK_HI_SIGN_DIRECT): Likewise.
	(UNSPEC_VUNPACK_LO_SIGN_DIRECT): Likewise.
	(mulv8hi3): Use gen_altivec_vpkuwum_direct instead of
	gen_altivec_vpkuwum.
	(altivec_vpkpx): Test for VECTOR_ELT_ORDER_BIG instead of for
	BYTES_BIG_ENDIAN.
	(altivec_vpks<VI_char>ss): Likewise.
	(altivec_vpks<VI_char>us): Likewise.
	(altivec_vpku<VI_char>us): Likewise.
	(altivec_vpku<VI_char>um): Likewise.
	(altivec_vpku<VI_char>um_direct): New (copy of
	altivec_vpku<VI_char>um that still relies on BYTES_BIG_ENDIAN, for
	internal use).
	(altivec_vupkhs<VU_char>): Emit vupkls* instead of vupkhs* when
	target is little endian and -maltivec=be is not specified.
	(*altivec_vupkhs<VU_char>_direct): New (copy of
	altivec_vupkhs<VU_char> that always emits vupkhs*, for internal
	use).
	(altivec_vupkls<VU_char>): Emit vupkhs* instead of vupkls* when
	target is little endian and -maltivec=be is not specified.
	(*altivec_vupkls<VU_char>_direct): New (copy of
	altivec_vupkls<VU_char> that always emits vupkls*, for internal
	use).
	(altivec_vupkhpx): Emit vupklpx instead of vupkhpx when target is
	little endian and -maltivec=be is not specified.
	(altivec_vupklpx): Emit vupkhpx instead of vupklpx when target is
	little endian and -maltivec=be is not specified.

	Backport from mainline r207521
	2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (altivec_vsum2sws): Adjust code
	generation for -maltivec=be.
	(altivec_vsumsws): Simplify redundant test.

	Backport from mainline r207525
	2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change
	CODE_FOR_altivec_vpku[hw]um to
	CODE_FOR_altivec_vpku[hw]um_direct.
	* config/rs6000/altivec.md (vec_unpacks_hi_<VP_small_lc>): Change
	UNSPEC_VUNPACK_HI_SIGN to UNSPEC_VUNPACK_HI_SIGN_DIRECT.
	(vec_unpacks_lo_<VP_small_lc>): Change UNSPEC_VUNPACK_LO_SIGN to
	UNSPEC_VUNPACK_LO_SIGN_DIRECT.

	Backport from mainline r207814.
	2014-02-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (vsx_xxpermdi_<mode>): Handle little
	endian targets.

	Backport from mainline r207815.
	2014-02-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (p8_vmrgew): Handle little endian
	targets.
	(p8_vmrgow): Likewise.

	Backport from mainline r207919.
	2014-02-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (vspltis_constant): Fix most significant
	bit of zero.

	Backport from mainline 208019
	2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (altivec_lvxl): Rename as
	*altivec_lvxl_<mode>_internal and use VM2 iterator instead of
	V4SI.
	(altivec_lvxl_<mode>): New define_expand incorporating
	-maltivec=be semantics where needed.
	(altivec_lvx): Rename as *altivec_lvx_<mode>_internal.
	(altivec_lvx_<mode>): New define_expand incorporating -maltivec=be
	semantics where needed.
	(altivec_stvx): Rename as *altivec_stvx_<mode>_internal.
	(altivec_stvx_<mode>): New define_expand incorporating
	-maltivec=be semantics where needed.
	(altivec_stvxl): Rename as *altivec_stvxl_<mode>_internal and use
	VM2 iterator instead of V4SI.
	(altivec_stvxl_<mode>): New define_expand incorporating
	-maltivec=be semantics where needed.
	* config/rs6000/rs6000-builtin.def: Add new built-in definitions
	LVXL_V2DF, LVXL_V2DI, LVXL_V4SF, LVXL_V4SI, LVXL_V8HI, LVXL_V16QI,
	LVX_V2DF, LVX_V2DI, LVX_V4SF, LVX_V4SI, LVX_V8HI, LVX_V16QI,
	STVX_V2DF, STVX_V2DI, STVX_V4SF, STVX_V4SI, STVX_V8HI, STVX_V16QI,
	STVXL_V2DF, STVXL_V2DI, STVXL_V4SF, STVXL_V4SI, STVXL_V8HI,
	STVXL_V16QI.
	* config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Replace
	ALTIVEC_BUILTIN_LVX with ALTIVEC_BUILTIN_LVX_<MODE> throughout;
	similarly for ALTIVEC_BUILTIN_LVXL, ALTIVEC_BUILTIN_STVX, and
	ALTIVEC_BUILTIN_STVXL.
	* config/rs6000/rs6000-protos.h (altivec_expand_lvx_be): New
	prototype.
	(altivec_expand_stvx_be): Likewise.
	* config/rs6000/rs6000.c (swap_selector_for_mode): New function.
	(altivec_expand_lvx_be): Likewise.
	(altivec_expand_stvx_be): Likewise.
	(altivec_expand_builtin): Add cases for
	ALTIVEC_BUILTIN_STVX_<MODE>, ALTIVEC_BUILTIN_STVXL_<MODE>,
	ALTIVEC_BUILTIN_LVXL_<MODE>, and ALTIVEC_BUILTIN_LVX_<MODE>.
	(altivec_init_builtins): Add definitions for
	__builtin_altivec_lvxl_<mode>, __builtin_altivec_lvx_<mode>,
	__builtin_altivec_stvx_<mode>, and
	__builtin_altivec_stvxl_<mode>.

	Backport from mainline 208021
	2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (altivec_vsumsws): Replace second
	vspltw with vsldoi.
	(reduc_uplus_v16qi): Use gen_altivec_vsumsws_direct instead of
	gen_altivec_vsumsws.

	Backport from mainline 208049
	2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* config/rs6000/altivec.md (altivec_lve<VI_char>x): Replace
	define_insn with define_expand and new define_insn
	*altivec_lve<VI_char>x_internal.
	(altivec_stve<VI_char>x): Replace define_insn with define_expand
	and new define_insn *altivec_stve<VI_char>x_internal.
	* config/rs6000/rs6000-protos.h (altivec_expand_stvex_be): New
	prototype.
	* config/rs6000/rs6000.c (altivec_expand_lvx_be): Document use by
	lve*x built-ins.
	(altivec_expand_stvex_be): New function.

	Backport from mainline
        2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
	* config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Relax assert
	to permit subregs.

	Backport from mainline
        2014-02-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
	* config/rs6000/vector.md (*vector_unordered<mode>): Change split
	to use canonical form for nor<mode>3.

[gcc/testsuite]

2014-03-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	Backport from mainline r206590
	2014-01-13  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/insert.c: New.
	* gcc.dg/vmx/insert-be-order.c: New.
	* gcc.dg/vmx/extract.c: New.
	* gcc.dg/vmx/extract-be-order.c: New.

	Backport from mainline r206641
	2014-01-15  Bill Schmidt  <wschmidt@vnet.linux.ibm.com>

	* gcc.dg/vmx/mult-even-odd.c: New.
	* gcc.dg/vmx/mult-even-odd-be-order.c: New.

	Backport from mainline r206926
	2014-01-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/insert-vsx-be-order.c: New.
	* gcc.dg/vmx/extract-vsx.c: New.
	* gcc.dg/vmx/extract-vsx-be-order.c: New.
	* gcc.dg/vmx/insert-vsx.c: New.

	Backport from mainline r207262
	2014-01-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/merge-be-order.c: New.
	* gcc.dg/vmx/merge.c: New.
	* gcc.dg/vmx/merge-vsx-be-order.c: New.
	* gcc.dg/vmx/merge-vsx.c: New.

	Backport from mainline r207318
	2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/splat.c: New.
	* gcc.dg/vmx/splat-vsx.c: New.
	* gcc.dg/vmx/splat-be-order.c: New.
	* gcc.dg/vmx/splat-vsx-be-order.c: New.
	* gcc.dg/vmx/eg-5.c: Remove special casing for little endian.
	* gcc.dg/vmx/sn7153.c: Add special casing for little endian.

	Backport from mainline r207414
	2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/vsums.c: New.
	* gcc.dg/vmx/vsums-be-order.c: New.

	Backport from mainline r207415
	2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/3b-15.c: Remove special handling for little endian.
	* gcc.dg/vmx/perm.c: New.
	* gcc.dg/vmx/perm-be-order.c: New.

	Backport from mainline r207520
	2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/pack.c: New.
	* gcc.dg/vmx/pack-be-order.c: New.
	* gcc.dg/vmx/unpack.c: New.
	* gcc.dg/vmx/unpack-be-order.c: New.

	Backport from mainline r207521
	2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/sum2s.c: New.
	* gcc.dg/vmx/sum2s-be-order.c: New.

	Backport from mainline 208019
	2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/ld.c: New test.
	* gcc.dg/vmx/ld-be-order.c: New test.
	* gcc.dg/vmx/ld-vsx.c: New test.
	* gcc.dg/vmx/ld-vsx-be-order.c: New test.
	* gcc.dg/vmx/ldl.c: New test.
	* gcc.dg/vmx/ldl-be-order.c: New test.
	* gcc.dg/vmx/ldl-vsx.c: New test.
	* gcc.dg/vmx/ldl-vsx-be-order.c: New test.
	* gcc.dg/vmx/st.c: New test.
	* gcc.dg/vmx/st-be-order.c: New test.
	* gcc.dg/vmx/st-vsx.c: New test.
	* gcc.dg/vmx/st-vsx-be-order.c: New test.
	* gcc.dg/vmx/stl.c: New test.
	* gcc.dg/vmx/stl-be-order.c: New test.
	* gcc.dg/vmx/stl-vsx.c: New test.
	* gcc.dg/vmx/stl-vsx-be-order.c: New test.

	Backport from mainline 208021
	2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/vsums.c: Check entire result vector.
	* gcc.dg/vmx/vsums-be-order.c: Likewise.

	Backport from mainline 208049
	2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/lde.c: New test.
	* gcc.dg/vmx/lde-be-order.c: New test.
	* gcc.dg/vmx/ste.c: New test.
	* gcc.dg/vmx/ste-be-order.c: New test.

	Backport from mainline 208120
	2014-02-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/ld-vsx.c: Don't use vec_all_eq.
	* gcc.dg/vmx/ld-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/ldl-vsx.c: Likewise.
	* gcc.dg/vmx/ldl-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/merge-vsx.c: Likewise.
	* gcc.dg/vmx/merge-vsx-be-order.c: Likewise.

	Backport from mainline 208321
	2014-03-04  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	* gcc.dg/vmx/extract-vsx.c: Replace "vector long" with "vector
	long long" throughout.
	* gcc.dg/vmx/extract-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/insert-vsx.c: Likewise.
	* gcc.dg/vmx/insert-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/ld-vsx.c: Likewise.
	* gcc.dg/vmx/ld-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/ldl-vsx.c: Likewise.
	* gcc.dg/vmx/ldl-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/merge-vsx.c: Likewise.
	* gcc.dg/vmx/merge-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/st-vsx.c: Likewise.
	* gcc.dg/vmx/st-vsx-be-order.c: Likewise.
	* gcc.dg/vmx/stl-vsx.c: Likewise.
	* gcc.dg/vmx/stl-vsx-be-order.c: Likewise.

Comments

David Edelsohn April 3, 2014, 2:43 p.m. UTC | #1
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> This patch (diff-le-vector-api) backports enablement of LE support for
> the Altivec APIs, including support for -maltivec=be.
>
> Thanks,
> Bill
>
>
> [gcc]
>
> 2014-03-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         Backport from mainline r206443
>         2014-01-08  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove
>         two duplicate entries.
>
>         Backport from mainline r206494
>         2014-01-09  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * doc/invoke.texi: Add -maltivec={be,le} options, and document
>         default element-order behavior for -maltivec.
>         * config/rs6000/rs6000.opt: Add -maltivec={be,le} options.
>         * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure
>         that -maltivec={le,be} implies -maltivec; disallow -maltivec=le
>         when targeting big endian, at least for now.
>         * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG.
>
>         Backport from mainline r206541
>         2014-01-10  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000-builtin.def: Fix pasto for VPKSDUS.
>
>         Backport from mainline r206590
>         2014-01-13  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
>         Implement -maltivec=be for vec_insert and vec_extract.
>
>         Backport from mainline r206641
>         2014-01-15  Bill Schmidt  <wschmidt@vnet.linux.ibm.com>
>
>         * config/rs6000/altivec.md (mulv8hi3): Explicitly generate vmulesh
>         and vmulosh rather than call gen_vec_widen_smult_*.
>         (vec_widen_umult_even_v16qi): Test VECTOR_ELT_ORDER_BIG rather
>         than BYTES_BIG_ENDIAN to determine use of even or odd instruction.
>         (vec_widen_smult_even_v16qi): Likewise.
>         (vec_widen_umult_even_v8hi): Likewise.
>         (vec_widen_smult_even_v8hi): Likewise.
>         (vec_widen_umult_odd_v16qi): Likewise.
>         (vec_widen_smult_odd_v16qi): Likewise.
>         (vec_widen_umult_odd_v8hi): Likewise.
>         (vec_widen_smult_odd_v8hi): Likewise.
>         (vec_widen_umult_hi_v16qi): Explicitly generate vmuleub and
>         vmuloub rather than call gen_vec_widen_umult_*.
>         (vec_widen_umult_lo_v16qi): Likewise.
>         (vec_widen_smult_hi_v16qi): Explicitly generate vmulesb and
>         vmulosb rather than call gen_vec_widen_smult_*.
>         (vec_widen_smult_lo_v16qi): Likewise.
>         (vec_widen_umult_hi_v8hi): Explicitly generate vmuleuh and vmulouh
>         rather than call gen_vec_widen_umult_*.
>         (vec_widen_umult_lo_v8hi): Likewise.
>         (vec_widen_smult_hi_v8hi): Explicitly gnerate vmulesh and vmulosh
>         rather than call gen_vec_widen_smult_*.
>         (vec_widen_smult_lo_v8hi): Likewise.
>
>         Backport from mainline r207062
>         2014-01-24  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove
>         correction for little endian...
>         * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to
>         here.
>
>         Backport from mainline r207262
>         2014-01-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_const):  Use
>         CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*.
>         * config/rs6000/vsx.md (vsx_mergel_<mode>): Adjust for
>         -maltivec=be with LE targets.
>         (vsx_mergeh_<mode>): Likewise.
>         * config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New
>         unspecs.
>         (mulv8hi3): Use gen_altivec_vmrg[hl]w_direct.
>         (altivec_vmrghb): Replace with define_expand and new
>         *altivec_vmrghb_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrghb_direct): New define_insn.
>         (altivec_vmrghh): Replace with define_expand and new
>         *altivec_vmrghh_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrghh_direct): New define_insn.
>         (altivec_vmrghw): Replace with define_expand and new
>         *altivec_vmrghw_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrghw_direct): New define_insn.
>         (*altivec_vmrghsf): Adjust for endianness.
>         (altivec_vmrglb): Replace with define_expand and new
>         *altivec_vmrglb_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrglb_direct): New define_insn.
>         (altivec_vmrglh): Replace with define_expand and new
>         *altivec_vmrglh_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrglh_direct): New define_insn.
>         (altivec_vmrglw): Replace with define_expand and new
>         *altivec_vmrglw_internal insn; adjust for -maltivec=be with LE
>         targets.
>         (altivec_vmrglw_direct): New define_insn.
>         (*altivec_vmrglsf): Adjust for endianness.
>         (vec_widen_umult_hi_v16qi): Use gen_altivec_vmrghh_direct.
>         (vec_widen_umult_lo_v16qi): Use gen_altivec_vmrglh_direct.
>         (vec_widen_smult_hi_v16qi): Use gen_altivec_vmrghh_direct.
>         (vec_widen_smult_lo_v16qi): Use gen_altivec_vmrglh_direct.
>         (vec_widen_umult_hi_v8hi): Use gen_altivec_vmrghw_direct.
>         (vec_widen_umult_lo_v8hi): Use gen_altivec_vmrglw_direct.
>         (vec_widen_smult_hi_v8hi): Use gen_altivec_vmrghw_direct.
>         (vec_widen_smult_lo_v8hi): Use gen_altivec_vmrglw_direct.
>
>         Backport from mainline r207318
>         2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc/config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
>         gen_vsx_xxspltw_v4sf_direct instead of gen_vsx_xxspltw_v4sf;
>         remove element index adjustment for endian (now handled in vsx.md
>         and altivec.md).
>         (altivec_expand_vec_perm_const): Use
>         gen_altivec_vsplt[bhw]_direct instead of gen_altivec_vsplt[bhw].
>         * gcc/config/rs6000/vsx.md (UNSPEC_VSX_XXSPLTW): New unspec.
>         (vsx_xxspltw_<mode>): Adjust element index for little endian.
>         * gcc/config/rs6000/altivec.md (altivec_vspltb): Divide into a
>         define_expand and a new define_insn *altivec_vspltb_internal;
>         adjust for -maltivec=be on a little endian target.
>         (altivec_vspltb_direct): New.
>         (altivec_vsplth): Divide into a define_expand and a new
>         define_insn *altivec_vsplth_internal; adjust for -maltivec=be on a
>         little endian target.
>         (altivec_vsplth_direct): New.
>         (altivec_vspltw): Divide into a define_expand and a new
>         define_insn *altivec_vspltw_internal; adjust for -maltivec=be on a
>         little endian target.
>         (altivec_vspltw_direct): New.
>         (altivec_vspltsf): Divide into a define_expand and a new
>         define_insn *altivec_vspltsf_internal; adjust for -maltivec=be on
>         a little endian target.
>
>         Backport from mainline r207326
>         2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_expand_vector_init): Remove
>         unused variable "field".
>         * config/rs6000/vsx.md (vsx_mergel_<mode>): Add missing DONE.
>         (vsx_mergeh_<mode>): Likewise.
>         * config/rs6000/altivec.md (altivec_vmrghb): Likewise.
>         (altivec_vmrghh): Likewise.
>         (altivec_vmrghw): Likewise.
>         (altivec_vmrglb): Likewise.
>         (altivec_vmrglh): Likewise.
>         (altivec_vmrglw): Likewise.
>         (altivec_vspltb): Add missing uses.
>         (altivec_vsplth): Likewise.
>         (altivec_vspltw): Likewise.
>         (altivec_vspltsf): Likewise.
>
>         Backport from mainline r207414
>         2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (UNSPEC_VSUMSWS_DIRECT): New unspec.
>         (altivec_vsumsws): Add handling for -maltivec=be with a little
>         endian target.
>         (altivec_vsumsws_direct): New.
>         (reduc_splus_<mode>): Call gen_altivec_vsumsws_direct instead of
>         gen_altivec_vsumsws.
>
>         Backport from mainline r207415
>         2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_le): Generalize
>         for vector types other than V16QImode.
>         * config/rs6000/altivec.md (altivec_vperm_<mode>): Change to a
>         define_expand, and call altivec_expand_vec_perm_le when producing
>         code with little endian element order.
>         (*altivec_vperm_<mode>_internal): New insn having previous
>         behavior of altivec_vperm_<mode>.
>         (altivec_vperm_<mode>_uns): Change to a define_expand, and call
>         altivec_expand_vec_perm_le when producing code with little endian
>         element order.
>         (*altivec_vperm_<mode>_uns_internal): New insn having previous
>         behavior of altivec_vperm_<mode>_uns.
>
>         Backport from mainline r207520
>         2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * altivec.md (UNSPEC_VPACK_UNS_UNS_MOD_DIRECT): New unspec.
>         (UNSPEC_VUNPACK_HI_SIGN_DIRECT): Likewise.
>         (UNSPEC_VUNPACK_LO_SIGN_DIRECT): Likewise.
>         (mulv8hi3): Use gen_altivec_vpkuwum_direct instead of
>         gen_altivec_vpkuwum.
>         (altivec_vpkpx): Test for VECTOR_ELT_ORDER_BIG instead of for
>         BYTES_BIG_ENDIAN.
>         (altivec_vpks<VI_char>ss): Likewise.
>         (altivec_vpks<VI_char>us): Likewise.
>         (altivec_vpku<VI_char>us): Likewise.
>         (altivec_vpku<VI_char>um): Likewise.
>         (altivec_vpku<VI_char>um_direct): New (copy of
>         altivec_vpku<VI_char>um that still relies on BYTES_BIG_ENDIAN, for
>         internal use).
>         (altivec_vupkhs<VU_char>): Emit vupkls* instead of vupkhs* when
>         target is little endian and -maltivec=be is not specified.
>         (*altivec_vupkhs<VU_char>_direct): New (copy of
>         altivec_vupkhs<VU_char> that always emits vupkhs*, for internal
>         use).
>         (altivec_vupkls<VU_char>): Emit vupkhs* instead of vupkls* when
>         target is little endian and -maltivec=be is not specified.
>         (*altivec_vupkls<VU_char>_direct): New (copy of
>         altivec_vupkls<VU_char> that always emits vupkls*, for internal
>         use).
>         (altivec_vupkhpx): Emit vupklpx instead of vupkhpx when target is
>         little endian and -maltivec=be is not specified.
>         (altivec_vupklpx): Emit vupkhpx instead of vupklpx when target is
>         little endian and -maltivec=be is not specified.
>
>         Backport from mainline r207521
>         2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (altivec_vsum2sws): Adjust code
>         generation for -maltivec=be.
>         (altivec_vsumsws): Simplify redundant test.
>
>         Backport from mainline r207525
>         2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change
>         CODE_FOR_altivec_vpku[hw]um to
>         CODE_FOR_altivec_vpku[hw]um_direct.
>         * config/rs6000/altivec.md (vec_unpacks_hi_<VP_small_lc>): Change
>         UNSPEC_VUNPACK_HI_SIGN to UNSPEC_VUNPACK_HI_SIGN_DIRECT.
>         (vec_unpacks_lo_<VP_small_lc>): Change UNSPEC_VUNPACK_LO_SIGN to
>         UNSPEC_VUNPACK_LO_SIGN_DIRECT.
>
>         Backport from mainline r207814.
>         2014-02-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/vsx.md (vsx_xxpermdi_<mode>): Handle little
>         endian targets.
>
>         Backport from mainline r207815.
>         2014-02-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (p8_vmrgew): Handle little endian
>         targets.
>         (p8_vmrgow): Likewise.
>
>         Backport from mainline r207919.
>         2014-02-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (vspltis_constant): Fix most significant
>         bit of zero.
>
>         Backport from mainline 208019
>         2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (altivec_lvxl): Rename as
>         *altivec_lvxl_<mode>_internal and use VM2 iterator instead of
>         V4SI.
>         (altivec_lvxl_<mode>): New define_expand incorporating
>         -maltivec=be semantics where needed.
>         (altivec_lvx): Rename as *altivec_lvx_<mode>_internal.
>         (altivec_lvx_<mode>): New define_expand incorporating -maltivec=be
>         semantics where needed.
>         (altivec_stvx): Rename as *altivec_stvx_<mode>_internal.
>         (altivec_stvx_<mode>): New define_expand incorporating
>         -maltivec=be semantics where needed.
>         (altivec_stvxl): Rename as *altivec_stvxl_<mode>_internal and use
>         VM2 iterator instead of V4SI.
>         (altivec_stvxl_<mode>): New define_expand incorporating
>         -maltivec=be semantics where needed.
>         * config/rs6000/rs6000-builtin.def: Add new built-in definitions
>         LVXL_V2DF, LVXL_V2DI, LVXL_V4SF, LVXL_V4SI, LVXL_V8HI, LVXL_V16QI,
>         LVX_V2DF, LVX_V2DI, LVX_V4SF, LVX_V4SI, LVX_V8HI, LVX_V16QI,
>         STVX_V2DF, STVX_V2DI, STVX_V4SF, STVX_V4SI, STVX_V8HI, STVX_V16QI,
>         STVXL_V2DF, STVXL_V2DI, STVXL_V4SF, STVXL_V4SI, STVXL_V8HI,
>         STVXL_V16QI.
>         * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Replace
>         ALTIVEC_BUILTIN_LVX with ALTIVEC_BUILTIN_LVX_<MODE> throughout;
>         similarly for ALTIVEC_BUILTIN_LVXL, ALTIVEC_BUILTIN_STVX, and
>         ALTIVEC_BUILTIN_STVXL.
>         * config/rs6000/rs6000-protos.h (altivec_expand_lvx_be): New
>         prototype.
>         (altivec_expand_stvx_be): Likewise.
>         * config/rs6000/rs6000.c (swap_selector_for_mode): New function.
>         (altivec_expand_lvx_be): Likewise.
>         (altivec_expand_stvx_be): Likewise.
>         (altivec_expand_builtin): Add cases for
>         ALTIVEC_BUILTIN_STVX_<MODE>, ALTIVEC_BUILTIN_STVXL_<MODE>,
>         ALTIVEC_BUILTIN_LVXL_<MODE>, and ALTIVEC_BUILTIN_LVX_<MODE>.
>         (altivec_init_builtins): Add definitions for
>         __builtin_altivec_lvxl_<mode>, __builtin_altivec_lvx_<mode>,
>         __builtin_altivec_stvx_<mode>, and
>         __builtin_altivec_stvxl_<mode>.
>
>         Backport from mainline 208021
>         2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (altivec_vsumsws): Replace second
>         vspltw with vsldoi.
>         (reduc_uplus_v16qi): Use gen_altivec_vsumsws_direct instead of
>         gen_altivec_vsumsws.
>
>         Backport from mainline 208049
>         2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * config/rs6000/altivec.md (altivec_lve<VI_char>x): Replace
>         define_insn with define_expand and new define_insn
>         *altivec_lve<VI_char>x_internal.
>         (altivec_stve<VI_char>x): Replace define_insn with define_expand
>         and new define_insn *altivec_stve<VI_char>x_internal.
>         * config/rs6000/rs6000-protos.h (altivec_expand_stvex_be): New
>         prototype.
>         * config/rs6000/rs6000.c (altivec_expand_lvx_be): Document use by
>         lve*x built-ins.
>         (altivec_expand_stvex_be): New function.
>
>         Backport from mainline
>         2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>         * config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Relax assert
>         to permit subregs.
>
>         Backport from mainline
>         2014-02-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>         * config/rs6000/vector.md (*vector_unordered<mode>): Change split
>         to use canonical form for nor<mode>3.
>
> [gcc/testsuite]
>
> 2014-03-19  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         Backport from mainline r206590
>         2014-01-13  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/insert.c: New.
>         * gcc.dg/vmx/insert-be-order.c: New.
>         * gcc.dg/vmx/extract.c: New.
>         * gcc.dg/vmx/extract-be-order.c: New.
>
>         Backport from mainline r206641
>         2014-01-15  Bill Schmidt  <wschmidt@vnet.linux.ibm.com>
>
>         * gcc.dg/vmx/mult-even-odd.c: New.
>         * gcc.dg/vmx/mult-even-odd-be-order.c: New.
>
>         Backport from mainline r206926
>         2014-01-22  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/insert-vsx-be-order.c: New.
>         * gcc.dg/vmx/extract-vsx.c: New.
>         * gcc.dg/vmx/extract-vsx-be-order.c: New.
>         * gcc.dg/vmx/insert-vsx.c: New.
>
>         Backport from mainline r207262
>         2014-01-29  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/merge-be-order.c: New.
>         * gcc.dg/vmx/merge.c: New.
>         * gcc.dg/vmx/merge-vsx-be-order.c: New.
>         * gcc.dg/vmx/merge-vsx.c: New.
>
>         Backport from mainline r207318
>         2014-01-30  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/splat.c: New.
>         * gcc.dg/vmx/splat-vsx.c: New.
>         * gcc.dg/vmx/splat-be-order.c: New.
>         * gcc.dg/vmx/splat-vsx-be-order.c: New.
>         * gcc.dg/vmx/eg-5.c: Remove special casing for little endian.
>         * gcc.dg/vmx/sn7153.c: Add special casing for little endian.
>
>         Backport from mainline r207414
>         2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/vsums.c: New.
>         * gcc.dg/vmx/vsums-be-order.c: New.
>
>         Backport from mainline r207415
>         2014-02-02  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/3b-15.c: Remove special handling for little endian.
>         * gcc.dg/vmx/perm.c: New.
>         * gcc.dg/vmx/perm-be-order.c: New.
>
>         Backport from mainline r207520
>         2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/pack.c: New.
>         * gcc.dg/vmx/pack-be-order.c: New.
>         * gcc.dg/vmx/unpack.c: New.
>         * gcc.dg/vmx/unpack-be-order.c: New.
>
>         Backport from mainline r207521
>         2014-02-05  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/sum2s.c: New.
>         * gcc.dg/vmx/sum2s-be-order.c: New.
>
>         Backport from mainline 208019
>         2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/ld.c: New test.
>         * gcc.dg/vmx/ld-be-order.c: New test.
>         * gcc.dg/vmx/ld-vsx.c: New test.
>         * gcc.dg/vmx/ld-vsx-be-order.c: New test.
>         * gcc.dg/vmx/ldl.c: New test.
>         * gcc.dg/vmx/ldl-be-order.c: New test.
>         * gcc.dg/vmx/ldl-vsx.c: New test.
>         * gcc.dg/vmx/ldl-vsx-be-order.c: New test.
>         * gcc.dg/vmx/st.c: New test.
>         * gcc.dg/vmx/st-be-order.c: New test.
>         * gcc.dg/vmx/st-vsx.c: New test.
>         * gcc.dg/vmx/st-vsx-be-order.c: New test.
>         * gcc.dg/vmx/stl.c: New test.
>         * gcc.dg/vmx/stl-be-order.c: New test.
>         * gcc.dg/vmx/stl-vsx.c: New test.
>         * gcc.dg/vmx/stl-vsx-be-order.c: New test.
>
>         Backport from mainline 208021
>         2014-02-21  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/vsums.c: Check entire result vector.
>         * gcc.dg/vmx/vsums-be-order.c: Likewise.
>
>         Backport from mainline 208049
>         2014-02-23  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/lde.c: New test.
>         * gcc.dg/vmx/lde-be-order.c: New test.
>         * gcc.dg/vmx/ste.c: New test.
>         * gcc.dg/vmx/ste-be-order.c: New test.
>
>         Backport from mainline 208120
>         2014-02-25  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/ld-vsx.c: Don't use vec_all_eq.
>         * gcc.dg/vmx/ld-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/ldl-vsx.c: Likewise.
>         * gcc.dg/vmx/ldl-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/merge-vsx.c: Likewise.
>         * gcc.dg/vmx/merge-vsx-be-order.c: Likewise.
>
>         Backport from mainline 208321
>         2014-03-04  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         * gcc.dg/vmx/extract-vsx.c: Replace "vector long" with "vector
>         long long" throughout.
>         * gcc.dg/vmx/extract-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/insert-vsx.c: Likewise.
>         * gcc.dg/vmx/insert-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/ld-vsx.c: Likewise.
>         * gcc.dg/vmx/ld-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/ldl-vsx.c: Likewise.
>         * gcc.dg/vmx/ldl-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/merge-vsx.c: Likewise.
>         * gcc.dg/vmx/merge-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/st-vsx.c: Likewise.
>         * gcc.dg/vmx/st-vsx-be-order.c: Likewise.
>         * gcc.dg/vmx/stl-vsx.c: Likewise.
>         * gcc.dg/vmx/stl-vsx-be-order.c: Likewise.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-c.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-c.c
@@ -612,10 +612,6 @@  const struct altivec_builtin_types altiv
     RS6000_BTI_V4SI, RS6000_BTI_V8HI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, ALTIVEC_BUILTIN_VUPKHSH,
     RS6000_BTI_bool_V4SI, RS6000_BTI_bool_V8HI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKH, P8V_BUILTIN_VUPKHSW,
-    RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
-  { ALTIVEC_BUILTIN_VEC_UNPACKH, P8V_BUILTIN_VUPKHSW,
-    RS6000_BTI_bool_V2DI, RS6000_BTI_bool_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, P8V_BUILTIN_VUPKHSW,
     RS6000_BTI_V2DI, RS6000_BTI_V4SI, 0, 0 },
   { ALTIVEC_BUILTIN_VEC_VUPKHSH, P8V_BUILTIN_VUPKHSW,
@@ -1110,54 +1106,54 @@  const struct altivec_builtin_types altiv
     RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 },
   { VSX_BUILTIN_VEC_DIV, VSX_BUILTIN_XVDIVDP,
     RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DF,
     RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DI,
     RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DI,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_unsigned_V2DI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V2DI,
     RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V2DI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SF,
     RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SF,
     RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_long, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_long, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V8HI,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V16QI,
     RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V16QI,
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V16QI,
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V16QI,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX,
+  { ALTIVEC_BUILTIN_VEC_LD, ALTIVEC_BUILTIN_LVX_V16QI,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
   { ALTIVEC_BUILTIN_VEC_LDE, ALTIVEC_BUILTIN_LVEBX,
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 },
@@ -1195,55 +1191,55 @@  const struct altivec_builtin_types altiv
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 },
   { ALTIVEC_BUILTIN_VEC_LVEBX, ALTIVEC_BUILTIN_LVEBX,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SF,
     RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SF,
     RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_long, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V4SI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V4SI,
     RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_long, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V8HI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V8HI,
     RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V16QI,
     RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V16QI,
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V16QI,
     RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V16QI,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI,
     ~RS6000_BTI_unsigned_V16QI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V16QI,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V2DF,
     RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V2DI,
     RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V2DI,
     RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_unsigned_V2DI, 0 },
-  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL,
+  { ALTIVEC_BUILTIN_VEC_LDL, ALTIVEC_BUILTIN_LVXL_V2DI,
     RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V2DI, 0 },
   { ALTIVEC_BUILTIN_VEC_LVSL, ALTIVEC_BUILTIN_LVSL,
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI, 0 },
@@ -2859,63 +2855,63 @@  const struct altivec_builtin_types altiv
     RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_NOT_OPAQUE },
   { ALTIVEC_BUILTIN_VEC_SLD, ALTIVEC_BUILTIN_VSLDOI_16QI,
     RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_NOT_OPAQUE },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V2DF,
     RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V2DI,
     RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V2DI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_unsigned_V2DI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V2DI,
     RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_bool_V2DI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SF,
     RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SF,
     RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V4SI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V4SI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V8HI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V8HI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V16QI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
-  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX,
+  { ALTIVEC_BUILTIN_VEC_ST, ALTIVEC_BUILTIN_STVX_V8HI,
     RS6000_BTI_void, RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI },
   { ALTIVEC_BUILTIN_VEC_STE, ALTIVEC_BUILTIN_STVEBX,
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
@@ -2987,64 +2983,64 @@  const struct altivec_builtin_types altiv
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_void },
   { ALTIVEC_BUILTIN_VEC_STVEBX, ALTIVEC_BUILTIN_STVEBX,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_void },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SF,
     RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_V4SF },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SF,
     RS6000_BTI_void, RS6000_BTI_V4SF, RS6000_BTI_INTSI, ~RS6000_BTI_float },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_V4SI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V4SI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V4SI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTSI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V4SI,
     RS6000_BTI_void, RS6000_BTI_bool_V4SI, RS6000_BTI_INTSI, ~RS6000_BTI_INTSI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_V8HI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V8HI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V8HI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTHI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_bool_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_INTHI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_V16QI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_unsigned_V16QI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_bool_V16QI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_UINTQI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V16QI,
     RS6000_BTI_void, RS6000_BTI_bool_V16QI, RS6000_BTI_INTSI, ~RS6000_BTI_INTQI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V8HI,
     RS6000_BTI_void, RS6000_BTI_pixel_V8HI, RS6000_BTI_INTSI, ~RS6000_BTI_pixel_V8HI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V2DF,
     RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_V2DF },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V2DF,
     RS6000_BTI_void, RS6000_BTI_V2DF, RS6000_BTI_INTSI, ~RS6000_BTI_double },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V2DI,
     RS6000_BTI_void, RS6000_BTI_V2DI, RS6000_BTI_INTSI, ~RS6000_BTI_V2DI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V2DI,
     RS6000_BTI_void, RS6000_BTI_unsigned_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_unsigned_V2DI },
-  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL,
+  { ALTIVEC_BUILTIN_VEC_STL, ALTIVEC_BUILTIN_STVXL_V2DI,
     RS6000_BTI_void, RS6000_BTI_bool_V2DI, RS6000_BTI_INTSI,
     ~RS6000_BTI_bool_V2DI },
   { ALTIVEC_BUILTIN_VEC_STVLX, ALTIVEC_BUILTIN_STVLX,
@@ -4178,7 +4174,7 @@  altivec_resolve_overloaded_builtin (loca
 	return build_constructor (type, vec);
     }
 
-  /* For now use pointer tricks to do the extaction, unless we are on VSX
+  /* For now use pointer tricks to do the extraction, unless we are on VSX
      extracting a double from a constant offset.  */
   if (fcode == ALTIVEC_BUILTIN_VEC_EXTRACT)
     {
@@ -4206,6 +4202,17 @@  altivec_resolve_overloaded_builtin (loca
       if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2)))
 	goto bad; 
 
+      /* If we are targeting little-endian, but -maltivec=be has been
+	 specified to override the element order, adjust the element
+	 number accordingly.  */
+      if (!BYTES_BIG_ENDIAN && rs6000_altivec_element_order == 2)
+	{
+	  unsigned int last_elem = TYPE_VECTOR_SUBPARTS (arg1_type) - 1;
+	  arg2 = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (arg2),
+				  build_int_cstu (TREE_TYPE (arg2), last_elem),
+				  arg2);
+	}
+
       /* If we can use the VSX xxpermdi instruction, use that for extract.  */
       mode = TYPE_MODE (arg1_type);
       if ((mode == V2DFmode || mode == V2DImode) && VECTOR_MEM_VSX_P (mode)
@@ -4253,7 +4260,7 @@  altivec_resolve_overloaded_builtin (loca
       return stmt;
     }
 
-  /* For now use pointer tricks to do the insertation, unless we are on VSX
+  /* For now use pointer tricks to do the insertion, unless we are on VSX
      inserting a double to a constant offset..  */
   if (fcode == ALTIVEC_BUILTIN_VEC_INSERT)
     {
@@ -4283,6 +4290,17 @@  altivec_resolve_overloaded_builtin (loca
       if (!INTEGRAL_TYPE_P (TREE_TYPE (arg2)))
 	goto bad; 
 
+      /* If we are targeting little-endian, but -maltivec=be has been
+	 specified to override the element order, adjust the element
+	 number accordingly.  */
+      if (!BYTES_BIG_ENDIAN && rs6000_altivec_element_order == 2)
+	{
+	  unsigned int last_elem = TYPE_VECTOR_SUBPARTS (arg1_type) - 1;
+	  arg2 = fold_build2_loc (loc, MINUS_EXPR, TREE_TYPE (arg2),
+				  build_int_cstu (TREE_TYPE (arg2), last_elem),
+				  arg2);
+	}
+
       /* If we can use the VSX xxpermdi instruction, use that for insert.  */
       mode = TYPE_MODE (arg1_type);
       if ((mode == V2DFmode || mode == V2DImode) && VECTOR_UNIT_VSX_P (mode)
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.c
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.c
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.c
@@ -3216,6 +3216,18 @@  rs6000_option_override_internal (bool gl
       && !(processor_target_table[tune_index].target_enable & OPTION_MASK_HTM))
     rs6000_isa_flags |= ~rs6000_isa_flags_explicit & OPTION_MASK_STRICT_ALIGN;
 
+  /* -maltivec={le,be} implies -maltivec.  */
+  if (rs6000_altivec_element_order != 0)
+    rs6000_isa_flags |= OPTION_MASK_ALTIVEC;
+
+  /* Disallow -maltivec=le in big endian mode for now.  This is not
+     known to be useful for anyone.  */
+  if (BYTES_BIG_ENDIAN && rs6000_altivec_element_order == 1)
+    {
+      warning (0, N_("-maltivec=le not allowed for big-endian targets"));
+      rs6000_altivec_element_order = 0;
+    }
+
   /* Add some warnings for VSX.  */
   if (TARGET_VSX)
     {
@@ -4995,7 +5007,7 @@  vspltis_constant (rtx op, unsigned step,
 
   val = const_vector_elt_as_int (op, BYTES_BIG_ENDIAN ? nunits - 1 : 0);
   splat_val = val;
-  msb_val = val > 0 ? 0 : -1;
+  msb_val = val >= 0 ? 0 : -1;
 
   /* Construct the value to be splatted, if possible.  If not, return 0.  */
   for (i = 2; i <= copies; i *= 2)
@@ -5460,7 +5472,7 @@  rs6000_expand_vector_init (rtx target, r
 		      : gen_vsx_xscvdpsp_scalar (freg, sreg));
 
 	  emit_insn (cvt);
-	  emit_insn (gen_vsx_xxspltw_v4sf (target, freg, const0_rtx));
+	  emit_insn (gen_vsx_xxspltw_v4sf_direct (target, freg, const0_rtx));
 	}
       else
 	{
@@ -5486,7 +5498,6 @@  rs6000_expand_vector_init (rtx target, r
      of 64-bit items is not supported on Altivec.  */
   if (all_same && GET_MODE_SIZE (inner_mode) <= 4)
     {
-      rtx field;
       mem = assign_stack_temp (mode, GET_MODE_SIZE (inner_mode));
       emit_move_insn (adjust_address_nv (mem, inner_mode, 0),
 		      XVECEXP (vals, 0, 0));
@@ -5497,11 +5508,9 @@  rs6000_expand_vector_init (rtx target, r
 					      gen_rtx_SET (VOIDmode,
 							   target, mem),
 					      x)));
-      field = (BYTES_BIG_ENDIAN ? const0_rtx
-	       : GEN_INT (GET_MODE_NUNITS (mode) - 1));
       x = gen_rtx_VEC_SELECT (inner_mode, target,
 			      gen_rtx_PARALLEL (VOIDmode,
-						gen_rtvec (1, field)));
+						gen_rtvec (1, const0_rtx)));
       emit_insn (gen_rtx_SET (VOIDmode, target,
 			      gen_rtx_VEC_DUPLICATE (mode, x)));
       return;
@@ -8000,7 +8009,7 @@  rs6000_emit_le_vsx_move (rtx dest, rtx s
 
   if (MEM_P (source))
     {
-      gcc_assert (REG_P (dest));
+      gcc_assert (REG_P (dest) || GET_CODE (dest) == SUBREG);
       rs6000_emit_le_vsx_load (dest, source, mode);
     }
   else
@@ -11730,6 +11739,100 @@  paired_expand_lv_builtin (enum insn_code
   return target;
 }
 
+/* Return a constant vector for use as a little-endian permute control vector
+   to reverse the order of elements of the given vector mode.  */
+static rtx
+swap_selector_for_mode (enum machine_mode mode)
+{
+  /* These are little endian vectors, so their elements are reversed
+     from what you would normally expect for a permute control vector.  */
+  unsigned int swap2[16] = {7,6,5,4,3,2,1,0,15,14,13,12,11,10,9,8};
+  unsigned int swap4[16] = {3,2,1,0,7,6,5,4,11,10,9,8,15,14,13,12};
+  unsigned int swap8[16] = {1,0,3,2,5,4,7,6,9,8,11,10,13,12,15,14};
+  unsigned int swap16[16] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  unsigned int *swaparray, i;
+  rtx perm[16];
+
+  switch (mode)
+    {
+    case V2DFmode:
+    case V2DImode:
+      swaparray = swap2;
+      break;
+    case V4SFmode:
+    case V4SImode:
+      swaparray = swap4;
+      break;
+    case V8HImode:
+      swaparray = swap8;
+      break;
+    case V16QImode:
+      swaparray = swap16;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  for (i = 0; i < 16; ++i)
+    perm[i] = GEN_INT (swaparray[i]);
+
+  return force_reg (V16QImode, gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, perm)));
+}
+
+/* Generate code for an "lvx", "lvxl", or "lve*x" built-in for a little endian target
+   with -maltivec=be specified.  Issue the load followed by an element-reversing
+   permute.  */
+void
+altivec_expand_lvx_be (rtx op0, rtx op1, enum machine_mode mode, unsigned unspec)
+{
+  rtx tmp = gen_reg_rtx (mode);
+  rtx load = gen_rtx_SET (VOIDmode, tmp, op1);
+  rtx lvx = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), unspec);
+  rtx par = gen_rtx_PARALLEL (mode, gen_rtvec (2, load, lvx));
+  rtx sel = swap_selector_for_mode (mode);
+  rtx vperm = gen_rtx_UNSPEC (mode, gen_rtvec (3, tmp, tmp, sel), UNSPEC_VPERM);
+
+  gcc_assert (REG_P (op0));
+  emit_insn (par);
+  emit_insn (gen_rtx_SET (VOIDmode, op0, vperm));
+}
+
+/* Generate code for a "stvx" or "stvxl" built-in for a little endian target
+   with -maltivec=be specified.  Issue the store preceded by an element-reversing
+   permute.  */
+void
+altivec_expand_stvx_be (rtx op0, rtx op1, enum machine_mode mode, unsigned unspec)
+{
+  rtx tmp = gen_reg_rtx (mode);
+  rtx store = gen_rtx_SET (VOIDmode, op0, tmp);
+  rtx stvx = gen_rtx_UNSPEC (mode, gen_rtvec (1, const0_rtx), unspec);
+  rtx par = gen_rtx_PARALLEL (mode, gen_rtvec (2, store, stvx));
+  rtx sel = swap_selector_for_mode (mode);
+  rtx vperm;
+
+  gcc_assert (REG_P (op1));
+  vperm = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op1, sel), UNSPEC_VPERM);
+  emit_insn (gen_rtx_SET (VOIDmode, tmp, vperm));
+  emit_insn (par);
+}
+
+/* Generate code for a "stve*x" built-in for a little endian target with -maltivec=be
+   specified.  Issue the store preceded by an element-reversing permute.  */
+void
+altivec_expand_stvex_be (rtx op0, rtx op1, enum machine_mode mode, unsigned unspec)
+{
+  enum machine_mode inner_mode = GET_MODE_INNER (mode);
+  rtx tmp = gen_reg_rtx (mode);
+  rtx stvx = gen_rtx_UNSPEC (inner_mode, gen_rtvec (1, tmp), unspec);
+  rtx sel = swap_selector_for_mode (mode);
+  rtx vperm;
+
+  gcc_assert (REG_P (op1));
+  vperm = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op1, sel), UNSPEC_VPERM);
+  emit_insn (gen_rtx_SET (VOIDmode, tmp, vperm));
+  emit_insn (gen_rtx_SET (VOIDmode, op0, stvx));
+}
+
 static rtx
 altivec_expand_lv_builtin (enum insn_code icode, tree exp, rtx target, bool blk)
 {
@@ -12522,16 +12625,38 @@  altivec_expand_builtin (tree exp, rtx ta
 
   switch (fcode)
     {
+    case ALTIVEC_BUILTIN_STVX_V2DF:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v2df, exp);
+    case ALTIVEC_BUILTIN_STVX_V2DI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v2di, exp);
+    case ALTIVEC_BUILTIN_STVX_V4SF:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v4sf, exp);
     case ALTIVEC_BUILTIN_STVX:
+    case ALTIVEC_BUILTIN_STVX_V4SI:
       return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v4si, exp);
+    case ALTIVEC_BUILTIN_STVX_V8HI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v8hi, exp);
+    case ALTIVEC_BUILTIN_STVX_V16QI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvx_v16qi, exp);
     case ALTIVEC_BUILTIN_STVEBX:
       return altivec_expand_stv_builtin (CODE_FOR_altivec_stvebx, exp);
     case ALTIVEC_BUILTIN_STVEHX:
       return altivec_expand_stv_builtin (CODE_FOR_altivec_stvehx, exp);
     case ALTIVEC_BUILTIN_STVEWX:
       return altivec_expand_stv_builtin (CODE_FOR_altivec_stvewx, exp);
+    case ALTIVEC_BUILTIN_STVXL_V2DF:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v2df, exp);
+    case ALTIVEC_BUILTIN_STVXL_V2DI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v2di, exp);
+    case ALTIVEC_BUILTIN_STVXL_V4SF:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v4sf, exp);
     case ALTIVEC_BUILTIN_STVXL:
-      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl, exp);
+    case ALTIVEC_BUILTIN_STVXL_V4SI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v4si, exp);
+    case ALTIVEC_BUILTIN_STVXL_V8HI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v8hi, exp);
+    case ALTIVEC_BUILTIN_STVXL_V16QI:
+      return altivec_expand_stv_builtin (CODE_FOR_altivec_stvxl_v16qi, exp);
 
     case ALTIVEC_BUILTIN_STVLX:
       return altivec_expand_stv_builtin (CODE_FOR_altivec_stvlx, exp);
@@ -12675,12 +12800,44 @@  altivec_expand_builtin (tree exp, rtx ta
     case ALTIVEC_BUILTIN_LVEWX:
       return altivec_expand_lv_builtin (CODE_FOR_altivec_lvewx,
 					exp, target, false);
+    case ALTIVEC_BUILTIN_LVXL_V2DF:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v2df,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVXL_V2DI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v2di,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVXL_V4SF:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v4sf,
+					exp, target, false);
     case ALTIVEC_BUILTIN_LVXL:
-      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl,
+    case ALTIVEC_BUILTIN_LVXL_V4SI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v4si,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVXL_V8HI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v8hi,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVXL_V16QI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvxl_v16qi,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVX_V2DF:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v2df,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVX_V2DI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v2di,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVX_V4SF:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v4sf,
 					exp, target, false);
     case ALTIVEC_BUILTIN_LVX:
+    case ALTIVEC_BUILTIN_LVX_V4SI:
       return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v4si,
 					exp, target, false);
+    case ALTIVEC_BUILTIN_LVX_V8HI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v8hi,
+					exp, target, false);
+    case ALTIVEC_BUILTIN_LVX_V16QI:
+      return altivec_expand_lv_builtin (CODE_FOR_altivec_lvx_v16qi,
+					exp, target, false);
     case ALTIVEC_BUILTIN_LVLX:
       return altivec_expand_lv_builtin (CODE_FOR_altivec_lvlx,
 					exp, target, true);
@@ -13996,10 +14153,58 @@  altivec_init_builtins (void)
   def_builtin ("__builtin_altivec_lvehx", v8hi_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVEHX);
   def_builtin ("__builtin_altivec_lvewx", v4si_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVEWX);
   def_builtin ("__builtin_altivec_lvxl", v4si_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVXL);
+  def_builtin ("__builtin_altivec_lvxl_v2df", v2df_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V2DF);
+  def_builtin ("__builtin_altivec_lvxl_v2di", v2di_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V2DI);
+  def_builtin ("__builtin_altivec_lvxl_v4sf", v4sf_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V4SF);
+  def_builtin ("__builtin_altivec_lvxl_v4si", v4si_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V4SI);
+  def_builtin ("__builtin_altivec_lvxl_v8hi", v8hi_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V8HI);
+  def_builtin ("__builtin_altivec_lvxl_v16qi", v16qi_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVXL_V16QI);
   def_builtin ("__builtin_altivec_lvx", v4si_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVX);
+  def_builtin ("__builtin_altivec_lvx_v2df", v2df_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V2DF);
+  def_builtin ("__builtin_altivec_lvx_v2di", v2di_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V2DI);
+  def_builtin ("__builtin_altivec_lvx_v4sf", v4sf_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V4SF);
+  def_builtin ("__builtin_altivec_lvx_v4si", v4si_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V4SI);
+  def_builtin ("__builtin_altivec_lvx_v8hi", v8hi_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V8HI);
+  def_builtin ("__builtin_altivec_lvx_v16qi", v16qi_ftype_long_pcvoid,
+	       ALTIVEC_BUILTIN_LVX_V16QI);
   def_builtin ("__builtin_altivec_stvx", void_ftype_v4si_long_pvoid, ALTIVEC_BUILTIN_STVX);
+  def_builtin ("__builtin_altivec_stvx_v2df", void_ftype_v2df_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V2DF);
+  def_builtin ("__builtin_altivec_stvx_v2di", void_ftype_v2di_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V2DI);
+  def_builtin ("__builtin_altivec_stvx_v4sf", void_ftype_v4sf_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V4SF);
+  def_builtin ("__builtin_altivec_stvx_v4si", void_ftype_v4si_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V4SI);
+  def_builtin ("__builtin_altivec_stvx_v8hi", void_ftype_v8hi_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V8HI);
+  def_builtin ("__builtin_altivec_stvx_v16qi", void_ftype_v16qi_long_pvoid,
+	       ALTIVEC_BUILTIN_STVX_V16QI);
   def_builtin ("__builtin_altivec_stvewx", void_ftype_v4si_long_pvoid, ALTIVEC_BUILTIN_STVEWX);
   def_builtin ("__builtin_altivec_stvxl", void_ftype_v4si_long_pvoid, ALTIVEC_BUILTIN_STVXL);
+  def_builtin ("__builtin_altivec_stvxl_v2df", void_ftype_v2df_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V2DF);
+  def_builtin ("__builtin_altivec_stvxl_v2di", void_ftype_v2di_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V2DI);
+  def_builtin ("__builtin_altivec_stvxl_v4sf", void_ftype_v4sf_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V4SF);
+  def_builtin ("__builtin_altivec_stvxl_v4si", void_ftype_v4si_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V4SI);
+  def_builtin ("__builtin_altivec_stvxl_v8hi", void_ftype_v8hi_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V8HI);
+  def_builtin ("__builtin_altivec_stvxl_v16qi", void_ftype_v16qi_long_pvoid,
+	       ALTIVEC_BUILTIN_STVXL_V16QI);
   def_builtin ("__builtin_altivec_stvebx", void_ftype_v16qi_long_pvoid, ALTIVEC_BUILTIN_STVEBX);
   def_builtin ("__builtin_altivec_stvehx", void_ftype_v8hi_long_pvoid, ALTIVEC_BUILTIN_STVEHX);
   def_builtin ("__builtin_vec_ld", opaque_ftype_long_pcvoid, ALTIVEC_BUILTIN_VEC_LD);
@@ -29886,16 +30091,18 @@  altivec_expand_vec_perm_le (rtx operands
   rtx op1 = operands[2];
   rtx sel = operands[3];
   rtx tmp = target;
+  rtx splatreg = gen_reg_rtx (V16QImode);
+  enum machine_mode mode = GET_MODE (target);
 
   /* Get everything in regs so the pattern matches.  */
   if (!REG_P (op0))
-    op0 = force_reg (V16QImode, op0);
+    op0 = force_reg (mode, op0);
   if (!REG_P (op1))
-    op1 = force_reg (V16QImode, op1);
+    op1 = force_reg (mode, op1);
   if (!REG_P (sel))
     sel = force_reg (V16QImode, sel);
   if (!REG_P (target))
-    tmp = gen_reg_rtx (V16QImode);
+    tmp = gen_reg_rtx (mode);
 
   /* SEL = splat(31) - SEL.  */
   /* We want to subtract from 31, but we can't vspltisb 31 since
@@ -29903,13 +30110,12 @@  altivec_expand_vec_perm_le (rtx operands
      five bits of the permute control vector elements are used.  */
   splat = gen_rtx_VEC_DUPLICATE (V16QImode,
 				 gen_rtx_CONST_INT (QImode, -1));
-  emit_move_insn (tmp, splat);
-  sel = gen_rtx_MINUS (V16QImode, tmp, sel);
-  emit_move_insn (tmp, sel);
+  emit_move_insn (splatreg, splat);
+  sel = gen_rtx_MINUS (V16QImode, splatreg, sel);
+  emit_move_insn (splatreg, sel);
 
   /* Permute with operands reversed and adjusted selector.  */
-  unspec = gen_rtx_UNSPEC (V16QImode, gen_rtvec (3, op1, op0, tmp),
-			   UNSPEC_VPERM);
+  unspec = gen_rtx_UNSPEC (mode, gen_rtvec (3, op1, op0, splatreg), UNSPEC_VPERM);
 
   /* Copy into target, possibly by way of a register.  */
   if (!REG_P (target))
@@ -29933,27 +30139,33 @@  altivec_expand_vec_perm_const (rtx opera
     unsigned char perm[16];
   };
   static const struct altivec_perm_insn patterns[] = {
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum,
+    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuhum_direct,
       {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
+    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum_direct,
       {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
     { OPTION_MASK_ALTIVEC, 
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb_direct
+       : CODE_FOR_altivec_vmrglb_direct),
       {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
     { OPTION_MASK_ALTIVEC,
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh_direct
+       : CODE_FOR_altivec_vmrglh_direct),
       {  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
     { OPTION_MASK_ALTIVEC,
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw_direct
+       : CODE_FOR_altivec_vmrglw_direct),
       {  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
     { OPTION_MASK_ALTIVEC,
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb_direct
+       : CODE_FOR_altivec_vmrghb_direct),
       {  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
     { OPTION_MASK_ALTIVEC,
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh_direct
+       : CODE_FOR_altivec_vmrghh_direct),
       {  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
     { OPTION_MASK_ALTIVEC,
-      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
+      (BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw_direct
+       : CODE_FOR_altivec_vmrghw_direct),
       {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
     { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
       {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
@@ -30017,7 +30229,7 @@  altivec_expand_vec_perm_const (rtx opera
 	{
           if (!BYTES_BIG_ENDIAN)
             elt = 15 - elt;
-	  emit_insn (gen_altivec_vspltb (target, op0, GEN_INT (elt)));
+	  emit_insn (gen_altivec_vspltb_direct (target, op0, GEN_INT (elt)));
 	  return true;
 	}
 
@@ -30030,8 +30242,8 @@  altivec_expand_vec_perm_const (rtx opera
 	    {
 	      int field = BYTES_BIG_ENDIAN ? elt / 2 : 7 - elt / 2;
 	      x = gen_reg_rtx (V8HImode);
-	      emit_insn (gen_altivec_vsplth (x, gen_lowpart (V8HImode, op0),
-					     GEN_INT (field)));
+	      emit_insn (gen_altivec_vsplth_direct (x, gen_lowpart (V8HImode, op0),
+						    GEN_INT (field)));
 	      emit_move_insn (target, gen_lowpart (V16QImode, x));
 	      return true;
 	    }
@@ -30049,8 +30261,8 @@  altivec_expand_vec_perm_const (rtx opera
 	    {
 	      int field = BYTES_BIG_ENDIAN ? elt / 4 : 3 - elt / 4;
 	      x = gen_reg_rtx (V4SImode);
-	      emit_insn (gen_altivec_vspltw (x, gen_lowpart (V4SImode, op0),
-					     GEN_INT (field)));
+	      emit_insn (gen_altivec_vspltw_direct (x, gen_lowpart (V4SImode, op0),
+						    GEN_INT (field)));
 	      emit_move_insn (target, gen_lowpart (V16QImode, x));
 	      return true;
 	    }
@@ -30094,14 +30306,14 @@  altivec_expand_vec_perm_const (rtx opera
 	     halfwords (BE numbering) when the even halfwords (LE
 	     numbering) are what we need.  */
 	  if (!BYTES_BIG_ENDIAN
-	      && icode == CODE_FOR_altivec_vpkuwum
+	      && icode == CODE_FOR_altivec_vpkuwum_direct
 	      && ((GET_CODE (op0) == REG
 		   && GET_MODE (op0) != V4SImode)
 		  || (GET_CODE (op0) == SUBREG
 		      && GET_MODE (XEXP (op0, 0)) != V4SImode)))
 	    continue;
 	  if (!BYTES_BIG_ENDIAN
-	      && icode == CODE_FOR_altivec_vpkuhum
+	      && icode == CODE_FOR_altivec_vpkuhum_direct
 	      && ((GET_CODE (op0) == REG
 		   && GET_MODE (op0) != V8HImode)
 		  || (GET_CODE (op0) == SUBREG
@@ -30183,22 +30395,6 @@  rs6000_expand_vec_perm_const_1 (rtx targ
       vmode = GET_MODE (target);
       gcc_assert (GET_MODE_NUNITS (vmode) == 2);
       dmode = mode_for_vector (GET_MODE_INNER (vmode), 4);
-
-      /* For little endian, swap operands and invert/swap selectors
-	 to get the correct xxpermdi.  The operand swap sets up the
-	 inputs as a little endian array.  The selectors are swapped
-	 because they are defined to use big endian ordering.  The
-	 selectors are inverted to get the correct doublewords for
-	 little endian ordering.  */
-      if (!BYTES_BIG_ENDIAN)
-	{
-	  int n;
-	  perm0 = 3 - perm0;
-	  perm1 = 3 - perm1;
-	  n = perm0, perm0 = perm1, perm1 = n;
-	  x = op0, op0 = op1, op1 = x;
-	}
-
       x = gen_rtx_VEC_CONCAT (dmode, op0, op1);
       v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1));
       x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v));
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.h
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.h
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.h
@@ -468,6 +468,15 @@  extern int rs6000_vector_align[];
    ? rs6000_vector_align[(MODE)]					\
    : (int)GET_MODE_BITSIZE ((MODE)))
 
+/* Determine the element order to use for vector instructions.  By
+   default we use big-endian element order when targeting big-endian,
+   and little-endian element order when targeting little-endian.  For
+   programs being ported from BE Power to LE Power, it can sometimes
+   be useful to use big-endian element order when targeting little-endian.
+   This is set via -maltivec=be, for example.  */
+#define VECTOR_ELT_ORDER_BIG                                  \
+  (BYTES_BIG_ENDIAN || (rs6000_altivec_element_order == 2))
+
 /* Alignment options for fields in structures for sub-targets following
    AIX-like ABI.
    ALIGN_POWER word-aligns FP doubles (default AIX ABI).
Index: gcc-4_8-test/gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000.opt
+++ gcc-4_8-test/gcc/config/rs6000/rs6000.opt
@@ -137,6 +137,14 @@  maltivec
 Target Report Mask(ALTIVEC) Var(rs6000_isa_flags)
 Use AltiVec instructions
 
+maltivec=le
+Target Report RejectNegative Var(rs6000_altivec_element_order, 1) Save
+Generate Altivec instructions using little-endian element order
+
+maltivec=be
+Target Report RejectNegative Var(rs6000_altivec_element_order, 2)
+Generate Altivec instructions using big-endian element order
+
 mhard-dfp
 Target Report Mask(DFP) Var(rs6000_isa_flags)
 Use decimal floating point instructions
Index: gcc-4_8-test/gcc/doc/invoke.texi
===================================================================
--- gcc-4_8-test.orig/gcc/doc/invoke.texi
+++ gcc-4_8-test/gcc/doc/invoke.texi
@@ -17297,6 +17297,38 @@  the AltiVec instruction set.  You may al
 @option{-mabi=altivec} to adjust the current ABI with AltiVec ABI
 enhancements.
 
+When @option{-maltivec} is used, rather than @option{-maltivec=le} or
+@option{-maltivec=be}, the element order for Altivec intrinsics such
+as @code{vec_splat}, @code{vec_extract}, and @code{vec_insert} will
+match array element order corresponding to the endianness of the
+target.  That is, element zero identifies the leftmost element in a
+vector register when targeting a big-endian platform, and identifies
+the rightmost element in a vector register when targeting a
+little-endian platform.
+
+@item -maltivec=be
+@opindex maltivec=be
+Generate Altivec instructions using big-endian element order,
+regardless of whether the target is big- or little-endian.  This is
+the default when targeting a big-endian platform.
+
+The element order is used to interpret element numbers in Altivec
+intrinsics such as @code{vec_splat}, @code{vec_extract}, and
+@code{vec_insert}.  By default, these will match array element order
+corresponding to the endianness for the target.
+
+@item -maltivec=le
+@opindex maltivec=le
+Generate Altivec instructions using little-endian element order,
+regardless of whether the target is big- or little-endian.  This is
+the default when targeting a little-endian platform.  This option is
+currently ignored when targeting a big-endian platform.
+
+The element order is used to interpret element numbers in Altivec
+intrinsics such as @code{vec_splat}, @code{vec_extract}, and
+@code{vec_insert}.  By default, these will match array element order
+corresponding to the endianness for the target.
+
 @item -mvrsave
 @itemx -mno-vrsave
 @opindex mvrsave
Index: gcc-4_8-test/gcc/config/rs6000/altivec.md
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/altivec.md
+++ gcc-4_8-test/gcc/config/rs6000/altivec.md
@@ -46,6 +46,7 @@ 
    UNSPEC_VPACK_SIGN_UNS_SAT
    UNSPEC_VPACK_UNS_UNS_SAT
    UNSPEC_VPACK_UNS_UNS_MOD
+   UNSPEC_VPACK_UNS_UNS_MOD_DIRECT
    UNSPEC_VSLV4SI
    UNSPEC_VSLO
    UNSPEC_VSR
@@ -69,6 +70,8 @@ 
    UNSPEC_VLSDOI
    UNSPEC_VUNPACK_HI_SIGN
    UNSPEC_VUNPACK_LO_SIGN
+   UNSPEC_VUNPACK_HI_SIGN_DIRECT
+   UNSPEC_VUNPACK_LO_SIGN_DIRECT
    UNSPEC_VUPKHPX
    UNSPEC_VUPKLPX
    UNSPEC_DST
@@ -129,6 +132,10 @@ 
    UNSPEC_VUPKHU_V4SF
    UNSPEC_VUPKLU_V4SF
    UNSPEC_VGBBD
+   UNSPEC_VMRGH_DIRECT
+   UNSPEC_VMRGL_DIRECT
+   UNSPEC_VSPLT_DIRECT
+   UNSPEC_VSUMSWS_DIRECT
 ])
 
 (define_c_enum "unspecv"
@@ -673,20 +680,21 @@ 
    rtx high = gen_reg_rtx (V4SImode);
    rtx low = gen_reg_rtx (V4SImode);
 
-   emit_insn (gen_vec_widen_smult_even_v8hi (even, operands[1], operands[2]));
-   emit_insn (gen_vec_widen_smult_odd_v8hi (odd, operands[1], operands[2]));
-
    if (BYTES_BIG_ENDIAN)
      {
-       emit_insn (gen_altivec_vmrghw (high, even, odd));
-       emit_insn (gen_altivec_vmrglw (low, even, odd));
-       emit_insn (gen_altivec_vpkuwum (operands[0], high, low));
+       emit_insn (gen_altivec_vmulesh (even, operands[1], operands[2]));
+       emit_insn (gen_altivec_vmulosh (odd, operands[1], operands[2]));
+       emit_insn (gen_altivec_vmrghw_direct (high, even, odd));
+       emit_insn (gen_altivec_vmrglw_direct (low, even, odd));
+       emit_insn (gen_altivec_vpkuwum_direct (operands[0], high, low));
      }
    else
      {
-       emit_insn (gen_altivec_vmrghw (high, odd, even));
-       emit_insn (gen_altivec_vmrglw (low, odd, even));
-       emit_insn (gen_altivec_vpkuwum (operands[0], low, high));
+       emit_insn (gen_altivec_vmulosh (even, operands[1], operands[2]));
+       emit_insn (gen_altivec_vmulesh (odd, operands[1], operands[2]));
+       emit_insn (gen_altivec_vmrghw_direct (high, odd, even));
+       emit_insn (gen_altivec_vmrglw_direct (low, odd, even));
+       emit_insn (gen_altivec_vpkuwum_direct (operands[0], low, high));
      } 
 
    DONE;
@@ -838,9 +846,41 @@ 
   "vmladduhm %0,%1,%2,%3"
   [(set_attr "type" "veccomplex")])
 
-(define_insn "altivec_vmrghb"
+(define_expand "altivec_vmrghb"
+  [(use (match_operand:V16QI 0 "register_operand" ""))
+   (use (match_operand:V16QI 1 "register_operand" ""))
+   (use (match_operand:V16QI 2 "register_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (16, GEN_INT (8), GEN_INT (24), GEN_INT (9), GEN_INT (25),
+                     GEN_INT (10), GEN_INT (26), GEN_INT (11), GEN_INT (27),
+		     GEN_INT (12), GEN_INT (28), GEN_INT (13), GEN_INT (29),
+		     GEN_INT (14), GEN_INT (30), GEN_INT (15), GEN_INT (31));
+      x = gen_rtx_VEC_CONCAT (V32QImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (16, GEN_INT (0), GEN_INT (16), GEN_INT (1), GEN_INT (17),
+                     GEN_INT (2), GEN_INT (18), GEN_INT (3), GEN_INT (19),
+		     GEN_INT (4), GEN_INT (20), GEN_INT (5), GEN_INT (21),
+		     GEN_INT (6), GEN_INT (22), GEN_INT (7), GEN_INT (23));
+      x = gen_rtx_VEC_CONCAT (V32QImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V16QImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrghb_internal"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
-	(vec_select:V16QI
+        (vec_select:V16QI
 	  (vec_concat:V32QI
 	    (match_operand:V16QI 1 "register_operand" "v")
 	    (match_operand:V16QI 2 "register_operand" "v"))
@@ -853,12 +893,54 @@ 
 		     (const_int 6) (const_int 22)
 		     (const_int 7) (const_int 23)])))]
   "TARGET_ALTIVEC"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrghb %0,%1,%2";
+  else
+    return "vmrglb %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghb_direct"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+        (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+                       (match_operand:V16QI 2 "register_operand" "v")]
+		      UNSPEC_VMRGH_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrghb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vmrghh"
+(define_expand "altivec_vmrghh"
+  [(use (match_operand:V8HI 0 "register_operand" ""))
+   (use (match_operand:V8HI 1 "register_operand" ""))
+   (use (match_operand:V8HI 2 "register_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (8, GEN_INT (4), GEN_INT (12), GEN_INT (5), GEN_INT (13),
+                     GEN_INT (6), GEN_INT (14), GEN_INT (7), GEN_INT (15));
+      x = gen_rtx_VEC_CONCAT (V16HImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (8, GEN_INT (0), GEN_INT (8), GEN_INT (1), GEN_INT (9),
+                     GEN_INT (2), GEN_INT (10), GEN_INT (3), GEN_INT (11));
+      x = gen_rtx_VEC_CONCAT (V16HImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V8HImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrghh_internal"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
-	(vec_select:V8HI
+        (vec_select:V8HI
 	  (vec_concat:V16HI
 	    (match_operand:V8HI 1 "register_operand" "v")
 	    (match_operand:V8HI 2 "register_operand" "v"))
@@ -867,10 +949,50 @@ 
 		     (const_int 2) (const_int 10)
 		     (const_int 3) (const_int 11)])))]
   "TARGET_ALTIVEC"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrghh %0,%1,%2";
+  else
+    return "vmrglh %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghh_direct"
+  [(set (match_operand:V8HI 0 "register_operand" "=v")
+        (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
+                      (match_operand:V8HI 2 "register_operand" "v")]
+                     UNSPEC_VMRGH_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrghh %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vmrghw"
+(define_expand "altivec_vmrghw"
+  [(use (match_operand:V4SI 0 "register_operand" ""))
+   (use (match_operand:V4SI 1 "register_operand" ""))
+   (use (match_operand:V4SI 2 "register_operand" ""))]
+  "VECTOR_MEM_ALTIVEC_P (V4SImode)"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (4, GEN_INT (2), GEN_INT (6), GEN_INT (3), GEN_INT (7));
+      x = gen_rtx_VEC_CONCAT (V8SImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (4, GEN_INT (0), GEN_INT (4), GEN_INT (1), GEN_INT (5));
+      x = gen_rtx_VEC_CONCAT (V8SImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V4SImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrghw_internal"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
         (vec_select:V4SI
 	  (vec_concat:V8SI
@@ -879,6 +1001,20 @@ 
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 1) (const_int 5)])))]
   "VECTOR_MEM_ALTIVEC_P (V4SImode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrghw %0,%1,%2";
+  else
+    return "vmrglw %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrghw_direct"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+        (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
+                      (match_operand:V4SI 2 "register_operand" "v")]
+                     UNSPEC_VMRGH_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrghw %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -891,10 +1027,47 @@ 
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 1) (const_int 5)])))]
   "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
-  "vmrghw %0,%1,%2"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrghw %0,%1,%2";
+  else
+    return "vmrglw %0,%2,%1";
+}
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vmrglb"
+(define_expand "altivec_vmrglb"
+  [(use (match_operand:V16QI 0 "register_operand" ""))
+   (use (match_operand:V16QI 1 "register_operand" ""))
+   (use (match_operand:V16QI 2 "register_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (16, GEN_INT (0), GEN_INT (16), GEN_INT (1), GEN_INT (17),
+                     GEN_INT (2), GEN_INT (18), GEN_INT (3), GEN_INT (19),
+		     GEN_INT (4), GEN_INT (20), GEN_INT (5), GEN_INT (21),
+		     GEN_INT (6), GEN_INT (22), GEN_INT (7), GEN_INT (23));
+      x = gen_rtx_VEC_CONCAT (V32QImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (16, GEN_INT (8), GEN_INT (24), GEN_INT (9), GEN_INT (25),
+                     GEN_INT (10), GEN_INT (26), GEN_INT (11), GEN_INT (27),
+		     GEN_INT (12), GEN_INT (28), GEN_INT (13), GEN_INT (29),
+		     GEN_INT (14), GEN_INT (30), GEN_INT (15), GEN_INT (31));
+      x = gen_rtx_VEC_CONCAT (V32QImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V16QImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrglb_internal"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
         (vec_select:V16QI
 	  (vec_concat:V32QI
@@ -909,10 +1082,52 @@ 
 		     (const_int 14) (const_int 30)
 		     (const_int 15) (const_int 31)])))]
   "TARGET_ALTIVEC"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrglb %0,%1,%2";
+  else
+    return "vmrghb %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrglb_direct"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+        (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+    		       (match_operand:V16QI 2 "register_operand" "v")]
+                      UNSPEC_VMRGL_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrglb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vmrglh"
+(define_expand "altivec_vmrglh"
+  [(use (match_operand:V8HI 0 "register_operand" ""))
+   (use (match_operand:V8HI 1 "register_operand" ""))
+   (use (match_operand:V8HI 2 "register_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (8, GEN_INT (0), GEN_INT (8), GEN_INT (1), GEN_INT (9),
+                     GEN_INT (2), GEN_INT (10), GEN_INT (3), GEN_INT (11));
+      x = gen_rtx_VEC_CONCAT (V16HImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (8, GEN_INT (4), GEN_INT (12), GEN_INT (5), GEN_INT (13),
+                     GEN_INT (6), GEN_INT (14), GEN_INT (7), GEN_INT (15));
+      x = gen_rtx_VEC_CONCAT (V16HImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V8HImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrglh_internal"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
         (vec_select:V8HI
 	  (vec_concat:V16HI
@@ -923,10 +1138,50 @@ 
 		     (const_int 6) (const_int 14)
 		     (const_int 7) (const_int 15)])))]
   "TARGET_ALTIVEC"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrglh %0,%1,%2";
+  else
+    return "vmrghh %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrglh_direct"
+  [(set (match_operand:V8HI 0 "register_operand" "=v")
+        (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
+		      (match_operand:V8HI 2 "register_operand" "v")]
+                     UNSPEC_VMRGL_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrglh %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vmrglw"
+(define_expand "altivec_vmrglw"
+  [(use (match_operand:V4SI 0 "register_operand" ""))
+   (use (match_operand:V4SI 1 "register_operand" ""))
+   (use (match_operand:V4SI 2 "register_operand" ""))]
+  "VECTOR_MEM_ALTIVEC_P (V4SImode)"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (4, GEN_INT (0), GEN_INT (4), GEN_INT (1), GEN_INT (5));
+      x = gen_rtx_VEC_CONCAT (V8SImode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (4, GEN_INT (2), GEN_INT (6), GEN_INT (3), GEN_INT (7));
+      x = gen_rtx_VEC_CONCAT (V8SImode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (V4SImode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vmrglw_internal"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
         (vec_select:V4SI
 	  (vec_concat:V8SI
@@ -935,6 +1190,20 @@ 
 	  (parallel [(const_int 2) (const_int 6)
 		     (const_int 3) (const_int 7)])))]
   "VECTOR_MEM_ALTIVEC_P (V4SImode)"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrglw %0,%1,%2";
+  else
+    return "vmrghw %0,%2,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vmrglw_direct"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+        (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
+	              (match_operand:V4SI 2 "register_operand" "v")]
+                     UNSPEC_VMRGL_DIRECT))]
+  "TARGET_ALTIVEC"
   "vmrglw %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
@@ -947,7 +1216,12 @@ 
 	 (parallel [(const_int 2) (const_int 6)
 		    (const_int 3) (const_int 7)])))]
   "VECTOR_MEM_ALTIVEC_P (V4SFmode)"
-  "vmrglw %0,%1,%2"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrglw %0,%1,%2";
+  else
+    return "vmrghw %0,%2,%1";
+}
   [(set_attr "type" "vecperm")])
 
 ;; Power8 vector merge even/odd
@@ -960,7 +1234,12 @@ 
 	  (parallel [(const_int 0) (const_int 4)
 		     (const_int 2) (const_int 6)])))]
   "TARGET_P8_VECTOR"
-  "vmrgew %0,%1,%2"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrgew %0,%1,%2";
+  else
+    return "vmrgow %0,%2,%1";
+}
   [(set_attr "type" "vecperm")])
 
 (define_insn "p8_vmrgow"
@@ -972,7 +1251,12 @@ 
 	  (parallel [(const_int 1) (const_int 5)
 		     (const_int 3) (const_int 7)])))]
   "TARGET_P8_VECTOR"
-  "vmrgow %0,%1,%2"
+{
+  if (BYTES_BIG_ENDIAN)
+    return "vmrgow %0,%1,%2";
+  else
+    return "vmrgew %0,%2,%1";
+}
   [(set_attr "type" "vecperm")])
 
 (define_expand "vec_widen_umult_even_v16qi"
@@ -981,7 +1265,7 @@ 
    (use (match_operand:V16QI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmuleub (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmuloub (operands[0], operands[1], operands[2]));
@@ -994,7 +1278,7 @@ 
    (use (match_operand:V16QI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmulesb (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmulosb (operands[0], operands[1], operands[2]));
@@ -1007,7 +1291,7 @@ 
    (use (match_operand:V8HI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmuleuh (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmulouh (operands[0], operands[1], operands[2]));
@@ -1020,7 +1304,7 @@ 
    (use (match_operand:V8HI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmulesh (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmulosh (operands[0], operands[1], operands[2]));
@@ -1033,7 +1317,7 @@ 
    (use (match_operand:V16QI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmuloub (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmuleub (operands[0], operands[1], operands[2]));
@@ -1046,7 +1330,7 @@ 
    (use (match_operand:V16QI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmulosb (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmulesb (operands[0], operands[1], operands[2]));
@@ -1059,7 +1343,7 @@ 
    (use (match_operand:V8HI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmulouh (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmuleuh (operands[0], operands[1], operands[2]));
@@ -1072,7 +1356,7 @@ 
    (use (match_operand:V8HI 2 "register_operand" ""))]
   "TARGET_ALTIVEC"
 {
-  if (BYTES_BIG_ENDIAN)
+  if (VECTOR_ELT_ORDER_BIG)
     emit_insn (gen_altivec_vmulosh (operands[0], operands[1], operands[2]));
   else
     emit_insn (gen_altivec_vmulesh (operands[0], operands[1], operands[2]));
@@ -1161,7 +1445,7 @@ 
   "TARGET_ALTIVEC"
   "*
   {
-    if (BYTES_BIG_ENDIAN)
+    if (VECTOR_ELT_ORDER_BIG)
       return \"vpkpx %0,%1,%2\";
     else
       return \"vpkpx %0,%2,%1\";
@@ -1176,7 +1460,7 @@ 
   "<VI_unit>"
   "*
   {
-    if (BYTES_BIG_ENDIAN)
+    if (VECTOR_ELT_ORDER_BIG)
       return \"vpks<VI_char>ss %0,%1,%2\";
     else
       return \"vpks<VI_char>ss %0,%2,%1\";
@@ -1191,7 +1475,7 @@ 
   "<VI_unit>"
   "*
   {
-    if (BYTES_BIG_ENDIAN)
+    if (VECTOR_ELT_ORDER_BIG)
       return \"vpks<VI_char>us %0,%1,%2\";
     else
       return \"vpks<VI_char>us %0,%2,%1\";
@@ -1206,7 +1490,7 @@ 
   "<VI_unit>"
   "*
   {
-    if (BYTES_BIG_ENDIAN)
+    if (VECTOR_ELT_ORDER_BIG)
       return \"vpku<VI_char>us %0,%1,%2\";
     else
       return \"vpku<VI_char>us %0,%2,%1\";
@@ -1221,6 +1505,21 @@ 
   "<VI_unit>"
   "*
   {
+    if (VECTOR_ELT_ORDER_BIG)
+      return \"vpku<VI_char>um %0,%1,%2\";
+    else
+      return \"vpku<VI_char>um %0,%2,%1\";
+  }"
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vpku<VI_char>um_direct"
+  [(set (match_operand:<VP_small> 0 "register_operand" "=v")
+	(unspec:<VP_small> [(match_operand:VP 1 "register_operand" "v")
+			    (match_operand:VP 2 "register_operand" "v")]
+			   UNSPEC_VPACK_UNS_UNS_MOD_DIRECT))]
+  "<VI_unit>"
+  "*
+  {
     if (BYTES_BIG_ENDIAN)
       return \"vpku<VI_char>um %0,%1,%2\";
     else
@@ -1316,64 +1615,242 @@ 
   "vsum4s<VI_char>s %0,%1,%2"
   [(set_attr "type" "veccomplex")])
 
+;; FIXME: For the following two patterns, the scratch should only be
+;; allocated for !VECTOR_ELT_ORDER_BIG, and the instructions should
+;; be emitted separately.
 (define_insn "altivec_vsum2sws"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
         (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
                       (match_operand:V4SI 2 "register_operand" "v")]
 		     UNSPEC_VSUM2SWS))
-   (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
+   (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))
+   (clobber (match_scratch:V4SI 3 "=v"))]
   "TARGET_ALTIVEC"
-  "vsum2sws %0,%1,%2"
-  [(set_attr "type" "veccomplex")])
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vsum2sws %0,%1,%2";
+  else
+    return "vsldoi %3,%2,%2,12\n\tvsum2sws %3,%1,%3\n\tvsldoi %0,%3,%3,4";
+}
+  [(set_attr "type" "veccomplex")
+   (set (attr "length")
+     (if_then_else
+       (match_test "VECTOR_ELT_ORDER_BIG")
+       (const_string "4")
+       (const_string "12")))])
 
 (define_insn "altivec_vsumsws"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
         (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
                       (match_operand:V4SI 2 "register_operand" "v")]
 		     UNSPEC_VSUMSWS))
+   (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))
+   (clobber (match_scratch:V4SI 3 "=v"))]
+  "TARGET_ALTIVEC"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vsumsws %0,%1,%2";
+  else
+    return "vspltw %3,%2,0\n\tvsumsws %3,%1,%3\n\tvsldoi %0,%3,%3,12";
+}
+  [(set_attr "type" "veccomplex")
+   (set (attr "length")
+     (if_then_else
+       (match_test "(VECTOR_ELT_ORDER_BIG)")
+       (const_string "4")
+       (const_string "12")))])
+
+(define_insn "altivec_vsumsws_direct"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+        (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
+                      (match_operand:V4SI 2 "register_operand" "v")]
+		     UNSPEC_VSUMSWS_DIRECT))
    (set (reg:SI 110) (unspec:SI [(const_int 0)] UNSPEC_SET_VSCR))]
   "TARGET_ALTIVEC"
   "vsumsws %0,%1,%2"
   [(set_attr "type" "veccomplex")])
 
-(define_insn "altivec_vspltb"
+(define_expand "altivec_vspltb"
+  [(use (match_operand:V16QI 0 "register_operand" ""))
+   (use (match_operand:V16QI 1 "register_operand" ""))
+   (use (match_operand:QI 2 "u5bit_cint_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  We have to reflect
+     the actual selected index for the splat in the RTL.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    operands[2] = GEN_INT (15 - INTVAL (operands[2]));
+
+  v = gen_rtvec (1, operands[2]);
+  x = gen_rtx_VEC_SELECT (QImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v));
+  x = gen_rtx_VEC_DUPLICATE (V16QImode, x);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vspltb_internal"
   [(set (match_operand:V16QI 0 "register_operand" "=v")
         (vec_duplicate:V16QI
 	 (vec_select:QI (match_operand:V16QI 1 "register_operand" "v")
 			(parallel
 			 [(match_operand:QI 2 "u5bit_cint_operand" "")]))))]
   "TARGET_ALTIVEC"
+{
+  /* For true LE, this adjusts the selected index.  For LE with 
+     -maltivec=be, this reverses what was done in the define_expand
+     because the instruction already has big-endian bias.  */
+  if (!BYTES_BIG_ENDIAN)
+    operands[2] = GEN_INT (15 - INTVAL (operands[2]));
+
+  return "vspltb %0,%1,%2";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vspltb_direct"
+  [(set (match_operand:V16QI 0 "register_operand" "=v")
+        (unspec:V16QI [(match_operand:V16QI 1 "register_operand" "v")
+	               (match_operand:QI 2 "u5bit_cint_operand" "i")]
+                      UNSPEC_VSPLT_DIRECT))]
+  "TARGET_ALTIVEC"
   "vspltb %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vsplth"
+(define_expand "altivec_vsplth"
+  [(use (match_operand:V8HI 0 "register_operand" ""))
+   (use (match_operand:V8HI 1 "register_operand" ""))
+   (use (match_operand:QI 2 "u5bit_cint_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  We have to reflect
+     the actual selected index for the splat in the RTL.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    operands[2] = GEN_INT (7 - INTVAL (operands[2]));
+
+  v = gen_rtvec (1, operands[2]);
+  x = gen_rtx_VEC_SELECT (HImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v));
+  x = gen_rtx_VEC_DUPLICATE (V8HImode, x);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vsplth_internal"
   [(set (match_operand:V8HI 0 "register_operand" "=v")
 	(vec_duplicate:V8HI
 	 (vec_select:HI (match_operand:V8HI 1 "register_operand" "v")
 			(parallel
 			 [(match_operand:QI 2 "u5bit_cint_operand" "")]))))]
   "TARGET_ALTIVEC"
+{
+  /* For true LE, this adjusts the selected index.  For LE with 
+     -maltivec=be, this reverses what was done in the define_expand
+     because the instruction already has big-endian bias.  */
+  if (!BYTES_BIG_ENDIAN)
+    operands[2] = GEN_INT (7 - INTVAL (operands[2]));
+
+  return "vsplth %0,%1,%2";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vsplth_direct"
+  [(set (match_operand:V8HI 0 "register_operand" "=v")
+        (unspec:V8HI [(match_operand:V8HI 1 "register_operand" "v")
+                      (match_operand:QI 2 "u5bit_cint_operand" "i")]
+                     UNSPEC_VSPLT_DIRECT))]
+  "TARGET_ALTIVEC"
   "vsplth %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vspltw"
+(define_expand "altivec_vspltw"
+  [(use (match_operand:V4SI 0 "register_operand" ""))
+   (use (match_operand:V4SI 1 "register_operand" ""))
+   (use (match_operand:QI 2 "u5bit_cint_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  We have to reflect
+     the actual selected index for the splat in the RTL.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    operands[2] = GEN_INT (3 - INTVAL (operands[2]));
+
+  v = gen_rtvec (1, operands[2]);
+  x = gen_rtx_VEC_SELECT (SImode, operands[1], gen_rtx_PARALLEL (VOIDmode, v));
+  x = gen_rtx_VEC_DUPLICATE (V4SImode, x);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vspltw_internal"
   [(set (match_operand:V4SI 0 "register_operand" "=v")
 	(vec_duplicate:V4SI
 	 (vec_select:SI (match_operand:V4SI 1 "register_operand" "v")
 			(parallel
 			 [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))]
   "TARGET_ALTIVEC"
+{
+  /* For true LE, this adjusts the selected index.  For LE with 
+     -maltivec=be, this reverses what was done in the define_expand
+     because the instruction already has big-endian bias.  */
+  if (!BYTES_BIG_ENDIAN)
+    operands[2] = GEN_INT (3 - INTVAL (operands[2]));
+
+  return "vspltw %0,%1,%2";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "altivec_vspltw_direct"
+  [(set (match_operand:V4SI 0 "register_operand" "=v")
+        (unspec:V4SI [(match_operand:V4SI 1 "register_operand" "v")
+                      (match_operand:QI 2 "u5bit_cint_operand" "i")]
+                     UNSPEC_VSPLT_DIRECT))]
+  "TARGET_ALTIVEC"
   "vspltw %0,%1,%2"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vspltsf"
+(define_expand "altivec_vspltsf"
+  [(use (match_operand:V4SF 0 "register_operand" ""))
+   (use (match_operand:V4SF 1 "register_operand" ""))
+   (use (match_operand:QI 2 "u5bit_cint_operand" ""))]
+  "TARGET_ALTIVEC"
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  We have to reflect
+     the actual selected index for the splat in the RTL.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    operands[2] = GEN_INT (3 - INTVAL (operands[2]));
+
+  v = gen_rtvec (1, operands[2]);
+  x = gen_rtx_VEC_SELECT (SFmode, operands[1], gen_rtx_PARALLEL (VOIDmode, v));
+  x = gen_rtx_VEC_DUPLICATE (V4SFmode, x);
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
+
+(define_insn "*altivec_vspltsf_internal"
   [(set (match_operand:V4SF 0 "register_operand" "=v")
 	(vec_duplicate:V4SF
 	 (vec_select:SF (match_operand:V4SF 1 "register_operand" "v")
 			(parallel
 			 [(match_operand:QI 2 "u5bit_cint_operand" "i")]))))]
   "VECTOR_UNIT_ALTIVEC_P (V4SFmode)"
-  "vspltw %0,%1,%2"
+{
+  /* For true LE, this adjusts the selected index.  For LE with 
+     -maltivec=be, this reverses what was done in the define_expand
+     because the instruction already has big-endian bias.  */
+  if (!BYTES_BIG_ENDIAN)
+    operands[2] = GEN_INT (3 - INTVAL (operands[2]));
+
+  return "vspltw %0,%1,%2";
+}
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vspltis<VI_char>"
@@ -1391,7 +1868,22 @@ 
   "vrfiz %0,%1"
   [(set_attr "type" "vecfloat")])
 
-(define_insn "altivec_vperm_<mode>"
+(define_expand "altivec_vperm_<mode>"
+  [(set (match_operand:VM 0 "register_operand" "=v")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
+		    (match_operand:VM 2 "register_operand" "v")
+		    (match_operand:V16QI 3 "register_operand" "v")]
+		   UNSPEC_VPERM))]
+  "TARGET_ALTIVEC"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_vec_perm_le (operands);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_vperm_<mode>_internal"
   [(set (match_operand:VM 0 "register_operand" "=v")
 	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
 		    (match_operand:VM 2 "register_operand" "v")
@@ -1401,7 +1893,22 @@ 
   "vperm %0,%1,%2,%3"
   [(set_attr "type" "vecperm")])
 
-(define_insn "altivec_vperm_<mode>_uns"
+(define_expand "altivec_vperm_<mode>_uns"
+  [(set (match_operand:VM 0 "register_operand" "=v")
+	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
+		    (match_operand:VM 2 "register_operand" "v")
+		    (match_operand:V16QI 3 "register_operand" "v")]
+		   UNSPEC_VPERM_UNS))]
+  "TARGET_ALTIVEC"
+{
+  if (!VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_vec_perm_le (operands);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_vperm_<mode>_uns_internal"
   [(set (match_operand:VM 0 "register_operand" "=v")
 	(unspec:VM [(match_operand:VM 1 "register_operand" "v")
 		    (match_operand:VM 2 "register_operand" "v")
@@ -1569,6 +2076,19 @@ 
 	(unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
 		     UNSPEC_VUNPACK_HI_SIGN))]
   "<VI_unit>"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vupkhs<VU_char> %0,%1";
+  else
+    return "vupkls<VU_char> %0,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "*altivec_vupkhs<VU_char>_direct"
+  [(set (match_operand:VP 0 "register_operand" "=v")
+	(unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
+		     UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
+  "<VI_unit>"
   "vupkhs<VU_char> %0,%1"
   [(set_attr "type" "vecperm")])
 
@@ -1577,6 +2097,19 @@ 
 	(unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
 		     UNSPEC_VUNPACK_LO_SIGN))]
   "<VI_unit>"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vupkls<VU_char> %0,%1";
+  else
+    return "vupkhs<VU_char> %0,%1";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "*altivec_vupkls<VU_char>_direct"
+  [(set (match_operand:VP 0 "register_operand" "=v")
+	(unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
+		     UNSPEC_VUNPACK_LO_SIGN_DIRECT))]
+  "<VI_unit>"
   "vupkls<VU_char> %0,%1"
   [(set_attr "type" "vecperm")])
 
@@ -1585,7 +2118,12 @@ 
 	(unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v")]
 		     UNSPEC_VUPKHPX))]
   "TARGET_ALTIVEC"
-  "vupkhpx %0,%1"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vupkhpx %0,%1";
+  else
+    return "vupklpx %0,%1";
+}
   [(set_attr "type" "vecperm")])
 
 (define_insn "altivec_vupklpx"
@@ -1593,7 +2131,12 @@ 
 	(unspec:V4SI [(match_operand:V8HI 1 "register_operand" "v")]
 		     UNSPEC_VUPKLPX))]
   "TARGET_ALTIVEC"
-  "vupklpx %0,%1"
+{
+  if (VECTOR_ELT_ORDER_BIG)
+    return "vupklpx %0,%1";
+  else
+    return "vupkhpx %0,%1";
+}
   [(set_attr "type" "vecperm")])
 
 ;; Compare vectors producing a vector result and a predicate, setting CR6 to
@@ -1782,7 +2325,21 @@ 
 ;; Parallel some of the LVE* and STV*'s with unspecs because some have
 ;; identical rtl but different instructions-- and gcc gets confused.
 
-(define_insn "altivec_lve<VI_char>x"
+(define_expand "altivec_lve<VI_char>x"
+  [(parallel
+    [(set (match_operand:VI 0 "register_operand" "=v")
+	  (match_operand:VI 1 "memory_operand" "Z"))
+     (unspec [(const_int 0)] UNSPEC_LVE)])]
+  "TARGET_ALTIVEC"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_lvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_LVE);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_lve<VI_char>x_internal"
   [(parallel
     [(set (match_operand:VI 0 "register_operand" "=v")
 	  (match_operand:VI 1 "memory_operand" "Z"))
@@ -1800,16 +2357,44 @@ 
   "lvewx %0,%y1"
   [(set_attr "type" "vecload")])
 
-(define_insn "altivec_lvxl"
+(define_expand "altivec_lvxl_<mode>"
   [(parallel
-    [(set (match_operand:V4SI 0 "register_operand" "=v")
-	  (match_operand:V4SI 1 "memory_operand" "Z"))
+    [(set (match_operand:VM2 0 "register_operand" "=v")
+	  (match_operand:VM2 1 "memory_operand" "Z"))
      (unspec [(const_int 0)] UNSPEC_SET_VSCR)])]
   "TARGET_ALTIVEC"
-  "lvxl %0,%y1"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_lvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_SET_VSCR);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_lvxl_<mode>_internal"
+  [(parallel
+    [(set (match_operand:VM2 0 "register_operand" "=v")
+	  (match_operand:VM2 1 "memory_operand" "Z"))
+     (unspec [(const_int 0)] UNSPEC_SET_VSCR)])]
+  "TARGET_ALTIVEC"
+  "lvx %0,%y1"
   [(set_attr "type" "vecload")])
 
-(define_insn "altivec_lvx_<mode>"
+(define_expand "altivec_lvx_<mode>"
+  [(parallel
+    [(set (match_operand:VM2 0 "register_operand" "=v")
+	  (match_operand:VM2 1 "memory_operand" "Z"))
+     (unspec [(const_int 0)] UNSPEC_LVX)])]
+  "TARGET_ALTIVEC"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_lvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_LVX);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_lvx_<mode>_internal"
   [(parallel
     [(set (match_operand:VM2 0 "register_operand" "=v")
 	  (match_operand:VM2 1 "memory_operand" "Z"))
@@ -1818,7 +2403,21 @@ 
   "lvx %0,%y1"
   [(set_attr "type" "vecload")])
 
-(define_insn "altivec_stvx_<mode>"
+(define_expand "altivec_stvx_<mode>"
+  [(parallel
+    [(set (match_operand:VM2 0 "memory_operand" "=Z")
+	  (match_operand:VM2 1 "register_operand" "v"))
+     (unspec [(const_int 0)] UNSPEC_STVX)])]
+  "TARGET_ALTIVEC"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_stvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_STVX);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_stvx_<mode>_internal"
   [(parallel
     [(set (match_operand:VM2 0 "memory_operand" "=Z")
 	  (match_operand:VM2 1 "register_operand" "v"))
@@ -1827,16 +2426,42 @@ 
   "stvx %1,%y0"
   [(set_attr "type" "vecstore")])
 
-(define_insn "altivec_stvxl"
+(define_expand "altivec_stvxl_<mode>"
   [(parallel
-    [(set (match_operand:V4SI 0 "memory_operand" "=Z")
-	  (match_operand:V4SI 1 "register_operand" "v"))
+    [(set (match_operand:VM2 0 "memory_operand" "=Z")
+	  (match_operand:VM2 1 "register_operand" "v"))
+     (unspec [(const_int 0)] UNSPEC_STVXL)])]
+  "TARGET_ALTIVEC"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_stvx_be (operands[0], operands[1], <MODE>mode, UNSPEC_STVXL);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_stvxl_<mode>_internal"
+  [(parallel
+    [(set (match_operand:VM2 0 "memory_operand" "=Z")
+	  (match_operand:VM2 1 "register_operand" "v"))
      (unspec [(const_int 0)] UNSPEC_STVXL)])]
   "TARGET_ALTIVEC"
   "stvxl %1,%y0"
   [(set_attr "type" "vecstore")])
 
-(define_insn "altivec_stve<VI_char>x"
+(define_expand "altivec_stve<VI_char>x"
+  [(set (match_operand:<VI_scalar> 0 "memory_operand" "=Z")
+	(unspec:<VI_scalar> [(match_operand:VI 1 "register_operand" "v")] UNSPEC_STVE))]
+  "TARGET_ALTIVEC"
+{
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      altivec_expand_stvex_be (operands[0], operands[1], <MODE>mode, UNSPEC_STVE);
+      DONE;
+    }
+})
+
+(define_insn "*altivec_stve<VI_char>x_internal"
   [(set (match_operand:<VI_scalar> 0 "memory_operand" "=Z")
 	(unspec:<VI_scalar> [(match_operand:VI 1 "register_operand" "v")] UNSPEC_STVE))]
   "TARGET_ALTIVEC"
@@ -1924,7 +2549,7 @@ 
 
   emit_insn (gen_altivec_vspltisw (vzero, const0_rtx));
   emit_insn (gen_altivec_vsum4s<VI_char>s (vtmp1, operands[1], vzero));
-  emit_insn (gen_altivec_vsumsws (dest, vtmp1, vzero));
+  emit_insn (gen_altivec_vsumsws_direct (dest, vtmp1, vzero));
   DONE;
 })
 
@@ -1940,7 +2565,7 @@ 
 
   emit_insn (gen_altivec_vspltisw (vzero, const0_rtx));
   emit_insn (gen_altivec_vsum4ubs (vtmp1, operands[1], vzero));
-  emit_insn (gen_altivec_vsumsws (dest, vtmp1, vzero));
+  emit_insn (gen_altivec_vsumsws_direct (dest, vtmp1, vzero));
   DONE;
 })
 
@@ -2033,14 +2658,14 @@ 
 (define_expand "vec_unpacks_hi_<VP_small_lc>"
   [(set (match_operand:VP 0 "register_operand" "=v")
         (unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
-		   UNSPEC_VUNPACK_HI_SIGN))]
+		   UNSPEC_VUNPACK_HI_SIGN_DIRECT))]
   "<VI_unit>"
   "")
 
 (define_expand "vec_unpacks_lo_<VP_small_lc>"
   [(set (match_operand:VP 0 "register_operand" "=v")
         (unspec:VP [(match_operand:<VP_small> 1 "register_operand" "v")]
-		   UNSPEC_VUNPACK_LO_SIGN))]
+		   UNSPEC_VUNPACK_LO_SIGN_DIRECT))]
   "<VI_unit>"
   "")
 
@@ -2220,12 +2845,18 @@ 
   rtx ve = gen_reg_rtx (V8HImode);
   rtx vo = gen_reg_rtx (V8HImode);
   
-  emit_insn (gen_vec_widen_umult_even_v16qi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_umult_odd_v16qi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrghh (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmuleub (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuloub (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghh_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrghh (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmuloub (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuleub (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghh_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2240,12 +2871,18 @@ 
   rtx ve = gen_reg_rtx (V8HImode);
   rtx vo = gen_reg_rtx (V8HImode);
   
-  emit_insn (gen_vec_widen_umult_even_v16qi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_umult_odd_v16qi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrglh (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmuleub (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuloub (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglh_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrglh (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmuloub (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuleub (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglh_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2260,12 +2897,18 @@ 
   rtx ve = gen_reg_rtx (V8HImode);
   rtx vo = gen_reg_rtx (V8HImode);
   
-  emit_insn (gen_vec_widen_smult_even_v16qi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_smult_odd_v16qi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrghh (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmulesb (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulosb (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghh_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrghh (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulosb (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulesb (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghh_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2280,12 +2923,18 @@ 
   rtx ve = gen_reg_rtx (V8HImode);
   rtx vo = gen_reg_rtx (V8HImode);
   
-  emit_insn (gen_vec_widen_smult_even_v16qi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_smult_odd_v16qi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrglh (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmulesb (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulosb (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglh_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrglh (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulosb (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulesb (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglh_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2300,12 +2949,18 @@ 
   rtx ve = gen_reg_rtx (V4SImode);
   rtx vo = gen_reg_rtx (V4SImode);
   
-  emit_insn (gen_vec_widen_umult_even_v8hi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_umult_odd_v8hi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrghw (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmuleuh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulouh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrghw (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulouh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuleuh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghw_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2320,12 +2975,18 @@ 
   rtx ve = gen_reg_rtx (V4SImode);
   rtx vo = gen_reg_rtx (V4SImode);
   
-  emit_insn (gen_vec_widen_umult_even_v8hi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_umult_odd_v8hi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrglw (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmuleuh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulouh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglw_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrglw (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulouh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmuleuh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglw_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2340,12 +3001,18 @@ 
   rtx ve = gen_reg_rtx (V4SImode);
   rtx vo = gen_reg_rtx (V4SImode);
   
-  emit_insn (gen_vec_widen_smult_even_v8hi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_smult_odd_v8hi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrghw (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmulesh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulosh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghw_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrghw (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulosh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulesh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrghw_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
@@ -2360,12 +3027,18 @@ 
   rtx ve = gen_reg_rtx (V4SImode);
   rtx vo = gen_reg_rtx (V4SImode);
   
-  emit_insn (gen_vec_widen_smult_even_v8hi (ve, operands[1], operands[2]));
-  emit_insn (gen_vec_widen_smult_odd_v8hi (vo, operands[1], operands[2]));
   if (BYTES_BIG_ENDIAN)
-    emit_insn (gen_altivec_vmrglw (operands[0], ve, vo));
+    {
+      emit_insn (gen_altivec_vmulesh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulosh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglw_direct (operands[0], ve, vo));
+    }
   else
-    emit_insn (gen_altivec_vmrglw (operands[0], vo, ve));
+    {
+      emit_insn (gen_altivec_vmulosh (ve, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmulesh (vo, operands[1], operands[2]));
+      emit_insn (gen_altivec_vmrglw_direct (operands[0], vo, ve));
+    }
   DONE;
 }")
 
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-builtin.def
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-builtin.def
@@ -793,8 +793,26 @@  BU_ALTIVEC_X (LVEBX,		"lvebx",	    MEM)
 BU_ALTIVEC_X (LVEHX,		"lvehx",	    MEM)
 BU_ALTIVEC_X (LVEWX,		"lvewx",	    MEM)
 BU_ALTIVEC_X (LVXL,		"lvxl",		    MEM)
+BU_ALTIVEC_X (LVXL_V2DF,	"lvxl_v2df",	    MEM)
+BU_ALTIVEC_X (LVXL_V2DI,	"lvxl_v2di",	    MEM)
+BU_ALTIVEC_X (LVXL_V4SF,	"lvxl_v4sf",	    MEM)
+BU_ALTIVEC_X (LVXL_V4SI,	"lvxl_v4si",	    MEM)
+BU_ALTIVEC_X (LVXL_V8HI,	"lvxl_v8hi",	    MEM)
+BU_ALTIVEC_X (LVXL_V16QI,	"lvxl_v16qi",	    MEM)
 BU_ALTIVEC_X (LVX,		"lvx",		    MEM)
+BU_ALTIVEC_X (LVX_V2DF,		"lvx_v2df",	    MEM)
+BU_ALTIVEC_X (LVX_V2DI,		"lvx_v2di",	    MEM)
+BU_ALTIVEC_X (LVX_V4SF,		"lvx_v4sf",	    MEM)
+BU_ALTIVEC_X (LVX_V4SI,		"lvx_v4si",	    MEM)
+BU_ALTIVEC_X (LVX_V8HI,		"lvx_v8hi",	    MEM)
+BU_ALTIVEC_X (LVX_V16QI,	"lvx_v16qi",	    MEM)
 BU_ALTIVEC_X (STVX,		"stvx",		    MEM)
+BU_ALTIVEC_X (STVX_V2DF,	"stvx_v2df",	    MEM)
+BU_ALTIVEC_X (STVX_V2DI,	"stvx_v2di",	    MEM)
+BU_ALTIVEC_X (STVX_V4SF,	"stvx_v4sf",	    MEM)
+BU_ALTIVEC_X (STVX_V4SI,	"stvx_v4si",	    MEM)
+BU_ALTIVEC_X (STVX_V8HI,	"stvx_v8hi",	    MEM)
+BU_ALTIVEC_X (STVX_V16QI,	"stvx_v16qi",	    MEM)
 BU_ALTIVEC_C (LVLX,		"lvlx",		    MEM)
 BU_ALTIVEC_C (LVLXL,		"lvlxl",	    MEM)
 BU_ALTIVEC_C (LVRX,		"lvrx",		    MEM)
@@ -803,6 +821,12 @@  BU_ALTIVEC_X (STVEBX,		"stvebx",	    MEM
 BU_ALTIVEC_X (STVEHX,		"stvehx",	    MEM)
 BU_ALTIVEC_X (STVEWX,		"stvewx",	    MEM)
 BU_ALTIVEC_X (STVXL,		"stvxl",	    MEM)
+BU_ALTIVEC_X (STVXL_V2DF,	"stvxl_v2df",	    MEM)
+BU_ALTIVEC_X (STVXL_V2DI,	"stvxl_v2di",	    MEM)
+BU_ALTIVEC_X (STVXL_V4SF,	"stvxl_v4sf",	    MEM)
+BU_ALTIVEC_X (STVXL_V4SI,	"stvxl_v4si",	    MEM)
+BU_ALTIVEC_X (STVXL_V8HI,	"stvxl_v8hi",	    MEM)
+BU_ALTIVEC_X (STVXL_V16QI,	"stvxl_v16qi",	    MEM)
 BU_ALTIVEC_C (STVLX,		"stvlx",	    MEM)
 BU_ALTIVEC_C (STVLXL,		"stvlxl",	    MEM)
 BU_ALTIVEC_C (STVRX,		"stvrx",	    MEM)
@@ -1318,7 +1342,7 @@  BU_P8V_AV_2 (VMRGOW,		"vmrgow",	CONST,	p
 BU_P8V_AV_2 (VPKUDUM,		"vpkudum",	CONST,	altivec_vpkudum)
 BU_P8V_AV_2 (VPKSDSS,		"vpksdss",	CONST,	altivec_vpksdss)
 BU_P8V_AV_2 (VPKUDUS,		"vpkudus",	CONST,	altivec_vpkudus)
-BU_P8V_AV_2 (VPKSDUS,		"vpksdus",	CONST,	altivec_vpkswus)
+BU_P8V_AV_2 (VPKSDUS,		"vpksdus",	CONST,	altivec_vpksdus)
 BU_P8V_AV_2 (VRLD,		"vrld",		CONST,	vrotlv2di3)
 BU_P8V_AV_2 (VSLD,		"vsld",		CONST,	vashlv2di3)
 BU_P8V_AV_2 (VSRD,		"vsrd",		CONST,	vlshrv2di3)
Index: gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/rs6000-protos.h
+++ gcc-4_8-test/gcc/config/rs6000/rs6000-protos.h
@@ -58,6 +58,9 @@  extern void rs6000_expand_vector_extract
 extern bool altivec_expand_vec_perm_const (rtx op[4]);
 extern void altivec_expand_vec_perm_le (rtx op[4]);
 extern bool rs6000_expand_vec_perm_const (rtx op[4]);
+extern void altivec_expand_lvx_be (rtx, rtx, enum machine_mode, unsigned);
+extern void altivec_expand_stvx_be (rtx, rtx, enum machine_mode, unsigned);
+extern void altivec_expand_stvex_be (rtx, rtx, enum machine_mode, unsigned);
 extern void rs6000_expand_extract_even (rtx, rtx, rtx);
 extern void rs6000_expand_interleave (rtx, rtx, rtx, bool);
 extern void build_mask64_2_operands (rtx, rtx *);
Index: gcc-4_8-test/gcc/config/rs6000/vsx.md
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/vsx.md
+++ gcc-4_8-test/gcc/config/rs6000/vsx.md
@@ -213,6 +213,7 @@ 
    UNSPEC_VSX_ROUND_I
    UNSPEC_VSX_ROUND_IC
    UNSPEC_VSX_SLDWI
+   UNSPEC_VSX_XXSPLTW
   ])
 
 ;; VSX moves
@@ -1620,7 +1621,18 @@ 
 	  op1 = gen_lowpart (V2DImode, op1);
 	}
     }
-  emit_insn (gen (target, op0, op1, perm0, perm1));
+  /* In little endian mode, vsx_xxpermdi2_<mode>_1 will perform a
+     transformation we don't want; it is necessary for
+     rs6000_expand_vec_perm_const_1 but not for this use.  So we
+     prepare for that by reversing the transformation here.  */
+  if (BYTES_BIG_ENDIAN)
+    emit_insn (gen (target, op0, op1, perm0, perm1));
+  else
+    {
+      rtx p0 = GEN_INT (3 - INTVAL (perm1));
+      rtx p1 = GEN_INT (3 - INTVAL (perm0));
+      emit_insn (gen (target, op1, op0, p0, p1));
+    }
   DONE;
 })
 
@@ -1634,9 +1646,32 @@ 
 		     (match_operand 4 "const_2_to_3_operand" "")])))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
-  int mask = (INTVAL (operands[3]) << 1) | (INTVAL (operands[4]) - 2);
+  int op3, op4, mask;
+
+  /* For little endian, swap operands and invert/swap selectors
+     to get the correct xxpermdi.  The operand swap sets up the
+     inputs as a little endian array.  The selectors are swapped
+     because they are defined to use big endian ordering.  The
+     selectors are inverted to get the correct doublewords for
+     little endian ordering.  */
+  if (BYTES_BIG_ENDIAN)
+    {
+      op3 = INTVAL (operands[3]);
+      op4 = INTVAL (operands[4]);
+    }
+  else
+    {
+      op3 = 3 - INTVAL (operands[4]);
+      op4 = 3 - INTVAL (operands[3]);
+    }
+
+  mask = (op3 << 1) | (op4 - 2);
   operands[3] = GEN_INT (mask);
-  return "xxpermdi %x0,%x1,%x2,%3";
+
+  if (BYTES_BIG_ENDIAN)
+    return "xxpermdi %x0,%x1,%x2,%3";
+  else
+    return "xxpermdi %x0,%x2,%x1,%3";
 }
   [(set_attr "type" "vecperm")])
 
@@ -1655,24 +1690,56 @@ 
 
 ;; Expanders for builtins
 (define_expand "vsx_mergel_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "")
-	(vec_select:VSX_D
-	  (vec_concat:<VS_double>
-	    (match_operand:VSX_D 1 "vsx_register_operand" "")
-	    (match_operand:VSX_D 2 "vsx_register_operand" ""))
-	  (parallel [(const_int 1) (const_int 3)])))]
+  [(use (match_operand:VSX_D 0 "vsx_register_operand" ""))
+   (use (match_operand:VSX_D 1 "vsx_register_operand" ""))
+   (use (match_operand:VSX_D 2 "vsx_register_operand" ""))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "")
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (2, GEN_INT (0), GEN_INT (2));
+      x = gen_rtx_VEC_CONCAT (<VS_double>mode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (2, GEN_INT (1), GEN_INT (3));
+      x = gen_rtx_VEC_CONCAT (<VS_double>mode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (<MODE>mode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
 
 (define_expand "vsx_mergeh_<mode>"
-  [(set (match_operand:VSX_D 0 "vsx_register_operand" "")
-	(vec_select:VSX_D
-	  (vec_concat:<VS_double>
-	    (match_operand:VSX_D 1 "vsx_register_operand" "")
-	    (match_operand:VSX_D 2 "vsx_register_operand" ""))
-	  (parallel [(const_int 0) (const_int 2)])))]
+  [(use (match_operand:VSX_D 0 "vsx_register_operand" ""))
+   (use (match_operand:VSX_D 1 "vsx_register_operand" ""))
+   (use (match_operand:VSX_D 2 "vsx_register_operand" ""))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "")
+{
+  rtvec v;
+  rtx x;
+
+  /* Special handling for LE with -maltivec=be.  */
+  if (!BYTES_BIG_ENDIAN && VECTOR_ELT_ORDER_BIG)
+    {
+      v = gen_rtvec (2, GEN_INT (1), GEN_INT (3));
+      x = gen_rtx_VEC_CONCAT (<VS_double>mode, operands[2], operands[1]);
+    }
+  else
+    {
+      v = gen_rtvec (2, GEN_INT (0), GEN_INT (2));
+      x = gen_rtx_VEC_CONCAT (<VS_double>mode, operands[1], operands[2]);
+    }
+
+  x = gen_rtx_VEC_SELECT (<MODE>mode, x, gen_rtx_PARALLEL (VOIDmode, v));
+  emit_insn (gen_rtx_SET (VOIDmode, operands[0], x));
+  DONE;
+})
 
 ;; V2DF/V2DI splat
 (define_insn "vsx_splat_<mode>"
@@ -1698,6 +1765,20 @@ 
 	  (parallel
 	   [(match_operand:QI 2 "u5bit_cint_operand" "i,i")]))))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
+{
+  if (!BYTES_BIG_ENDIAN)
+    operands[2] = GEN_INT (3 - INTVAL (operands[2]));
+
+  return "xxspltw %x0,%x1,%2";
+}
+  [(set_attr "type" "vecperm")])
+
+(define_insn "vsx_xxspltw_<mode>_direct"
+  [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wf,?wa")
+        (unspec:VSX_W [(match_operand:VSX_W 1 "vsx_register_operand" "wf,wa")
+                       (match_operand:QI 2 "u5bit_cint_operand" "i,i")]
+                      UNSPEC_VSX_XXSPLTW))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)"
   "xxspltw %x0,%x1,%2"
   [(set_attr "type" "vecperm")])
 
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/3b-15.c
===================================================================
--- gcc-4_8-test.orig/gcc/testsuite/gcc.dg/vmx/3b-15.c
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/3b-15.c
@@ -3,11 +3,7 @@ 
 vector unsigned char
 f (vector unsigned char a, vector unsigned char b, vector unsigned char c)
 {
-#ifdef __BIG_ENDIAN__
   return vec_perm(a,b,c); 
-#else
-  return vec_perm(b,a,c);
-#endif
 }
 
 static void test()
@@ -16,13 +12,8 @@  static void test()
 					    8,9,10,11,12,13,14,15}),
 		     ((vector unsigned char){70,71,72,73,74,75,76,77,
 					    78,79,80,81,82,83,84,85}),
-#ifdef __BIG_ENDIAN__
 		     ((vector unsigned char){0x1,0x14,0x18,0x10,0x16,0x15,0x19,0x1a,
 					    0x1c,0x1c,0x1c,0x12,0x8,0x1d,0x1b,0xe})),
-#else
-                     ((vector unsigned char){0x1e,0xb,0x7,0xf,0x9,0xa,0x6,0x5,
-                                            0x3,0x3,0x3,0xd,0x17,0x2,0x4,0x11})),
-#endif
 		   ((vector unsigned char){1,74,78,70,76,75,79,80,82,82,82,72,8,83,81,14})),
 	"f");
 }
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/eg-5.c
===================================================================
--- gcc-4_8-test.orig/gcc/testsuite/gcc.dg/vmx/eg-5.c
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/eg-5.c
@@ -6,19 +6,10 @@  matvecmul4 (vector float c0, vector floa
 {
   /* Set result to a vector of f32 0's */
   vector float result = ((vector float){0.,0.,0.,0.});
-
-#ifdef __LITTLE_ENDIAN__
-  result  = vec_madd (c0, vec_splat (v, 3), result);
-  result  = vec_madd (c1, vec_splat (v, 2), result);
-  result  = vec_madd (c2, vec_splat (v, 1), result);
-  result  = vec_madd (c3, vec_splat (v, 0), result);
-#else
   result  = vec_madd (c0, vec_splat (v, 0), result);
   result  = vec_madd (c1, vec_splat (v, 1), result);
   result  = vec_madd (c2, vec_splat (v, 2), result);
   result  = vec_madd (c3, vec_splat (v, 3), result);
-#endif
-
   return result;
 }
 
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-be-order.c
@@ -0,0 +1,33 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char va = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vb = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vc = {0,1,2,3,4,5,6,7};
+  vector signed short vd = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int ve = {0,1,2,3};
+  vector signed int vf = {-2,-1,0,1};
+  vector float vg = {-2.0f,-1.0f,0.0f,1.0f};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  check (vec_extract (va, 5) == 10, "vec_extract (va, 5)");
+  check (vec_extract (vb, 0) == 7, "vec_extract (vb, 0)");
+  check (vec_extract (vc, 7) == 0, "vec_extract (vc, 7)");
+  check (vec_extract (vd, 3) == 0, "vec_extract (vd, 3)");
+  check (vec_extract (ve, 2) == 1, "vec_extract (ve, 2)");
+  check (vec_extract (vf, 1) == 0, "vec_extract (vf, 1)");
+  check (vec_extract (vg, 0) == 1.0f, "vec_extract (vg, 0)");
+#else
+  check (vec_extract (va, 5) == 5, "vec_extract (va, 5)");
+  check (vec_extract (vb, 0) == -8, "vec_extract (vb, 0)");
+  check (vec_extract (vc, 7) == 7, "vec_extract (vc, 7)");
+  check (vec_extract (vd, 3) == -1, "vec_extract (vd, 3)");
+  check (vec_extract (ve, 2) == 2, "vec_extract (ve, 2)");
+  check (vec_extract (vf, 1) == -1, "vec_extract (vf, 1)");
+  check (vec_extract (vg, 0) == -2.0f, "vec_extract (vg, 0)");
+#endif
+}
+
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-vsx-be-order.c
@@ -0,0 +1,19 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector long long vl = {0, 1};
+  vector double vd = {0.0, 1.0};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  check (vec_extract (vl, 0) == 1, "vl, 0");
+  check (vec_extract (vd, 1) == 0.0, "vd, 1");
+#else
+  check (vec_extract (vl, 0) == 0, "vl, 0");
+  check (vec_extract (vd, 1) == 1.0, "vd, 1");
+#endif
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract-vsx.c
@@ -0,0 +1,16 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector long long vl = {0, 1};
+  vector double vd = {0.0, 1.0};
+
+  check (vec_extract (vl, 0) == 0, "vec_extract, vl, 0");
+  check (vec_extract (vd, 1) == 1.0, "vec_extract, vd, 1");
+  check (vl[0] == 0, "[], vl, 0");
+  check (vd[1] == 1.0, "[], vd, 0");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/extract.c
@@ -0,0 +1,21 @@ 
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char va = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vb = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vc = {0,1,2,3,4,5,6,7};
+  vector signed short vd = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int ve = {0,1,2,3};
+  vector signed int vf = {-2,-1,0,1};
+  vector float vg = {-2.0f,-1.0f,0.0f,1.0f};
+
+  check (vec_extract (va, 5) == 5, "vec_extract (va, 5)");
+  check (vec_extract (vb, 0) == -8, "vec_extract (vb, 0)");
+  check (vec_extract (vc, 7) == 7, "vec_extract (vc, 7)");
+  check (vec_extract (vd, 3) == -1, "vec_extract (vd, 3)");
+  check (vec_extract (ve, 2) == 2, "vec_extract (ve, 2)");
+  check (vec_extract (vf, 1) == -1, "vec_extract (vf, 1)");
+  check (vec_extract (vg, 0) == -2.0f, "vec_extract (vg, 0)");
+}
+
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-be-order.c
@@ -0,0 +1,65 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char va = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vb = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vc = {0,1,2,3,4,5,6,7};
+  vector signed short vd = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int ve = {0,1,2,3};
+  vector signed int vf = {-2,-1,0,1};
+  vector float vg = {-2.0f,-1.0f,0.0f,1.0f};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  check (vec_all_eq (vec_insert (16, va, 5),
+		     ((vector unsigned char)
+		       {0,1,2,3,4,5,6,7,8,9,16,11,12,13,14,15})),
+	 "vec_insert (va LE)");
+  check (vec_all_eq (vec_insert (-16, vb, 0),
+		     ((vector signed char)
+		       {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,-16})),
+	 "vec_insert (vb LE)");
+  check (vec_all_eq (vec_insert (16, vc, 7),
+		     ((vector unsigned short){16,1,2,3,4,5,6,7})),
+	 "vec_insert (vc LE)");
+  check (vec_all_eq (vec_insert (-16, vd, 3),
+		     ((vector signed short){-4,-3,-2,-1,-16,1,2,3})),
+	 "vec_insert (vd LE)");
+  check (vec_all_eq (vec_insert (16, ve, 2),
+		     ((vector unsigned int){0,16,2,3})),
+	 "vec_insert (ve LE)");
+  check (vec_all_eq (vec_insert (-16, vf, 1),
+		     ((vector signed int){-2,-1,-16,1})),
+	 "vec_insert (vf LE)");
+  check (vec_all_eq (vec_insert (-16.0f, vg, 0),
+		     ((vector float){-2.0f,-1.0f,0.0f,-16.0f})),
+	 "vec_insert (vg LE)");
+#else
+  check (vec_all_eq (vec_insert (16, va, 5),
+		     ((vector unsigned char)
+		       {0,1,2,3,4,16,6,7,8,9,10,11,12,13,14,15})),
+	 "vec_insert (va BE)");
+  check (vec_all_eq (vec_insert (-16, vb, 0),
+		     ((vector signed char)
+		       {-16,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7})),
+	 "vec_insert (vb BE)");
+  check (vec_all_eq (vec_insert (16, vc, 7),
+		     ((vector unsigned short){0,1,2,3,4,5,6,16})),
+	 "vec_insert (vc BE)");
+  check (vec_all_eq (vec_insert (-16, vd, 3),
+		     ((vector signed short){-4,-3,-2,-16,0,1,2,3})),
+	 "vec_insert (vd BE)");
+  check (vec_all_eq (vec_insert (16, ve, 2),
+		     ((vector unsigned int){0,1,16,3})),
+	 "vec_insert (ve BE)");
+  check (vec_all_eq (vec_insert (-16, vf, 1),
+		     ((vector signed int){-2,-16,0,1})),
+	 "vec_insert (vf BE)");
+  check (vec_all_eq (vec_insert (-16.0f, vg, 0),
+		     ((vector float){-16.0f,-1.0f,0.0f,1.0f})),
+	 "vec_insert (vg BE)");
+#endif
+}
+
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-vsx-be-order.c
@@ -0,0 +1,34 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static int vec_long_long_eq (vector long long x, vector long long y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static int vec_dbl_eq (vector double x, vector double y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static void test()
+{
+  vector long long vl = {0, 1};
+  vector double vd = {0.0, 1.0};
+  vector long long vlr = vec_insert (2, vl, 0);
+  vector double vdr = vec_insert (2.0, vd, 1);
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector long long vler = {0, 2};
+  vector double vder = {2.0, 1.0};
+#else
+  vector long long vler = {2, 1};
+  vector double vder = {0.0, 2.0};
+#endif
+
+  check (vec_long_long_eq (vlr, vler), "vl");
+  check (vec_dbl_eq (vdr, vder), "vd");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert-vsx.c
@@ -0,0 +1,28 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static int vec_long_long_eq (vector long long x, vector long long y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static int vec_dbl_eq (vector double x, vector double y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static void test()
+{
+  vector long long vl = {0, 1};
+  vector double vd = {0.0, 1.0};
+  vector long long vlr = vec_insert (2, vl, 0);
+  vector double vdr = vec_insert (2.0, vd, 1);
+  vector long long vler = {2, 1};
+  vector double vder = {0.0, 2.0};
+
+  check (vec_long_long_eq (vlr, vler), "vl");
+  check (vec_dbl_eq (vdr, vder), "vd");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/insert.c
@@ -0,0 +1,37 @@ 
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char va = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vb = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vc = {0,1,2,3,4,5,6,7};
+  vector signed short vd = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int ve = {0,1,2,3};
+  vector signed int vf = {-2,-1,0,1};
+  vector float vg = {-2.0f,-1.0f,0.0f,1.0f};
+
+  check (vec_all_eq (vec_insert (16, va, 5),
+		     ((vector unsigned char)
+		      {0,1,2,3,4,16,6,7,8,9,10,11,12,13,14,15})),
+	 "vec_insert (va)");
+  check (vec_all_eq (vec_insert (-16, vb, 0),
+		     ((vector signed char)
+		      {-16,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7})),
+	 "vec_insert (vb)");
+  check (vec_all_eq (vec_insert (16, vc, 7),
+		     ((vector unsigned short){0,1,2,3,4,5,6,16})),
+	 "vec_insert (vc)");
+  check (vec_all_eq (vec_insert (-16, vd, 3),
+		     ((vector signed short){-4,-3,-2,-16,0,1,2,3})),
+	 "vec_insert (vd)");
+  check (vec_all_eq (vec_insert (16, ve, 2),
+		     ((vector unsigned int){0,1,16,3})),
+	 "vec_insert (ve)");
+  check (vec_all_eq (vec_insert (-16, vf, 1),
+		     ((vector signed int){-2,-16,0,1})),
+	 "vec_insert (vf)");
+  check (vec_all_eq (vec_insert (-16.0f, vg, 0),
+		     ((vector float){-16.0f,-1.0f,0.0f,1.0f})),
+	 "vec_insert (vg)");
+}
+
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-be-order.c
@@ -0,0 +1,107 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+      svbc[i] = (i % 2) ? 0xff : 0;
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+      svbs[i] = (i % 2) ? 0xffff : 0;
+      svp[i] = i;
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svbi[i] = (i % 2) ? 0xffffffff : 0;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char evuc = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
+  vector signed char evsc = {7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8};
+  vector bool char evbc = {255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0};
+  vector unsigned short evus = {7,6,5,4,3,2,1,0};
+  vector signed short evss = {3,2,1,0,-1,-2,-3,-4};
+  vector bool short evbs = {65535,0,65535,0,65535,0,65535,0};
+  vector pixel evp = {7,6,5,4,3,2,1,0};
+  vector unsigned int evui = {3,2,1,0};
+  vector signed int evsi = {1,0,-1,-2};
+  vector bool int evbi = {0xffffffff,0,0xffffffff,0};
+  vector float evf = {3.0,2.0,1.0,0.0};
+#else
+  vector unsigned char evuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char evsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char evbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short evus = {0,1,2,3,4,5,6,7};
+  vector signed short evss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short evbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel evp = {0,1,2,3,4,5,6,7};
+  vector unsigned int evui = {0,1,2,3};
+  vector signed int evsi = {-2,-1,0,1};
+  vector bool int evbi = {0,0xffffffff,0,0xffffffff};
+  vector float evf = {0.0,1.0,2.0,3.0};
+#endif
+
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector bool char vbc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector bool short vbs;
+  vector pixel vp;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector bool int vbi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_ld (0, (vector unsigned char *)svuc);
+  vsc = vec_ld (0, (vector signed char *)svsc);
+  vbc = vec_ld (0, (vector bool char *)svbc);
+  vus = vec_ld (0, (vector unsigned short *)svus);
+  vss = vec_ld (0, (vector signed short *)svss);
+  vbs = vec_ld (0, (vector bool short *)svbs);
+  vp  = vec_ld (0, (vector pixel *)svp);
+  vui = vec_ld (0, (vector unsigned int *)svui);
+  vsi = vec_ld (0, (vector signed int *)svsi);
+  vbi = vec_ld (0, (vector bool int *)svbi);
+  vf  = vec_ld (0, (vector float *)svf);
+
+  check (vec_all_eq (vuc, evuc), "vuc");
+  check (vec_all_eq (vsc, evsc), "vsc");
+  check (vec_all_eq (vbc, evbc), "vbc");
+  check (vec_all_eq (vus, evus), "vus");
+  check (vec_all_eq (vss, evss), "vss");
+  check (vec_all_eq (vbs, evbs), "vbs");
+  check (vec_all_eq (vp,  evp ), "vp" );
+  check (vec_all_eq (vui, evui), "vui");
+  check (vec_all_eq (vsi, evsi), "vsi");
+  check (vec_all_eq (vbi, evbi), "vbi");
+  check (vec_all_eq (vf,  evf ), "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-vsx-be-order.c
@@ -0,0 +1,44 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      svul[i] = i;
+      svd[i] = i * 1.0;
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned long long evul = {1,0};
+  vector double evd = {1.0,0.0};
+#else
+  vector unsigned long long evul = {0,1};
+  vector double evd = {0.0,1.0};
+#endif
+
+  vector unsigned long long vul;
+  vector double vd;
+  unsigned i;
+
+  init ();
+
+  vul = vec_ld (0, (vector unsigned long long *)svul);
+  vd  = vec_ld (0, (vector double *)svd);
+
+  for (i = 0; i < 2; ++i)
+    {
+      check (vul[i] == evul[i], "vul");
+      check (vd[i]  == evd[i],  "vd" );
+    }
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld-vsx.c
@@ -0,0 +1,39 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      svul[i] = i;
+      svd[i] = i * 1.0;
+    }
+}
+
+static void test ()
+{
+  vector unsigned long long evul = {0,1};
+  vector double evd = {0.0,1.0};
+
+  vector unsigned long long vul;
+  vector double vd;
+  unsigned i;
+
+  init ();
+
+  vul = vec_ld (0, (vector unsigned long long *)svul);
+  vd  = vec_ld (0, (vector double *)svd);
+
+  for (i = 0; i < 2; ++i)
+    {
+      check (vul[i] == evul[i], "vul");
+      check (vd[i]  == evd[i],  "vd" );
+    }
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ld.c
@@ -0,0 +1,91 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+      svbc[i] = (i % 2) ? 0xff : 0;
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+      svbs[i] = (i % 2) ? 0xffff : 0;
+      svp[i] = i;
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svbi[i] = (i % 2) ? 0xffffffff : 0;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+  vector unsigned char evuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char evsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char evbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short evus = {0,1,2,3,4,5,6,7};
+  vector signed short evss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short evbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel evp = {0,1,2,3,4,5,6,7};
+  vector unsigned int evui = {0,1,2,3};
+  vector signed int evsi = {-2,-1,0,1};
+  vector bool int evbi = {0,0xffffffff,0,0xffffffff};
+  vector float evf = {0.0,1.0,2.0,3.0};
+
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector bool char vbc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector bool short vbs;
+  vector pixel vp;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector bool int vbi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_ld (0, (vector unsigned char *)svuc);
+  vsc = vec_ld (0, (vector signed char *)svsc);
+  vbc = vec_ld (0, (vector bool char *)svbc);
+  vus = vec_ld (0, (vector unsigned short *)svus);
+  vss = vec_ld (0, (vector signed short *)svss);
+  vbs = vec_ld (0, (vector bool short *)svbs);
+  vp  = vec_ld (0, (vector pixel *)svp);
+  vui = vec_ld (0, (vector unsigned int *)svui);
+  vsi = vec_ld (0, (vector signed int *)svsi);
+  vbi = vec_ld (0, (vector bool int *)svbi);
+  vf  = vec_ld (0, (vector float *)svf);
+
+  check (vec_all_eq (vuc, evuc), "vuc");
+  check (vec_all_eq (vsc, evsc), "vsc");
+  check (vec_all_eq (vbc, evbc), "vbc");
+  check (vec_all_eq (vus, evus), "vus");
+  check (vec_all_eq (vss, evss), "vss");
+  check (vec_all_eq (vbs, evbs), "vbs");
+  check (vec_all_eq (vp,  evp ), "vp" );
+  check (vec_all_eq (vui, evui), "vui");
+  check (vec_all_eq (vsi, evsi), "vsi");
+  check (vec_all_eq (vbi, evbi), "vbi");
+  check (vec_all_eq (vf,  evf ), "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/lde-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/lde-be-order.c
@@ -0,0 +1,73 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  int i;
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  for (i = 15; i >= 0; --i)
+#else
+  for (i = 0; i < 16; ++i)
+#endif
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+    }
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  for (i = 7; i >= 0; --i)
+#else
+  for (i = 0; i < 8; ++i)
+#endif
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+    }
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  for (i = 3; i >= 0; --i)
+#else
+  for (i = 0; i < 4; ++i)
+#endif
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_lde (9*1, (unsigned char *)svuc);
+  vsc = vec_lde (14*1, (signed char *)svsc);
+  vus = vec_lde (7*2, (unsigned short *)svus);
+  vss = vec_lde (1*2, (signed short *)svss);
+  vui = vec_lde (3*4, (unsigned int *)svui);
+  vsi = vec_lde (2*4, (signed int *)svsi);
+  vf  = vec_lde (0*4, (float *)svf);
+
+  check (vec_extract (vuc, 9) == 9, "vuc");
+  check (vec_extract (vsc, 14) == 6, "vsc");
+  check (vec_extract (vus, 7) == 7, "vus");
+  check (vec_extract (vss, 1) == -3, "vss");
+  check (vec_extract (vui, 3) == 3, "vui");
+  check (vec_extract (vsi, 2) == 0, "vsi");
+  check (vec_extract (vf,  0) == 0.0, "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/lde.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/lde.c
@@ -0,0 +1,59 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_lde (9*1, (unsigned char *)svuc);
+  vsc = vec_lde (14*1, (signed char *)svsc);
+  vus = vec_lde (7*2, (unsigned short *)svus);
+  vss = vec_lde (1*2, (signed short *)svss);
+  vui = vec_lde (3*4, (unsigned int *)svui);
+  vsi = vec_lde (2*4, (signed int *)svsi);
+  vf  = vec_lde (0*4, (float *)svf);
+
+  check (vec_extract (vuc, 9) == 9, "vuc");
+  check (vec_extract (vsc, 14) == 6, "vsc");
+  check (vec_extract (vus, 7) == 7, "vus");
+  check (vec_extract (vss, 1) == -3, "vss");
+  check (vec_extract (vui, 3) == 3, "vui");
+  check (vec_extract (vsi, 2) == 0, "vsi");
+  check (vec_extract (vf,  0) == 0.0, "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-be-order.c
@@ -0,0 +1,107 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+      svbc[i] = (i % 2) ? 0xff : 0;
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+      svbs[i] = (i % 2) ? 0xffff : 0;
+      svp[i] = i;
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svbi[i] = (i % 2) ? 0xffffffff : 0;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char evuc = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
+  vector signed char evsc = {7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8};
+  vector bool char evbc = {255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0};
+  vector unsigned short evus = {7,6,5,4,3,2,1,0};
+  vector signed short evss = {3,2,1,0,-1,-2,-3,-4};
+  vector bool short evbs = {65535,0,65535,0,65535,0,65535,0};
+  vector pixel evp = {7,6,5,4,3,2,1,0};
+  vector unsigned int evui = {3,2,1,0};
+  vector signed int evsi = {1,0,-1,-2};
+  vector bool int evbi = {0xffffffff,0,0xffffffff,0};
+  vector float evf = {3.0,2.0,1.0,0.0};
+#else
+  vector unsigned char evuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char evsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char evbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short evus = {0,1,2,3,4,5,6,7};
+  vector signed short evss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short evbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel evp = {0,1,2,3,4,5,6,7};
+  vector unsigned int evui = {0,1,2,3};
+  vector signed int evsi = {-2,-1,0,1};
+  vector bool int evbi = {0,0xffffffff,0,0xffffffff};
+  vector float evf = {0.0,1.0,2.0,3.0};
+#endif
+
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector bool char vbc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector bool short vbs;
+  vector pixel vp;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector bool int vbi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_ldl (0, (vector unsigned char *)svuc);
+  vsc = vec_ldl (0, (vector signed char *)svsc);
+  vbc = vec_ldl (0, (vector bool char *)svbc);
+  vus = vec_ldl (0, (vector unsigned short *)svus);
+  vss = vec_ldl (0, (vector signed short *)svss);
+  vbs = vec_ldl (0, (vector bool short *)svbs);
+  vp  = vec_ldl (0, (vector pixel *)svp);
+  vui = vec_ldl (0, (vector unsigned int *)svui);
+  vsi = vec_ldl (0, (vector signed int *)svsi);
+  vbi = vec_ldl (0, (vector bool int *)svbi);
+  vf  = vec_ldl (0, (vector float *)svf);
+
+  check (vec_all_eq (vuc, evuc), "vuc");
+  check (vec_all_eq (vsc, evsc), "vsc");
+  check (vec_all_eq (vbc, evbc), "vbc");
+  check (vec_all_eq (vus, evus), "vus");
+  check (vec_all_eq (vss, evss), "vss");
+  check (vec_all_eq (vbs, evbs), "vbs");
+  check (vec_all_eq (vp,  evp ), "vp" );
+  check (vec_all_eq (vui, evui), "vui");
+  check (vec_all_eq (vsi, evsi), "vsi");
+  check (vec_all_eq (vbi, evbi), "vbi");
+  check (vec_all_eq (vf,  evf ), "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-vsx-be-order.c
@@ -0,0 +1,44 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      svul[i] = i;
+      svd[i] = i * 1.0;
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned long long evul = {1,0};
+  vector double evd = {1.0,0.0};
+#else
+  vector unsigned long long evul = {0,1};
+  vector double evd = {0.0,1.0};
+#endif
+
+  vector unsigned long long vul;
+  vector double vd;
+  unsigned i;
+
+  init ();
+
+  vul = vec_ldl (0, (vector unsigned long long *)svul);
+  vd  = vec_ldl (0, (vector double *)svd);
+
+  for (i = 0; i < 2; ++i)
+    {
+      check (vul[i] == evul[i], "vul");
+      check (vd[i]  == evd[i],  "vd" );
+    }
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl-vsx.c
@@ -0,0 +1,39 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      svul[i] = i;
+      svd[i] = i * 1.0;
+    }
+}
+
+static void test ()
+{
+  vector unsigned long long evul = {0,1};
+  vector double evd = {0.0,1.0};
+
+  vector unsigned long long vul;
+  vector double vd;
+  unsigned i;
+
+  init ();
+
+  vul = vec_ldl (0, (vector unsigned long long *)svul);
+  vd  = vec_ldl (0, (vector double *)svd);
+
+  for (i = 0; i < 2; ++i)
+    {
+      check (vul[i] == evul[i], "vul");
+      check (vd[i]  == evd[i],  "vd" );
+    }
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ldl.c
@@ -0,0 +1,91 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void init ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      svuc[i] = i;
+      svsc[i] = i - 8;
+      svbc[i] = (i % 2) ? 0xff : 0;
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      svus[i] = i;
+      svss[i] = i - 4;
+      svbs[i] = (i % 2) ? 0xffff : 0;
+      svp[i] = i;
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      svui[i] = i;
+      svsi[i] = i - 2;
+      svbi[i] = (i % 2) ? 0xffffffff : 0;
+      svf[i] = i * 1.0f;
+    }
+}
+
+static void test ()
+{
+  vector unsigned char evuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char evsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char evbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short evus = {0,1,2,3,4,5,6,7};
+  vector signed short evss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short evbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel evp = {0,1,2,3,4,5,6,7};
+  vector unsigned int evui = {0,1,2,3};
+  vector signed int evsi = {-2,-1,0,1};
+  vector bool int evbi = {0,0xffffffff,0,0xffffffff};
+  vector float evf = {0.0,1.0,2.0,3.0};
+
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector bool char vbc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector bool short vbs;
+  vector pixel vp;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector bool int vbi;
+  vector float vf;
+
+  init ();
+
+  vuc = vec_ldl (0, (vector unsigned char *)svuc);
+  vsc = vec_ldl (0, (vector signed char *)svsc);
+  vbc = vec_ldl (0, (vector bool char *)svbc);
+  vus = vec_ldl (0, (vector unsigned short *)svus);
+  vss = vec_ldl (0, (vector signed short *)svss);
+  vbs = vec_ldl (0, (vector bool short *)svbs);
+  vp  = vec_ldl (0, (vector pixel *)svp);
+  vui = vec_ldl (0, (vector unsigned int *)svui);
+  vsi = vec_ldl (0, (vector signed int *)svsi);
+  vbi = vec_ldl (0, (vector bool int *)svbi);
+  vf  = vec_ldl (0, (vector float *)svf);
+
+  check (vec_all_eq (vuc, evuc), "vuc");
+  check (vec_all_eq (vsc, evsc), "vsc");
+  check (vec_all_eq (vbc, evbc), "vbc");
+  check (vec_all_eq (vus, evus), "vus");
+  check (vec_all_eq (vss, evss), "vss");
+  check (vec_all_eq (vbs, evbs), "vbs");
+  check (vec_all_eq (vp,  evp ), "vp" );
+  check (vec_all_eq (vui, evui), "vui");
+  check (vec_all_eq (vsi, evsi), "vsi");
+  check (vec_all_eq (vbi, evbi), "vbi");
+  check (vec_all_eq (vf,  evf ), "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-be-order.c
@@ -0,0 +1,96 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb
+    = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector signed char vsca
+    = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+
+  /* Result vectors.  */
+  vector unsigned char vuch, vucl;
+  vector signed char vsch, vscl;
+  vector unsigned short vush, vusl;
+  vector signed short vssh, vssl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector float vfh, vfl;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vucrh = {24,8,25,9,26,10,27,11,28,12,29,13,30,14,31,15};
+  vector unsigned char vucrl = {16,0,17,1,18,2,19,3,20,4,21,5,22,6,23,7};
+  vector signed char vscrh = {8,-8,9,-7,10,-6,11,-5,12,-4,13,-3,14,-2,15,-1};
+  vector signed char vscrl = {0,-16,1,-15,2,-14,3,-13,4,-12,5,-11,6,-10,7,-9};
+  vector unsigned short vusrh = {12,4,13,5,14,6,15,7};
+  vector unsigned short vusrl = {8,0,9,1,10,2,11,3};
+  vector signed short vssrh = {4,-4,5,-3,6,-2,7,-1};
+  vector signed short vssrl = {0,-8,1,-7,2,-6,3,-5};
+  vector unsigned int vuirh = {6,2,7,3};
+  vector unsigned int vuirl = {4,0,5,1};
+  vector signed int vsirh = {2,-2,3,-1};
+  vector signed int vsirl = {0,-4,1,-3};
+  vector float vfrh = {2.0,-2.0,3.0,-1.0};
+  vector float vfrl = {0.0,-4.0,1.0,-3.0};
+#else
+  vector unsigned char vucrh = {0,16,1,17,2,18,3,19,4,20,5,21,6,22,7,23};
+  vector unsigned char vucrl = {8,24,9,25,10,26,11,27,12,28,13,29,14,30,15,31};
+  vector signed char vscrh = {-16,0,-15,1,-14,2,-13,3,-12,4,-11,5,-10,6,-9,7};
+  vector signed char vscrl = {-8,8,-7,9,-6,10,-5,11,-4,12,-3,13,-2,14,-1,15};
+  vector unsigned short vusrh = {0,8,1,9,2,10,3,11};
+  vector unsigned short vusrl = {4,12,5,13,6,14,7,15};
+  vector signed short vssrh = {-8,0,-7,1,-6,2,-5,3};
+  vector signed short vssrl = {-4,4,-3,5,-2,6,-1,7};
+  vector unsigned int vuirh = {0,4,1,5};
+  vector unsigned int vuirl = {2,6,3,7};
+  vector signed int vsirh = {-4,0,-3,1};
+  vector signed int vsirl = {-2,2,-1,3};
+  vector float vfrh = {-4.0,0.0,-3.0,1.0};
+  vector float vfrl = {-2.0,2.0,-1.0,3.0};
+#endif
+
+  vuch = vec_mergeh (vuca, vucb);
+  vucl = vec_mergel (vuca, vucb);
+  vsch = vec_mergeh (vsca, vscb);
+  vscl = vec_mergel (vsca, vscb);
+  vush = vec_mergeh (vusa, vusb);
+  vusl = vec_mergel (vusa, vusb);
+  vssh = vec_mergeh (vssa, vssb);
+  vssl = vec_mergel (vssa, vssb);
+  vuih = vec_mergeh (vuia, vuib);
+  vuil = vec_mergel (vuia, vuib);
+  vsih = vec_mergeh (vsia, vsib);
+  vsil = vec_mergel (vsia, vsib);
+  vfh  = vec_mergeh (vfa,  vfb );
+  vfl  = vec_mergel (vfa,  vfb );
+
+  check (vec_all_eq (vuch, vucrh), "vuch");
+  check (vec_all_eq (vucl, vucrl), "vucl");
+  check (vec_all_eq (vsch, vscrh), "vsch");
+  check (vec_all_eq (vscl, vscrl), "vscl");
+  check (vec_all_eq (vush, vusrh), "vush");
+  check (vec_all_eq (vusl, vusrl), "vusl");
+  check (vec_all_eq (vssh, vssrh), "vssh");
+  check (vec_all_eq (vssl, vssrl), "vssl");
+  check (vec_all_eq (vuih, vuirh), "vuih");
+  check (vec_all_eq (vuil, vuirl), "vuil");
+  check (vec_all_eq (vsih, vsirh), "vsih");
+  check (vec_all_eq (vsil, vsirl), "vsil");
+  check (vec_all_eq (vfh,  vfrh),  "vfh");
+  check (vec_all_eq (vfl,  vfrl),  "vfl");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-vsx-be-order.c
@@ -0,0 +1,51 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static int vec_long_long_eq (vector long long x, vector long long y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static int vec_double_eq (vector double x, vector double y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static void test()
+{
+  /* Input vectors.  */
+  vector long long vla = {-2,-1};
+  vector long long vlb = {0,1};
+  vector double vda = {-2.0,-1.0};
+  vector double vdb = {0.0,1.0};
+
+  /* Result vectors.  */
+  vector long long vlh, vll;
+  vector double vdh, vdl;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector long long vlrh = {1,-1};
+  vector long long vlrl = {0,-2};
+  vector double vdrh = {1.0,-1.0};
+  vector double vdrl = {0.0,-2.0};
+#else
+  vector long long vlrh = {-2,0};
+  vector long long vlrl = {-1,1};
+  vector double vdrh = {-2.0,0.0};
+  vector double vdrl = {-1.0,1.0};
+#endif
+
+  vlh = vec_mergeh (vla, vlb);
+  vll = vec_mergel (vla, vlb);
+  vdh = vec_mergeh (vda, vdb);
+  vdl = vec_mergel (vda, vdb);
+
+  check (vec_long_long_eq (vlh, vlrh), "vlh");
+  check (vec_long_long_eq (vll, vlrl), "vll");
+  check (vec_double_eq (vdh, vdrh), "vdh" );
+  check (vec_double_eq (vdl, vdrl), "vdl" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge-vsx.c
@@ -0,0 +1,44 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static int vec_long_long_eq (vector long long x, vector long long y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static int vec_double_eq (vector double x, vector double y)
+{
+  return (x[0] == y[0] && x[1] == y[1]);
+}
+
+static void test()
+{
+  /* Input vectors.  */
+  vector long long vla = {-2,-1};
+  vector long long vlb = {0,1};
+  vector double vda = {-2.0,-1.0};
+  vector double vdb = {0.0,1.0};
+
+  /* Result vectors.  */
+  vector long long vlh, vll;
+  vector double vdh, vdl;
+
+  /* Expected result vectors.  */
+  vector long long vlrh = {-2,0};
+  vector long long vlrl = {-1,1};
+  vector double vdrh = {-2.0,0.0};
+  vector double vdrl = {-1.0,1.0};
+
+  vlh = vec_mergeh (vla, vlb);
+  vll = vec_mergel (vla, vlb);
+  vdh = vec_mergeh (vda, vdb);
+  vdl = vec_mergel (vda, vdb);
+
+  check (vec_long_long_eq (vlh, vlrh), "vlh");
+  check (vec_long_long_eq (vll, vlrl), "vll");
+  check (vec_double_eq (vdh, vdrh), "vdh" );
+  check (vec_double_eq (vdl, vdrl), "vdl" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/merge.c
@@ -0,0 +1,77 @@ 
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb
+    = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector signed char vsca
+    = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+
+  /* Result vectors.  */
+  vector unsigned char vuch, vucl;
+  vector signed char vsch, vscl;
+  vector unsigned short vush, vusl;
+  vector signed short vssh, vssl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector float vfh, vfl;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucrh = {0,16,1,17,2,18,3,19,4,20,5,21,6,22,7,23};
+  vector unsigned char vucrl = {8,24,9,25,10,26,11,27,12,28,13,29,14,30,15,31};
+  vector signed char vscrh = {-16,0,-15,1,-14,2,-13,3,-12,4,-11,5,-10,6,-9,7};
+  vector signed char vscrl = {-8,8,-7,9,-6,10,-5,11,-4,12,-3,13,-2,14,-1,15};
+  vector unsigned short vusrh = {0,8,1,9,2,10,3,11};
+  vector unsigned short vusrl = {4,12,5,13,6,14,7,15};
+  vector signed short vssrh = {-8,0,-7,1,-6,2,-5,3};
+  vector signed short vssrl = {-4,4,-3,5,-2,6,-1,7};
+  vector unsigned int vuirh = {0,4,1,5};
+  vector unsigned int vuirl = {2,6,3,7};
+  vector signed int vsirh = {-4,0,-3,1};
+  vector signed int vsirl = {-2,2,-1,3};
+  vector float vfrh = {-4.0,0.0,-3.0,1.0};
+  vector float vfrl = {-2.0,2.0,-1.0,3.0};
+
+  vuch = vec_mergeh (vuca, vucb);
+  vucl = vec_mergel (vuca, vucb);
+  vsch = vec_mergeh (vsca, vscb);
+  vscl = vec_mergel (vsca, vscb);
+  vush = vec_mergeh (vusa, vusb);
+  vusl = vec_mergel (vusa, vusb);
+  vssh = vec_mergeh (vssa, vssb);
+  vssl = vec_mergel (vssa, vssb);
+  vuih = vec_mergeh (vuia, vuib);
+  vuil = vec_mergel (vuia, vuib);
+  vsih = vec_mergeh (vsia, vsib);
+  vsil = vec_mergel (vsia, vsib);
+  vfh  = vec_mergeh (vfa,  vfb );
+  vfl  = vec_mergel (vfa,  vfb );
+
+  check (vec_all_eq (vuch, vucrh), "vuch");
+  check (vec_all_eq (vucl, vucrl), "vucl");
+  check (vec_all_eq (vsch, vscrh), "vsch");
+  check (vec_all_eq (vscl, vscrl), "vscl");
+  check (vec_all_eq (vush, vusrh), "vush");
+  check (vec_all_eq (vusl, vusrl), "vusl");
+  check (vec_all_eq (vssh, vssrh), "vssh");
+  check (vec_all_eq (vssl, vssrl), "vssl");
+  check (vec_all_eq (vuih, vuirh), "vuih");
+  check (vec_all_eq (vuil, vuirl), "vuil");
+  check (vec_all_eq (vsih, vsirh), "vsih");
+  check (vec_all_eq (vsil, vsirl), "vsil");
+  check (vec_all_eq (vfh,  vfrh),  "vfh");
+  check (vec_all_eq (vfl,  vfrl),  "vfl");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/mult-even-odd-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/mult-even-odd-be-order.c
@@ -0,0 +1,64 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb = {2,3,2,3,2,3,2,3,2,3,2,3,2,3,2,3};
+  vector signed char vsca = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector signed char vscb = {2,-3,2,-3,2,-3,2,-3,2,-3,2,-3,2,-3,2,-3};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {2,3,2,3,2,3,2,3};
+  vector signed short vssa = {-4,-3,-2,-1,0,1,2,3};
+  vector signed short vssb = {2,-3,2,-3,2,-3,2,-3};
+  vector unsigned short vuse, vuso;
+  vector signed short vsse, vsso;
+  vector unsigned int vuie, vuio;
+  vector signed int vsie, vsio;
+
+  vuse = vec_mule (vuca, vucb);
+  vuso = vec_mulo (vuca, vucb);
+  vsse = vec_mule (vsca, vscb);
+  vsso = vec_mulo (vsca, vscb);
+  vuie = vec_mule (vusa, vusb);
+  vuio = vec_mulo (vusa, vusb);
+  vsie = vec_mule (vssa, vssb);
+  vsio = vec_mulo (vssa, vssb);
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  check (vec_all_eq (vuse,
+		     ((vector unsigned short){3,9,15,21,27,33,39,45})),
+	 "vuse");
+  check (vec_all_eq (vuso,
+		     ((vector unsigned short){0,4,8,12,16,20,24,28})),
+	 "vuso");
+  check (vec_all_eq (vsse,
+		     ((vector signed short){21,15,9,3,-3,-9,-15,-21})),
+	 "vsse");
+  check (vec_all_eq (vsso,
+		     ((vector signed short){-16,-12,-8,-4,0,4,8,12})),
+	 "vsso");
+  check (vec_all_eq (vuie, ((vector unsigned int){3,9,15,21})), "vuie");
+  check (vec_all_eq (vuio, ((vector unsigned int){0,4,8,12})), "vuio");
+  check (vec_all_eq (vsie, ((vector signed int){9,3,-3,-9})), "vsie");
+  check (vec_all_eq (vsio, ((vector signed int){-8,-4,0,4})), "vsio");
+#else
+  check (vec_all_eq (vuse,
+		     ((vector unsigned short){0,4,8,12,16,20,24,28})),
+	 "vuse");
+  check (vec_all_eq (vuso,
+		     ((vector unsigned short){3,9,15,21,27,33,39,45})),
+	 "vuso");
+  check (vec_all_eq (vsse,
+		     ((vector signed short){-16,-12,-8,-4,0,4,8,12})),
+	 "vsse");
+  check (vec_all_eq (vsso,
+		     ((vector signed short){21,15,9,3,-3,-9,-15,-21})),
+	 "vsso");
+  check (vec_all_eq (vuie, ((vector unsigned int){0,4,8,12})), "vuie");
+  check (vec_all_eq (vuio, ((vector unsigned int){3,9,15,21})), "vuio");
+  check (vec_all_eq (vsie, ((vector signed int){-8,-4,0,4})), "vsie");
+  check (vec_all_eq (vsio, ((vector signed int){9,3,-3,-9})), "vsio");
+#endif
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/mult-even-odd.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/mult-even-odd.c
@@ -0,0 +1,43 @@ 
+#include "harness.h"
+
+static void test()
+{
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb = {2,3,2,3,2,3,2,3,2,3,2,3,2,3,2,3};
+  vector signed char vsca = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector signed char vscb = {2,-3,2,-3,2,-3,2,-3,2,-3,2,-3,2,-3,2,-3};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {2,3,2,3,2,3,2,3};
+  vector signed short vssa = {-4,-3,-2,-1,0,1,2,3};
+  vector signed short vssb = {2,-3,2,-3,2,-3,2,-3};
+  vector unsigned short vuse, vuso;
+  vector signed short vsse, vsso;
+  vector unsigned int vuie, vuio;
+  vector signed int vsie, vsio;
+
+  vuse = vec_mule (vuca, vucb);
+  vuso = vec_mulo (vuca, vucb);
+  vsse = vec_mule (vsca, vscb);
+  vsso = vec_mulo (vsca, vscb);
+  vuie = vec_mule (vusa, vusb);
+  vuio = vec_mulo (vusa, vusb);
+  vsie = vec_mule (vssa, vssb);
+  vsio = vec_mulo (vssa, vssb);
+
+  check (vec_all_eq (vuse,
+		     ((vector unsigned short){0,4,8,12,16,20,24,28})),
+	 "vuse");
+  check (vec_all_eq (vuso,
+		     ((vector unsigned short){3,9,15,21,27,33,39,45})),
+	 "vuso");
+  check (vec_all_eq (vsse,
+		     ((vector signed short){-16,-12,-8,-4,0,4,8,12})),
+	 "vsse");
+  check (vec_all_eq (vsso,
+		     ((vector signed short){21,15,9,3,-3,-9,-15,-21})),
+	 "vsso");
+  check (vec_all_eq (vuie, ((vector unsigned int){0,4,8,12})), "vuie");
+  check (vec_all_eq (vuio, ((vector unsigned int){3,9,15,21})), "vuio");
+  check (vec_all_eq (vsie, ((vector signed int){-8,-4,0,4})), "vsie");
+  check (vec_all_eq (vsio, ((vector signed int){9,3,-3,-9})), "vsio");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/pack-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/pack-be-order.c
@@ -0,0 +1,136 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+#define BIG 4294967295
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector bool short vbsa = {0,65535,65535,0,0,0,65535,0};
+  vector bool short vbsb = {65535,0,0,65535,65535,65535,0,65535};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector bool int vbia = {0,BIG,BIG,BIG};
+  vector bool int vbib = {BIG,0,0,0};
+  vector unsigned int vipa = {(0<<24) + (2<<19) + (3<<11) + (4<<3),
+			      (1<<24) + (5<<19) + (6<<11) + (7<<3),
+			      (0<<24) + (8<<19) + (9<<11) + (10<<3),
+			      (1<<24) + (11<<19) + (12<<11) + (13<<3)};
+  vector unsigned int vipb = {(1<<24) + (14<<19) + (15<<11) + (16<<3),
+			      (0<<24) + (17<<19) + (18<<11) + (19<<3),
+			      (1<<24) + (20<<19) + (21<<11) + (22<<3),
+			      (0<<24) + (23<<19) + (24<<11) + (25<<3)};
+  vector unsigned short vusc = {0,256,1,257,2,258,3,259};
+  vector unsigned short vusd = {4,260,5,261,6,262,7,263};
+  vector signed short vssc = {-1,-128,0,127,-2,-129,1,128};
+  vector signed short vssd = {-3,-130,2,129,-4,-131,3,130};
+  vector unsigned int vuic = {0,65536,1,65537};
+  vector unsigned int vuid = {2,65538,3,65539};
+  vector signed int vsic = {-1,-32768,0,32767};
+  vector signed int vsid = {-2,-32769,1,32768};
+
+  /* Result vectors.  */
+  vector unsigned char vucr;
+  vector signed char vscr;
+  vector bool char vbcr;
+  vector unsigned short vusr;
+  vector signed short vssr;
+  vector bool short vbsr;
+  vector pixel vpr;
+  vector unsigned char vucsr;
+  vector signed char vscsr;
+  vector unsigned short vussr;
+  vector signed short vsssr;
+  vector unsigned char vucsur1, vucsur2;
+  vector unsigned short vussur1, vussur2;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vucer = {8,9,10,11,12,13,14,15,0,1,2,3,4,5,6,7};
+  vector signed char vscer = {0,1,2,3,4,5,6,7,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector bool char vbcer = {255,0,0,255,255,255,0,255,0,255,255,0,0,0,255,0};
+  vector unsigned short vuser = {4,5,6,7,0,1,2,3};
+  vector signed short vsser = {0,1,2,3,-4,-3,-2,-1};
+  vector bool short vbser = {65535,0,0,0,0,65535,65535,65535};
+  vector pixel vper = {(1<<15) + (14<<10) + (15<<5) + 16,
+		       (0<<15) + (17<<10) + (18<<5) + 19,
+		       (1<<15) + (20<<10) + (21<<5) + 22,
+		       (0<<15) + (23<<10) + (24<<5) + 25,
+		       (0<<15) + (2<<10) + (3<<5) + 4,
+		       (1<<15) + (5<<10) + (6<<5) + 7,
+		       (0<<15) + (8<<10) + (9<<5) + 10,
+		       (1<<15) + (11<<10) + (12<<5) + 13};
+  vector unsigned char vucser = {4,255,5,255,6,255,7,255,0,255,1,255,2,255,3,255};
+  vector signed char vscser = {-3,-128,2,127,-4,-128,3,127,
+			       -1,-128,0,127,-2,-128,1,127};
+  vector unsigned short vusser = {2,65535,3,65535,0,65535,1,65535};
+  vector signed short vssser = {-2,-32768,1,32767,-1,-32768,0,32767};
+  vector unsigned char vucsuer1 = {4,255,5,255,6,255,7,255,0,255,1,255,2,255,3,255};
+  vector unsigned char vucsuer2 = {0,0,2,129,0,0,3,130,0,0,0,127,0,0,1,128};
+  vector unsigned short vussuer1 = {2,65535,3,65535,0,65535,1,65535};
+  vector unsigned short vussuer2 = {0,0,1,32768,0,0,0,32767};
+#else
+  vector unsigned char vucer = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vscer = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbcer = {0,255,255,0,0,0,255,0,255,0,0,255,255,255,0,255};
+  vector unsigned short vuser = {0,1,2,3,4,5,6,7};
+  vector signed short vsser = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbser = {0,65535,65535,65535,65535,0,0,0};
+  vector pixel vper = {(0<<15) + (2<<10) + (3<<5) + 4,
+		       (1<<15) + (5<<10) + (6<<5) + 7,
+		       (0<<15) + (8<<10) + (9<<5) + 10,
+		       (1<<15) + (11<<10) + (12<<5) + 13,
+		       (1<<15) + (14<<10) + (15<<5) + 16,
+		       (0<<15) + (17<<10) + (18<<5) + 19,
+		       (1<<15) + (20<<10) + (21<<5) + 22,
+		       (0<<15) + (23<<10) + (24<<5) + 25};
+  vector unsigned char vucser = {0,255,1,255,2,255,3,255,4,255,5,255,6,255,7,255};
+  vector signed char vscser = {-1,-128,0,127,-2,-128,1,127,
+			       -3,-128,2,127,-4,-128,3,127};
+  vector unsigned short vusser = {0,65535,1,65535,2,65535,3,65535};
+  vector signed short vssser = {-1,-32768,0,32767,-2,-32768,1,32767};
+  vector unsigned char vucsuer1 = {0,255,1,255,2,255,3,255,4,255,5,255,6,255,7,255};
+  vector unsigned char vucsuer2 = {0,0,0,127,0,0,1,128,0,0,2,129,0,0,3,130};
+  vector unsigned short vussuer1 = {0,65535,1,65535,2,65535,3,65535};
+  vector unsigned short vussuer2 = {0,0,0,32767,0,0,1,32768};
+#endif
+
+  vucr = vec_pack (vusa, vusb);
+  vscr = vec_pack (vssa, vssb);
+  vbcr = vec_pack (vbsa, vbsb);
+  vusr = vec_pack (vuia, vuib);
+  vssr = vec_pack (vsia, vsib);
+  vbsr = vec_pack (vbia, vbib);
+  vpr  = vec_packpx (vipa, vipb);
+  vucsr = vec_packs (vusc, vusd);
+  vscsr = vec_packs (vssc, vssd);
+  vussr = vec_packs (vuic, vuid);
+  vsssr = vec_packs (vsic, vsid);
+  vucsur1 = vec_packsu (vusc, vusd);
+  vucsur2 = vec_packsu (vssc, vssd);
+  vussur1 = vec_packsu (vuic, vuid);
+  vussur2 = vec_packsu (vsic, vsid);
+
+  check (vec_all_eq (vucr, vucer), "vucr");
+  check (vec_all_eq (vscr, vscer), "vscr");
+  check (vec_all_eq (vbcr, vbcer), "vbcr");
+  check (vec_all_eq (vusr, vuser), "vusr");
+  check (vec_all_eq (vssr, vsser), "vssr");
+  check (vec_all_eq (vbsr, vbser), "vbsr");
+  check (vec_all_eq (vpr,  vper ), "vpr" );
+  check (vec_all_eq (vucsr, vucser), "vucsr");
+  check (vec_all_eq (vscsr, vscser), "vscsr");
+  check (vec_all_eq (vussr, vusser), "vussr");
+  check (vec_all_eq (vsssr, vssser), "vsssr");
+  check (vec_all_eq (vucsur1, vucsuer1), "vucsur1");
+  check (vec_all_eq (vucsur2, vucsuer2), "vucsur2");
+  check (vec_all_eq (vussur1, vussuer1), "vussur1");
+  check (vec_all_eq (vussur2, vussuer2), "vussur2");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/pack.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/pack.c
@@ -0,0 +1,108 @@ 
+#include "harness.h"
+
+#define BIG 4294967295
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector bool short vbsa = {0,65535,65535,0,0,0,65535,0};
+  vector bool short vbsb = {65535,0,0,65535,65535,65535,0,65535};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector bool int vbia = {0,BIG,BIG,BIG};
+  vector bool int vbib = {BIG,0,0,0};
+  vector unsigned int vipa = {(0<<24) + (2<<19) + (3<<11) + (4<<3),
+			      (1<<24) + (5<<19) + (6<<11) + (7<<3),
+			      (0<<24) + (8<<19) + (9<<11) + (10<<3),
+			      (1<<24) + (11<<19) + (12<<11) + (13<<3)};
+  vector unsigned int vipb = {(1<<24) + (14<<19) + (15<<11) + (16<<3),
+			      (0<<24) + (17<<19) + (18<<11) + (19<<3),
+			      (1<<24) + (20<<19) + (21<<11) + (22<<3),
+			      (0<<24) + (23<<19) + (24<<11) + (25<<3)};
+  vector unsigned short vusc = {0,256,1,257,2,258,3,259};
+  vector unsigned short vusd = {4,260,5,261,6,262,7,263};
+  vector signed short vssc = {-1,-128,0,127,-2,-129,1,128};
+  vector signed short vssd = {-3,-130,2,129,-4,-131,3,130};
+  vector unsigned int vuic = {0,65536,1,65537};
+  vector unsigned int vuid = {2,65538,3,65539};
+  vector signed int vsic = {-1,-32768,0,32767};
+  vector signed int vsid = {-2,-32769,1,32768};
+
+  /* Result vectors.  */
+  vector unsigned char vucr;
+  vector signed char vscr;
+  vector bool char vbcr;
+  vector unsigned short vusr;
+  vector signed short vssr;
+  vector bool short vbsr;
+  vector pixel vpr;
+  vector unsigned char vucsr;
+  vector signed char vscsr;
+  vector unsigned short vussr;
+  vector signed short vsssr;
+  vector unsigned char vucsur1, vucsur2;
+  vector unsigned short vussur1, vussur2;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucer = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vscer = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbcer = {0,255,255,0,0,0,255,0,255,0,0,255,255,255,0,255};
+  vector unsigned short vuser = {0,1,2,3,4,5,6,7};
+  vector signed short vsser = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbser = {0,65535,65535,65535,65535,0,0,0};
+  vector pixel vper = {(0<<15) + (2<<10) + (3<<5) + 4,
+		       (1<<15) + (5<<10) + (6<<5) + 7,
+		       (0<<15) + (8<<10) + (9<<5) + 10,
+		       (1<<15) + (11<<10) + (12<<5) + 13,
+		       (1<<15) + (14<<10) + (15<<5) + 16,
+		       (0<<15) + (17<<10) + (18<<5) + 19,
+		       (1<<15) + (20<<10) + (21<<5) + 22,
+		       (0<<15) + (23<<10) + (24<<5) + 25};
+  vector unsigned char vucser = {0,255,1,255,2,255,3,255,4,255,5,255,6,255,7,255};
+  vector signed char vscser = {-1,-128,0,127,-2,-128,1,127,
+			       -3,-128,2,127,-4,-128,3,127};
+  vector unsigned short vusser = {0,65535,1,65535,2,65535,3,65535};
+  vector signed short vssser = {-1,-32768,0,32767,-2,-32768,1,32767};
+  vector unsigned char vucsuer1 = {0,255,1,255,2,255,3,255,4,255,5,255,6,255,7,255};
+  vector unsigned char vucsuer2 = {0,0,0,127,0,0,1,128,0,0,2,129,0,0,3,130};
+  vector unsigned short vussuer1 = {0,65535,1,65535,2,65535,3,65535};
+  vector unsigned short vussuer2 = {0,0,0,32767,0,0,1,32768};
+
+  vucr = vec_pack (vusa, vusb);
+  vscr = vec_pack (vssa, vssb);
+  vbcr = vec_pack (vbsa, vbsb);
+  vusr = vec_pack (vuia, vuib);
+  vssr = vec_pack (vsia, vsib);
+  vbsr = vec_pack (vbia, vbib);
+  vpr  = vec_packpx (vipa, vipb);
+  vucsr = vec_packs (vusc, vusd);
+  vscsr = vec_packs (vssc, vssd);
+  vussr = vec_packs (vuic, vuid);
+  vsssr = vec_packs (vsic, vsid);
+  vucsur1 = vec_packsu (vusc, vusd);
+  vucsur2 = vec_packsu (vssc, vssd);
+  vussur1 = vec_packsu (vuic, vuid);
+  vussur2 = vec_packsu (vsic, vsid);
+
+  check (vec_all_eq (vucr, vucer), "vucr");
+  check (vec_all_eq (vscr, vscer), "vscr");
+  check (vec_all_eq (vbcr, vbcer), "vbcr");
+  check (vec_all_eq (vusr, vuser), "vusr");
+  check (vec_all_eq (vssr, vsser), "vssr");
+  check (vec_all_eq (vbsr, vbser), "vbsr");
+  check (vec_all_eq (vpr,  vper ), "vpr" );
+  check (vec_all_eq (vucsr, vucser), "vucsr");
+  check (vec_all_eq (vscsr, vscser), "vscsr");
+  check (vec_all_eq (vussr, vusser), "vussr");
+  check (vec_all_eq (vsssr, vssser), "vsssr");
+  check (vec_all_eq (vucsur1, vucsuer1), "vucsur1");
+  check (vec_all_eq (vucsur2, vucsuer2), "vucsur2");
+  check (vec_all_eq (vussur1, vussuer1), "vussur1");
+  check (vec_all_eq (vussur2, vussuer2), "vussur2");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/perm-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/perm-be-order.c
@@ -0,0 +1,74 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector signed char vsca = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vucp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23};
+  vector unsigned char vscp = {15,16,14,17,13,18,12,19,11,20,10,21,9,22,8,23};
+  vector unsigned char vusp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22};
+  vector unsigned char vssp = {15,14,17,16,13,12,19,18,11,10,21,20,9,8,23,22};
+  vector unsigned char vuip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+  vector unsigned char vsip = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+  vector unsigned char vfp  = {15,14,13,12,19,18,17,16,11,10,9,8,23,22,21,20};
+#else
+  vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+  vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+  vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+  vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+  vector unsigned char vfp  = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+#endif
+
+  /* Result vectors.  */
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8};
+  vector unsigned short vusr = {0,15,1,14,2,13,3,12};
+  vector signed short vssr = {-8,7,-7,6,-6,5,-5,4};
+  vector unsigned int vuir = {0,7,1,6};
+  vector signed int vsir = {-4,3,-3,2};
+  vector float vfr = {-4.0,3.0,-3.0,2.0};
+
+  vuc = vec_perm (vuca, vucb, vucp);
+  vsc = vec_perm (vsca, vscb, vscp);
+  vus = vec_perm (vusa, vusb, vusp);
+  vss = vec_perm (vssa, vssb, vssp);
+  vui = vec_perm (vuia, vuib, vuip);
+  vsi = vec_perm (vsia, vsib, vsip);
+  vf  = vec_perm (vfa,  vfb,  vfp );
+
+  check (vec_all_eq (vuc, vucr), "vuc");
+  check (vec_all_eq (vsc, vscr), "vsc");
+  check (vec_all_eq (vus, vusr), "vus");
+  check (vec_all_eq (vss, vssr), "vss");
+  check (vec_all_eq (vui, vuir), "vui");
+  check (vec_all_eq (vsi, vsir), "vsi");
+  check (vec_all_eq (vf,  vfr),  "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/perm.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/perm.c
@@ -0,0 +1,69 @@ 
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuca = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vucb
+    = {16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31};
+  vector unsigned char vucp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+
+  vector signed char vsca
+    = {-16,-15,-14,-13,-12,-11,-10,-9,-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed char vscb = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector unsigned char vscp = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+
+  vector unsigned short vusa = {0,1,2,3,4,5,6,7};
+  vector unsigned short vusb = {8,9,10,11,12,13,14,15};
+  vector unsigned char vusp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+
+  vector signed short vssa = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vssb = {0,1,2,3,4,5,6,7};
+  vector unsigned char vssp = {0,1,30,31,2,3,28,29,4,5,26,27,6,7,24,25};
+
+  vector unsigned int vuia = {0,1,2,3};
+  vector unsigned int vuib = {4,5,6,7};
+  vector unsigned char vuip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  vector signed int vsia = {-4,-3,-2,-1};
+  vector signed int vsib = {0,1,2,3};
+  vector unsigned char vsip = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  vector float vfa = {-4.0,-3.0,-2.0,-1.0};
+  vector float vfb = {0.0,1.0,2.0,3.0};
+  vector unsigned char vfp = {0,1,2,3,28,29,30,31,4,5,6,7,24,25,26,27};
+
+  /* Result vectors.  */
+  vector unsigned char vuc;
+  vector signed char vsc;
+  vector unsigned short vus;
+  vector signed short vss;
+  vector unsigned int vui;
+  vector signed int vsi;
+  vector float vf;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucr = {0,31,1,30,2,29,3,28,4,27,5,26,6,25,7,24};
+  vector signed char vscr = {-16,15,-15,14,-14,13,-13,12,-12,11,-11,10,-10,9,-9,8};
+  vector unsigned short vusr = {0,15,1,14,2,13,3,12};
+  vector signed short vssr = {-8,7,-7,6,-6,5,-5,4};
+  vector unsigned int vuir = {0,7,1,6};
+  vector signed int vsir = {-4,3,-3,2};
+  vector float vfr = {-4.0,3.0,-3.0,2.0};
+
+  vuc = vec_perm (vuca, vucb, vucp);
+  vsc = vec_perm (vsca, vscb, vscp);
+  vus = vec_perm (vusa, vusb, vusp);
+  vss = vec_perm (vssa, vssb, vssp);
+  vui = vec_perm (vuia, vuib, vuip);
+  vsi = vec_perm (vsia, vsib, vsip);
+  vf  = vec_perm (vfa,  vfb,  vfp );
+
+  check (vec_all_eq (vuc, vucr), "vuc");
+  check (vec_all_eq (vsc, vscr), "vsc");
+  check (vec_all_eq (vus, vusr), "vus");
+  check (vec_all_eq (vss, vssr), "vss");
+  check (vec_all_eq (vui, vuir), "vui");
+  check (vec_all_eq (vsi, vsir), "vsi");
+  check (vec_all_eq (vf,  vfr),  "vf" );
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sn7153.c
===================================================================
--- gcc-4_8-test.orig/gcc/testsuite/gcc.dg/vmx/sn7153.c
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sn7153.c
@@ -34,7 +34,11 @@  main()
 
 void validate_sat()
 {
+#ifdef __LITTLE_ENDIAN__
+  if (vec_any_ne(vec_splat(vec_mfvscr(), 0), ((vector unsigned short){1,1,1,1,1,1,1,1})))
+#else
   if (vec_any_ne(vec_splat(vec_mfvscr(), 7), ((vector unsigned short){1,1,1,1,1,1,1,1})))
+#endif
     {
       union {vector unsigned short v; unsigned short s[8];} u;
       u.v = vec_mfvscr();
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-be-order.c
@@ -0,0 +1,59 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {-2.0,-1.0,0.0,1.0};
+
+  /* Result vectors.  */
+  vector unsigned char vucr;
+  vector signed char vscr;
+  vector unsigned short vusr;
+  vector signed short vssr;
+  vector unsigned int vuir;
+  vector signed int vsir;
+  vector float vfr;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vucer = {14,14,14,14,14,14,14,14,14,14,14,14,14,14,14,14};
+  vector signed char vscer = {-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1,-1};
+  vector unsigned short vuser = {0,0,0,0,0,0,0,0};
+  vector signed short vsser = {3,3,3,3,3,3,3,3};
+  vector unsigned int vuier = {1,1,1,1};
+  vector signed int vsier = {-2,-2,-2,-2};
+  vector float vfer = {0.0,0.0,0.0,0.0};
+#else
+  vector unsigned char vucer = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
+  vector signed char vscer = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
+  vector unsigned short vuser = {7,7,7,7,7,7,7,7};
+  vector signed short vsser = {-4,-4,-4,-4,-4,-4,-4,-4};
+  vector unsigned int vuier = {2,2,2,2};
+  vector signed int vsier = {1,1,1,1};
+  vector float vfer = {-1.0,-1.0,-1.0,-1.0};
+#endif
+
+  vucr = vec_splat (vuc, 1);
+  vscr = vec_splat (vsc, 8);
+  vusr = vec_splat (vus, 7);
+  vssr = vec_splat (vss, 0);
+  vuir = vec_splat (vui, 2);
+  vsir = vec_splat (vsi, 3);
+  vfr  = vec_splat (vf,  1);
+
+  check (vec_all_eq (vucr, vucer), "vuc");
+  check (vec_all_eq (vscr, vscer), "vsc");
+  check (vec_all_eq (vusr, vuser), "vus");
+  check (vec_all_eq (vssr, vsser), "vss");
+  check (vec_all_eq (vuir, vuier), "vui");
+  check (vec_all_eq (vsir, vsier), "vsi");
+  check (vec_all_eq (vfr,  vfer ), "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-vsx-be-order.c
@@ -0,0 +1,37 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {-2.0,-1.0,0.0,1.0};
+
+  /* Result vectors.  */
+  vector unsigned int vuir;
+  vector signed int vsir;
+  vector float vfr;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned int vuier = {1,1,1,1};
+  vector signed int vsier = {-2,-2,-2,-2};
+  vector float vfer = {0.0,0.0,0.0,0.0};
+#else
+  vector unsigned int vuier = {2,2,2,2};
+  vector signed int vsier = {1,1,1,1};
+  vector float vfer = {-1.0,-1.0,-1.0,-1.0};
+#endif
+
+  vuir = vec_splat (vui, 2);
+  vsir = vec_splat (vsi, 3);
+  vfr  = vec_splat (vf,  1);
+
+  check (vec_all_eq (vuir, vuier), "vui");
+  check (vec_all_eq (vsir, vsier), "vsi");
+  check (vec_all_eq (vfr,  vfer ), "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat-vsx.c
@@ -0,0 +1,31 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {-2.0,-1.0,0.0,1.0};
+
+  /* Result vectors.  */
+  vector unsigned int vuir;
+  vector signed int vsir;
+  vector float vfr;
+
+  /* Expected result vectors.  */
+  vector unsigned int vuier = {2,2,2,2};
+  vector signed int vsier = {1,1,1,1};
+  vector float vfer = {-1.0,-1.0,-1.0,-1.0};
+
+  vuir = vec_splat (vui, 2);
+  vsir = vec_splat (vsi, 3);
+  vfr  = vec_splat (vf,  1);
+
+  check (vec_all_eq (vuir, vuier), "vui");
+  check (vec_all_eq (vsir, vsier), "vsi");
+  check (vec_all_eq (vfr,  vfer ), "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/splat.c
@@ -0,0 +1,47 @@ 
+#include "harness.h"
+
+static void test()
+{
+  /* Input vectors.  */
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {-2.0,-1.0,0.0,1.0};
+
+  /* Result vectors.  */
+  vector unsigned char vucr;
+  vector signed char vscr;
+  vector unsigned short vusr;
+  vector signed short vssr;
+  vector unsigned int vuir;
+  vector signed int vsir;
+  vector float vfr;
+
+  /* Expected result vectors.  */
+  vector unsigned char vucer = {1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1};
+  vector signed char vscer = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};
+  vector unsigned short vuser = {7,7,7,7,7,7,7,7};
+  vector signed short vsser = {-4,-4,-4,-4,-4,-4,-4,-4};
+  vector unsigned int vuier = {2,2,2,2};
+  vector signed int vsier = {1,1,1,1};
+  vector float vfer = {-1.0,-1.0,-1.0,-1.0};
+
+  vucr = vec_splat (vuc, 1);
+  vscr = vec_splat (vsc, 8);
+  vusr = vec_splat (vus, 7);
+  vssr = vec_splat (vss, 0);
+  vuir = vec_splat (vui, 2);
+  vsir = vec_splat (vsi, 3);
+  vfr  = vec_splat (vf,  1);
+
+  check (vec_all_eq (vucr, vucer), "vuc");
+  check (vec_all_eq (vscr, vscer), "vsc");
+  check (vec_all_eq (vusr, vuser), "vus");
+  check (vec_all_eq (vssr, vsser), "vss");
+  check (vec_all_eq (vuir, vuier), "vui");
+  check (vec_all_eq (vsir, vsier), "vsi");
+  check (vec_all_eq (vfr,  vfer ), "vf");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-be-order.c
@@ -0,0 +1,83 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      check (svuc[i] == i, "svuc");
+      check (svsc[i] == i - 8, "svsc");
+      check (svbc[i] == ((i % 2) ? 0xff : 0), "svbc");
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      check (svus[i] == i, "svus");
+      check (svss[i] == i - 4, "svss");
+      check (svbs[i] == ((i % 2) ? 0xffff : 0), "svbs");
+      check (svp[i] == i, "svp");
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      check (svui[i] == i, "svui");
+      check (svsi[i] == i - 2, "svsi");
+      check (svbi[i] == ((i % 2) ? 0xffffffff : 0), "svbi");
+      check (svf[i] == i * 1.0f, "svf");
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vuc = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
+  vector signed char vsc = {7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8};
+  vector bool char vbc = {255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0};
+  vector unsigned short vus = {7,6,5,4,3,2,1,0};
+  vector signed short vss = {3,2,1,0,-1,-2,-3,-4};
+  vector bool short vbs = {65535,0,65535,0,65535,0,65535,0};
+  vector pixel vp = {7,6,5,4,3,2,1,0};
+  vector unsigned int vui = {3,2,1,0};
+  vector signed int vsi = {1,0,-1,-2};
+  vector bool int vbi = {0xffffffff,0,0xffffffff,0};
+  vector float vf = {3.0,2.0,1.0,0.0};
+#else
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel vp = {0,1,2,3,4,5,6,7};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector bool int vbi = {0,0xffffffff,0,0xffffffff};
+  vector float vf = {0.0,1.0,2.0,3.0};
+#endif
+
+  vec_st (vuc, 0, (vector unsigned char *)svuc);
+  vec_st (vsc, 0, (vector signed char *)svsc);
+  vec_st (vbc, 0, (vector bool char *)svbc);
+  vec_st (vus, 0, (vector unsigned short *)svus);
+  vec_st (vss, 0, (vector signed short *)svss);
+  vec_st (vbs, 0, (vector bool short *)svbs);
+  vec_st (vp,  0, (vector pixel *)svp);
+  vec_st (vui, 0, (vector unsigned int *)svui);
+  vec_st (vsi, 0, (vector signed int *)svsi);
+  vec_st (vbi, 0, (vector bool int *)svbi);
+  vec_st (vf,  0, (vector float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-vsx-be-order.c
@@ -0,0 +1,34 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      check (svul[i] == i, "svul");
+      check (svd[i] == i * 1.0, "svd");
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned long long vul = {1,0};
+  vector double vd = {1.0,0.0};
+#else
+  vector unsigned long long vul = {0,1};
+  vector double vd = {0.0,1.0};
+#endif
+
+  vec_st (vul, 0, (vector unsigned long long *)svul);
+  vec_st (vd,  0, (vector double *)svd);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st-vsx.c
@@ -0,0 +1,29 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      check (svul[i] == i, "svul");
+      check (svd[i] == i * 1.0, "svd");
+    }
+}
+
+static void test ()
+{
+  vector unsigned long long vul = {0,1};
+  vector double vd = {0.0,1.0};
+
+  vec_st (vul, 0, (vector unsigned long long *)svul);
+  vec_st (vd,  0, (vector double *)svd);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/st.c
@@ -0,0 +1,67 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      check (svuc[i] == i, "svuc");
+      check (svsc[i] == i - 8, "svsc");
+      check (svbc[i] == ((i % 2) ? 0xff : 0), "svbc");
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      check (svus[i] == i, "svus");
+      check (svss[i] == i - 4, "svss");
+      check (svbs[i] == ((i % 2) ? 0xffff : 0), "svbs");
+      check (svp[i] == i, "svp");
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      check (svui[i] == i, "svui");
+      check (svsi[i] == i - 2, "svsi");
+      check (svbi[i] == ((i % 2) ? 0xffffffff : 0), "svbi");
+      check (svf[i] == i * 1.0f, "svf");
+    }
+}
+
+static void test ()
+{
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel vp = {0,1,2,3,4,5,6,7};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector bool int vbi = {0,0xffffffff,0,0xffffffff};
+  vector float vf = {0.0,1.0,2.0,3.0};
+
+  vec_st (vuc, 0, (vector unsigned char *)svuc);
+  vec_st (vsc, 0, (vector signed char *)svsc);
+  vec_st (vbc, 0, (vector bool char *)svbc);
+  vec_st (vus, 0, (vector unsigned short *)svus);
+  vec_st (vss, 0, (vector signed short *)svss);
+  vec_st (vbs, 0, (vector bool short *)svbs);
+  vec_st (vp,  0, (vector pixel *)svp);
+  vec_st (vui, 0, (vector unsigned int *)svui);
+  vec_st (vsi, 0, (vector signed int *)svsi);
+  vec_st (vbi, 0, (vector bool int *)svbi);
+  vec_st (vf,  0, (vector float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ste-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ste-be-order.c
@@ -0,0 +1,53 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  check (svuc[9] == 9, "svuc");
+  check (svsc[14] == 6, "svsc");
+  check (svus[7] == 7, "svus");
+  check (svss[1] == -3, "svss");
+  check (svui[3] == 3, "svui");
+  check (svsi[2] == 0, "svsi");
+  check (svf[0] == 0.0, "svf");
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vuc = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
+  vector signed char vsc = {7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8};
+  vector unsigned short vus = {7,6,5,4,3,2,1,0};
+  vector signed short vss = {3,2,1,0,-1,-2,-3,-4};
+  vector unsigned int vui = {3,2,1,0};
+  vector signed int vsi = {1,0,-1,-2};
+  vector float vf = {3.0,2.0,1.0,0.0};
+#else
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {0.0,1.0,2.0,3.0};
+#endif
+
+  vec_ste (vuc, 9*1, (unsigned char *)svuc);
+  vec_ste (vsc, 14*1, (signed char *)svsc);
+  vec_ste (vus, 7*2, (unsigned short *)svus);
+  vec_ste (vss, 1*2, (signed short *)svss);
+  vec_ste (vui, 3*4, (unsigned int *)svui);
+  vec_ste (vsi, 2*4, (signed int *)svsi);
+  vec_ste (vf,  0*4, (float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ste.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/ste.c
@@ -0,0 +1,41 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  check (svuc[9] == 9, "svuc");
+  check (svsc[14] == 6, "svsc");
+  check (svus[7] == 7, "svus");
+  check (svss[1] == -3, "svss");
+  check (svui[3] == 3, "svui");
+  check (svsi[2] == 0, "svsi");
+  check (svf[0] == 0.0, "svf");
+}
+
+static void test ()
+{
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector float vf = {0.0,1.0,2.0,3.0};
+
+  vec_ste (vuc, 9*1, (unsigned char *)svuc);
+  vec_ste (vsc, 14*1, (signed char *)svsc);
+  vec_ste (vus, 7*2, (unsigned short *)svus);
+  vec_ste (vss, 1*2, (signed short *)svss);
+  vec_ste (vui, 3*4, (unsigned int *)svui);
+  vec_ste (vsi, 2*4, (signed int *)svsi);
+  vec_ste (vf,  0*4, (float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-be-order.c
@@ -0,0 +1,83 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      check (svuc[i] == i, "svuc");
+      check (svsc[i] == i - 8, "svsc");
+      check (svbc[i] == ((i % 2) ? 0xff : 0), "svbc");
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      check (svus[i] == i, "svus");
+      check (svss[i] == i - 4, "svss");
+      check (svbs[i] == ((i % 2) ? 0xffff : 0), "svbs");
+      check (svp[i] == i, "svp");
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      check (svui[i] == i, "svui");
+      check (svsi[i] == i - 2, "svsi");
+      check (svbi[i] == ((i % 2) ? 0xffffffff : 0), "svbi");
+      check (svf[i] == i * 1.0f, "svf");
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned char vuc = {15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0};
+  vector signed char vsc = {7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8};
+  vector bool char vbc = {255,0,255,0,255,0,255,0,255,0,255,0,255,0,255,0};
+  vector unsigned short vus = {7,6,5,4,3,2,1,0};
+  vector signed short vss = {3,2,1,0,-1,-2,-3,-4};
+  vector bool short vbs = {65535,0,65535,0,65535,0,65535,0};
+  vector pixel vp = {7,6,5,4,3,2,1,0};
+  vector unsigned int vui = {3,2,1,0};
+  vector signed int vsi = {1,0,-1,-2};
+  vector bool int vbi = {0xffffffff,0,0xffffffff,0};
+  vector float vf = {3.0,2.0,1.0,0.0};
+#else
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel vp = {0,1,2,3,4,5,6,7};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector bool int vbi = {0,0xffffffff,0,0xffffffff};
+  vector float vf = {0.0,1.0,2.0,3.0};
+#endif
+
+  vec_stl (vuc, 0, (vector unsigned char *)svuc);
+  vec_stl (vsc, 0, (vector signed char *)svsc);
+  vec_stl (vbc, 0, (vector bool char *)svbc);
+  vec_stl (vus, 0, (vector unsigned short *)svus);
+  vec_stl (vss, 0, (vector signed short *)svss);
+  vec_stl (vbs, 0, (vector bool short *)svbs);
+  vec_stl (vp,  0, (vector pixel *)svp);
+  vec_stl (vui, 0, (vector unsigned int *)svui);
+  vec_stl (vsi, 0, (vector signed int *)svsi);
+  vec_stl (vbi, 0, (vector bool int *)svbi);
+  vec_stl (vf,  0, (vector float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-vsx-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-vsx-be-order.c
@@ -0,0 +1,34 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      check (svul[i] == i, "svul");
+      check (svd[i] == i * 1.0, "svd");
+    }
+}
+
+static void test ()
+{
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector unsigned long long vul = {1,0};
+  vector double vd = {1.0,0.0};
+#else
+  vector unsigned long long vul = {0,1};
+  vector double vd = {0.0,1.0};
+#endif
+
+  vec_stl (vul, 0, (vector unsigned long long *)svul);
+  vec_stl (vd,  0, (vector double *)svd);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-vsx.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl-vsx.c
@@ -0,0 +1,29 @@ 
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-maltivec -mabi=altivec -std=gnu99 -mvsx" } */
+
+#include "harness.h"
+
+static unsigned long long svul[2] __attribute__ ((aligned (16)));
+static double svd[2] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 2; ++i)
+    {
+      check (svul[i] == i, "svul");
+      check (svd[i] == i * 1.0, "svd");
+    }
+}
+
+static void test ()
+{
+  vector unsigned long long vul = {0,1};
+  vector double vd = {0.0,1.0};
+
+  vec_stl (vul, 0, (vector unsigned long long *)svul);
+  vec_stl (vd,  0, (vector double *)svd);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/stl.c
@@ -0,0 +1,67 @@ 
+#include "harness.h"
+
+static unsigned char svuc[16] __attribute__ ((aligned (16)));
+static signed char svsc[16] __attribute__ ((aligned (16)));
+static unsigned char svbc[16] __attribute__ ((aligned (16)));
+static unsigned short svus[8] __attribute__ ((aligned (16)));
+static signed short svss[8] __attribute__ ((aligned (16)));
+static unsigned short svbs[8] __attribute__ ((aligned (16)));
+static unsigned short svp[8] __attribute__ ((aligned (16)));
+static unsigned int svui[4] __attribute__ ((aligned (16)));
+static signed int svsi[4] __attribute__ ((aligned (16)));
+static unsigned int svbi[4] __attribute__ ((aligned (16)));
+static float svf[4] __attribute__ ((aligned (16)));
+
+static void check_arrays ()
+{
+  unsigned int i;
+  for (i = 0; i < 16; ++i)
+    {
+      check (svuc[i] == i, "svuc");
+      check (svsc[i] == i - 8, "svsc");
+      check (svbc[i] == ((i % 2) ? 0xff : 0), "svbc");
+    }
+  for (i = 0; i < 8; ++i)
+    {
+      check (svus[i] == i, "svus");
+      check (svss[i] == i - 4, "svss");
+      check (svbs[i] == ((i % 2) ? 0xffff : 0), "svbs");
+      check (svp[i] == i, "svp");
+    }
+  for (i = 0; i < 4; ++i)
+    {
+      check (svui[i] == i, "svui");
+      check (svsi[i] == i - 2, "svsi");
+      check (svbi[i] == ((i % 2) ? 0xffffffff : 0), "svbi");
+      check (svf[i] == i * 1.0f, "svf");
+    }
+}
+
+static void test ()
+{
+  vector unsigned char vuc = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15};
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,0,255,0,255,0,255,0,255,0,255,0,255,0,255};
+  vector unsigned short vus = {0,1,2,3,4,5,6,7};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,0,65535,0,65535,0,65535};
+  vector pixel vp = {0,1,2,3,4,5,6,7};
+  vector unsigned int vui = {0,1,2,3};
+  vector signed int vsi = {-2,-1,0,1};
+  vector bool int vbi = {0,0xffffffff,0,0xffffffff};
+  vector float vf = {0.0,1.0,2.0,3.0};
+
+  vec_stl (vuc, 0, (vector unsigned char *)svuc);
+  vec_stl (vsc, 0, (vector signed char *)svsc);
+  vec_stl (vbc, 0, (vector bool char *)svbc);
+  vec_stl (vus, 0, (vector unsigned short *)svus);
+  vec_stl (vss, 0, (vector signed short *)svss);
+  vec_stl (vbs, 0, (vector bool short *)svbs);
+  vec_stl (vp,  0, (vector pixel *)svp);
+  vec_stl (vui, 0, (vector unsigned int *)svui);
+  vec_stl (vsi, 0, (vector signed int *)svsi);
+  vec_stl (vbi, 0, (vector bool int *)svbi);
+  vec_stl (vf,  0, (vector float *)svf);
+
+  check_arrays ();
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sum2s-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sum2s-be-order.c
@@ -0,0 +1,19 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector signed int vsia = {-10,1,2,3};
+  vector signed int vsib = {100,101,102,-103};
+  vector signed int vsir;
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector signed int vsier = {91,0,107,0};
+#else
+  vector signed int vsier = {0,92,0,-98};
+#endif
+
+  vsir = vec_sum2s (vsia, vsib);
+
+  check (vec_all_eq (vsir, vsier), "vsir");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sum2s.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/sum2s.c
@@ -0,0 +1,13 @@ 
+#include "harness.h"
+
+static void test()
+{
+  vector signed int vsia = {-10,1,2,3};
+  vector signed int vsib = {100,101,102,-103};
+  vector signed int vsir;
+  vector signed int vsier = {0,92,0,-98};
+
+  vsir = vec_sum2s (vsia, vsib);
+
+  check (vec_all_eq (vsir, vsier), "vsir");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/unpack-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/unpack-be-order.c
@@ -0,0 +1,88 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+#define BIG 4294967295
+
+static void test()
+{
+  /* Input vectors.  */
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,255,0,0,0,255,0,255,0,0,255,255,255,0,255};
+  vector pixel vp = {(0<<15) + (1<<10)  + (2<<5)  + 3,
+		     (1<<15) + (4<<10)  + (5<<5)  + 6,
+		     (0<<15) + (7<<10)  + (8<<5)  + 9,
+		     (1<<15) + (10<<10) + (11<<5) + 12,
+		     (1<<15) + (13<<10) + (14<<5) + 15,
+		     (0<<15) + (16<<10) + (17<<5) + 18,
+		     (1<<15) + (19<<10) + (20<<5) + 21,
+		     (0<<15) + (22<<10) + (23<<5) + 24};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,65535,0,0,0,65535,0};
+
+  /* Result vectors.  */
+  vector signed short vsch, vscl;
+  vector bool short vbsh, vbsl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector bool int vbih, vbil;
+
+  /* Expected result vectors.  */
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector signed short vschr = {0,1,2,3,4,5,6,7};
+  vector signed short vsclr = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector bool short vbshr = {65535,0,0,65535,65535,65535,0,65535};
+  vector bool short vbslr = {0,65535,65535,0,0,0,65535,0};
+  vector unsigned int vuihr = {(65535<<24) + (13<<16) + (14<<8) + 15,
+			       (0<<24)     + (16<<16) + (17<<8) + 18,
+			       (65535<<24) + (19<<16) + (20<<8) + 21,
+			       (0<<24)     + (22<<16) + (23<<8) + 24};
+  vector unsigned int vuilr = {(0<<24)     + (1<<16)  + (2<<8)  + 3,
+			       (65535<<24) + (4<<16)  + (5<<8)  + 6,
+			       (0<<24)     + (7<<16)  + (8<<8)  + 9,
+			       (65535<<24) + (10<<16) + (11<<8) + 12};
+  vector signed int vsihr = {0,1,2,3};
+  vector signed int vsilr = {-4,-3,-2,-1};
+  vector bool int vbihr = {0,0,BIG,0};
+  vector bool int vbilr = {0,BIG,BIG,0};
+#else
+  vector signed short vschr = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vsclr = {0,1,2,3,4,5,6,7};
+  vector bool short vbshr = {0,65535,65535,0,0,0,65535,0};
+  vector bool short vbslr = {65535,0,0,65535,65535,65535,0,65535};
+  vector unsigned int vuihr = {(0<<24)     + (1<<16)  + (2<<8)  + 3,
+			       (65535<<24) + (4<<16)  + (5<<8)  + 6,
+			       (0<<24)     + (7<<16)  + (8<<8)  + 9,
+			       (65535<<24) + (10<<16) + (11<<8) + 12};
+  vector unsigned int vuilr = {(65535<<24) + (13<<16) + (14<<8) + 15,
+			       (0<<24)     + (16<<16) + (17<<8) + 18,
+			       (65535<<24) + (19<<16) + (20<<8) + 21,
+			       (0<<24)     + (22<<16) + (23<<8) + 24};
+  vector signed int vsihr = {-4,-3,-2,-1};
+  vector signed int vsilr = {0,1,2,3};
+  vector bool int vbihr = {0,BIG,BIG,0};
+  vector bool int vbilr = {0,0,BIG,0};
+#endif
+
+  vsch = vec_unpackh (vsc);
+  vscl = vec_unpackl (vsc);
+  vbsh = vec_unpackh (vbc);
+  vbsl = vec_unpackl (vbc);
+  vuih = vec_unpackh (vp);
+  vuil = vec_unpackl (vp);
+  vsih = vec_unpackh (vss);
+  vsil = vec_unpackl (vss);
+  vbih = vec_unpackh (vbs);
+  vbil = vec_unpackl (vbs);
+
+  check (vec_all_eq (vsch, vschr), "vsch");
+  check (vec_all_eq (vscl, vsclr), "vscl");
+  check (vec_all_eq (vbsh, vbshr), "vbsh");
+  check (vec_all_eq (vbsl, vbslr), "vbsl");
+  check (vec_all_eq (vuih, vuihr), "vuih");
+  check (vec_all_eq (vuil, vuilr), "vuil");
+  check (vec_all_eq (vsih, vsihr), "vsih");
+  check (vec_all_eq (vsil, vsilr), "vsil");
+  check (vec_all_eq (vbih, vbihr), "vbih");
+  check (vec_all_eq (vbil, vbilr), "vbil");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/unpack.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/unpack.c
@@ -0,0 +1,67 @@ 
+#include "harness.h"
+
+#define BIG 4294967295
+
+static void test()
+{
+  /* Input vectors.  */
+  vector signed char vsc = {-8,-7,-6,-5,-4,-3,-2,-1,0,1,2,3,4,5,6,7};
+  vector bool char vbc = {0,255,255,0,0,0,255,0,255,0,0,255,255,255,0,255};
+  vector pixel vp = {(0<<15) + (1<<10)  + (2<<5)  + 3,
+		     (1<<15) + (4<<10)  + (5<<5)  + 6,
+		     (0<<15) + (7<<10)  + (8<<5)  + 9,
+		     (1<<15) + (10<<10) + (11<<5) + 12,
+		     (1<<15) + (13<<10) + (14<<5) + 15,
+		     (0<<15) + (16<<10) + (17<<5) + 18,
+		     (1<<15) + (19<<10) + (20<<5) + 21,
+		     (0<<15) + (22<<10) + (23<<5) + 24};
+  vector signed short vss = {-4,-3,-2,-1,0,1,2,3};
+  vector bool short vbs = {0,65535,65535,0,0,0,65535,0};
+
+  /* Result vectors.  */
+  vector signed short vsch, vscl;
+  vector bool short vbsh, vbsl;
+  vector unsigned int vuih, vuil;
+  vector signed int vsih, vsil;
+  vector bool int vbih, vbil;
+
+  /* Expected result vectors.  */
+  vector signed short vschr = {-8,-7,-6,-5,-4,-3,-2,-1};
+  vector signed short vsclr = {0,1,2,3,4,5,6,7};
+  vector bool short vbshr = {0,65535,65535,0,0,0,65535,0};
+  vector bool short vbslr = {65535,0,0,65535,65535,65535,0,65535};
+  vector unsigned int vuihr = {(0<<24)     + (1<<16)  + (2<<8)  + 3,
+			       (65535<<24) + (4<<16)  + (5<<8)  + 6,
+			       (0<<24)     + (7<<16)  + (8<<8)  + 9,
+			       (65535<<24) + (10<<16) + (11<<8) + 12};
+  vector unsigned int vuilr = {(65535<<24) + (13<<16) + (14<<8) + 15,
+			       (0<<24)     + (16<<16) + (17<<8) + 18,
+			       (65535<<24) + (19<<16) + (20<<8) + 21,
+			       (0<<24)     + (22<<16) + (23<<8) + 24};
+  vector signed int vsihr = {-4,-3,-2,-1};
+  vector signed int vsilr = {0,1,2,3};
+  vector bool int vbihr = {0,BIG,BIG,0};
+  vector bool int vbilr = {0,0,BIG,0};
+
+  vsch = vec_unpackh (vsc);
+  vscl = vec_unpackl (vsc);
+  vbsh = vec_unpackh (vbc);
+  vbsl = vec_unpackl (vbc);
+  vuih = vec_unpackh (vp);
+  vuil = vec_unpackl (vp);
+  vsih = vec_unpackh (vss);
+  vsil = vec_unpackl (vss);
+  vbih = vec_unpackh (vbs);
+  vbil = vec_unpackl (vbs);
+
+  check (vec_all_eq (vsch, vschr), "vsch");
+  check (vec_all_eq (vscl, vsclr), "vscl");
+  check (vec_all_eq (vbsh, vbshr), "vbsh");
+  check (vec_all_eq (vbsl, vbslr), "vbsl");
+  check (vec_all_eq (vuih, vuihr), "vuih");
+  check (vec_all_eq (vuil, vuilr), "vuil");
+  check (vec_all_eq (vsih, vsihr), "vsih");
+  check (vec_all_eq (vsil, vsilr), "vsil");
+  check (vec_all_eq (vbih, vbihr), "vbih");
+  check (vec_all_eq (vbil, vbilr), "vbil");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/vsums-be-order.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/vsums-be-order.c
@@ -0,0 +1,20 @@ 
+/* { dg-options "-maltivec=be -mabi=altivec -std=gnu99 -mno-vsx" } */
+
+#include "harness.h"
+
+static void test()
+{
+  vector signed int va = {-7,11,-13,17};
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+  vector signed int vb = {128,0,0,0};
+  vector signed int evd = {136,0,0,0};
+#else
+  vector signed int vb = {0,0,0,128};
+  vector signed int evd = {0,0,0,136};
+#endif
+
+  vector signed int vd = vec_sums (va, vb);
+
+  check (vec_all_eq (vd, evd), "sums");
+}
Index: gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/vsums.c
===================================================================
--- /dev/null
+++ gcc-4_8-test/gcc/testsuite/gcc.dg/vmx/vsums.c
@@ -0,0 +1,12 @@ 
+#include "harness.h"
+
+static void test()
+{
+  vector signed int va = {-7,11,-13,17};
+  vector signed int vb = {0,0,0,128};
+  vector signed int evd = {0,0,0,136};
+
+  vector signed int vd = vec_sums (va, vb);
+
+  check (vec_all_eq (vd, evd), "sums");
+}
Index: gcc-4_8-test/gcc/config/rs6000/vector.md
===================================================================
--- gcc-4_8-test.orig/gcc/config/rs6000/vector.md
+++ gcc-4_8-test/gcc/config/rs6000/vector.md
@@ -608,8 +608,8 @@ 
 	(ge:VEC_F (match_dup 2)
 		  (match_dup 1)))
    (set (match_dup 0)
-	(not:VEC_F (ior:VEC_F (match_dup 3)
-			      (match_dup 4))))]
+        (and:VEC_F (not:VEC_F (match_dup 3))
+                   (not:VEC_F (match_dup 4))))]
   "
 {
   operands[3] = gen_reg_rtx (<MODE>mode);