diff mbox

[4,of,5] , Add suport for PowerPC IEEE 128-bit floating point

Message ID 20140715184042.GD3263@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner July 15, 2014, 6:40 p.m. UTC
The patches are the PowerPC specific patches to gcc to enable IEEE 128-bit
floating point.

2014-07-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* doc/invoke.texi (-mfloat128-vsx): Document new PowerPC switches
	to enable IEEE 128-bit floating point.
	(-mfloat128-ref): Likewise.
	* doc/extend.texi (Floating Types): Document use of __float128 on
	PowerPC systems.

	* config/rs6000/rs6000-protos.h (rs6000_expand_float128_convert):
	Add declaration.

	* config/rs6000/rs6000.c (TARGET_C_MODE_FOR_SUFFIX): Add support
	for using 'q' or 'Q' as the suffix for IEEE 128-bit floating
	point.
	(rs6000_c_mode_for_suffix): Likewise.
	(scalar_float_not_ieee128_p): Helper function to return true if
	normal scalar floating point, but not IEEE 128-bit floating
	point.
	(rs6000_hard_regno_nregs_internal): Add support for IEEE 128-bit
	floating point that can occupy a single vector register, instead
	of 2 scalar registers.
	(rs6000_debug_reg_global): Add debugging for IEEE 128-bit floating
	point support.
	(rs6000_init_hard_regno_mode_ok): Set up tables when IEEE 128-bit
	floating point can occupy a single vector register to use vector
	addressing.  Add reload helper functions.
	(rs6000_option_override_internal): Add support for -mfloat128-vsx
	and -mfloat128-ref options.
	(invalid_e500_subreg): Skip IEEE 128-bit floating point from being
	a subreg, like other floating point types.
	(reg_offset_addressing_ok_p): Add support for IEEE 128-bit
	floating point types going in a vector register.
	(rs6000_legitimate_offset_address_p): Likewise.
	(rs6000_legitimize_address): Likewise.
	(rs6000_legitimize_reload_address): Likewise.
	(rs6000_legitimate_address_p): Likewise.
	(rs6000_emit_le_vsx_load): On little endian VSX systems, make sure
	IEEE 128-bit floating point types are properly swapped.
	(rs6000_emit_le_vsx_store): Likewise.
	(rs6000_emit_move): Update moving IBM 128-bit floating point
	constants to use new macro framework.
	(force_const_mem): Handle IEEE 128-bit floating point.
	(rs6000_member_type_forces_blk): Likewise.
	(rs6000_discover_homogeneous_aggregate): Likewise.
	(rs6000_return_in_memory): If -mfloat128-vsx, return IEEE
	128-bit floating point in vector registers, otherwise caller must
	pass an address to store the result.
	(init_cumulative_args): Record whether the function is a library
	function or not.  IEEE 128-bit floating point is is not passed
	like normal scalar floating point.
	(function_arg_boundary): Add support for IEEE 128-bit floating
	point, passing/returning the values either in vector registers, or
	by passing a 128-bit space by reference.  IEEE 128-bit library
	functions are assumed to have a prototype, so the arguments are
	not needed in the parameter save area.
	(rs6000_function_arg_advance_1): Likewise.
	(rs6000_function_arg): Likewise.
	(rs6000_arg_partial_bytes): Likewise.
	(rs6000_pass_by_refernece): Likewise.
	(rs6000_gimplify_va_arg): Likewise.
	(rs6000_init_builtins): Initialize IEEE 128-bit floating point
	type.  Add support for __float128 keyword.
	(rs6000_init_funcs): Split this into 2 sub-functions, one to setup
	for IEEE 128-bit floating point, and the other to set up for IBM
	floating point.
	(init_float128_ibm): Likewise.
	(init_float128_ieee): Likewise.
	(rs6000_cannot_change_mode_class): Add support for IEEE 128-bit
	floating point.
	(rs6000_output_move_128bit): If an error is reached, use
	fatal_insn to print the SET insn all of the time, instead of when
	-mdebug=addr is used.
	(print_operand): Update %y to use IEEE 128-bit floating point
	macros.
	(rs6000_generate_compare): Add support for IEEE 128-bit floating
	point comparisons.
	(rs6000_expand_float128_convert): New helper function called from
	rtl expanders to generate the appropriate conversion to/from IEEE
	128-bit floating point.
	(rs6000_split_multireg_move): Use IEEE 128-bit floating point
	infrastructure macros.
	(spe_func_has_64bit_regs_p): Likewise.
	(rs6000_output_function_epilogue): Add IEEE 128-bit floating point
	support.
	(output_toc): Likewise.
	(rs6000_mangle_type): Likewise.
	(rs6000_register_move_cost): Likewise.
	(rs6000_function_value): Likewise.
	(rs6000_libcall_value): Likewise.
	(rs6000_scalar_mode_supported_p): Enable KFmode if -mfloat128-vsx
	or -mfloat128-ref.
	(rs6000_vector_mode_supported_p): There is no vector form of IEEE
	128-bit floating point.
	(rs6000_opt_masks): Add support for -mfloat128-vsx and
	-mfloat128-ref.

	* config/rs6000/vector.md (VEC_L): Add KFmode (IEEE 128-bit
	floating point) to vector mode iterators.
	(VEC_M): Likewise.
	(VEC_N): Likewise.
	(VEC_R): Likewise.
	(mov<mode>, VEC_M iterator): Add support for IEEE 128-bit floating
	point constants, calling easy_fp_convert instead of
	easy_vector_convert.

	* config/rs6000/predicates.md (int_reg_operand_not_pseudo): New
	predicate for splitters to only split 128-bit types when the value
	is in a hard GPR register.
	(easy_fp_constant): Add IEEE 128-bit floating point support.
	(easy_vector_constant): Call easy_fp_constant for scalar IEEE
	128-bit floating point.

	* config/rs6000/rs6000-modes.def (KFmode): New type for IEEE
	128-bit floating point support.

	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Set
	-mfloat128-vsx on by default.
	(POWERPC_MASKS): Add -mfloat128-vsx and -mfloat128-ref masks.
	(power7 cpu): Set -mfloat128-vsx on by default.

	* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): If
	-mfloat128-vsx or -mfloat128-ref, define the appropriate macros.
	(rs6000_cpu_cpp_builtins): Define macros to identify the long
	double format.

	* config/rs6000/rs6000.opt (-mfloat128-vsx): New switches to
	enable/disable IEEE 128-bit floating point.
	(-mfloat128-ref): Likewise.

	* config/rs6000/vsx.md (VSX_L): Add KFmode/TFmode, based on the
	float128 switches.
	(VSX_M): Likewise.
	(VSX_M2): Likewise.
	(VSX_F128): New iterator for 128-bit scalar types that use vector
	registers.
	(VSm): Add IEEE 128-bit support to mode attributes.
	(VSs): Likewise.
	(VSr): Likewise.
	(VSv): Likewise.
	(vsx_le_perm_load_<mode>, VSX_F128 iterator): New insn for 128-bit
	scalars, casting the mode to V2DImode so vec_select can be used to
	create the xxpermdi instruction.
	(vsx_le_perm_store_<mode>, VSX_F128 iterator): Likewise.
	(IEEE 128-bit splitters): Add splitters for IEEE 128-bit floating
	point in vector registers.

	* config/rs6000/rs6000.h (TARGET_FLOAT128): New macro, true if
	either -mfloat128-vsx or -mfloat128-ref.
	(IEEE_128BIT_P): New macros to identify IEEE and IBM 128-bit
	floating point modes.
	(IBM_128BIT_P): Likewise.
	(FLOAT128_VECTOR_P): New macros to identify 128-bit floating point
	types that either use a single vector register, or a pair of
	scalar floating point registers.
	(FLOAT128_2REG_P): Likewise.
	(MASK_FLOAT128_VSX): Shorter IEEE 128-bit floating point option
	masks.
	(MASK_FLOAT128_REF): Likewise.
	(SLOW_UNALIGNED_ACCESS): Add support for IEEE 128-bit floating
	point modes.
	(HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	(HARD_REGNO_CALL_PART_CLOBBERED): Likewise.
	(VSX_VECTOR_MODE): Likewise.
	(ALTIVEC_VECTOR_MODE): Likewise.
	(MODES_TIEABLE_P): Move tests for vector modes above scalar
	floating point modes, so that IEEE 128-bit floating point that
	goes in a VSX register only ties with other vector types.
	(struct rs6000_args): Add libcall field.
	(enum rs6000_builtin_type_index): Add IEEE 128-bit floating
	point.
	(ieee128_float_type_node): Likewise.

	* config/rs6000/altivec.md (VM): Add KFmode/TFmode to vector mode
	iterators.
	(VM2): Likewise.
	(altivec_high_bit): New function to instantiate vector register
	with high bit set (i.e. -0.0 for IEEE 128-bit floating point).

	* config/rs6000/rs6000.md (FP): Add IEEE 128-bit floating point
	support.
	(FMOVE128): Likewise.
	(FMOVE128_FPR): New mode iterator for 128-bit types that take 2
	floating point registers.
	(FMOVE128_GPR): Add KFmode.
	(FMOVE128_VSX): New iterator for scalar types in VSX registers.
	(FLOAT128_SFDFTF): New mode iterator for IEEE 128-bit floating
	point conversions.
	(TFKF): New mode iterator for 128-bit scalar floating point
	types.
	(mov<mode>_64bit_dm): Add support for IEEE 128-bit floating
	point.
	(mov<mode>_32bit): Likewise.
	(mov<mode>_softfloat): Likewise.
	(extenddftf2_internal): Add support if long double is IEEE 128-bit
	floating point.
	(trunctfdf2): Likewise.
	(trunctfdf2_internal1): Likewise.
	(fix_trunctfsi2): Likewise.
	(fix_trunctfdi2): Likewise.
	(funcs_trunctf<mode>2): Likewise.
	(floatditf2): Likewise.
	(floatuns<mode>tf2): Likewise
	(negtf2): Likewise.
	(negtf2_internal): Likewise.
	(abstf2): Likewise.
	(abs<mode>2, TKFK iterator): Likewise.
	(ieee_128bit_vsx_neg<mode>2): New insns for IEEE 128-bit floating
	point support.
	(ieee_128bit_vsx_neg<mode>2_internal): Likewise.
	(ieee_128bit_vsx_abs<mode>2): Likewise.
	(ieee_128bit_vsx_abs<mode>2_internal): Likewise.
	(ieee_128bit_vsx_nabs<mode>2): Likewise.
	(ieee_128bit_vsx_nabs<mode>2_internal): Likewise.
	(extend<mode>kf2): Likewise.
	(trunckf<mode>2): Likewise.
	(fix_trunckf<mode>2): Likewise.
	(fixuns_trunckf<mode>2): Likewise.
	(float<mode>kf2): Likewise.
	(floatuns<mode>kf2): Likewise.
	(unpack<mode>, FP128_64 iterator): Limit unpacking to when the
	mode takes two scalar registers.
	(unpack<mode>_dm, FP128_64 iterator): Likewise.
	(pack<mode>, FP128_64 iterator): Likewise.
	(unpackv1ti): Delete V1TImode pack/unpack, replace with modes that
	handle V1TImode, KFmode, and TFmode.
	(unpack<mode>): Likewise.
	(packv1ti): Likewise.
	(pack<mode>): Likewise.
diff mbox

Patch

Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/doc/invoke.texi	(working copy)
@@ -930,7 +930,8 @@  See RS/6000 and PowerPC Options.
 -mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol
 -mquad-memory -mno-quad-memory @gol
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
--mcompat-align-parm -mno-compat-align-parm}
+-mcompat-align-parm -mno-compat-align-parm @gol
+-mfloat128-vsx -mno-float128-vsx -mfloat128-ref -mno-float128-ref}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -20114,6 +20115,26 @@  that is compatible with functions compil
 GCC.
 
 The @option{-mno-compat-align-parm} option is the default.
+
+@item -mfloat128-vsx
+@itemx -mno-float128-vsx
+@opindex mfloat128-vsx
+Enable (do not enable) the type @code{__float128} which is the IEEE
+128-bit floating point type.  The @code{__float128} type is passed and
+returned in a single register, and it requires the VSX instruction set
+extension.  The @option{-mfloat128-vsx} option is enabled by default
+if @option{-mvsx}, @option{-mcpu=power7}, or @option{-mcpu=power8}
+options are used.  You cannot use @option{-mfloat128-vsx} and
+@option{-mfloat128-ref} at the same time.
+
+@item -mfloat128-ref
+@itemx -mno-float128-ref
+@opindex mfloat128-fpr
+Enable (do not enable) the type @code{__float128} which is the IEEE
+128-bit floating point type.  The @code{__float128} type is passed and
+returned by reference on the stack.  This option is not enabled by
+default.  You cannot use @option{-mfloat128-vsx} and
+@option{-mfloat128-ref} at the same time.
 @end table
 
 @node RX Options
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/doc/extend.texi	(working copy)
@@ -957,18 +957,34 @@  add, subtract, multiply, divide; unary a
 relational operators; equality operators; and conversions to and from
 integer and other floating types.  Use a suffix @samp{w} or @samp{W}
 in a literal constant of type @code{__float80} and @samp{q} or @samp{Q}
-for @code{_float128}.  You can declare complex types using the
-corresponding internal complex type, @code{XCmode} for @code{__float80}
-type and @code{TCmode} for @code{__float128} type:
+for @code{_float128}.
+
+On the i386, x86_64, IA-64, and HP-UX targets, you can declare complex
+types using the corresponding internal complex type, @code{XCmode} for
+@code{__float80} type and @code{TCmode} for @code{__float128} type:
 
 @smallexample
 typedef _Complex float __attribute__((mode(TC))) _Complex128;
 typedef _Complex float __attribute__((mode(XC))) _Complex80;
 @end smallexample
 
+On PowerPC Linux, Freebsd and Darwin systems, the default for
+@code{long double} is to use the IBM extended floating point format
+that uses a pair of @code{double} values to extend the precision.
+This means that the mode @code{TCmode} was already used by the
+traditional IBM long double format, and you would need to use the mode
+@code{KCmode}:
+
+@smallexample
+typedef _Complex float __attribute__((mode(KC))) _Complex128;
+@end smallexample
+
 Not all targets support additional floating-point types.  @code{__float80}
 and @code{__float128} types are supported on i386, x86_64 and IA-64 targets.
-The @code{__float128} type is supported on hppa HP-UX targets.
+The @code{__float128} type is supported on hppa HP-UX.
+The @code{__float128} type is supported on PowerPC systems by default
+if the vector scalar instruction set (VSX) is enabled, or if the
+@option{-mfloat128-fpr} option is used.
 
 @node Half-Precision
 @section Half-Precision Floating Point
@@ -13420,6 +13436,8 @@  uint64_t __builtin_ppc_get_timebase ();
 unsigned long __builtin_ppc_mftb ();
 double __builtin_unpack_longdouble (long double, int);
 long double __builtin_pack_longdouble (double, double);
+double __builtin_unpack_ibm128 (long double, int);
+__ibm128 __builtin_pack_ibm128 (double, double);
 @end smallexample
 
 The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -53,6 +53,7 @@  extern const char *output_vec_const_move
 extern const char *rs6000_output_move_128bit (rtx *);
 extern bool rs6000_move_128bit_ok_p (rtx []);
 extern bool rs6000_split_128bit_ok_p (rtx []);
+extern void rs6000_expand_float128_convert (rtx, rtx, bool);
 extern void rs6000_expand_vector_init (rtx, rtx);
 extern void paired_expand_vector_init (rtx, rtx);
 extern void rs6000_expand_vector_set (rtx, rtx, int);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1622,6 +1622,9 @@  static const struct attribute_spec rs600
 
 #undef TARGET_CAN_USE_DOLOOP_P
 #define TARGET_CAN_USE_DOLOOP_P can_use_doloop_if_innermost
+
+#undef TARGET_C_MODE_FOR_SUFFIX
+#define TARGET_C_MODE_FOR_SUFFIX rs6000_c_mode_for_suffix
 
 
 /* Processor table.  */
@@ -1658,6 +1661,22 @@  rs6000_cpu_name_lookup (const char *name
 }
 
 
+/* Helper function to separate IEEE 128-bit floating point from other scalar
+   float modes, since IEEE 128-bit is either passed by reference (V4) or in a
+   vector register (VSX).  */
+
+static inline bool
+scalar_float_not_ieee128_p (enum machine_mode mode)
+{
+  if (!SCALAR_FLOAT_MODE_P (mode))
+    return false;
+
+  if (IEEE_128BIT_P (mode))
+    return false;
+
+  return true;
+}
+
 /* Return number of consecutive hard regs needed starting at reg REGNO
    to hold something of mode MODE.
    This is ordinarily the length in words of a value of mode MODE
@@ -1675,9 +1694,10 @@  rs6000_hard_regno_nregs_internal (int re
 {
   unsigned HOST_WIDE_INT reg_size;
 
-  /* TF/TD modes are special in that they always take 2 registers.  */
+  /* 128-bit floating point usually takes 2 registers, unless it is IEEE
+     128-bit floating point that can go in vector registers.  */
   if (FP_REGNO_P (regno))
-    reg_size = ((VECTOR_MEM_VSX_P (mode) && mode != TDmode && mode != TFmode)
+    reg_size = ((VECTOR_MEM_VSX_P (mode) && !FLOAT128_2REG_P (mode))
 		? UNITS_PER_VSX_WORD
 		: UNITS_PER_FP_WORD);
 
@@ -1964,6 +1984,7 @@  rs6000_debug_reg_global (void)
     SFmode,
     DFmode,
     TFmode,
+    KFmode,
     SDmode,
     DDmode,
     TDmode,
@@ -2330,6 +2351,10 @@  rs6000_debug_reg_global (void)
   fprintf (stderr, DEBUG_FMT_D, "Number of rs6000 builtins",
 	   (int)RS6000_BUILTIN_COUNT);
 
+  if (TARGET_FLOAT128)
+    fprintf (stderr, DEBUG_FMT_S, "__float128",
+	     TARGET_FLOAT128_VSX ? "vsx" : "ref");
+
   if (TARGET_VSX)
     fprintf (stderr, DEBUG_FMT_D, "VSX easy 64-bit scalar element",
 	     (int)VECTOR_ELEMENT_SCALAR_64BIT);
@@ -2527,6 +2552,20 @@  rs6000_init_hard_regno_mode_ok (bool glo
       align32 = 128;
     }
 
+  /* KF mode (ieee 128-bit) where we can pass it as a vector.  We do not have
+     arithmetic, so only set the memory modes.  */
+  if (TARGET_VSX && TARGET_FLOAT128_VSX)
+    {
+      rs6000_vector_mem[KFmode] = VECTOR_VSX;
+      rs6000_vector_align[KFmode] = 128;
+
+      if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
+	{
+	  rs6000_vector_mem[TFmode] = VECTOR_VSX;
+	  rs6000_vector_align[TFmode] = 128;
+	}
+    }
+
   /* V2DF mode, VSX only.  */
   if (TARGET_VSX)
     {
@@ -2716,6 +2755,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_di_load;
 	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_di_store;
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_di_load;
+	  reg_addr[KFmode].reload_store    = CODE_FOR_reload_kf_di_store;
+	  reg_addr[KFmode].reload_load     = CODE_FOR_reload_kf_di_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_di_store;
@@ -2783,6 +2824,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_si_load;
 	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_si_store;
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_si_load;
+	  reg_addr[KFmode].reload_store    = CODE_FOR_reload_kf_si_store;
+	  reg_addr[KFmode].reload_load     = CODE_FOR_reload_kf_si_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_si_store;
@@ -2839,9 +2882,9 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  enum machine_mode m2 = (enum machine_mode)m;
 	  int reg_size2 = reg_size;
 
-	  /* TFmode/TDmode always takes 2 registers, even in VSX.  */
-	  if (TARGET_VSX && VSX_REG_CLASS_P (c)
-	      && (m == TDmode || m == TFmode))
+	  /* TDmode & IBM 128-bit floating point always takes 2 registers, even
+	     in VSX.  */
+	  if (TARGET_VSX && VSX_REG_CLASS_P (c) && FLOAT128_2REG_P (m))
 	    reg_size2 = UNITS_PER_FP_WORD;
 
 	  rs6000_class_max_nregs[m][c]
@@ -3392,6 +3435,35 @@  rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_P8_VECTOR;
     }
 
+  if (TARGET_FLOAT128_REF && TARGET_FLOAT128_VSX)
+    {
+      if (((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_VSX) != 0)
+	  && ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_REF) == 0))
+	rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_REF;
+
+      else if (((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_VSX) == 0)
+	       && ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_REF) != 0))
+	rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_VSX;
+
+      else
+	{
+	  if ((rs6000_isa_flags_explicit
+	       & (OPTION_MASK_FLOAT128_VSX | OPTION_MASK_FLOAT128_REF)) != 0)
+	    error ("-mfloat128-vsx and -mfloat128-ref are incompatible");
+
+	  rs6000_isa_flags &= ((TARGET_VSX)
+			       ? ~OPTION_MASK_FLOAT128_REF
+			       : ~OPTION_MASK_FLOAT128_VSX);
+	}
+    }
+
+  if (TARGET_FLOAT128_VSX && !TARGET_VSX)
+    {
+      if (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_VSX)
+	error ("-mfloat128-vsx requires -mvsx");
+      rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_VSX;
+    }
+
   if (TARGET_VSX_TIMODE && !TARGET_VSX)
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_VSX_TIMODE)
@@ -5812,13 +5884,14 @@  invalid_e500_subreg (rtx op, enum machin
 	      || mode == DDmode || mode == TDmode || mode == PTImode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DFmode
-	      || GET_MODE (SUBREG_REG (op)) == TFmode))
+	      || GET_MODE (SUBREG_REG (op)) == TFmode
+	      || GET_MODE (SUBREG_REG (op)) == KFmode))
 	return true;
 
       /* Reject (subreg:DF (reg:DI)); likewise with subreg:TF and
 	 reg:TI.  */
       if (GET_CODE (op) == SUBREG
-	  && (mode == DFmode || mode == TFmode)
+	  && (mode == DFmode || mode == TFmode || mode == KFmode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DImode
 	      || GET_MODE (SUBREG_REG (op)) == TImode
@@ -6154,10 +6227,13 @@  reg_offset_addressing_ok_p (enum machine
     case V2DImode:
     case V1TImode:
     case TImode:
+    case TFmode:
+    case KFmode:
       /* AltiVec/VSX vector modes.  Only reg+reg addressing is valid.  While
 	 TImode is not a vector mode, if we want to use the VSX registers to
-	 move it around, we need to restrict ourselves to reg+reg
-	 addressing.  */
+	 move it around, we need to restrict ourselves to reg+reg addressing.
+	 Similarly for IEEE 128-bit floating point that is passed in a single
+	 vector register.  */
       if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
 	return false;
       break;
@@ -6433,6 +6509,7 @@  rs6000_legitimate_offset_address_p (enum
       break;
 
     case TFmode:
+    case KFmode:
       if (TARGET_E500_DOUBLE)
 	return (SPE_CONST_OFFSET_OK (offset)
 		&& SPE_CONST_OFFSET_OK (offset + 8));
@@ -6626,6 +6703,7 @@  rs6000_legitimize_address (rtx x, rtx ol
     case TDmode:
     case TImode:
     case PTImode:
+    case KFmode:
       /* As in legitimate_offset_address_p we do not assume
 	 worst-case.  The mode here is just a hint as to the registers
 	 used.  A TImode is usually in gprs, but may actually be in
@@ -7432,6 +7510,7 @@  rs6000_legitimize_reload_address (rtx x,
 	 mem is sufficiently aligned.  */
       && mode != TFmode
       && mode != TDmode
+      && mode != KFmode
       && (mode != TImode || !TARGET_VSX_TIMODE)
       && mode != PTImode
       && (mode != DImode || TARGET_POWERPC64)
@@ -7585,8 +7664,7 @@  rs6000_legitimate_address_p (enum machin
     return 1;
   if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict, false))
     return 1;
-  if (mode != TFmode
-      && mode != TDmode
+  if (!FLOAT128_2REG_P (mode)
       && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
 	  || TARGET_POWERPC64
 	  || (mode != DFmode && mode != DDmode)
@@ -8089,9 +8167,9 @@  rs6000_emit_le_vsx_load (rtx dest, rtx s
 {
   rtx tmp, permute_mem, permute_reg;
 
-  /* Use V2DImode to do swaps of types with 128-bit scalare parts (TImode,
-     V1TImode).  */
-  if (mode == TImode || mode == V1TImode)
+  /* Use V2DImode to do swaps of types with 128-bit scalar parts (TImode,
+     V1TImode, IEEE 128-bit floating point that goes in vector registers).  */
+  if (mode == TImode || mode == V1TImode || FLOAT128_VECTOR_P (mode))
     {
       mode = V2DImode;
       dest = gen_lowpart (V2DImode, dest);
@@ -8113,9 +8191,9 @@  rs6000_emit_le_vsx_store (rtx dest, rtx 
 {
   rtx tmp, permute_src, permute_tmp;
 
-  /* Use V2DImode to do swaps of types with 128-bit scalare parts (TImode,
-     V1TImode).  */
-  if (mode == TImode || mode == V1TImode)
+  /* Use V2DImode to do swaps of types with 128-bit scalar parts (TImode,
+     V1TImode, IEEE 128-bit floating point that goes in vector registers).  */
+  if (mode == TImode || mode == V1TImode || FLOAT128_VECTOR_P (mode))
     {
       mode = V2DImode;
       dest = adjust_address (dest, V2DImode, 0);
@@ -8247,8 +8325,7 @@  rs6000_emit_move (rtx dest, rtx source, 
 
   /* 128-bit constant floating-point values on Darwin should really be
      loaded as two parts.  */
-  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-      && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+  if (IBM_128BIT_P (mode) && GET_CODE (operands[1]) == CONST_DOUBLE)
     {
       rs6000_emit_move (simplify_gen_subreg (DFmode, operands[0], mode, 0),
 			simplify_gen_subreg (DFmode, operands[1], mode, 0),
@@ -8390,6 +8467,11 @@  rs6000_emit_move (rtx dest, rtx source, 
 	operands[1] = force_const_mem (mode, operands[1]);
       break;
 
+    case KFmode:
+      if (CONSTANT_P (operands[1]) && !easy_fp_constant (operands[1], mode))
+	operands[1] = force_const_mem (mode, operands[1]);
+      break;
+
     case TFmode:
     case TDmode:
       rs6000_eliminate_indexed_memrefs (operands);
@@ -8617,7 +8699,7 @@  rs6000_member_type_forces_blk (const_tre
 
 /* Nonzero if we can use a floating-point register to pass this arg.  */
 #define USE_FP_FOR_ARG_P(CUM,MODE)		\
-  (SCALAR_FLOAT_MODE_P (MODE)			\
+  (scalar_float_not_ieee128_p (MODE)		\
    && (CUM)->fregno <= FP_ARG_MAX_REG		\
    && TARGET_HARD_FLOAT && TARGET_FPRS)
 
@@ -8818,7 +8900,7 @@  rs6000_discover_homogeneous_aggregate (e
 
       if (field_count > 0)
 	{
-	  int n_regs = (SCALAR_FLOAT_MODE_P (field_mode)?
+	  int n_regs = (SCALAR_FLOAT_MODE_P (field_mode) ?
 			(GET_MODE_SIZE (field_mode) + 7) >> 3 : 1);
 
 	  /* The ELFv2 ABI allows homogeneous aggregates to occupy
@@ -8928,7 +9010,8 @@  rs6000_return_in_memory (const_tree type
       return true;
     }
 
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && TYPE_MODE (type) == TFmode)
+  if (IEEE_128BIT_P (TYPE_MODE (type))
+      && (TARGET_FLOAT128_REF || (DEFAULT_ABI == ABI_V4)))
     return true;
 
   return false;
@@ -9000,6 +9083,7 @@  init_cumulative_args (CUMULATIVE_ARGS *c
 		      ? CALL_LIBCALL : CALL_NORMAL);
   cum->sysv_gregno = GP_ARG_MIN_REG;
   cum->stdarg = stdarg_p (fntype);
+  cum->libcall = libcall;
 
   cum->nargs_prototype = 0;
   if (incoming || cum->prototype)
@@ -9058,7 +9142,7 @@  init_cumulative_args (CUMULATIVE_ARGS *c
 		      <= 8))
 		rs6000_returns_struct = true;
 	    }
-	  if (SCALAR_FLOAT_MODE_P (return_mode))
+	  if (scalar_float_not_ieee128_p (return_mode))
 	    rs6000_passes_float = true;
 	  else if (ALTIVEC_OR_VSX_VECTOR_MODE (return_mode)
 		   || SPE_VECTOR_MODE (return_mode))
@@ -9173,8 +9257,10 @@  rs6000_function_arg_boundary (enum machi
       && (GET_MODE_SIZE (mode) == 8
 	  || (TARGET_HARD_FLOAT
 	      && TARGET_FPRS
-	      && (mode == TFmode || mode == TDmode))))
+	      && FLOAT128_2REG_P (mode))))
     return 64;
+  else if (FLOAT128_VECTOR_P (mode))
+    return 128;
   else if (SPE_VECTOR_MODE (mode)
 	   || (type && TREE_CODE (type) == VECTOR_TYPE
 	       && int_size_in_bytes (type) >= 8
@@ -9412,7 +9498,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
   if (DEFAULT_ABI == ABI_V4
       && cum->escapes)
     {
-      if (SCALAR_FLOAT_MODE_P (mode))
+      if (scalar_float_not_ieee128_p (mode))
 	rs6000_passes_float = true;
       else if (named && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
 	rs6000_passes_vector = true;
@@ -9519,7 +9605,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
+	      || FLOAT128_2REG_P (mode)
 	      || mode == SDmode || mode == DDmode || mode == TDmode))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -9527,13 +9613,13 @@  rs6000_function_arg_advance_1 (CUMULATIV
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    cum->fregno += (GET_MODE_SIZE (mode) + 7) >> 3;
 	  else
 	    {
 	      cum->fregno = FP_ARG_V4_MAX_REG + 1;
-	      if (mode == DFmode || mode == TFmode
+	      if (mode == DFmode || FLOAT128_2REG_P (mode)
 		  || mode == DDmode || mode == TDmode)
 		cum->words += cum->words & 1;
 	      cum->words += rs6000_arg_size (mode, type);
@@ -9585,8 +9671,8 @@  rs6000_function_arg_advance_1 (CUMULATIV
 
       cum->words = align_words + n_words;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode)
-	  && TARGET_HARD_FLOAT && TARGET_FPRS)
+      if (scalar_float_not_ieee128_p (elt_mode) && TARGET_HARD_FLOAT
+	  && TARGET_FPRS)
 	{
 	  /* _Decimal128 must be passed in an even/odd float register pair.
 	     This assumes that the register number is odd when fregno is
@@ -10104,9 +10190,11 @@  rs6000_function_arg (cumulative_args_t c
       rtx r, off;
       int i, k = 0;
 
-      /* Do we also need to pass this argument in the parameter
-	 save area?  */
-      if (TARGET_64BIT && ! cum->prototype)
+      /* Do we also need to pass this argument in the parameter save area?
+	 Library support functions for IEEE 128-bit are assumed to not need the
+	 value passed both in GPRs and in vector registers.  */
+      if (TARGET_64BIT && !cum->prototype
+	  && (!cum->libcall || !FLOAT128_VECTOR_P (elt_mode)))
 	{
 	  int align_words = (cum->words + 1) & ~1;
 	  k = rs6000_psave_function_arg (mode, type, align_words, rvec);
@@ -10179,7 +10267,7 @@  rs6000_function_arg (cumulative_args_t c
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
+	      || FLOAT128_2REG_P (mode)
 	      || mode == SDmode || mode == DDmode || mode == TDmode))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -10187,7 +10275,7 @@  rs6000_function_arg (cumulative_args_t c
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    return gen_rtx_REG (mode, cum->fregno);
 	  else
@@ -10248,7 +10336,7 @@  rs6000_function_arg (cumulative_args_t c
 	      enum machine_mode fmode = elt_mode;
 	      if (cum->fregno + (i + 1) * n_fpreg > FP_ARG_MAX_REG + 1)
 		{
-		  gcc_assert (fmode == TFmode || fmode == TDmode);
+		  gcc_assert (FLOAT128_2REG_P (fmode));
 		  fmode = DECIMAL_FLOAT_MODE_P (fmode) ? DDmode : DFmode;
 		}
 
@@ -10295,11 +10383,14 @@  rs6000_arg_partial_bytes (cumulative_arg
 
   if (USE_ALTIVEC_FOR_ARG_P (cum, elt_mode, named))
     {
-      /* If we are passing this arg in the fixed parameter save area
-         (gprs or memory) as well as VRs, we do not use the partial
-	 bytes mechanism; instead, rs6000_function_arg will return a
-	 PARALLEL including a memory element as necessary.  */
-      if (TARGET_64BIT && ! cum->prototype)
+      /* If we are passing this arg in the fixed parameter save area (gprs or
+         memory) as well as VRs, we do not use the partial bytes mechanism;
+         instead, rs6000_function_arg will return a PARALLEL including a memory
+         element as necessary.  Library support functions for IEEE 128-bit are
+         assumed to not need the value passed both in GPRs and in vector
+         registers.  */
+      if (TARGET_64BIT && !cum->prototype
+	  && (!cum->libcall || !FLOAT128_VECTOR_P (elt_mode)))
 	return 0;
 
       /* Otherwise, we pass in VRs only.  Check for partial copies.  */
@@ -10366,10 +10457,11 @@  rs6000_pass_by_reference (cumulative_arg
 			  enum machine_mode mode, const_tree type,
 			  bool named ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && mode == TFmode)
+  if (IEEE_128BIT_P (mode)
+      && (TARGET_FLOAT128_REF || (DEFAULT_ABI == ABI_V4)))
     {
       if (TARGET_DEBUG_ARG)
-	fprintf (stderr, "function_arg_pass_by_reference: V4 long double\n");
+	fprintf (stderr, "function_arg_pass_by_reference: V4 IEEE 128-bit\n");
       return 1;
     }
 
@@ -11047,6 +11139,7 @@  rs6000_gimplify_va_arg (tree valist, tre
           || (TARGET_DOUBLE_FLOAT 
               && (TYPE_MODE (type) == DFmode 
  	          || TYPE_MODE (type) == TFmode
+ 	          || TYPE_MODE (type) == KFmode
 	          || TYPE_MODE (type) == SDmode
 	          || TYPE_MODE (type) == DDmode
 	          || TYPE_MODE (type) == TDmode))))
@@ -13824,6 +13917,7 @@  rs6000_init_builtins (void)
   tree tdecl;
   tree ftype;
   enum machine_mode mode;
+  enum machine_mode ieee128_mode;
 
   if (TARGET_DEBUG_BUILTIN)
     fprintf (stderr, "rs6000_init_builtins%s%s%s%s\n",
@@ -13891,6 +13985,20 @@  rs6000_init_builtins (void)
   dfloat128_type_internal_node = dfloat128_type_node;
   void_type_internal_node = void_type_node;
 
+  /* 128-bit floating point support.  KFmode is IEEE 128-bit floating point.
+     TFmode will be either IEEE 128-bit floating point or the IBM double-double
+     format that uses a pair of doubles, depending on the switches and
+     defaults.  */
+  ieee128_mode = (TARGET_IEEEQUAD) ? TFmode : KFmode;
+  ieee128_float_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (ieee128_float_type_node) = 128;
+  layout_type (ieee128_float_type_node);
+  SET_TYPE_MODE (ieee128_float_type_node, ieee128_mode);
+
+  if (TARGET_FLOAT128)
+    lang_hooks.types.register_builtin_type (ieee128_float_type_node,
+					    "__float128");
+
   /* Initialize the modes for builtin_function_type, mapping a machine mode to
      tree type node.  */
   builtin_mode_to_type[QImode][0] = integer_type_node;
@@ -13903,6 +14011,7 @@  rs6000_init_builtins (void)
   builtin_mode_to_type[TImode][1] = unsigned_intTI_type_node;
   builtin_mode_to_type[SFmode][0] = float_type_node;
   builtin_mode_to_type[DFmode][0] = double_type_node;
+  builtin_mode_to_type[KFmode][0] = ieee128_float_type_node;
   builtin_mode_to_type[TFmode][0] = long_double_type_node;
   builtin_mode_to_type[DDmode][0] = dfloat64_type_node;
   builtin_mode_to_type[TDmode][0] = dfloat128_type_node;
@@ -15374,78 +15483,163 @@  rs6000_common_init_builtins (void)
     }
 }
 
+/* Set up AIX/Darwin/64-bit Linux quad floating point routines.  */
 static void
-rs6000_init_libfuncs (void)
+init_float128_ibm (enum machine_mode mode)
 {
-  if (!TARGET_IEEEQUAD)
-      /* AIX/Darwin/64-bit Linux quad floating point routines.  */
-    if (!TARGET_XL_COMPAT)
-      {
-	set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
-	set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
-	set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
-	set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
+  if (!TARGET_XL_COMPAT)
+    {
+      set_optab_libfunc (add_optab, mode, "__gcc_qadd");
+      set_optab_libfunc (sub_optab, mode, "__gcc_qsub");
+      set_optab_libfunc (smul_optab, mode, "__gcc_qmul");
+      set_optab_libfunc (sdiv_optab, mode, "__gcc_qdiv");
 
-	if (!(TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)))
-	  {
-	    set_optab_libfunc (neg_optab, TFmode, "__gcc_qneg");
-	    set_optab_libfunc (eq_optab, TFmode, "__gcc_qeq");
-	    set_optab_libfunc (ne_optab, TFmode, "__gcc_qne");
-	    set_optab_libfunc (gt_optab, TFmode, "__gcc_qgt");
-	    set_optab_libfunc (ge_optab, TFmode, "__gcc_qge");
-	    set_optab_libfunc (lt_optab, TFmode, "__gcc_qlt");
-	    set_optab_libfunc (le_optab, TFmode, "__gcc_qle");
-
-	    set_conv_libfunc (sext_optab, TFmode, SFmode, "__gcc_stoq");
-	    set_conv_libfunc (sext_optab, TFmode, DFmode, "__gcc_dtoq");
-	    set_conv_libfunc (trunc_optab, SFmode, TFmode, "__gcc_qtos");
-	    set_conv_libfunc (trunc_optab, DFmode, TFmode, "__gcc_qtod");
-	    set_conv_libfunc (sfix_optab, SImode, TFmode, "__gcc_qtoi");
-	    set_conv_libfunc (ufix_optab, SImode, TFmode, "__gcc_qtou");
-	    set_conv_libfunc (sfloat_optab, TFmode, SImode, "__gcc_itoq");
-	    set_conv_libfunc (ufloat_optab, TFmode, SImode, "__gcc_utoq");
-	  }
+      if (!(TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)))
+	{
+	  set_optab_libfunc (neg_optab, mode, "__gcc_qneg");
+	  set_optab_libfunc (eq_optab, mode, "__gcc_qeq");
+	  set_optab_libfunc (ne_optab, mode, "__gcc_qne");
+	  set_optab_libfunc (gt_optab, mode, "__gcc_qgt");
+	  set_optab_libfunc (ge_optab, mode, "__gcc_qge");
+	  set_optab_libfunc (lt_optab, mode, "__gcc_qlt");
+	  set_optab_libfunc (le_optab, mode, "__gcc_qle");
 
-	if (!(TARGET_HARD_FLOAT && TARGET_FPRS))
-	  set_optab_libfunc (unord_optab, TFmode, "__gcc_qunord");
-      }
-    else
-      {
-	set_optab_libfunc (add_optab, TFmode, "_xlqadd");
-	set_optab_libfunc (sub_optab, TFmode, "_xlqsub");
-	set_optab_libfunc (smul_optab, TFmode, "_xlqmul");
-	set_optab_libfunc (sdiv_optab, TFmode, "_xlqdiv");
-      }
+	  set_conv_libfunc (sext_optab, mode, SFmode, "__gcc_stoq");
+	  set_conv_libfunc (sext_optab, mode, DFmode, "__gcc_dtoq");
+	  set_conv_libfunc (trunc_optab, SFmode, mode, "__gcc_qtos");
+	  set_conv_libfunc (trunc_optab, DFmode, mode, "__gcc_qtod");
+	  set_conv_libfunc (sfix_optab, SImode, mode, "__gcc_qtoi");
+	  set_conv_libfunc (ufix_optab, SImode, mode, "__gcc_qtou");
+	  set_conv_libfunc (sfloat_optab, mode, SImode, "__gcc_itoq");
+	  set_conv_libfunc (ufloat_optab, mode, SImode, "__gcc_utoq");
+	}
+
+      if (!(TARGET_HARD_FLOAT && TARGET_FPRS))
+	set_optab_libfunc (unord_optab, mode, "__gcc_qunord");
+    }
   else
     {
-      /* 32-bit SVR4 quad floating point routines.  */
+      set_optab_libfunc (add_optab, mode, "_xlqadd");
+      set_optab_libfunc (sub_optab, mode, "_xlqsub");
+      set_optab_libfunc (smul_optab, mode, "_xlqmul");
+      set_optab_libfunc (sdiv_optab, mode, "_xlqdiv");
+    }
+}
+
+/* Set up IEEE 128-bit floating point routines.  Use different names if the
+   arguments can be passed in a vector register.  The historical PowerPC
+   implementation of IEEE 128-bit floating point used _q_<op> for the names, so
+   continue to use that if we can't pass IEEE 128-bit in a VSX vector register.
+
+   Add _vector to clarify that this function is called with the argument in a
+   vector register, and _fpr when we are not passing IEEE 128-bit in a vector
+   register.  */
+
+static void
+init_float128_ieee (enum machine_mode mode)
+{
+  if (FLOAT128_VECTOR_P (mode))
+    {
+      set_optab_libfunc (add_optab, mode, "__addkf3");
+      set_optab_libfunc (sub_optab, mode, "__subkf3");
+      set_optab_libfunc (neg_optab, mode, "__negkf2");
+      set_optab_libfunc (smul_optab, mode, "__mulkf3");
+      set_optab_libfunc (sdiv_optab, mode, "__divkf3");
+      set_optab_libfunc (sqrt_optab, mode, "__sqrtkf2");
+      set_optab_libfunc (abs_optab, mode, "__abstkf2");
 
-      set_optab_libfunc (add_optab, TFmode, "_q_add");
-      set_optab_libfunc (sub_optab, TFmode, "_q_sub");
-      set_optab_libfunc (neg_optab, TFmode, "_q_neg");
-      set_optab_libfunc (smul_optab, TFmode, "_q_mul");
-      set_optab_libfunc (sdiv_optab, TFmode, "_q_div");
+      set_optab_libfunc (eq_optab, mode, "__eqkf2");
+      set_optab_libfunc (ne_optab, mode, "__nekf2");
+      set_optab_libfunc (gt_optab, mode, "__gtkf2");
+      set_optab_libfunc (ge_optab, mode, "__gekf2");
+      set_optab_libfunc (lt_optab, mode, "__ltkf2");
+      set_optab_libfunc (le_optab, mode, "__lekf2");
+      set_optab_libfunc (unord_optab, mode, "__unordkf2");
+      set_optab_libfunc (cmp_optab, mode, "__cmpkf2");
+
+      set_conv_libfunc (sext_optab, mode, SFmode, "__extendsfkf2");
+      set_conv_libfunc (sext_optab, mode, DFmode, "__extenddfkf2");
+      set_conv_libfunc (trunc_optab, SFmode, mode, "__trunckfsf2");
+      set_conv_libfunc (trunc_optab, DFmode, mode, "__trunckfdf2");
+
+      set_conv_libfunc (sfix_optab, SImode, mode, "__fixkfsi");
+      set_conv_libfunc (ufix_optab, SImode, mode, "__fixunskfsi");
+      set_conv_libfunc (sfix_optab, DImode, mode, "__fixkfdi");
+      set_conv_libfunc (ufix_optab, DImode, mode, "__fixunskfdi");
+
+      set_conv_libfunc (sfloat_optab, mode, SImode, "__floatsikf");
+      set_conv_libfunc (ufloat_optab, mode, SImode, "__floatunsikf");
+      set_conv_libfunc (sfloat_optab, mode, DImode, "__floatdikf");
+      set_conv_libfunc (ufloat_optab, mode, DImode, "__floatundikf");
+
+      if (mode == KFmode && !TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
+	{
+	  set_conv_libfunc (sext_optab, TFmode, mode, "__extendkftf2");
+	  set_conv_libfunc (trunc_optab, mode, TFmode, "__trunctfkf2");
+	}
+    }
+
+  else
+    {
+      set_optab_libfunc (add_optab, mode, "_q_add");
+      set_optab_libfunc (sub_optab, mode, "_q_sub");
+      set_optab_libfunc (neg_optab, mode, "_q_neg");
+      set_optab_libfunc (smul_optab, mode, "_q_mul");
+      set_optab_libfunc (sdiv_optab, mode, "_q_div");
       if (TARGET_PPC_GPOPT)
-	set_optab_libfunc (sqrt_optab, TFmode, "_q_sqrt");
+	set_optab_libfunc (sqrt_optab, mode, "_q_sqrt");
 
-      set_optab_libfunc (eq_optab, TFmode, "_q_feq");
-      set_optab_libfunc (ne_optab, TFmode, "_q_fne");
-      set_optab_libfunc (gt_optab, TFmode, "_q_fgt");
-      set_optab_libfunc (ge_optab, TFmode, "_q_fge");
-      set_optab_libfunc (lt_optab, TFmode, "_q_flt");
-      set_optab_libfunc (le_optab, TFmode, "_q_fle");
-
-      set_conv_libfunc (sext_optab, TFmode, SFmode, "_q_stoq");
-      set_conv_libfunc (sext_optab, TFmode, DFmode, "_q_dtoq");
-      set_conv_libfunc (trunc_optab, SFmode, TFmode, "_q_qtos");
-      set_conv_libfunc (trunc_optab, DFmode, TFmode, "_q_qtod");
-      set_conv_libfunc (sfix_optab, SImode, TFmode, "_q_qtoi");
-      set_conv_libfunc (ufix_optab, SImode, TFmode, "_q_qtou");
-      set_conv_libfunc (sfloat_optab, TFmode, SImode, "_q_itoq");
-      set_conv_libfunc (ufloat_optab, TFmode, SImode, "_q_utoq");
+      set_optab_libfunc (eq_optab, mode, "_q_feq");
+      set_optab_libfunc (ne_optab, mode, "_q_fne");
+      set_optab_libfunc (gt_optab, mode, "_q_fgt");
+      set_optab_libfunc (ge_optab, mode, "_q_fge");
+      set_optab_libfunc (lt_optab, mode, "_q_flt");
+      set_optab_libfunc (le_optab, mode, "_q_fle");
+
+      set_conv_libfunc (sext_optab, mode, SFmode, "_q_stoq");
+      set_conv_libfunc (sext_optab, mode, DFmode, "_q_dtoq");
+      set_conv_libfunc (trunc_optab, SFmode, mode, "_q_qtos");
+      set_conv_libfunc (trunc_optab, DFmode, mode, "_q_qtod");
+      set_conv_libfunc (sfix_optab, SImode, mode, "_q_qtoi");
+      set_conv_libfunc (ufix_optab, SImode, mode, "_q_qtou");
+      set_conv_libfunc (sfloat_optab, mode, SImode, "_q_itoq");
+      set_conv_libfunc (ufloat_optab, mode, SImode, "_q_utoq");
+
+      /* The classic V4 IEEE 128-bit support did not include IEEE unordered
+	 support, or 64/128-bit integer conversions.  If we have
+	 -mfloat128-fpr, add these functions.  */
+      if (TARGET_FLOAT128_REF)
+	{
+	  set_optab_libfunc (unord_optab, mode, "_q_funordered");
+	  set_optab_libfunc (cmp_optab, mode, "_q_fcmp");
+
+	  if (mode == KFmode && !TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
+	    {
+	      set_conv_libfunc (sext_optab, TFmode, mode, "_q_qtot");
+	      set_conv_libfunc (trunc_optab, mode, TFmode, "_q_ttoq");
+	    }
+
+	  set_conv_libfunc (sfix_optab, DImode, mode, "_q_qtoi_d");
+	  set_conv_libfunc (ufix_optab, DImode, mode, "_q_qtou_d");
+	  set_conv_libfunc (sfloat_optab, mode, DImode, "_q_itoq_d");
+	  set_conv_libfunc (ufloat_optab, mode, DImode, "_q_utoq_d");
+	}
     }
 }
 
+static void
+rs6000_init_libfuncs (void)
+{
+  /* AIX/Darwin/64-bit Linux quad floating point routines.  */
+  if (!TARGET_IEEEQUAD)
+    init_float128_ibm (TFmode);
+
+  /* IEEE 128-bit including 32-bit SVR4 quad floating point routines.  */
+  init_float128_ieee (KFmode);
+  if (TARGET_IEEEQUAD)
+    init_float128_ieee (TFmode);
+}
+
 
 /* Expand a block clear operation, and return 1 if successful.  Return 0
    if we should let the compiler generate normal code.
@@ -17311,6 +17505,8 @@  rs6000_cannot_change_mode_class (enum ma
 	{
 	  unsigned to_nregs = hard_regno_nregs[FIRST_FPR_REGNO][to];
 	  unsigned from_nregs = hard_regno_nregs[FIRST_FPR_REGNO][from];
+	  bool to_float128_vector_p = FLOAT128_VECTOR_P (to);
+	  bool from_float128_vector_p = FLOAT128_VECTOR_P (from);
 
 	  /* Don't allow 64-bit types to overlap with 128-bit types that take a
 	     single register under VSX because the scalar part of the register
@@ -17319,7 +17515,10 @@  rs6000_cannot_change_mode_class (enum ma
 	     IEEE floating point can't overlap, and neither can small
 	     values.  */
 
-	  if (TARGET_IEEEQUAD && (to == TFmode || from == TFmode))
+	  if (to_float128_vector_p && from_float128_vector_p)
+	    return false;
+
+	  else if (to_float128_vector_p || from_float128_vector_p)
 	    return true;
 
 	  /* TDmode in floating-mode registers must always go into a register
@@ -17347,6 +17546,7 @@  rs6000_cannot_change_mode_class (enum ma
   if (TARGET_E500_DOUBLE
       && ((((to) == DFmode) + ((from) == DFmode)) == 1
 	  || (((to) == TFmode) + ((from) == TFmode)) == 1
+	  || (((to) == KFmode) + ((from) == KFmode)) == 1
 	  || (((to) == DDmode) + ((from) == DDmode)) == 1
 	  || (((to) == TDmode) + ((from) == TDmode)) == 1
 	  || (((to) == DImode) + ((from) == DImode)) == 1))
@@ -17543,13 +17743,7 @@  rs6000_output_move_128bit (rtx operands[
 	return output_vec_const_move (operands);
     }
 
-  if (TARGET_DEBUG_ADDR)
-    {
-      fprintf (stderr, "\n===== Bad 128 bit move:\n");
-      debug_rtx (gen_rtx_SET (VOIDmode, dest, src));
-    }
-
-  gcc_unreachable ();
+  fatal_insn ("Bad 128-bit move", gen_rtx_SET (VOIDmode, dest, src));
 }
 
 /* Validate a 128-bit move.  */
@@ -18344,7 +18538,7 @@  print_operand (FILE *file, rtx x, int co
 	/* Ugly hack because %y is overloaded.  */
 	if ((TARGET_SPE || TARGET_E500_DOUBLE)
 	    && (GET_MODE_SIZE (GET_MODE (x)) == 8
-		|| GET_MODE (x) == TFmode
+		|| FLOAT128_2REG_P (GET_MODE (x))
 		|| GET_MODE (x) == TImode
 		|| GET_MODE (x) == PTImode))
 	  {
@@ -18752,6 +18946,7 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case KFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfeq_gpr (compare_result, op0, op1)
 		: gen_cmptfeq_gpr (compare_result, op0, op1);
@@ -18779,6 +18974,7 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case KFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfgt_gpr (compare_result, op0, op1)
 		: gen_cmptfgt_gpr (compare_result, op0, op1);
@@ -18806,6 +19002,7 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case KFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttflt_gpr (compare_result, op0, op1)
 		: gen_cmptflt_gpr (compare_result, op0, op1);
@@ -18843,6 +19040,7 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case KFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfeq_gpr (compare_result2, op0, op1)
 		: gen_cmptfeq_gpr (compare_result2, op0, op1);
@@ -18865,14 +19063,117 @@  rs6000_generate_compare (rtx cmp, enum m
 
       emit_insn (cmp);
     }
+
+  /* IEEE 128-bit support without hardware.  The ge/le comparison functions
+     return -2 for unordered, -1 for less than, 0 for equal, and +1 for greater
+     than.  For now, don't support IEEE Nan's in tests.  */
+  else if(IEEE_128BIT_P (mode))
+    {
+      rtx and_reg = gen_reg_rtx (SImode);
+      rtx dest = gen_reg_rtx (SImode);
+      rtx libfunc = optab_libfunc (cmp_optab, mode);
+      HOST_WIDE_INT mask_value = 0;
+
+      /* Values that __cmpkf2 returns (same bits as the CR registers).  */
+#define PPC_CMP_UNORDERED	0x1		/* isnan (a) || isnan (b).  */
+#define PPC_CMP_EQUAL		0x2		/* a == b.  */
+#define PPC_CMP_GREATER_THEN	0x4		/* a > b.  */
+#define PPC_CMP_LESS_THEN	0x8		/* a < b.  */
+
+      switch (code)
+	{
+	case EQ:
+	  mask_value = PPC_CMP_EQUAL;
+	  code = NE;
+	  break;
+
+	case NE:
+	  mask_value = PPC_CMP_EQUAL;
+	  code = EQ;
+	  break;
+
+	case GT:
+	  mask_value = PPC_CMP_GREATER_THEN;
+	  code = NE;
+	  break;
+
+	case GE:
+	  mask_value = PPC_CMP_GREATER_THEN | PPC_CMP_EQUAL;
+	  code = NE;
+	  break;
+
+	case LT:
+	  mask_value = PPC_CMP_LESS_THEN;
+	  code = NE;
+	  break;
+
+	case LE:
+	  mask_value = PPC_CMP_LESS_THEN | PPC_CMP_EQUAL;
+	  code = NE;
+	  break;
+
+	case UNLE:
+	  mask_value = PPC_CMP_GREATER_THEN;
+	  code = EQ;
+	  break;
+
+	case UNLT:
+	  mask_value = PPC_CMP_GREATER_THEN | PPC_CMP_EQUAL;
+	  code = EQ;
+	  break;
+
+	case UNGE:
+	  mask_value = PPC_CMP_LESS_THEN;
+	  code = EQ;
+	  break;
+
+	case UNGT:
+	  mask_value = PPC_CMP_LESS_THEN | PPC_CMP_EQUAL;
+	  code = EQ;
+	  break;
+
+	case UNEQ:
+	  mask_value = PPC_CMP_EQUAL | PPC_CMP_UNORDERED;
+	  code = NE;
+
+	case LTGT:
+	  mask_value = PPC_CMP_EQUAL | PPC_CMP_UNORDERED;
+	  code = EQ;
+	  break;
+
+	case UNORDERED:
+	  mask_value = PPC_CMP_UNORDERED;
+	  code = NE;
+	  break;
+
+	case ORDERED:
+	  mask_value = PPC_CMP_UNORDERED;
+	  code = EQ;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	}
+
+      gcc_assert (mask_value != 0);
+      and_reg = emit_library_call_value (libfunc, and_reg, LCT_CONST, SImode, 2,
+					 op0, mode, op1, mode);
+
+      emit_insn (gen_andsi3 (dest, and_reg, GEN_INT (mask_value)));
+      compare_result = gen_reg_rtx (CCmode);
+      comp_mode = CCmode;
+
+      emit_insn (gen_rtx_SET (VOIDmode, compare_result,
+			      gen_rtx_COMPARE (comp_mode, dest, const0_rtx)));
+    }
+
   else
     {
       /* Generate XLC-compatible TFmode compare as PARALLEL with extra
 	 CLOBBERs to match cmptf_internal2 pattern.  */
       if (comp_mode == CCFPmode && TARGET_XL_COMPAT
-	  && GET_MODE (op0) == TFmode
-	  && !TARGET_IEEEQUAD
-	  && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128)
+	  && IBM_128BIT_P (GET_MODE (op0))
+	  && TARGET_HARD_FLOAT && TARGET_FPRS)
 	emit_insn (gen_rtx_PARALLEL (VOIDmode,
 	  gen_rtvec (10,
 		     gen_rtx_SET (VOIDmode,
@@ -18906,6 +19207,7 @@  rs6000_generate_compare (rtx cmp, enum m
   /* Some kinds of FP comparisons need an OR operation;
      under flag_finite_math_only we don't bother.  */
   if (FLOAT_MODE_P (mode)
+      && !IEEE_128BIT_P (mode)
       && !flag_finite_math_only
       && !(TARGET_HARD_FLOAT && !TARGET_FPRS)
       && (code == LE || code == GE
@@ -18945,6 +19247,68 @@  rs6000_generate_compare (rtx cmp, enum m
 }
 
 
+/* Expand floating point conversion to/from __float128.  */
+
+void
+rs6000_expand_float128_convert (rtx dest, rtx src, bool unsigned_p)
+{
+  enum machine_mode dest_mode = GET_MODE (dest);
+  enum machine_mode src_mode = GET_MODE (src);
+  convert_optab cvt = unknown_optab;
+  rtx libfunc = NULL_RTX;
+  rtx dest2;
+
+  if (dest_mode == src_mode)
+    gcc_unreachable ();
+
+  if (IEEE_128BIT_P (dest_mode))
+    {
+      if (src_mode == SFmode
+	  || src_mode == DFmode
+	  || IBM_128BIT_P (src_mode))
+	cvt = sext_optab;
+
+      else if (GET_MODE_CLASS (src_mode) == MODE_INT)
+	cvt = (unsigned_p) ? ufloat_optab : sfloat_optab;
+
+      else if (IEEE_128BIT_P (src_mode))
+	emit_move_insn (dest, gen_lowpart (dest_mode, src));
+
+      else
+	gcc_unreachable ();
+    }
+
+  else if (IEEE_128BIT_P (src_mode))
+    {
+      if (dest_mode == SFmode
+	  || dest_mode == DFmode
+	  || IBM_128BIT_P (dest_mode))
+	cvt = trunc_optab;
+
+      else if (GET_MODE_CLASS (dest_mode) == MODE_INT)
+	cvt = (unsigned_p) ? ufix_optab : sfix_optab;
+
+      else
+	gcc_unreachable ();
+    }
+
+  else
+    gcc_unreachable ();
+
+  gcc_assert (cvt != unknown_optab);
+  libfunc = convert_optab_libfunc (cvt, dest_mode, src_mode);
+  gcc_assert (libfunc != NULL_RTX);
+
+  dest2 = emit_library_call_value (libfunc, dest, LCT_CONST, dest_mode, 1, src,
+				   src_mode);
+
+  gcc_assert (dest != NULL_RTX);
+  if (!rtx_equal_p (dest, dest2))
+    emit_move_insn (dest, dest2);
+
+  return;
+}
+
 /* Emit the RTL for an sISEL pattern.  */
 
 void
@@ -20288,7 +20652,7 @@  rs6000_split_multireg_move (rtx dst, rtx
 	((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? DFmode : SFmode);
   else if (ALTIVEC_REGNO_P (reg))
     reg_mode = V16QImode;
-  else if (TARGET_E500_DOUBLE && mode == TFmode)
+  else if (TARGET_E500_DOUBLE && FLOAT128_2REG_P (mode))
     reg_mode = DFmode;
   else
     reg_mode = word_mode;
@@ -21387,7 +21751,8 @@  spe_func_has_64bit_regs_p (void)
 
 	      if (SPE_VECTOR_MODE (mode))
 		return true;
-	      if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode))
+	      if (TARGET_E500_DOUBLE
+		  && (mode == DFmode || FLOAT128_2REG_P (mode)))
 		return true;
 	    }
 	}
@@ -25056,6 +25421,7 @@  rs6000_output_function_epilogue (FILE *f
 			case DDmode:
 			case TFmode:
 			case TDmode:
+			case KFmode:
 			  bits = 0x3;
 			  break;
 
@@ -25542,7 +25908,8 @@  output_toc (FILE *file, rtx x, int label
      TOC, things we put here aren't actually in the TOC, so we can allow
      FP constants.  */
   if (GET_CODE (x) == CONST_DOUBLE &&
-      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode))
+      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode
+       || GET_MODE (x) == KFmode))
     {
       REAL_VALUE_TYPE rv;
       long k[4];
@@ -28270,9 +28637,20 @@  rs6000_mangle_type (const_tree type)
   if (type == bool_int_type_node) return "U6__booli";
   if (type == bool_long_type_node) return "U6__booll";
 
-  /* Mangle IBM extended float long double as `g' (__float128) on
-     powerpc*-linux where long-double-64 previously was the default.  */
-  if (TYPE_MAIN_VARIANT (type) == long_double_type_node
+  /* For VSX systems, we are transitioning to supporting IEEE 128-bit floating
+     point.  Initially, users will have to use __float128 to get access to the
+     IEEE 128-bit floating point, and long double will remain the IBM
+     double-double format.  At some point in the future, long double may become
+     the same as __float128.
+
+     AIX and really old powerpc*-linux systems default to 64-bit for the
+     long double type, and we will use the normal C++ mangling in this
+     case.  */
+
+  if (type == ieee128_float_type_node)
+    return "e";
+
+  if (type == long_double_type_node
       && TARGET_ELF
       && TARGET_LONG_DOUBLE_128
       && !TARGET_IEEEQUAD)
@@ -30065,7 +30443,7 @@  rs6000_register_move_cost (enum machine_
 
   /* Moving between two similar registers is just one instruction.  */
   else if (reg_classes_intersect_p (to, from))
-    ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
+    ret = (FLOAT128_2REG_P (mode)) ? 4 : 2;
 
   /* Everything else has to go through GENERAL_REGS.  */
   else
@@ -31094,7 +31472,7 @@  rs6000_function_value (const_tree valtyp
       int first_reg, n_regs, i;
       rtx par;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode))
+      if (scalar_float_not_ieee128_p (elt_mode))
 	{
 	  /* _Decimal128 must use even/odd register pairs.  */
 	  first_reg = (elt_mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
@@ -31159,7 +31537,7 @@  rs6000_function_value (const_tree valtyp
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS
+  else if (scalar_float_not_ieee128_p (mode) && TARGET_HARD_FLOAT && TARGET_FPRS
 	   && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
@@ -31168,13 +31546,13 @@  rs6000_function_value (const_tree valtyp
   /* VSX is a superset of Altivec and adds V2DImode/V2DFmode.  Since the same
      return register is used in both cases, and we won't see V2DImode/V2DFmode
      for pure altivec, combine the two cases.  */
-  else if (TREE_CODE (valtype) == VECTOR_TYPE
+  else if ((TREE_CODE (valtype) == VECTOR_TYPE || FLOAT128_VECTOR_P (mode))
 	   && TARGET_ALTIVEC && TARGET_ALTIVEC_ABI
 	   && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
     regno = ALTIVEC_ARG_RETURN;
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || IBM_128BIT_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -31206,7 +31584,7 @@  rs6000_libcall_value (enum machine_mode 
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_MODE_P (mode)
+  else if (scalar_float_not_ieee128_p (mode)
 	   && TARGET_HARD_FLOAT && TARGET_FPRS
            && ((TARGET_SINGLE_FLOAT && mode == SFmode) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
@@ -31220,7 +31598,7 @@  rs6000_libcall_value (enum machine_mode 
     return rs6000_complex_function_value (mode);
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || IBM_128BIT_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -31417,6 +31795,8 @@  rs6000_scalar_mode_supported_p (enum mac
 {
   if (DECIMAL_FLOAT_MODE_P (mode))
     return default_decimal_float_supported_p ();
+  else if (mode == KFmode)
+    return TARGET_FLOAT128;
   else
     return default_scalar_mode_supported_p (mode);
 }
@@ -31432,13 +31812,26 @@  rs6000_vector_mode_supported_p (enum mac
   if (TARGET_SPE && SPE_VECTOR_MODE (mode))
     return true;
 
-  else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
+  /* There is no vector form for IEEE 128-bit.  If we return true for IEEE
+     128-bit, the compiler might try to widen IEEE 128-bit to IBM
+     double-double.  */
+  else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && !IEEE_128BIT_P (mode))
     return true;
 
   else
     return false;
 }
 
+/* Target hook for c_mode_for_suffix.  */
+static enum machine_mode
+rs6000_c_mode_for_suffix (char suffix)
+{
+  if (TARGET_FLOAT128 && suffix == 'q')
+    return KFmode;
+
+  return VOIDmode;
+}
+
 /* Target hook for invalid_arg_for_unprototyped_fn. */
 static const char *
 invalid_arg_for_unprototyped_fn (const_tree typelist, const_tree funcdecl, const_tree val)
@@ -31522,6 +31915,8 @@  static struct rs6000_opt_mask const rs60
   { "crypto",			OPTION_MASK_CRYPTO,		false, true  },
   { "direct-move",		OPTION_MASK_DIRECT_MOVE,	false, true  },
   { "dlmzb",			OPTION_MASK_DLMZB,		false, true  },
+  { "float128-vsx",		OPTION_MASK_FLOAT128_VSX,	false, true  },
+  { "float128-ref",		OPTION_MASK_FLOAT128_REF,	false, true  },
   { "fprnd",			OPTION_MASK_FPRND,		false, true  },
   { "hard-dfp",			OPTION_MASK_DFP,		false, true  },
   { "htm",			OPTION_MASK_HTM,		false, true  },
Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/vector.md	(working copy)
@@ -36,13 +36,13 @@  (define_mode_iterator VEC_A [V16QI V8HI 
 (define_mode_iterator VEC_K [V16QI V8HI V4SI V4SF])
 
 ;; Vector logical modes
-(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI])
+(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI KF])
 
 ;; Vector modes for moves.  Don't do TImode here.
-(define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF V1TI])
+(define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF V1TI KF])
 
 ;; Vector modes for types that don't need a realignment under VSX
-(define_mode_iterator VEC_N [V4SI V4SF V2DI V2DF V1TI])
+(define_mode_iterator VEC_N [V4SI V4SF V2DI V2DF V1TI KF])
 
 ;; Vector comparison modes
 (define_mode_iterator VEC_C [V16QI V8HI V4SI V2DI V4SF V2DF])
@@ -55,7 +55,7 @@  (define_mode_iterator VEC_64 [V2DI V2DF]
 
 ;; Vector reload iterator
 (define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF V1TI
-			     SF SD SI DF DD DI TI])
+			     SF SD SI DF DD DI TI KF])
 
 ;; Base type from vector mode
 (define_mode_attr VEC_base [(V16QI "QI")
@@ -95,16 +95,24 @@  (define_code_attr VEC_reduc_rtx [(plus "
 (define_expand "mov<mode>"
   [(set (match_operand:VEC_M 0 "nonimmediate_operand" "")
 	(match_operand:VEC_M 1 "any_operand" ""))]
-  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)
+   || (<MODE>mode == KFmode && TARGET_FLOAT128)"
 {
   if (can_create_pseudo_p ())
     {
-      if (CONSTANT_P (operands[1])
-	  && !easy_vector_constant (operands[1], <MODE>mode))
-	operands[1] = force_const_mem (<MODE>mode, operands[1]);
+      if (CONSTANT_P (operands[1]))
+	{
+	  if (FLOAT128_VECTOR_P (<MODE>mode))
+	    {
+	      if (!easy_fp_constant (operands[1], <MODE>mode))
+		operands[1] = force_const_mem (<MODE>mode, operands[1]);
+	    }
+	  else if (!easy_vector_constant (operands[1], <MODE>mode))
+	    operands[1] = force_const_mem (<MODE>mode, operands[1]);
+	}
 
-      else if (!vlogical_operand (operands[0], <MODE>mode)
-	       && !vlogical_operand (operands[1], <MODE>mode))
+      if (!vlogical_operand (operands[0], <MODE>mode)
+	  && !vlogical_operand (operands[1], <MODE>mode))
 	operands[1] = force_reg (<MODE>mode, operands[1]);
     }
   if (!BYTES_BIG_ENDIAN
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/predicates.md	(working copy)
@@ -228,6 +228,25 @@  (define_predicate "int_reg_operand"
   return INT_REGNO_P (REGNO (op));
 })
 
+;; Like int_reg_operand, but don't return true for pseudo registers
+(define_predicate "int_reg_operand_not_pseudo"
+  (match_operand 0 "register_operand")
+{
+  if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode))
+    return 0;
+
+  if (GET_CODE (op) == SUBREG)
+    op = SUBREG_REG (op);
+
+  if (!REG_P (op))
+    return 0;
+
+  if (REGNO (op) >= FIRST_PSEUDO_REGISTER)
+    return 0;
+
+  return INT_REGNO_P (REGNO (op));
+})
+
 ;; Like int_reg_operand, but only return true for base registers
 (define_predicate "base_reg_operand"
   (match_operand 0 "int_reg_operand")
@@ -438,11 +457,12 @@  (define_predicate "easy_fp_constant"
     return 1;
 
   /* The constant 0.0 is easy under VSX.  */
-  if ((mode == SFmode || mode == DFmode || mode == SDmode || mode == DDmode)
+  if ((mode == SFmode || mode == DFmode || mode == SDmode || mode == DDmode
+       || mode == KFmode)
       && VECTOR_UNIT_VSX_P (DFmode) && op == CONST0_RTX (mode))
     return 1;
 
-  if (DECIMAL_FLOAT_MODE_P (mode))
+  if (DECIMAL_FLOAT_MODE_P (mode) || mode == KFmode)
     return 0;
 
   /* If we are using V.4 style PIC, consider all constants to be hard.  */
@@ -458,7 +478,14 @@  (define_predicate "easy_fp_constant"
 
   switch (mode)
     {
+    /* For IEEE 128-bit, only consider 0.0 to be easy.  */
+    case KFmode:
+      return (op == CONST0_RTX (mode));
+
     case TFmode:
+      if (FLOAT128_VECTOR_P (mode))
+	return (op == CONST0_RTX (mode));
+
       if (TARGET_E500_DOUBLE)
 	return 0;
 
@@ -531,6 +558,12 @@  (define_predicate "easy_vector_constant"
   if (TARGET_PAIRED_FLOAT)
     return false;
 
+  /* Because IEEE 128-bit floating point is considered a vector type
+     in order to pass it in VSX registers, it might use this function
+     instead of easy_fp_constant.  */
+  if (FLOAT128_VECTOR_P (mode))
+    return easy_fp_constant (op, mode);
+
   if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
     {
       if (zero_constant (op, mode))
Index: gcc/config/rs6000/rs6000-modes.def
===================================================================
--- gcc/config/rs6000/rs6000-modes.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000-modes.def	(working copy)
@@ -47,3 +47,15 @@  VECTOR_MODES (FLOAT, 32);     /*       V
    for quad memory atomic operations to force getting an even/odd register
    combination.  */
 PARTIAL_INT_MODE (TI, 128, PTI);
+
+/* IEEE 128-bit floating point.  Define this as a SPECIAL floating point type
+   to prevent the compiler from widening DFmode to KFmode (IEEE 128-bit) and
+   then to TFmode (IBM long double).  To use KFmode, you must explicitly use
+   __float128.  */
+SPECIAL_FLOAT_MODE (KF, 112, 16, ieee_quad_format);
+ADJUST_ALIGNMENT (KF, 16);
+
+/* FRACTIONAL_FLOAT_MODE (KF, 80, 16, ieee_quad_format); */
+/* ADJUST_BYTESIZE  (KF, 16); */
+/* ADJUST_ALIGNMENT (KF, 16); */
+
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000-cpus.def	(working copy)
@@ -44,6 +44,7 @@ 
 #define ISA_2_6_MASKS_SERVER	(ISA_2_5_MASKS_SERVER			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
+				 | OPTION_MASK_FLOAT128_VSX		\
 				 | OPTION_MASK_VSX)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
@@ -76,6 +77,8 @@ 
 				 | OPTION_MASK_DFP			\
 				 | OPTION_MASK_DIRECT_MOVE		\
 				 | OPTION_MASK_DLMZB			\
+				 | OPTION_MASK_FLOAT128_VSX		\
+				 | OPTION_MASK_FLOAT128_REF		\
 				 | OPTION_MASK_FPRND			\
 				 | OPTION_MASK_HTM			\
 				 | OPTION_MASK_ISEL			\
@@ -184,7 +187,7 @@  RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION)
+	    | MASK_VSX | MASK_RECIP_PRECISION | MASK_FLOAT128_VSX)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000-c.c	(working copy)
@@ -363,6 +363,12 @@  rs6000_target_modify_macros (bool define
     rs6000_define_or_undefine_macro (define_p, "__QUAD_MEMORY_ATOMIC__");
   if ((flags & OPTION_MASK_CRYPTO) != 0)
     rs6000_define_or_undefine_macro (define_p, "__CRYPTO__");
+  if ((flags & (OPTION_MASK_FLOAT128_VSX | OPTION_MASK_FLOAT128_REF)) != 0)
+    rs6000_define_or_undefine_macro (define_p, "__FLOAT128__");
+  if ((flags & OPTION_MASK_FLOAT128_VSX) != 0)
+    rs6000_define_or_undefine_macro (define_p, "__FLOAT128_VSX__");
+  if ((flags & OPTION_MASK_FLOAT128_REF) != 0)
+    rs6000_define_or_undefine_macro (define_p, "__FLOAT128_REF__");
 
   /* options from the builtin masks.  */
   if ((bu_mask & RS6000_BTM_SPE) != 0)
@@ -461,6 +467,11 @@  rs6000_cpu_cpp_builtins (cpp_reader *pfi
     {
       builtin_define ("__LONG_DOUBLE_128__");
       builtin_define ("__LONGDOUBLE128");
+
+      if (TARGET_IEEEQUAD)
+	builtin_define ("__LONG_DOUBLE_IEEE128__");
+      else
+	builtin_define ("__LONG_DOUBLE_IBM128__");
     }
 
   switch (TARGET_CMODEL)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -441,6 +441,14 @@  mwarn-altivec-long
 Target Var(rs6000_warn_altivec_long) Init(1) Save
 Warn about deprecated 'vector long ...' AltiVec type usage
 
+mfloat128-vsx
+Target Mask(FLOAT128_VSX) Var(rs6000_isa_flags)
+Enable the __float128 type using VSX registers
+
+mfloat128-ref
+Target Mask(FLOAT128_REF) Var(rs6000_isa_flags)
+Enable the __float128 type, passing/returning the value by reference
+
 mfloat-gprs=
 Target RejectNegative Joined Enum(rs6000_float_gprs) Var(rs6000_float_gprs) Save
 -mfloat-gprs=	Select GPR floating point method
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -41,11 +41,28 @@  (define_mode_iterator VSX_DF [V2DF DF])
 (define_mode_iterator VSX_F [V4SF V2DF])
 
 ;; Iterator for logical types supported by VSX
-(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI])
+(define_mode_iterator VSX_L [V16QI
+			     V8HI
+			     V4SI
+			     V2DI
+			     V4SF
+			     V2DF
+			     V1TI
+			     TI
+			     (KF	"FLOAT128_VECTOR_P (KFmode)")
+			     (TF	"FLOAT128_VECTOR_P (TFmode)")])
 
 ;; Iterator for memory move.  Handle TImode specially to allow
 ;; it to use gprs as well as vsx registers.
-(define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF V1TI])
+(define_mode_iterator VSX_M [V16QI
+			     V8HI
+			     V4SI
+			     V2DI
+			     V4SF
+			     V2DF
+			     V1TI
+			     (KF	"FLOAT128_VECTOR_P (KFmode)")
+			     (TF	"FLOAT128_VECTOR_P (TFmode)")])
 
 (define_mode_iterator VSX_M2 [V16QI
 			      V8HI
@@ -54,8 +71,15 @@  (define_mode_iterator VSX_M2 [V16QI
 			      V4SF
 			      V2DF
 			      V1TI
+			      (KF	"FLOAT128_VECTOR_P (KFmode)")
+			      (TF	"FLOAT128_VECTOR_P (TFmode)")
 			      (TI	"TARGET_VSX_TIMODE")])
 
+;; Mode iterator for 128-bit floating point that goes in a single vector
+;; register.
+(define_mode_iterator VSX_F128 [(KF	"FLOAT128_VECTOR_P (KFmode)")
+				(TF	"FLOAT128_VECTOR_P (TFmode)")])
+
 ;; Map into the appropriate load/store name based on the type
 (define_mode_attr VSm  [(V16QI "vw4")
 			(V8HI  "vw4")
@@ -64,6 +88,7 @@  (define_mode_attr VSm  [(V16QI "vw4")
 			(V2DF  "vd2")
 			(V2DI  "vd2")
 			(DF    "d")
+			(KF    "vd2")
 			(V1TI  "vd2")
 			(TI    "vd2")])
 
@@ -76,6 +101,7 @@  (define_mode_attr VSs	[(V16QI "sp")
 			 (V2DI  "dp")
 			 (DF    "dp")
 			 (SF	"sp")
+			 (KF    "dp")
 			 (V1TI  "dp")
 			 (TI    "dp")])
 
@@ -88,6 +114,7 @@  (define_mode_attr VSr	[(V16QI "v")
 			 (V2DF  "wd")
 			 (DF    "ws")
 			 (SF	"d")
+			 (KF	"wd")
 			 (V1TI  "v")
 			 (TI    "wt")])
 
@@ -135,7 +162,8 @@  (define_mode_attr VSv	[(V16QI "v")
 			 (V2DI  "v")
 			 (V2DF  "v")
 			 (V1TI  "v")
-			 (DF    "s")])
+			 (DF    "s")
+			 (KF	"v")])
 
 ;; Appropriate type for add ops (and other simple FP ops)
 (define_mode_attr VStype_simple	[(V2DF "vecdouble")
@@ -257,6 +285,30 @@  (define_insn_and_split "*vsx_le_perm_loa
   [(set_attr "type" "vecload")
    (set_attr "length" "8")])
 
+;; KFmode/TFmode aren't vector types.  Use V2DImode instead.
+(define_insn_and_split "*vsx_le_perm_load_<mode>"
+  [(set (match_operand:VSX_F128 0 "vsx_register_operand" "=wa")
+        (match_operand:VSX_F128 1 "memory_operand" "Z"))]
+  "!BYTES_BIG_ENDIAN && TARGET_VSX"
+  "#"
+  "!BYTES_BIG_ENDIAN && TARGET_VSX"
+  [(set (match_dup 4)
+        (vec_select:V2DI (match_dup 3)
+			 (parallel [(const_int 1) (const_int 0)])))
+   (set (match_dup 2)
+        (vec_select:V2DI (match_dup 4)
+			 (parallel [(const_int 1) (const_int 0)])))]
+  "
+{
+  operands[2] = gen_lowpart (V2DImode, operands[0]);
+  operands[3] = gen_lowpart (V2DImode, operands[1]);
+  operands[4] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[2])
+                                       : operands[2];
+}
+  "
+  [(set_attr "type" "vecload")
+   (set_attr "length" "8")])
+
 (define_insn_and_split "*vsx_le_perm_load_<mode>"
   [(set (match_operand:VSX_W 0 "vsx_register_operand" "=wa")
         (match_operand:VSX_W 1 "memory_operand" "Z"))]
@@ -394,6 +446,51 @@  (define_split
   "")
 
 (define_insn "*vsx_le_perm_store_<mode>"
+  [(set (match_operand:VSX_F128 0 "memory_operand" "=Z")
+        (match_operand:VSX_F128 1 "vsx_register_operand" "+wa"))]
+  "!BYTES_BIG_ENDIAN && TARGET_VSX"
+  "#"
+  [(set_attr "type" "vecstore")
+   (set_attr "length" "12")])
+
+(define_split
+  [(set (match_operand:VSX_F128 0 "memory_operand" "")
+        (match_operand:VSX_F128 1 "vsx_register_operand" ""))]
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && !reload_completed"
+  [(set (match_dup 4)
+        (vec_select:V2DI (match_dup 3)
+			 (parallel [(const_int 1) (const_int 0)])))
+   (set (match_dup 2)
+        (vec_select:V2DI (match_dup 4)
+			 (parallel [(const_int 1) (const_int 0)])))]
+{
+  operands[2] = gen_lowpart (V2DImode, operands[0]);
+  operands[3] = gen_lowpart (V2DImode, operands[1]);
+  operands[4] = can_create_pseudo_p () ? gen_reg_rtx_and_attrs (operands[2])
+                                       : operands[2];
+})
+
+;; The post-reload split requires that we re-permute the source
+;; register in case it is still live.
+(define_split
+  [(set (match_operand:VSX_F128 0 "memory_operand" "")
+        (match_operand:VSX_F128 1 "vsx_register_operand" ""))]
+  "!BYTES_BIG_ENDIAN && TARGET_VSX && reload_completed"
+  [(set (match_dup 3)
+        (vec_select:V2DI (match_dup 3)
+			 (parallel [(const_int 1) (const_int 0)])))
+   (set (match_dup 2)
+        (vec_select:V2DI (match_dup 3)
+			 (parallel [(const_int 1) (const_int 0)])))
+   (set (match_dup 3)
+        (vec_select:V2DI (match_dup 3)
+			 (parallel [(const_int 1) (const_int 0)])))]
+{
+  operands[2] = gen_lowpart (V2DImode, operands[0]);
+  operands[3] = gen_lowpart (V2DImode, operands[1]);
+})
+
+(define_insn "*vsx_le_perm_store_<mode>"
   [(set (match_operand:VSX_W 0 "memory_operand" "=Z")
         (match_operand:VSX_W 1 "vsx_register_operand" "+wa"))]
   "!BYTES_BIG_ENDIAN && TARGET_VSX"
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -402,6 +402,29 @@  extern const char *host_detect_local_cpu
 #define TARGET_DEBUG_TARGET	(rs6000_debug & MASK_DEBUG_TARGET)
 #define TARGET_DEBUG_BUILTIN	(rs6000_debug & MASK_DEBUG_BUILTIN)
 
+/* __float128 enablement.  */
+#define TARGET_FLOAT128		(TARGET_FLOAT128_VSX || TARGET_FLOAT128_REF)
+
+/* Helper macros for TFmode.  Quad floating point (TFmode) can be either IBM
+   long double format that uses a pair of doubles, or IEEE 128-bit floating
+   point.  KFmode was added as a way to represent IEEE 128-bit floating point,
+   even if the default for long double is the IBM long double format.  */
+#define IEEE_128BIT_P(MODE)						\
+  (((MODE) == KFmode)							\
+   || (((MODE) == TFmode) && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128))
+
+#define IBM_128BIT_P(MODE)						\
+  (((MODE) == TFmode) && !TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128)
+
+/* Helper macros to say whether a 128-bit floating point type can go in a
+   single vector register, or whether it needs paired scalar values.  */
+#define FLOAT128_VECTOR_P(MODE) (TARGET_FLOAT128_VSX && IEEE_128BIT_P (MODE))
+
+#define FLOAT128_2REG_P(MODE)						\
+  (IBM_128BIT_P (MODE)							\
+   || ((MODE) == TDmode)						\
+   || (!TARGET_FLOAT128_VSX && IEEE_128BIT_P (MODE)))
+
 /* Describe the vector unit used for arithmetic operations.  */
 extern enum rs6000_vector rs6000_vector_unit[];
 
@@ -559,6 +582,8 @@  extern int rs6000_vector_align[];
 #define MASK_DIRECT_MOVE		OPTION_MASK_DIRECT_MOVE
 #define MASK_DLMZB			OPTION_MASK_DLMZB
 #define MASK_EABI			OPTION_MASK_EABI
+#define MASK_FLOAT128_VSX		OPTION_MASK_FLOAT128_VSX
+#define MASK_FLOAT128_GPR		OPTION_MASK_FLOAT128_GPR
 #define MASK_FPRND			OPTION_MASK_FPRND
 #define MASK_P8_FUSION			OPTION_MASK_P8_FUSION
 #define MASK_HARD_FLOAT			OPTION_MASK_HARD_FLOAT
@@ -896,10 +921,9 @@  enum data_align { align_abi, align_opt, 
    aligned to 4 or 8 bytes.  */
 #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN)				\
   (STRICT_ALIGNMENT							\
-   || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode	\
-	|| (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode)	\
-       && (ALIGN) < 32)							\
-   || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE))))
+   || (SCALAR_FLOAT_MODE_P (MODE) && (ALIGN) < 32)			\
+   || (VECTOR_MEM_VSX_OR_P8_VECTOR_P (MODE) && GET_MODE_SIZE (MODE) > 8	\
+       && (((int)(ALIGN)) < VECTOR_ALIGN (MODE))))
 
 
 /* Standard register usage.  */
@@ -1173,7 +1197,7 @@  enum data_align { align_abi, align_opt, 
    && ((MODE) == VOIDmode || ALTIVEC_OR_VSX_VECTOR_MODE (MODE))		\
    && FP_REGNO_P (REGNO)						\
    ? V2DFmode								\
-   : ((MODE) == TFmode && FP_REGNO_P (REGNO))				\
+   : (FLOAT128_2REG_P (MODE) && FP_REGNO_P (REGNO))			\
    ? DFmode								\
    : ((MODE) == TDmode && FP_REGNO_P (REGNO))				\
    ? DImode								\
@@ -1184,18 +1208,19 @@  enum data_align { align_abi, align_opt, 
      && (GET_MODE_SIZE (MODE) > 4)					\
      && INT_REGNO_P (REGNO)) ? 1 : 0)					\
    || (TARGET_VSX && FP_REGNO_P (REGNO)					\
-       && GET_MODE_SIZE (MODE) > 8 && ((MODE) != TDmode) 		\
-       && ((MODE) != TFmode)))
+       && GET_MODE_SIZE (MODE) > 8 && !FLOAT128_2REG_P (MODE)))
+
+#define VSX_VECTOR_MODE(MODE) ((MODE) == V4SFmode || (MODE) == V2DFmode)
+
+/* Note KFmode and possibly TFmode (i.e. IEEE 128-bit floating point) are not really a vector, but
+   we want to treat it as a vector for moves, and such.  */
 
-#define VSX_VECTOR_MODE(MODE)		\
-	 ((MODE) == V4SFmode		\
-	  || (MODE) == V2DFmode)	\
-
-#define ALTIVEC_VECTOR_MODE(MODE)	\
-	 ((MODE) == V16QImode		\
-	  || (MODE) == V8HImode		\
-	  || (MODE) == V4SFmode		\
-	  || (MODE) == V4SImode)
+#define ALTIVEC_VECTOR_MODE(MODE)					\
+  ((MODE) == V16QImode							\
+   || (MODE) == V8HImode						\
+   || (MODE) == V4SFmode						\
+   || (MODE) == V4SImode						\
+   || FLOAT128_VECTOR_P (MODE))
 
 #define ALTIVEC_OR_VSX_VECTOR_MODE(MODE)				\
   (ALTIVEC_VECTOR_MODE (MODE) || VSX_VECTOR_MODE (MODE)			\
@@ -1222,12 +1247,19 @@  enum data_align { align_abi, align_opt, 
 
    PTImode cannot tie with other modes because PTImode is restricted to even
    GPR registers, and TImode can go in any GPR as well as VSX registers (PR
-   57744).  */
+   57744).
+
+   Altivec/VSX vector tests moved ahead of scalar float mode, so that IEEE
+   128-bit floating point on VSX systems ties with other vectors.  */
 #define MODES_TIEABLE_P(MODE1, MODE2)		\
   ((MODE1) == PTImode				\
    ? (MODE2) == PTImode				\
    : (MODE2) == PTImode				\
    ? 0						\
+   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE1)		\
+   ? ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
+   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
+   ? 0						\
    : SCALAR_FLOAT_MODE_P (MODE1)		\
    ? SCALAR_FLOAT_MODE_P (MODE2)		\
    : SCALAR_FLOAT_MODE_P (MODE2)		\
@@ -1240,10 +1272,6 @@  enum data_align { align_abi, align_opt, 
    ? SPE_VECTOR_MODE (MODE2)			\
    : SPE_VECTOR_MODE (MODE2)			\
    ? 0						\
-   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE1)		\
-   ? ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
-   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
-   ? 0						\
    : 1)
 
 /* Post-reload, we can't use any new AltiVec registers, as we already
@@ -1740,6 +1768,7 @@  typedef struct rs6000_args
 				   GPR space (darwin64) */
   int named;			/* false for varargs params */
   int escapes;			/* if function visible outside tu */
+  int libcall;			/* If this is a compiler generated call.  */
 } CUMULATIVE_ARGS;
 
 /* Initialize a variable CUM of type CUMULATIVE_ARGS
@@ -2639,6 +2668,7 @@  enum rs6000_builtin_type_index
   RS6000_BTI_dfloat64,		 /* dfloat64_type_node */
   RS6000_BTI_dfloat128,		 /* dfloat128_type_node */
   RS6000_BTI_void,	         /* void_type_node */
+  RS6000_BTI_ieee128_float,	 /* ieee 128-bit floating point */
   RS6000_BTI_MAX
 };
 
@@ -2693,6 +2723,7 @@  enum rs6000_builtin_type_index
 #define dfloat64_type_internal_node	 (rs6000_builtin_types[RS6000_BTI_dfloat64])
 #define dfloat128_type_internal_node	 (rs6000_builtin_types[RS6000_BTI_dfloat128])
 #define void_type_internal_node		 (rs6000_builtin_types[RS6000_BTI_void])
+#define ieee128_float_type_node		 (rs6000_builtin_types[RS6000_BTI_ieee128_float])
 
 extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX];
 extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/altivec.md	(working copy)
@@ -168,10 +168,27 @@  (define_mode_iterator VF [V4SF])
 (define_mode_iterator V [V4SI V8HI V16QI V4SF])
 ;; Vec modes for move/logical/permute ops, include vector types for move not
 ;; otherwise handled by altivec (v2df, v2di, ti)
-(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI V1TI TI])
+(define_mode_iterator VM [V4SI
+			  V8HI
+			  V16QI
+			  V4SF
+			  V2DF
+			  V2DI
+			  V1TI
+			  TI
+			  (KF "FLOAT128_VECTOR_P (KFmode)")
+			  (TF "FLOAT128_VECTOR_P (TFmode)")])
 
 ;; Like VM, except don't do TImode
-(define_mode_iterator VM2 [V4SI V8HI V16QI V4SF V2DF V2DI V1TI])
+(define_mode_iterator VM2 [V4SI
+			   V8HI
+			   V16QI
+			   V4SF
+			   V2DF
+			   V2DI
+			   V1TI
+			   (KF "FLOAT128_VECTOR_P (KFmode)")
+			   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
 (define_mode_attr VI_char [(V2DI "d") (V4SI "w") (V8HI "h") (V16QI "b")])
 (define_mode_attr VI_scalar [(V2DI "DI") (V4SI "SI") (V8HI "HI") (V16QI "QI")])
@@ -3446,3 +3463,32 @@  (define_peephole2
 				  (match_dup 3)]
 				 UNSPEC_BCD_ADD_SUB)
 		    (match_dup 4)))])])
+
+
+;; Return constant 0x80000000000000000000000000000000 in an Altivec register.
+
+(define_expand "altivec_high_bit"
+  [(set (match_dup 1)
+	(vec_duplicate:V16QI (const_int 7)))
+   (set (match_dup 2)
+	(ashift:V16QI (match_dup 1)
+		      (match_dup 1)))
+   (set (match_dup 3)
+	(match_dup 4))
+   (set (match_operand:V16QI 0 "register_operand" "")
+	(unspec:V16QI [(match_dup 2)
+		       (match_dup 3)
+		       (const_int 15)] UNSPEC_VLSDOI))]
+  "TARGET_ALTIVEC"
+{
+  if (can_create_pseudo_p ())
+    {
+      operands[1] = gen_reg_rtx (V16QImode);
+      operands[2] = gen_reg_rtx (V16QImode);
+      operands[3] = gen_reg_rtx (V16QImode);
+    }
+  else
+    operands[1] = operands[2] = operands[3] = operands[0];
+
+  operands[4] = CONST0_RTX (V16QImode);
+})
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk)	(revision 212529)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -338,7 +338,8 @@  (define_mode_iterator FP [
    && (TARGET_FPRS || TARGET_E500_DOUBLE)
    && TARGET_LONG_DOUBLE_128")
   (DD "TARGET_DFP")
-  (TD "TARGET_DFP")])
+  (TD "TARGET_DFP")
+  (KF "TARGET_FLOAT128")])
 
 ; Any fma capable floating-point mode.
 (define_mode_iterator FMA_F [
@@ -354,9 +355,13 @@  (define_mode_iterator FMA_F [
 (define_mode_iterator FMOVE32 [SF SD])
 (define_mode_iterator FMOVE64 [DF DD])
 (define_mode_iterator FMOVE64X [DI DF DD])
-(define_mode_iterator FMOVE128 [(TF "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128")
+(define_mode_iterator FMOVE128 [(TF "TARGET_LONG_DOUBLE_128")
 				(TD "TARGET_HARD_FLOAT && TARGET_FPRS")])
 
+(define_mode_iterator FMOVE128_FPR [(TF "TARGET_LONG_DOUBLE_128")
+				    (KF "TARGET_FLOAT128_REF")
+				    (TD "TARGET_HARD_FLOAT && TARGET_FPRS")])
+
 ; Iterators for 128 bit types for direct move
 (define_mode_iterator FMOVE128_GPR [(TI    "TARGET_VSX_TIMODE")
 				    (V16QI "")
@@ -365,7 +370,12 @@  (define_mode_iterator FMOVE128_GPR [(TI 
 				    (V4SF  "")
 				    (V2DI  "")
 				    (V2DF  "")
-				    (V1TI  "")])
+				    (V1TI  "")
+				    (KF    "")
+				    (TF    "")])
+
+; Iterator for 128-bit VSX types for pack/unpack
+(define_mode_iterator FMOVE128_VSX [V1TI KF])
 
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
@@ -404,6 +414,16 @@  (define_mode_iterator RECIPF [SF DF V4SF
 ; Iterator for just SF/DF
 (define_mode_iterator SFDF [SF DF])
 
+; Iterator for float128 floating conversions
+(define_mode_iterator FLOAT128_SFDFTF [
+    (SF "TARGET_FLOAT128")
+    (DF "TARGET_FLOAT128")
+    (TF "TARGET_FLOAT128 && !TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128")])
+
+; Iterator for 128-bit floating point
+(define_mode_iterator TFKF [(KF "TARGET_FLOAT128")
+			    (TF "TARGET_LONG_DOUBLE_128")])
+
 ; SF/DF suffix for traditional floating instructions
 (define_mode_attr Ftrad		[(SF "s") (DF "")])
 
@@ -9023,9 +9043,10 @@  (define_expand "mov<mode>"
 ;; problematical.  Don't allow direct move for this case.
 
 (define_insn_and_split "*mov<mode>_64bit_dm"
-  [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r,r,wm")
-	(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r,wm,r"))]
+  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,Y,r,r,r,wm,wa,r")
+	(match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,r,YGHF,r,wm,r,j,j"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_POWERPC64
+   && FLOAT128_2REG_P (<MODE>mode)
    && (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9033,7 +9054,7 @@  (define_insn_and_split "*mov<mode>_64bit
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8,8,8")])
+  [(set_attr "length" "8,8,8,12,12,8,8,8,8,8")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -9048,16 +9069,19 @@  (define_insn_and_split "*movtd_64bit_nod
   [(set_attr "length" "8,8,8,12,12,8")])
 
 (define_insn_and_split "*mov<mode>_32bit"
-  [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
-	(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r"))]
+  [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,Y,r,r,wa,r")
+	(match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,r,YGHF,r,j,j"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && !TARGET_POWERPC64
+   && (FLOAT128_2REG_P (<MODE>mode)
+       || int_reg_operand_not_pseudo (operands[0], <MODE>mode)
+       || int_reg_operand_not_pseudo (operands[1], <MODE>mode))
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
   "#"
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,20,20,16")])
+  [(set_attr "length" "8,8,8,20,20,16,8,16")])
 
 (define_insn_and_split "*mov<mode>_softfloat"
   [(set (match_operand:FMOVE128 0 "rs6000_nonimmediate_operand" "=Y,r,r")
@@ -9074,12 +9098,12 @@  (define_insn_and_split "*mov<mode>_softf
 (define_expand "extenddftf2"
   [(set (match_operand:TF 0 "nonimmediate_operand" "")
 	(float_extend:TF (match_operand:DF 1 "input_operand" "")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT
-   && (TARGET_FPRS || TARGET_E500_DOUBLE)
+  "TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)
    && TARGET_LONG_DOUBLE_128"
 {
-  if (TARGET_E500_DOUBLE)
+  if (TARGET_IEEEQUAD)
+    rs6000_expand_float128_convert (operands[0], operands[1], false);
+  else if (TARGET_E500_DOUBLE)
     emit_insn (gen_spe_extenddftf2 (operands[0], operands[1]));
   else
     emit_insn (gen_extenddftf2_fprs (operands[0], operands[1]));
@@ -9123,25 +9147,34 @@  (define_insn_and_split "*extenddftf2_int
 (define_expand "extendsftf2"
   [(set (match_operand:TF 0 "nonimmediate_operand" "")
 	(float_extend:TF (match_operand:SF 1 "gpc_reg_operand" "")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT
+  "TARGET_HARD_FLOAT
    && (TARGET_FPRS || TARGET_E500_DOUBLE)
    && TARGET_LONG_DOUBLE_128"
 {
-  rtx tmp = gen_reg_rtx (DFmode);
-  emit_insn (gen_extendsfdf2 (tmp, operands[1]));
-  emit_insn (gen_extenddftf2 (operands[0], tmp));
+  if (TARGET_IEEEQUAD)
+    rs6000_expand_float128_convert (operands[0], operands[1], false);
+  else
+    {
+      rtx tmp = gen_reg_rtx (DFmode);
+      emit_insn (gen_extendsfdf2 (tmp, operands[1]));
+      emit_insn (gen_extenddftf2 (operands[0], tmp));
+    }
   DONE;
 })
 
 (define_expand "trunctfdf2"
   [(set (match_operand:DF 0 "gpc_reg_operand" "")
 	(float_truncate:DF (match_operand:TF 1 "gpc_reg_operand" "")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT
+  "TARGET_HARD_FLOAT
    && (TARGET_FPRS || TARGET_E500_DOUBLE)
    && TARGET_LONG_DOUBLE_128"
-  "")
+{
+  if (TARGET_IEEEQUAD)
+    {
+      rs6000_expand_float128_convert (operands[0], operands[1], false);
+      DONE;
+    }
+})
 
 (define_insn_and_split "trunctfdf2_internal1"
   [(set (match_operand:DF 0 "gpc_reg_operand" "=d,?d")
@@ -9177,7 +9210,9 @@  (define_expand "trunctfsf2"
    && (TARGET_FPRS || TARGET_E500_DOUBLE)
    && TARGET_LONG_DOUBLE_128"
 {
-  if (TARGET_E500_DOUBLE)
+  if (TARGET_IEEEQUAD)
+    rs6000_expand_float128_convert (operands[0], operands[1], false);
+  else if (TARGET_E500_DOUBLE)
     emit_insn (gen_spe_trunctfsf2 (operands[0], operands[1]));
   else
     emit_insn (gen_trunctfsf2_fprs (operands[0], operands[1]));
@@ -9228,10 +9263,12 @@  (define_insn "fix_trunc_helper"
 (define_expand "fix_trunctfsi2"
   [(set (match_operand:SI 0 "gpc_reg_operand" "")
 	(fix:SI (match_operand:TF 1 "gpc_reg_operand" "")))]
-  "!TARGET_IEEEQUAD && TARGET_HARD_FLOAT
+  "TARGET_HARD_FLOAT
    && (TARGET_FPRS || TARGET_E500_DOUBLE) && TARGET_LONG_DOUBLE_128"
 {
-  if (TARGET_E500_DOUBLE)
+  if (TARGET_IEEEQUAD)
+    rs6000_expand_float128_convert (operands[0], operands[1], false);
+  else if (TARGET_E500_DOUBLE)
     emit_insn (gen_spe_fix_trunctfsi2 (operands[0], operands[1]));
   else
     emit_insn (gen_fix_trunctfsi2_fprs (operands[0], operands[1]));
@@ -9279,20 +9316,73 @@  (define_insn_and_split "*fix_trunctfsi2_
   DONE;
 })
 
-(define_expand "negtf2"
-  [(set (match_operand:TF 0 "gpc_reg_operand" "")
-	(neg:TF (match_operand:TF 1 "gpc_reg_operand" "")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT
-   && (TARGET_FPRS || TARGET_E500_DOUBLE)
-   && TARGET_LONG_DOUBLE_128"
-  "")
+(define_expand "fix_trunctfdi2"
+  [(set (match_operand:DI 0 "nonimmediate_operand" "")
+	(fix:DI (match_operand:TF 1 "gpc_reg_operand" "")))]
+  "TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "fixuns_trunctf<mode>2"
+  [(set (match_operand:SDI 0 "nonimmediate_operand" "")
+	(unsigned_fix:SDI (match_operand:TF 1 "gpc_reg_operand" "")))]
+  "TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], true);
+  DONE;
+})
+
+(define_expand "floatditf2"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+	(float:TF (match_operand:DI 1 "gpc_reg_operand" "")))]
+  "TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "floatuns<mode>tf2"
+  [(set (match_operand:TF 0 "nonimmediate_operand" "")
+	(unsigned_float:TF (match_operand:SDI 1 "gpc_reg_operand" "")))]
+  "TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], true);
+  DONE;
+})
+
+(define_expand "neg<mode>2"
+  [(set (match_operand:TFKF 0 "gpc_reg_operand" "")
+	(neg:TFKF (match_operand:TFKF 1 "gpc_reg_operand" "")))]
+  "IEEE_128BIT_P (<MODE>mode)
+   || (IBM_128BIT_P (<MODE>mode)
+       && TARGET_HARD_FLOAT
+       && (TARGET_FPRS || TARGET_E500_DOUBLE))"
+  "
+{
+  if (IEEE_128BIT_P (<MODE>mode))
+    {
+      if (TARGET_FLOAT128_VSX)
+	emit_insn (gen_ieee_128bit_vsx_neg<mode>2 (operands[0], operands[1]));
+      else
+	{
+	  rtx libfunc = optab_libfunc (neg_optab, <MODE>mode);
+	  rtx target = emit_library_call_value (libfunc, operands[0], LCT_CONST,
+						<MODE>mode, 1,
+						operands[1], <MODE>mode);
+
+	  if (target && !rtx_equal_p (target, operands[0]))
+	    emit_move_insn (operands[0], target);
+	}
+      DONE;
+    }
+}")
 
 (define_insn "negtf2_internal"
   [(set (match_operand:TF 0 "gpc_reg_operand" "=d")
 	(neg:TF (match_operand:TF 1 "gpc_reg_operand" "d")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128"
+  "TARGET_HARD_FLOAT && TARGET_FPRS && IBM_128BIT_P (TFmode)"
   "*
 {
   if (REGNO (operands[0]) == REGNO (operands[1]) + 1)
@@ -9303,16 +9393,29 @@  (define_insn "negtf2_internal"
   [(set_attr "type" "fp")
    (set_attr "length" "8")])
 
-(define_expand "abstf2"
-  [(set (match_operand:TF 0 "gpc_reg_operand" "")
-	(abs:TF (match_operand:TF 1 "gpc_reg_operand" "")))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT
-   && (TARGET_FPRS || TARGET_E500_DOUBLE)
-   && TARGET_LONG_DOUBLE_128"
+(define_expand "abs<mode>2"
+  [(set (match_operand:TFKF 0 "gpc_reg_operand" "")
+	(abs:TFKF (match_operand:TFKF 1 "gpc_reg_operand" "")))]
+  "IEEE_128BIT_P (<MODE>mode)
+   || (IBM_128BIT_P (<MODE>mode)
+       && TARGET_HARD_FLOAT
+       && (TARGET_FPRS || TARGET_E500_DOUBLE))"
   "
 {
-  rtx label = gen_label_rtx ();
+  rtx label;
+
+  if (IEEE_128BIT_P (<MODE>mode))
+    {
+      if (TARGET_FLOAT128_VSX)
+	{
+	  emit_insn (gen_ieee_128bit_vsx_abs<mode>2 (operands[0], operands[1]));
+	  DONE;
+	}
+      else
+	FAIL;
+    }
+
+  label = gen_label_rtx ();
   if (TARGET_E500_DOUBLE)
     {
       if (flag_finite_math_only && !flag_trapping_math)
@@ -9348,6 +9451,166 @@  (define_expand "abstf2_internal"
   operands[5] = simplify_gen_subreg (DFmode, operands[0], TFmode, hi_word);
   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
 }")
+
+
+
+;; IEEE 128-bit negate
+
+;; We have 2 insns here for negate and absolute value.  The first uses
+;; match_scratch so that phases like combine can recognize neg/abs as generic
+;; insns, and second insn after the first split pass loads up the bit to
+;; twiddle the sign bit.  Later GCSE passes can then combine multiple uses of
+;; neg/abs to create the constant just once.
+
+(define_insn_and_split "ieee_128bit_vsx_neg<mode>2"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(neg:TFKF (match_operand:TFKF 1 "register_operand" "wa")))
+   (clobber (match_scratch:V16QI 2 "=v"))]
+  "TARGET_FLOAT128_VSX && IEEE_128BIT_P (<MODE>mode)"
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+		   (neg:TFKF (match_dup 1)))
+	      (use (match_dup 2))])]
+{
+  if (GET_CODE (operands[2]) == SCRATCH)
+    operands[2] = gen_reg_rtx (V16QImode);
+
+  operands[3] = gen_reg_rtx (V16QImode);
+  emit_insn (gen_altivec_high_bit (operands[2]));
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecsimple")])
+
+(define_insn "*ieee_128bit_vsx_neg<mode>2_internal"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(neg:TFKF (match_operand:TFKF 1 "register_operand" "wa")))
+   (use (match_operand:V16QI 2 "register_operand" "=v"))]
+  "TARGET_FLOAT128_VSX"
+  "xxlxor %x0,%x1,%x2"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
+;; IEEE 128-bit absolute value
+(define_insn_and_split "ieee_128bit_vsx_abs<mode>2"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(abs:TFKF (match_operand:TFKF 1 "register_operand" "wa")))
+   (clobber (match_scratch:V16QI 2 "=v"))]
+  "TARGET_FLOAT128_VSX && IEEE_128BIT_P (<MODE>mode)"
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+		   (abs:TFKF (match_dup 1)))
+	      (use (match_dup 2))])]
+{
+  if (GET_CODE (operands[2]) == SCRATCH)
+    operands[2] = gen_reg_rtx (V16QImode);
+
+  operands[3] = gen_reg_rtx (V16QImode);
+  emit_insn (gen_altivec_high_bit (operands[2]));
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecsimple")])
+
+(define_insn "*ieee_128bit_vsx_abs<mode>2_internal"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(abs:TFKF (match_operand:TFKF 1 "register_operand" "wa")))
+   (use (match_operand:V16QI 2 "register_operand" "=v"))]
+  "TARGET_FLOAT128_VSX"
+  "xxlandc %x0,%x1,%x2"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
+;; IEEE 128-bit negative absolute value
+(define_insn_and_split "*ieee_128bit_vsx_nabs<mode>2"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(neg:TFKF (abs:TFKF (match_operand:TFKF 1 "register_operand" "wa"))))
+   (clobber (match_scratch:V16QI 2 "=v"))]
+  "TARGET_FLOAT128_VSX && IEEE_128BIT_P (<MODE>mode)"
+  "#"
+  ""
+  [(parallel [(set (match_dup 0)
+		   (abs:TFKF (match_dup 1)))
+	      (use (match_dup 2))])]
+{
+  if (GET_CODE (operands[2]) == SCRATCH)
+    operands[2] = gen_reg_rtx (V16QImode);
+
+  operands[3] = gen_reg_rtx (V16QImode);
+  emit_insn (gen_altivec_high_bit (operands[2]));
+}
+  [(set_attr "length" "8")
+   (set_attr "type" "vecsimple")])
+
+(define_insn "*ieee_128bit_vsx_nabs<mode>2_internal"
+  [(set (match_operand:TFKF 0 "register_operand" "=wa")
+	(neg:TFKF (abs:TFKF (match_operand:TFKF 1 "register_operand" "wa"))))
+   (use (match_operand:V16QI 2 "register_operand" "=v"))]
+  "TARGET_FLOAT128_VSX"
+  "xxlor %x0,%x1,%x2"
+  [(set_attr "length" "4")
+   (set_attr "type" "vecsimple")])
+
+;; Float128 conversion functions.  These expand to library function calls.  We
+;; need these expanders here, so that we can select the appropriate function to
+;; call, based on whether -mfloat128-vsx or -mfloat128-fpr is used.
+
+(define_expand "extend<mode>kf2"
+  [(set (match_operand:KF 0 "nonimmediate_operand" "")
+	(float_extend:KF
+	 (match_operand:FLOAT128_SFDFTF 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "trunckf<mode>2"
+  [(set (match_operand:FLOAT128_SFDFTF 0 "nonimmediate_operand" "")
+	(float_truncate:FLOAT128_SFDFTF
+	 (match_operand:KF 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "fix_trunckf<mode>2"
+  [(set (match_operand:SDI 0 "nonimmediate_operand" "")
+	(fix:SDI (match_operand:KF 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "fixuns_trunckf<mode>2"
+  [(set (match_operand:SDI 0 "nonimmediate_operand" "")
+	(unsigned_fix:SDI (match_operand:KF 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], true);
+  DONE;
+})
+
+(define_expand "float<mode>kf2"
+  [(set (match_operand:KF 0 "nonimmediate_operand" "")
+	(float:KF (match_operand:SDI 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], false);
+  DONE;
+})
+
+(define_expand "floatuns<mode>kf2"
+  [(set (match_operand:KF 0 "nonimmediate_operand" "")
+	(unsigned_float:KF (match_operand:SDI 1 "gpc_reg_operand" "")))]
+  "TARGET_FLOAT128"
+{
+  rs6000_expand_float128_convert (operands[0], operands[1], true);
+  DONE;
+})
+
 
 ;; Reload helper functions used by rs6000_secondary_reload.  The patterns all
 ;; must have 3 arguments, and scratch register constraint must be a single
@@ -15256,7 +15519,9 @@  (define_insn "div<div_extend>_<mode>"
 ;; Pack/unpack 128-bit floating point types that take 2 scalar registers
 
 ; Type of the 64-bit part when packing/unpacking 128-bit floating point types
-(define_mode_attr FP128_64 [(TF "DF") (TD "DI")])
+(define_mode_attr FP128_64 [(TF "DF")
+			    (TD "DI")
+			    (KF "DI")])
 
 (define_expand "unpack<mode>"
   [(set (match_operand:<FP128_64> 0 "nonimmediate_operand" "")
@@ -15264,7 +15529,7 @@  (define_expand "unpack<mode>"
 	 [(match_operand:FMOVE128 1 "register_operand" "")
 	  (match_operand:QI 2 "const_0_to_1_operand" "")]
 	 UNSPEC_UNPACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "")
 
 (define_insn_and_split "unpack<mode>_dm"
@@ -15273,7 +15538,7 @@  (define_insn_and_split "unpack<mode>_dm"
 	 [(match_operand:FMOVE128 1 "register_operand" "d,d,r,d,r")
 	  (match_operand:QI 2 "const_0_to_1_operand" "i,i,i,i,i")]
 	 UNSPEC_UNPACK_128BIT))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && FLOAT128_2REG_P (<MODE>mode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 3))]
@@ -15297,7 +15562,7 @@  (define_insn_and_split "unpack<mode>_nod
 	 [(match_operand:FMOVE128 1 "register_operand" "d,d")
 	  (match_operand:QI 2 "const_0_to_1_operand" "i,i")]
 	 UNSPEC_UNPACK_128BIT))]
-  "!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE"
+  "(!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE) && FLOAT128_2REG_P (<MODE>mode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 3))]
@@ -15321,7 +15586,7 @@  (define_insn_and_split "pack<mode>"
 	 [(match_operand:<FP128_64> 1 "register_operand" "0,d")
 	  (match_operand:<FP128_64> 2 "register_operand" "d,d")]
 	 UNSPEC_PACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "@
    fmr %L0,%2
    #"
@@ -15341,12 +15606,12 @@  (define_insn_and_split "pack<mode>"
   [(set_attr "type" "fp,fp")
    (set_attr "length" "4,8")])
 
-(define_insn "unpackv1ti"
+(define_insn "unpack<mode>"
   [(set (match_operand:DI 0 "register_operand" "=d,d")
-	(unspec:DI [(match_operand:V1TI 1 "register_operand" "0,wa")
+	(unspec:DI [(match_operand:FMOVE128_VSX 1 "register_operand" "0,wa")
 		    (match_operand:QI 2 "const_0_to_1_operand" "O,i")]
 	 UNSPEC_UNPACK_128BIT))]
-  "TARGET_VSX"
+  "VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
 {
   if (REGNO (operands[0]) == REGNO (operands[1]) && INTVAL (operands[2]) == 0)
     return ASM_COMMENT_START " xxpermdi to same register";
@@ -15357,9 +15622,9 @@  (define_insn "unpackv1ti"
   [(set_attr "type" "vecperm")
    (set_attr "length" "4")])
 
-(define_insn "packv1ti"
-  [(set (match_operand:V1TI 0 "register_operand" "=wa")
-	(unspec:V1TI
+(define_insn "pack<mode>"
+  [(set (match_operand:FMOVE128_VSX 0 "register_operand" "=wa")
+	(unspec:FMOVE128_VSX
 	 [(match_operand:DI 1 "register_operand" "d")
 	  (match_operand:DI 2 "register_operand" "d")]
 	 UNSPEC_PACK_128BIT))]