diff mbox

, RFC, add support for __float128/__ibm128 types on PowerPC

Message ID 20140429223032.GA21674@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner April 29, 2014, 10:30 p.m. UTC
This patch adds support for a new type (__float128) on the PowerPC to allow
people to use the 128-bit IEEE floating point format instead of the traditional
IBM double-double that has been used in the Linux compilers.  At this time,
long double still will remain using the IBM double-double format.

There has been an undocumented option to switch long double to to IEEE 128-bit,
but right now, there are bugs I haven't ironed out on VSX systems.

In addition, I added another type (__ibm128) so that when the transition is
eventually made, people can use this type to get the old long double type.

I was wondering if people had any comments on the code so far, and things I
should different.  Note, I will be out on vacation May 6th - 14th, so I don't
expect to submit the patches until I get back.

At present, I don't have tests in the testsuite.

If you want to try it out, I have it as a svn branch:
svn+ssh://gcc.gnu.org/svn/gcc/branches/ibm/gcc-4_10-ieee

Now, some questions.  For Power server systems, we will want to always pass
IEEE 128-bit as a vector type, but I have code in here to pass it in the
traditional fashion in two FPRs if -mvsx is not used?  It is a little unclean
to have two different sets of support functions, and possibly confuse the user
if they compile some code with -mvsx and some without.  I can require -mvsx if
it is desirable to only have one ABI controlling it.  Or I can do this
selection on the ELF v2 abi switch.

Do we need -mfloat128?  Or should it just be an undocumented swtich and switch
on with VSX and/or ELF v2 abi?

Would other users besides Linux users want to use the __float128 type?

I assume I haven't broken anything on the other non-Linux environments,
particularly the ones that already have IEEE 128-bit support, but it would be
nice to know that for certain.

[gcc]
2014-04-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/predicates.md (int_reg_operand_not_pseudo): New
	predicate that matches hard GPR registers, but not pseudos.
	(easy_fp_constant): Add support for new floating point modes
	XFmode and JFmode.

	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Set
	-mfloat128 on by default.
	(POWERPC_MASKS): Add -mfloat128.
	(power7 cpu): Enable -mfloat128.

	* config/rs6000/rs6000.opt (-mfloat128): New option to control
	whether the __float128 (IEEE 128-bit floating point) and __ibm128
	(traditional IBM double double format for long double) types are
	enabled.

	* config/rs6000/rs6000-c.c (rs6000_target_modify_macros): If
	-mfloat128, define __FLOAT128__.

	* config/rs6000/rs6000-builtin.def (PACK_JF): Define pack/unpack
	builtin functions for __ibm128.
	(UNPACK_F): Likewise.

	* config/rs6000/rs6000.c (scalar_float_not_vector_p): New helper
	function for scalar floats that are used in traditional floating
	point registers, and not in a vector register.
	(rs6000_hard_regno_nregs_internal): Add support for __float128 on
	VSX systems to know that the value uses the full vector register
	rather than being a pair of scalar registers.
	(rs6000_hard_regno_mode_ok): Likewise.
	(rs6000_debug_reg_global): Add debugging for __float128 and
	__ibm128.
	(rs6000_init_hard_regno_mode_ok): Add support for __float128 and
	__ibm128.
	(rs6000_option_override_internal): Don't allow -mfloat128 and
	-mlong-double-64.
	(invalid_e500_subreg): Add support for __float128 and __ibm128
	floating point types.  On VSX systems, pass/return __float128 in
	vector registers.
	(reg_offset_addressing_ok_p): Likewise.
	(rs6000_legitimate_offset_address_p): Likewise.
	(rs6000_legitimize_reload_address): Likewise.
	(rs6000_legitimate_address_p): Likewise.
	(rs6000_emit_move): Likewise.
	(USE_FP_FOR_ARG_P): Likewise.
	(rs6000_aggregate_candidate): Likewise.
	(rs6000_discover_homogeneous_aggregate): Likewise.
	(rs6000_return_in_memory): Likewise.
	(init_cumulative_args): Likewise.
	(rs6000_function_arg_boundary): Likewise.
	(rs6000_function_arg_advance_1): Likewise.
	(rs6000_function_arg): Likewise.
	(rs6000_arg_partial_bytes): Likewise.
	(rs6000_pass_by_reference): Likewise.
	(rs6000_init_builtins): Initialize support for __float128 and
	__ibm128.
	(init_float128_ibm): New function to set up the library names for
	the IBM double-double 128-bit format.
	(init_float128_ieee): New function to set up the library names for
	IEEE 128-bit types.  On VSX systems, use <name>_vector, on non-VSX
	systems with -mfloat128, use <name>_fpr, and on non-Linux/BSD
	systems that default to long double == IEEE 128-bit, use the
	historic names.
	(rs6000_init_libfuncs): Move setup of the library names to
	init_float128_ibm and init_float128_ieee.
	(rs6000_cannot_change_mode_class): Add support for __float128 and
	__ibm128.
	(rs6000_generate_compare): Likewise.
	(rs6000_split_multireg_move): Likewise.
	(spe_func_has_64bit_regs_p): Likewise.
	(rs6000_output_function_epilogue): Likewise.
	(output_toc): Likewise.
	(rs6000_mangle_type): Expand mangling to use "g" for __ibm128, and
	"e" for __float128, to be compatible with defaults for long
	double.
	(rs6000_register_move_cost): Add support for __float128 and
	__ibm128.
	(rs6000_function_value): Likewise.
	(rs6000_libcall_value): Likewise.
	(rs6000_opt_masks): Add -mfloat128.

	* config/rs6000/rs6000.h (FLOAT128_IEEE_P): New macro to identify
	types that map to IEEE 128-bit floating point.
	(FLOAT128_IBM_P): New macro to identify types that map to the
	traditional IBM double-double 128-bit floating point.
	(FLOAT128_VECTOR_P): New macro to identify 128-bit floating point
	types that take a single vector register.
	(FLOAT128_2REG_P): New macro to identify 128-bit floating point
	types that take 2 adjacent scalar registers.
	(MASK_FLOAT128): Map to OPTION_MASK_FLOAT128.
	(SLOW_UNALIGNED_ACCESS): Add support for __float182 and __ibm128.
	(HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	(HARD_REGNO_CALL_PART_CLOBBERED): Likewise.
	(VSX_VECTOR_MODE): Spacing.
	(ALTIVEC_VECTOR_MODE): Add __float128 in vector registers.
	(MODES_TIEABLE_P): Move vectors higher than scalar floating point,
	so that __float128 in vector registers ties with vectors and not
	with other floating point values.
	(struct rs6000_args): Add libcall argument so that we can tell
	when we calling a library support function for __float128.
	(enum rs6000_builtin_type_index): Add __float128 and __ibm128
	support.
	(ieee128_float_type_node): Likewise.
	(ibm128_float_type_node): Likewise.

	* config/rs6000/altivec.md (VM): Add support for IEEE 128-bit
	floating point types that goes in a vector.
	(VM2): Likewise.

	* config/rs6000/rs6000.md (FMOVE128): Add supprt for __float128
	and __ibm128.
	(FMOVE128_GPR): Likewise.
	(mov<mode>_64bit, FMOVE128 types): Likewise.
	(mov<mode>_32bit, FMOVE128 types): Likewise.
	(FP128_64): Likewise.
	(unpack<mode>): Add support to pack/unpack __ibm128 types.  Delete
	old insns just for TFmode.  Don't pack/unpack 128-bit types in a
	vector register.
	(TF_JF): Likewise.
	(unpack<mode>_0): Likewise.
	(unpacktf_0): Likewise.
	(unpack<mode>_1): Likewise.
	(unpacktf_1): Likewise.
	(unpack<mode>_dm): Likewise.
	(unpack<mode>_nodm): Likewise.
	(pack<mode>): Likewise.

	* config/rs6000/vector.md (VEC_L): Add XFmode to vector modes.
	(VEC_R): Likewise.

	* config/rs6000/rs6000-modes.def (XFmode): Add new modes to
	support __float128 and __ibm128 types.
	(JFmode): Likewise.

	* config/rs6000/vsx.md (VSX_L): Add XFmode to vector modes.
	(VSX_M): Likewise.
	(VSX_M2): Likewise.

	* doc/extend.texi (Floating Types): Document __float128 and
	__ibm128 on PowerPC.
	(PowerPC Built-in Functions): Document pack, unpack builtins for
	__ibm128.

	* doc/invoke.texi (RS/6000 and PowerPC Options): Add -mfloat128.

[libgcc]
2014-04-29  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/float128-vsx.h: New files to add support to build
	the IEEE 128-bit support functions on the PowerPC.  Build two
	versions of the IEEE 128-bit support, one version that uses the
	traditional floating point registers, and passes the type as two
	scalar registers, and the other that passes the type in a single
	vector register.
	* config/rs6000/float128-novsx.h: Likewise.
	* config/rs6000/t-float128: Likewise.

	* soft-fp/quad.h (TFmode): If TFmode is defined, don't use the
	normal TF mode definition.

	* config.host (powerpc*-linux*): Add support for building the IEEE
	128-bit floating point support functions.

Comments

Marc Glisse April 30, 2014, 5:56 a.m. UTC | #1
Minor detail:

+  if ((flags & OPTION_MASK_FLOAT128) != 0)
+    rs6000_define_or_undefine_macro (define_p, "__FLOAT128__");

I recently added __SIZEOF_FLOAT128__ to the x86 target to advertise the 
availability of __float128, it would be good if we had a common macro (it 
can be in addition to __FLOAT128__). If you really don't like it, we can 
ask x86 maintainers if they like __FLOAT128__.
Michael Meissner April 30, 2014, 3:03 p.m. UTC | #2
On Wed, Apr 30, 2014 at 07:56:07AM +0200, Marc Glisse wrote:
> Minor detail:
> 
> +  if ((flags & OPTION_MASK_FLOAT128) != 0)
> +    rs6000_define_or_undefine_macro (define_p, "__FLOAT128__");
> 
> I recently added __SIZEOF_FLOAT128__ to the x86 target to advertise
> the availability of __float128, it would be good if we had a common
> macro (it can be in addition to __FLOAT128__). If you really don't
> like it, we can ask x86 maintainers if they like __FLOAT128__.

I would prefer to have common names between the x86 and PowerPC, since our two
ports are the main two that had long double types that weren't either a
straight IEEE 64-bit or IEEE 128-bit representation.

I can certainly switch the PowerPC to use the same name as the x86.  When I
began the port, I didn't notice there was a define that said the __float128
type was available, so I just picked a name out of the air.

I do think when we agree on a name, it would be helpful if quad.h in
libgcc/soft-fp (which comes from the glibc sources) used that to use the
__float128 type.  Unfortunately, given the IBM double-double size is 128-bits,
the attribute((mode(TF))) won't work for us.
Jakub Jelinek May 2, 2014, 10:13 a.m. UTC | #3
Hi!

On Tue, Apr 29, 2014 at 06:30:32PM -0400, Michael Meissner wrote:
> This patch adds support for a new type (__float128) on the PowerPC to allow
> people to use the 128-bit IEEE floating point format instead of the traditional
> IBM double-double that has been used in the Linux compilers.  At this time,
> long double still will remain using the IBM double-double format.
> 
> There has been an undocumented option to switch long double to to IEEE 128-bit,
> but right now, there are bugs I haven't ironed out on VSX systems.
> 
> In addition, I added another type (__ibm128) so that when the transition is
> eventually made, people can use this type to get the old long double type.
> 
> I was wondering if people had any comments on the code so far, and things I
> should different.  Note, I will be out on vacation May 6th - 14th, so I don't
> expect to submit the patches until I get back.

For mangling, if you are going to mangle it the same as the -mlong-double-64
long double, is __float128 going to be supported solely for ELFv2 ABI and
are you sure nobody has ever used -mlong-double-64 or
--without-long-double-128 configured compiler for it?

What is the plan for glibc (and for libstdc++)?
Looking at current ppc64le glibc, it seems it mistakenly still supports
the -mlong-double-64 stuff (e.g. printf calls are usually redirected to
__nldbl_printf (and tons of other calls).  So, is the plan to use
yet another set of symbols?  For __nldbl_* it is about 113 entry points
in libc.so and 1 in libm.so, but if you are going to support all of
-mlong-double-64, -mlong-double-128 as well as __float128, that would be far
more, because the compat -mlong-double-64 support mostly works by
redirecting, either in headers or through a special *.a library, to
corresponding double entry points whenever possible.
So, if you call logl in -mlong-double-64 code, it will be redirected to
log, because it has the same ABI.  But if you call *printf or nexttowardf
etc. where there is no ABI compatible double entrypoint, it needs to be a
new symbol.
But with __float128 vs. __ibm128 and long double being either of those,
you need different logl.

Which is why it is so huge problem that this hasn't been resolved initially
as part of ELFv2 changes.

	Jakub
Steven Munroe May 2, 2014, 12:47 p.m. UTC | #4
On Fri, 2014-05-02 at 12:13 +0200, Jakub Jelinek wrote:
> Hi!
> 
> On Tue, Apr 29, 2014 at 06:30:32PM -0400, Michael Meissner wrote:
> > This patch adds support for a new type (__float128) on the PowerPC to allow
> > people to use the 128-bit IEEE floating point format instead of the traditional
> > IBM double-double that has been used in the Linux compilers.  At this time,
> > long double still will remain using the IBM double-double format.
> > 
> > There has been an undocumented option to switch long double to to IEEE 128-bit,
> > but right now, there are bugs I haven't ironed out on VSX systems.
> > 
> > In addition, I added another type (__ibm128) so that when the transition is
> > eventually made, people can use this type to get the old long double type.
> > 
> > I was wondering if people had any comments on the code so far, and things I
> > should different.  Note, I will be out on vacation May 6th - 14th, so I don't
> > expect to submit the patches until I get back.
> 
> For mangling, if you are going to mangle it the same as the -mlong-double-64
> long double, is __float128 going to be supported solely for ELFv2 ABI and
> are you sure nobody has ever used -mlong-double-64 or
> --without-long-double-128 configured compiler for it?

> What is the plan for glibc (and for libstdc++)?
> Looking at current ppc64le glibc, it seems it mistakenly still supports
> the -mlong-double-64 stuff (e.g. printf calls are usually redirected to
> __nldbl_printf (and tons of other calls).  So, is the plan to use
> yet another set of symbols?  For __nldbl_* it is about 113 entry points
> in libc.so and 1 in libm.so, but if you are going to support all of
> -mlong-double-64, -mlong-double-128 as well as __float128, that would be far
> more, because the compat -mlong-double-64 support mostly works by
> redirecting, either in headers or through a special *.a library, to
> corresponding double entry points whenever possible.
> So, if you call logl in -mlong-double-64 code, it will be redirected to
> log, because it has the same ABI.  But if you call *printf or nexttowardf
> etc. where there is no ABI compatible double entrypoint, it needs to be a
> new symbol.
> But with __float128 vs. __ibm128 and long double being either of those,
> you need different logl.
> 
Yes and we will work on a plan to do this. But at this time and near
future there is no performance advantage to __float128 over IBM long
double.

> Which is why it is so huge problem that this hasn't been resolved initially
> as part of ELFv2 changes.

Because it was a huge problem and there was no way for the required GCC
support to be available in time for GLIBC-2.19.

So we will develop a orderly, step by step transition plan. This will
take some time.
Joseph Myers May 2, 2014, 5:18 p.m. UTC | #5
On Tue, 29 Apr 2014, Michael Meissner wrote:

> 	* soft-fp/quad.h (TFmode): If TFmode is defined, don't use the
> 	normal TF mode definition.

That does of course need to go to glibc first (and I think it would be 
best to do something consistent for all the floating-point formats in 
soft-fp - allow overriding the definitions of all of them in terms of 
machine modes to keep things consistent, even without any use case for 
overriding definitions other than for TFmode).

I don't see anything in this patch to give appropriate symbol versions to 
the new libgcc functions.

libgcc/config/rs6000/sfp-machine.h does not currently implement 
integration with hardware exceptions and rounding modes.  Such integration 
should be added as on other architectures such as x86 and AArch64 
(conditional on FPRs being present, as opposed to soft-fp being used in 
libgcc for soft-float or e500).
diff mbox

Patch

Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/predicates.md	(.../gcc/config/rs6000)	(working copy)
@@ -228,6 +228,25 @@  (define_predicate "int_reg_operand"
   return INT_REGNO_P (REGNO (op));
 })
 
+;; Like int_reg_operand, but don't return true for pseudo registers
+(define_predicate "int_reg_operand_not_pseudo"
+  (match_operand 0 "register_operand")
+{
+  if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode))
+    return 0;
+
+  if (GET_CODE (op) == SUBREG)
+    op = SUBREG_REG (op);
+
+  if (!REG_P (op))
+    return 0;
+
+  if (REGNO (op) >= FIRST_PSEUDO_REGISTER)
+    return 0;
+
+  return INT_REGNO_P (REGNO (op));
+})
+
 ;; Like int_reg_operand, but only return true for base registers
 (define_predicate "base_reg_operand"
   (match_operand 0 "int_reg_operand")
@@ -438,11 +457,12 @@  (define_predicate "easy_fp_constant"
     return 1;
 
   /* The constant 0.0 is easy under VSX.  */
-  if ((mode == SFmode || mode == DFmode || mode == SDmode || mode == DDmode)
+  if ((mode == SFmode || mode == DFmode || mode == SDmode || mode == DDmode
+       || mode == XFmode)
       && VECTOR_UNIT_VSX_P (DFmode) && op == CONST0_RTX (mode))
     return 1;
 
-  if (DECIMAL_FLOAT_MODE_P (mode))
+  if (DECIMAL_FLOAT_MODE_P (mode) || mode == XFmode)
     return 0;
 
   /* If we are using V.4 style PIC, consider all constants to be hard.  */
@@ -458,8 +478,14 @@  (define_predicate "easy_fp_constant"
 
   switch (mode)
     {
+    /* For IEEE 128-bit, only consider 0.0 to be easy.  */
+    case JFmode:
+    case XFmode:
     case TFmode:
-      if (TARGET_E500_DOUBLE)
+      if (FLOAT128_VECTOR_P (mode))
+	return (op == CONST0_RTX (mode));
+
+      if (mode == XFmode || TARGET_E500_DOUBLE)
 	return 0;
 
       REAL_VALUE_FROM_CONST_DOUBLE (rv, op);
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000-cpus.def	(.../gcc/config/rs6000)	(working copy)
@@ -44,6 +44,7 @@ 
 #define ISA_2_6_MASKS_SERVER	(ISA_2_5_MASKS_SERVER			\
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
+				 | OPTION_MASK_FLOAT128			\
 				 | OPTION_MASK_VSX)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
@@ -76,6 +77,7 @@ 
 				 | OPTION_MASK_DFP			\
 				 | OPTION_MASK_DIRECT_MOVE		\
 				 | OPTION_MASK_DLMZB			\
+				 | OPTION_MASK_FLOAT128			\
 				 | OPTION_MASK_FPRND			\
 				 | OPTION_MASK_HTM			\
 				 | OPTION_MASK_ISEL			\
@@ -184,7 +186,7 @@  RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION)
+	    | MASK_VSX | MASK_RECIP_PRECISION | MASK_FLOAT128)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
 RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64)
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000-builtin.def	(.../gcc/config/rs6000)	(working copy)
@@ -1598,6 +1598,9 @@  BU_MISC_2 (UNPACK_TF,		"unpack_longdoubl
 BU_MISC_1 (UNPACK_TF_0,		"longdouble_dw0",	CONST,	unpacktf_0)
 BU_MISC_1 (UNPACK_TF_1,		"longdouble_dw1",	CONST,	unpacktf_1)
 
+BU_MISC_2 (PACK_JF,		"pack_ibm128",		CONST,	packjf)
+BU_MISC_2 (UNPACK_JF,		"unpack_ibm128",	CONST,	unpackjf)
+
 BU_P7_MISC_2 (PACK_V1TI,	"pack_vector_int128",	CONST,	packv1ti)
 BU_P7_MISC_2 (UNPACK_V1TI,	"unpack_vector_int128",	CONST,	unpackv1ti)
 
Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000-c.c	(.../gcc/config/rs6000)	(working copy)
@@ -362,6 +362,8 @@  rs6000_target_modify_macros (bool define
     rs6000_define_or_undefine_macro (define_p, "__QUAD_MEMORY_ATOMIC__");
   if ((flags & OPTION_MASK_CRYPTO) != 0)
     rs6000_define_or_undefine_macro (define_p, "__CRYPTO__");
+  if ((flags & OPTION_MASK_FLOAT128) != 0)
+    rs6000_define_or_undefine_macro (define_p, "__FLOAT128__");
 
   /* options from the builtin masks.  */
   if ((bu_mask & RS6000_BTM_SPE) != 0)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000.opt	(.../gcc/config/rs6000)	(working copy)
@@ -441,6 +441,10 @@  mwarn-altivec-long
 Target Var(rs6000_warn_altivec_long) Init(1) Save
 Warn about deprecated 'vector long ...' AltiVec type usage
 
+mfloat128
+Target Mask(FLOAT128) Var(rs6000_isa_flags)
+Enable the __float128 type to specify IEEE 128-bit floating point
+
 mfloat-gprs=
 Target RejectNegative Joined Enum(rs6000_float_gprs) Var(rs6000_float_gprs) Save
 -mfloat-gprs=	Select GPR floating point method
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000.c	(.../gcc/config/rs6000)	(working copy)
@@ -1657,6 +1657,21 @@  rs6000_cpu_name_lookup (const char *name
 }
 
 
+/* Helper function to separate IEEE 128 when it can go in vector registers from
+   normal scalar floating point.  */
+
+static inline bool
+scalar_float_not_vector_p (enum machine_mode mode)
+{
+  if (!SCALAR_FLOAT_MODE_P (mode))
+    return false;
+
+  if (FLOAT128_VECTOR_P (mode))
+    return false;
+
+  return true;
+}
+
 /* Return number of consecutive hard regs needed starting at reg REGNO
    to hold something of mode MODE.
    This is ordinarily the length in words of a value of mode MODE
@@ -1674,9 +1689,10 @@  rs6000_hard_regno_nregs_internal (int re
 {
   unsigned HOST_WIDE_INT reg_size;
 
-  /* TF/TD modes are special in that they always take 2 registers.  */
+  /* 128-bit floating point usually takes 2 registers, unless it is IEEE
+     128-bit floating point that can go in vector registers.  */
   if (FP_REGNO_P (regno))
-    reg_size = ((VECTOR_MEM_VSX_P (mode) && mode != TDmode && mode != TFmode)
+    reg_size = ((VECTOR_MEM_VSX_P (mode) && !FLOAT128_2REG_P (mode))
 		? UNITS_PER_VSX_WORD
 		: UNITS_PER_FP_WORD);
 
@@ -1752,7 +1768,7 @@  rs6000_hard_regno_mode_ok (int regno, en
      modes and DImode.  */
   if (FP_REGNO_P (regno))
     {
-      if (SCALAR_FLOAT_MODE_P (mode)
+      if (scalar_float_not_vector_p (mode)
 	  && (mode != TDmode || (regno % 2) == 0)
 	  && FP_REGNO_P (last_regno))
 	return 1;
@@ -1963,6 +1979,8 @@  rs6000_debug_reg_global (void)
     SFmode,
     DFmode,
     TFmode,
+    JFmode,
+    XFmode,
     SDmode,
     DDmode,
     TDmode,
@@ -2322,6 +2340,8 @@  rs6000_debug_reg_global (void)
   fprintf (stderr, DEBUG_FMT_D, "tls_size", rs6000_tls_size);
   fprintf (stderr, DEBUG_FMT_D, "long_double_size",
 	   rs6000_long_double_type_size);
+  fprintf (stderr, DEBUG_FMT_S, "__float128",
+	   TARGET_FLOAT128 ? "true" : "false");
   fprintf (stderr, DEBUG_FMT_D, "sched_restricted_insns_priority",
 	   (int)rs6000_sched_restricted_insns_priority);
   fprintf (stderr, DEBUG_FMT_D, "Number of standard builtins",
@@ -2526,6 +2546,20 @@  rs6000_init_hard_regno_mode_ok (bool glo
       align32 = 128;
     }
 
+  /* XF mode (ieee 128-bit) where we can pass it as a vector.  We do not have
+     arithmetic, so only set the memory modes.  */
+  if (TARGET_VSX)
+    {
+      enum rs6000_vector mem_type = VECTOR_VSX;
+      rs6000_vector_mem[XFmode] = mem_type;
+      rs6000_vector_align[XFmode] = 128;
+      if (TARGET_IEEEQUAD)
+	{
+	  rs6000_vector_mem[TFmode] = mem_type;
+	  rs6000_vector_align[TFmode] = 128;
+	}
+    }
+
   /* V2DF mode, VSX only.  */
   if (TARGET_VSX)
     {
@@ -2715,6 +2749,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_di_load;
 	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_di_store;
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_di_load;
+	  reg_addr[XFmode].reload_store    = CODE_FOR_reload_xf_di_store;
+	  reg_addr[XFmode].reload_load     = CODE_FOR_reload_xf_di_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_di_store;
@@ -2782,6 +2818,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  reg_addr[V4SFmode].reload_load   = CODE_FOR_reload_v4sf_si_load;
 	  reg_addr[V2DFmode].reload_store  = CODE_FOR_reload_v2df_si_store;
 	  reg_addr[V2DFmode].reload_load   = CODE_FOR_reload_v2df_si_load;
+	  reg_addr[XFmode].reload_store    = CODE_FOR_reload_xf_si_store;
+	  reg_addr[XFmode].reload_load     = CODE_FOR_reload_xf_si_load;
 	  if (TARGET_VSX && TARGET_UPPER_REGS_DF)
 	    {
 	      reg_addr[DFmode].reload_store  = CODE_FOR_reload_df_si_store;
@@ -2838,9 +2876,9 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  enum machine_mode m2 = (enum machine_mode)m;
 	  int reg_size2 = reg_size;
 
-	  /* TFmode/TDmode always takes 2 registers, even in VSX.  */
-	  if (TARGET_VSX && VSX_REG_CLASS_P (c)
-	      && (m == TDmode || m == TFmode))
+	  /* TDmode & IBM 128-bit floating point always takes 2 registers, even
+	     in VSX.  */
+	  if (TARGET_VSX && VSX_REG_CLASS_P (c) && FLOAT128_2REG_P (m))
 	    reg_size2 = UNITS_PER_FP_WORD;
 
 	  rs6000_class_max_nregs[m][c]
@@ -3519,6 +3557,12 @@  rs6000_option_override_internal (bool gl
     rs6000_ieeequad = 1;
 #endif
 
+  if (rs6000_long_double_type_size != 128 && TARGET_FLOAT128)
+    {
+      error ("-mfloat128 needs long doubles to be 128-bits");
+      rs6000_isa_flags &= ~OPTION_MASK_FLOAT128;
+    }
+
   /* Disable VSX and Altivec silently if the user switched cpus to power7 in a
      target attribute or pragma which automatically enables both options,
      unless the altivec ABI was set.  This is set by default for 64-bit, but
@@ -5797,13 +5841,16 @@  invalid_e500_subreg (rtx op, enum machin
 	      || mode == DDmode || mode == TDmode || mode == PTImode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DFmode
-	      || GET_MODE (SUBREG_REG (op)) == TFmode))
+	      || GET_MODE (SUBREG_REG (op)) == TFmode
+	      || GET_MODE (SUBREG_REG (op)) == XFmode
+	      || GET_MODE (SUBREG_REG (op)) == JFmode))
 	return true;
 
       /* Reject (subreg:DF (reg:DI)); likewise with subreg:TF and
 	 reg:TI.  */
       if (GET_CODE (op) == SUBREG
-	  && (mode == DFmode || mode == TFmode)
+	  && (mode == DFmode || mode == TFmode || mode == XFmode
+	      || mode == JFmode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DImode
 	      || GET_MODE (SUBREG_REG (op)) == TImode
@@ -6138,10 +6185,13 @@  reg_offset_addressing_ok_p (enum machine
     case V2DImode:
     case V1TImode:
     case TImode:
+    case TFmode:
+    case XFmode:
       /* AltiVec/VSX vector modes.  Only reg+reg addressing is valid.  While
 	 TImode is not a vector mode, if we want to use the VSX registers to
-	 move it around, we need to restrict ourselves to reg+reg
-	 addressing.  */
+	 move it around, we need to restrict ourselves to reg+reg addressing.
+	 Similarly for IEEE 128-bit floating point that is passed in a single
+	 vector register.  */
       if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
 	return false;
       break;
@@ -6417,6 +6467,8 @@  rs6000_legitimate_offset_address_p (enum
       break;
 
     case TFmode:
+    case XFmode:
+    case JFmode:
       if (TARGET_E500_DOUBLE)
 	return (SPE_CONST_OFFSET_OK (offset)
 		&& SPE_CONST_OFFSET_OK (offset + 8));
@@ -6610,6 +6662,8 @@  rs6000_legitimize_address (rtx x, rtx ol
     case TDmode:
     case TImode:
     case PTImode:
+    case XFmode:
+    case JFmode:
       /* As in legitimate_offset_address_p we do not assume
 	 worst-case.  The mode here is just a hint as to the registers
 	 used.  A TImode is usually in gprs, but may actually be in
@@ -7405,6 +7459,8 @@  rs6000_legitimize_reload_address (rtx x,
 	 mem is sufficiently aligned.  */
       && mode != TFmode
       && mode != TDmode
+      && mode != XFmode
+      && mode != JFmode
       && (mode != TImode || !TARGET_VSX_TIMODE)
       && mode != PTImode
       && (mode != DImode || TARGET_POWERPC64)
@@ -7558,8 +7614,7 @@  rs6000_legitimate_address_p (enum machin
     return 1;
   if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict, false))
     return 1;
-  if (mode != TFmode
-      && mode != TDmode
+  if (!FLOAT128_2REG_P (mode)
       && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
 	  || TARGET_POWERPC64
 	  || (mode != DFmode && mode != DDmode)
@@ -8236,8 +8291,7 @@  rs6000_emit_move (rtx dest, rtx source, 
 
   /* 128-bit constant floating-point values on Darwin should really be
      loaded as two parts.  */
-  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-      && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+  if (FLOAT128_IBM_P (mode) && GET_CODE (operands[1]) == CONST_DOUBLE)
     {
       rs6000_emit_move (simplify_gen_subreg (DFmode, operands[0], mode, 0),
 			simplify_gen_subreg (DFmode, operands[1], mode, 0),
@@ -8381,7 +8435,10 @@  rs6000_emit_move (rtx dest, rtx source, 
 
     case TFmode:
     case TDmode:
-      rs6000_eliminate_indexed_memrefs (operands);
+    case XFmode:
+    case JFmode:
+      if (FLOAT128_2REG_P (mode))
+	rs6000_eliminate_indexed_memrefs (operands);
       /* fall through */
 
     case DFmode:
@@ -8606,9 +8663,10 @@  rs6000_member_type_forces_blk (const_tre
 
 /* Nonzero if we can use a floating-point register to pass this arg.  */
 #define USE_FP_FOR_ARG_P(CUM,MODE)		\
-  (SCALAR_FLOAT_MODE_P (MODE)			\
+  (scalar_float_not_vector_p (MODE)		\
    && (CUM)->fregno <= FP_ARG_MAX_REG		\
-   && TARGET_HARD_FLOAT && TARGET_FPRS)
+   && TARGET_HARD_FLOAT && TARGET_FPRS		\
+   && !FLOAT128_VECTOR_P (MODE))
 
 /* Nonzero if we can use an AltiVec register to pass this arg.  */
 #define USE_ALTIVEC_FOR_ARG_P(CUM,MODE,NAMED)			\
@@ -8634,7 +8692,7 @@  rs6000_aggregate_candidate (const_tree t
     {
     case REAL_TYPE:
       mode = TYPE_MODE (type);
-      if (!SCALAR_FLOAT_MODE_P (mode))
+      if (!scalar_float_not_vector_p (mode))
 	return -1;
 
       if (*modep == VOIDmode)
@@ -8647,7 +8705,7 @@  rs6000_aggregate_candidate (const_tree t
 
     case COMPLEX_TYPE:
       mode = TYPE_MODE (TREE_TYPE (type));
-      if (!SCALAR_FLOAT_MODE_P (mode))
+      if (!scalar_float_not_vector_p (mode))
 	return -1;
 
       if (*modep == VOIDmode)
@@ -8807,7 +8865,7 @@  rs6000_discover_homogeneous_aggregate (e
 
       if (field_count > 0)
 	{
-	  int n_regs = (SCALAR_FLOAT_MODE_P (field_mode)?
+	  int n_regs = (scalar_float_not_vector_p (field_mode) ?
 			(GET_MODE_SIZE (field_mode) + 7) >> 3 : 1);
 
 	  /* The ELFv2 ABI allows homogeneous aggregates to occupy
@@ -8917,7 +8975,8 @@  rs6000_return_in_memory (const_tree type
       return true;
     }
 
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && TYPE_MODE (type) == TFmode)
+  if (DEFAULT_ABI == ABI_V4 && FLOAT128_IEEE_P (TYPE_MODE (type))
+      && !TARGET_VSX)
     return true;
 
   return false;
@@ -8989,6 +9048,7 @@  init_cumulative_args (CUMULATIVE_ARGS *c
 		      ? CALL_LIBCALL : CALL_NORMAL);
   cum->sysv_gregno = GP_ARG_MIN_REG;
   cum->stdarg = stdarg_p (fntype);
+  cum->libcall = libcall;
 
   cum->nargs_prototype = 0;
   if (incoming || cum->prototype)
@@ -9047,7 +9107,7 @@  init_cumulative_args (CUMULATIVE_ARGS *c
 		      <= 8))
 		rs6000_returns_struct = true;
 	    }
-	  if (SCALAR_FLOAT_MODE_P (return_mode))
+	  if (scalar_float_not_vector_p (return_mode))
 	    rs6000_passes_float = true;
 	  else if (ALTIVEC_OR_VSX_VECTOR_MODE (return_mode)
 		   || SPE_VECTOR_MODE (return_mode))
@@ -9162,7 +9222,11 @@  rs6000_function_arg_boundary (enum machi
       && (GET_MODE_SIZE (mode) == 8
 	  || (TARGET_HARD_FLOAT
 	      && TARGET_FPRS
-	      && (mode == TFmode || mode == TDmode))))
+	      && FLOAT128_2REG_P (mode))))
+    return 64;
+  else if (FLOAT128_VECTOR_P (mode))
+    return 128;
+  else if (FLOAT128_2REG_P (mode))
     return 64;
   else if (SPE_VECTOR_MODE (mode)
 	   || (type && TREE_CODE (type) == VECTOR_TYPE
@@ -9401,7 +9465,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
   if (DEFAULT_ABI == ABI_V4
       && cum->escapes)
     {
-      if (SCALAR_FLOAT_MODE_P (mode))
+      if (scalar_float_not_vector_p (mode))
 	rs6000_passes_float = true;
       else if (named && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
 	rs6000_passes_vector = true;
@@ -9508,7 +9572,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
+	      || FLOAT128_2REG_P (mode)
 	      || mode == SDmode || mode == DDmode || mode == TDmode))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -9516,13 +9580,13 @@  rs6000_function_arg_advance_1 (CUMULATIV
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    cum->fregno += (GET_MODE_SIZE (mode) + 7) >> 3;
 	  else
 	    {
 	      cum->fregno = FP_ARG_V4_MAX_REG + 1;
-	      if (mode == DFmode || mode == TFmode
+	      if (mode == DFmode || FLOAT128_2REG_P (mode)
 		  || mode == DDmode || mode == TDmode)
 		cum->words += cum->words & 1;
 	      cum->words += rs6000_arg_size (mode, type);
@@ -9574,7 +9638,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
 
       cum->words = align_words + n_words;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode)
+      if (scalar_float_not_vector_p (elt_mode)
 	  && TARGET_HARD_FLOAT && TARGET_FPRS)
 	{
 	  /* _Decimal128 must be passed in an even/odd float register pair.
@@ -10093,9 +10157,11 @@  rs6000_function_arg (cumulative_args_t c
       rtx r, off;
       int i, k = 0;
 
-      /* Do we also need to pass this argument in the parameter
-	 save area?  */
-      if (TARGET_64BIT && ! cum->prototype)
+      /* Do we also need to pass this argument in the parameter save area?
+	 Library support functions for IEEE 128-bit are assumed to not need the
+	 value passed both in GPRs and in vector registers.  */
+      if (TARGET_64BIT && !cum->prototype
+	  && (!cum->libcall || !FLOAT128_VECTOR_P (elt_mode)))
 	{
 	  int align_words = (cum->words + 1) & ~1;
 	  k = rs6000_psave_function_arg (mode, type, align_words, rvec);
@@ -10168,7 +10234,7 @@  rs6000_function_arg (cumulative_args_t c
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
+	      || FLOAT128_2REG_P (mode)
 	      || mode == SDmode || mode == DDmode || mode == TDmode))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -10176,7 +10242,7 @@  rs6000_function_arg (cumulative_args_t c
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    return gen_rtx_REG (mode, cum->fregno);
 	  else
@@ -10237,7 +10303,7 @@  rs6000_function_arg (cumulative_args_t c
 	      enum machine_mode fmode = elt_mode;
 	      if (cum->fregno + (i + 1) * n_fpreg > FP_ARG_MAX_REG + 1)
 		{
-		  gcc_assert (fmode == TFmode || fmode == TDmode);
+		  gcc_assert (FLOAT128_2REG_P (fmode));
 		  fmode = DECIMAL_FLOAT_MODE_P (fmode) ? DDmode : DFmode;
 		}
 
@@ -10284,11 +10350,14 @@  rs6000_arg_partial_bytes (cumulative_arg
 
   if (USE_ALTIVEC_FOR_ARG_P (cum, elt_mode, named))
     {
-      /* If we are passing this arg in the fixed parameter save area
-         (gprs or memory) as well as VRs, we do not use the partial
-	 bytes mechanism; instead, rs6000_function_arg will return a
-	 PARALLEL including a memory element as necessary.  */
-      if (TARGET_64BIT && ! cum->prototype)
+      /* If we are passing this arg in the fixed parameter save area (gprs or
+         memory) as well as VRs, we do not use the partial bytes mechanism;
+         instead, rs6000_function_arg will return a PARALLEL including a memory
+         element as necessary.  Library support functions for IEEE 128-bit are
+         assumed to not need the value passed both in GPRs and in vector
+         registers.  */
+      if (TARGET_64BIT && !cum->prototype
+	  && (!cum->libcall || !FLOAT128_VECTOR_P (elt_mode)))
 	return 0;
 
       /* Otherwise, we pass in VRs only.  Check for partial copies.  */
@@ -10355,7 +10424,7 @@  rs6000_pass_by_reference (cumulative_arg
 			  enum machine_mode mode, const_tree type,
 			  bool named ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && mode == TFmode)
+  if (DEFAULT_ABI == ABI_V4 && FLOAT128_IEEE_P (mode))
     {
       if (TARGET_DEBUG_ARG)
 	fprintf (stderr, "function_arg_pass_by_reference: V4 long double\n");
@@ -11006,6 +11075,8 @@  rs6000_gimplify_va_arg (tree valist, tre
           || (TARGET_DOUBLE_FLOAT 
               && (TYPE_MODE (type) == DFmode 
  	          || TYPE_MODE (type) == TFmode
+ 	          || TYPE_MODE (type) == XFmode
+ 	          || TYPE_MODE (type) == JFmode
 	          || TYPE_MODE (type) == SDmode
 	          || TYPE_MODE (type) == DDmode
 	          || TYPE_MODE (type) == TDmode))))
@@ -13776,6 +13847,8 @@  rs6000_init_builtins (void)
   tree tdecl;
   tree ftype;
   enum machine_mode mode;
+  enum machine_mode ieee128_mode;
+  enum machine_mode ibm128_mode;
 
   if (TARGET_DEBUG_BUILTIN)
     fprintf (stderr, "rs6000_init_builtins%s%s%s%s\n",
@@ -13843,6 +13916,31 @@  rs6000_init_builtins (void)
   dfloat128_type_internal_node = dfloat128_type_node;
   void_type_internal_node = void_type_node;
 
+  /* 128-bit floating point support.  XFmode is IEEE 128-bit floating point.
+     JFmode is the IBM 128-bit floating point format that uses a pair of
+     doubles to represent the extended value.  TFmode will be either XFmode or
+     JFmode, depending on the switches and defaults.  */
+  ieee128_mode = (TARGET_IEEEQUAD) ? TFmode : XFmode;
+  ieee128_float_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (ieee128_float_type_node) = 128;
+  layout_type (ieee128_float_type_node);
+  SET_TYPE_MODE (ieee128_float_type_node, ieee128_mode);
+
+  ibm128_mode = (!TARGET_IEEEQUAD) ? TFmode : JFmode;
+  ibm128_float_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (ibm128_float_type_node) = 128;
+  layout_type (ibm128_float_type_node);
+  SET_TYPE_MODE (ibm128_float_type_node, ibm128_mode);
+
+  if (TARGET_FLOAT128)
+    {
+      lang_hooks.types.register_builtin_type (ieee128_float_type_node,
+					      "__float128");
+
+      lang_hooks.types.register_builtin_type (ibm128_float_type_node,
+					      "__ibm128");
+    }
+
   /* Initialize the modes for builtin_function_type, mapping a machine mode to
      tree type node.  */
   builtin_mode_to_type[QImode][0] = integer_type_node;
@@ -13855,6 +13953,8 @@  rs6000_init_builtins (void)
   builtin_mode_to_type[TImode][1] = unsigned_intTI_type_node;
   builtin_mode_to_type[SFmode][0] = float_type_node;
   builtin_mode_to_type[DFmode][0] = double_type_node;
+  builtin_mode_to_type[XFmode][0] = ieee128_float_type_node;
+  builtin_mode_to_type[JFmode][0] = ibm128_float_type_node;
   builtin_mode_to_type[TFmode][0] = long_double_type_node;
   builtin_mode_to_type[DDmode][0] = dfloat64_type_node;
   builtin_mode_to_type[TDmode][0] = dfloat128_type_node;
@@ -15326,78 +15426,173 @@  rs6000_common_init_builtins (void)
     }
 }
 
+/* Set up AIX/Darwin/64-bit Linux quad floating point routines.  */
 static void
-rs6000_init_libfuncs (void)
+init_float128_ibm (enum machine_mode mode)
 {
-  if (!TARGET_IEEEQUAD)
-      /* AIX/Darwin/64-bit Linux quad floating point routines.  */
-    if (!TARGET_XL_COMPAT)
-      {
-	set_optab_libfunc (add_optab, TFmode, "__gcc_qadd");
-	set_optab_libfunc (sub_optab, TFmode, "__gcc_qsub");
-	set_optab_libfunc (smul_optab, TFmode, "__gcc_qmul");
-	set_optab_libfunc (sdiv_optab, TFmode, "__gcc_qdiv");
+  if (!TARGET_XL_COMPAT)
+    {
+      set_optab_libfunc (add_optab, mode, "__gcc_qadd");
+      set_optab_libfunc (sub_optab, mode, "__gcc_qsub");
+      set_optab_libfunc (smul_optab, mode, "__gcc_qmul");
+      set_optab_libfunc (sdiv_optab, mode, "__gcc_qdiv");
 
-	if (!(TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)))
-	  {
-	    set_optab_libfunc (neg_optab, TFmode, "__gcc_qneg");
-	    set_optab_libfunc (eq_optab, TFmode, "__gcc_qeq");
-	    set_optab_libfunc (ne_optab, TFmode, "__gcc_qne");
-	    set_optab_libfunc (gt_optab, TFmode, "__gcc_qgt");
-	    set_optab_libfunc (ge_optab, TFmode, "__gcc_qge");
-	    set_optab_libfunc (lt_optab, TFmode, "__gcc_qlt");
-	    set_optab_libfunc (le_optab, TFmode, "__gcc_qle");
-
-	    set_conv_libfunc (sext_optab, TFmode, SFmode, "__gcc_stoq");
-	    set_conv_libfunc (sext_optab, TFmode, DFmode, "__gcc_dtoq");
-	    set_conv_libfunc (trunc_optab, SFmode, TFmode, "__gcc_qtos");
-	    set_conv_libfunc (trunc_optab, DFmode, TFmode, "__gcc_qtod");
-	    set_conv_libfunc (sfix_optab, SImode, TFmode, "__gcc_qtoi");
-	    set_conv_libfunc (ufix_optab, SImode, TFmode, "__gcc_qtou");
-	    set_conv_libfunc (sfloat_optab, TFmode, SImode, "__gcc_itoq");
-	    set_conv_libfunc (ufloat_optab, TFmode, SImode, "__gcc_utoq");
-	  }
+      if (!(TARGET_HARD_FLOAT && (TARGET_FPRS || TARGET_E500_DOUBLE)))
+	{
+	  set_optab_libfunc (neg_optab, mode, "__gcc_qneg");
+	  set_optab_libfunc (eq_optab, mode, "__gcc_qeq");
+	  set_optab_libfunc (ne_optab, mode, "__gcc_qne");
+	  set_optab_libfunc (gt_optab, mode, "__gcc_qgt");
+	  set_optab_libfunc (ge_optab, mode, "__gcc_qge");
+	  set_optab_libfunc (lt_optab, mode, "__gcc_qlt");
+	  set_optab_libfunc (le_optab, mode, "__gcc_qle");
 
-	if (!(TARGET_HARD_FLOAT && TARGET_FPRS))
-	  set_optab_libfunc (unord_optab, TFmode, "__gcc_qunord");
-      }
-    else
-      {
-	set_optab_libfunc (add_optab, TFmode, "_xlqadd");
-	set_optab_libfunc (sub_optab, TFmode, "_xlqsub");
-	set_optab_libfunc (smul_optab, TFmode, "_xlqmul");
-	set_optab_libfunc (sdiv_optab, TFmode, "_xlqdiv");
-      }
+	  set_conv_libfunc (sext_optab, mode, SFmode, "__gcc_stoq");
+	  set_conv_libfunc (sext_optab, mode, DFmode, "__gcc_dtoq");
+	  set_conv_libfunc (trunc_optab, SFmode, mode, "__gcc_qtos");
+	  set_conv_libfunc (trunc_optab, DFmode, mode, "__gcc_qtod");
+	  set_conv_libfunc (sfix_optab, SImode, mode, "__gcc_qtoi");
+	  set_conv_libfunc (ufix_optab, SImode, mode, "__gcc_qtou");
+	  set_conv_libfunc (sfloat_optab, mode, SImode, "__gcc_itoq");
+	  set_conv_libfunc (ufloat_optab, mode, SImode, "__gcc_utoq");
+	}
+
+      if (!(TARGET_HARD_FLOAT && TARGET_FPRS))
+	set_optab_libfunc (unord_optab, mode, "__gcc_qunord");
+    }
   else
     {
-      /* 32-bit SVR4 quad floating point routines.  */
+      set_optab_libfunc (add_optab, mode, "_xlqadd");
+      set_optab_libfunc (sub_optab, mode, "_xlqsub");
+      set_optab_libfunc (smul_optab, mode, "_xlqmul");
+      set_optab_libfunc (sdiv_optab, mode, "_xlqdiv");
+    }
+}
 
-      set_optab_libfunc (add_optab, TFmode, "_q_add");
-      set_optab_libfunc (sub_optab, TFmode, "_q_sub");
-      set_optab_libfunc (neg_optab, TFmode, "_q_neg");
-      set_optab_libfunc (smul_optab, TFmode, "_q_mul");
-      set_optab_libfunc (sdiv_optab, TFmode, "_q_div");
+/* Set up IEEE 128-bit floating point routines.  Use different names if the
+   arguments can be passed in a vector register.  The historical PowerPC
+   implementation of IEEE 128-bit floating point used _q_<op> for the names, so
+   continue to use that if we can't pass IEEE 128-bit in a VSX vector register.
+
+   Add _vector to clarify that this function is called with the argument in a
+   vector register, and _fpr when we are not passing IEEE 128-bit in a vector
+   register.  */
+
+static void
+init_float128_ieee (enum machine_mode mode)
+{
+  if (FLOAT128_VECTOR_P (mode))
+    {
+      set_optab_libfunc (add_optab, mode, "__addtf3_vector");
+      set_optab_libfunc (sub_optab, mode, "__subtf3_vector");
+      set_optab_libfunc (neg_optab, mode, "__negtf2_vector");
+      set_optab_libfunc (smul_optab, mode, "__multf3_vector");
+      set_optab_libfunc (sdiv_optab, mode, "__divtf3_vector");
+      set_optab_libfunc (sqrt_optab, mode, "__sqrttf2_vector");
+
+      set_optab_libfunc (eq_optab, mode, "__eqtf2_vector");
+      set_optab_libfunc (ne_optab, mode, "__netf2_vector");
+      set_optab_libfunc (gt_optab, mode, "__gttf2_vector");
+      set_optab_libfunc (ge_optab, mode, "__getf2_vector");
+      set_optab_libfunc (lt_optab, mode, "__lttf2_vector");
+      set_optab_libfunc (le_optab, mode, "__letf2_vector");
+
+      set_conv_libfunc (sext_optab, mode, SFmode, "__extendsftf2_vector");
+      set_conv_libfunc (sext_optab, mode, DFmode, "__extenddftf2_vector");
+      set_conv_libfunc (trunc_optab, SFmode, mode, "__trunctfsf2_vector");
+      set_conv_libfunc (trunc_optab, DFmode, mode, "__trunctfdf2_vector");
+
+      set_conv_libfunc (sfix_optab, SImode, mode, "__fixtfsi_vector");
+      set_conv_libfunc (ufix_optab, SImode, mode, "__fixunstfsi_vector");
+      set_conv_libfunc (sfix_optab, DImode, mode, "__fixtfdi_vector");
+      set_conv_libfunc (ufix_optab, DImode, mode, "__fixunstfdi_vector");
+      set_conv_libfunc (sfix_optab, TImode, mode, "__fixtfti_vector");
+      set_conv_libfunc (ufix_optab, TImode, mode, "__fixunstfti_vector");
+
+      set_conv_libfunc (sfloat_optab, mode, SImode, "__floatsitf_vector");
+      set_conv_libfunc (ufloat_optab, mode, SImode, "__floatunssitf_vector");
+      set_conv_libfunc (sfloat_optab, mode, DImode, "__floatditf_vector");
+      set_conv_libfunc (ufloat_optab, mode, DImode, "__floatunsditf_vector");
+      set_conv_libfunc (sfloat_optab, mode, TImode, "__floattitf_vector");
+      set_conv_libfunc (ufloat_optab, mode, TImode, "__floatunstixf_vector");
+    }
+  else if (TARGET_FLOAT128)
+    {
+      set_optab_libfunc (add_optab, mode, "__addtf3_fpr");
+      set_optab_libfunc (sub_optab, mode, "__subtf3_fpr");
+      set_optab_libfunc (neg_optab, mode, "__negtf2_fpr");
+      set_optab_libfunc (smul_optab, mode, "__multf3_fpr");
+      set_optab_libfunc (sdiv_optab, mode, "__divtf3_fpr");
+      set_optab_libfunc (sqrt_optab, mode, "__sqrttf2_fpr");
+
+      set_optab_libfunc (eq_optab, mode, "__eqtf2_fpr");
+      set_optab_libfunc (ne_optab, mode, "__netf2_fpr");
+      set_optab_libfunc (gt_optab, mode, "__gttf2_fpr");
+      set_optab_libfunc (ge_optab, mode, "__getf2_fpr");
+      set_optab_libfunc (lt_optab, mode, "__lttf2_fpr");
+      set_optab_libfunc (le_optab, mode, "__letf2_fpr");
+
+      set_conv_libfunc (sext_optab, mode, SFmode, "__extendsftf2_fpr");
+      set_conv_libfunc (sext_optab, mode, DFmode, "__extenddftf2_fpr");
+      set_conv_libfunc (trunc_optab, SFmode, mode, "__trunctfsf2_fpr");
+      set_conv_libfunc (trunc_optab, DFmode, mode, "__trunctfdf2_fpr");
+
+      set_conv_libfunc (sfix_optab, SImode, mode, "__fixtfsi_fpr");
+      set_conv_libfunc (ufix_optab, SImode, mode, "__fixunstfsi_fpr");
+      set_conv_libfunc (sfix_optab, DImode, mode, "__fixtfdi_fpr");
+      set_conv_libfunc (ufix_optab, DImode, mode, "__fixunstfdi_fpr");
+      set_conv_libfunc (sfix_optab, TImode, mode, "__fixtfti_fpr");
+      set_conv_libfunc (ufix_optab, TImode, mode, "__fixunstfti_fpr");
+
+      set_conv_libfunc (sfloat_optab, mode, SImode, "__floatsitf_fpr");
+      set_conv_libfunc (ufloat_optab, mode, SImode, "__floatunssitf_fpr");
+      set_conv_libfunc (sfloat_optab, mode, DImode, "__floatditf_fpr");
+      set_conv_libfunc (ufloat_optab, mode, DImode, "__floatunsditf_fpr");
+      set_conv_libfunc (sfloat_optab, mode, TImode, "__floattitf_fpr");
+      set_conv_libfunc (ufloat_optab, mode, TImode, "__floatunstixf_fpr");
+    }
+  else
+    {
+      set_optab_libfunc (add_optab, mode, "_q_add");
+      set_optab_libfunc (sub_optab, mode, "_q_sub");
+      set_optab_libfunc (neg_optab, mode, "_q_neg");
+      set_optab_libfunc (smul_optab, mode, "_q_mul");
+      set_optab_libfunc (sdiv_optab, mode, "_q_div");
       if (TARGET_PPC_GPOPT)
-	set_optab_libfunc (sqrt_optab, TFmode, "_q_sqrt");
+	set_optab_libfunc (sqrt_optab, mode, "_q_sqrt");
 
-      set_optab_libfunc (eq_optab, TFmode, "_q_feq");
-      set_optab_libfunc (ne_optab, TFmode, "_q_fne");
-      set_optab_libfunc (gt_optab, TFmode, "_q_fgt");
-      set_optab_libfunc (ge_optab, TFmode, "_q_fge");
-      set_optab_libfunc (lt_optab, TFmode, "_q_flt");
-      set_optab_libfunc (le_optab, TFmode, "_q_fle");
-
-      set_conv_libfunc (sext_optab, TFmode, SFmode, "_q_stoq");
-      set_conv_libfunc (sext_optab, TFmode, DFmode, "_q_dtoq");
-      set_conv_libfunc (trunc_optab, SFmode, TFmode, "_q_qtos");
-      set_conv_libfunc (trunc_optab, DFmode, TFmode, "_q_qtod");
-      set_conv_libfunc (sfix_optab, SImode, TFmode, "_q_qtoi");
-      set_conv_libfunc (ufix_optab, SImode, TFmode, "_q_qtou");
-      set_conv_libfunc (sfloat_optab, TFmode, SImode, "_q_itoq");
-      set_conv_libfunc (ufloat_optab, TFmode, SImode, "_q_utoq");
+      set_optab_libfunc (eq_optab, mode, "_q_feq");
+      set_optab_libfunc (ne_optab, mode, "_q_fne");
+      set_optab_libfunc (gt_optab, mode, "_q_fgt");
+      set_optab_libfunc (ge_optab, mode, "_q_fge");
+      set_optab_libfunc (lt_optab, mode, "_q_flt");
+      set_optab_libfunc (le_optab, mode, "_q_fle");
+
+      set_conv_libfunc (sext_optab, mode, SFmode, "_q_stoq");
+      set_conv_libfunc (sext_optab, mode, DFmode, "_q_dtoq");
+      set_conv_libfunc (trunc_optab, SFmode, mode, "_q_qtos");
+      set_conv_libfunc (trunc_optab, DFmode, mode, "_q_qtod");
+      set_conv_libfunc (sfix_optab, SImode, mode, "_q_qtoi");
+      set_conv_libfunc (ufix_optab, SImode, mode, "_q_qtou");
+      set_conv_libfunc (sfloat_optab, mode, SImode, "_q_itoq");
+      set_conv_libfunc (ufloat_optab, mode, SImode, "_q_utoq");
     }
 }
 
+static void
+rs6000_init_libfuncs (void)
+{
+  /* AIX/Darwin/64-bit Linux quad floating point routines.  */
+  init_float128_ibm (JFmode);
+  if (!TARGET_IEEEQUAD)
+    init_float128_ibm (TFmode);
+
+  /* 32-bit SVR4 quad floating point routines.  */
+  init_float128_ieee (XFmode);
+  if (TARGET_IEEEQUAD)
+    init_float128_ieee (TFmode);
+}
+
 
 /* Expand a block clear operation, and return 1 if successful.  Return 0
    if we should let the compiler generate normal code.
@@ -17225,6 +17420,8 @@  rs6000_cannot_change_mode_class (enum ma
 	{
 	  unsigned to_nregs = hard_regno_nregs[FIRST_FPR_REGNO][to];
 	  unsigned from_nregs = hard_regno_nregs[FIRST_FPR_REGNO][from];
+	  bool to_float128_vector_p = FLOAT128_VECTOR_P (to);
+	  bool from_float128_vector_p = FLOAT128_VECTOR_P (from);
 
 	  /* Don't allow 64-bit types to overlap with 128-bit types that take a
 	     single register under VSX because the scalar part of the register
@@ -17233,14 +17430,18 @@  rs6000_cannot_change_mode_class (enum ma
 	     IEEE floating point can't overlap, and neither can small
 	     values.  */
 
-	  if (TARGET_IEEEQUAD && (to == TFmode || from == TFmode))
+	  if (to_float128_vector_p && from_float128_vector_p)
+	    return false;
+
+	  else if (to_float128_vector_p || from_float128_vector_p)
 	    return true;
 
 	  /* TDmode in floating-mode registers must always go into a register
 	     pair with the most significant word in the even-numbered register
 	     to match ISA requirements.  In little-endian mode, this does not
 	     match subreg numbering, so we cannot allow subregs.  */
-	  if (!BYTES_BIG_ENDIAN && (to == TDmode || from == TDmode))
+	  if (!BYTES_BIG_ENDIAN
+	      && (FLOAT128_2REG_P (to) || FLOAT128_2REG_P (from)))
 	    return true;
 
 	  if (from_size < 8 || to_size < 8)
@@ -17261,6 +17462,8 @@  rs6000_cannot_change_mode_class (enum ma
   if (TARGET_E500_DOUBLE
       && ((((to) == DFmode) + ((from) == DFmode)) == 1
 	  || (((to) == TFmode) + ((from) == TFmode)) == 1
+	  || (((to) == XFmode) + ((from) == XFmode)) == 1
+	  || (((to) == JFmode) + ((from) == JFmode)) == 1
 	  || (((to) == DDmode) + ((from) == DDmode)) == 1
 	  || (((to) == TDmode) + ((from) == TDmode)) == 1
 	  || (((to) == DImode) + ((from) == DImode)) == 1))
@@ -18257,7 +18460,7 @@  print_operand (FILE *file, rtx x, int co
 	/* Ugly hack because %y is overloaded.  */
 	if ((TARGET_SPE || TARGET_E500_DOUBLE)
 	    && (GET_MODE_SIZE (GET_MODE (x)) == 8
-		|| GET_MODE (x) == TFmode
+		|| FLOAT128_2REG_P (GET_MODE (x))
 		|| GET_MODE (x) == TImode
 		|| GET_MODE (x) == PTImode))
 	  {
@@ -18666,6 +18869,8 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case XFmode:
+	    case JFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfeq_gpr (compare_result, op0, op1)
 		: gen_cmptfeq_gpr (compare_result, op0, op1);
@@ -18693,6 +18898,8 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case XFmode:
+	    case JFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfgt_gpr (compare_result, op0, op1)
 		: gen_cmptfgt_gpr (compare_result, op0, op1);
@@ -18720,6 +18927,8 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case XFmode:
+	    case JFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttflt_gpr (compare_result, op0, op1)
 		: gen_cmptflt_gpr (compare_result, op0, op1);
@@ -18757,6 +18966,8 @@  rs6000_generate_compare (rtx cmp, enum m
 	      break;
 
 	    case TFmode:
+	    case XFmode:
+	    case JFmode:
 	      cmp = (flag_finite_math_only && !flag_trapping_math)
 		? gen_tsttfeq_gpr (compare_result2, op0, op1)
 		: gen_cmptfeq_gpr (compare_result2, op0, op1);
@@ -18784,9 +18995,8 @@  rs6000_generate_compare (rtx cmp, enum m
       /* Generate XLC-compatible TFmode compare as PARALLEL with extra
 	 CLOBBERs to match cmptf_internal2 pattern.  */
       if (comp_mode == CCFPmode && TARGET_XL_COMPAT
-	  && GET_MODE (op0) == TFmode
-	  && !TARGET_IEEEQUAD
-	  && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128)
+	  && FLOAT128_IBM_P (GET_MODE (op0))
+	  && TARGET_HARD_FLOAT && TARGET_FPRS)
 	emit_insn (gen_rtx_PARALLEL (VOIDmode,
 	  gen_rtvec (10,
 		     gen_rtx_SET (VOIDmode,
@@ -20202,7 +20412,7 @@  rs6000_split_multireg_move (rtx dst, rtx
 	((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? DFmode : SFmode);
   else if (ALTIVEC_REGNO_P (reg))
     reg_mode = V16QImode;
-  else if (TARGET_E500_DOUBLE && mode == TFmode)
+  else if (TARGET_E500_DOUBLE && FLOAT128_2REG_P (mode))
     reg_mode = DFmode;
   else
     reg_mode = word_mode;
@@ -21301,7 +21511,8 @@  spe_func_has_64bit_regs_p (void)
 
 	      if (SPE_VECTOR_MODE (mode))
 		return true;
-	      if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode))
+	      if (TARGET_E500_DOUBLE
+		  && (mode == DFmode || FLOAT128_2REG_P (mode)))
 		return true;
 	    }
 	}
@@ -24970,6 +25181,8 @@  rs6000_output_function_epilogue (FILE *f
 			case DDmode:
 			case TFmode:
 			case TDmode:
+			case XFmode:
+			case JFmode:
 			  bits = 0x3;
 			  break;
 
@@ -25447,7 +25660,8 @@  output_toc (FILE *file, rtx x, int label
      TOC, things we put here aren't actually in the TOC, so we can allow
      FP constants.  */
   if (GET_CODE (x) == CONST_DOUBLE &&
-      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode))
+      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode
+       || GET_MODE (x) == XFmode || GET_MODE (x) == JFmode))
     {
       REAL_VALUE_TYPE rv;
       long k[4];
@@ -28153,9 +28367,23 @@  rs6000_mangle_type (const_tree type)
   if (type == bool_int_type_node) return "U6__booli";
   if (type == bool_long_type_node) return "U6__booll";
 
-  /* Mangle IBM extended float long double as `g' (__float128) on
-     powerpc*-linux where long-double-64 previously was the default.  */
-  if (TYPE_MAIN_VARIANT (type) == long_double_type_node
+  /* For VSX systems, we are transitioning to supporting IEEE 128-bit floating
+     point.  Initially, users will have to use __float128 to get access to the
+     IEEE 128-bit floating point, and long double will remain the IBM
+     double-double format (same as the __ibm128 type).  At some point in the
+     future, long double may become the same as __float128.
+
+     AIX and really old powerpc*-linux systems default to 64-bit for the
+     long double type, and we will use the normal C++ mangling in this
+     case.  */
+
+  if (type == ibm128_float_type_node)
+    return "g";
+
+  if (type == ieee128_float_type_node)
+    return "e";
+
+  if (type == long_double_type_node
       && TARGET_ELF
       && TARGET_LONG_DOUBLE_128
       && !TARGET_IEEEQUAD)
@@ -29778,7 +30006,7 @@  rs6000_register_move_cost (enum machine_
 
   /* Moving between two similar registers is just one instruction.  */
   else if (reg_classes_intersect_p (to, from))
-    ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
+    ret = (FLOAT128_2REG_P (mode)) ? 4 : 2;
 
   /* Everything else has to go through GENERAL_REGS.  */
   else
@@ -30807,7 +31035,7 @@  rs6000_function_value (const_tree valtyp
       int first_reg, n_regs, i;
       rtx par;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode))
+      if (scalar_float_not_vector_p (elt_mode))
 	{
 	  /* _Decimal128 must use even/odd register pairs.  */
 	  first_reg = (elt_mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
@@ -30872,7 +31100,7 @@  rs6000_function_value (const_tree valtyp
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS
+  else if (scalar_float_not_vector_p (mode) && TARGET_HARD_FLOAT && TARGET_FPRS
 	   && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
@@ -30881,13 +31109,13 @@  rs6000_function_value (const_tree valtyp
   /* VSX is a superset of Altivec and adds V2DImode/V2DFmode.  Since the same
      return register is used in both cases, and we won't see V2DImode/V2DFmode
      for pure altivec, combine the two cases.  */
-  else if (TREE_CODE (valtype) == VECTOR_TYPE
+  else if ((TREE_CODE (valtype) == VECTOR_TYPE || FLOAT128_VECTOR_P (mode))
 	   && TARGET_ALTIVEC && TARGET_ALTIVEC_ABI
 	   && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
     regno = ALTIVEC_ARG_RETURN;
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || FLOAT128_IBM_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -30919,7 +31147,7 @@  rs6000_libcall_value (enum machine_mode 
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_MODE_P (mode)
+  else if (scalar_float_not_vector_p (mode)
 	   && TARGET_HARD_FLOAT && TARGET_FPRS
            && ((TARGET_SINGLE_FLOAT && mode == SFmode) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
@@ -30933,7 +31161,7 @@  rs6000_libcall_value (enum machine_mode 
     return rs6000_complex_function_value (mode);
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || FLOAT128_IBM_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -31235,6 +31463,7 @@  static struct rs6000_opt_mask const rs60
   { "crypto",			OPTION_MASK_CRYPTO,		false, true  },
   { "direct-move",		OPTION_MASK_DIRECT_MOVE,	false, true  },
   { "dlmzb",			OPTION_MASK_DLMZB,		false, true  },
+  { "float128",			OPTION_MASK_FLOAT128,		false, false },
   { "fprnd",			OPTION_MASK_FPRND,		false, true  },
   { "hard-dfp",			OPTION_MASK_DFP,		false, true  },
   { "htm",			OPTION_MASK_HTM,		false, true  },
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000.h	(.../gcc/config/rs6000)	(working copy)
@@ -402,6 +402,28 @@  extern const char *host_detect_local_cpu
 #define TARGET_DEBUG_TARGET	(rs6000_debug & MASK_DEBUG_TARGET)
 #define TARGET_DEBUG_BUILTIN	(rs6000_debug & MASK_DEBUG_BUILTIN)
 
+/* Helper macros for TFmode.  Quad floating point (TFmode) can be either IBM
+   long double format that uses a pair of doubles, or IEEE 128-bit floating
+   point.  XFmode was added as a way to represent IEEE 128-bit floating point,
+   even if the default for long double is the IBM long double format.
+   Similarly, JFmode is a way to represent the IBM long double format.  */
+#define FLOAT128_IEEE_P(MODE)						\
+  (((MODE) == XFmode)							\
+   || (((MODE) == TFmode) && TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128))
+
+#define FLOAT128_IBM_P(MODE)						\
+  (((MODE) == JFmode)							\
+   || (((MODE) == TFmode) && !TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128))
+
+/* Helper macros to say whether a 128-bit floating point type can go in a
+   single vector register, or whether it needs paired scalar values.  */
+#define FLOAT128_VECTOR_P(MODE) (TARGET_VSX && FLOAT128_IEEE_P (MODE))
+
+#define FLOAT128_2REG_P(MODE)						\
+  (FLOAT128_IBM_P (MODE)						\
+   || ((MODE) == TDmode)						\
+   || (!TARGET_VSX && FLOAT128_IEEE_P (MODE)))
+
 /* Describe the vector unit used for arithmetic operations.  */
 extern enum rs6000_vector rs6000_vector_unit[];
 
@@ -559,6 +581,7 @@  extern int rs6000_vector_align[];
 #define MASK_DIRECT_MOVE		OPTION_MASK_DIRECT_MOVE
 #define MASK_DLMZB			OPTION_MASK_DLMZB
 #define MASK_EABI			OPTION_MASK_EABI
+#define MASK_FLOAT128			OPTION_MASK_FLOAT128
 #define MASK_FPRND			OPTION_MASK_FPRND
 #define MASK_P8_FUSION			OPTION_MASK_P8_FUSION
 #define MASK_HARD_FLOAT			OPTION_MASK_HARD_FLOAT
@@ -895,7 +918,7 @@  enum data_align { align_abi, align_opt, 
    aligned to 4 or 8 bytes.  */
 #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN)				\
   (STRICT_ALIGNMENT							\
-   || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode	\
+   || (((MODE) == SFmode || (MODE) == DFmode || FLOAT128_IBM_P (MODE)	\
 	|| (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode)	\
        && (ALIGN) < 32)							\
    || (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE))))
@@ -1173,7 +1196,7 @@  enum data_align { align_abi, align_opt, 
    && ((MODE) == VOIDmode || ALTIVEC_OR_VSX_VECTOR_MODE (MODE))		\
    && FP_REGNO_P (REGNO)						\
    ? V2DFmode								\
-   : ((MODE) == TFmode && FP_REGNO_P (REGNO))				\
+   : (FLOAT128_2REG_P (MODE) && FP_REGNO_P (REGNO))			\
    ? DFmode								\
    : ((MODE) == TDmode && FP_REGNO_P (REGNO))				\
    ? DImode								\
@@ -1185,17 +1208,19 @@  enum data_align { align_abi, align_opt, 
      && INT_REGNO_P (REGNO)) ? 1 : 0)					\
    || (TARGET_VSX && FP_REGNO_P (REGNO)					\
        && GET_MODE_SIZE (MODE) > 8 && ((MODE) != TDmode) 		\
-       && ((MODE) != TFmode)))
+       && !FLOAT128_IBM_P (MODE)))
+
+#define VSX_VECTOR_MODE(MODE) ((MODE) == V4SFmode || (MODE) == V2DFmode)
+
+/* Note XFmode and possibly TFmode (i.e. IEEE 128-bit floating point) are not really a vector, but
+   we want to treat it as a vector for moves, and such.  */
 
-#define VSX_VECTOR_MODE(MODE)		\
-	 ((MODE) == V4SFmode		\
-	  || (MODE) == V2DFmode)	\
-
-#define ALTIVEC_VECTOR_MODE(MODE)	\
-	 ((MODE) == V16QImode		\
-	  || (MODE) == V8HImode		\
-	  || (MODE) == V4SFmode		\
-	  || (MODE) == V4SImode)
+#define ALTIVEC_VECTOR_MODE(MODE)					\
+  ((MODE) == V16QImode							\
+   || (MODE) == V8HImode						\
+   || (MODE) == V4SFmode						\
+   || (MODE) == V4SImode						\
+   || FLOAT128_VECTOR_P (MODE))
 
 #define ALTIVEC_OR_VSX_VECTOR_MODE(MODE)				\
   (ALTIVEC_VECTOR_MODE (MODE) || VSX_VECTOR_MODE (MODE)			\
@@ -1222,12 +1247,19 @@  enum data_align { align_abi, align_opt, 
 
    PTImode cannot tie with other modes because PTImode is restricted to even
    GPR registers, and TImode can go in any GPR as well as VSX registers (PR
-   57744).  */
+   57744).
+
+   Altivec/VSX vector tests moved ahead of scalar float mode, so that IEEE
+   128-bit floating point on VSX systems ties with other vectors.  */
 #define MODES_TIEABLE_P(MODE1, MODE2)		\
   ((MODE1) == PTImode				\
    ? (MODE2) == PTImode				\
    : (MODE2) == PTImode				\
    ? 0						\
+   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE1)		\
+   ? ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
+   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
+   ? 0						\
    : SCALAR_FLOAT_MODE_P (MODE1)		\
    ? SCALAR_FLOAT_MODE_P (MODE2)		\
    : SCALAR_FLOAT_MODE_P (MODE2)		\
@@ -1240,10 +1272,6 @@  enum data_align { align_abi, align_opt, 
    ? SPE_VECTOR_MODE (MODE2)			\
    : SPE_VECTOR_MODE (MODE2)			\
    ? 0						\
-   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE1)		\
-   ? ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
-   : ALTIVEC_OR_VSX_VECTOR_MODE (MODE2)		\
-   ? 0						\
    : 1)
 
 /* Post-reload, we can't use any new AltiVec registers, as we already
@@ -1733,6 +1761,7 @@  typedef struct rs6000_args
 				   GPR space (darwin64) */
   int named;			/* false for varargs params */
   int escapes;			/* if function visible outside tu */
+  int libcall;			/* If this is a compiler generated call.  */
 } CUMULATIVE_ARGS;
 
 /* Initialize a variable CUM of type CUMULATIVE_ARGS
@@ -2628,6 +2657,8 @@  enum rs6000_builtin_type_index
   RS6000_BTI_dfloat64,		 /* dfloat64_type_node */
   RS6000_BTI_dfloat128,		 /* dfloat128_type_node */
   RS6000_BTI_void,	         /* void_type_node */
+  RS6000_BTI_ieee128_float,	 /* ieee 128-bit floating point */
+  RS6000_BTI_ibm128_float,	 /* ibm 128-bit double/double floating point */
   RS6000_BTI_MAX
 };
 
@@ -2682,6 +2713,8 @@  enum rs6000_builtin_type_index
 #define dfloat64_type_internal_node	 (rs6000_builtin_types[RS6000_BTI_dfloat64])
 #define dfloat128_type_internal_node	 (rs6000_builtin_types[RS6000_BTI_dfloat128])
 #define void_type_internal_node		 (rs6000_builtin_types[RS6000_BTI_void])
+#define ieee128_float_type_node		 (rs6000_builtin_types[RS6000_BTI_ieee128_float])
+#define ibm128_float_type_node		 (rs6000_builtin_types[RS6000_BTI_ibm128_float])
 
 extern GTY(()) tree rs6000_builtin_types[RS6000_BTI_MAX];
 extern GTY(()) tree rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
Index: gcc/config/rs6000/altivec.md
===================================================================
--- gcc/config/rs6000/altivec.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/altivec.md	(.../gcc/config/rs6000)	(working copy)
@@ -168,10 +168,27 @@  (define_mode_iterator VF [V4SF])
 (define_mode_iterator V [V4SI V8HI V16QI V4SF])
 ;; Vec modes for move/logical/permute ops, include vector types for move not
 ;; otherwise handled by altivec (v2df, v2di, ti)
-(define_mode_iterator VM [V4SI V8HI V16QI V4SF V2DF V2DI V1TI TI])
+(define_mode_iterator VM [V4SI
+			  V8HI
+			  V16QI
+			  V4SF
+			  V2DF
+			  V2DI
+			  V1TI
+			  TI
+			  (XF "FLOAT128_VECTOR_P (XFmode)")
+			  (TF "FLOAT128_VECTOR_P (TFmode)")])
 
 ;; Like VM, except don't do TImode
-(define_mode_iterator VM2 [V4SI V8HI V16QI V4SF V2DF V2DI V1TI])
+(define_mode_iterator VM2 [V4SI
+			   V8HI
+			   V16QI
+			   V4SF
+			   V2DF
+			   V2DI
+			   V1TI
+			   (XF "FLOAT128_VECTOR_P (XFmode)")
+			   (TF "FLOAT128_VECTOR_P (TFmode)")])
 
 (define_mode_attr VI_char [(V2DI "d") (V4SI "w") (V8HI "h") (V16QI "b")])
 (define_mode_attr VI_scalar [(V2DI "DI") (V4SI "SI") (V8HI "HI") (V16QI "QI")])
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -289,7 +289,9 @@  (define_mode_iterator FMOVE32 [SF SD])
 (define_mode_iterator FMOVE64 [DF DD])
 (define_mode_iterator FMOVE64X [DI DF DD])
 (define_mode_iterator FMOVE128 [(TF "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128")
-				(TD "TARGET_HARD_FLOAT && TARGET_FPRS")])
+				(TD "TARGET_HARD_FLOAT && TARGET_FPRS")
+				(XF "TARGET_HARD_FLOAT && TARGET_FPRS")
+				(JF "TARGET_HARD_FLOAT && TARGET_FPRS")])
 
 ; Iterators for 128 bit types for direct move
 (define_mode_iterator FMOVE128_GPR [(TI    "TARGET_VSX_TIMODE")
@@ -299,7 +301,9 @@  (define_mode_iterator FMOVE128_GPR [(TI 
 				    (V4SF  "")
 				    (V2DI  "")
 				    (V2DF  "")
-				    (V1TI  "")])
+				    (V1TI  "")
+				    (XF    "")
+				    (TF    "")])
 
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
@@ -9594,6 +9598,7 @@  (define_insn_and_split "*mov<mode>_64bit
   [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r,r,wm")
 	(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r,wm,r"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_POWERPC64
+   && FLOAT128_2REG_P (<MODE>mode)
    && (<MODE>mode != TDmode || WORDS_BIG_ENDIAN)
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
@@ -9619,6 +9624,9 @@  (define_insn_and_split "*mov<mode>_32bit
   [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
 	(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && !TARGET_POWERPC64
+   && (FLOAT128_2REG_P (<MODE>mode)
+       || int_reg_operand_not_pseudo (operands[0], <MODE>mode)
+       || int_reg_operand_not_pseudo (operands[1], <MODE>mode))
    && (gpc_reg_operand (operands[0], <MODE>mode)
        || gpc_reg_operand (operands[1], <MODE>mode))"
   "#"
@@ -15795,7 +15803,10 @@  (define_insn "div<div_extend>_<mode>"
 ;; Pack/unpack 128-bit floating point types that take 2 scalar registers
 
 ; Type of the 64-bit part when packing/unpacking 128-bit floating point types
-(define_mode_attr FP128_64 [(TF "DF") (TD "DI")])
+(define_mode_attr FP128_64 [(TF "DF")
+			    (TD "DI")
+			    (XF "DI")
+			    (JF "DF")])
 
 (define_expand "unpack<mode>"
   [(set (match_operand:<FP128_64> 0 "nonimmediate_operand" "")
@@ -15803,27 +15814,29 @@  (define_expand "unpack<mode>"
 	 [(match_operand:FMOVE128 1 "register_operand" "")
 	  (match_operand:QI 2 "const_0_to_1_operand" "")]
 	 UNSPEC_UNPACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "")
 
 ;; The Advance Toolchain 7.0-3 added private builtins: __builtin_longdouble_dw0
 ;; and __builtin_longdouble_dw1 to optimize glibc.  Add support for these
 ;; builtins here.
 
-(define_expand "unpacktf_0"
+(define_mode_iterator TF_JF [TF JF])
+
+(define_expand "unpack<mode>_0"
   [(set (match_operand:DF 0 "nonimmediate_operand" "")
-	(unspec:DF [(match_operand:TF 1 "register_operand" "")
+	(unspec:DF [(match_operand:TF_JF 1 "register_operand" "")
 		    (const_int 0)]
 	 UNSPEC_UNPACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "")
 
-(define_expand "unpacktf_1"
+(define_expand "unpack<mode>_1"
   [(set (match_operand:DF 0 "nonimmediate_operand" "")
-	(unspec:DF [(match_operand:TF 1 "register_operand" "")
+	(unspec:DF [(match_operand:TF_JF 1 "register_operand" "")
 		    (const_int 1)]
 	 UNSPEC_UNPACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "")
 
 (define_insn_and_split "unpack<mode>_dm"
@@ -15832,7 +15845,7 @@  (define_insn_and_split "unpack<mode>_dm"
 	 [(match_operand:FMOVE128 1 "register_operand" "d,d,r,d,r")
 	  (match_operand:QI 2 "const_0_to_1_operand" "i,i,i,i,i")]
 	 UNSPEC_UNPACK_128BIT))]
-  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+  "TARGET_POWERPC64 && TARGET_DIRECT_MOVE && FLOAT128_2REG_P (<MODE>mode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 3))]
@@ -15856,7 +15869,7 @@  (define_insn_and_split "unpack<mode>_nod
 	 [(match_operand:FMOVE128 1 "register_operand" "d,d")
 	  (match_operand:QI 2 "const_0_to_1_operand" "i,i")]
 	 UNSPEC_UNPACK_128BIT))]
-  "!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE"
+  "(!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE) && FLOAT128_2REG_P (<MODE>mode)"
   "#"
   "&& reload_completed"
   [(set (match_dup 0) (match_dup 3))]
@@ -15880,7 +15893,7 @@  (define_insn_and_split "pack<mode>"
 	 [(match_operand:<FP128_64> 1 "register_operand" "0,d")
 	  (match_operand:<FP128_64> 2 "register_operand" "d,d")]
 	 UNSPEC_PACK_128BIT))]
-  ""
+  "FLOAT128_2REG_P (<MODE>mode)"
   "@
    fmr %L0,%2
    #"
Index: gcc/config/rs6000/vector.md
===================================================================
--- gcc/config/rs6000/vector.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/vector.md	(.../gcc/config/rs6000)	(working copy)
@@ -36,7 +36,7 @@  (define_mode_iterator VEC_A [V16QI V8HI 
 (define_mode_iterator VEC_K [V16QI V8HI V4SI V4SF])
 
 ;; Vector logical modes
-(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI])
+(define_mode_iterator VEC_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI XF])
 
 ;; Vector modes for moves.  Don't do TImode here.
 (define_mode_iterator VEC_M [V16QI V8HI V4SI V2DI V4SF V2DF V1TI])
@@ -55,7 +55,7 @@  (define_mode_iterator VEC_64 [V2DI V2DF]
 
 ;; Vector reload iterator
 (define_mode_iterator VEC_R [V16QI V8HI V4SI V2DI V4SF V2DF V1TI
-			     SF SD SI DF DD DI TI])
+			     SF SD SI DF DD DI TI XF])
 
 ;; Base type from vector mode
 (define_mode_attr VEC_base [(V16QI "QI")
Index: gcc/config/rs6000/rs6000-modes.def
===================================================================
--- gcc/config/rs6000/rs6000-modes.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/rs6000-modes.def	(.../gcc/config/rs6000)	(working copy)
@@ -47,3 +47,17 @@  VECTOR_MODES (FLOAT, 32);     /*       V
    for quad memory atomic operations to force getting an even/odd register
    combination.  */
 PARTIAL_INT_MODE (TI, 128, PTI);
+
+/* These modes really aren't fractional float modes, but declaring them this
+   way prevents the normal mode lookup routines from using them.  To use them,
+   you must use explict keywords.  */
+
+/* IEEE 128-bit floating point.  */
+FRACTIONAL_FLOAT_MODE (XF, 80, 16, ieee_quad_format);
+ADJUST_BYTESIZE  (XF, 16);
+ADJUST_ALIGNMENT (XF, 16);
+
+/* IBM double-double 128-bit floating point.  */
+FRACTIONAL_FLOAT_MODE (JF, 80, 16, ibm_extended_format);
+ADJUST_BYTESIZE  (JF, 16);
+ADJUST_ALIGNMENT (JF, 16);
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 209899)
+++ gcc/config/rs6000/vsx.md	(.../gcc/config/rs6000)	(working copy)
@@ -34,11 +34,28 @@  (define_mode_iterator VSX_DF [V2DF DF])
 (define_mode_iterator VSX_F [V4SF V2DF])
 
 ;; Iterator for logical types supported by VSX
-(define_mode_iterator VSX_L [V16QI V8HI V4SI V2DI V4SF V2DF V1TI TI])
+(define_mode_iterator VSX_L [V16QI
+			     V8HI
+			     V4SI
+			     V2DI
+			     V4SF
+			     V2DF
+			     V1TI
+			     TI
+			     (XF	"FLOAT128_VECTOR_P (XFmode)")
+			     (TF	"FLOAT128_VECTOR_P (TFmode)")])
 
 ;; Iterator for memory move.  Handle TImode specially to allow
 ;; it to use gprs as well as vsx registers.
-(define_mode_iterator VSX_M [V16QI V8HI V4SI V2DI V4SF V2DF V1TI])
+(define_mode_iterator VSX_M [V16QI
+			     V8HI
+			     V4SI
+			     V2DI
+			     V4SF
+			     V2DF
+			     V1TI
+			     (XF	"FLOAT128_VECTOR_P (XFmode)")
+			     (TF	"FLOAT128_VECTOR_P (TFmode)")])
 
 (define_mode_iterator VSX_M2 [V16QI
 			      V8HI
@@ -47,6 +64,8 @@  (define_mode_iterator VSX_M2 [V16QI
 			      V4SF
 			      V2DF
 			      V1TI
+			      (XF	"FLOAT128_VECTOR_P (XFmode)")
+			      (TF	"FLOAT128_VECTOR_P (TFmode)")
 			      (TI	"TARGET_VSX_TIMODE")])
 
 ;; Map into the appropriate load/store name based on the type
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 209899)
+++ gcc/doc/extend.texi	(.../gcc/doc)	(working copy)
@@ -945,6 +945,7 @@  examine and set these two fictitious var
 @cindex additional floating types
 @cindex @code{__float80} data type
 @cindex @code{__float128} data type
+@cindex @code{__ibm128} data type
 @cindex @code{w} floating point suffix
 @cindex @code{q} floating point suffix
 @cindex @code{W} floating point suffix
@@ -969,7 +970,13 @@  typedef _Complex float __attribute__((mo
 
 Not all targets support additional floating-point types.  @code{__float80}
 and @code{__float128} types are supported on i386, x86_64 and IA-64 targets.
-The @code{__float128} type is supported on hppa HP-UX targets.
+The @code{__float128} type is supported on hppa HP-UX.
+
+On PowerPC systems that support the vector scalar extensions (VSX),
+@code{__float128} supports the IEEE 128-bit extended format, and
+@code{__ibm128} supports the 128-bit format that uses a pair of
+@code{double} values to provide the extended precision.  The default
+for @code{long double} is @code{__ibm128}.
 
 @node Half-Precision
 @section Half-Precision Floating Point
@@ -12793,6 +12800,8 @@  double __builtin_unpack_longdouble (long
 double __builtin_longdouble_dw0 (long double);
 double __builtin_longdouble_dw1 (long double);
 long double __builtin_pack_longdouble (double, double);
+double __builtin_unpack_ibm128 (long double, int);
+__ibm128 __builtin_pack_ibm128 (double, double);
 @end smallexample
 
 The @code{vec_rsqrt}, @code{__builtin_rsqrt}, and
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 209899)
+++ gcc/doc/invoke.texi	(.../gcc/doc)	(working copy)
@@ -923,7 +923,8 @@  See RS/6000 and PowerPC Options.
 -mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol
 -mquad-memory -mno-quad-memory @gol
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
--mcompat-align-parm -mno-compat-align-parm}
+-mcompat-align-parm -mno-compat-align-parm @gol
+-mfloat128 -mno-float128}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -19910,6 +19911,13 @@  that is compatible with functions compil
 GCC.
 
 The @option{-mno-compat-align-parm} option is the default.
+
+@item -mfloat128
+@itemx -mno-float128
+@opindex mfloat128
+Enable (do not enable) the types @code{__float128} and @code{__ibm128}
+which given explicit access to IEEE 128-bit and IBM double-double
+128-bit floating point types.
 @end table
 
 @node RX Options
Index: libgcc/config/rs6000/float128-vsx.h
===================================================================
--- libgcc/config/rs6000/float128-vsx.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/libgcc)	(revision 0)
+++ libgcc/config/rs6000/float128-vsx.h	(.../libgcc)	(revision 209914)
@@ -0,0 +1,32 @@ 
+/* Map __<func>tf<n> to the name used for __float128 when used on a VSX system.
+   We compile two versions, one that passes __float128 in a vector register,
+   and the other that expects it to be a pair of scalar FPR registers.  */
+
+#define __addtf3	__addtf3_vector
+#define __subtf3	__subtf3_vector
+#define __multf3	__multf3_vector
+#define __divtf3	__divtf3_vector
+#define __eqtf2		__eqtf2_vector
+#define __netf2		__netf2_vector
+#define __getf2		__getf2_vector
+#define __gttf2		__gttf2_vector
+#define __letf2		__letf2_vector
+#define __lttf2		__lttf2_vector
+#define __unordtf2	__unordtf2_vector
+#define __negtf2	__negtf2_vector
+#define __extenddftf2	__extenddftf2_vector
+#define __extendsftf2	__extendsftf2_vector
+#define __trunctfdf2	__trunctfdf2_vector
+#define __trunctfsf2	__trunctfsf2_vector
+#define __fixtfsi	__fixtfsi_vector
+#define __fixtfdi	__fixtfdi_vector
+#define __fixtfti	__fixtfti_vector
+#define __fixunstfsi	__fixunstfsi_vector
+#define __fixunstfdi	__fixunstfdi_vector
+#define __fixunstfti	__fixunstfti_vector
+#define __floatsitf	__floatsitf_vector
+#define __floatditf	__floatditf_vector
+#define __floattitf	__floattitf_vector
+#define __floatunsitf	__floatunsitf_vector
+#define __floatunditf	__floatunditf_vector
+#define __floatuntitf	__floatuntitf_vector
Index: libgcc/config/rs6000/t-float128
===================================================================
--- libgcc/config/rs6000/t-float128	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/libgcc)	(revision 0)
+++ libgcc/config/rs6000/t-float128	(.../libgcc)	(revision 209914)
@@ -0,0 +1,51 @@ 
+# Support for adding __float128 to the powerpc
+
+# Due to the desire to pass __float128 in a vector register under VSX we
+# compile two sets support functions for __float128, one compiled for VSX and
+# newer systems, and the other compiled without VSX.
+
+fp128_func		 = addtf3 subtf3 multf3 divtf3 negtf2 \
+			   eqtf2 getf2 letf2 unordtf2 \
+			   extenddftf2 extendsftf2 trunctfdf2 trunctfsf2 \
+			   fixtfsi fixtfdi fixunstfsi fixunstfdi \
+			   floatsitf floatditf floatunsitf floatunditf
+
+fp128_vsx_src_list	:= $(foreach f,$(fp128_func),$(f)-vsx.c)
+fp128_vsx_objects_base	 = $(basename $(notdir $(fp128_vsx_src_list)))
+fp128_vsx_objects	 = $(addsuffix $(objext),$(fp128_vsx_objects_base))
+
+$(fp128_vsx_src_list):
+	@echo "Create $@"; \
+	tf=$$(echo "$@" | sed -e 's/-vsx//'); \
+	echo '#include "config/rs6000/float128-vsx.h"' > $@; \
+	echo '#include "soft-fp/'$${tf}'"' >> $@
+
+fp128_novsx_src_list	 := $(foreach f,$(fp128_func),$(f)-novsx.c)
+fp128_novsx_objects_base  = $(basename $(notdir $(fp128_novsx_src_list)))
+fp128_novsx_objects	  = $(addsuffix $(objext),$(fp128_novsx_objects_base))
+
+$(fp128_novsx_src_list):
+	@echo "Create $@"; \
+	tf=$$(echo "$@" | sed -e 's/-novsx//'); \
+	echo '#include "config/rs6000/float128-novsx.h"' > $@; \
+	echo '#include "soft-fp/'$${tf}'"' >> $@
+
+FP128_COMMON_CFLAGS	= -Wno-missing-prototypes -Wno-type-limits \
+			  -DTFtype=__float128 -mfloat128
+
+FP128_VSX_CFLAGS	= $(FP128_COMMON_CFLAGS) -mvsx
+FP128_NOVSX_CFLAGS	= $(FP128_COMMON_CFLAGS) -mno-vsx -mno-altivec \
+			  -mno-power8-vector
+
+$(fp128_vsx_objects) : INTERNAL_CFLAGS += $(FP128_VSX_CFLAGS)
+$(fp128_vsx_objects) : $(addprefix $(srcdir)/soft-float/,$(patsubst %-vsx.c,%.c,$<))
+$(fp128_vsx_objects) : $(srcdir)/config/rs6000/t-float128
+$(fp128_vsx_objects) : $(srcdir)/config/rs6000/float128-vsx.h
+
+$(fp128_novsx_objects) : INTERNAL_CFLAGS += $(FP128_NOVSX_CFLAGS)
+$(fp128_novsx_objects) : $(addprefix $(srcdir)/soft-float/,$(patsubst %-novsx.c,%.c,$<))
+$(fp128_novsx_objects) : $(srcdir)/config/rs6000/t-float128
+$(fp128_novsx_objects) : $(srcdir)/config/rs6000/float128-novsx.h
+
+LIB2ADD += $(fp128_vsx_src_list) $(fp128_novsx_src_list)
+
Index: libgcc/config/rs6000/float128-novsx.h
===================================================================
--- libgcc/config/rs6000/float128-novsx.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/libgcc)	(revision 0)
+++ libgcc/config/rs6000/float128-novsx.h	(.../libgcc)	(revision 209914)
@@ -0,0 +1,33 @@ 
+/* Map __<func>tf<n> to the name used for __float128 when used on system
+   without VSX.  We compile two versions, one that passes __float128 in a
+   vector register, and the other that expects it to be a pair of scalar FPR
+   registers.  */
+
+#define __addtf3	__addtf3_fpr
+#define __subtf3	__subtf3_fpr
+#define __multf3	__multf3_fpr
+#define __divtf3	__divtf3_fpr
+#define __eqtf2		__eqtf2_fpr
+#define __netf2		__netf2_fpr
+#define __getf2		__getf2_fpr
+#define __gttf2		__gttf2_fpr
+#define __letf2		__letf2_fpr
+#define __lttf2		__lttf2_fpr
+#define __unordtf2	__unordtf2_fpr
+#define __negtf2	__negtf2_fpr
+#define __extenddftf2	__extenddftf2_fpr
+#define __extendsftf2	__extendsftf2_fpr
+#define __trunctfdf2	__trunctfdf2_fpr
+#define __trunctfsf2	__trunctfsf2_fpr
+#define __fixtfsi	__fixtfsi_fpr
+#define __fixtfdi	__fixtfdi_fpr
+#define __fixtfti	__fixtfti_fpr
+#define __fixunstfsi	__fixunstfsi_fpr
+#define __fixunstfdi	__fixunstfdi_fpr
+#define __fixunstfti	__fixunstfti_fpr
+#define __floatsitf	__floatsitf_fpr
+#define __floatditf	__floatditf_fpr
+#define __floattitf	__floattitf_fpr
+#define __floatunsitf	__floatunsitf_fpr
+#define __floatunditf	__floatunditf_fpr
+#define __floatuntitf	__floatuntitf_fpr
Index: libgcc/soft-fp/quad.h
===================================================================
--- libgcc/soft-fp/quad.h	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/libgcc)	(revision 209899)
+++ libgcc/soft-fp/quad.h	(.../libgcc)	(working copy)
@@ -66,7 +66,13 @@ 
 #define _FP_HIGHBIT_DW_Q	\
   ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_DW_Q - 1) % _FP_W_TYPE_SIZE)
 
+/* Allow machine to override the name of the 128-bit floating point type.
+   PowerPC long double historically used a pair of doubles on Linux/BSD
+   systems, so use the __float128 type if it is available, instead of
+   TFmode.  */
+#ifndef TFtype
 typedef float TFtype __attribute__ ((mode (TF)));
+#endif
 
 #if _FP_W_TYPE_SIZE < 64
 
Index: libgcc/config.host
===================================================================
--- libgcc/config.host	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/libgcc)	(revision 209899)
+++ libgcc/config.host	(.../libgcc)	(working copy)
@@ -980,7 +980,7 @@  powerpc-*-rtems*)
 	extra_parts="$extra_parts crtbeginS.o crtendS.o crtbeginT.o ecrti.o ecrtn.o ncrti.o ncrtn.o"
 	;;
 powerpc*-*-linux*)
-	tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr rs6000/t-crtstuff rs6000/t-linux t-softfp-sfdf t-softfp-excl t-dfprules rs6000/t-ppc64-fp t-softfp t-slibgcc-libgcc"
+	tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr rs6000/t-crtstuff rs6000/t-linux t-softfp-sfdf t-softfp-excl t-dfprules rs6000/t-ppc64-fp t-softfp rs6000/t-float128 t-slibgcc-libgcc"
 	extra_parts="$extra_parts ecrti.o ecrtn.o ncrti.o ncrtn.o"
 	md_unwind_header=rs6000/linux-unwind.h
 	;;