, Add IEEE 128-bit floating point to PowerPC, patch #2b
diff mbox

Message ID 20150709000149.GA25295@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner July 9, 2015, 12:01 a.m. UTC
David asked me to change the definition of FLOAT128_IBM_P and FLOAT128_IEEE_P
so that they were more consistant.  In doing so, I realized that I didn't need
to test TARGET_LONG_DOUBLE_128 in those macros, since the mode would never be
TFmode, IFmode, or KFmode unless TARGET_LONG_DOUBLE_128 was true.

He also asked for better verification that the existing calling sequence with
long double (which needs the subsequent patches to test), and I will write such
tests and test them, putting them in the final patches.

It bootstraps fine with no regressions.  Is this patch ok to install?

2015-07-08  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-protos.h (rs6000_secondary_reload_memory):
	Use machine mode, not enum machine_mode in the prototype.

	* config/rs6000/rs6000.h (FLOAT128_IEEE_P): New helper macros to
	classify 128-bit floating point support.
	(FLOAT128_IBM_P): Likewise.
	(FLOAT128_VECTOR_P): Likewise.
	(FLOAT128_2REG_P): Likewise.
	(SCALAR_FLOAT_MODE_NOT_VECTOR_P): Likewise.
	(SLOW_UNALIGNED_ACCESS): Add IEEE 128-bit floating point support.
	(HARD_REGNO_CALLER_SAVE_MODE): Likewise.
	(HARD_REGNO_CALL_PART_CLOBBERED): Likewise.

	* config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Drop
	tests against TFmode/TDmode, since those modes do not use VSX
	addresses.
	(rs6000_hard_regno_mode_ok): Add IEEE 128-bit floating point
	support.
	(rs6000_init_hard_regno_mode_ok): Use new helper macros instead of
	tests against TFmode, etc.
	(invalid_e500_subreg): Add tests against IFmode/KFmode.
	(reg_offset_addressing_ok_p): Likewise.
	(rs6000_legitimate_offset_address_p): Likewise.
	(rs6000_legitimize_address): Likewise.
	(rs6000_legitimize_reload_address): Likewise.
	(rs6000_legitimate_address_p): Clean up tests against TFmode and
	TDmode to use the new helper macros, which will include IFmode and
	KFmode.
	(rs6000_emit_move): Likewise.
	(rs6000_darwin64_record_arg_recurse): Likewise.
	(print_operand): Likewise.
	(rs6000_member_type_forces_blk): Treat IEEE 128-bit floating point
	that uses a single vector register as a vector and not as a
	floating point register in terms of the calling sequence.
	(rs6000_discover_homogeneous_aggregate): Likewise.
	(rs6000_return_in_memory): Likewise.
	(init_cumulative_args): Likewise.
	(rs6000_function_arg_boundary): Likewise.
	(rs6000_function_arg_advance_1): Likewise.
	(rs6000_function_arg): Likewise.
	(rs6000_pass_by_reference): Likewise.
	(rs6000_gimplify_va_arg): Likewise.
	(rs6000_secondary_reload_memory): Use machine_mode not enum
	machine mode.
	(rs6000_split_multireg_move): Use new helper macros.
	(spe_func_has_64bit_regs_p): Likewise.
	(rs6000_output_function_epilogue): Add IFmode/KFmode support.
	(output_toc): Use new helper macros.
	(rs6000_register_move_cost): Likewise.
	(rs6000_function_value): Add IEEE 128-bit floating point calling
	sequence support.
	(rs6000_libcall_value): Likewise.
	(rs6000_scalar_mode_supported_p): Add support for IEEE 128-bit
	floating point support.
	(rs6000_vector_mode_supported_p): Likewise.

Comments

David Edelsohn July 9, 2015, 12:21 a.m. UTC | #1
On Wed, Jul 8, 2015 at 8:01 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> David asked me to change the definition of FLOAT128_IBM_P and FLOAT128_IEEE_P
> so that they were more consistant.  In doing so, I realized that I didn't need
> to test TARGET_LONG_DOUBLE_128 in those macros, since the mode would never be
> TFmode, IFmode, or KFmode unless TARGET_LONG_DOUBLE_128 was true.
>
> He also asked for better verification that the existing calling sequence with
> long double (which needs the subsequent patches to test), and I will write such
> tests and test them, putting them in the final patches.
>
> It bootstraps fine with no regressions.  Is this patch ok to install?
>
> 2015-07-08  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000-protos.h (rs6000_secondary_reload_memory):
>         Use machine mode, not enum machine_mode in the prototype.
>
>         * config/rs6000/rs6000.h (FLOAT128_IEEE_P): New helper macros to
>         classify 128-bit floating point support.
>         (FLOAT128_IBM_P): Likewise.
>         (FLOAT128_VECTOR_P): Likewise.
>         (FLOAT128_2REG_P): Likewise.
>         (SCALAR_FLOAT_MODE_NOT_VECTOR_P): Likewise.
>         (SLOW_UNALIGNED_ACCESS): Add IEEE 128-bit floating point support.
>         (HARD_REGNO_CALLER_SAVE_MODE): Likewise.
>         (HARD_REGNO_CALL_PART_CLOBBERED): Likewise.
>
>         * config/rs6000/rs6000.c (rs6000_hard_regno_nregs_internal): Drop
>         tests against TFmode/TDmode, since those modes do not use VSX
>         addresses.
>         (rs6000_hard_regno_mode_ok): Add IEEE 128-bit floating point
>         support.
>         (rs6000_init_hard_regno_mode_ok): Use new helper macros instead of
>         tests against TFmode, etc.
>         (invalid_e500_subreg): Add tests against IFmode/KFmode.
>         (reg_offset_addressing_ok_p): Likewise.
>         (rs6000_legitimate_offset_address_p): Likewise.
>         (rs6000_legitimize_address): Likewise.
>         (rs6000_legitimize_reload_address): Likewise.
>         (rs6000_legitimate_address_p): Clean up tests against TFmode and
>         TDmode to use the new helper macros, which will include IFmode and
>         KFmode.
>         (rs6000_emit_move): Likewise.
>         (rs6000_darwin64_record_arg_recurse): Likewise.
>         (print_operand): Likewise.
>         (rs6000_member_type_forces_blk): Treat IEEE 128-bit floating point
>         that uses a single vector register as a vector and not as a
>         floating point register in terms of the calling sequence.
>         (rs6000_discover_homogeneous_aggregate): Likewise.
>         (rs6000_return_in_memory): Likewise.
>         (init_cumulative_args): Likewise.
>         (rs6000_function_arg_boundary): Likewise.
>         (rs6000_function_arg_advance_1): Likewise.
>         (rs6000_function_arg): Likewise.
>         (rs6000_pass_by_reference): Likewise.
>         (rs6000_gimplify_va_arg): Likewise.
>         (rs6000_secondary_reload_memory): Use machine_mode not enum
>         machine mode.
>         (rs6000_split_multireg_move): Use new helper macros.
>         (spe_func_has_64bit_regs_p): Likewise.
>         (rs6000_output_function_epilogue): Add IFmode/KFmode support.
>         (output_toc): Use new helper macros.
>         (rs6000_register_move_cost): Likewise.
>         (rs6000_function_value): Add IEEE 128-bit floating point calling
>         sequence support.
>         (rs6000_libcall_value): Likewise.
>         (rs6000_scalar_mode_supported_p): Add support for IEEE 128-bit
>         floating point support.
>         (rs6000_vector_mode_supported_p): Likewise.

Okay.

Thanks, David

Patch
diff mbox

Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 225574)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -133,8 +133,7 @@  extern void rs6000_split_multireg_move (
 extern void rs6000_emit_le_vsx_move (rtx, rtx, machine_mode);
 extern void rs6000_emit_move (rtx, rtx, machine_mode);
 extern rtx rs6000_secondary_memory_needed_rtx (machine_mode);
-extern machine_mode rs6000_secondary_memory_needed_mode (enum
-							      machine_mode);
+extern machine_mode rs6000_secondary_memory_needed_mode (machine_mode);
 extern rtx (*rs6000_legitimize_reload_address_ptr) (rtx, machine_mode,
 						    int, int, int, int *);
 extern bool rs6000_legitimate_offset_address_p (machine_mode, rtx,
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 225574)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1740,9 +1740,11 @@  rs6000_hard_regno_nregs_internal (int re
 {
   unsigned HOST_WIDE_INT reg_size;
 
-  /* TF/TD modes are special in that they always take 2 registers.  */
+  /* 128-bit floating point usually takes 2 registers, unless it is IEEE
+     128-bit floating point that can go in vector registers, which has VSX
+     memory addressing.  */
   if (FP_REGNO_P (regno))
-    reg_size = ((VECTOR_MEM_VSX_P (mode) && mode != TDmode && mode != TFmode)
+    reg_size = (VECTOR_MEM_VSX_P (mode)
 		? UNITS_PER_VSX_WORD
 		: UNITS_PER_FP_WORD);
 
@@ -1799,6 +1801,7 @@  rs6000_hard_regno_mode_ok (int regno, ma
      asked for it.  */
   if (TARGET_VSX && VSX_REGNO_P (regno)
       && (VECTOR_MEM_VSX_P (mode)
+	  || FLOAT128_VECTOR_P (mode)
 	  || reg_addr[mode].scalar_in_vmx_p
 	  || (TARGET_VSX_TIMODE && mode == TImode)
 	  || (TARGET_VADDUQM && mode == V1TImode)))
@@ -1824,6 +1827,9 @@  rs6000_hard_regno_mode_ok (int regno, ma
      modes and DImode.  */
   if (FP_REGNO_P (regno))
     {
+      if (FLOAT128_VECTOR_P (mode))
+	return false;
+
       if (SCALAR_FLOAT_MODE_P (mode)
 	  && (mode != TDmode || (regno % 2) == 0)
 	  && FP_REGNO_P (last_regno))
@@ -2999,9 +3005,9 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	  machine_mode m2 = (machine_mode)m;
 	  int reg_size2 = reg_size;
 
-	  /* TFmode/TDmode always takes 2 registers, even in VSX.  */
-	  if (TARGET_VSX && VSX_REG_CLASS_P (c)
-	      && (m == TDmode || m == TFmode))
+	  /* TDmode & IBM 128-bit floating point always takes 2 registers, even
+	     in VSX.  */
+	  if (TARGET_VSX && VSX_REG_CLASS_P (c) && FLOAT128_2REG_P (m))
 	    reg_size2 = UNITS_PER_FP_WORD;
 
 	  rs6000_class_max_nregs[m][c]
@@ -6106,13 +6112,16 @@  invalid_e500_subreg (rtx op, machine_mod
 	      || mode == DDmode || mode == TDmode || mode == PTImode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DFmode
-	      || GET_MODE (SUBREG_REG (op)) == TFmode))
+	      || GET_MODE (SUBREG_REG (op)) == TFmode
+	      || GET_MODE (SUBREG_REG (op)) == IFmode
+	      || GET_MODE (SUBREG_REG (op)) == KFmode))
 	return true;
 
       /* Reject (subreg:DF (reg:DI)); likewise with subreg:TF and
 	 reg:TI.  */
       if (GET_CODE (op) == SUBREG
-	  && (mode == DFmode || mode == TFmode)
+	  && (mode == DFmode || mode == TFmode || mode == IFmode
+	      || mode == KFmode)
 	  && REG_P (SUBREG_REG (op))
 	  && (GET_MODE (SUBREG_REG (op)) == DImode
 	      || GET_MODE (SUBREG_REG (op)) == TImode
@@ -6474,10 +6483,13 @@  reg_offset_addressing_ok_p (machine_mode
     case V2DImode:
     case V1TImode:
     case TImode:
+    case TFmode:
+    case KFmode:
       /* AltiVec/VSX vector modes.  Only reg+reg addressing is valid.  While
 	 TImode is not a vector mode, if we want to use the VSX registers to
-	 move it around, we need to restrict ourselves to reg+reg
-	 addressing.  */
+	 move it around, we need to restrict ourselves to reg+reg addressing.
+	 Similarly for IEEE 128-bit floating point that is passed in a single
+	 vector register.  */
       if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
 	return false;
       break;
@@ -6743,6 +6755,8 @@  rs6000_legitimate_offset_address_p (mach
       break;
 
     case TFmode:
+    case IFmode:
+    case KFmode:
       if (TARGET_E500_DOUBLE)
 	return (SPE_CONST_OFFSET_OK (offset)
 		&& SPE_CONST_OFFSET_OK (offset + 8));
@@ -6936,6 +6950,8 @@  rs6000_legitimize_address (rtx x, rtx ol
     case TDmode:
     case TImode:
     case PTImode:
+    case IFmode:
+    case KFmode:
       /* As in legitimate_offset_address_p we do not assume
 	 worst-case.  The mode here is just a hint as to the registers
 	 used.  A TImode is usually in gprs, but may actually be in
@@ -7708,6 +7724,8 @@  rs6000_legitimize_reload_address (rtx x,
       && !reg_addr[mode].scalar_in_vmx_p
       && mode != TFmode
       && mode != TDmode
+      && mode != IFmode
+      && mode != KFmode
       && (mode != TImode || !TARGET_VSX_TIMODE)
       && mode != PTImode
       && (mode != DImode || TARGET_POWERPC64)
@@ -7861,8 +7879,7 @@  rs6000_legitimate_address_p (machine_mod
     return 1;
   if (rs6000_legitimate_offset_address_p (mode, x, reg_ok_strict, false))
     return 1;
-  if (mode != TFmode
-      && mode != TDmode
+  if (!FLOAT128_2REG_P (mode)
       && ((TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT)
 	  || TARGET_POWERPC64
 	  || (mode != DFmode && mode != DDmode)
@@ -8530,9 +8547,8 @@  rs6000_emit_move (rtx dest, rtx source, 
   /* 128-bit constant floating-point values on Darwin should really be loaded
      as two parts.  However, this premature splitting is a problem when DFmode
      values can go into Altivec registers.  */
-  if (!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
-      && !reg_addr[DFmode].scalar_in_vmx_p
-      && mode == TFmode && GET_CODE (operands[1]) == CONST_DOUBLE)
+  if (FLOAT128_IBM_P (mode) && !reg_addr[DFmode].scalar_in_vmx_p
+      && GET_CODE (operands[1]) == CONST_DOUBLE)
     {
       rs6000_emit_move (simplify_gen_subreg (DFmode, operands[0], mode, 0),
 			simplify_gen_subreg (DFmode, operands[1], mode, 0),
@@ -8724,7 +8740,10 @@  rs6000_emit_move (rtx dest, rtx source, 
 
     case TFmode:
     case TDmode:
-      rs6000_eliminate_indexed_memrefs (operands);
+    case IFmode:
+    case KFmode:
+      if (FLOAT128_2REG_P (mode))
+	rs6000_eliminate_indexed_memrefs (operands);
       /* fall through */
 
     case DFmode:
@@ -8948,7 +8967,7 @@  rs6000_member_type_forces_blk (const_tre
 
 /* Nonzero if we can use a floating-point register to pass this arg.  */
 #define USE_FP_FOR_ARG_P(CUM,MODE)		\
-  (SCALAR_FLOAT_MODE_P (MODE)			\
+  (SCALAR_FLOAT_MODE_NOT_VECTOR_P (MODE)		\
    && (CUM)->fregno <= FP_ARG_MAX_REG		\
    && TARGET_HARD_FLOAT && TARGET_FPRS)
 
@@ -9149,7 +9168,7 @@  rs6000_discover_homogeneous_aggregate (m
 
       if (field_count > 0)
 	{
-	  int n_regs = (SCALAR_FLOAT_MODE_P (field_mode)?
+	  int n_regs = (SCALAR_FLOAT_MODE_P (field_mode) ?
 			(GET_MODE_SIZE (field_mode) + 7) >> 3 : 1);
 
 	  /* The ELFv2 ABI allows homogeneous aggregates to occupy
@@ -9259,7 +9278,8 @@  rs6000_return_in_memory (const_tree type
       return true;
     }
 
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && TYPE_MODE (type) == TFmode)
+  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD
+      && FLOAT128_IEEE_P (TYPE_MODE (type)))
     return true;
 
   return false;
@@ -9389,7 +9409,7 @@  init_cumulative_args (CUMULATIVE_ARGS *c
 		      <= 8))
 		rs6000_returns_struct = true;
 	    }
-	  if (SCALAR_FLOAT_MODE_P (return_mode))
+	  if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (return_mode))
 	    rs6000_passes_float = true;
 	  else if (ALTIVEC_OR_VSX_VECTOR_MODE (return_mode)
 		   || SPE_VECTOR_MODE (return_mode))
@@ -9527,8 +9547,10 @@  rs6000_function_arg_boundary (machine_mo
       && (GET_MODE_SIZE (mode) == 8
 	  || (TARGET_HARD_FLOAT
 	      && TARGET_FPRS
-	      && (mode == TFmode || mode == TDmode))))
+	      && FLOAT128_2REG_P (mode))))
     return 64;
+  else if (FLOAT128_VECTOR_P (mode))
+    return 128;
   else if (SPE_VECTOR_MODE (mode)
 	   || (type && TREE_CODE (type) == VECTOR_TYPE
 	       && int_size_in_bytes (type) >= 8
@@ -9800,7 +9822,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
   if (DEFAULT_ABI == ABI_V4
       && cum->escapes)
     {
-      if (SCALAR_FLOAT_MODE_P (mode))
+      if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (mode))
 	rs6000_passes_float = true;
       else if (named && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
 	rs6000_passes_vector = true;
@@ -9907,21 +9929,21 @@  rs6000_function_arg_advance_1 (CUMULATIV
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
-	      || mode == SDmode || mode == DDmode || mode == TDmode))
+	      || FLOAT128_2REG_P (mode)
+	      || DECIMAL_FLOAT_MODE_P (mode)))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
 	     that the register number is odd when fregno is odd.  */
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    cum->fregno += (GET_MODE_SIZE (mode) + 7) >> 3;
 	  else
 	    {
 	      cum->fregno = FP_ARG_V4_MAX_REG + 1;
-	      if (mode == DFmode || mode == TFmode
+	      if (mode == DFmode || FLOAT128_IBM_P (mode)
 		  || mode == DDmode || mode == TDmode)
 		cum->words += cum->words & 1;
 	      cum->words += rs6000_arg_size (mode, type);
@@ -9973,8 +9995,7 @@  rs6000_function_arg_advance_1 (CUMULATIV
 
       cum->words = align_words + n_words;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode)
-	  && TARGET_HARD_FLOAT && TARGET_FPRS)
+      if (SCALAR_FLOAT_MODE_P (elt_mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
 	{
 	  /* _Decimal128 must be passed in an even/odd float register pair.
 	     This assumes that the register number is odd when fregno is
@@ -10216,7 +10237,7 @@  rs6000_darwin64_record_arg_recurse (CUMU
 	      = gen_rtx_EXPR_LIST (VOIDmode,
 				   gen_rtx_REG (mode, cum->fregno++),
 				   GEN_INT (bitpos / BITS_PER_UNIT));
-	    if (mode == TFmode || mode == TDmode)
+	    if (FLOAT128_2REG_P (mode))
 	      cum->fregno++;
 	  }
 	else if (cum->named && USE_ALTIVEC_FOR_ARG_P (cum, mode, 1))
@@ -10567,15 +10588,15 @@  rs6000_function_arg (cumulative_args_t c
       if (TARGET_HARD_FLOAT && TARGET_FPRS
 	  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
 	      || (TARGET_DOUBLE_FLOAT && mode == DFmode)
-	      || (mode == TFmode && !TARGET_IEEEQUAD)
-	      || mode == SDmode || mode == DDmode || mode == TDmode))
+	      || FLOAT128_2REG_P (mode)
+	      || DECIMAL_FLOAT_MODE_P (mode)))
 	{
 	  /* _Decimal128 must use an even/odd register pair.  This assumes
 	     that the register number is odd when fregno is odd.  */
 	  if (mode == TDmode && (cum->fregno % 2) == 1)
 	    cum->fregno++;
 
-	  if (cum->fregno + (mode == TFmode || mode == TDmode ? 1 : 0)
+	  if (cum->fregno + (FLOAT128_2REG_P (mode) ? 1 : 0)
 	      <= FP_ARG_V4_MAX_REG)
 	    return gen_rtx_REG (mode, cum->fregno);
 	  else
@@ -10637,7 +10658,7 @@  rs6000_function_arg (cumulative_args_t c
 	      machine_mode fmode = elt_mode;
 	      if (cum->fregno + (i + 1) * n_fpreg > FP_ARG_MAX_REG + 1)
 		{
-		  gcc_assert (fmode == TFmode || fmode == TDmode);
+		  gcc_assert (FLOAT128_2REG_P (fmode));
 		  fmode = DECIMAL_FLOAT_MODE_P (fmode) ? DDmode : DFmode;
 		}
 
@@ -10811,10 +10832,11 @@  rs6000_pass_by_reference (cumulative_arg
 			  machine_mode mode, const_tree type,
 			  bool named ATTRIBUTE_UNUSED)
 {
-  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD && mode == TFmode)
+  if (DEFAULT_ABI == ABI_V4 && TARGET_IEEEQUAD
+      && FLOAT128_IEEE_P (TYPE_MODE (type)))
     {
       if (TARGET_DEBUG_ARG)
-	fprintf (stderr, "function_arg_pass_by_reference: V4 long double\n");
+	fprintf (stderr, "function_arg_pass_by_reference: V4 IEEE 128-bit\n");
       return 1;
     }
 
@@ -11489,10 +11511,8 @@  rs6000_gimplify_va_arg (tree valist, tre
       && ((TARGET_SINGLE_FLOAT && TYPE_MODE (type) == SFmode)
           || (TARGET_DOUBLE_FLOAT 
               && (TYPE_MODE (type) == DFmode 
- 	          || TYPE_MODE (type) == TFmode
-	          || TYPE_MODE (type) == SDmode
-	          || TYPE_MODE (type) == DDmode
-	          || TYPE_MODE (type) == TDmode))))
+		  || FLOAT128_2REG_P (TYPE_MODE (type))
+		  || DECIMAL_FLOAT_MODE_P (TYPE_MODE (type))))))
     {
       /* FP args go in FP registers, if present.  */
       reg = fpr;
@@ -16755,7 +16775,7 @@  rs6000_secondary_reload_toc_costs (addr_
 static int
 rs6000_secondary_reload_memory (rtx addr,
 				enum reg_class rclass,
-				enum machine_mode mode)
+				machine_mode mode)
 {
   int extra_cost = 0;
   rtx reg, and_arg, plus_arg0, plus_arg1;
@@ -19053,7 +19073,7 @@  print_operand (FILE *file, rtx x, int co
 	/* Ugly hack because %y is overloaded.  */
 	if ((TARGET_SPE || TARGET_E500_DOUBLE)
 	    && (GET_MODE_SIZE (GET_MODE (x)) == 8
-		|| GET_MODE (x) == TFmode
+		|| FLOAT128_2REG_P (GET_MODE (x))
 		|| GET_MODE (x) == TImode
 		|| GET_MODE (x) == PTImode))
 	  {
@@ -21003,7 +21023,7 @@  rs6000_split_multireg_move (rtx dst, rtx
 	((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? DFmode : SFmode);
   else if (ALTIVEC_REGNO_P (reg))
     reg_mode = V16QImode;
-  else if (TARGET_E500_DOUBLE && mode == TFmode)
+  else if (TARGET_E500_DOUBLE && FLOAT128_2REG_P (mode))
     reg_mode = DFmode;
   else
     reg_mode = word_mode;
@@ -22080,7 +22100,8 @@  spe_func_has_64bit_regs_p (void)
 
 	      if (SPE_VECTOR_MODE (mode))
 		return true;
-	      if (TARGET_E500_DOUBLE && (mode == DFmode || mode == TFmode))
+	      if (TARGET_E500_DOUBLE
+		  && (mode == DFmode || FLOAT128_2REG_P (mode)))
 		return true;
 	    }
 	}
@@ -25843,6 +25864,8 @@  rs6000_output_function_epilogue (FILE *f
 			case DDmode:
 			case TFmode:
 			case TDmode:
+			case IFmode:
+			case KFmode:
 			  bits = 0x3;
 			  break;
 
@@ -26516,7 +26539,8 @@  output_toc (FILE *file, rtx x, int label
      TOC, things we put here aren't actually in the TOC, so we can allow
      FP constants.  */
   if (GET_CODE (x) == CONST_DOUBLE &&
-      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode))
+      (GET_MODE (x) == TFmode || GET_MODE (x) == TDmode
+       || GET_MODE (x) == IFmode || GET_MODE (x) == KFmode))
     {
       REAL_VALUE_TYPE rv;
       long k[4];
@@ -31040,7 +31064,7 @@  rs6000_register_move_cost (machine_mode 
 
   /* Moving between two similar registers is just one instruction.  */
   else if (reg_classes_intersect_p (to, from))
-    ret = (mode == TFmode || mode == TDmode) ? 4 : 2;
+    ret = (FLOAT128_2REG_P (mode)) ? 4 : 2;
 
   /* Everything else has to go through GENERAL_REGS.  */
   else
@@ -32107,7 +32131,7 @@  rs6000_function_value (const_tree valtyp
     {
       int first_reg, n_regs;
 
-      if (SCALAR_FLOAT_MODE_P (elt_mode))
+      if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (elt_mode))
 	{
 	  /* _Decimal128 must use even/odd register pairs.  */
 	  first_reg = (elt_mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
@@ -32144,7 +32168,7 @@  rs6000_function_value (const_tree valtyp
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_TYPE_P (valtype) && TARGET_HARD_FLOAT && TARGET_FPRS
+  else if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS
 	   && ((TARGET_SINGLE_FLOAT && (mode == SFmode)) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
   else if (TREE_CODE (valtype) == COMPLEX_TYPE
@@ -32153,13 +32177,13 @@  rs6000_function_value (const_tree valtyp
   /* VSX is a superset of Altivec and adds V2DImode/V2DFmode.  Since the same
      return register is used in both cases, and we won't see V2DImode/V2DFmode
      for pure altivec, combine the two cases.  */
-  else if (TREE_CODE (valtype) == VECTOR_TYPE
+  else if ((TREE_CODE (valtype) == VECTOR_TYPE || FLOAT128_VECTOR_P (mode))
 	   && TARGET_ALTIVEC && TARGET_ALTIVEC_ABI
 	   && ALTIVEC_OR_VSX_VECTOR_MODE (mode))
     regno = ALTIVEC_ARG_RETURN;
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || FLOAT128_IBM_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -32181,7 +32205,7 @@  rs6000_libcall_value (machine_mode mode)
   if (DECIMAL_FLOAT_MODE_P (mode) && TARGET_HARD_FLOAT && TARGET_FPRS)
     /* _Decimal128 must use an even/odd register pair.  */
     regno = (mode == TDmode) ? FP_ARG_RETURN + 1 : FP_ARG_RETURN;
-  else if (SCALAR_FLOAT_MODE_P (mode)
+  else if (SCALAR_FLOAT_MODE_NOT_VECTOR_P (mode)
 	   && TARGET_HARD_FLOAT && TARGET_FPRS
            && ((TARGET_SINGLE_FLOAT && mode == SFmode) || TARGET_DOUBLE_FLOAT))
     regno = FP_ARG_RETURN;
@@ -32195,7 +32219,7 @@  rs6000_libcall_value (machine_mode mode)
     return rs6000_complex_function_value (mode);
   else if (TARGET_E500_DOUBLE && TARGET_HARD_FLOAT
 	   && (mode == DFmode || mode == DCmode
-	       || mode == TFmode || mode == TCmode))
+	       || FLOAT128_IBM_P (mode) || mode == TCmode))
     return spe_build_register_parallel (mode, GP_ARG_RETURN);
   else
     regno = GP_ARG_RETURN;
@@ -32421,6 +32445,8 @@  rs6000_scalar_mode_supported_p (machine_
 
   if (DECIMAL_FLOAT_MODE_P (mode))
     return default_decimal_float_supported_p ();
+  else if (mode == KFmode)
+    return TARGET_FLOAT128;
   else
     return default_scalar_mode_supported_p (mode);
 }
@@ -32436,7 +32462,10 @@  rs6000_vector_mode_supported_p (machine_
   if (TARGET_SPE && SPE_VECTOR_MODE (mode))
     return true;
 
-  else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode))
+  /* There is no vector form for IEEE 128-bit.  If we return true for IEEE
+     128-bit, the compiler might try to widen IEEE 128-bit to IBM
+     double-double.  */
+  else if (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) && !FLOAT128_IEEE_P (mode))
     return true;
 
   else
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 225574)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -402,6 +402,33 @@  extern const char *host_detect_local_cpu
 #define TARGET_DEBUG_TARGET	(rs6000_debug & MASK_DEBUG_TARGET)
 #define TARGET_DEBUG_BUILTIN	(rs6000_debug & MASK_DEBUG_BUILTIN)
 
+/* Helper macros for TFmode.  Quad floating point (TFmode) can be either IBM
+   long double format that uses a pair of doubles, or IEEE 128-bit floating
+   point.  KFmode was added as a way to represent IEEE 128-bit floating point,
+   even if the default for long double is the IBM long double format.
+   Similarly IFmode is the IBM long double format even if the default is IEEE
+   128-bit.  */
+#define FLOAT128_IEEE_P(MODE)						\
+  (((MODE) == TFmode && TARGET_IEEEQUAD)				\
+   || ((MODE) == KFmode))
+
+#define FLOAT128_IBM_P(MODE)						\
+  (((MODE) == TFmode && !TARGET_IEEEQUAD)				\
+   || ((MODE) == IFmode))
+
+/* Helper macros to say whether a 128-bit floating point type can go in a
+   single vector register, or whether it needs paired scalar values.  */
+#define FLOAT128_VECTOR_P(MODE) (TARGET_FLOAT128 && FLOAT128_IEEE_P (MODE))
+
+#define FLOAT128_2REG_P(MODE)						\
+  (FLOAT128_IBM_P (MODE)						\
+   || ((MODE) == TDmode)						\
+   || (!TARGET_FLOAT128 && FLOAT128_IEEE_P (MODE)))
+
+/* Return true for floating point that does not use a vector register.  */
+#define SCALAR_FLOAT_MODE_NOT_VECTOR_P(MODE)				\
+  (SCALAR_FLOAT_MODE_P (MODE) && !FLOAT128_VECTOR_P (MODE))
+
 /* Describe the vector unit used for arithmetic operations.  */
 extern enum rs6000_vector rs6000_vector_unit[];
 
@@ -888,11 +915,10 @@  enum data_align { align_abi, align_opt, 
    aligned to 4 or 8 bytes.  */
 #define SLOW_UNALIGNED_ACCESS(MODE, ALIGN)				\
   (STRICT_ALIGNMENT							\
-   || (((MODE) == SFmode || (MODE) == DFmode || (MODE) == TFmode	\
-	|| (MODE) == SDmode || (MODE) == DDmode || (MODE) == TDmode)	\
-       && (ALIGN) < 32)							\
+   || (SCALAR_FLOAT_MODE_NOT_VECTOR_P (MODE) && (ALIGN) < 32)		\
    || (!TARGET_EFFICIENT_UNALIGNED_VSX                                  \
-       && (VECTOR_MODE_P ((MODE)) && (((int)(ALIGN)) < VECTOR_ALIGN (MODE)))))
+       && ((VECTOR_MODE_P (MODE) || FLOAT128_VECTOR_P (MODE))		\
+	   && (((int)(ALIGN)) < VECTOR_ALIGN (MODE)))))
 
 
 /* Standard register usage.  */
@@ -1174,7 +1200,7 @@  enum data_align { align_abi, align_opt, 
    ? V2DFmode								\
    : TARGET_E500_DOUBLE && ((MODE) == VOIDmode || (MODE) == DFmode)	\
    ? DFmode								\
-   : !TARGET_E500_DOUBLE && (MODE) == TFmode && FP_REGNO_P (REGNO)	\
+   : !TARGET_E500_DOUBLE && FLOAT128_IBM_P (MODE) && FP_REGNO_P (REGNO)	\
    ? DFmode								\
    : !TARGET_E500_DOUBLE && (MODE) == TDmode && FP_REGNO_P (REGNO)	\
    ? DImode								\
@@ -1185,8 +1211,7 @@  enum data_align { align_abi, align_opt, 
      && (GET_MODE_SIZE (MODE) > 4)					\
      && INT_REGNO_P (REGNO)) ? 1 : 0)					\
    || (TARGET_VSX && FP_REGNO_P (REGNO)					\
-       && GET_MODE_SIZE (MODE) > 8 && ((MODE) != TDmode) 		\
-       && ((MODE) != TFmode)))
+       && GET_MODE_SIZE (MODE) > 8 && !FLOAT128_2REG_P (MODE)))
 
 #define VSX_VECTOR_MODE(MODE)		\
 	 ((MODE) == V4SFmode		\