diff mbox series

, make Float128 built-in functions work with -mabi=ieeelongdouble

Message ID 20171110231319.GA18838@ibm-tiger.the-meissners.org
State New
Headers show
Series , make Float128 built-in functions work with -mabi=ieeelongdouble | expand

Commit Message

Michael Meissner Nov. 10, 2017, 11:13 p.m. UTC
This patch updates the float128 built-in functions that get/set exponents, get
mantissa, and do tests to work with _Float128 on the current system, and with
long double when -mabi=ieeelongdouble is used.

The issue is when long double == IEEE, we use TFmode instead of KFmode.  I
decided to fix this inside of rs6000_expand_builtin, adding a switch statement
for the KFmode float128 built-ins, and switching them to the TFmode variant if
-mabi=ieeelongdouble.

I went back and reworked the changes on November 6th that did not use the
most of the rs6000-builtins.def machinery to create the built-in functions.  I
removed the special built-in function creation.

I have checked this against subversion id 254470 and it bootstraps fine, and
passes all of the regression tests.  As I write this, subversion id 254642 did
not bootstrap on a PowerPC power8 system.  Assuming the patch is acceptable, I
will make sure the compiler bootstraps before committing the changes.

Is it ok to install in the trunk?

[gcc]
2017-11-10  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-c.c (is_float128_p): New helper function.
	(rs6000_builtin_type_compatible): Treat _Float128 and long double
	as being compatible if -mabi=ieeelongdouble.
	* config/rs6000/rs6000-builtin.def (BU_FLOAT128_HW_1): New macros
	to setup float128 built-ins with hardware support.
	(BU_FLOAT128_HW_2): Likewise.
	(BU_FLOAT128_HW_3): Likewise.
	(BU_FLOAT128_HW_VSX_1): Likewise.
	(BU_FLOAT128_HW_VSX_2): Likewise.
	(scalar_extract_expq): Change float128 built-in functions to
	accomidate having both KFmode and TFmode functions.  Use the
	KFmode variant as the default.
	(scalar_extract_sigq): Likewise.
	(scalar_test_neg_qp): Likewise.
	(scalar_insert_exp_q): Likewise.
	(scalar_insert_exp_qp): Likewise.
	(scalar_test_data_class_qp): Likewise.
	(sqrtf128_round_to_odd): Delete processing the round to odd
	built-in functions as special built-in functions, and define them
	as float128 built-ins.  Use the KFmode variant as the default.
	(truncf128_round_to_odd): Likewise.
	(addf128_round_to_odd): Likewise.
	(subf128_round_to_odd): Likewise.
	(mulf128_round_to_odd): Likewise.
	(divf128_round_to_odd): Likewise.
	(fmaf128_round_to_odd): Likewise.
	* config/rs6000/rs6000.c (rs6000_expand_binop_builtin): Add
	support for KFmode and TFmode xststdcqp calls.
	(rs6000_expand_builtin): If long double is IEEE 128-bit floating
	point, switch the built-in handlers for the get/set float128
	exponent, get float128 mantissa, float128 test built-ins, and the
	float128 round to odd built-in functions.  Eliminate creating the
	float128 round to odd built-in functions as special built-ins.
	(rs6000_init_builtins): Eliminate special creation of the float128
	round to odd built-in functions.
	* config/rs6000/vsx.md (xsxexpqp_<mode>): Change float128 built-in
	function insns to support both TFmode and KFmode varaints.
	(xsxsigqp_<mode>): Likewise.
	(xsiexpqpf_<mode>): Likewise.
	(xsiexpqp_<mode>): Likewise.
	(xststdcqp_<mode>): Likewise.
	(xststdcnegqp_<mode>): Likewise.
	(xststdcqp_<mode>): Likewise.

[gcc/testsuite]
2017-11-10  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-hw4.c: New test.

Comments

Segher Boessenkool Nov. 14, 2017, 5:01 p.m. UTC | #1
Hi!

On Fri, Nov 10, 2017 at 06:13:19PM -0500, Michael Meissner wrote:
> This patch updates the float128 built-in functions that get/set exponents, get
> mantissa, and do tests to work with _Float128 on the current system, and with
> long double when -mabi=ieeelongdouble is used.
> 
> The issue is when long double == IEEE, we use TFmode instead of KFmode.  I
> decided to fix this inside of rs6000_expand_builtin, adding a switch statement
> for the KFmode float128 built-ins, and switching them to the TFmode variant if
> -mabi=ieeelongdouble.

> 	(scalar_extract_expq): Change float128 built-in functions to
> 	accomidate having both KFmode and TFmode functions.  Use the
> 	KFmode variant as the default.

"accommodate".

> 	* config/rs6000/vsx.md (xsxexpqp_<mode>): Change float128 built-in
> 	function insns to support both TFmode and KFmode varaints.

"variants".

Other than those trivialities, looks fine.  Please commit.  Thanks!


Segher
Michael Meissner Nov. 15, 2017, 9:56 p.m. UTC | #2
David tells me that the patch to enable float128 built-in functions to work
with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
under AIX.

I changed the code to have a separate table, and the first call, I build the
table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
the KF->TF mode conversion will not be done.

I have tested this on a little endian power8 system and there were no
regressions.  Once David verifies that it builds on AIX, can I check this into
the trunk?

2017-11-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_builtin): Do not use a
	switch to map KFmode built-in functions to TFmode.
Segher Boessenkool Nov. 16, 2017, 10:48 a.m. UTC | #3
On Wed, Nov 15, 2017 at 04:56:10PM -0500, Michael Meissner wrote:
> David tells me that the patch to enable float128 built-in functions to work
> with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
> insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
> that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
> under AIX.

It also breaks on Linux with older binutils (no HAVE_AS_POWER9 defined).

> I changed the code to have a separate table, and the first call, I build the
> table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
> the KF->TF mode conversion will not be done.
> 
> I have tested this on a little endian power8 system and there were no
> regressions.  Once David verifies that it builds on AIX, can I check this into
> the trunk?

I don't like this scheme much (huge table, initialisation at runtime, etc.),
but okay for trunk, to unbreak things there.

Some comments on the patch:

> +      if (first_time)
> +	{
> +	  first_time = false;
> +	  gcc_assert ((int)CODE_FOR_nothing == 0);

No useless cast please.  The whole assert is pretty useless fwiw; just
take it out?

> +	  for (i = 0; i < ARRAY_SIZE (map); i++)
> +	    map_insn_code[(int)map[i].from] = map[i].to;
> +	}

Space after cast.

Only do this for codes that are *not* CODE_FOR_nothing?


Segher
Michael Meissner Nov. 16, 2017, 5:48 p.m. UTC | #4
On Thu, Nov 16, 2017 at 04:48:18AM -0600, Segher Boessenkool wrote:
> On Wed, Nov 15, 2017 at 04:56:10PM -0500, Michael Meissner wrote:
> > David tells me that the patch to enable float128 built-in functions to work
> > with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
> > insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
> > that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
> > under AIX.
> 
> It also breaks on Linux with older binutils (no HAVE_AS_POWER9 defined).
> 
> > I changed the code to have a separate table, and the first call, I build the
> > table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
> > the KF->TF mode conversion will not be done.
> > 
> > I have tested this on a little endian power8 system and there were no
> > regressions.  Once David verifies that it builds on AIX, can I check this into
> > the trunk?
> 
> I don't like this scheme much (huge table, initialisation at runtime, etc.),
> but okay for trunk, to unbreak things there.
> 
> Some comments on the patch:
> 
> > +      if (first_time)
> > +	{
> > +	  first_time = false;
> > +	  gcc_assert ((int)CODE_FOR_nothing == 0);
> 
> No useless cast please.  The whole assert is pretty useless fwiw; just
> take it out?
> 
> > +	  for (i = 0; i < ARRAY_SIZE (map); i++)
> > +	    map_insn_code[(int)map[i].from] = map[i].to;
> > +	}
> 
> Space after cast.
> 
> Only do this for codes that are *not* CODE_FOR_nothing?

I must admit to not liking the code, and it is overly complicated.

It occurred to me this morning that a much simpler patch is to just #ifdef out
the switch statement if we don't have the proper assembler.  I tried this on an
old power7 system using the system assembler (which does not support the ISA
3.0 instructions) and it built fine.  I think this will work on AIX.  David can
you check this?

I will fire off a build, and if it is successful, can I check this patch
instead of the other patch?

2017-11-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_expand_builtin): Do not do the
	switch statement mapping KF built-ins to TF built-ins if we don't
	have the proper ISA 3.0 assembler support.
David Edelsohn Nov. 16, 2017, 5:54 p.m. UTC | #5
On Thu, Nov 16, 2017 at 12:48 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> On Thu, Nov 16, 2017 at 04:48:18AM -0600, Segher Boessenkool wrote:
>> On Wed, Nov 15, 2017 at 04:56:10PM -0500, Michael Meissner wrote:
>> > David tells me that the patch to enable float128 built-in functions to work
>> > with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
>> > insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
>> > that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
>> > under AIX.
>>
>> It also breaks on Linux with older binutils (no HAVE_AS_POWER9 defined).
>>
>> > I changed the code to have a separate table, and the first call, I build the
>> > table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
>> > the KF->TF mode conversion will not be done.
>> >
>> > I have tested this on a little endian power8 system and there were no
>> > regressions.  Once David verifies that it builds on AIX, can I check this into
>> > the trunk?
>>
>> I don't like this scheme much (huge table, initialisation at runtime, etc.),
>> but okay for trunk, to unbreak things there.
>>
>> Some comments on the patch:
>>
>> > +      if (first_time)
>> > +   {
>> > +     first_time = false;
>> > +     gcc_assert ((int)CODE_FOR_nothing == 0);
>>
>> No useless cast please.  The whole assert is pretty useless fwiw; just
>> take it out?
>>
>> > +     for (i = 0; i < ARRAY_SIZE (map); i++)
>> > +       map_insn_code[(int)map[i].from] = map[i].to;
>> > +   }
>>
>> Space after cast.
>>
>> Only do this for codes that are *not* CODE_FOR_nothing?
>
> I must admit to not liking the code, and it is overly complicated.
>
> It occurred to me this morning that a much simpler patch is to just #ifdef out
> the switch statement if we don't have the proper assembler.  I tried this on an
> old power7 system using the system assembler (which does not support the ISA
> 3.0 instructions) and it built fine.  I think this will work on AIX.  David can
> you check this?
>
> I will fire off a build, and if it is successful, can I check this patch
> instead of the other patch?

This patch will solve the problem.

GCC policy prefers runtime tests over #ifdef, but I agree that the
runtime approach is overly messy.  This seems like a reasonable
approach to me.

Thanks, David
Segher Boessenkool Nov. 16, 2017, 6:38 p.m. UTC | #6
On Thu, Nov 16, 2017 at 12:54:54PM -0500, David Edelsohn wrote:
> On Thu, Nov 16, 2017 at 12:48 PM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > On Thu, Nov 16, 2017 at 04:48:18AM -0600, Segher Boessenkool wrote:
> >> On Wed, Nov 15, 2017 at 04:56:10PM -0500, Michael Meissner wrote:
> >> > David tells me that the patch to enable float128 built-in functions to work
> >> > with the -mabi=ieeelongdouble option broke AIX because on AIX, the float128
> >> > insns are disabled, and they all become CODE_FOR_nothing.  The switch statement
> >> > that was added in rs6000.c to map KFmode built-in functions to TFmode breaks
> >> > under AIX.
> >>
> >> It also breaks on Linux with older binutils (no HAVE_AS_POWER9 defined).
> >>
> >> > I changed the code to have a separate table, and the first call, I build the
> >> > table.  If the insn was not generated, it will just be CODE_FOR_nothing, and
> >> > the KF->TF mode conversion will not be done.
> >> >
> >> > I have tested this on a little endian power8 system and there were no
> >> > regressions.  Once David verifies that it builds on AIX, can I check this into
> >> > the trunk?
> >>
> >> I don't like this scheme much (huge table, initialisation at runtime, etc.),
> >> but okay for trunk, to unbreak things there.
> >>
> >> Some comments on the patch:
> >>
> >> > +      if (first_time)
> >> > +   {
> >> > +     first_time = false;
> >> > +     gcc_assert ((int)CODE_FOR_nothing == 0);
> >>
> >> No useless cast please.  The whole assert is pretty useless fwiw; just
> >> take it out?
> >>
> >> > +     for (i = 0; i < ARRAY_SIZE (map); i++)
> >> > +       map_insn_code[(int)map[i].from] = map[i].to;
> >> > +   }
> >>
> >> Space after cast.
> >>
> >> Only do this for codes that are *not* CODE_FOR_nothing?
> >
> > I must admit to not liking the code, and it is overly complicated.
> >
> > It occurred to me this morning that a much simpler patch is to just #ifdef out
> > the switch statement if we don't have the proper assembler.  I tried this on an
> > old power7 system using the system assembler (which does not support the ISA
> > 3.0 instructions) and it built fine.  I think this will work on AIX.  David can
> > you check this?
> >
> > I will fire off a build, and if it is successful, can I check this patch
> > instead of the other patch?
> 
> This patch will solve the problem.
> 
> GCC policy prefers runtime tests over #ifdef, but I agree that the
> runtime approach is overly messy.  This seems like a reasonable
> approach to me.

Same here.  It's a nice simple patch, and with a comment even :-)

We also have 117 #if.* in rs6000.c already, one more won't hurt.


Segher
diff mbox series

Patch

Index: gcc/config/rs6000/rs6000-c.c
===================================================================
--- gcc/config/rs6000/rs6000-c.c	(revision 254556)
+++ gcc/config/rs6000/rs6000-c.c	(working copy)
@@ -5699,12 +5699,22 @@  rs6000_builtin_type (int id)
   return id < 0 ? build_pointer_type (t) : t;
 }
 
-/* Check whether the type of an argument, T, is compatible with a
-   type ID stored into a struct altivec_builtin_types.  Integer
-   types are considered compatible; otherwise, the language hook
-   lang_hooks.types_compatible_p makes the decision.  */
+/* Check whether the type of an argument, T, is compatible with a type ID
+   stored into a struct altivec_builtin_types.  Integer types are considered
+   compatible; otherwise, the language hook lang_hooks.types_compatible_p makes
+   the decision.  Also allow long double and _Float128 to be compatible if
+   -mabi=ieeelongdouble.  */
 
 static inline bool
+is_float128_p (tree t)
+{
+  return (t == float128_type_node
+	  || (TARGET_IEEEQUAD
+	      && TARGET_LONG_DOUBLE_128
+	      && t == long_double_type_node));
+}
+  
+static inline bool
 rs6000_builtin_type_compatible (tree t, int id)
 {
   tree builtin_type;
@@ -5713,6 +5723,9 @@  rs6000_builtin_type_compatible (tree t, 
     return false;
   if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (builtin_type))
     return true;
+  else if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128
+	   && is_float128_p (t) && is_float128_p (builtin_type))
+    return true;
   else
     return lang_hooks.types_compatible_p (t, builtin_type);
 }
Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(revision 254556)
+++ gcc/config/rs6000/rs6000-builtin.def	(working copy)
@@ -909,6 +909,51 @@ 
 		     | RS6000_BTC_BINARY),				\
 		    CODE_FOR_nothing)			/* ICODE */
 
+/* Built-in functions for IEEE 128-bit hardware floating point.  IEEE 128-bit
+   hardware requires p9-vector and 64-bit operation.  These functions use just
+   __builtin_ as the prefix.  */
+#define BU_FLOAT128_HW_1(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_1 (FLOAT128_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_FLOAT128_HW,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FLOAT128_HW_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (FLOAT128_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_FLOAT128_HW,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FLOAT128_HW_3(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_3 (FLOAT128_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_" NAME,			/* NAME */	\
+		    RS6000_BTM_FLOAT128_HW,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_TERNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+/* Built-in functions for IEEE 128-bit hardware floating point.  These
+   functions use __builtin_vsx_ as the prefix.  */
+#define BU_FLOAT128_HW_VSX_1(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_1 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_FLOAT128_HW,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_UNARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
+#define BU_FLOAT128_HW_VSX_2(ENUM, NAME, ATTR, ICODE)			\
+  RS6000_BUILTIN_2 (P9V_BUILTIN_ ## ENUM,		/* ENUM */	\
+		    "__builtin_vsx_" NAME,		/* NAME */	\
+		    RS6000_BTM_FLOAT128_HW,		/* MASK */	\
+		    (RS6000_BTC_ ## ATTR		/* ATTR */	\
+		     | RS6000_BTC_BINARY),				\
+		    CODE_FOR_ ## ICODE)			/* ICODE */
+
 #endif
 
 
@@ -2038,10 +2083,10 @@  BU_P9V_OVERLOAD_3 (RLMI,	"rlmi")
 BU_P9V_64BIT_VSX_1 (VSEEDP,	"scalar_extract_exp",	CONST,	xsxexpdp)
 BU_P9V_64BIT_VSX_1 (VSESDP,	"scalar_extract_sig",	CONST,	xsxsigdp)
 
-BU_P9V_64BIT_VSX_1 (VSEEQP,	"scalar_extract_expq",	CONST,	xsxexpqp)
-BU_P9V_64BIT_VSX_1 (VSESQP,	"scalar_extract_sigq",	CONST,	xsxsigqp)
+BU_FLOAT128_HW_VSX_1 (VSEEQP,	"scalar_extract_expq",	CONST,	xsxexpqp_kf)
+BU_FLOAT128_HW_VSX_1 (VSESQP,	"scalar_extract_sigq",	CONST,	xsxsigqp_kf)
 
-BU_P9V_VSX_1 (VSTDCNQP,	"scalar_test_neg_qp",	CONST,	xststdcnegqp)
+BU_FLOAT128_HW_VSX_1 (VSTDCNQP, "scalar_test_neg_qp",	CONST,	xststdcnegqp_kf)
 BU_P9V_VSX_1 (VSTDCNDP,	"scalar_test_neg_dp",	CONST,	xststdcnegdp)
 BU_P9V_VSX_1 (VSTDCNSP,	"scalar_test_neg_sp",	CONST,	xststdcnegsp)
 
@@ -2057,15 +2102,15 @@  BU_P9V_VSX_1 (XXBRH_V8HI,	"xxbrh_v8hi",	
 BU_P9V_64BIT_VSX_2 (VSIEDP,	"scalar_insert_exp",	CONST,	xsiexpdp)
 BU_P9V_64BIT_VSX_2 (VSIEDPF,	"scalar_insert_exp_dp",	CONST,	xsiexpdpf)
 
-BU_P9V_64BIT_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp)
-BU_P9V_64BIT_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf)
+BU_FLOAT128_HW_VSX_2 (VSIEQP,	"scalar_insert_exp_q",	CONST,	xsiexpqp_kf)
+BU_FLOAT128_HW_VSX_2 (VSIEQPF,	"scalar_insert_exp_qp",	CONST,	xsiexpqpf_kf)
 
 BU_P9V_VSX_2 (VSCEDPGT,	"scalar_cmp_exp_dp_gt",	CONST,	xscmpexpdp_gt)
 BU_P9V_VSX_2 (VSCEDPLT,	"scalar_cmp_exp_dp_lt",	CONST,	xscmpexpdp_lt)
 BU_P9V_VSX_2 (VSCEDPEQ,	"scalar_cmp_exp_dp_eq",	CONST,	xscmpexpdp_eq)
 BU_P9V_VSX_2 (VSCEDPUO,	"scalar_cmp_exp_dp_unordered",	CONST,	xscmpexpdp_unordered)
 
-BU_P9V_VSX_2 (VSTDCQP,	"scalar_test_data_class_qp",	CONST,	xststdcqp)
+BU_FLOAT128_HW_VSX_2 (VSTDCQP, "scalar_test_data_class_qp",	CONST,	xststdcqp_kf)
 BU_P9V_VSX_2 (VSTDCDP,	"scalar_test_data_class_dp",	CONST,	xststdcdp)
 BU_P9V_VSX_2 (VSTDCSP,	"scalar_test_data_class_sp",	CONST,	xststdcsp)
 
@@ -2142,6 +2187,16 @@  BU_P9V_VSX_2 (VEXTRACT4B,   "vextract4b"
 BU_P9V_VSX_3 (VINSERT4B,    "vinsert4b",	CONST,	vinsert4b)
 BU_P9V_VSX_3 (VINSERT4B_DI, "vinsert4b_di",	CONST,	vinsert4b_di)
 
+/* Hardware IEEE 128-bit floating point round to odd instrucitons added in ISA
+   3.0 (power9).  */
+BU_FLOAT128_HW_1 (SQRTF128_ODD,  "sqrtf128_round_to_odd",  FP, sqrtkf2_odd)
+BU_FLOAT128_HW_1 (TRUNCF128_ODD, "truncf128_round_to_odd", FP, trunckfdf2_odd)
+BU_FLOAT128_HW_2 (ADDF128_ODD,   "addf128_round_to_odd",   FP, addkf3_odd)
+BU_FLOAT128_HW_2 (SUBF128_ODD,   "subf128_round_to_odd",   FP, subkf3_odd)
+BU_FLOAT128_HW_2 (MULF128_ODD,   "mulf128_round_to_odd",   FP, mulkf3_odd)
+BU_FLOAT128_HW_2 (DIVF128_ODD,   "divf128_round_to_odd",   FP, divkf3_odd)
+BU_FLOAT128_HW_3 (FMAF128_ODD,   "fmaf128_round_to_odd",   FP, fmakf4_odd)
+
 /* 3 argument vector functions returning void, treated as SPECIAL,
    added in ISA 3.0 (power9).  */
 BU_P9V_64BIT_AV_X (STXVL,	"stxvl",	MISC)
@@ -2464,34 +2519,6 @@  BU_SPECIAL_X (RS6000_BUILTIN_CPU_IS, "__
 BU_SPECIAL_X (RS6000_BUILTIN_CPU_SUPPORTS, "__builtin_cpu_supports",
 	      RS6000_BTM_ALWAYS, RS6000_BTC_MISC)
 
-BU_SPECIAL_X (FLOAT128_BUILTIN_SQRTF128_ODD,
-	      "__builtin_sqrtf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_TRUNCF128_ODD,
-	      "__builtin_truncf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_ADDF128_ODD,
-	      "__builtin_addf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_SUBF128_ODD,
-	      "__builtin_subf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_MULF128_ODD,
-	      "__builtin_mulf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_DIVF128_ODD,
-	      "__builtin_divf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
-BU_SPECIAL_X (FLOAT128_BUILTIN_FMAF128_ODD,
-	      "__builtin_fmaf128_round_to_odd",
-	      RS6000_BTM_FLOAT128_HW, RS6000_BTC_MISC)
-
 /* Darwin CfString builtin.  */
 BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS,
 	      RS6000_BTC_MISC)
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 254556)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -14086,7 +14086,8 @@  rs6000_expand_binop_builtin (enum insn_c
 	  return CONST0_RTX (tmode);
 	}
     }
-  else if (icode == CODE_FOR_xststdcqp
+  else if (icode == CODE_FOR_xststdcqp_kf
+	   || icode == CODE_FOR_xststdcqp_tf
 	   || icode == CODE_FOR_xststdcdp
 	   || icode == CODE_FOR_xststdcsp
 	   || icode == CODE_FOR_xvtstdcdp
@@ -16736,10 +16737,37 @@  rs6000_expand_builtin (tree exp, rtx tar
   bool success;
   HOST_WIDE_INT mask = rs6000_builtin_info[uns_fcode].mask;
   bool func_valid_p = ((rs6000_builtin_mask & mask) == mask);
+  enum insn_code icode = rs6000_builtin_info[uns_fcode].icode;
+
+  /* We have two different modes (KFmode, TFmode) that are the IEEE 128-bit
+     floating point type, depending on whether long double is the IBM extended
+     double (KFmode) or long double is IEEE 128-bit (TFmode).  It is simpler if
+     we only define one variant of the built-in function, and switch the code
+     when defining it, rather than defining two built-ins and using the
+     overload table in rs6000-c.c to switch between the two.  */
+  if (FLOAT128_IEEE_P (TFmode))
+    switch (icode)
+      {
+      default:
+	break;
+
+      case CODE_FOR_sqrtkf2_odd:	icode = CODE_FOR_sqrttf2_odd;	break;
+      case CODE_FOR_trunckfdf2_odd:	icode = CODE_FOR_trunctfdf2_odd; break;
+      case CODE_FOR_addkf3_odd:		icode = CODE_FOR_addtf3_odd;	break;
+      case CODE_FOR_subkf3_odd:		icode = CODE_FOR_subtf3_odd;	break;
+      case CODE_FOR_mulkf3_odd:		icode = CODE_FOR_multf3_odd;	break;
+      case CODE_FOR_divkf3_odd:		icode = CODE_FOR_divtf3_odd;	break;
+      case CODE_FOR_fmakf4_odd:		icode = CODE_FOR_fmatf4_odd;	break;
+      case CODE_FOR_xsxexpqp_kf:	icode = CODE_FOR_xsxexpqp_tf;	break;
+      case CODE_FOR_xsxsigqp_kf:	icode = CODE_FOR_xsxsigqp_tf;	break;
+      case CODE_FOR_xststdcnegqp_kf:	icode = CODE_FOR_xststdcnegqp_tf; break;
+      case CODE_FOR_xsiexpqp_kf:	icode = CODE_FOR_xsiexpqp_tf;	break;
+      case CODE_FOR_xsiexpqpf_kf:	icode = CODE_FOR_xsiexpqpf_tf;	break;
+      case CODE_FOR_xststdcqp_kf:	icode = CODE_FOR_xststdcqp_tf;	break;
+      }
 
   if (TARGET_DEBUG_BUILTIN)
     {
-      enum insn_code icode = rs6000_builtin_info[uns_fcode].icode;
       const char *name1 = rs6000_builtin_info[uns_fcode].name;
       const char *name2 = (icode != CODE_FOR_nothing)
 			   ? get_insn_name ((int) icode)
@@ -16815,48 +16843,13 @@  rs6000_expand_builtin (tree exp, rtx tar
     case RS6000_BUILTIN_CPU_SUPPORTS:
       return cpu_expand_builtin (fcode, exp, target);
 
-    case FLOAT128_BUILTIN_SQRTF128_ODD:
-      return rs6000_expand_unop_builtin (TARGET_IEEEQUAD
-					 ? CODE_FOR_sqrttf2_odd
-					 : CODE_FOR_sqrtkf2_odd, exp, target);
-
-    case FLOAT128_BUILTIN_TRUNCF128_ODD:
-      return rs6000_expand_unop_builtin (TARGET_IEEEQUAD
-					 ? CODE_FOR_trunctfdf2_odd
-					 : CODE_FOR_trunckfdf2_odd, exp, target);
-
-    case FLOAT128_BUILTIN_ADDF128_ODD:
-      return rs6000_expand_binop_builtin (TARGET_IEEEQUAD
-					  ? CODE_FOR_addtf3_odd
-					  : CODE_FOR_addkf3_odd, exp, target);
-
-    case FLOAT128_BUILTIN_SUBF128_ODD:
-      return rs6000_expand_binop_builtin (TARGET_IEEEQUAD
-					  ? CODE_FOR_subtf3_odd
-					  : CODE_FOR_subkf3_odd, exp, target);
-
-    case FLOAT128_BUILTIN_MULF128_ODD:
-      return rs6000_expand_binop_builtin (TARGET_IEEEQUAD
-					  ? CODE_FOR_multf3_odd
-					  : CODE_FOR_mulkf3_odd, exp, target);
-
-    case FLOAT128_BUILTIN_DIVF128_ODD:
-      return rs6000_expand_binop_builtin (TARGET_IEEEQUAD
-					  ? CODE_FOR_divtf3_odd
-					  : CODE_FOR_divkf3_odd, exp, target);
-
-    case FLOAT128_BUILTIN_FMAF128_ODD:
-      return rs6000_expand_ternop_builtin (TARGET_IEEEQUAD
-					   ? CODE_FOR_fmatf4_odd
-					   : CODE_FOR_fmakf4_odd, exp, target);
-
     case ALTIVEC_BUILTIN_MASK_FOR_LOAD:
     case ALTIVEC_BUILTIN_MASK_FOR_STORE:
       {
-	int icode = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct
+	int icode2 = (BYTES_BIG_ENDIAN ? (int) CODE_FOR_altivec_lvsr_direct
 		     : (int) CODE_FOR_altivec_lvsl_direct);
-	machine_mode tmode = insn_data[icode].operand[0].mode;
-	machine_mode mode = insn_data[icode].operand[1].mode;
+	machine_mode tmode = insn_data[icode2].operand[0].mode;
+	machine_mode mode = insn_data[icode2].operand[1].mode;
 	tree arg;
 	rtx op, addr, pat;
 
@@ -16878,10 +16871,10 @@  rs6000_expand_builtin (tree exp, rtx tar
 
 	if (target == 0
 	    || GET_MODE (target) != tmode
-	    || ! (*insn_data[icode].operand[0].predicate) (target, tmode))
+	    || ! (*insn_data[icode2].operand[0].predicate) (target, tmode))
 	  target = gen_reg_rtx (tmode);
 
-	pat = GEN_FCN (icode) (target, op);
+	pat = GEN_FCN (icode2) (target, op);
 	if (!pat)
 	  return 0;
 	emit_insn (pat);
@@ -16939,25 +16932,25 @@  rs6000_expand_builtin (tree exp, rtx tar
   d = bdesc_1arg;
   for (i = 0; i < ARRAY_SIZE (bdesc_1arg); i++, d++)
     if (d->code == fcode)
-      return rs6000_expand_unop_builtin (d->icode, exp, target);
+      return rs6000_expand_unop_builtin (icode, exp, target);
 
   /* Handle simple binary operations.  */
   d = bdesc_2arg;
   for (i = 0; i < ARRAY_SIZE (bdesc_2arg); i++, d++)
     if (d->code == fcode)
-      return rs6000_expand_binop_builtin (d->icode, exp, target);
+      return rs6000_expand_binop_builtin (icode, exp, target);
 
   /* Handle simple ternary operations.  */
   d = bdesc_3arg;
   for (i = 0; i < ARRAY_SIZE  (bdesc_3arg); i++, d++)
     if (d->code == fcode)
-      return rs6000_expand_ternop_builtin (d->icode, exp, target);
+      return rs6000_expand_ternop_builtin (icode, exp, target);
 
   /* Handle simple no-argument operations. */
   d = bdesc_0arg;
   for (i = 0; i < ARRAY_SIZE (bdesc_0arg); i++, d++)
     if (d->code == fcode)
-      return rs6000_expand_zeroop_builtin (d->icode, target);
+      return rs6000_expand_zeroop_builtin (icode, target);
 
   gcc_unreachable ();
 }
@@ -17228,32 +17221,6 @@  rs6000_init_builtins (void)
   def_builtin ("__builtin_cpu_is", ftype, RS6000_BUILTIN_CPU_IS);
   def_builtin ("__builtin_cpu_supports", ftype, RS6000_BUILTIN_CPU_SUPPORTS);
 
-  ftype = build_function_type_list (ieee128_float_type_node,
-				    ieee128_float_type_node, NULL_TREE);
-  def_builtin ("__builtin_sqrtf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_SQRTF128_ODD);
-  def_builtin ("__builtin_truncf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_TRUNCF128_ODD);
-
-  ftype = build_function_type_list (ieee128_float_type_node,
-				    ieee128_float_type_node,
-				    ieee128_float_type_node, NULL_TREE);
-  def_builtin ("__builtin_addf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_ADDF128_ODD);
-  def_builtin ("__builtin_subf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_SUBF128_ODD);
-  def_builtin ("__builtin_mulf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_MULF128_ODD);
-  def_builtin ("__builtin_divf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_DIVF128_ODD);
-
-  ftype = build_function_type_list (ieee128_float_type_node,
-				    ieee128_float_type_node,
-				    ieee128_float_type_node,
-				    ieee128_float_type_node, NULL_TREE);
-  def_builtin ("__builtin_fmaf128_round_to_odd", ftype,
-	       FLOAT128_BUILTIN_FMAF128_ODD);
-
   /* AIX libm provides clog as __clog.  */
   if (TARGET_XCOFF &&
       (tdecl = builtin_decl_explicit (BUILT_IN_CLOG)) != NULL_TREE)
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 254556)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -4064,9 +4064,9 @@  (define_insn "*vsx_sign_extend_si_v2di"
 ;; ISA 3.0 Binary Floating-Point Support
 
 ;; VSX Scalar Extract Exponent Quad-Precision
-(define_insn "xsxexpqp"
+(define_insn "xsxexpqp_<mode>"
   [(set (match_operand:DI 0 "altivec_register_operand" "=v")
-	(unspec:DI [(match_operand:KF 1 "altivec_register_operand" "v")]
+	(unspec:DI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SXEXPDP))]
   "TARGET_P9_VECTOR"
   "xsxexpqp %0,%1"
@@ -4082,9 +4082,9 @@  (define_insn "xsxexpdp"
   [(set_attr "type" "integer")])
 
 ;; VSX Scalar Extract Significand Quad-Precision
-(define_insn "xsxsigqp"
+(define_insn "xsxsigqp_<mode>"
   [(set (match_operand:TI 0 "altivec_register_operand" "=v")
-	(unspec:TI [(match_operand:KF 1 "altivec_register_operand" "v")]
+	(unspec:TI [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SXSIG))]
   "TARGET_P9_VECTOR"
   "xsxsigqp %0,%1"
@@ -4100,20 +4100,21 @@  (define_insn "xsxsigdp"
   [(set_attr "type" "integer")])
 
 ;; VSX Scalar Insert Exponent Quad-Precision Floating Point Argument
-(define_insn "xsiexpqpf"
-  [(set (match_operand:KF 0 "altivec_register_operand" "=v")
-	(unspec:KF [(match_operand:KF 1 "altivec_register_operand" "v")
-		    (match_operand:DI 2 "altivec_register_operand" "v")]
+(define_insn "xsiexpqpf_<mode>"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(match_operand:IEEE128 1 "altivec_register_operand" "v")
+	  (match_operand:DI 2 "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SIEXPQP))]
   "TARGET_P9_VECTOR"
   "xsiexpqp %0,%1,%2"
   [(set_attr "type" "vecmove")])
 
 ;; VSX Scalar Insert Exponent Quad-Precision
-(define_insn "xsiexpqp"
-  [(set (match_operand:KF 0 "altivec_register_operand" "=v")
-	(unspec:KF [(match_operand:TI 1 "altivec_register_operand" "v")
-		    (match_operand:DI 2 "altivec_register_operand" "v")]
+(define_insn "xsiexpqp_<mode>"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128 [(match_operand:TI 1 "altivec_register_operand" "v")
+			 (match_operand:DI 2 "altivec_register_operand" "v")]
 	 UNSPEC_VSX_SIEXPQP))]
   "TARGET_P9_VECTOR"
   "xsiexpqp %0,%1,%2"
@@ -4172,11 +4173,11 @@  (define_insn "*xscmpexpdp"
 ;;   (Has side effect of setting the lt bit if operand 1 is negative,
 ;;    setting the eq bit if any of the conditions tested by operand 2
 ;;    are satisfied, and clearing the gt and undordered bits to zero.)
-(define_expand "xststdcqp"
+(define_expand "xststdcqp_<mode>"
   [(set (match_dup 3)
 	(compare:CCFP
-	 (unspec:KF
-	  [(match_operand:KF 1 "altivec_register_operand" "v")
+	 (unspec:IEEE128
+	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")
 	   (match_operand:SI 2 "u7bit_cint_operand" "n")]
 	  UNSPEC_VSX_STSTDC)
 	 (const_int 0)))
@@ -4210,11 +4211,11 @@  (define_expand "xststdc<Fvsx>"
 })
 
 ;; The VSX Scalar Test Negative Quad-Precision
-(define_expand "xststdcnegqp"
+(define_expand "xststdcnegqp_<mode>"
   [(set (match_dup 2)
 	(compare:CCFP
-	 (unspec:KF
-	  [(match_operand:KF 1 "altivec_register_operand" "v")
+	 (unspec:IEEE128
+	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")
 	   (const_int 0)]
 	  UNSPEC_VSX_STSTDC)
 	 (const_int 0)))
@@ -4244,11 +4245,12 @@  (define_expand "xststdcneg<Fvsx>"
   operands[3] = CONST0_RTX (SImode);
 })
 
-(define_insn "*xststdcqp"
+(define_insn "*xststdcqp_<mode>"
   [(set (match_operand:CCFP 0 "" "=y")
 	(compare:CCFP
-	 (unspec:KF [(match_operand:KF 1 "altivec_register_operand" "v")
-		     (match_operand:SI 2 "u7bit_cint_operand" "n")]
+	 (unspec:IEEE128
+	  [(match_operand:IEEE128 1 "altivec_register_operand" "v")
+	   (match_operand:SI 2 "u7bit_cint_operand" "n")]
 	  UNSPEC_VSX_STSTDC)
 	 (const_int 0)))]
   "TARGET_P9_VECTOR"
Index: gcc/testsuite/gcc.target/powerpc/float128-hw4.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-hw4.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/float128-hw4.c	(revision 0)
@@ -0,0 +1,135 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Insure that the ISA 3.0 IEEE 128-bit floating point built-in functions can
+   be used with long double when the default is IEEE 128-bit.  */
+
+#ifndef TYPE
+#define TYPE long double
+#endif
+
+unsigned int
+get_double_exponent (double a)
+{
+  return __builtin_vec_scalar_extract_exp (a);
+}
+
+unsigned int
+get_float128_exponent (TYPE a)
+{
+  return __builtin_vec_scalar_extract_exp (a);
+}
+
+unsigned long
+get_double_mantissa (double a)
+{
+  return __builtin_vec_scalar_extract_sig (a);
+}
+
+__uint128_t
+get_float128_mantissa (TYPE a)
+{
+  return __builtin_vec_scalar_extract_sig (a);
+}
+
+double
+set_double_exponent_ulong (unsigned long a, unsigned long e)
+{
+  return __builtin_vec_scalar_insert_exp (a, e);
+}
+
+TYPE
+set_float128_exponent_uint128 (__uint128_t a, unsigned long e)
+{
+  return __builtin_vec_scalar_insert_exp (a, e);
+}
+
+double
+set_double_exponent_double (double a, unsigned long e)
+{
+  return __builtin_vec_scalar_insert_exp (a, e);
+}
+
+TYPE
+set_float128_exponent_float128 (TYPE a, __uint128_t e)
+{
+  return __builtin_vec_scalar_insert_exp (a, e);
+}
+
+TYPE
+sqrt_odd (TYPE a)
+{
+  return __builtin_sqrtf128_round_to_odd (a);
+}
+
+double
+trunc_odd (TYPE a)
+{
+  return __builtin_truncf128_round_to_odd (a);
+}
+
+TYPE
+add_odd (TYPE a, TYPE b)
+{
+  return __builtin_addf128_round_to_odd (a, b);
+}
+
+TYPE
+sub_odd (TYPE a, TYPE b)
+{
+  return __builtin_subf128_round_to_odd (a, b);
+}
+
+TYPE
+mul_odd (TYPE a, TYPE b)
+{
+  return __builtin_mulf128_round_to_odd (a, b);
+}
+
+TYPE
+div_odd (TYPE a, TYPE b)
+{
+  return __builtin_divf128_round_to_odd (a, b);
+}
+
+TYPE
+fma_odd (TYPE a, TYPE b, TYPE c)
+{
+  return __builtin_fmaf128_round_to_odd (a, b, c);
+}
+
+TYPE
+fms_odd (TYPE a, TYPE b, TYPE c)
+{
+  return __builtin_fmaf128_round_to_odd (a, b, -c);
+}
+
+TYPE
+nfma_odd (TYPE a, TYPE b, TYPE c)
+{
+  return -__builtin_fmaf128_round_to_odd (a, b, c);
+}
+
+TYPE
+nfms_odd (TYPE a, TYPE b, TYPE c)
+{
+  return -__builtin_fmaf128_round_to_odd (a, b, -c);
+}
+
+/* { dg-final { scan-assembler 	   {\mxsiexpdp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsiexpqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsxexpdp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsxexpqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsxsigdp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsxsigqp\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsaddqpo\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsdivqpo\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsmaddqpo\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxsmsubqpo\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxsmulqpo\M}   } } */
+/* { dg-final { scan-assembler 	   {\mxsnmaddqpo\M} } } */
+/* { dg-final { scan-assembler 	   {\mxsnmsubqpo\M} } } */
+/* { dg-final { scan-assembler 	   {\mxssqrtqpo\M}  } } */
+/* { dg-final { scan-assembler 	   {\mxssubqpo\M}   } } */
+/* { dg-final { scan-assembler-not {\mbl\M}         } } */