diff mbox series

, Add PowerPC ISA 3.0 IEEE 128-bit floating point round to odd built-in functions

Message ID 20170928223423.GA3629@ibm-tiger.the-meissners.org
State New
Headers show
Series , Add PowerPC ISA 3.0 IEEE 128-bit floating point round to odd built-in functions | expand

Commit Message

Michael Meissner Sept. 28, 2017, 10:34 p.m. UTC
This patch addss built-in functions on PowerPC ISA 3.0 (power9) that allow the
user to access the round to odd IEEE 128-bit floating point instructions.

I have checked it on a little endian power8 system doing a bootstrap and make
check.  There were no regressions in the testsuite.  I verified that the new
test (float128-odd.c) did run sucessfully.  Can I check this patch into the
trunk?

[gcc]
2017-09-28  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def (BU_FLOAT128_2_HW): Define new
	helper macro for IEEE float128 hardware built-in functions.
	(SQRTF128_ODD): Add built-in functions with the round-to-odd
	semantics.
	(TRUNCF128_ODD): Likewise.
	(ADDF128_ODD): Likewise.
	(SUBF128_ODD): Likewise.
	(MULF128_ODD): Likewise.
	(DIVF128_ODD): Likewise.
	(FMAF128_ODD): Likewise.
	* config/rs6000/rs6000.md (trunc<mode>sf2_hw): Change the truncate
	with round to odd expansion to use float_truncate:DF inside of the
	UNSPEC to better document what the insn does.
	(add<mode>3_odd): Add insns for IEEE 128-bit floating point round
	to odd hardware instructions.
	(sub<mode>3_odd): Likewise.
	(mul<mode>3_odd): Likewise.
	(div<mode>3_odd): Likewise.
	(sqrt<mode>2_odd): Likewise.
	(fma<mode>4_odd): Likewise.
	(fms<mode>4_odd): Likewise.
	(nfma<mode>4_odd): Likewise.
	(nfms<mode>4_odd): Likewise.
	(trunc<mode>df2_odd): Change insn format to make it more readable,
	and add a generator function.
	* doc/extend.texi (PowerPC built-in functions): Update documentation
	for existing IEEE float128-bit built-in functions.  Add built-in
	functions that generate the IEEE 128-bit floating point round to
	odd instructions.

[gcc/testsuite]
2017-09-28  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-odd.c: New test.

Comments

Segher Boessenkool Sept. 29, 2017, 5:10 p.m. UTC | #1
Hi Mike,

On Thu, Sep 28, 2017 at 06:34:23PM -0400, Michael Meissner wrote:
> This patch addss built-in functions on PowerPC ISA 3.0 (power9) that allow the
> user to access the round to odd IEEE 128-bit floating point instructions.

> --- gcc/config/rs6000/rs6000.md	(revision 253267)
> +++ gcc/config/rs6000/rs6000.md	(working copy)
> @@ -14505,7 +14505,9 @@ (define_insn_and_split "trunc<mode>sf2_h
>    "#"
>    "&& 1"
>    [(set (match_dup 2)
> -	(unspec:DF [(match_dup 1)] UNSPEC_ROUND_TO_ODD))
> +	(unspec:DF [(float_truncate:DF
> +		     (match_dup 1))]
> +		   UNSPEC_ROUND_TO_ODD))
>     (set (match_dup 0)
>  	(float_truncate:SF (match_dup 2)))]
>  {

I don't think this is correct.  It says to first truncate the f128 to DF,
and then round it to odd; I think you want to do the truncation with
round-to-odd rounding mode already.

> +(define_insn "mul<mode>3_odd"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(unspec:IEEE128
> +	 [(mult:IEEE128
> +	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
> +	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
> +	 UNSPEC_ROUND_TO_ODD))]
> +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +  "xsmulqpo %0,%1,%2"
> +  [(set_attr "type" "vecfloat")
> +   (set_attr "size" "128")])

Similar here (and everywhere else): it does an f128 mul, so rounding
with whatever rounding mode is current, and *then* it rounds to odd.

> +(define_insn "sqrt<mode>2_odd"
> +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> +	(unspec:IEEE128
> +	 [(sqrt:IEEE128
> +	   (match_operand:IEEE128 1 "altivec_register_operand" "v"))]
> +	 UNSPEC_ROUND_TO_ODD))]
> +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> +   "xssqrtqpo %0,%1"

(One space too many here).

Everything else looks fine, but that unspec thing needs fixing.  Can be
later, things will likely work for now, so okay for trunk.  Thanks.

How do other ports deal with this?  Insns with a specific rounding mode?
Have a separate unspec for every operation?  Not very nice either :-(


Segher
Joseph Myers Sept. 29, 2017, 8:31 p.m. UTC | #2
On Fri, 29 Sep 2017, Segher Boessenkool wrote:

> How do other ports deal with this?  Insns with a specific rounding mode?
> Have a separate unspec for every operation?  Not very nice either :-(

Well, ideally you'd have a machine-independent representation of constant 
rounding modes that could be used with the TS 18661-1 FENV_ROUND pragma, 
respectively FENV_DEC_ROUND for decimal floating point (as the standard 
machine-independent way of accessing such a facility at the C language 
level - you'd then need to extend it to handle round-to-odd, but given the 
basic facility, accepting additional rounding mode names with it should be 
easy).  But I don't know what that would look like in either GIMPLE or 
RTL, and I'd certainly expect it to be a large project (quite likely 
depending on other large projects to handle dynamic rounding modes 
properly through optimizers).  So you probably can't do much better than 
lots of unspecs and machine-specific built-in functions at present.
Joseph Myers Sept. 29, 2017, 8:42 p.m. UTC | #3
On Fri, 29 Sep 2017, Joseph Myers wrote:

> On Fri, 29 Sep 2017, Segher Boessenkool wrote:
> 
> > How do other ports deal with this?  Insns with a specific rounding mode?
> > Have a separate unspec for every operation?  Not very nice either :-(
> 
> Well, ideally you'd have a machine-independent representation of constant 
> rounding modes that could be used with the TS 18661-1 FENV_ROUND pragma, 
> respectively FENV_DEC_ROUND for decimal floating point (as the standard 
> machine-independent way of accessing such a facility at the C language 
> level - you'd then need to extend it to handle round-to-odd, but given the 
> basic facility, accepting additional rounding mode names with it should be 
> easy).  But I don't know what that would look like in either GIMPLE or 
> RTL, and I'd certainly expect it to be a large project (quite likely 
> depending on other large projects to handle dynamic rounding modes 
> properly through optimizers).  So you probably can't do much better than 
> lots of unspecs and machine-specific built-in functions at present.

(But the answer to your question seems to be that AVX512 uses something 
involving UNSPEC_EMBEDDED_ROUNDING.)
Segher Boessenkool Oct. 2, 2017, 10:46 a.m. UTC | #4
On Fri, Sep 29, 2017 at 08:42:45PM +0000, Joseph Myers wrote:
> On Fri, 29 Sep 2017, Joseph Myers wrote:
> 
> > On Fri, 29 Sep 2017, Segher Boessenkool wrote:
> > 
> > > How do other ports deal with this?  Insns with a specific rounding mode?
> > > Have a separate unspec for every operation?  Not very nice either :-(
> > 
> > Well, ideally you'd have a machine-independent representation of constant 
> > rounding modes that could be used with the TS 18661-1 FENV_ROUND pragma, 
> > respectively FENV_DEC_ROUND for decimal floating point (as the standard 
> > machine-independent way of accessing such a facility at the C language 
> > level - you'd then need to extend it to handle round-to-odd, but given the 
> > basic facility, accepting additional rounding mode names with it should be 
> > easy).  But I don't know what that would look like in either GIMPLE or 
> > RTL, and I'd certainly expect it to be a large project (quite likely 
> > depending on other large projects to handle dynamic rounding modes 
> > properly through optimizers).  So you probably can't do much better than 
> > lots of unspecs and machine-specific built-in functions at present.
> 
> (But the answer to your question seems to be that AVX512 uses something 
> involving UNSPEC_EMBEDDED_ROUNDING.)

Thanks.

So this seems to be the same as Mike's patch does.  I was hoping for a
nifty trick, not another huge project :-)  Oh well.


Segher
Michael Meissner Oct. 2, 2017, 6:01 p.m. UTC | #5
On Fri, Sep 29, 2017 at 12:10:07PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Sep 28, 2017 at 06:34:23PM -0400, Michael Meissner wrote:
> > This patch addss built-in functions on PowerPC ISA 3.0 (power9) that allow the
> > user to access the round to odd IEEE 128-bit floating point instructions.
> 
> > --- gcc/config/rs6000/rs6000.md	(revision 253267)
> > +++ gcc/config/rs6000/rs6000.md	(working copy)
> > @@ -14505,7 +14505,9 @@ (define_insn_and_split "trunc<mode>sf2_h
> >    "#"
> >    "&& 1"
> >    [(set (match_dup 2)
> > -	(unspec:DF [(match_dup 1)] UNSPEC_ROUND_TO_ODD))
> > +	(unspec:DF [(float_truncate:DF
> > +		     (match_dup 1))]
> > +		   UNSPEC_ROUND_TO_ODD))
> >     (set (match_dup 0)
> >  	(float_truncate:SF (match_dup 2)))]
> >  {
> 
> I don't think this is correct.  It says to first truncate the f128 to DF,
> and then round it to odd; I think you want to do the truncation with
> round-to-odd rounding mode already.
> 
> > +(define_insn "mul<mode>3_odd"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(unspec:IEEE128
> > +	 [(mult:IEEE128
> > +	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
> > +	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
> > +	 UNSPEC_ROUND_TO_ODD))]
> > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +  "xsmulqpo %0,%1,%2"
> > +  [(set_attr "type" "vecfloat")
> > +   (set_attr "size" "128")])
> 
> Similar here (and everywhere else): it does an f128 mul, so rounding
> with whatever rounding mode is current, and *then* it rounds to odd.
> 
> > +(define_insn "sqrt<mode>2_odd"
> > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > +	(unspec:IEEE128
> > +	 [(sqrt:IEEE128
> > +	   (match_operand:IEEE128 1 "altivec_register_operand" "v"))]
> > +	 UNSPEC_ROUND_TO_ODD))]
> > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > +   "xssqrtqpo %0,%1"
> 
> (One space too many here).
> 
> Everything else looks fine, but that unspec thing needs fixing.  Can be
> later, things will likely work for now, so okay for trunk.  Thanks.
> 
> How do other ports deal with this?  Insns with a specific rounding mode?
> Have a separate unspec for every operation?  Not very nice either :-(

I just want to do the minimal work so the glibc people can use this
instruction, but I am not that motivated to go back and modify it further.  So,
I would prefer to do it right before committing it, unless you want to refine
it.

As I see it, there are 2 ways to encode the RTL:

    1)	Use one UNSPEC name, and add the operation inside of the spec;

    2)	Add separate UNSPEC's for each operation.

Either one is fine with me.  If you would prefer separate names, I can do it.

In theory we could do something like define_subst, but that is a large time
waster to learn how to do it for 8 specialized instructions.

Similarly, I don't see the need for doing more general support, until we have
more data types and more rounding modes.
Segher Boessenkool Oct. 2, 2017, 7:12 p.m. UTC | #6
Hi!

On Mon, Oct 02, 2017 at 02:01:57PM -0400, Michael Meissner wrote:
> On Fri, Sep 29, 2017 at 12:10:07PM -0500, Segher Boessenkool wrote:
> > On Thu, Sep 28, 2017 at 06:34:23PM -0400, Michael Meissner wrote:
> > > --- gcc/config/rs6000/rs6000.md	(revision 253267)
> > > +++ gcc/config/rs6000/rs6000.md	(working copy)
> > > @@ -14505,7 +14505,9 @@ (define_insn_and_split "trunc<mode>sf2_h
> > >    "#"
> > >    "&& 1"
> > >    [(set (match_dup 2)
> > > -	(unspec:DF [(match_dup 1)] UNSPEC_ROUND_TO_ODD))
> > > +	(unspec:DF [(float_truncate:DF
> > > +		     (match_dup 1))]
> > > +		   UNSPEC_ROUND_TO_ODD))
> > >     (set (match_dup 0)
> > >  	(float_truncate:SF (match_dup 2)))]
> > >  {
> > 
> > I don't think this is correct.  It says to first truncate the f128 to DF,
> > and then round it to odd; I think you want to do the truncation with
> > round-to-odd rounding mode already.
> > 
> > > +(define_insn "mul<mode>3_odd"
> > > +  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
> > > +	(unspec:IEEE128
> > > +	 [(mult:IEEE128
> > > +	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
> > > +	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
> > > +	 UNSPEC_ROUND_TO_ODD))]
> > > +  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
> > > +  "xsmulqpo %0,%1,%2"
> > > +  [(set_attr "type" "vecfloat")
> > > +   (set_attr "size" "128")])
> > 
> > Similar here (and everywhere else): it does an f128 mul, so rounding
> > with whatever rounding mode is current, and *then* it rounds to odd.

> > Everything else looks fine, but that unspec thing needs fixing.  Can be
> > later, things will likely work for now, so okay for trunk.  Thanks.
> > 
> > How do other ports deal with this?  Insns with a specific rounding mode?
> > Have a separate unspec for every operation?  Not very nice either :-(
> 
> I just want to do the minimal work so the glibc people can use this
> instruction, but I am not that motivated to go back and modify it further.  So,
> I would prefer to do it right before committing it, unless you want to refine
> it.
> 
> As I see it, there are 2 ways to encode the RTL:
> 
>     1)	Use one UNSPEC name, and add the operation inside of the spec;

But this is incorrect as far as I see: it does not correctly describe
what the instruction does.

>     2)	Add separate UNSPEC's for each operation.
> 
> Either one is fine with me.  If you would prefer separate names, I can do it.

Yes please.

Thanks,


Segher
Michael Meissner Oct. 3, 2017, 8:15 p.m. UTC | #7
Here is the patch to add the round to odd instructions using separate UNSPEC
names instead of putting the operation into the unspec.

I have done a bootstrap and check on a little endian power8 system and there
were no regressions.  Can I check this into the trunk?

[gcc]
2017-10-03  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000-builtin.def (BU_FLOAT128_2_HW): Define new
	helper macro for IEEE float128 hardware built-in functions.
	(SQRTF128_ODD): Add built-in functions with the round-to-odd
	semantics.
	(TRUNCF128_ODD): Likewise.
	(ADDF128_ODD): Likewise.
	(SUBF128_ODD): Likewise.
	(MULF128_ODD): Likewise.
	(DIVF128_ODD): Likewise.
	(FMAF128_ODD): Likewise.
	* config/rs6000/rs6000.md (UNSPEC_ROUND_TO_ODD): Rename to
	UNSPEC_TRUNC_ROUND_TO_ODD.
	(UNSPEC_TRUNC_ROUND_TO_ODD): Likewise.
	(UNSPEC_ADD_ROUND_TO_ODD): New unspec codes for the IEEE 128-bit
	floating point round to odd instructions.
	(UNSPEC_SUB_ROUND_TO_ODD): Likewise.
	(UNSPEC_MUL_ROUND_TO_ODD): Likewise.
	(UNSPEC_DIV_ROUND_TO_ODD): Likewise.
	(UNSPEC_FMA_ROUND_TO_ODD): Likewise.
	(UNSPEC_SQRT_ROUND_TO_ODD): Likewise.
	(trunc<mode>sf2_hw): Change the truncate with round to odd
	expansion to use UNSPEC_TRUNC_ROUND_TO_ODD.
	(add<mode>3_odd): Add insns for IEEE 128-bit floating point round
	to odd hardware instructions.
	(sub<mode>3_odd): Likewise.
	(mul<mode>3_odd): Likewise.
	(div<mode>3_odd): Likewise.
	(sqrt<mode>2_odd): Likewise.
	(fma<mode>4_odd): Likewise.
	(fms<mode>4_odd): Likewise.
	(nfma<mode>4_odd): Likewise.
	(nfms<mode>4_odd): Likewise.
	(trunc<mode>df2_odd): Change the truncate with round to odd
	expansion to use UNSPEC_TRUNC_ROUND_TO_ODD.	Add a generator
	function.
	* doc/extend.texi (PowerPC built-in functions): Update documentation
	for existing IEEE float128-bit built-in functions.  Add built-in
	functions that generate the IEEE 128-bit floating point round to
	odd instructions.

[gcc/testsuite]
2017-10-03  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/float128-odd.c: New test.
Segher Boessenkool Oct. 3, 2017, 10:21 p.m. UTC | #8
On Tue, Oct 03, 2017 at 04:15:23PM -0400, Michael Meissner wrote:
> Here is the patch to add the round to odd instructions using separate UNSPEC
> names instead of putting the operation into the unspec.
> 
> I have done a bootstrap and check on a little endian power8 system and there
> were no regressions.  Can I check this into the trunk?

This looks fine.  Okay for trunk.  Thanks!


Segher


> 2017-10-03  Michael Meissner  <meissner@linux.vnet.ibm.com>
> 
> 	* config/rs6000/rs6000-builtin.def (BU_FLOAT128_2_HW): Define new
> 	helper macro for IEEE float128 hardware built-in functions.
> 	(SQRTF128_ODD): Add built-in functions with the round-to-odd
> 	semantics.
> 	(TRUNCF128_ODD): Likewise.
> 	(ADDF128_ODD): Likewise.
> 	(SUBF128_ODD): Likewise.
> 	(MULF128_ODD): Likewise.
> 	(DIVF128_ODD): Likewise.
> 	(FMAF128_ODD): Likewise.
> 	* config/rs6000/rs6000.md (UNSPEC_ROUND_TO_ODD): Rename to
> 	UNSPEC_TRUNC_ROUND_TO_ODD.
> 	(UNSPEC_TRUNC_ROUND_TO_ODD): Likewise.
> 	(UNSPEC_ADD_ROUND_TO_ODD): New unspec codes for the IEEE 128-bit
> 	floating point round to odd instructions.
> 	(UNSPEC_SUB_ROUND_TO_ODD): Likewise.
> 	(UNSPEC_MUL_ROUND_TO_ODD): Likewise.
> 	(UNSPEC_DIV_ROUND_TO_ODD): Likewise.
> 	(UNSPEC_FMA_ROUND_TO_ODD): Likewise.
> 	(UNSPEC_SQRT_ROUND_TO_ODD): Likewise.
> 	(trunc<mode>sf2_hw): Change the truncate with round to odd
> 	expansion to use UNSPEC_TRUNC_ROUND_TO_ODD.
> 	(add<mode>3_odd): Add insns for IEEE 128-bit floating point round
> 	to odd hardware instructions.
> 	(sub<mode>3_odd): Likewise.
> 	(mul<mode>3_odd): Likewise.
> 	(div<mode>3_odd): Likewise.
> 	(sqrt<mode>2_odd): Likewise.
> 	(fma<mode>4_odd): Likewise.
> 	(fms<mode>4_odd): Likewise.
> 	(nfma<mode>4_odd): Likewise.
> 	(nfms<mode>4_odd): Likewise.
> 	(trunc<mode>df2_odd): Change the truncate with round to odd
> 	expansion to use UNSPEC_TRUNC_ROUND_TO_ODD.	Add a generator
> 	function.
> 	* doc/extend.texi (PowerPC built-in functions): Update documentation
> 	for existing IEEE float128-bit built-in functions.  Add built-in
> 	functions that generate the IEEE 128-bit floating point round to
> 	odd instructions.
diff mbox series

Patch

Index: gcc/config/rs6000/rs6000-builtin.def
===================================================================
--- gcc/config/rs6000/rs6000-builtin.def	(revision 253267)
+++ gcc/config/rs6000/rs6000-builtin.def	(working copy)
@@ -686,6 +686,14 @@ 
 		     | RS6000_BTC_UNARY),                               \
 		    CODE_FOR_ ## ICODE)                 /* ICODE */
 
+#define BU_FLOAT128_2_HW(ENUM, NAME, ATTR, ICODE)                       \
+  RS6000_BUILTIN_2 (MISC_BUILTIN_ ## ENUM,              /* ENUM */      \
+		    "__builtin_" NAME,                  /* NAME */      \
+		    RS6000_BTM_FLOAT128_HW,             /* MASK */      \
+		    (RS6000_BTC_ ## ATTR                /* ATTR */      \
+		     | RS6000_BTC_BINARY),                              \
+		    CODE_FOR_ ## ICODE)                 /* ICODE */
+
 #define BU_FLOAT128_3_HW(ENUM, NAME, ATTR, ICODE)                       \
   RS6000_BUILTIN_3 (MISC_BUILTIN_ ## ENUM,              /* ENUM */      \
 		    "__builtin_" NAME,                  /* NAME */      \
@@ -2365,11 +2373,19 @@  BU_P9_OVERLOAD_2 (CMPEQB,	"byte_in_set")
 BU_FLOAT128_1 (FABSQ,		"fabsq",       CONST, abskf2)
 BU_FLOAT128_2 (COPYSIGNQ,	"copysignq",   CONST, copysignkf3)
 
-/* 1 and 3 argument IEEE 128-bit floating point functions that require ISA 3.0
-   hardware.  These functions use the new 'f128' suffix.  Eventually these
-   should be folded into the common built-in function handling. */
-BU_FLOAT128_1_HW (SQRTF128,	"sqrtf128",	CONST, sqrtkf2)
-BU_FLOAT128_3_HW (FMAF128,	"fmaf128",	CONST, fmakf4_hw)
+/* 1, 2, and 3 argument IEEE 128-bit floating point functions that require ISA
+   3.0 hardware.  These functions use the new 'f128' suffix.  Eventually the
+   standard functions should be folded into the common built-in function
+   handling. */
+BU_FLOAT128_1_HW (SQRTF128,	 "sqrtf128",		   CONST, sqrtkf2)
+BU_FLOAT128_1_HW (SQRTF128_ODD,	 "sqrtf128_round_to_odd",  CONST, sqrtkf2_odd)
+BU_FLOAT128_1_HW (TRUNCF128_ODD, "truncf128_round_to_odd", CONST, trunckfdf2_odd)
+BU_FLOAT128_2_HW (ADDF128_ODD,	 "addf128_round_to_odd",   CONST, addkf3_odd)
+BU_FLOAT128_2_HW (SUBF128_ODD,	 "subf128_round_to_odd",   CONST, subkf3_odd)
+BU_FLOAT128_2_HW (MULF128_ODD,	 "mulf128_round_to_odd",   CONST, mulkf3_odd)
+BU_FLOAT128_2_HW (DIVF128_ODD,	 "divf128_round_to_odd",   CONST, divkf3_odd)
+BU_FLOAT128_3_HW (FMAF128,	 "fmaf128",		   CONST, fmakf4_hw)
+BU_FLOAT128_3_HW (FMAF128_ODD,	 "fmaf128_round_to_odd",   CONST, fmakf4_odd)
 
 /* 1 argument crypto functions.  */
 BU_CRYPTO_1 (VSBOX,		"vsbox",	  CONST, crypto_vsbox)
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 253267)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -14505,7 +14505,9 @@  (define_insn_and_split "trunc<mode>sf2_h
   "#"
   "&& 1"
   [(set (match_dup 2)
-	(unspec:DF [(match_dup 1)] UNSPEC_ROUND_TO_ODD))
+	(unspec:DF [(float_truncate:DF
+		     (match_dup 1))]
+		   UNSPEC_ROUND_TO_ODD))
    (set (match_dup 0)
 	(float_truncate:SF (match_dup 2)))]
 {
@@ -14682,9 +14684,125 @@  (define_insn_and_split "floatuns<QHI:mod
    (set_attr "size" "128")])
 
 ;; IEEE 128-bit instructions with round to odd semantics
-(define_insn "*trunc<mode>df2_odd"
+(define_insn "add<mode>3_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(plus:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsaddqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "sub<mode>3_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(minus:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xssubqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "mul<mode>3_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(mult:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsmulqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "div<mode>3_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(div:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsdivqpo %0,%1,%2"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
+
+(define_insn "sqrt<mode>2_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(sqrt:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "v"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+   "xssqrtqpo %0,%1"
+  [(set_attr "type" "vecdiv")
+   (set_attr "size" "128")])
+
+(define_insn "fma<mode>4_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(fma:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "%v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v")
+	   (match_operand:IEEE128 3 "altivec_register_operand" "0"))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsmaddqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "*fms<mode>4_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(unspec:IEEE128
+	 [(fma:IEEE128
+	   (match_operand:IEEE128 1 "altivec_register_operand" "%v")
+	   (match_operand:IEEE128 2 "altivec_register_operand" "v")
+	   (neg:IEEE128
+	    (match_operand:IEEE128 3 "altivec_register_operand" "0")))]
+	 UNSPEC_ROUND_TO_ODD))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsmsubqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "*nfma<mode>4_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(neg:IEEE128
+	 (unspec:IEEE128
+	  [(fma:IEEE128
+	    (match_operand:IEEE128 1 "altivec_register_operand" "%v")
+	    (match_operand:IEEE128 2 "altivec_register_operand" "v")
+	    (match_operand:IEEE128 3 "altivec_register_operand" "0"))]
+	  UNSPEC_ROUND_TO_ODD)))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsnmaddqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "*nfms<mode>4_odd"
+  [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v")
+	(neg:IEEE128
+	 (unspec:IEEE128
+	  [(fma:IEEE128
+	    (match_operand:IEEE128 1 "altivec_register_operand" "%v")
+	    (match_operand:IEEE128 2 "altivec_register_operand" "v")
+	    (neg:IEEE128
+	     (match_operand:IEEE128 3 "altivec_register_operand" "0")))]
+	  UNSPEC_ROUND_TO_ODD)))]
+  "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
+  "xsnmsubqpo %0,%1,%2"
+  [(set_attr "type" "vecfloat")
+   (set_attr "size" "128")])
+
+(define_insn "trunc<mode>df2_odd"
   [(set (match_operand:DF 0 "vsx_register_operand" "=v")
-	(unspec:DF [(match_operand:IEEE128 1 "altivec_register_operand" "v")]
+	(unspec:DF [(float_truncate:DF
+		     (match_operand:IEEE128 1 "altivec_register_operand" "v"))]
 		   UNSPEC_ROUND_TO_ODD))]
   "TARGET_FLOAT128_HW && FLOAT128_IEEE_P (<MODE>mode)"
   "xscvqpdpo %0,%1"
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 253267)
+++ gcc/doc/extend.texi	(working copy)
@@ -15348,14 +15348,47 @@  that use the ISA 3.0 instruction set.
 
 @table @code
 @item __float128 __builtin_sqrtf128 (__float128)
-Similar to @code{__builtin_sqrtf}, except the return and input types
-are @code{__float128}.
+Perform a 128-bit IEEE floating point square root operation.
 @findex __builtin_sqrtf128
 
 @item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
-Similar to @code{__builtin_fma}, except the return and input types are
-@code{__float128}.
+Perform a 128-bit IEEE floating point fused multiply and add operation.
 @findex __builtin_fmaf128
+
+@item __float128 __builtin_addf128_round_to_odd (__float128, __float128)
+Perform a 128-bit IEEE floating point add using round to odd as the
+rounding mode.
+@findex __builtin_addf128_round_to_odd
+
+@item __float128 __builtin_subf128_round_to_odd (__float128, __float128)
+Perform a 128-bit IEEE floating point subtract using round to odd as
+the rounding mode.
+@findex __builtin_subf128_round_to_odd
+
+@item __float128 __builtin_mulf128_round_to_odd (__float128, __float128)
+Perform a 128-bit IEEE floating point multiply using round to odd as
+the rounding mode.
+@findex __builtin_mulf128_round_to_odd
+
+@item __float128 __builtin_divf128_round_to_odd (__float128, __float128)
+Perform a 128-bit IEEE floating point divide using round to odd as
+the rounding mode.
+@findex __builtin_divf128_round_to_odd
+
+@item __float128 __builtin_sqrtf128_round_to_odd (__float128)
+Perform a 128-bit IEEE floating point square root using round to odd
+as the rounding mode.
+@findex __builtin_sqrtf128_round_to_odd
+
+@item __float128 __builtin_fmaf128 (__float128, __float128, __float128)
+Perform a 128-bit IEEE floating point fused multiply and add operation
+using round to odd as the rounding mode.
+@findex __builtin_fmaf128_round_to_odd
+
+@item double __builtin_truncf128_round_to_odd (__float128)
+Convert a 128-bit IEEE floating point value to @code{double} using
+round to odd as the rounding mode.
+@findex __builtin_truncf128_round_to_odd
 @end table
 
 The following built-in functions are available for the PowerPC family
Index: gcc/testsuite/gcc.target/powerpc/float128-odd.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/float128-odd.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/float128-odd.c	(working copy)
@@ -0,0 +1,75 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2" } */
+
+/* Test the generation of the round to odd instructions.  */
+__float128
+f128_add(__float128 a, __float128 b)
+{
+  return __builtin_addf128_round_to_odd (a, b);
+}
+
+__float128
+f128_sub (__float128 a, __float128 b)
+{
+  return __builtin_subf128_round_to_odd (a, b);
+}
+
+__float128
+f128_mul (__float128 a, __float128 b)
+{
+  return __builtin_mulf128_round_to_odd (a, b);
+}
+
+__float128
+f128_div (__float128 a, __float128 b)
+{
+  return __builtin_divf128_round_to_odd (a, b);
+}
+
+__float128
+f128_sqrt (__float128 a)
+{
+  return __builtin_sqrtf128_round_to_odd (a);
+}
+
+double
+f128_trunc (__float128 a)
+{
+  return __builtin_truncf128_round_to_odd (a);
+}
+
+__float128
+f128_fma (__float128 a, __float128 b, __float128 c)
+{
+  return __builtin_fmaf128_round_to_odd (a, b, c);
+}
+
+__float128
+f128_fms (__float128 a, __float128 b, __float128 c)
+{
+  return __builtin_fmaf128_round_to_odd (a, b, -c);
+}
+
+__float128
+f128_nfma (__float128 a, __float128 b, __float128 c)
+{
+  return - __builtin_fmaf128_round_to_odd (a, b, c);
+}
+
+__float128
+f128_nfms (__float128 a, __float128 b, __float128 c)
+{
+  return - __builtin_fmaf128_round_to_odd (a, b, -c);
+}
+
+/* { dg-final { scan-assembler {\mxsaddqpo\M}   } } */
+/* { dg-final { scan-assembler {\mxssubqpo\M}   } } */
+/* { dg-final { scan-assembler {\mxsmulqpo\M}   } } */
+/* { dg-final { scan-assembler {\mxsdivqpo\M}   } } */
+/* { dg-final { scan-assembler {\mxssqrtqpo\M}  } } */
+/* { dg-final { scan-assembler {\mxscvqpdpo\M}  } } */
+/* { dg-final { scan-assembler {\mxsmaddqpo\M}  } } */
+/* { dg-final { scan-assembler {\mxsmsubqpo\M}  } } */
+/* { dg-final { scan-assembler {\mxsnmaddqpo\M} } } */
+/* { dg-final { scan-assembler {\mxsnmsubqpo\M} } } */