, PR target/85358, Fix __ibm128 being converter to __float128 on PowerPC ISA 3.0 (power9)

Message ID 20180416174129.GA7012@ibm-tiger.the-meissners.org
State New
Headers show
Series
  • , PR target/85358, Fix __ibm128 being converter to __float128 on PowerPC ISA 3.0 (power9)
Related show

Commit Message

Michael Meissner April 16, 2018, 5:41 p.m.
As I was working on PR target/85075 (to flesh some bugs with IEEE 128-bit
support on the PowerPC, particularly with switching the default of long
double), I noticed that for explicit IBM extended double, the compiler was
converting the __ibm128 type to an IEEE 128-bit type, because those types had
hardware support in ISA 3.0 (power9).

Unfortunately, IBM extended double (a pair of doubles meant to give you more
mantissa precision but not more expoenent range), is not completely
representable in IEEE 128-bit floating point.  There are likely some exposed
corner cases involved, and the bottom bits might not be the same.

While I would prefer IBM extended double to disappear entirely, I do think we
need to deal with it for a few versions still to come.

I tried to come up with machine dependent ways to prevent the automatic
widening from occuring, but I couldn't get anything to work reliably.  This
patch adds a new target hook that says whether the automatic widening between
two modes should be allowed.  The default hook says to allow all of the default
widenings to occur, while the PowerPC override says IBM extended double does
not widen to IEEE 128-bit and vice versa.

Given we are in stage4 right now, it is not the time to add new hooks, but here
is the patch.  If it doesn't go into GCC 8, is it acceptable for being put
early into GCC 9 with a backport before 8.2 comes out?

I have tested this patch using bootstrap builds on a power8 little system, a
power7 big endian system (both 32-bit and 64-bit), and an x86_64 bit system
with no regressions.  

[gcc]
2018-04-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/85358
	* target.def (default_widening_p): New target hook to say whether
	default widening between modes should be done.
	* targhooks.h (hook_bool_mode_mode_bool_true): New declaration.
	* targhooks.c (hook_bool_mode_mode_bool_true): New default target
	hook.
	* optabs.c (expand_binop): Before doing default widening, check
	whether the backend allows the widening.
	(expand_twoval_unop): Likewise.
	(expand_twoval_binop): Likewise.
	(widen_leading): Likewise.
	(widen_bswap): Likewise.
	(expand_unop): Likewise.
	* cse.c (cse_insn): Likewise.
	* combine.c (simplify_comparison): Likewise.
	* var-tracking.c (prepare_call_arguments): Likewise.
	* config/rs6000/rs6000.c (TARGET_DEFAULT_WIDENING_P): Define
	target hook to prevent IBM extended double and IEEE 128-bit
	floating point from being converted to each by default.
	(rs6000_default_widening_p): Likewise.
	* doc/tm.texi (TARGET_DEFAULT_WIDENING_P): Document the new
	default widening hook.
	* doc/tm.texi.in (TARGET_DEFAULT_WIDENING_P): Likewise.

[gcc/testsuite]
2018-04-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	PR target/85358
	* gcc.target/powerpc/pr85358.c: New test to make sure __ibm128
	does not widen to __float128 on ISA 3.0 systems.

Comments

Segher Boessenkool April 16, 2018, 7:53 p.m. | #1
Hi!

On Mon, Apr 16, 2018 at 01:41:29PM -0400, Michael Meissner wrote:
> As I was working on PR target/85075 (to flesh some bugs with IEEE 128-bit
> support on the PowerPC, particularly with switching the default of long
> double), I noticed that for explicit IBM extended double, the compiler was
> converting the __ibm128 type to an IEEE 128-bit type, because those types had
> hardware support in ISA 3.0 (power9).
> 
> Unfortunately, IBM extended double (a pair of doubles meant to give you more
> mantissa precision but not more expoenent range), is not completely
> representable in IEEE 128-bit floating point.  There are likely some exposed
> corner cases involved, and the bottom bits might not be the same.
> 
> While I would prefer IBM extended double to disappear entirely, I do think we
> need to deal with it for a few versions still to come.

There are many subtargets that still need it, and nothing to change
that has been planned.

> I tried to come up with machine dependent ways to prevent the automatic
> widening from occuring, but I couldn't get anything to work reliably.  This
> patch adds a new target hook that says whether the automatic widening between
> two modes should be allowed.  The default hook says to allow all of the default
> widenings to occur, while the PowerPC override says IBM extended double does
> not widen to IEEE 128-bit and vice versa.
> 
> Given we are in stage4 right now, it is not the time to add new hooks, but here
> is the patch.  If it doesn't go into GCC 8, is it acceptable for being put
> early into GCC 9 with a backport before 8.2 comes out?
> 
> I have tested this patch using bootstrap builds on a power8 little system, a
> power7 big endian system (both 32-bit and 64-bit), and an x86_64 bit system
> with no regressions.  

Can't you just set up things such that GET_MODE_WIDER_TYPE does not
return ieee128 for ibm128?  With maybe a new hook yes, if that is
necessary.

> --- gcc/target.def	(revision 259376)
> +++ gcc/target.def	(working copy)
> @@ -3486,6 +3486,13 @@ If this hook allows @code{val} to have a
>   hook_bool_mode_uhwi_false)
>  
>  DEFHOOK
> +(default_widening_p,
> + "Return true if GCC can automatically widen from @var{from_mode} to\n\
> +@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.",
> + bool, (machine_mode, machine_mode, bool),
> + hook_bool_mode_mode_bool_true)

This should have those names (from_mode, to_mode, unsigned_p) in the
prototype then, for the generated documentation to make sense.


Segher
Michael Meissner April 17, 2018, 4:51 p.m. | #2
On Mon, Apr 16, 2018 at 02:53:35PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Apr 16, 2018 at 01:41:29PM -0400, Michael Meissner wrote:
> > As I was working on PR target/85075 (to flesh some bugs with IEEE 128-bit
> > support on the PowerPC, particularly with switching the default of long
> > double), I noticed that for explicit IBM extended double, the compiler was
> > converting the __ibm128 type to an IEEE 128-bit type, because those types had
> > hardware support in ISA 3.0 (power9).
> > 
> > Unfortunately, IBM extended double (a pair of doubles meant to give you more
> > mantissa precision but not more expoenent range), is not completely
> > representable in IEEE 128-bit floating point.  There are likely some exposed
> > corner cases involved, and the bottom bits might not be the same.
> > 
> > While I would prefer IBM extended double to disappear entirely, I do think we
> > need to deal with it for a few versions still to come.
> 
> There are many subtargets that still need it, and nothing to change
> that has been planned.

Just to be clear, IBM extended double is fine, as long as you don't try to have
both IBM extended double and IEEE 128-bit support in the SAME program (and of
course, you can live with the various problems IBM extended double has).  The
main problem on the last 3 years has been that at its core, GCC believes there
should be only one floating point type for a given size.

> > I tried to come up with machine dependent ways to prevent the automatic
> > widening from occuring, but I couldn't get anything to work reliably.  This
> > patch adds a new target hook that says whether the automatic widening between
> > two modes should be allowed.  The default hook says to allow all of the default
> > widenings to occur, while the PowerPC override says IBM extended double does
> > not widen to IEEE 128-bit and vice versa.
> > 
> > Given we are in stage4 right now, it is not the time to add new hooks, but here
> > is the patch.  If it doesn't go into GCC 8, is it acceptable for being put
> > early into GCC 9 with a backport before 8.2 comes out?
> > 
> > I have tested this patch using bootstrap builds on a power8 little system, a
> > power7 big endian system (both 32-bit and 64-bit), and an x86_64 bit system
> > with no regressions.  
> 
> Can't you just set up things such that GET_MODE_WIDER_TYPE does not
> return ieee128 for ibm128?  With maybe a new hook yes, if that is
> necessary.

To me both are semantically the same.  Both involve adding a new hook, and
whether you change the macro or the calls, is a mattery of syntax.  As I
recall, when I first started doing __float128, there were one or two places
that wants to do wider types, but there you want to do it, even if the hook
says not to do automatic conversions.  You also don't want to outlaw explicit
conversions.

I can rewrite it if the global/release maintainers would prefer.
Segher Boessenkool April 17, 2018, 5:22 p.m. | #3
On Tue, Apr 17, 2018 at 12:51:17PM -0400, Michael Meissner wrote:
> On Mon, Apr 16, 2018 at 02:53:35PM -0500, Segher Boessenkool wrote:
> > Can't you just set up things such that GET_MODE_WIDER_TYPE does not
> > return ieee128 for ibm128?  With maybe a new hook yes, if that is
> > necessary.
> 
> To me both are semantically the same.  Both involve adding a new hook, and
> whether you change the macro or the calls, is a mattery of syntax.  As I
> recall, when I first started doing __float128, there were one or two places
> that wants to do wider types, but there you want to do it, even if the hook
> says not to do automatic conversions.  You also don't want to outlaw explicit
> conversions.
> 
> I can rewrite it if the global/release maintainers would prefer.

It should be a much smaller patch that way.  And yes, you need some
middle end maintainer's ack.


Segher

Patch

Index: gcc/target.def
===================================================================
--- gcc/target.def	(revision 259376)
+++ gcc/target.def	(working copy)
@@ -3486,6 +3486,13 @@  If this hook allows @code{val} to have a
  hook_bool_mode_uhwi_false)
 
 DEFHOOK
+(default_widening_p,
+ "Return true if GCC can automatically widen from @var{from_mode} to\n\
+@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.",
+ bool, (machine_mode, machine_mode, bool),
+ hook_bool_mode_mode_bool_true)
+
+DEFHOOK
 (libgcc_floating_mode_supported_p,
  "Define this to return nonzero if libgcc provides support for the \n\
 floating-point mode @var{mode}, which is known to pass \n\
Index: gcc/targhooks.c
===================================================================
--- gcc/targhooks.c	(revision 259376)
+++ gcc/targhooks.c	(working copy)
@@ -2336,4 +2336,12 @@  default_select_early_remat_modes (sbitma
 {
 }
 
+bool
+hook_bool_mode_mode_bool_true (machine_mode from_mode ATTRIBUTE_UNUSED,
+			       machine_mode to_mode ATTRIBUTE_UNUSED,
+			       bool unsigned_p ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 #include "gt-targhooks.h"
Index: gcc/targhooks.h
===================================================================
--- gcc/targhooks.h	(revision 259376)
+++ gcc/targhooks.h	(working copy)
@@ -288,5 +288,6 @@  extern enum flt_eval_method
 default_excess_precision (enum excess_precision_type ATTRIBUTE_UNUSED);
 extern bool default_stack_clash_protection_final_dynamic_probe (rtx);
 extern void default_select_early_remat_modes (sbitmap);
+extern bool hook_bool_mode_mode_bool_true (machine_mode, machine_mode, bool);
 
 #endif /* GCC_TARGHOOKS_H */
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	(revision 259376)
+++ gcc/optabs.c	(working copy)
@@ -1284,14 +1284,15 @@  expand_binop (machine_mode mode, optab b
     FOR_EACH_WIDER_MODE (wider_mode, mode)
       {
 	machine_mode next_mode;
-	if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
-	    || (binoptab == smul_optab
-		&& GET_MODE_WIDER_MODE (wider_mode).exists (&next_mode)
-		&& (find_widening_optab_handler ((unsignedp
-						  ? umul_widen_optab
-						  : smul_widen_optab),
-						 next_mode, mode)
-		    != CODE_FOR_nothing)))
+	if ((optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+	     || (binoptab == smul_optab
+		 && GET_MODE_WIDER_MODE (wider_mode).exists (&next_mode)
+		 && (find_widening_optab_handler ((unsignedp
+						   ? umul_widen_optab
+						   : smul_widen_optab),
+						  next_mode, mode)
+		     != CODE_FOR_nothing)))
+	    && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	  {
 	    rtx xop0 = op0, xop1 = op1;
 	    int no_extend = 0;
@@ -1834,9 +1835,10 @@  expand_binop (machine_mode mode, optab b
       gcc_assert (!convert_optab_p (binoptab));
       FOR_EACH_WIDER_MODE (wider_mode, mode)
 	{
-	  if (optab_handler (binoptab, wider_mode)
-	      || (methods == OPTAB_LIB
-		  && optab_libfunc (binoptab, wider_mode)))
+	  if ((optab_handler (binoptab, wider_mode)
+	       || (methods == OPTAB_LIB
+		   && optab_libfunc (binoptab, wider_mode)))
+	      && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	    {
 	      rtx xop0 = op0, xop1 = op1;
 	      int no_extend = 0;
@@ -1989,7 +1991,8 @@  expand_twoval_unop (optab unoptab, rtx o
     {
       FOR_EACH_WIDER_MODE (wider_mode, mode)
 	{
-	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
+	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing
+	      && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	    {
 	      rtx t0 = gen_reg_rtx (wider_mode);
 	      rtx t1 = gen_reg_rtx (wider_mode);
@@ -2070,7 +2073,8 @@  expand_twoval_binop (optab binoptab, rtx
     {
       FOR_EACH_WIDER_MODE (wider_mode, mode)
 	{
-	  if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing)
+	  if (optab_handler (binoptab, wider_mode) != CODE_FOR_nothing
+	      && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	    {
 	      rtx t0 = gen_reg_rtx (wider_mode);
 	      rtx t1 = gen_reg_rtx (wider_mode);
@@ -2169,7 +2173,9 @@  widen_leading (scalar_int_mode mode, rtx
   FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
     {
       scalar_int_mode wider_mode = wider_mode_iter.require ();
-      if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
+      bool unsignedp = unoptab != clrsb_optab;
+      if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing
+	  && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	{
 	  rtx xop0, temp;
 	  rtx_insn *last;
@@ -2178,10 +2184,8 @@  widen_leading (scalar_int_mode mode, rtx
 
 	  if (target == 0)
 	    target = gen_reg_rtx (mode);
-	  xop0 = widen_operand (op0, wider_mode, mode,
-				unoptab != clrsb_optab, false);
-	  temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX,
-			      unoptab != clrsb_optab);
+	  xop0 = widen_operand (op0, wider_mode, mode, unsignedp, false);
+	  temp = expand_unop (wider_mode, unoptab, xop0, NULL_RTX, unsignedp);
 	  if (temp != 0)
 	    temp = expand_binop
 	      (wider_mode, sub_optab, temp,
@@ -2333,9 +2337,12 @@  widen_bswap (scalar_int_mode mode, rtx o
   opt_scalar_int_mode wider_mode_iter;
 
   FOR_EACH_WIDER_MODE (wider_mode_iter, mode)
-    if (optab_handler (bswap_optab, wider_mode_iter.require ())
-	!= CODE_FOR_nothing)
-      break;
+    {
+      machine_mode wider_mode = wider_mode_iter.require ();
+      if (optab_handler (bswap_optab, wider_mode) != CODE_FOR_nothing
+	  && targetm.default_widening_p (mode, wider_mode, true))
+	break;
+    }
 
   if (!wider_mode_iter.exists ())
     return NULL_RTX;
@@ -2865,7 +2872,8 @@  expand_unop (machine_mode mode, optab un
   if (CLASS_HAS_WIDER_MODES_P (mclass))
     FOR_EACH_WIDER_MODE (wider_mode, mode)
       {
-	if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing)
+	if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing
+	    && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	  {
 	    rtx xop0 = op0;
 	    rtx_insn *last = get_last_insn ();
@@ -3032,8 +3040,9 @@  expand_unop (machine_mode mode, optab un
     {
       FOR_EACH_WIDER_MODE (wider_mode, mode)
 	{
-	  if (optab_handler (unoptab, wider_mode) != CODE_FOR_nothing
-	      || optab_libfunc (unoptab, wider_mode))
+	  if ((optab_handler (unoptab, wider_mode) != CODE_FOR_nothing
+	       || optab_libfunc (unoptab, wider_mode))
+	      && targetm.default_widening_p (mode, wider_mode, unsignedp))
 	    {
 	      rtx xop0 = op0;
 	      rtx_insn *last = get_last_insn ();
Index: gcc/cse.c
===================================================================
--- gcc/cse.c	(revision 259376)
+++ gcc/cse.c	(working copy)
@@ -4885,6 +4885,9 @@  cse_insn (rtx_insn *insn)
 	      if (GET_MODE_PRECISION (wider_mode) > BITS_PER_WORD)
 		break;
 
+	      if (!targetm.default_widening_p (mode, wider_mode, false))
+		continue;
+
 	      struct table_elt *const_elt
 		= lookup (src_const, HASH (src_const, wider_mode), wider_mode);
 
@@ -4924,6 +4927,9 @@  cse_insn (rtx_insn *insn)
 	      if (GET_MODE_SIZE (tmode) > UNITS_PER_WORD)
 		break;
 
+	      if (!targetm.default_widening_p (mode, tmode, false))
+		continue;
+
 	      rtx inner = gen_lowpart (tmode, XEXP (src, 0));
 	      struct table_elt *larger_elt;
 
@@ -4979,6 +4985,9 @@  cse_insn (rtx_insn *insn)
 	      if (GET_MODE_SIZE (tmode) > UNITS_PER_WORD)
 		break;
 
+	      if (!targetm.default_widening_p (mode, tmode, false))
+		continue;
+
 	      PUT_MODE (memory_extend_rtx, tmode);
 	      larger_elt = lookup (memory_extend_rtx,
 				   HASH (memory_extend_rtx, tmode), tmode);
Index: gcc/combine.c
===================================================================
--- gcc/combine.c	(revision 259376)
+++ gcc/combine.c	(working copy)
@@ -12965,6 +12965,9 @@  simplify_comparison (enum rtx_code code,
 				  || (nonzero_bits (op1, tmode)
 				      & ~GET_MODE_MASK (mode)) == 0)));
 
+	    if (!targetm.default_widening_p (mode, tmode, zero_extended))
+	      continue;
+
 	    if (zero_extended
 		|| ((num_sign_bit_copies (op0, tmode)
 		     > (unsigned int) (GET_MODE_PRECISION (tmode)
Index: gcc/var-tracking.c
===================================================================
--- gcc/var-tracking.c	(revision 259376)
+++ gcc/var-tracking.c	(working copy)
@@ -6347,10 +6347,14 @@  prepare_call_arguments (basic_block bb, 
 		opt_scalar_int_mode mode_iter;
 		FOR_EACH_WIDER_MODE (mode_iter, mode)
 		  {
+		    machine_mode old_mode = mode;
 		    mode = mode_iter.require ();
 		    if (GET_MODE_BITSIZE (mode) > BITS_PER_WORD)
 		      break;
 
+		    if (!targetm.default_widening_p (old_mode, mode, false))
+		      continue;
+
 		    rtx reg = simplify_subreg (mode, x, GET_MODE (x), 0);
 		    if (reg == NULL_RTX || !REG_P (reg))
 		      continue;
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 259376)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1980,6 +1980,9 @@  static const struct attribute_spec rs600
 
 #undef TARGET_STARTING_FRAME_OFFSET
 #define TARGET_STARTING_FRAME_OFFSET rs6000_starting_frame_offset
+
+#undef TARGET_DEFAULT_WIDENING_P
+#define TARGET_DEFAULT_WIDENING_P rs6000_default_widening_p
 
 
 /* Processor table.  */
@@ -17150,6 +17153,29 @@  rs6000_init_builtins (void)
 #endif
 }
 
+/* Return true if FROM_MODE can be widened to TO_MODE automatically.  UNSIGNEDP
+   says the conversion is unsigned.
+
+   On PowerPC, don't allow IBM extended double to widen to an IEEE 128-bit
+   floating point value or vice versa.  */
+
+static bool
+rs6000_default_widening_p (machine_mode from_mode,
+			   machine_mode to_mode,
+			   bool unsignedp ATTRIBUTE_UNUSED)
+{
+  if (!TARGET_FLOAT128_TYPE)
+    return true;
+
+  if (FLOAT128_IEEE_P (from_mode) && FLOAT128_IBM_P (to_mode))
+    return false;
+
+  if (FLOAT128_IBM_P (from_mode) && FLOAT128_IEEE_P (to_mode))
+    return false;
+
+  return true;
+}
+
 /* Returns the rs6000 builtin decl for CODE.  */
 
 static tree
Index: gcc/doc/tm.texi
===================================================================
--- gcc/doc/tm.texi	(revision 259376)
+++ gcc/doc/tm.texi	(working copy)
@@ -4315,6 +4315,11 @@  If this hook allows @code{val} to have a
 @code{int8x8x3_t}s in registers rather than forcing them onto the stack.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_DEFAULT_WIDENING_P (machine_mode, @var{machine_mode}, @var{bool})
+Return true if GCC can automatically widen from @var{from_mode} to
+@var{to_mode}.  Conversions are unsigned if @var{unsigned_p} is true.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P (scalar_float_mode @var{mode})
 Define this to return nonzero if libgcc provides support for the 
 floating-point mode @var{mode}, which is known to pass 
Index: gcc/doc/tm.texi.in
===================================================================
--- gcc/doc/tm.texi.in	(revision 259376)
+++ gcc/doc/tm.texi.in	(working copy)
@@ -3344,6 +3344,8 @@  stack.
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
 
+@hook TARGET_DEFAULT_WIDENING_P
+
 @hook TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P
 
 @hook TARGET_FLOATN_MODE
Index: gcc/testsuite/gcc.target/powerpc/pr85358.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr85358.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr85358.c	(revision 0)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile { target { powerpc*-*-linux* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -mfloat128 -O2 -mabi=ieeelongdouble -Wno-psabi" } */
+
+/* Verify that __ibm128 does not get converted automatically to IEEE 128-bit on
+   machines with IEEE 128-bit hardware support.  */
+
+__ibm128
+add (__ibm128 a, __ibm128 b)
+{
+  return a + b;
+}
+
+/* { dg-final { scan-assembler-not {\mxsaddqp\M} } } */