From patchwork Wed Sep  1 15:24:56 2010
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Michael Meissner <meissner@linux.vnet.ibm.com>
X-Patchwork-Id: 63384
Return-Path: 
 <gcc-patches-return-271739-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id D4968B7150
	for <incoming@patchwork.ozlabs.org>;
	Thu,  2 Sep 2010 01:25:35 +1000 (EST)
Received: (qmail 30925 invoked by alias); 1 Sep 2010 15:25:28 -0000
Received: (qmail 30897 invoked by uid 22791); 1 Sep 2010 15:25:26 -0000
X-SWARE-Spam-Status: No, hits=-1.2 required=5.0	tests=AWL, BAYES_00,
	NO_DNS_FOR_FROM, TW_FP, TW_FR, TW_MV, TW_XS, T_RP_MATCHES_RCVD
X-Spam-Check-By: sourceware.org
Received: from e3.ny.us.ibm.com (HELO e3.ny.us.ibm.com) (32.97.182.143) by
	sourceware.org (qpsmtpd/0.43rc1) with ESMTP;
	Wed, 01 Sep 2010 15:25:10 +0000
Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com
	[9.56.227.235])	by e3.ny.us.ibm.com (8.14.4/8.13.1) with
	ESMTP id o81F9Yf0023219	for <gcc-patches@gcc.gnu.org>;
	Wed, 1 Sep 2010 11:09:34 -0400
Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215])	by
	d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP
	id o81FP3Ti354400	for <gcc-patches@gcc.gnu.org>;
	Wed, 1 Sep 2010 11:25:07 -0400
Received: from d01av01.pok.ibm.com (loopback [127.0.0.1])	by
	d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with
	ESMTP id o81FP2e1014057	for <gcc-patches@gcc.gnu.org>;
	Wed, 1 Sep 2010 11:25:02 -0400
Received: from hungry-tiger.westford.ibm.com (dyn9033037152.westford.ibm.com
	[9.33.37.152])	by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO
	v10.0 AVin) with ESMTP id o81FOwxV013310;
	Wed, 1 Sep 2010 11:24:58 -0400
Received: by hungry-tiger.westford.ibm.com (Postfix, from userid 500)	id
	BD7A2325A; Wed,  1 Sep 2010 11:24:56 -0400 (EDT)
Date: Wed, 1 Sep 2010 11:24:56 -0400
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: [PATCH,
	powerpc] Generate FRIZ for (double)(long) under -ffast-math on
	powerpc
Message-ID: <20100901152456.GA5449@hungry-tiger.westford.ibm.com>
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,
	gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-08-17)
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

This patch optimizes (double)(long) operations to generate a single FRIZ round
instruction, instead of the convert to 64-bit integer and back.  It is only
done under -ffast-math because the FRIZ instruction generates a different
answer if the value is larger or smaller than will fit in a 64-bit integer
value.  I added a -mno-friz/-mfriz option to control whether this is generated
or not.

I did a bootstrap and make check yeilded no regressions.  Is this ok to install
in the tree?

[gcc]
2010-08-31  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.opt (-mfriz): New switch to control whether
	to convert (double)(long) into a single FRIZ instruction or not
	when -ffast-math is used.

	* config/rs6000/vsx.md (VSX_DF): New iterator for DF/V2DF modes.
	(vsx_float_fix_<mode>2): Optimize (double)(long) into X{S,V}RDPIZ
	or FRIZ instruction if -ffast-math.
	* config/rs6000/rs6000.md (friz): Ditto.

	* doc/invoke.texi (RS/6000 and PowerPC Options): Document -mfriz.

[gcc/testsuite]
2010-08-31  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/ppc-fpconv-10.c: New file to test generating
	FRIZ/XSRIZ instruciton for (double)(long long)x.
	* gcc.target/powerpc/ppc-fpconv-11.c: Ditto.

Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(revision 163697)
+++ gcc/config/rs6000/rs6000.opt	(working copy)
@@ -1,6 +1,7 @@
 ; Options for the rs6000 port of the compiler
 ;
-; Copyright (C) 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
+; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010 Free Software
+; Foundation, Inc.
 ; Contributed by Aldy Hernandez <aldy@quesejoda.com>.
 ;
 ; This file is part of GCC.
@@ -115,6 +116,10 @@ mpopcntd
 Target Report Mask(POPCNTD)
 Use PowerPC V2.06 popcntd instruction
 
+mfriz
+Target Report Var(TARGET_FRIZ) Init(-1)
+Under -ffast-math, generate a FRIZ instruction for (double)(long long) conversions
+
 mveclibabi=
 Target RejectNegative Joined Var(rs6000_veclibabi_name)
 Vector library ABI to use
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 163697)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -28,6 +28,9 @@ (define_mode_iterator VSX_D [V2DF V2DI])
 ;; Iterator for the 2 32-bit vector types
 (define_mode_iterator VSX_W [V4SF V4SI])
 
+;; Iterator for the DF types
+(define_mode_iterator VSX_DF [V2DF DF])
+
 ;; Iterator for vector floating point types supported by VSX
 (define_mode_iterator VSX_F [V4SF V2DF])
 
@@ -1053,6 +1056,22 @@ (define_insn "vsx_xvcvspuxds"
   "VECTOR_UNIT_VSX_P (V2DFmode)"
   "xvcvspuxds %x0,%x1"
   [(set_attr "type" "vecfloat")])
+
+;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since
+;; since the xsrdpiz instruction does not truncate the value if the floating
+;; point value is < LONG_MIN or > LONG_MAX.
+(define_insn "*vsx_float_fix_<mode>2"
+  [(set (match_operand:VSX_DF 0 "vsx_register_operand" "=<VSr>,?wa")
+	(float:VSX_DF
+	 (fix:<VSI>
+	  (match_operand:VSX_DF 1 "vsx_register_operand" "<VSr>,?wa"))))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+   && VECTOR_UNIT_VSX_P (<MODE>mode) && flag_unsafe_math_optimizations
+   && !flag_trapping_math && TARGET_FRIZ"
+  "x<VSv>r<VSs>iz %x0,%x1"
+  [(set_attr "type" "<VStype_simple>")
+   (set_attr "fp_type" "<VSfptype_simple>")])
+
 
 ;; Logical and permute operations
 (define_insn "*vsx_and<mode>3"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 163697)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -6983,6 +6983,18 @@ (define_insn "fctiwuz_<mode>"
   "fctiwuz %0,%1"
   [(set_attr "type" "fp")])
 
+;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since
+;; since the friz instruction does not truncate the value if the floating
+;; point value is < LONG_MIN or > LONG_MAX.
+(define_insn "*friz"
+  [(set (match_operand:DF 0 "gpc_reg_operand" "=d")
+	(float:DF (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d"))))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_POPCNTB
+   && !VECTOR_UNIT_VSX_P (DFmode) && flag_unsafe_math_optimizations
+   && !flag_trapping_math && TARGET_FRIZ"
+  "friz %0,%1"
+  [(set_attr "type" "fp")])
+
 ;; No VSX equivalent to fctid
 (define_insn "lrint<mode>di2"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d")
Index: gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c	(revision 0)
@@ -0,0 +1,10 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-options "-O2 -mcpu=power5 -ffast-math" } */
+/* { dg-final { scan-assembler-not "xsrdpiz" } } */
+/* { dg-final { scan-assembler "friz" } } */
+
+double round_double_llong (double a)
+{
+  return (double)(long long)a;
+}
Index: gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c	(revision 0)
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power7 -ffast-math" } */
+/* { dg-final { scan-assembler "xsrdpiz" } } */
+/* { dg-final { scan-assembler-not "friz" } } */
+
+double round_double_llong (double a)
+{
+  return (double)(long long)a;
+}
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 163697)
+++ gcc/doc/invoke.texi	(working copy)
@@ -789,7 +789,7 @@ See RS/6000 and PowerPC Options.
 -msdata=@var{opt}  -mvxworks  -G @var{num}  -pthread @gol
 -mrecip -mrecip=@var{opt} -mno-recip -mrecip-precision
 -mno-recip-precision @gol
--mveclibabi=@var{type}}
+-mveclibabi=@var{type} -mfriz -mno-friz}
 
 @emph{RX Options}
 @gccoptlist{-m64bit-doubles  -m32bit-doubles  -fpu  -nofpu@gol
@@ -15931,6 +15931,15 @@ GCC will currently emit calls to @code{a
 for power7.  Both @option{-ftree-vectorize} and
 @option{-funsafe-math-optimizations} have to be enabled.  The MASS
 libraries will have to be specified at link time.
+
+@item -mfriz
+@itemx -mno-friz
+@opindex mfriz
+Generate (do not generate) the @code{friz} instruction when the
+@option{-funsafe-math-optimizations} option is used to optimize
+rounding a floating point value to 64-bit integer and back to floating
+point.  The @code{friz} instruction does not return the same value if
+the floating point number is too large to fit in an integer.
 @end table
 
 @node RX Options