From patchwork Wed Sep 1 15:24:56 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 63384 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id D4968B7150 for ; Thu, 2 Sep 2010 01:25:35 +1000 (EST) Received: (qmail 30925 invoked by alias); 1 Sep 2010 15:25:28 -0000 Received: (qmail 30897 invoked by uid 22791); 1 Sep 2010 15:25:26 -0000 X-SWARE-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL, BAYES_00, NO_DNS_FOR_FROM, TW_FP, TW_FR, TW_MV, TW_XS, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from e3.ny.us.ibm.com (HELO e3.ny.us.ibm.com) (32.97.182.143) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 01 Sep 2010 15:25:10 +0000 Received: from d01relay03.pok.ibm.com (d01relay03.pok.ibm.com [9.56.227.235]) by e3.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o81F9Yf0023219 for ; Wed, 1 Sep 2010 11:09:34 -0400 Received: from d01av01.pok.ibm.com (d01av01.pok.ibm.com [9.56.224.215]) by d01relay03.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o81FP3Ti354400 for ; Wed, 1 Sep 2010 11:25:07 -0400 Received: from d01av01.pok.ibm.com (loopback [127.0.0.1]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o81FP2e1014057 for ; Wed, 1 Sep 2010 11:25:02 -0400 Received: from hungry-tiger.westford.ibm.com (dyn9033037152.westford.ibm.com [9.33.37.152]) by d01av01.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id o81FOwxV013310; Wed, 1 Sep 2010 11:24:58 -0400 Received: by hungry-tiger.westford.ibm.com (Postfix, from userid 500) id BD7A2325A; Wed, 1 Sep 2010 11:24:56 -0400 (EDT) Date: Wed, 1 Sep 2010 11:24:56 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Subject: [PATCH, powerpc] Generate FRIZ for (double)(long) under -ffast-math on powerpc Message-ID: <20100901152456.GA5449@hungry-tiger.westford.ibm.com> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-08-17) X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org This patch optimizes (double)(long) operations to generate a single FRIZ round instruction, instead of the convert to 64-bit integer and back. It is only done under -ffast-math because the FRIZ instruction generates a different answer if the value is larger or smaller than will fit in a 64-bit integer value. I added a -mno-friz/-mfriz option to control whether this is generated or not. I did a bootstrap and make check yeilded no regressions. Is this ok to install in the tree? [gcc] 2010-08-31 Michael Meissner * config/rs6000/rs6000.opt (-mfriz): New switch to control whether to convert (double)(long) into a single FRIZ instruction or not when -ffast-math is used. * config/rs6000/vsx.md (VSX_DF): New iterator for DF/V2DF modes. (vsx_float_fix_2): Optimize (double)(long) into X{S,V}RDPIZ or FRIZ instruction if -ffast-math. * config/rs6000/rs6000.md (friz): Ditto. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mfriz. [gcc/testsuite] 2010-08-31 Michael Meissner * gcc.target/powerpc/ppc-fpconv-10.c: New file to test generating FRIZ/XSRIZ instruciton for (double)(long long)x. * gcc.target/powerpc/ppc-fpconv-11.c: Ditto. Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 163697) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -1,6 +1,7 @@ ; Options for the rs6000 port of the compiler ; -; Copyright (C) 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc. +; Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010 Free Software +; Foundation, Inc. ; Contributed by Aldy Hernandez . ; ; This file is part of GCC. @@ -115,6 +116,10 @@ mpopcntd Target Report Mask(POPCNTD) Use PowerPC V2.06 popcntd instruction +mfriz +Target Report Var(TARGET_FRIZ) Init(-1) +Under -ffast-math, generate a FRIZ instruction for (double)(long long) conversions + mveclibabi= Target RejectNegative Joined Var(rs6000_veclibabi_name) Vector library ABI to use Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 163697) +++ gcc/config/rs6000/vsx.md (working copy) @@ -28,6 +28,9 @@ (define_mode_iterator VSX_D [V2DF V2DI]) ;; Iterator for the 2 32-bit vector types (define_mode_iterator VSX_W [V4SF V4SI]) +;; Iterator for the DF types +(define_mode_iterator VSX_DF [V2DF DF]) + ;; Iterator for vector floating point types supported by VSX (define_mode_iterator VSX_F [V4SF V2DF]) @@ -1053,6 +1056,22 @@ (define_insn "vsx_xvcvspuxds" "VECTOR_UNIT_VSX_P (V2DFmode)" "xvcvspuxds %x0,%x1" [(set_attr "type" "vecfloat")]) + +;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since +;; since the xsrdpiz instruction does not truncate the value if the floating +;; point value is < LONG_MIN or > LONG_MAX. +(define_insn "*vsx_float_fix_2" + [(set (match_operand:VSX_DF 0 "vsx_register_operand" "=,?wa") + (float:VSX_DF + (fix: + (match_operand:VSX_DF 1 "vsx_register_operand" ",?wa"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT + && VECTOR_UNIT_VSX_P (mode) && flag_unsafe_math_optimizations + && !flag_trapping_math && TARGET_FRIZ" + "xriz %x0,%x1" + [(set_attr "type" "") + (set_attr "fp_type" "")]) + ;; Logical and permute operations (define_insn "*vsx_and3" Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 163697) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -6983,6 +6983,18 @@ (define_insn "fctiwuz_" "fctiwuz %0,%1" [(set_attr "type" "fp")]) +;; Only optimize (float (fix x)) -> frz if we are in fast-math mode, since +;; since the friz instruction does not truncate the value if the floating +;; point value is < LONG_MIN or > LONG_MAX. +(define_insn "*friz" + [(set (match_operand:DF 0 "gpc_reg_operand" "=d") + (float:DF (fix:DI (match_operand:DF 1 "gpc_reg_operand" "d"))))] + "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_POPCNTB + && !VECTOR_UNIT_VSX_P (DFmode) && flag_unsafe_math_optimizations + && !flag_trapping_math && TARGET_FRIZ" + "friz %0,%1" + [(set_attr "type" "fp")]) + ;; No VSX equivalent to fctid (define_insn "lrintdi2" [(set (match_operand:DI 0 "gpc_reg_operand" "=d") Index: gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ppc-fpconv-11.c (revision 0) @@ -0,0 +1,10 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-options "-O2 -mcpu=power5 -ffast-math" } */ +/* { dg-final { scan-assembler-not "xsrdpiz" } } */ +/* { dg-final { scan-assembler "friz" } } */ + +double round_double_llong (double a) +{ + return (double)(long long)a; +} Index: gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/ppc-fpconv-10.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mcpu=power7 -ffast-math" } */ +/* { dg-final { scan-assembler "xsrdpiz" } } */ +/* { dg-final { scan-assembler-not "friz" } } */ + +double round_double_llong (double a) +{ + return (double)(long long)a; +} Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 163697) +++ gcc/doc/invoke.texi (working copy) @@ -789,7 +789,7 @@ See RS/6000 and PowerPC Options. -msdata=@var{opt} -mvxworks -G @var{num} -pthread @gol -mrecip -mrecip=@var{opt} -mno-recip -mrecip-precision -mno-recip-precision @gol --mveclibabi=@var{type}} +-mveclibabi=@var{type} -mfriz -mno-friz} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -15931,6 +15931,15 @@ GCC will currently emit calls to @code{a for power7. Both @option{-ftree-vectorize} and @option{-funsafe-math-optimizations} have to be enabled. The MASS libraries will have to be specified at link time. + +@item -mfriz +@itemx -mno-friz +@opindex mfriz +Generate (do not generate) the @code{friz} instruction when the +@option{-funsafe-math-optimizations} option is used to optimize +rounding a floating point value to 64-bit integer and back to floating +point. The @code{friz} instruction does not return the same value if +the floating point number is too large to fit in an integer. @end table @node RX Options