From patchwork Thu May 26 17:04:59 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 626799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rFwWm0n9wz9t6B for ; Fri, 27 May 2016 03:05:39 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Tj41t2xR; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=HltcsODBSIGcu/Ea+ dTL9GfpSKPDQ372VnZq1sdcszwolveOUI3oatI5FwOhsyxuGv1dzjGIgEFcdc7GR Vw5dnQM2fAtvHNebsu0f+FulzDDB7RDCO7MTvzAbuz4iDgj68shTj9Bqfn7q5I9M e8l1RMA63yVCfR3B5ghjl5zP34= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=3CE9pU3wkfEOgCHQi29RKFh HXs4=; b=Tj41t2xRMkOvy8TJ42pj77Z894D7VhFNibdDsAg4vQYWBAbpZnlnF4r 2bJ1xU0uyCWsRJIxZCwnVBeMwHqRD2LwVu7AQvx1BTASWjWPSBTGFWuBqBIx9tWX Y7PrKIDfkEo5nEQ28m9YLARz2IwBab3CN27lm4gXmvWoSMLXn5UQ= Received: (qmail 18705 invoked by alias); 26 May 2016 17:05:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 18615 invoked by uid 89); 26 May 2016 17:05:22 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.5 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=dgskipif, dg-skip-if, sk:flag_t, sfmode X-HELO: e32.co.us.ibm.com Received: from e32.co.us.ibm.com (HELO e32.co.us.ibm.com) (32.97.110.150) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 26 May 2016 17:05:12 +0000 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 26 May 2016 11:05:08 -0600 Received: from d03dlp03.boulder.ibm.com (9.17.202.179) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 26 May 2016 11:05:03 -0600 X-IBM-Helo: d03dlp03.boulder.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b03cxnp08026.gho.boulder.ibm.com (b03cxnp08026.gho.boulder.ibm.com [9.17.130.18]) by d03dlp03.boulder.ibm.com (Postfix) with ESMTP id 8C8F619D8040; Thu, 26 May 2016 11:04:43 -0600 (MDT) Received: from b03ledav001.gho.boulder.ibm.com (b03ledav001.gho.boulder.ibm.com [9.17.130.232]) by b03cxnp08026.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u4QH519E24379522; Thu, 26 May 2016 10:05:01 -0700 Received: from b03ledav001.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A1EE86E041; Thu, 26 May 2016 11:05:01 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav001.gho.boulder.ibm.com (Postfix) with ESMTP id 5F7FF6E035; Thu, 26 May 2016 11:05:01 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 58ACC45CAE; Thu, 26 May 2016 13:05:00 -0400 (EDT) Date: Thu, 26 May 2016 13:04:59 -0400 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt Subject: Re: [PATCH], Add PowerPC ISA 3.0 min/max support Message-ID: <20160526170459.GA14011@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, David Edelsohn , Bill Schmidt References: <20160505191839.GA7023@ibm-tiger.the-meissners.org> <20160509143143.GC31139@gate.crashing.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20160509143143.GC31139@gate.crashing.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16052617-0005-0000-0000-000075895F19 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes On Mon, May 09, 2016 at 09:31:43AM -0500, Segher Boessenkool wrote: > On Thu, May 05, 2016 at 03:18:39PM -0400, Michael Meissner wrote: > > At the present time, the code does not support comparisons involving >= and <= > > unless the -ffast-math option is used. I hope eventually to support generating > > these instructions without having -ffast-math used. > > > > The underlying reason is when fast math is not used, we change the condition > > from: > > > > (ge:SI (reg:CCFP ) (const_int 0)) > > > > to: > > > > (ior:SI (gt:SI (reg:CCFP ) (const_int 0)) > > (eq:SI (reg:CCFP ) (const_int 0))) > > > > The machine independent portion of the compiler does not recognize this when > > trying to generate conditional moves. > > > > I would imagine the 'fix' is to generate GE/LE all of the time, and then have a > > splitter that converts it to IOR of GT/EQ if it is not a conditional move with > > ISA 3.0 instructions. > > That sounds like a plan :-) Well in the list of my priorities, it is low on the list. Hopefully I or somebody else will be able to get to it by the time GCC 7 freezes. > > > -;; Return true if operand is MIN or MAX operator. > > +;; Return true if operand is MIN or MAX operator. Since this is only used to > > +;; convert floating point MIN/MAX operations into FSEL on pre-vsx systems, > > +;; don't include UMIN or UMAX. > > (define_predicate "min_max_operator" > > - (match_code "smin,smax,umin,umax")) > > + (match_code "smin,smax")) > > Please name it signed_min_max_operator instead? In this set of patches, I rewrote the define_split that called it to use SMIN/SMAX code iterators, so I deleted the min_max_operator predicate. > > --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) > > +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) > > @@ -20534,6 +20534,12 @@ print_operand (FILE *file, rtx x, int co > > "local dynamic TLS references"); > > return; > > > > + case '@': > > + /* If -mpower9-minmax, use xsmaxcpdp instead of xsmaxdp. */ > > + if (TARGET_P9_MINMAX) > > + putc ('c', file); > > + return; > > I don't think @ is very mnemonic, nor is this special enough for such > a nice letter. I just remove the %@ and instead did a C++ test for the appropriate string to return. > Form looking at how it is used, it seems you can make it part of code_attr > minmax (and give that a better name, minmax_fp or such)? No, you can't use code attributes, because it is based on the target switches, not on the insn (i.e. VSX with -ffast-math uses the same insn as p9 min/max without -ffast-math). > > + rs6000_emit_minmax (dest, (max_p) ? SMAX : SMIN, op0, op1); > > Superfluous parentheses. Ok. > > +rs6000_emit_power9_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) > > Maybe put some "fp" in the name? For "minmax" as well. > > > + if (swap_p) > > + compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op1, op0); > > + else > > + compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op0, op1); > > if (swap_p) > std::swap (op0, op1); > > and then just generate the one form? I renamed the functions, and used std::swap earlier. These patches have been bootstrap on a big endian power7 system (both 32-bit and 64-bit available) and little endian power8 system with no regressions. Are these patches ok to install in the trunk? After a burn-in period, are they ok to install on the GCC 6.2 branch? [gcc] 2016-05-26 Michael Meissner * config/rs6000/rs6000.c (rs6000_emit_p9_fp_minmax): New function for ISA 3.0 min/max support. (rs6000_emit_p9_fp_cmove): New function for ISA 3.0 floating point conditional move support. (rs6000_emit_cmove): Call rs6000_emit_p9_fp_minmax and rs6000_emit_p9_fp_cmove if the ISA 3.0 instructions are available. * config/rs6000/rs6000.md (SFDF2): New iterator to allow doing conditional moves there the comparison type is different from move type. (fp_minmax): New code iterator for smin/smax. (minmax): New code attributes for min/max. (SMINMAX): Likewise. (smax3): Combine min, max insns into one insn using the fp_minmax code iterator. Add support for ISA 3.0 min/max instructions that don't need -ffast-math. (s3): Likewise. (smax3_vsx): Likewise. (smin3): Likewise. (s3_vsx): Likewise. (smin3_vsx): Likewise. (pre-VSX min/max splitters): Likewise. (s3_fpr): Likewise. (movsfcc): Rewrite floating point conditional moves to combine SFmode/DFmode into a single insn. (movcc): Likewise. (movdfcc): Likewise. (fselsfsf4): Combine FSEL cases into a single insn, using SFDF and SFDF2 iterators to handle all combinations. (fseldfsf4): Likewise. (fsel4): Likewise. (fseldfdf4): Likewise. (fselsfdf4): Likewise. (movcc_p9): Add support for the ISA 3.0 comparison instructions that set a 0/-1 mask, and use it for floating point conditional move via XXSEL. (fpmask): Likewise. (xxsel): Likewise. * config/rs6000/predicates.md (min_max_operator): Delete, no longer used. (fpmask_comparison_operaton): New insn for ISA 3.0 comparison instructions that generate a 0/-1 mask for use with XXSEL. * config/rs6000/rs6000.h (TARGET_MINMAX_SF): New helper macros to say whether floating point min/max is available, either through FSEL, ISA 2.06 min/max, and ISA 3.0 min/max instrucitons. (TARGET_MINMAX_DF): Likewise. [gcc/testsuite] 2016-05-26 Michael Meissner * gcc.target/powerpc/p9-minmax-1.c: New tests for ISA 3.0 floating point min/max/comparison instructions. * gcc.target/powerpc/p9-minmax-2.c: Likewise. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 236740) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -22637,6 +22637,101 @@ rs6000_emit_vector_cond_expr (rtx dest, return 1; } +/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction + for SF/DF scalars. Move TRUE_COND to DEST if OP of the operands of the last + comparison is nonzero/true, FALSE_COND if it is zero/false. Return 0 if the + hardware has no such operation. */ + +static int +rs6000_emit_p9_fp_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond) +{ + enum rtx_code code = GET_CODE (op); + rtx op0 = XEXP (op, 0); + rtx op1 = XEXP (op, 1); + machine_mode compare_mode = GET_MODE (op0); + machine_mode result_mode = GET_MODE (dest); + bool max_p = false; + + if (result_mode != compare_mode) + return 0; + + if (code == GE || code == GT) + max_p = true; + else if (code == LE || code == LT) + max_p = false; + else + return 0; + + if (rtx_equal_p (op0, true_cond) && rtx_equal_p (op1, false_cond)) + ; + + else if (rtx_equal_p (op1, true_cond) && rtx_equal_p (op0, false_cond)) + max_p = !max_p; + + else + return 0; + + rs6000_emit_minmax (dest, max_p ? SMAX : SMIN, op0, op1); + return 1; +} + +/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and + XXSEL instructions for SF/DF scalars. Move TRUE_COND to DEST if OP of the + operands of the last comparison is nonzero/true, FALSE_COND if it is + zero/false. Return 0 if the hardware has no such operation. */ + +static int +rs6000_emit_p9_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) +{ + enum rtx_code code = GET_CODE (op); + rtx op0 = XEXP (op, 0); + rtx op1 = XEXP (op, 1); + machine_mode result_mode = GET_MODE (dest); + rtx compare_rtx; + rtx cmove_rtx; + rtx clobber_rtx; + + if (!can_create_pseudo_p ()) + return 0; + + switch (code) + { + case EQ: + case GE: + case GT: + break; + + case NE: + case LT: + case LE: + code = swap_condition (code); + std::swap (op0, op1); + break; + + default: + return 0; + } + + /* Generate: [(parallel [(set (dest) + (if_then_else (op (cmp1) (cmp2)) + (true) + (false))) + (clobber (scratch))])]. */ + + compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op0, op1); + cmove_rtx = gen_rtx_SET (dest, + gen_rtx_IF_THEN_ELSE (result_mode, + compare_rtx, + true_cond, + false_cond)); + + clobber_rtx = gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (V2DImode)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (2, cmove_rtx, clobber_rtx))); + + return 1; +} + /* Emit a conditional move: move TRUE_COND to DEST if OP of the operands of the last comparison is nonzero/true, FALSE_COND if it is zero/false. Return 0 if the hardware has no such operation. */ @@ -22663,6 +22758,18 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx if (GET_MODE (false_cond) != result_mode) return 0; + /* See if we can use the ISA 3.0 (power9) min/max/compare functions. */ + if (TARGET_P9_MINMAX + && (compare_mode == SFmode || compare_mode == DFmode) + && (result_mode == SFmode || result_mode == DFmode)) + { + if (rs6000_emit_p9_fp_minmax (dest, op, true_cond, false_cond)) + return 1; + + if (rs6000_emit_p9_fp_cmove (dest, op, true_cond, false_cond)) + return 1; + } + /* Don't allow using floating point comparisons for integer results for now. */ if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 236740) +++ gcc/config/rs6000/rs6000.md (.../gcc/config/rs6000) (working copy) @@ -489,6 +489,10 @@ (define_mode_iterator RECIPF [SF DF V4SF ; Iterator for just SF/DF (define_mode_iterator SFDF [SF DF]) +; Like SFDF, but a different name to match conditional move where the +; comparison operands may be a different mode than the input operands. +(define_mode_iterator SFDF2 [SF DF]) + ; Iterator for 128-bit floating point that uses the IBM double-double format (define_mode_iterator IBM128 [(IF "FLOAT128_IBM_P (IFmode)") (TF "FLOAT128_IBM_P (TFmode)")]) @@ -700,6 +704,15 @@ (define_mode_attr BOOL_REGS_UNARY [(TI " (define_mode_iterator RELOAD [V16QI V8HI V4SI V2DI V4SF V2DF V1TI SF SD SI DF DD DI TI PTI KF IF TF]) +;; Iterate over smin, smax +(define_code_iterator fp_minmax [smin smax]) + +(define_code_attr minmax [(smin "min") + (smax "max")]) + +(define_code_attr SMINMAX [(smin "SMIN") + (smax "SMAX")]) + ;; Start with fixed-point load and store insns. Here we put only the more ;; complex forms. Basic data transfer is done later. @@ -4629,74 +4642,45 @@ (define_insn "copysign3_fcpsgn" ;; On VSX, we only check for TARGET_VSX instead of checking for a vsx/p8 vector ;; to allow either DF/SF to use only traditional registers. -(define_expand "smax3" +(define_expand "s3" [(set (match_operand:SFDF 0 "gpc_reg_operand" "") - (if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "") - (match_operand:SFDF 2 "gpc_reg_operand" "")) - (match_dup 1) - (match_dup 2)))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math" + (fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "") + (match_operand:SFDF 2 "gpc_reg_operand" "")))] + "TARGET_MINMAX_" { - rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); + rs6000_emit_minmax (operands[0], , operands[1], operands[2]); DONE; }) -(define_insn "*smax3_vsx" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") - (smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,") - (match_operand:SFDF 2 "gpc_reg_operand" ",")))] - "TARGET__FPR && TARGET_VSX" - "xsmaxdp %x0,%x1,%x2" - [(set_attr "type" "fp")]) - -(define_expand "smin3" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "") - (if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "") - (match_operand:SFDF 2 "gpc_reg_operand" "")) - (match_dup 2) - (match_dup 1)))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math" +(define_insn "*s3_vsx" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=") + (fp_minmax:SFDF (match_operand:SFDF 1 "vsx_register_operand" "") + (match_operand:SFDF 2 "vsx_register_operand" "")))] + "TARGET_VSX && TARGET__FPR" { - rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); - DONE; -}) - -(define_insn "*smin3_vsx" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") - (smin:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,") - (match_operand:SFDF 2 "gpc_reg_operand" ",")))] - "TARGET__FPR && TARGET_VSX" - "xsmindp %x0,%x1,%x2" + return (TARGET_P9_MINMAX + ? "xscdp %x0,%x1,%x2" + : "xsdp %x0,%x1,%x2"); +} [(set_attr "type" "fp")]) -(define_split +;; The conditional move instructions allow us to perform max and min operations +;; even when we don't have the appropriate max/min instruction using the FSEL +;; instruction. + +(define_insn_and_split "*s3_fpr" [(set (match_operand:SFDF 0 "gpc_reg_operand" "") - (match_operator:SFDF 3 "min_max_operator" - [(match_operand:SFDF 1 "gpc_reg_operand" "") - (match_operand:SFDF 2 "gpc_reg_operand" "")]))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math - && !TARGET_VSX" + (fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "") + (match_operand:SFDF 2 "gpc_reg_operand" "")))] + "!TARGET_VSX && TARGET_MINMAX_" + "#" + "&& 1" [(const_int 0)] { - rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), operands[1], - operands[2]); + rs6000_emit_minmax (operands[0], , operands[1], operands[2]); DONE; }) -(define_split - [(set (match_operand:SF 0 "gpc_reg_operand" "") - (match_operator:SF 3 "min_max_operator" - [(match_operand:SF 1 "gpc_reg_operand" "") - (match_operand:SF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_SINGLE_FLOAT && !flag_trapping_math" - [(const_int 0)] - " -{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), - operands[1], operands[2]); - DONE; -}") - (define_expand "movcc" [(set (match_operand:GPR 0 "gpc_reg_operand" "") (if_then_else:GPR (match_operand 1 "comparison_operator" "") @@ -4779,12 +4763,13 @@ (define_insn "*isel_reversed_unsigned_cc" + [(set (match_operand:SFDF 0 "gpc_reg_operand" "") + (if_then_else:SFDF (match_operand 1 "comparison_operator" "") + (match_operand:SFDF 2 "gpc_reg_operand" "") + (match_operand:SFDF 3 "gpc_reg_operand" "")))] + "TARGET__FPR && TARGET_PPC_GFXOPT" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -4793,76 +4778,70 @@ (define_expand "movsfcc" FAIL; }") -(define_insn "*fselsfsf4" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "f") - (match_operand:SF 4 "zero_fp_constant" "F")) - (match_operand:SF 2 "gpc_reg_operand" "f") - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) - -(define_insn "*fseldfsf4" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "d") - (match_operand:DF 4 "zero_fp_constant" "F")) - (match_operand:SF 2 "gpc_reg_operand" "f") - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" +(define_insn "*fsel4" + [(set (match_operand:SFDF 0 "fpr_reg_operand" "=&") + (if_then_else:SFDF + (ge (match_operand:SFDF2 1 "fpr_reg_operand" "") + (match_operand:SFDF2 4 "zero_fp_constant" "F")) + (match_operand:SFDF 2 "fpr_reg_operand" "") + (match_operand:SFDF 3 "fpr_reg_operand" "")))] + "TARGET__FPR && TARGET_PPC_GFXOPT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) -;; The conditional move instructions allow us to perform max and min -;; operations even when - -(define_split - [(set (match_operand:DF 0 "gpc_reg_operand" "") - (match_operator:DF 3 "min_max_operator" - [(match_operand:DF 1 "gpc_reg_operand" "") - (match_operand:DF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT - && !flag_trapping_math" - [(const_int 0)] - " -{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), - operands[1], operands[2]); - DONE; -}") - -(define_expand "movdfcc" - [(set (match_operand:DF 0 "gpc_reg_operand" "") - (if_then_else:DF (match_operand 1 "comparison_operator" "") - (match_operand:DF 2 "gpc_reg_operand" "") - (match_operand:DF 3 "gpc_reg_operand" "")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" - " -{ - if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) - DONE; - else - FAIL; -}") +(define_insn_and_split "*movcc_p9" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=&,") + (if_then_else:SFDF + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:SFDF2 2 "vsx_register_operand" ",") + (match_operand:SFDF2 3 "vsx_register_operand" ",")]) + (match_operand:SFDF 4 "vsx_register_operand" ",") + (match_operand:SFDF 5 "vsx_register_operand" ","))) + (clobber (match_scratch:V2DI 6 "=0,&wa"))] + "TARGET_P9_MINMAX" + "#" + "" + [(set (match_dup 6) + (if_then_else:V2DI (match_dup 1) + (match_dup 7) + (match_dup 8))) + (set (match_dup 0) + (if_then_else:SFDF (ne (match_dup 6) + (match_dup 8)) + (match_dup 4) + (match_dup 5)))] +{ + if (GET_CODE (operands[6]) == SCRATCH) + operands[6] = gen_reg_rtx (V2DImode); + + operands[7] = CONSTM1_RTX (V2DImode); + operands[8] = CONST0_RTX (V2DImode); +} + [(set_attr "length" "8") + (set_attr "type" "vecperm")]) + +(define_insn "*fpmask" + [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa") + (if_then_else:V2DI + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:SFDF 2 "vsx_register_operand" "") + (match_operand:SFDF 3 "vsx_register_operand" "")]) + (match_operand:V2DI 4 "all_ones_constant" "") + (match_operand:V2DI 5 "zero_constant" "")))] + "TARGET_P9_MINMAX" + "xscmp%V1dp %x0,%x2,%x3" + [(set_attr "type" "fpcompare")]) -(define_insn "*fseldfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "d") - (match_operand:DF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "d") - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) +(define_insn "*xxsel" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=") + (if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa") + (match_operand:V2DI 2 "zero_constant" "")) + (match_operand:SFDF 3 "vsx_register_operand" "") + (match_operand:SFDF 4 "vsx_register_operand" "")))] + "TARGET_P9_MINMAX" + "xxsel %x0,%x1,%x3,%x4" + [(set_attr "type" "vecperm")]) -(define_insn "*fselsfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (if_then_else:DF (ge (match_operand:SF 1 "gpc_reg_operand" "f") - (match_operand:SF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "d") - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) ;; Conversions to and from floating-point. Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 236740) +++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy) @@ -1109,10 +1109,6 @@ (define_predicate "boolean_or_operator" (define_special_predicate "equality_operator" (match_code "eq,ne")) -;; Return true if operand is MIN or MAX operator. -(define_predicate "min_max_operator" - (match_code "smin,smax,umin,umax")) - ;; Return 1 if OP is a comparison operation that is valid for a branch ;; instruction. We check the opcode against the mode of the CC value. ;; validate_condition_mode is an assertion. @@ -1155,6 +1151,11 @@ (define_predicate "scc_rev_comparison_op (and (match_operand 0 "branch_comparison_operator") (match_code "ne,le,ge,leu,geu,ordered"))) +;; Return 1 if OP is a comparison operator suitable for vector/scalar +;; comparisons that generate a -1/0 mask. +(define_predicate "fpmask_comparison_operator" + (match_code "eq,gt,ge")) + ;; Return 1 if OP is a comparison operation that is valid for a branch ;; insn, which is true if the corresponding bit in the CC register is set. (define_predicate "branch_positive_comparison_operator" Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 236740) +++ gcc/config/rs6000/rs6000.h (.../gcc/config/rs6000) (working copy) @@ -594,6 +594,15 @@ extern int rs6000_vector_align[]; in the register. */ #define TARGET_NO_SDMODE_STACK (TARGET_LFIWZX && TARGET_STFIWX && TARGET_DFP) +/* ISA 3.0 has new min/max functions that don't need fast math that are being + phased in. Min/max using FSEL or XSMAXDP/XSMINDP do not return the correct + answers if the arguments are not in the normal range. */ +#define TARGET_MINMAX_SF (TARGET_SF_FPR && TARGET_PPC_GFXOPT \ + && (TARGET_P9_MINMAX || !flag_trapping_math)) + +#define TARGET_MINMAX_DF (TARGET_DF_FPR && TARGET_PPC_GFXOPT \ + && (TARGET_P9_MINMAX || !flag_trapping_math)) + /* In switching from using target_flags to using rs6000_isa_flags, the options machinery creates OPTION_MASK_ instead of MASK_. For now map OPTION_MASK_ back into MASK_. */ Index: gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c (.../gcc/testsuite/gcc.target/powerpc) (revision 236741) @@ -0,0 +1,171 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mpower9-minmax -ffast-math" } */ +/* { dg-final { scan-assembler-not "fsel" } } */ +/* { dg-final { scan-assembler "xscmpeqdp" } } */ +/* { dg-final { scan-assembler "xscmpgtdp" } } */ +/* { dg-final { scan-assembler "xscmpgedp" } } */ +/* { dg-final { scan-assembler-not "xscmpodp" } } */ +/* { dg-final { scan-assembler-not "xscmpudp" } } */ +/* { dg-final { scan-assembler "xsmaxcdp" } } */ +/* { dg-final { scan-assembler-not "xsmaxdp" } } */ +/* { dg-final { scan-assembler "xsmincdp" } } */ +/* { dg-final { scan-assembler-not "xsmindp" } } */ +/* { dg-final { scan-assembler "xxsel" } } */ + +double +dbl_max1 (double a, double b) +{ + return (a >= b) ? a : b; +} + +double +dbl_max2 (double a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_min1 (double a, double b) +{ + return (a < b) ? a : b; +} + +double +dbl_min2 (double a, double b) +{ + return (a <= b) ? a : b; +} + +double +dbl_cmp_eq (double a, double b, double c, double d) +{ + return (a == b) ? c : d; +} + +double +dbl_cmp_ne (double a, double b, double c, double d) +{ + return (a != b) ? c : d; +} + +double +dbl_cmp_gt (double a, double b, double c, double d) +{ + return (a > b) ? c : d; +} + +double +dbl_cmp_ge (double a, double b, double c, double d) +{ + return (a >= b) ? c : d; +} + +double +dbl_cmp_lt (double a, double b, double c, double d) +{ + return (a < b) ? c : d; +} + +double +dbl_cmp_le (double a, double b, double c, double d) +{ + return (a <= b) ? c : d; +} + +float +flt_max1 (float a, float b) +{ + return (a >= b) ? a : b; +} + +float +flt_max2 (float a, float b) +{ + return (a > b) ? a : b; +} + +float +flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +float +flt_min2 (float a, float b) +{ + return (a <= b) ? a : b; +} + +float +flt_cmp_eq (float a, float b, float c, float d) +{ + return (a == b) ? c : d; +} + +float +flt_cmp_ne (float a, float b, float c, float d) +{ + return (a != b) ? c : d; +} + +float +flt_cmp_gt (float a, float b, float c, float d) +{ + return (a > b) ? c : d; +} + +float +flt_cmp_ge (float a, float b, float c, float d) +{ + return (a >= b) ? c : d; +} + +float +flt_cmp_lt (float a, float b, float c, float d) +{ + return (a < b) ? c : d; +} + +float +flt_cmp_le (float a, float b, float c, float d) +{ + return (a <= b) ? c : d; +} + +double +dbl_flt_max1 (float a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max2 (double a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max3 (float a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min2 (double a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min3 (float a, double b) +{ + return (a < b) ? a : b; +} Index: gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c (.../gcc/testsuite/gcc.target/powerpc) (revision 236741) @@ -0,0 +1,191 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mpower9-minmax" } */ +/* { dg-final { scan-assembler-not "fsel" } } */ +/* { dg-final { scan-assembler "xscmpeqdp" } } */ +/* { dg-final { scan-assembler "xscmpgtdp" } } */ +/* { dg-final { scan-assembler-not "xscmpodp" } } */ +/* { dg-final { scan-assembler-not "xscmpudp" } } */ +/* { dg-final { scan-assembler "xsmaxcdp" } } */ +/* { dg-final { scan-assembler-not "xsmaxdp" } } */ +/* { dg-final { scan-assembler "xsmincdp" } } */ +/* { dg-final { scan-assembler-not "xsmindp" } } */ +/* { dg-final { scan-assembler "xxsel" } } */ + +/* Due to NaN support, <= and >= are not handled presently unless -ffast-math + is used. At some point this will be fixed and the xscmpgedp instruction can + be generated normally. The <= and >= tests are bracketed with + #ifdef DO_GE_LE. */ + +#ifdef DO_GE_LE +double +dbl_max1 (double a, double b) +{ + return (a >= b) ? a : b; +} +#endif + +double +dbl_max2 (double a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_min1 (double a, double b) +{ + return (a < b) ? a : b; +} + +#ifdef DO_GE_LE +double +dbl_min2 (double a, double b) +{ + return (a <= b) ? a : b; +} +#endif + +double +dbl_cmp_eq (double a, double b, double c, double d) +{ + return (a == b) ? c : d; +} + +double +dbl_cmp_ne (double a, double b, double c, double d) +{ + return (a != b) ? c : d; +} + +double +dbl_cmp_gt (double a, double b, double c, double d) +{ + return (a > b) ? c : d; +} + +#ifdef DO_GE_LE +double +dbl_cmp_ge (double a, double b, double c, double d) +{ + return (a >= b) ? c : d; +} +#endif + +double +dbl_cmp_lt (double a, double b, double c, double d) +{ + return (a < b) ? c : d; +} + +#ifdef DO_GE_LE +double +dbl_cmp_le (double a, double b, double c, double d) +{ + return (a <= b) ? c : d; +} +#endif + +#ifdef DO_GE_LE +float +flt_max1 (float a, float b) +{ + return (a >= b) ? a : b; +} +#endif + +float +flt_max2 (float a, float b) +{ + return (a > b) ? a : b; +} + +float +flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +#ifdef DO_GE_LE +float +flt_min2 (float a, float b) +{ + return (a <= b) ? a : b; +} +#endif + +float +flt_cmp_eq (float a, float b, float c, float d) +{ + return (a == b) ? c : d; +} + +float +flt_cmp_ne (float a, float b, float c, float d) +{ + return (a != b) ? c : d; +} + +float +flt_cmp_gt (float a, float b, float c, float d) +{ + return (a > b) ? c : d; +} + +#ifdef DO_GE_LE +float +flt_cmp_ge (float a, float b, float c, float d) +{ + return (a >= b) ? c : d; +} +#endif + +float +flt_cmp_lt (float a, float b, float c, float d) +{ + return (a < b) ? c : d; +} + +#ifdef DO_GE_LE +float +flt_cmp_le (float a, float b, float c, float d) +{ + return (a <= b) ? c : d; +} +#endif + +double +dbl_flt_max1 (float a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max2 (double a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max3 (float a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min2 (double a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min3 (float a, double b) +{ + return (a < b) ? a : b; +}