From patchwork Thu May 5 19:18:39 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 619032 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r14Tl66F5z9t6F for ; Fri, 6 May 2016 05:19:23 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=HzG/BTia; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=qlQo3ZEq3nHI/pBCZunuK3Xs/n3ghMkm+s5VYPN5Ftl8DHqMYfHN7 MvdhGDDgFljplf52TAOIAIs/6pHXjF6467jB2UBDgzo92R1IKelNtf4JASg7Kumr lQ0AgXGOUCm5DnzKnay38vtPz33warlYSV6KBCOY4xUlRg3zLfd7BI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=mj/3JaJHGRqKjTjypIL70+4oC+0=; b=HzG/BTiapgjOCxmLgC/v vUA6AjSkFbNGMiqUtmEDPkLLb4oQOT9AUrIVpmAjROX+1TZiblvaEl6KYbVooIAz LLUzLl1sIn8qXGH3WQH3uGxAL4m5FxTjhtrjF3IIkSBZeCWVDyfi8qTRZUsV5hYs Dq/yGRf+Dvivvv2SKyJy138= Received: (qmail 121348 invoked by alias); 5 May 2016 19:19:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 121335 invoked by uid 89); 5 May 2016 19:19:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.8 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY autolearn=no version=3.3.2 spammy=King, 2506r, 8994797, 2506R X-HELO: e35.co.us.ibm.com Received: from e35.co.us.ibm.com (HELO e35.co.us.ibm.com) (32.97.110.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Thu, 05 May 2016 19:18:51 +0000 Received: from localhost by e35.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 5 May 2016 13:18:49 -0600 Received: from d03dlp01.boulder.ibm.com (9.17.202.177) by e35.co.us.ibm.com (192.168.1.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 5 May 2016 13:18:46 -0600 X-IBM-Helo: d03dlp01.boulder.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by d03dlp01.boulder.ibm.com (Postfix) with ESMTP id BDCE41FF004B; Thu, 5 May 2016 13:18:30 -0600 (MDT) Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u45JIjZt40435832; Thu, 5 May 2016 12:18:45 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7417BBE03A; Thu, 5 May 2016 13:18:45 -0600 (MDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id 36BE3BE038; Thu, 5 May 2016 13:18:45 -0600 (MDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id 678D445EAA; Thu, 5 May 2016 15:18:39 -0400 (EDT) Date: Thu, 5 May 2016 15:18:39 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Add PowerPC ISA 3.0 min/max support Message-ID: <20160505191839.GA7023@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16050519-0013-0000-0000-0000358C2E4B X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes This patch was originally meant for GCC 6.x, but the stage3 submission window closed before I could submit this patch. This patch adds support for the new ISA 3.0 instructions for doing min, max, and comparison. Unlike the existing XSMINDP and XSMAXDP instructions, the new XSMINCDP and XSMAXCDP instructions do the right thing with regard to one of the arguments being NaN (not a number). This means, these instructions can be generated even if the -ffast-math switch is not used. In addition, the instructions XSCMPEQDP, XSCMPGTDP, and XSCMPGEDP generate either all 1's or all 0's (similar to the vector forms of the instructions), which allows floating point conditional move sequences to be generated with the comparison and XXSEL instruction. At the present time, the code does not support comparisons involving >= and <= unless the -ffast-math option is used. I hope eventually to support generating these instructions without having -ffast-math used. The underlying reason is when fast math is not used, we change the condition from: (ge:SI (reg:CCFP ) (const_int 0)) to: (ior:SI (gt:SI (reg:CCFP ) (const_int 0)) (eq:SI (reg:CCFP ) (const_int 0))) The machine independent portion of the compiler does not recognize this when trying to generate conditional moves. I would imagine the 'fix' is to generate GE/LE all of the time, and then have a splitter that converts it to IOR of GT/EQ if it is not a conditional move with ISA 3.0 instructions. I have bootstrapped the compiler and there were no regressions. Is it ok to apply to the trunk? Note, this patch is independent of the vector d-form support patch I just submitted. [gcc] 2016-05-05 Michael Meissner * config/rs6000/predicates.md (all_ones_constant): New predicate to recognize a vector/integer constant that is all 1's. (min_max_operator): Don't match umin or umax, since it is only used for floating point min/maxes. (fpmask_comparison_operator): New predicate for returning true if a comparison operator can generate -1/0 mask. * config/rs6000/rs6000.c (print_operand): Add support for ISA 3.0 floating point min/max instructions. (rs6000_emit_power9_minmax): New function to generate ISA 3.0 XSMAXCDP and XSMINCDP instructions. (rs6000_emit_power9_cmove): New function to generate ISA 3.0 XSCMP{EQ,GE,GT,NE}DP instructions and XXSEL to speed up floating point conditional moves. (rs6000_emit_cmove): Add support for ISA 3.0 floating point condition moves and min/max instructions. * config/rs6000/rs6000.h (TARGET_MINMAX_SF): New target macros to say whether the machine has floating point min/maxes that can be generated directly. (TARGET_MINMAX_DF): Likewise. (PRINT_OPERAND_PUNCT_VALID_P): Add '@' for ISA 3.0 min/max. * config/rs6000/rs6000.md (SFDF2): New iterator to allow mixed mode floating point conditional moves. (fp_minmax): New code iterator for ISA 3.0 min/max support. (minmax): New code attributes for ISA 3.0 min/max support. (MINMAX): Likewise. (smax3): Rework floating point min/max to be a combined insn using code iterators for min and max. Add support for ISA 3.0 min/max instructions. (s3): Likewise. (smax3_vsx): Likewise. (s3_vsx): Likewise. (smin3): Likewise. (smin3_vsx): Likewise. (*sdfsf3_vsx_1): Add support for min/max where one or both operands are float that are promoted to double. (sdfsf3_vsx_2): Likewise. (sdfsf3_vsx_3): Likewise. (sdfsf3_vsx_4): Likewise. (min/max splitter, non-VSX): Use TARGET_MINMAX_. (SF conditional move splitter): Merge SF and DF insns for generating conditional move into one insn. (movsfcc): Likewise. (movcc): Likewise. (fselsfsf4): Merge SF/DF conditional move variants into one insn that handles different types for comparison and move. (fseldfsf4): Likewise. (DF conditional move splitter): Likewise. (movdfcc): Likewise. (fseldfdf4): Likewise. (fselsfdf4): Likewise. (movcc_p9): New floating point conditional move support for ISA 3.0. (fpmask): Likewise. (xxsel): Likewise. (lfiwax): Correct scratch constraint to be wi, since we don't require direct move support. (lfiwzx): Likewise. (floatsi2_lfiwax_mem): Combine alternatives into a single alternative. (floatunssi2_lfiwzx_mem): Likewise. (fix_truncsi2): Fix comment. (fix_truncdi2_fctidz): Allow any VSX register instead of Altivec register as the second alternative. (fixuns_truncdi2_fctiduz): Likewise. [gcc/testsuite] 2016-05-05 Michael Meissner * gcc.target/powerpc/p9-minmax-1.c: New tests for ISA 3.0 min/max and conditional move support. * gcc.target/powerpc/p9-minmax-2.c: Likewise. Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/predicates.md (.../gcc/config/rs6000) (working copy) @@ -669,6 +669,11 @@ (define_predicate "zero_constant" (and (match_code "const_int,const_double,const_wide_int,const_vector") (match_test "op == CONST0_RTX (mode)"))) +;; Return 1 if operand is constant -1 (scalars and vectors). +(define_predicate "all_ones_constant" + (and (match_code "const_int,const_double,const_wide_int,const_vector") + (match_test "op == CONSTM1_RTX (mode) && !FLOAT_MODE_P (mode)"))) + ;; Return 1 if operand is 0.0. (define_predicate "zero_fp_constant" (and (match_code "const_double") @@ -1091,9 +1096,11 @@ (define_predicate "boolean_or_operator" (define_special_predicate "equality_operator" (match_code "eq,ne")) -;; Return true if operand is MIN or MAX operator. +;; Return true if operand is MIN or MAX operator. Since this is only used to +;; convert floating point MIN/MAX operations into FSEL on pre-vsx systems, +;; don't include UMIN or UMAX. (define_predicate "min_max_operator" - (match_code "smin,smax,umin,umax")) + (match_code "smin,smax")) ;; Return 1 if OP is a comparison operation that is valid for a branch ;; instruction. We check the opcode against the mode of the CC value. @@ -1137,6 +1144,11 @@ (define_predicate "scc_rev_comparison_op (and (match_operand 0 "branch_comparison_operator") (match_code "ne,le,ge,leu,geu,ordered"))) +;; Return 1 if OP is a comparison operator suitable for vector/scalar +;; comparisons that generate a -1/0 mask. +(define_predicate "fpmask_comparison_operator" + (match_code "eq,gt,ge")) + ;; Return 1 if OP is a comparison operation that is valid for a branch ;; insn, which is true if the corresponding bit in the CC register is set. (define_predicate "branch_positive_comparison_operator" Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.c (.../gcc/config/rs6000) (working copy) @@ -20534,6 +20534,12 @@ print_operand (FILE *file, rtx x, int co "local dynamic TLS references"); return; + case '@': + /* If -mpower9-minmax, use xsmaxcpdp instead of xsmaxdp. */ + if (TARGET_P9_MINMAX) + putc ('c', file); + return; + default: output_operand_lossage ("invalid %%xn code"); } @@ -21995,6 +22001,114 @@ rs6000_emit_vector_cond_expr (rtx dest, return 1; } +/* ISA 3.0 (power9) minmax subcase to emit a XSMAXCDP or XSMINCDP instruction + for SF/DF scalars. Move TRUE_COND to DEST if OP of the operands of the last + comparison is nonzero/true, FALSE_COND if it is zero/false. Return 0 if the + hardware has no such operation. */ + +static int +rs6000_emit_power9_minmax (rtx dest, rtx op, rtx true_cond, rtx false_cond) +{ + enum rtx_code code = GET_CODE (op); + rtx op0 = XEXP (op, 0); + rtx op1 = XEXP (op, 1); + machine_mode compare_mode = GET_MODE (op0); + machine_mode result_mode = GET_MODE (dest); + bool max_p = false; + + if (result_mode != compare_mode) + return 0; + + if (code == GE || code == GT) + max_p = true; + else if (code == LE || code == LT) + max_p = false; + else + return 0; + + if (rtx_equal_p (op0, true_cond) && rtx_equal_p (op1, false_cond)) + ; + + else if (rtx_equal_p (op1, true_cond) && rtx_equal_p (op0, false_cond)) + max_p = !max_p; + + else + return 0; + + rs6000_emit_minmax (dest, (max_p) ? SMAX : SMIN, op0, op1); + return 1; +} + +/* ISA 3.0 (power9) conditional move subcase to emit XSCMP{EQ,GE,GT,NE}DP and + XXSEL instructions for SF/DF scalars. Move TRUE_COND to DEST if OP of the + operands of the last comparison is nonzero/true, FALSE_COND if it is + zero/false. Return 0 if the hardware has no such operation. */ + +static int +rs6000_emit_power9_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) +{ + enum rtx_code code = GET_CODE (op); + rtx op0 = XEXP (op, 0); + rtx op1 = XEXP (op, 1); + machine_mode result_mode = GET_MODE (dest); + bool swap_p = false; + rtx compare_rtx; + rtx cmove_rtx; + rtx clobber_rtx; + + if (!can_create_pseudo_p ()) + return 0; + + switch (code) + { + case EQ: + case GE: + case GT: + break; + + case NE: + code = EQ; + swap_p = true; + break; + + case LT: + code = GT; + swap_p = true; + break; + + case LE: + code = GE; + swap_p = true; + break; + + default: + return 0; + } + + /* Generate: [(parallel [(set (dest) + (if_then_else (op (cmp1) (cmp2)) + (true) + (false))) + (clobber (scratch))])]. */ + + if (swap_p) + compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op1, op0); + else + compare_rtx = gen_rtx_fmt_ee (code, CCFPmode, op0, op1); + + cmove_rtx = gen_rtx_SET (dest, + gen_rtx_IF_THEN_ELSE (result_mode, + compare_rtx, + true_cond, + false_cond)); + + clobber_rtx = gen_rtx_CLOBBER (VOIDmode, gen_rtx_SCRATCH (V2DImode)); + emit_insn (gen_rtx_PARALLEL (VOIDmode, + gen_rtvec (2, cmove_rtx, clobber_rtx))); + + return 1; +} + /* Emit a conditional move: move TRUE_COND to DEST if OP of the operands of the last comparison is nonzero/true, FALSE_COND if it is zero/false. Return 0 if the hardware has no such operation. */ @@ -22021,6 +22135,18 @@ rs6000_emit_cmove (rtx dest, rtx op, rtx if (GET_MODE (false_cond) != result_mode) return 0; + /* See if we can use the ISA 3.0 (power9) min/max/compare functions. */ + if (TARGET_P9_MINMAX + && (compare_mode == SFmode || compare_mode == DFmode) + && (result_mode == SFmode || result_mode == DFmode)) + { + if (rs6000_emit_power9_minmax (dest, op, true_cond, false_cond)) + return 1; + + if (rs6000_emit_power9_cmove (dest, op, true_cond, false_cond)) + return 1; + } + /* Don't allow using floating point comparisons for integer results for now. */ if (FLOAT_MODE_P (compare_mode) && !FLOAT_MODE_P (result_mode)) Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.h (.../gcc/config/rs6000) (working copy) @@ -594,6 +594,15 @@ extern int rs6000_vector_align[]; in the register. */ #define TARGET_NO_SDMODE_STACK (TARGET_LFIWZX && TARGET_STFIWX && TARGET_DFP) +/* ISA 3.0 has new min/max functions that don't need fast math that are being + phased in. Min/max using FSEL or XSMAXDP/XSMINDP do not return the correct + answers if the arguments are not in the normal range. */ +#define TARGET_MINMAX_SF (TARGET_SF_FPR && TARGET_PPC_GFXOPT \ + && (TARGET_P9_MINMAX || !flag_trapping_math)) + +#define TARGET_MINMAX_DF (TARGET_DF_FPR && TARGET_PPC_GFXOPT \ + && (TARGET_P9_MINMAX || !flag_trapping_math)) + /* In switching from using target_flags to using rs6000_isa_flags, the options machinery creates OPTION_MASK_ instead of MASK_. For now map OPTION_MASK_ back into MASK_. */ @@ -2606,7 +2615,7 @@ extern char rs6000_reg_names[][8]; /* re /* Define which CODE values are valid. */ -#define PRINT_OPERAND_PUNCT_VALID_P(CODE) ((CODE) == '&') +#define PRINT_OPERAND_PUNCT_VALID_P(CODE) ((CODE) == '&' || (CODE) == '@') /* Print a memory address as an operand to reference that memory location. */ Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000) (revision 235831) +++ gcc/config/rs6000/rs6000.md (.../gcc/config/rs6000) (working copy) @@ -489,6 +489,10 @@ (define_mode_iterator RECIPF [SF DF V4SF ; Iterator for just SF/DF (define_mode_iterator SFDF [SF DF]) +; Like SFDF, but a different name to match conditional move where the +; comparison operands may be a different mode than the input operands. +(define_mode_iterator SFDF2 [SF DF]) + ; Iterator for 128-bit floating point that uses the IBM double-double format (define_mode_iterator IBM128 [(IF "FLOAT128_IBM_P (IFmode)") (TF "FLOAT128_IBM_P (TFmode)")]) @@ -698,6 +702,15 @@ (define_mode_attr BOOL_REGS_UNARY [(TI " (define_mode_iterator RELOAD [V16QI V8HI V4SI V2DI V4SF V2DF V1TI SF SD SI DF DD DI TI PTI KF IF TF]) +;; Iterate over smin, smax +(define_code_iterator fp_minmax [smin smax]) + +(define_code_attr minmax [(smin "min") + (smax "max")]) + +(define_code_attr SMINMAX [(smin "SMIN") + (smax "SMAX")]) + ;; Start with fixed-point load and store insns. Here we put only the more ;; complex forms. Basic data transfer is done later. @@ -4627,53 +4640,72 @@ (define_insn "copysign3_fcpsgn" ;; On VSX, we only check for TARGET_VSX instead of checking for a vsx/p8 vector ;; to allow either DF/SF to use only traditional registers. -(define_expand "smax3" +(define_expand "s3" [(set (match_operand:SFDF 0 "gpc_reg_operand" "") - (if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "") - (match_operand:SFDF 2 "gpc_reg_operand" "")) - (match_dup 1) - (match_dup 2)))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math" + (fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "") + (match_operand:SFDF 2 "gpc_reg_operand" "")))] + "TARGET_MINMAX_" { - rs6000_emit_minmax (operands[0], SMAX, operands[1], operands[2]); + rs6000_emit_minmax (operands[0], , operands[1], operands[2]); DONE; }) -(define_insn "*smax3_vsx" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") - (smax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,") - (match_operand:SFDF 2 "gpc_reg_operand" ",")))] - "TARGET__FPR && TARGET_VSX" - "xsmaxdp %x0,%x1,%x2" +(define_insn "*s3_vsx" + [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") + (fp_minmax:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "") + (match_operand:SFDF 2 "gpc_reg_operand" "")))] + "TARGET_DF_FPR && TARGET_VSX" + "xs%@dp %x0,%x1,%x2" [(set_attr "type" "fp")]) -(define_expand "smin3" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "") - (if_then_else:SFDF (ge (match_operand:SFDF 1 "gpc_reg_operand" "") - (match_operand:SFDF 2 "gpc_reg_operand" "")) - (match_dup 2) - (match_dup 1)))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math" -{ - rs6000_emit_minmax (operands[0], SMIN, operands[1], operands[2]); - DONE; -}) +;; Recognize min/max promotions from float to double +(define_insn "*sdfsf3_vsx_1" + [(set (match_operand:DF 0 "gpc_reg_operand" "=ws") + (float_extend:DF (fp_minmax:SF + (match_operand:SF 1 "gpc_reg_operand" "ww") + (match_operand:SF 2 "gpc_reg_operand" "ww"))))] + "TARGET_DF_FPR && TARGET_VSX" + "xs%@dp %x0,%x1,%x2" + [(set_attr "type" "fp")]) -(define_insn "*smin3_vsx" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") - (smin:SFDF (match_operand:SFDF 1 "gpc_reg_operand" "%,") - (match_operand:SFDF 2 "gpc_reg_operand" ",")))] - "TARGET__FPR && TARGET_VSX" - "xsmindp %x0,%x1,%x2" +(define_insn "*sdfsf3_vsx_2" + [(set (match_operand:DF 0 "gpc_reg_operand" "=ws") + (fp_minmax:DF (float_extend:DF + (match_operand:SF 1 "gpc_reg_operand" "ww")) + (match_operand:DF 2 "gpc_reg_operand" "ws")))] + "TARGET_DF_FPR && TARGET_VSX" + "xs%@dp %x0,%x1,%x2" [(set_attr "type" "fp")]) +(define_insn "*sdfsf3_vsx_3" + [(set (match_operand:DF 0 "gpc_reg_operand" "=ws") + (fp_minmax:DF (match_operand:DF 1 "gpc_reg_operand" "ws") + (float_extend:DF + (match_operand:SF 2 "gpc_reg_operand" "ww"))))] + "TARGET_DF_FPR && TARGET_VSX" + "xs%@dp %x0,%x1,%x2" + [(set_attr "type" "fp")]) + +(define_insn "*sdfsf3_vsx_4" + [(set (match_operand:DF 0 "gpc_reg_operand" "=ws") + (fp_minmax:DF (float_extend:DF + (match_operand:SF 1 "gpc_reg_operand" "ww")) + (float_extend:DF + (match_operand:SF 2 "gpc_reg_operand" "ww"))))] + "TARGET_DF_FPR && TARGET_VSX" + "xs%@dp %x0,%x1,%x2" + [(set_attr "type" "fp")]) + +;; The conditional move instructions allow us to perform max and min operations +;; even when we don't have the appropriate max/min instruction using the FSEL +;; instruction. + (define_split [(set (match_operand:SFDF 0 "gpc_reg_operand" "") (match_operator:SFDF 3 "min_max_operator" [(match_operand:SFDF 1 "gpc_reg_operand" "") (match_operand:SFDF 2 "gpc_reg_operand" "")]))] - "TARGET__FPR && TARGET_PPC_GFXOPT && !flag_trapping_math - && !TARGET_VSX" + "TARGET_MINMAX_ && !TARGET_VSX" [(const_int 0)] { rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), operands[1], @@ -4681,20 +4713,6 @@ (define_split DONE; }) -(define_split - [(set (match_operand:SF 0 "gpc_reg_operand" "") - (match_operator:SF 3 "min_max_operator" - [(match_operand:SF 1 "gpc_reg_operand" "") - (match_operand:SF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS - && TARGET_SINGLE_FLOAT && !flag_trapping_math" - [(const_int 0)] - " -{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), - operands[1], operands[2]); - DONE; -}") - (define_expand "movcc" [(set (match_operand:GPR 0 "gpc_reg_operand" "") (if_then_else:GPR (match_operand 1 "comparison_operator" "") @@ -4777,12 +4795,13 @@ (define_insn "*isel_reversed_unsigned_cc" + [(set (match_operand:SFDF 0 "gpc_reg_operand" "") + (if_then_else:SFDF (match_operand 1 "comparison_operator" "") + (match_operand:SFDF 2 "gpc_reg_operand" "") + (match_operand:SFDF 3 "gpc_reg_operand" "")))] + "TARGET__FPR && TARGET_PPC_GFXOPT" " { if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) @@ -4791,76 +4810,70 @@ (define_expand "movsfcc" FAIL; }") -(define_insn "*fselsfsf4" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (if_then_else:SF (ge (match_operand:SF 1 "gpc_reg_operand" "f") - (match_operand:SF 4 "zero_fp_constant" "F")) - (match_operand:SF 2 "gpc_reg_operand" "f") - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_SINGLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) - -(define_insn "*fseldfsf4" - [(set (match_operand:SF 0 "gpc_reg_operand" "=f") - (if_then_else:SF (ge (match_operand:DF 1 "gpc_reg_operand" "d") - (match_operand:DF 4 "zero_fp_constant" "F")) - (match_operand:SF 2 "gpc_reg_operand" "f") - (match_operand:SF 3 "gpc_reg_operand" "f")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" +(define_insn "*fsel4" + [(set (match_operand:SFDF 0 "fpr_reg_operand" "=&") + (if_then_else:SFDF + (ge (match_operand:SFDF2 1 "fpr_reg_operand" "") + (match_operand:SFDF2 4 "zero_fp_constant" "F")) + (match_operand:SFDF 2 "fpr_reg_operand" "") + (match_operand:SFDF 3 "fpr_reg_operand" "")))] + "TARGET__FPR && TARGET_PPC_GFXOPT" "fsel %0,%1,%2,%3" [(set_attr "type" "fp")]) -;; The conditional move instructions allow us to perform max and min -;; operations even when - -(define_split - [(set (match_operand:DF 0 "gpc_reg_operand" "") - (match_operator:DF 3 "min_max_operator" - [(match_operand:DF 1 "gpc_reg_operand" "") - (match_operand:DF 2 "gpc_reg_operand" "")]))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT - && !flag_trapping_math" - [(const_int 0)] - " -{ rs6000_emit_minmax (operands[0], GET_CODE (operands[3]), - operands[1], operands[2]); - DONE; -}") - -(define_expand "movdfcc" - [(set (match_operand:DF 0 "gpc_reg_operand" "") - (if_then_else:DF (match_operand 1 "comparison_operator" "") - (match_operand:DF 2 "gpc_reg_operand" "") - (match_operand:DF 3 "gpc_reg_operand" "")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" - " -{ - if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) - DONE; - else - FAIL; -}") +(define_insn_and_split "*movcc_p9" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=&,") + (if_then_else:SFDF + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:SFDF2 2 "vsx_register_operand" ",") + (match_operand:SFDF2 3 "vsx_register_operand" ",")]) + (match_operand:SFDF 4 "vsx_register_operand" ",") + (match_operand:SFDF 5 "vsx_register_operand" ","))) + (clobber (match_scratch:V2DI 6 "=0,&wa"))] + "TARGET_P9_MINMAX" + "#" + "" + [(set (match_dup 6) + (if_then_else:V2DI (match_dup 1) + (match_dup 7) + (match_dup 8))) + (set (match_dup 0) + (if_then_else:SFDF (ne (match_dup 6) + (match_dup 8)) + (match_dup 4) + (match_dup 5)))] +{ + if (GET_CODE (operands[6]) == SCRATCH) + operands[6] = gen_reg_rtx (V2DImode); + + operands[7] = CONSTM1_RTX (V2DImode); + operands[8] = CONST0_RTX (V2DImode); +} + [(set_attr "length" "8") + (set_attr "type" "vecperm")]) + +(define_insn "*fpmask" + [(set (match_operand:V2DI 0 "vsx_register_operand" "=wa") + (if_then_else:V2DI + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:SFDF 2 "vsx_register_operand" "") + (match_operand:SFDF 3 "vsx_register_operand" "")]) + (match_operand:V2DI 4 "all_ones_constant" "") + (match_operand:V2DI 5 "zero_constant" "")))] + "TARGET_P9_MINMAX" + "xscmp%V1dp %x0,%x2,%x3" + [(set_attr "type" "fpcompare")]) -(define_insn "*fseldfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (if_then_else:DF (ge (match_operand:DF 1 "gpc_reg_operand" "d") - (match_operand:DF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "d") - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) +(define_insn "*xxsel" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=") + (if_then_else:SFDF (ne (match_operand:V2DI 1 "vsx_register_operand" "wa") + (match_operand:V2DI 2 "zero_constant" "")) + (match_operand:SFDF 3 "vsx_register_operand" "") + (match_operand:SFDF 4 "vsx_register_operand" "")))] + "TARGET_P9_MINMAX" + "xxsel %x0,%x1,%x3,%x4" + [(set_attr "type" "vecperm")]) -(define_insn "*fselsfdf4" - [(set (match_operand:DF 0 "gpc_reg_operand" "=d") - (if_then_else:DF (ge (match_operand:SF 1 "gpc_reg_operand" "f") - (match_operand:SF 4 "zero_fp_constant" "F")) - (match_operand:DF 2 "gpc_reg_operand" "d") - (match_operand:DF 3 "gpc_reg_operand" "d")))] - "TARGET_PPC_GFXOPT && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_SINGLE_FLOAT" - "fsel %0,%1,%2,%3" - [(set_attr "type" "fp")]) ;; Conversions to and from floating-point. @@ -4885,7 +4898,7 @@ (define_insn "lfiwax" (define_insn_and_split "floatsi2_lfiwax" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r"))) - (clobber (match_scratch:DI 2 "=wj"))] + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX && && can_create_pseudo_p ()" "#" @@ -4924,11 +4937,11 @@ (define_insn_and_split "floatsi2_l (set_attr "type" "fpload")]) (define_insn_and_split "floatsi2_lfiwax_mem" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") + [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (float:SFDF (sign_extend:DI - (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z")))) - (clobber (match_scratch:DI 2 "=0,d"))] + (match_operand:SI 1 "indexed_or_indirect_operand" "Z")))) + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX && " "#" @@ -4960,7 +4973,7 @@ (define_insn "lfiwzx" (define_insn_and_split "floatunssi2_lfiwzx" [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r"))) - (clobber (match_scratch:DI 2 "=wj"))] + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX && " "#" @@ -4999,11 +5012,11 @@ (define_insn_and_split "floatunssi (set_attr "type" "fpload")]) (define_insn_and_split "floatunssi2_lfiwzx_mem" - [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,") + [(set (match_operand:SFDF 0 "gpc_reg_operand" "=") (unsigned_float:SFDF (zero_extend:DI - (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z")))) - (clobber (match_scratch:DI 2 "=0,d"))] + (match_operand:SI 1 "indexed_or_indirect_operand" "Z")))) + (clobber (match_scratch:DI 2 "=wi"))] "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX && " "#" @@ -5232,7 +5245,7 @@ (define_expand "fix_truncsi2" ; Like the convert to float patterns, this insn must be split before ; register allocation so that it can allocate the memory slot if it -; needed +; is needed (define_insn_and_split "fix_truncsi2_stfiwx" [(set (match_operand:SI 0 "nonimmediate_operand" "=rm") (fix:SI (match_operand:SFDF 1 "gpc_reg_operand" "d"))) @@ -5307,7 +5320,7 @@ (define_expand "fix_truncdi2" (define_insn "*fix_truncdi2_fctidz" [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi") - (fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] + (fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS && TARGET_FCFID" "@ @@ -5379,7 +5392,7 @@ (define_expand "fixuns_truncdi2" (define_insn "*fixuns_truncdi2_fctiduz" [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi") - (unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] + (unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" ",")))] "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS && TARGET_FCTIDUZ" "@ Index: gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-minmax-1.c (.../gcc/testsuite/gcc.target/powerpc) (revision 235894) @@ -0,0 +1,171 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mpower9-minmax -ffast-math" } */ +/* { dg-final { scan-assembler-not "fsel" } } */ +/* { dg-final { scan-assembler "xscmpeqdp" } } */ +/* { dg-final { scan-assembler "xscmpgtdp" } } */ +/* { dg-final { scan-assembler "xscmpgedp" } } */ +/* { dg-final { scan-assembler-not "xscmpodp" } } */ +/* { dg-final { scan-assembler-not "xscmpudp" } } */ +/* { dg-final { scan-assembler "xsmaxcdp" } } */ +/* { dg-final { scan-assembler-not "xsmaxdp" } } */ +/* { dg-final { scan-assembler "xsmincdp" } } */ +/* { dg-final { scan-assembler-not "xsmindp" } } */ +/* { dg-final { scan-assembler "xxsel" } } */ + +double +dbl_max1 (double a, double b) +{ + return (a >= b) ? a : b; +} + +double +dbl_max2 (double a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_min1 (double a, double b) +{ + return (a < b) ? a : b; +} + +double +dbl_min2 (double a, double b) +{ + return (a <= b) ? a : b; +} + +double +dbl_cmp_eq (double a, double b, double c, double d) +{ + return (a == b) ? c : d; +} + +double +dbl_cmp_ne (double a, double b, double c, double d) +{ + return (a != b) ? c : d; +} + +double +dbl_cmp_gt (double a, double b, double c, double d) +{ + return (a > b) ? c : d; +} + +double +dbl_cmp_ge (double a, double b, double c, double d) +{ + return (a >= b) ? c : d; +} + +double +dbl_cmp_lt (double a, double b, double c, double d) +{ + return (a < b) ? c : d; +} + +double +dbl_cmp_le (double a, double b, double c, double d) +{ + return (a <= b) ? c : d; +} + +float +flt_max1 (float a, float b) +{ + return (a >= b) ? a : b; +} + +float +flt_max2 (float a, float b) +{ + return (a > b) ? a : b; +} + +float +flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +float +flt_min2 (float a, float b) +{ + return (a <= b) ? a : b; +} + +float +flt_cmp_eq (float a, float b, float c, float d) +{ + return (a == b) ? c : d; +} + +float +flt_cmp_ne (float a, float b, float c, float d) +{ + return (a != b) ? c : d; +} + +float +flt_cmp_gt (float a, float b, float c, float d) +{ + return (a > b) ? c : d; +} + +float +flt_cmp_ge (float a, float b, float c, float d) +{ + return (a >= b) ? c : d; +} + +float +flt_cmp_lt (float a, float b, float c, float d) +{ + return (a < b) ? c : d; +} + +float +flt_cmp_le (float a, float b, float c, float d) +{ + return (a <= b) ? c : d; +} + +double +dbl_flt_max1 (float a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max2 (double a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max3 (float a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min2 (double a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min3 (float a, double b) +{ + return (a < b) ? a : b; +} Index: gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c (.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc) (revision 0) +++ gcc/testsuite/gcc.target/powerpc/p9-minmax-2.c (.../gcc/testsuite/gcc.target/powerpc) (revision 235894) @@ -0,0 +1,191 @@ +/* { dg-do compile { target { powerpc*-*-* } } } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */ +/* { dg-require-effective-target powerpc_p9vector_ok } */ +/* { dg-options "-mcpu=power9 -O2 -mpower9-minmax" } */ +/* { dg-final { scan-assembler-not "fsel" } } */ +/* { dg-final { scan-assembler "xscmpeqdp" } } */ +/* { dg-final { scan-assembler "xscmpgtdp" } } */ +/* { dg-final { scan-assembler-not "xscmpodp" } } */ +/* { dg-final { scan-assembler-not "xscmpudp" } } */ +/* { dg-final { scan-assembler "xsmaxcdp" } } */ +/* { dg-final { scan-assembler-not "xsmaxdp" } } */ +/* { dg-final { scan-assembler "xsmincdp" } } */ +/* { dg-final { scan-assembler-not "xsmindp" } } */ +/* { dg-final { scan-assembler "xxsel" } } */ + +/* Due to NaN support, <= and >= are not handled presently unless -ffast-math + is used. At some point this will be fixed and the xscmpgedp instruction can + be generated normally. The <= and >= tests are bracketed with + #ifdef DO_GE_LE. */ + +#ifdef DO_GE_LE +double +dbl_max1 (double a, double b) +{ + return (a >= b) ? a : b; +} +#endif + +double +dbl_max2 (double a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_min1 (double a, double b) +{ + return (a < b) ? a : b; +} + +#ifdef DO_GE_LE +double +dbl_min2 (double a, double b) +{ + return (a <= b) ? a : b; +} +#endif + +double +dbl_cmp_eq (double a, double b, double c, double d) +{ + return (a == b) ? c : d; +} + +double +dbl_cmp_ne (double a, double b, double c, double d) +{ + return (a != b) ? c : d; +} + +double +dbl_cmp_gt (double a, double b, double c, double d) +{ + return (a > b) ? c : d; +} + +#ifdef DO_GE_LE +double +dbl_cmp_ge (double a, double b, double c, double d) +{ + return (a >= b) ? c : d; +} +#endif + +double +dbl_cmp_lt (double a, double b, double c, double d) +{ + return (a < b) ? c : d; +} + +#ifdef DO_GE_LE +double +dbl_cmp_le (double a, double b, double c, double d) +{ + return (a <= b) ? c : d; +} +#endif + +#ifdef DO_GE_LE +float +flt_max1 (float a, float b) +{ + return (a >= b) ? a : b; +} +#endif + +float +flt_max2 (float a, float b) +{ + return (a > b) ? a : b; +} + +float +flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +#ifdef DO_GE_LE +float +flt_min2 (float a, float b) +{ + return (a <= b) ? a : b; +} +#endif + +float +flt_cmp_eq (float a, float b, float c, float d) +{ + return (a == b) ? c : d; +} + +float +flt_cmp_ne (float a, float b, float c, float d) +{ + return (a != b) ? c : d; +} + +float +flt_cmp_gt (float a, float b, float c, float d) +{ + return (a > b) ? c : d; +} + +#ifdef DO_GE_LE +float +flt_cmp_ge (float a, float b, float c, float d) +{ + return (a >= b) ? c : d; +} +#endif + +float +flt_cmp_lt (float a, float b, float c, float d) +{ + return (a < b) ? c : d; +} + +#ifdef DO_GE_LE +float +flt_cmp_le (float a, float b, float c, float d) +{ + return (a <= b) ? c : d; +} +#endif + +double +dbl_flt_max1 (float a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max2 (double a, float b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_max3 (float a, double b) +{ + return (a > b) ? a : b; +} + +double +dbl_flt_min1 (float a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min2 (double a, float b) +{ + return (a < b) ? a : b; +} + +double +dbl_flt_min3 (float a, double b) +{ + return (a < b) ? a : b; +}