From patchwork Wed Jun 9 00:24:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 1489670 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Qc8LoWUv; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4G07Dh4JBXz9s24 for ; Wed, 9 Jun 2021 10:25:39 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AD4DA3985835 for ; Wed, 9 Jun 2021 00:25:37 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AD4DA3985835 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1623198337; bh=R8N0/2m1bXmY3SS9lD9YbkuNSHtp/VkqgMRhtjWRZ/g=; h=Date:To:Subject:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=Qc8LoWUvQ60pPgp86+Sm/QHH1x9ONk+GJGOBi9iHNS/RU3Y+qER2NhT1ha8bZqh9t BJy/PbaShut1IBpodFSF+UptzqW8rIYU0uhVlH09SLY/aAssAIPt3oWrA4l3Gkit1+ 5bKsnsGeJNVD6vvmzPpA91ZOP4JgUKon46VPII6U= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 882BC385E447 for ; Wed, 9 Jun 2021 00:24:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 882BC385E447 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 15903Bce042073; Tue, 8 Jun 2021 20:24:53 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 392hp09thh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 08 Jun 2021 20:24:53 -0400 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 15903KEQ043107; Tue, 8 Jun 2021 20:24:53 -0400 Received: from ppma01dal.us.ibm.com (83.d6.3fa9.ip4.static.sl-reverse.com [169.63.214.131]) by mx0a-001b2d01.pphosted.com with ESMTP id 392hp09th6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 08 Jun 2021 20:24:53 -0400 Received: from pps.filterd (ppma01dal.us.ibm.com [127.0.0.1]) by ppma01dal.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 1590HFDA005187; Wed, 9 Jun 2021 00:24:52 GMT Received: from b03cxnp07029.gho.boulder.ibm.com (b03cxnp07029.gho.boulder.ibm.com [9.17.130.16]) by ppma01dal.us.ibm.com with ESMTP id 3900w9j7b5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 09 Jun 2021 00:24:52 +0000 Received: from b03ledav004.gho.boulder.ibm.com (b03ledav004.gho.boulder.ibm.com [9.17.130.235]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 1590Oo0J31785380 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 9 Jun 2021 00:24:50 GMT Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 6CA0178063; Wed, 9 Jun 2021 00:24:50 +0000 (GMT) Received: from b03ledav004.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AF8557805E; Wed, 9 Jun 2021 00:24:49 +0000 (GMT) Received: from ibm-toto.the-meissners.org (unknown [9.160.19.146]) by b03ledav004.gho.boulder.ibm.com (Postfix) with ESMTPS; Wed, 9 Jun 2021 00:24:49 +0000 (GMT) Date: Tue, 8 Jun 2021 20:24:47 -0400 To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt Subject: [PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC. Message-ID: <20210609002447.GC18854@ibm-toto.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt References: <20210609001744.GA16932@ibm-toto.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20210609001744.GA16932@ibm-toto.the-meissners.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: e4tTVe56ww_FmLpRMDjHLuNuE7O83hI- X-Proofpoint-GUID: 7k2qaJJVXMdWLO1Kq67EVkvqVWAgld7n X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.761 definitions=2021-06-08_17:2021-06-04, 2021-06-08 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 mlxscore=0 priorityscore=1501 clxscore=1015 suspectscore=0 phishscore=0 adultscore=0 lowpriorityscore=0 bulkscore=0 mlxlogscore=999 impostorscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2106080152 X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Michael Meissner via Gcc-patches From: Michael Meissner Reply-To: Michael Meissner Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" [PATCH 3/3] Add IEEE 128-bit fp conditional move on PowerPC. This patch adds the support for power10 IEEE 128-bit floating point conditional move and for automatically generating min/max. In this patch, I simplified things compared to previous patches. Instead of allowing any four of the modes to be used for the conditional move comparison and the move itself could use different modes, I restricted the conditional move to just the same mode. I.e. you can do: _Float128 a, b, c, d, e, r; r = (a == b) ? c : d; But you can't do: _Float128 c, d, r; double a, b; r = (a == b) ? c : d; or: _Float128 a, b; double c, d, r; r = (a == b) ? c : d; This eliminates a lot of the complexity of the code, because you don't have to worry about the sizes being different, and the IEEE 128-bit types being restricted to Altivec registers, while the SF/DF modes can use any VSX register. I did not modify the existing support that allowed conditional moves where SFmode operands are compared and DFmode operands are moved (and vice versa). Compared to the May 18th patches, this patch replaces the complicated test that was complained about. I tested it on 3 platforms: * Power9 little endian, --with-code=power9; * Power8 big endian, --with-code=power8, both 32/64-bit tests done; * Power10 little endian, --with-code=power10. All systems bootstrapped and there were no new regressions. I believe I have addressed the issues with the last patch. Can I check this into the master branch, and after a soak-in period, back port it to the GCC 11 branch? gcc/ 2021-06-08 Michael Meissner * config/rs6000/rs6000.c (rs6000_maybe_emit_fp_cmove): Add IEEE 128-bit floating point conditional move support. (have_compare_and_set_mask): Add IEEE 128-bit floating point types. * config/rs6000/rs6000.md (movcc, IEEE128 iterator): New insn. (movcc_p10, IEEE128 iterator): New insn. (movcc_invert_p10, IEEE128 iterator): New insn. (fpmask, IEEE128 iterator): New insn. (xxsel, IEEE128 iterator): New insn. gcc/testsuite/ 2021-06-08 Michael Meissner * gcc.target/powerpc/float128-cmove.c: New test. * gcc.target/powerpc/float128-minmax-3.c: New test. --- gcc/config/rs6000/rs6000.c | 38 ++++++- gcc/config/rs6000/rs6000.md | 106 ++++++++++++++++++ .../gcc.target/powerpc/float128-cmove.c | 58 ++++++++++ .../gcc.target/powerpc/float128-minmax-3.c | 15 +++ 4 files changed, 215 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-cmove.c create mode 100644 gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 1651788df6a..411e7539019 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -15698,8 +15698,8 @@ rs6000_emit_vector_cond_expr (rtx dest, rtx op_true, rtx op_false, return 1; } -/* Possibly emit the xsmaxcdp and xsmincdp instructions to emit a maximum or - minimum with "C" semantics. +/* Possibly emit the xsmaxc{dp,qp} and xsminc{dp,qp} instructions to emit a + maximum or minimum with "C" semantics. Unless you use -ffast-math, you can't use these instructions to replace conditions that implicitly reverse the condition because the comparison @@ -15775,6 +15775,7 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) enum rtx_code code = GET_CODE (op); rtx op0 = XEXP (op, 0); rtx op1 = XEXP (op, 1); + machine_mode compare_mode = GET_MODE (op0); machine_mode result_mode = GET_MODE (dest); rtx compare_rtx; rtx cmove_rtx; @@ -15783,6 +15784,35 @@ rs6000_maybe_emit_fp_cmove (rtx dest, rtx op, rtx true_cond, rtx false_cond) if (!can_create_pseudo_p ()) return 0; + /* We allow the comparison to be either SFmode/DFmode and the true/false + condition to be either SFmode/DFmode. I.e. we allow: + + float a, b; + double c, d, r; + + r = (a == b) ? c : d; + + and: + + double a, b; + float c, d, r; + + r = (a == b) ? c : d; + + but we don't allow intermixing the IEEE 128-bit floating point types with + the 32/64-bit scalar types. + + It gets too messy where SFmode/DFmode can use any register and TFmode/KFmode + can only use Altivec registers. In addtion, we would need to do a XXPERMDI + if we compare SFmode/DFmode and move TFmode/KFmode. */ + + if (compare_mode == result_mode + || (compare_mode == SFmode && result_mode == DFmode) + || (compare_mode == DFmode && result_mode == SFmode)) + ; + else + return false; + switch (code) { case EQ: @@ -15835,6 +15865,10 @@ have_compare_and_set_mask (machine_mode mode) case E_DFmode: return TARGET_P9_MINMAX; + case E_KFmode: + case E_TFmode: + return TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode); + default: break; } diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 064c3a2d9d6..ff87d8c6eaa 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -5449,6 +5449,112 @@ (define_insn "*xxsel" "xxsel %x0,%x4,%x3,%x1" [(set_attr "type" "vecmove")]) +;; Support for ISA 3.1 IEEE 128-bit conditional move. The mode used in the +;; comparison must be the same as used in the conditional move. +(define_expand "movcc" + [(set (match_operand:IEEE128 0 "gpc_reg_operand") + (if_then_else:IEEE128 (match_operand 1 "comparison_operator") + (match_operand:IEEE128 2 "gpc_reg_operand") + (match_operand:IEEE128 3 "gpc_reg_operand")))] + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" +{ + if (rs6000_emit_cmove (operands[0], operands[1], operands[2], operands[3])) + DONE; + else + FAIL; +}) + +(define_insn_and_split "*movcc_p10" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v") + (if_then_else:IEEE128 + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:IEEE128 2 "altivec_register_operand" "v,v") + (match_operand:IEEE128 3 "altivec_register_operand" "v,v")]) + (match_operand:IEEE128 4 "altivec_register_operand" "v,v") + (match_operand:IEEE128 5 "altivec_register_operand" "v,v"))) + (clobber (match_scratch:V2DI 6 "=0,&v"))] + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "#" + "&& 1" + [(set (match_dup 6) + (if_then_else:V2DI (match_dup 1) + (match_dup 7) + (match_dup 8))) + (set (match_dup 0) + (if_then_else:IEEE128 (ne (match_dup 6) + (match_dup 8)) + (match_dup 4) + (match_dup 5)))] +{ + if (GET_CODE (operands[6]) == SCRATCH) + operands[6] = gen_reg_rtx (V2DImode); + + operands[7] = CONSTM1_RTX (V2DImode); + operands[8] = CONST0_RTX (V2DImode); +} + [(set_attr "length" "8") + (set_attr "type" "vecperm")]) + +;; Handle inverting the fpmask comparisons. +(define_insn_and_split "*movcc_invert_p10" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=&v,v") + (if_then_else:IEEE128 + (match_operator:CCFP 1 "invert_fpmask_comparison_operator" + [(match_operand:IEEE128 2 "altivec_register_operand" "v,v") + (match_operand:IEEE128 3 "altivec_register_operand" "v,v")]) + (match_operand:IEEE128 4 "altivec_register_operand" "v,v") + (match_operand:IEEE128 5 "altivec_register_operand" "v,v"))) + (clobber (match_scratch:V2DI 6 "=0,&v"))] + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "#" + "&& 1" + [(set (match_dup 6) + (if_then_else:V2DI (match_dup 9) + (match_dup 7) + (match_dup 8))) + (set (match_dup 0) + (if_then_else:IEEE128 (ne (match_dup 6) + (match_dup 8)) + (match_dup 5) + (match_dup 4)))] +{ + rtx op1 = operands[1]; + enum rtx_code cond = reverse_condition_maybe_unordered (GET_CODE (op1)); + + if (GET_CODE (operands[6]) == SCRATCH) + operands[6] = gen_reg_rtx (V2DImode); + + operands[7] = CONSTM1_RTX (V2DImode); + operands[8] = CONST0_RTX (V2DImode); + + operands[9] = gen_rtx_fmt_ee (cond, CCFPmode, operands[2], operands[3]); +} + [(set_attr "length" "8") + (set_attr "type" "vecperm")]) + +(define_insn "*fpmask" + [(set (match_operand:V2DI 0 "altivec_register_operand" "=v") + (if_then_else:V2DI + (match_operator:CCFP 1 "fpmask_comparison_operator" + [(match_operand:IEEE128 2 "altivec_register_operand" "v") + (match_operand:IEEE128 3 "altivec_register_operand" "v")]) + (match_operand:V2DI 4 "all_ones_constant" "") + (match_operand:V2DI 5 "zero_constant" "")))] + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xscmp%V1qp %0,%2,%3" + [(set_attr "type" "fpcompare")]) + +(define_insn "*xxsel" + [(set (match_operand:IEEE128 0 "altivec_register_operand" "=v") + (if_then_else:IEEE128 + (ne (match_operand:V2DI 1 "altivec_register_operand" "v") + (match_operand:V2DI 2 "zero_constant" "")) + (match_operand:IEEE128 3 "altivec_register_operand" "v") + (match_operand:IEEE128 4 "altivec_register_operand" "v")))] + "TARGET_POWER10 && TARGET_FLOAT128_HW && FLOAT128_IEEE_P (mode)" + "xxsel %x0,%x4,%x3,%x1" + [(set_attr "type" "vecmove")]) + ;; Conversions to and from floating-point. diff --git a/gcc/testsuite/gcc.target/powerpc/float128-cmove.c b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c new file mode 100644 index 00000000000..2fae8dc23bc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/float128-cmove.c @@ -0,0 +1,58 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target ppc_float128_hw } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#ifndef TYPE +#ifdef __LONG_DOUBLE_IEEE128__ +#define TYPE long double + +#else +#define TYPE _Float128 +#endif +#endif + +/* Verify that the ISA 3.1 (power10) IEEE 128-bit conditional move instructions + are generated. */ + +TYPE +eq (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a == b) ? c : d; +} + +TYPE +ne (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a != b) ? c : d; +} + +TYPE +lt (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a < b) ? c : d; +} + +TYPE +le (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a <= b) ? c : d; +} + +TYPE +gt (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a > b) ? c : d; +} + +TYPE +ge (TYPE a, TYPE b, TYPE c, TYPE d) +{ + return (a >= b) ? c : d; +} + +/* { dg-final { scan-assembler-times {\mxscmpeqqp\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxscmpgeqp\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxscmpgtqp\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mxxsel\M} 6 } } */ +/* { dg-final { scan-assembler-not {\mxscmpuqp\M} } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c new file mode 100644 index 00000000000..6f7627c0f2a --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/float128-minmax-3.c @@ -0,0 +1,15 @@ +/* { dg-require-effective-target ppc_float128_hw } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#ifndef TYPE +#define TYPE _Float128 +#endif + +/* Test that the fminf128/fmaxf128 functions generate if/then/else and not a + call. */ +TYPE f128_min (TYPE a, TYPE b) { return (a < b) ? a : b; } +TYPE f128_max (TYPE a, TYPE b) { return (b > a) ? b : a; } + +/* { dg-final { scan-assembler {\mxsmaxcqp\M} } } */ +/* { dg-final { scan-assembler {\mxsmincqp\M} } } */