From patchwork Wed Aug 25 19:46:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 1520905 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=D8y5dqss; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GvxN6717lz9sRf for ; Thu, 26 Aug 2021 05:47:49 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A3F8C385C403 for ; Wed, 25 Aug 2021 19:47:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A3F8C385C403 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1629920866; bh=wL5JKzpJfoOGcJbXPv+lcmS5vMONvPvPBRlj3hQAceY=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=D8y5dqss4jbWhBW3irjtvnyqqcuiY7XO+4o8LMSPkDqAekPnLEzm9ADcyn0x6rBpC MC+MOZTROtfkIuQDtHA7DbxvaYTkhv9DjEK/4WDDPoOxk+9rSXfJmkCdoCKhOBc1eS U5Tb+6Yaq/nYEK8PUd0cl7CWxTM1i54RQWo1R5rE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 836533857C67 for ; Wed, 25 Aug 2021 19:46:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 836533857C67 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 17PJXs17053228; Wed, 25 Aug 2021 15:46:51 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 3antgvk2rw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 25 Aug 2021 15:46:50 -0400 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.43/8.16.0.43) with SMTP id 17PJXq0f053126; Wed, 25 Aug 2021 15:46:50 -0400 Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0a-001b2d01.pphosted.com with ESMTP id 3antgvk2re-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 25 Aug 2021 15:46:50 -0400 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 17PJd8Pv010077; Wed, 25 Aug 2021 19:46:49 GMT Received: from b01cxnp22036.gho.pok.ibm.com (b01cxnp22036.gho.pok.ibm.com [9.57.198.26]) by ppma01wdc.us.ibm.com with ESMTP id 3ajs4dpe2d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 25 Aug 2021 19:46:49 +0000 Received: from b01ledav006.gho.pok.ibm.com (b01ledav006.gho.pok.ibm.com [9.57.199.111]) by b01cxnp22036.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 17PJkmr510617404 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 25 Aug 2021 19:46:48 GMT Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4FFC9AC060; Wed, 25 Aug 2021 19:46:48 +0000 (GMT) Received: from b01ledav006.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B5974AC05B; Wed, 25 Aug 2021 19:46:47 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.31.187]) by b01ledav006.gho.pok.ibm.com (Postfix) with ESMTPS; Wed, 25 Aug 2021 19:46:47 +0000 (GMT) Date: Wed, 25 Aug 2021 15:46:43 -0400 To: gcc-patches@gcc.gnu.org, Michael Meissner , Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt Subject: [PATCH] Generate XXSPLTIDP on power10. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt , Peter Bergner , Will Schmidt MIME-Version: 1.0 Content-Disposition: inline X-TM-AS-GCONF: 00 X-Proofpoint-GUID: By-GnLiENPfF5mrzMM-44NYyDkzFj5q4 X-Proofpoint-ORIG-GUID: f7b4wPXn5QqkNNKFuODMwZgSYQAOIvJe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-25_07:2021-08-25, 2021-08-25 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 bulkscore=0 phishscore=0 clxscore=1015 adultscore=0 mlxlogscore=999 mlxscore=0 malwarescore=0 spamscore=0 lowpriorityscore=0 suspectscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108250114 X-Spam-Status: No, score=-10.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_MANYTO, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Michael Meissner via Gcc-patches From: Michael Meissner Reply-To: Michael Meissner Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Generate XXSPLTIDP on power10. This patch implements XXSPLTIDP support for SF and DF scalar constants and V2DF vector constants. The XXSPLTIDP instruction is given a 32-bit immediate that is converted to a vector of two DFmode constants. The immediate is in SFmode format, so only constants that fit as SFmode values can be loaded with XXSPLTIDP. I added a new constraint (eF) to match constants that can be loaded with the XXSPLTIDP instruction. I have added a temporary switch (-mxxspltidp) to control whether or not the XXSPLTIDP instruction is generated. I added 3 new tests to test loading up SF/DF scalar and V2DF vector constants. I have tested this with bootstrap compilers on power10 systems and there was no regression. I have built GCC with these patches on little endian power9 and big endian power8 systems, and there were no regressions. In addition, I have built and run the full Spec 2017 rate suite, comparing with the patches enabled and not enabled. There were roughly 66,000 XXSPLTIDP's generated in the rate build for Spec 2017. On a stand-alone system that is running single threaded, blender_r has a 1.9% increase in performance, and rest of the benchmarks are performance neutral. However, I would expect that in a real world scenario, switching to use XXSPLTIDP will increase performance due to removing all of the loads. Can I check this into the master branch? 2021-08-25 Michael Meissner gcc/ * config/rs6000/constraints.md (eF): New constraint. * config/rs6000/predicates.md (easy_fp_constant): If we can load the scalar constant with XXSPLTIDP, the floating point constant is easy. (xxspltidp_operand): New predicate. (easy_vector_constant): If we can generate XXSPLTIDP, mark the vector constant as easy. * config/rs6000/rs6000-protos.h (xxspltidp_constant_p): New declaration. (prefixed_permute_p): Likewise. * config/rs6000/rs6000.c (xxspltidp_constant_p): New function. (output_vec_const_move): Add support for XXSPLTIDP. (prefixed_permute_p): New function. * config/rs6000/rs6000.md (prefixed attribute): Add support for permute prefixed instructions. (movsf_hardfloat): Add XXSPLTIDP support. (mov_hardfloat32, FMOVE64 iterator): Likewise. (mov_hardfloat64, FMOVE64 iterator): Likewise. * config/rs6000/rs6000.opt (-mxxspltidp): New switch. * config/rs6000/vsx.md (vsx_move_64bit): Add XXSPLTIDP support. (vsx_move_32bit): Likewise. (vsx_splat_v2df_xxspltidp): New insn. (XXSPLTIDP): New mode iterator. (xxspltidp__internal): New insn and splits. (xxspltidp__inst): Replace xxspltidp_v2df_inst with an iterated form that also does SFmode, and DFmode. gcc/testsuite/ * gcc.target/powerpc/vec-splat-constant-sf.c: New test. * gcc.target/powerpc/vec-splat-constant-df.c: New test. * gcc.target/powerpc/vec-splat-constant-v2df.c: New test. --- gcc/config/rs6000/constraints.md | 5 + gcc/config/rs6000/predicates.md | 17 +++ gcc/config/rs6000/rs6000-protos.h | 2 + gcc/config/rs6000/rs6000.c | 106 ++++++++++++++++++ gcc/config/rs6000/rs6000.md | 45 +++++--- gcc/config/rs6000/rs6000.opt | 4 + gcc/config/rs6000/vsx.md | 64 ++++++++++- .../powerpc/vec-splat-constant-df.c | 60 ++++++++++ .../powerpc/vec-splat-constant-sf.c | 60 ++++++++++ .../powerpc/vec-splat-constant-v2df.c | 64 +++++++++++ 10 files changed, 405 insertions(+), 22 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c create mode 100644 gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md index c8cff1a3038..ea2e4a267c3 100644 --- a/gcc/config/rs6000/constraints.md +++ b/gcc/config/rs6000/constraints.md @@ -208,6 +208,11 @@ (define_constraint "P" (and (match_code "const_int") (match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x10000"))) +;; SF/DF/V2DF scalar or vector constant that can be loaded with XXSPLTIDP +(define_constraint "eF" + "A vector constant that can be loaded with the XXSPLTIDP instruction." + (match_operand 0 "xxspltidp_operand")) + ;; 34-bit signed integer constant (define_constraint "eI" "A signed 34-bit integer constant if prefixed instructions are supported." diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 956e42bc514..134243e404b 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -601,6 +601,11 @@ (define_predicate "easy_fp_constant" if (TARGET_VSX && op == CONST0_RTX (mode)) return 1; + /* If we have the ISA 3.1 XXSPLTIDP instruction, see if the constant can + be loaded with that instruction. */ + if (xxspltidp_operand (op, mode)) + return 1; + /* Otherwise consider floating point constants hard, so that the constant gets pushed to memory during the early RTL phases. This has the advantage that double precision constants that can be @@ -640,6 +645,15 @@ (define_predicate "xxspltib_constant_nosplit" return num_insns == 1; }) +;; Return 1 if operand is a SF/DF CONST_DOUBLE or V2DF CONST_VECTOR that can be +;; loaded via the ISA 3.1 XXSPLTIDP instruction. +(define_predicate "xxspltidp_operand" + (match_code "const_double,const_vector,vec_duplicate") +{ + HOST_WIDE_INT value = 0; + return xxspltidp_constant_p (op, mode, &value); +}) + ;; Return 1 if the operand is a CONST_VECTOR and can be loaded into a ;; vector register without using memory. (define_predicate "easy_vector_constant" @@ -653,6 +667,9 @@ (define_predicate "easy_vector_constant" if (zero_constant (op, mode) || all_ones_constant (op, mode)) return true; + if (xxspltidp_operand (op, mode)) + return true; + if (TARGET_P9_VECTOR && xxspltib_constant_p (op, mode, &num_insns, &value)) return true; diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 14f6b313105..9bba57c22f2 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -32,6 +32,7 @@ extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, int, int, int, extern int easy_altivec_constant (rtx, machine_mode); extern bool xxspltib_constant_p (rtx, machine_mode, int *, int *); +extern bool xxspltidp_constant_p (rtx, machine_mode, HOST_WIDE_INT *); extern int vspltis_shifted (rtx); extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int); extern bool macho_lo_sum_memory_operand (rtx, machine_mode); @@ -198,6 +199,7 @@ enum non_prefixed_form reg_to_non_prefixed (rtx reg, machine_mode mode); extern bool prefixed_load_p (rtx_insn *); extern bool prefixed_store_p (rtx_insn *); extern bool prefixed_paddi_p (rtx_insn *); +extern bool prefixed_permute_p (rtx_insn *); extern void rs6000_asm_output_opcode (FILE *); extern void output_pcrel_opt_reloc (rtx); extern void rs6000_final_prescan_insn (rtx_insn *, rtx [], int); diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index e073b26b430..322b3c83925 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -6533,6 +6533,74 @@ xxspltib_constant_p (rtx op, return true; } +/* Return true if OP is of the given MODE and can be synthesized with ISA 3.1 + XXSPLTIDP instruction. + + Return the constant that is being split via CONSTANT_PTR to use in the + XXSPLTIDP instruction. */ + +bool +xxspltidp_constant_p (rtx op, + machine_mode mode, + HOST_WIDE_INT *constant_ptr) +{ + *constant_ptr = 0; + + if (!TARGET_XXSPLTIDP || !TARGET_PREFIXED || !TARGET_VSX) + return false; + + if (mode == VOIDmode) + mode = GET_MODE (op); + + rtx element = op; + if (mode == V2DFmode) + { + if (CONST_VECTOR_P (op)) + { + element = CONST_VECTOR_ELT (op, 0); + if (!rtx_equal_p (element, CONST_VECTOR_ELT (op, 1))) + return false; + } + + else if (GET_CODE (op) == VEC_DUPLICATE) + element = XEXP (op, 0); + + else + return false; + + mode = DFmode; + } + + if (mode != SFmode && mode != DFmode) + return false; + + if (GET_MODE (element) != mode) + return false; + + if (!CONST_DOUBLE_P (element)) + return false; + + /* Don't return true for 0.0 since that is easy to create without + XXSPLTIDP. */ + if (element == CONST0_RTX (mode)) + return false; + + /* If the value doesn't fit in a SFmode, exactly, we can't use XXSPLTIDP. */ + const struct real_value *rv = CONST_DOUBLE_REAL_VALUE (element); + if (!exact_real_truncate (SFmode, rv)) + return false; + + long value; + REAL_VALUE_TO_TARGET_SINGLE (*rv, value); + + /* Test for SFmode denormal (exponent is 0, mantissa field is non-zero). */ + if (((value & 0x7F800000) == 0) && ((value & 0x7FFFFF) != 0)) + return false; + + *constant_ptr = value; + return true; +} + const char * output_vec_const_move (rtx *operands) { @@ -6548,6 +6616,7 @@ output_vec_const_move (rtx *operands) { bool dest_vmx_p = ALTIVEC_REGNO_P (REGNO (dest)); int xxspltib_value = 256; + HOST_WIDE_INT xxspltidp_value = 0; int num_insns = -1; if (zero_constant (vec, mode)) @@ -6577,6 +6646,12 @@ output_vec_const_move (rtx *operands) gcc_unreachable (); } + if (xxspltidp_constant_p (vec, mode, &xxspltidp_value)) + { + operands[2] = GEN_INT (xxspltidp_value); + return "xxspltidp %x0,%2"; + } + if (TARGET_P9_VECTOR && xxspltib_constant_p (vec, mode, &num_insns, &xxspltib_value)) { @@ -26219,6 +26294,37 @@ prefixed_paddi_p (rtx_insn *insn) return (iform == INSN_FORM_PCREL_EXTERNAL || iform == INSN_FORM_PCREL_LOCAL); } +/* Whether a permute type instruction is a prefixed instruction. This is + called from the prefixed attribute processing. */ + +bool +prefixed_permute_p (rtx_insn *insn) +{ + rtx set = single_set (insn); + if (!set) + return false; + + rtx dest = SET_DEST (set); + rtx src = SET_SRC (set); + machine_mode mode = GET_MODE (dest); + + if (!REG_P (dest) && !SUBREG_P (dest)) + return false; + + switch (mode) + { + case DFmode: + case SFmode: + case V2DFmode: + return xxspltidp_operand (src, mode); + + default: + break; + } + + return false; +} + /* Whether the next instruction needs a 'p' prefix issued before the instruction is printed out. */ static bool prepend_p_to_next_insn; diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index a84438f8545..bf3bfed3b88 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -314,6 +314,11 @@ (define_attr "prefixed" "no,yes" (eq_attr "type" "integer,add") (if_then_else (match_test "prefixed_paddi_p (insn)") + (const_string "yes") + (const_string "no")) + + (eq_attr "type" "vecperm") + (if_then_else (match_test "prefixed_permute_p (insn)") (const_string "yes") (const_string "no"))] @@ -7723,17 +7728,17 @@ (define_split ;; ;; LWZ LFS LXSSP LXSSPX STFS STXSSP ;; STXSSPX STW XXLXOR LI FMR XSCPSGNDP -;; MR MT MF NOP +;; MR MT MF NOP XXSPLTIDP (define_insn "movsf_hardfloat" [(set (match_operand:SF 0 "nonimmediate_operand" "=!r, f, v, wa, m, wY, Z, m, wa, !r, f, wa, - !r, *c*l, !r, *h") + !r, *c*l, !r, *h, wa") (match_operand:SF 1 "input_operand" "m, m, wY, Z, f, v, wa, r, j, j, f, wa, - r, r, *h, 0"))] + r, r, *h, 0, eF"))] "(register_operand (operands[0], SFmode) || register_operand (operands[1], SFmode)) && TARGET_HARD_FLOAT @@ -7755,15 +7760,16 @@ (define_insn "movsf_hardfloat" mr %0,%1 mt%0 %1 mf%1 %0 - nop" + nop + #" [(set_attr "type" "load, fpload, fpload, fpload, fpstore, fpstore, fpstore, store, veclogical, integer, fpsimple, fpsimple, - *, mtjmpr, mfjmpr, *") + *, mtjmpr, mfjmpr, *, vecperm") (set_attr "isa" "*, *, p9v, p8v, *, p9v, p8v, *, *, *, *, *, - *, *, *, *")]) + *, *, *, *, p10")]) ;; LWZ LFIWZX STW STFIWX MTVSRWZ MFVSRWZ ;; FMR MR MT%0 MF%1 NOP @@ -8023,18 +8029,18 @@ (define_split ;; STFD LFD FMR LXSD STXSD ;; LXSD STXSD XXLOR XXLXOR GPR<-0 -;; LWZ STW MR +;; LWZ STW MR XXSPLTIDP (define_insn "*mov_hardfloat32" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, - Y, r, !r") + Y, r, !r, wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , - r, Y, r"))] + r, Y, r, eF"))] "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -8051,20 +8057,21 @@ (define_insn "*mov_hardfloat32" # # # + # #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, two, - store, load, two") + store, load, two, vecperm") (set_attr "size" "64") (set_attr "length" "*, *, *, *, *, *, *, *, *, 8, - 8, 8, 8") + 8, 8, 8, *") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, - *, *, *")]) + *, *, *, p10")]) ;; STW LWZ MR G-const H-const F-const @@ -8091,19 +8098,19 @@ (define_insn "*mov_softfloat32" ;; STFD LFD FMR LXSD STXSD ;; LXSDX STXSDX XXLOR XXLXOR LI 0 ;; STD LD MR MT{CTR,LR} MF{CTR,LR} -;; NOP MFVSRD MTVSRD +;; NOP MFVSRD MTVSRD XXSPLTIDP (define_insn "*mov_hardfloat64" [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m, d, d, , wY, , Z, , , !r, YZ, r, !r, *c*l, !r, - *h, r, ") + *h, r, , wa") (match_operand:FMOVE64 1 "input_operand" "d, m, d, wY, , Z, , , , , r, YZ, r, r, *h, - 0, , r"))] + 0, , r, eF"))] "TARGET_POWERPC64 && TARGET_HARD_FLOAT && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" @@ -8125,18 +8132,19 @@ (define_insn "*mov_hardfloat64" mf%1 %0 nop mfvsrd %0,%x1 - mtvsrd %x0,%1" + mtvsrd %x0,%1 + #" [(set_attr "type" "fpstore, fpload, fpsimple, fpload, fpstore, fpload, fpstore, veclogical, veclogical, integer, store, load, *, mtjmpr, mfjmpr, - *, mfvsr, mtvsr") + *, mfvsr, mtvsr, vecperm") (set_attr "size" "64") (set_attr "isa" "*, *, *, p9v, p9v, p7v, p7v, *, *, *, *, *, *, *, *, - *, p8v, p8v")]) + *, p8v, p8v, p10")]) ;; STD LD MR MT MF G-const ;; H-const F-const Special @@ -8170,6 +8178,7 @@ (define_insn "*mov_softfloat64" (set_attr "length" "*, *, *, *, *, 8, 12, 16, *")]) + (define_expand "mov" [(set (match_operand:FMOVE128 0 "general_operand") diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt index 0538db387dc..928c4fafe07 100644 --- a/gcc/config/rs6000/rs6000.opt +++ b/gcc/config/rs6000/rs6000.opt @@ -639,3 +639,7 @@ Enable instructions that guard against return-oriented programming attacks. mprivileged Target Var(rs6000_privileged) Init(0) Generate code that will run in privileged state. + +mxxspltidp +Target Undocumented Var(TARGET_XXSPLTIDP) Init(1) Save +Generate (do not generate) XXSPLTIDP instructions. diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md index bf033e31c1c..af9a04870d4 100644 --- a/gcc/config/rs6000/vsx.md +++ b/gcc/config/rs6000/vsx.md @@ -1191,16 +1191,19 @@ (define_insn_and_split "*xxspltib__split" ;; instruction). But generate XXLXOR/XXLORC if it will avoid a register move. ;; VSX store VSX load VSX move VSX->GPR GPR->VSX LQ (GPR) +;; XXSPLTIDP ;; STQ (GPR) GPR load GPR store GPR move XXSPLTIB VSPLTISW ;; VSX 0/-1 VMX const GPR const LVX (VMX) STVX (VMX) (define_insn "vsx_mov_64bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, wa, wa, r, we, ?wQ, + wa, ?&r, ??r, ??Y, , wa, v, ?wa, v, , wZ, v") (match_operand:VSX_M 1 "input_operand" "wa, ZwO, wa, we, r, r, + eF, wQ, Y, r, r, wE, jwM, ?jwM, W, , v, wZ"))] @@ -1212,36 +1215,44 @@ (define_insn "vsx_mov_64bit" } [(set_attr "type" "vecstore, vecload, vecsimple, mtvsr, mfvsr, load, + vecperm, store, load, store, *, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "num_insns" "*, *, *, 2, *, 2, + *, 2, 2, 2, 2, *, *, *, 5, 2, *, *") (set_attr "max_prefixed_insns" "*, *, *, *, *, 2, + *, 2, 2, 2, 2, *, *, *, *, *, *, *") (set_attr "length" "*, *, *, 8, *, 8, + *, 8, 8, 8, 8, *, *, *, 20, 8, *, *") (set_attr "isa" ", , , *, *, *, + p10, *, *, *, *, p9v, *, , *, *, *, *")]) ;; VSX store VSX load VSX move GPR load GPR store GPR move +;; XXSPLTIDP ;; XXSPLTIB VSPLTISW VSX 0/-1 VMX const GPR const ;; LVX (VMX) STVX (VMX) (define_insn "*vsx_mov_32bit" [(set (match_operand:VSX_M 0 "nonimmediate_operand" "=ZwO, wa, wa, ??r, ??Y, , + wa, wa, v, ?wa, v, , wZ, v") (match_operand:VSX_M 1 "input_operand" "wa, ZwO, wa, Y, r, r, + eF, wE, jwM, ?jwM, W, , v, wZ"))] @@ -1253,14 +1264,17 @@ (define_insn "*vsx_mov_32bit" } [(set_attr "type" "vecstore, vecload, vecsimple, load, store, *, + vecperm, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") (set_attr "length" "*, *, *, 16, 16, 16, + *, *, *, *, 20, 16, *, *") (set_attr "isa" ", , , *, *, *, + p10, p9v, *, , *, *, *, *")]) @@ -4580,6 +4594,23 @@ (define_insn "vsx_splat__reg" mtvsrdd %x0,%1,%1" [(set_attr "type" "vecperm,vecmove")]) +(define_insn "*vsx_splat_v2df_xxspltidp" + [(set (match_operand:V2DF 0 "vsx_register_operand" "=wa") + (vec_duplicate:V2DF + (match_operand:DF 1 "xxspltidp_operand" "eF")))] + "TARGET_POWER10" +{ + HOST_WIDE_INT value; + + if (!xxspltidp_constant_p (operands[1], DFmode, &value)) + gcc_unreachable (); + + operands[2] = GEN_INT (value); + return "xxspltidp %x0,%1"; +} + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + (define_insn "vsx_splat__mem" [(set (match_operand:VSX_D 0 "vsx_register_operand" "=wa") (vec_duplicate:VSX_D @@ -6449,15 +6480,40 @@ (define_expand "xxspltidp_v2df" DONE; }) -(define_insn "xxspltidp_v2df_inst" - [(set (match_operand:V2DF 0 "register_operand" "=wa") - (unspec:V2DF [(match_operand:SI 1 "c32bit_cint_operand" "n")] - UNSPEC_XXSPLTIDP))] +(define_mode_iterator XXSPLTIDP [SF DF V2DF]) + +(define_insn "xxspltidp__inst" + [(set (match_operand:XXSPLTIDP 0 "register_operand" "=wa") + (unspec:XXSPLTIDP [(match_operand:SI 1 "c32bit_cint_operand" "n")] + UNSPEC_XXSPLTIDP))] "TARGET_POWER10" "xxspltidp %x0,%1" [(set_attr "type" "vecperm") (set_attr "prefixed" "yes")]) +;; Generate the XXSPLTIDP instruction to support SFmode and DFmode scalar +;; constants and V2DF vector constants where both elements are the same. The +;; constant has to be expressible as a SFmode constant that is not a SFmode +;; denormal value. +(define_insn_and_split "*xxspltidp__internal" + [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand" "=wa") + (match_operand:XXSPLTIDP 1 "xxspltidp_operand" "eF"))] + "TARGET_POWER10" + "#" + "&& 1" + [(set (match_operand:XXSPLTIDP 0 "vsx_register_operand") + (unspec:XXSPLTIDP [(match_dup 2)] UNSPEC_XXSPLTIDP))] +{ + HOST_WIDE_INT value = 0; + + if (!xxspltidp_constant_p (operands[1], mode, &value)) + gcc_unreachable (); + + operands[2] = GEN_INT (value); +} + [(set_attr "type" "vecperm") + (set_attr "prefixed" "yes")]) + ;; XXSPLTI32DX built-in function support (define_expand "xxsplti32dx_v4si" [(set (match_operand:V4SI 0 "register_operand" "=wa") diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c new file mode 100644 index 00000000000..8f6e176f9af --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-df.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating DFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +double +scalar_double_0 (void) +{ + return 0.0; /* XXSPLTIB or XXLXOR. */ +} + +double +scalar_double_1 (void) +{ + return 1.0; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +double +scalar_double_m0 (void) +{ + return -0.0; /* XXSPLTIDP. */ +} + +double +scalar_double_nan (void) +{ + return __builtin_nan (""); /* XXSPLTIDP. */ +} + +double +scalar_double_inf (void) +{ + return __builtin_inf (); /* XXSPLTIDP. */ +} + +double +scalar_double_m_inf (void) /* XXSPLTIDP. */ +{ + return - __builtin_inf (); +} +#endif + +double +scalar_double_pi (void) +{ + return M_PI; /* PLFD. */ +} + +double +scalar_double_denorm (void) +{ + return 0x1p-149f; /* PLFD. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c new file mode 100644 index 00000000000..72504bdfbbd --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-sf.c @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating SFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +float +scalar_float_0 (void) +{ + return 0.0f; /* XXSPLTIB or XXLXOR. */ +} + +float +scalar_float_1 (void) +{ + return 1.0f; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +float +scalar_float_m0 (void) +{ + return -0.0f; /* XXSPLTIDP. */ +} + +float +scalar_float_nan (void) +{ + return __builtin_nanf (""); /* XXSPLTIDP. */ +} + +float +scalar_float_inf (void) +{ + return __builtin_inff (); /* XXSPLTIDP. */ +} + +float +scalar_float_m_inf (void) /* XXSPLTIDP. */ +{ + return - __builtin_inff (); +} +#endif + +float +scalar_float_pi (void) +{ + return (float)M_PI; /* XXSPLTIDP. */ +} + +float +scalar_float_denorm (void) +{ + return 0x1p-149f; /* PLFS. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c new file mode 100644 index 00000000000..82ffc86f8aa --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec-splat-constant-v2df.c @@ -0,0 +1,64 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +#include + +/* Test generating V2DFmode constants with the ISA 3.1 (power10) XXSPLTIDP + instruction. */ + +vector double +v2df_double_0 (void) +{ + return (vector double) { 0.0, 0.0 }; /* XXSPLTIB or XXLXOR. */ +} + +vector double +v2df_double_1 (void) +{ + return (vector double) { 1.0, 1.0 }; /* XXSPLTIDP. */ +} + +#ifndef __FAST_MATH__ +vector double +v2df_double_m0 (void) +{ + return (vector double) { -0.0, -0.0 }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_nan (void) +{ + return (vector double) { __builtin_nan (""), + __builtin_nan ("") }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_inf (void) +{ + return (vector double) { __builtin_inf (), + __builtin_inf () }; /* XXSPLTIDP. */ +} + +vector double +v2df_double_m_inf (void) +{ + return (vector double) { - __builtin_inf (), + - __builtin_inf () }; /* XXSPLTIDP. */ +} +#endif + +vector double +v2df_double_pi (void) +{ + return (vector double) { M_PI, M_PI }; /* PLVX. */ +} + +vector double +v2df_double_denorm (void) +{ + return (vector double) { (double)0x1p-149f, + (double)0x1p-149f }; /* PLVX. */ +} + +/* { dg-final { scan-assembler-times {\mxxspltidp\M} 5 } } */