From patchwork Thu Nov 8 21:28:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 995214 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489463-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="HgdqLgdf"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rbyJ5wC5z9sCw for ; Fri, 9 Nov 2018 08:29:12 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:references:mime-version:content-type :in-reply-to:message-id; q=dns; s=default; b=HU8YWacJFlwCSRJf91N LjdzdUaDYhMaltX/aVjrUhBtckVBphRLxTOS6cVBknfayzYrLAhO6sbng2W97pfz vJcPmX5iSfV/HIAMFP28RV+2vt1a0krmH8fDUDZRUxsjYesDnI0Txy5Qnt2J/GQB ucmL1RjtB+MnisFYpnVeM2fA= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:references:mime-version:content-type :in-reply-to:message-id; s=default; bh=QNkfFlnBgSR7gYuTJLgPiFnCf sI=; b=HgdqLgdfQp0wiX0BWv6lVTmd57t5mufoXHmX7wqyPLlYQbECff4IOfjjm kK1Wz1Ue+H4inrbZh3jY0e7Ld5BYNSN8fykdfRBYbr45kKZE94ZxFX/1uBFnfMJn qt7E9l97bEL9mU9fVSpinry9FG4wYF6znODU+oWnD9xpYW4h2s= Received: (qmail 2821 invoked by alias); 8 Nov 2018 21:29:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 2794 invoked by uid 89); 8 Nov 2018 21:29:03 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_NUMSUBJECT, KHOP_DYNAMIC, RCVD_IN_DNSWL_LOW, SPF_PASS, TVD_SUBJ_WIPE_DEBT autolearn=ham version=3.3.2 spammy=king, 0s, 2.06, 1s X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 21:28:59 +0000 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wA8LSekR146788 for ; Thu, 8 Nov 2018 16:28:57 -0500 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2nmv9ea01j-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 08 Nov 2018 16:28:57 -0500 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 8 Nov 2018 21:28:56 -0000 Received: from b01cxnp23033.gho.pok.ibm.com (9.57.198.28) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 8 Nov 2018 21:28:54 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wA8LSrGi28377222 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 8 Nov 2018 21:28:53 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AE4D9B2064; Thu, 8 Nov 2018 21:28:53 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 59C3EB205F; Thu, 8 Nov 2018 21:28:53 +0000 (GMT) Received: from ibm-toto.the-meissners.org (unknown [9.32.77.218]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTPS; Thu, 8 Nov 2018 21:28:53 +0000 (GMT) Date: Thu, 8 Nov 2018 16:28:52 -0500 From: Michael Meissner To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH], Remove power9 fusion support, version 2 Mail-Followup-To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt References: <20181102183734.GA27589@ibm-toto.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20181102183734.GA27589@ibm-toto.the-meissners.org> User-Agent: Mutt/1.5.21 (2010-09-15) x-cbid: 18110821-0040-0000-0000-0000048F48D3 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010009; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000269; SDB=6.01114618; UDB=6.00577119; IPR=6.00894716; MB=3.00024077; MTD=3.00000008; XFM=3.00000015; UTC=2018-11-08 21:28:55 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18110821-0041-0000-0000-000008984F3E Message-Id: <20181108212852.GA12190@ibm-toto.the-meissners.org> This is version 2 of the patch to remove power9 fusion. Is it ok to check into the trunk? [gcc] 2018-11-08 Michael Meissner * config/rs6000/constraints.md (wF constraint): Update constraint documentation for power8 fusion only. * config/rs6000/predicates.md (p9_fusion_reg_operand): Delete. (fusion_gpr_addis): Delete power9 fusion support. Change power8 fusion support to require the upper 12 bits to be all 0's or all 1's. (fusion_gpr_mem_load): Add comment. (fusion_addis_mem_combo_load): Remove power9 fusion support. (fusion_addis_mem_combo_store): Delete. (fusion_offsettable_mem_operand): Delete. * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Do not set power8 fusion here. (ISA_3_0_MASKS_SERVER): Delete power9 fusion. (POWERPC_MASKS): Delete power9 fusion. * config/rs6000/rs6000-protos.h (emit_fusion_load_store): Delete. (fusion_p9_p): Delete. (expand_fusion_p9_load): Delete. (expand_fusion_p9_store): Delete. (emit_fusion_p9_load): Delete. (emit_fusion_p9_store): Delete. * config/rs6000/rs6000.c (rs6000_debug_reg_global): Delete power9 fusion support. (rs6000_option_override_internal): Set power8 fusion based on whether we are tuning for power8. Delete power9 fusion support. (rs6000_opt_masks): Delete -mpower9-fusion switch. (emit_fusion_load): Rename emit_fusion_load_store to emit_fusion_load, and drop fusion store support. Update callers. (emit_fusion_load_store): Likewise. (emit_fusion_gpr_load): Likewise. (fusion_p9_p): Delete. (expand_fusion_p9_load): Delete. (expand_fusion_p9_store): Delete. (emit_fusion_p9_load): Delete. (emit_fusion_p9_store): Delete. * config/rs6000/rs6000.md (UNSPEC_FUSION_P9): Delete. (GPR_FUSION): Delete. (FPR_FUSION): Delete. (power9 fusion peephole2s): Delete. (fusion_gpr___load): Delete. (fusion_gpr___store): Delete. (fusion_vsx___load): Delete. (fusion_vsx___store): Delete. (fusion_p9__constant): Delete. * config/rs6000/rs6000.opt (-mpower9-fusion): Delete undocumented power9 fusion switch. * doc/md.texi (PowerPC constraints): Update wF constraint documentation for power8 fusion only. [gcc/testsuite] 2018-11-08 Michael Meissner * gcc.target/powerpc/fusion3.c: Delete. * gcc.target/powerpc/fusion4.c: Delete. Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 265931) +++ gcc/config/rs6000/constraints.md (working copy) @@ -154,7 +154,7 @@ (define_constraint "wE" ;; Extended fusion store (define_memory_constraint "wF" - "Memory operand suitable for power9 fusion load/stores" + "Memory operand suitable for power8 GPR load fusion" (match_operand 0 "fusion_addis_mem_combo_load")) (define_register_constraint "wH" "rs6000_constraints[RS6000_CONSTRAINT_wH]" Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 265931) +++ gcc/config/rs6000/predicates.md (working copy) @@ -406,48 +406,6 @@ (define_predicate "fpr_reg_operand" return FP_REGNO_P (r); }) -;; Return true if this is a register that can has D-form addressing (GPR, -;; traditional FPR registers, and Altivec registers for scalars). Unlike -;; power8 fusion, this fusion does not depend on putting the ADDIS instruction -;; into the GPR register being loaded. -(define_predicate "p9_fusion_reg_operand" - (match_code "reg,subreg") -{ - HOST_WIDE_INT r; - bool gpr_p = (mode == QImode || mode == HImode || mode == SImode - || mode == SFmode - || (TARGET_POWERPC64 && (mode == DImode || mode == DFmode))); - bool fpr_p = (TARGET_P9_FUSION - && (mode == DFmode || mode == SFmode - || (TARGET_POWERPC64 && mode == DImode))); - bool vmx_p = (TARGET_P9_FUSION && TARGET_P9_VECTOR - && (mode == DFmode || mode == SFmode)); - - if (!TARGET_P8_FUSION) - return 0; - - if (GET_CODE (op) == SUBREG) - op = SUBREG_REG (op); - - if (!REG_P (op)) - return 0; - - r = REGNO (op); - if (r >= FIRST_PSEUDO_REGISTER) - return (gpr_p || fpr_p || vmx_p); - - if (INT_REGNO_P (r)) - return gpr_p; - - if (FP_REGNO_P (r)) - return fpr_p; - - if (ALTIVEC_REGNO_P (r)) - return vmx_p; - - return 0; -}) - ;; Return 1 if op is a HTM specific SPR register. (define_predicate "htm_spr_reg_operand" (match_operand 0 "register_operand") @@ -1691,13 +1649,9 @@ (define_predicate "fusion_gpr_addis" if ((value & (HOST_WIDE_INT)0xffff0000) == 0) return 0; - /* Power8 currently will only do the fusion if the top 11 bits of the addis - value are all 1's or 0's. Ignore this restriction if we are testing - advanced fusion. */ - if (TARGET_P9_FUSION) - return 1; - - return (IN_RANGE (value >> 16, -32, 31)); + /* Power8 only does the fusion if the top 12 bits of the addis value are all + 1's or 0's. */ + return (IN_RANGE (value >> 16, -16, 15)); }) ;; Match the second insn (lbz, lhz, lwz, ld) in fusing the combination of addis @@ -1730,6 +1684,8 @@ (define_predicate "fusion_gpr_mem_load" return 0; break; + /* Do not allow SF/DFmode in GPR fusion. While the loads do occur, they + are not common. */ default: return 0; } @@ -1762,14 +1718,13 @@ (define_predicate "fusion_gpr_mem_load" ;; Match a GPR load (lbz, lhz, lwz, ld) that uses a combined address in the ;; memory field with both the addis and the memory offset. Sign extension ;; is not handled here, since lha and lwa are not fused. -;; With P9 fusion, also match a fpr/vector load and float_extend (define_predicate "fusion_addis_mem_combo_load" - (match_code "mem,zero_extend,float_extend") + (match_code "mem,zero_extend") { rtx addr, base, offset; - /* Handle zero/float extend. */ - if (GET_CODE (op) == ZERO_EXTEND || GET_CODE (op) == FLOAT_EXTEND) + /* Handle zero extend. */ + if (GET_CODE (op) == ZERO_EXTEND) { op = XEXP (op, 0); mode = GET_MODE (op); @@ -1792,20 +1747,8 @@ (define_predicate "fusion_addis_mem_comb return 0; break; - /* ISA 2.08/power8 only had fusion of GPR loads. */ - case E_SFmode: - if (!TARGET_P9_FUSION) - return 0; - break; - - /* ISA 2.08/power8 only had fusion of GPR loads. Do not allow 64-bit - DFmode in 32-bit if -msoft-float since it splits into two separate - instructions. */ - case E_DFmode: - if ((!TARGET_POWERPC64 && !TARGET_HARD_FLOAT) || !TARGET_P9_FUSION) - return 0; - break; - + /* Do not allow SF/DFmode in GPR fusion. While the loads do occur, they + are not common. */ default: return 0; } @@ -1833,80 +1776,3 @@ (define_predicate "fusion_addis_mem_comb return 0; }) - -;; Like fusion_addis_mem_combo_load, but for stores -(define_predicate "fusion_addis_mem_combo_store" - (match_code "mem") -{ - rtx addr, base, offset; - - if (!MEM_P (op) || !TARGET_P9_FUSION) - return 0; - - switch (mode) - { - case E_QImode: - case E_HImode: - case E_SImode: - case E_SFmode: - break; - - /* Do not fuse 64-bit DImode in 32-bit since it splits into two - separate instructions. */ - case E_DImode: - if (!TARGET_POWERPC64) - return 0; - break; - - /* Do not allow 64-bit DFmode in 32-bit if -msoft-float since it splits - into two separate instructions. Do allow fusion if we have hardware - floating point. */ - case E_DFmode: - if (!TARGET_POWERPC64 && !TARGET_HARD_FLOAT) - return 0; - break; - - default: - return 0; - } - - addr = XEXP (op, 0); - if (GET_CODE (addr) != PLUS && GET_CODE (addr) != LO_SUM) - return 0; - - base = XEXP (addr, 0); - if (!fusion_gpr_addis (base, GET_MODE (base))) - return 0; - - offset = XEXP (addr, 1); - if (GET_CODE (addr) == PLUS) - return satisfies_constraint_I (offset); - - else if (GET_CODE (addr) == LO_SUM) - { - if (TARGET_XCOFF || (TARGET_ELF && TARGET_POWERPC64)) - return small_toc_ref (offset, GET_MODE (offset)); - - else if (TARGET_ELF && !TARGET_POWERPC64) - return CONSTANT_P (offset); - } - - return 0; -}) - -;; Return true if the operand is a float_extend or zero extend of an -;; offsettable memory operand suitable for use in fusion -(define_predicate "fusion_offsettable_mem_operand" - (match_code "mem,zero_extend,float_extend") -{ - if (GET_CODE (op) == ZERO_EXTEND || GET_CODE (op) == FLOAT_EXTEND) - { - op = XEXP (op, 0); - mode = GET_MODE (op); - } - - if (!memory_operand (op, mode)) - return 0; - - return offsettable_nonstrict_memref_p (op); -}) Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (revision 265931) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -44,9 +44,10 @@ | OPTION_MASK_ALTIVEC \ | OPTION_MASK_VSX) -/* For now, don't provide an embedded version of ISA 2.07. */ +/* For now, don't provide an embedded version of ISA 2.07. Do not set power8 + fusion here, instead set it in rs6000.c if we are tuning for a power8 + system. */ #define ISA_2_7_MASKS_SERVER (ISA_2_6_MASKS_SERVER \ - | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ | OPTION_MASK_CRYPTO \ | OPTION_MASK_DIRECT_MOVE \ @@ -60,7 +61,6 @@ #define ISA_3_0_MASKS_SERVER (ISA_2_7_MASKS_SERVER \ | OPTION_MASK_ISEL \ | OPTION_MASK_MODULO \ - | OPTION_MASK_P9_FUSION \ | OPTION_MASK_P9_MINMAX \ | OPTION_MASK_P9_MISC \ | OPTION_MASK_P9_VECTOR) @@ -121,7 +121,6 @@ | OPTION_MASK_NO_UPDATE \ | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ - | OPTION_MASK_P9_FUSION \ | OPTION_MASK_P9_MINMAX \ | OPTION_MASK_P9_MISC \ | OPTION_MASK_P9_VECTOR \ Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h (revision 265931) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -91,13 +91,7 @@ extern bool quad_load_store_p (rtx, rtx) extern bool fusion_gpr_load_p (rtx, rtx, rtx, rtx); extern void expand_fusion_gpr_load (rtx *); extern void emit_fusion_addis (rtx, rtx); -extern void emit_fusion_load_store (rtx, rtx, rtx, const char *); extern const char *emit_fusion_gpr_load (rtx, rtx); -extern bool fusion_p9_p (rtx, rtx, rtx, rtx); -extern void expand_fusion_p9_load (rtx *); -extern void expand_fusion_p9_store (rtx *); -extern const char *emit_fusion_p9_load (rtx, rtx, rtx); -extern const char *emit_fusion_p9_store (rtx, rtx, rtx); extern enum reg_class (*rs6000_preferred_reload_class_ptr) (rtx, enum reg_class); extern enum reg_class (*rs6000_secondary_reload_class_ptr) (enum reg_class, Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 265931) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -2787,7 +2787,7 @@ rs6000_debug_reg_global (void) { char options[80]; - strcpy (options, (TARGET_P9_FUSION) ? "power9" : "power8"); + strcpy (options, "power8"); if (TARGET_P8_FUSION_SIGN) strcat (options, ", sign"); @@ -4163,10 +4163,15 @@ rs6000_option_override_internal (bool gl rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT; /* Enable power8 fusion if we are tuning for power8, even if we aren't - generating power8 instructions. */ + generating power8 instructions. Power9 does not optimize power8 fusion + cases. */ if (!(rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION)) - rs6000_isa_flags |= (processor_target_table[tune_index].target_enable - & OPTION_MASK_P8_FUSION); + { + if (processor_target_table[tune_index].processor == PROCESSOR_POWER8) + rs6000_isa_flags |= OPTION_MASK_P8_FUSION; + else + rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION; + } /* Setting additional fusion flags turns on base fusion. */ if (!TARGET_P8_FUSION && TARGET_P8_FUSION_SIGN) @@ -4183,28 +4188,6 @@ rs6000_option_override_internal (bool gl rs6000_isa_flags |= OPTION_MASK_P8_FUSION; } - /* Power9 fusion is a superset over power8 fusion. */ - if (TARGET_P9_FUSION && !TARGET_P8_FUSION) - { - if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION) - { - /* We prefer to not mention undocumented options in - error messages. However, if users have managed to select - power9-fusion without selecting power8-fusion, they - already know about undocumented flags. */ - error ("%qs requires %qs", "-mpower9-fusion", "-mpower8-fusion"); - rs6000_isa_flags &= ~OPTION_MASK_P9_FUSION; - } - else - rs6000_isa_flags |= OPTION_MASK_P8_FUSION; - } - - /* Enable power9 fusion if we are tuning for power9, even if we aren't - generating power9 instructions. */ - if (!(rs6000_isa_flags_explicit & OPTION_MASK_P9_FUSION)) - rs6000_isa_flags |= (processor_target_table[tune_index].target_enable - & OPTION_MASK_P9_FUSION); - /* Power8 does not fuse sign extended loads with the addis. If we are optimizing at high levels for speed, convert a sign extended load into a zero extending load, and an explicit sign extension. */ @@ -36010,7 +35993,6 @@ static struct rs6000_opt_mask const rs60 { "power8-fusion", OPTION_MASK_P8_FUSION, false, true }, { "power8-fusion-sign", OPTION_MASK_P8_FUSION_SIGN, false, true }, { "power8-vector", OPTION_MASK_P8_VECTOR, false, true }, - { "power9-fusion", OPTION_MASK_P9_FUSION, false, true }, { "power9-minmax", OPTION_MASK_P9_MINMAX, false, true }, { "power9-misc", OPTION_MASK_P9_MISC, false, true }, { "power9-vector", OPTION_MASK_P9_VECTOR, false, true }, @@ -38114,14 +38096,13 @@ emit_fusion_addis (rtx target, rtx addis /* Emit a D-form load or store instruction that is the second instruction of a fusion sequence. */ -void -emit_fusion_load_store (rtx load_store_reg, rtx addis_reg, rtx offset, - const char *insn_str) +static void +emit_fusion_load (rtx load_reg, rtx addis_reg, rtx offset, const char *insn_str) { rtx fuse_ops[10]; char insn_template[80]; - fuse_ops[0] = load_store_reg; + fuse_ops[0] = load_reg; fuse_ops[1] = addis_reg; if (CONST_INT_P (offset) && satisfies_constraint_I (offset)) @@ -38259,367 +38240,12 @@ emit_fusion_gpr_load (rtx target, rtx me emit_fusion_addis (target, addis_value); /* Emit the D-form load instruction. */ - emit_fusion_load_store (target, target, load_offset, load_str); + emit_fusion_load (target, target, load_offset, load_str); return ""; } -/* Return true if the peephole2 can combine a load/store involving a - combination of an addis instruction and the memory operation. This was - added to the ISA 3.0 (power9) hardware. */ - -bool -fusion_p9_p (rtx addis_reg, /* register set via addis. */ - rtx addis_value, /* addis value. */ - rtx dest, /* destination (memory or register). */ - rtx src) /* source (register or memory). */ -{ - rtx addr, mem, offset; - machine_mode mode = GET_MODE (src); - - /* Validate arguments. */ - if (!base_reg_operand (addis_reg, GET_MODE (addis_reg))) - return false; - - if (!fusion_gpr_addis (addis_value, GET_MODE (addis_value))) - return false; - - /* Ignore extend operations that are part of the load. */ - if (GET_CODE (src) == FLOAT_EXTEND || GET_CODE (src) == ZERO_EXTEND) - src = XEXP (src, 0); - - /* Test for memory<-register or register<-memory. */ - if (fpr_reg_operand (src, mode) || int_reg_operand (src, mode)) - { - if (!MEM_P (dest)) - return false; - - mem = dest; - } - - else if (MEM_P (src)) - { - if (!fpr_reg_operand (dest, mode) && !int_reg_operand (dest, mode)) - return false; - - mem = src; - } - - else - return false; - - addr = XEXP (mem, 0); /* either PLUS or LO_SUM. */ - if (GET_CODE (addr) == PLUS) - { - if (!rtx_equal_p (addis_reg, XEXP (addr, 0))) - return false; - - return satisfies_constraint_I (XEXP (addr, 1)); - } - - else if (GET_CODE (addr) == LO_SUM) - { - if (!rtx_equal_p (addis_reg, XEXP (addr, 0))) - return false; - - offset = XEXP (addr, 1); - if (TARGET_XCOFF || (TARGET_ELF && TARGET_POWERPC64)) - return small_toc_ref (offset, GET_MODE (offset)); - - else if (TARGET_ELF && !TARGET_POWERPC64) - return CONSTANT_P (offset); - } - - return false; -} - -/* During the peephole2 pass, adjust and expand the insns for an extended fusion - load sequence. - - The operands are: - operands[0] register set with addis - operands[1] value set via addis - operands[2] target register being loaded - operands[3] D-form memory reference using operands[0]. - - This is similar to the fusion introduced with power8, except it scales to - both loads/stores and does not require the result register to be the same as - the base register. At the moment, we only do this if register set with addis - is dead. */ - -void -expand_fusion_p9_load (rtx *operands) -{ - rtx tmp_reg = operands[0]; - rtx addis_value = operands[1]; - rtx target = operands[2]; - rtx orig_mem = operands[3]; - rtx new_addr, new_mem, orig_addr, offset, set, clobber, insn; - enum rtx_code plus_or_lo_sum; - machine_mode target_mode = GET_MODE (target); - machine_mode extend_mode = target_mode; - machine_mode ptr_mode = Pmode; - enum rtx_code extend = UNKNOWN; - - if (GET_CODE (orig_mem) == FLOAT_EXTEND || GET_CODE (orig_mem) == ZERO_EXTEND) - { - extend = GET_CODE (orig_mem); - orig_mem = XEXP (orig_mem, 0); - target_mode = GET_MODE (orig_mem); - } - - gcc_assert (MEM_P (orig_mem)); - - orig_addr = XEXP (orig_mem, 0); - plus_or_lo_sum = GET_CODE (orig_addr); - gcc_assert (plus_or_lo_sum == PLUS || plus_or_lo_sum == LO_SUM); - - offset = XEXP (orig_addr, 1); - new_addr = gen_rtx_fmt_ee (plus_or_lo_sum, ptr_mode, addis_value, offset); - new_mem = replace_equiv_address_nv (orig_mem, new_addr, false); - - if (extend != UNKNOWN) - new_mem = gen_rtx_fmt_e (extend, extend_mode, new_mem); - - new_mem = gen_rtx_UNSPEC (extend_mode, gen_rtvec (1, new_mem), - UNSPEC_FUSION_P9); - - set = gen_rtx_SET (target, new_mem); - clobber = gen_rtx_CLOBBER (VOIDmode, tmp_reg); - insn = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set, clobber)); - emit_insn (insn); - - return; -} - -/* During the peephole2 pass, adjust and expand the insns for an extended fusion - store sequence. - - The operands are: - operands[0] register set with addis - operands[1] value set via addis - operands[2] target D-form memory being stored to - operands[3] register being stored - - This is similar to the fusion introduced with power8, except it scales to - both loads/stores and does not require the result register to be the same as - the base register. At the moment, we only do this if register set with addis - is dead. */ - -void -expand_fusion_p9_store (rtx *operands) -{ - rtx tmp_reg = operands[0]; - rtx addis_value = operands[1]; - rtx orig_mem = operands[2]; - rtx src = operands[3]; - rtx new_addr, new_mem, orig_addr, offset, set, clobber, insn, new_src; - enum rtx_code plus_or_lo_sum; - machine_mode target_mode = GET_MODE (orig_mem); - machine_mode ptr_mode = Pmode; - - gcc_assert (MEM_P (orig_mem)); - - orig_addr = XEXP (orig_mem, 0); - plus_or_lo_sum = GET_CODE (orig_addr); - gcc_assert (plus_or_lo_sum == PLUS || plus_or_lo_sum == LO_SUM); - - offset = XEXP (orig_addr, 1); - new_addr = gen_rtx_fmt_ee (plus_or_lo_sum, ptr_mode, addis_value, offset); - new_mem = replace_equiv_address_nv (orig_mem, new_addr, false); - - new_src = gen_rtx_UNSPEC (target_mode, gen_rtvec (1, src), - UNSPEC_FUSION_P9); - - set = gen_rtx_SET (new_mem, new_src); - clobber = gen_rtx_CLOBBER (VOIDmode, tmp_reg); - insn = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set, clobber)); - emit_insn (insn); - - return; -} - -/* Return a string to fuse an addis instruction with a load using extended - fusion. The address that is used is the logical address that was formed - during peephole2: (lo_sum (high) (low-part)) - - The code is complicated, so we call output_asm_insn directly, and just - return "". */ - -const char * -emit_fusion_p9_load (rtx reg, rtx mem, rtx tmp_reg) -{ - machine_mode mode = GET_MODE (reg); - rtx hi; - rtx lo; - rtx addr; - const char *load_string; - int r; - - if (GET_CODE (mem) == FLOAT_EXTEND || GET_CODE (mem) == ZERO_EXTEND) - { - mem = XEXP (mem, 0); - mode = GET_MODE (mem); - } - - if (GET_CODE (reg) == SUBREG) - { - gcc_assert (SUBREG_BYTE (reg) == 0); - reg = SUBREG_REG (reg); - } - - if (!REG_P (reg)) - fatal_insn ("emit_fusion_p9_load, bad reg #1", reg); - - r = REGNO (reg); - if (FP_REGNO_P (r)) - { - if (mode == SFmode) - load_string = "lfs"; - else if (mode == DFmode || mode == DImode) - load_string = "lfd"; - else - gcc_unreachable (); - } - else if (ALTIVEC_REGNO_P (r) && TARGET_P9_VECTOR) - { - if (mode == SFmode) - load_string = "lxssp"; - else if (mode == DFmode || mode == DImode) - load_string = "lxsd"; - else - gcc_unreachable (); - } - else if (INT_REGNO_P (r)) - { - switch (mode) - { - case E_QImode: - load_string = "lbz"; - break; - case E_HImode: - load_string = "lhz"; - break; - case E_SImode: - case E_SFmode: - load_string = "lwz"; - break; - case E_DImode: - case E_DFmode: - if (!TARGET_POWERPC64) - gcc_unreachable (); - load_string = "ld"; - break; - default: - gcc_unreachable (); - } - } - else - fatal_insn ("emit_fusion_p9_load, bad reg #2", reg); - - if (!MEM_P (mem)) - fatal_insn ("emit_fusion_p9_load not MEM", mem); - - addr = XEXP (mem, 0); - fusion_split_address (addr, &hi, &lo); - - /* Emit the addis instruction. */ - emit_fusion_addis (tmp_reg, hi); - - /* Emit the D-form load instruction. */ - emit_fusion_load_store (reg, tmp_reg, lo, load_string); - - return ""; -} - -/* Return a string to fuse an addis instruction with a store using extended - fusion. The address that is used is the logical address that was formed - during peephole2: (lo_sum (high) (low-part)) - - The code is complicated, so we call output_asm_insn directly, and just - return "". */ - -const char * -emit_fusion_p9_store (rtx mem, rtx reg, rtx tmp_reg) -{ - machine_mode mode = GET_MODE (reg); - rtx hi; - rtx lo; - rtx addr; - const char *store_string; - int r; - - if (GET_CODE (reg) == SUBREG) - { - gcc_assert (SUBREG_BYTE (reg) == 0); - reg = SUBREG_REG (reg); - } - - if (!REG_P (reg)) - fatal_insn ("emit_fusion_p9_store, bad reg #1", reg); - - r = REGNO (reg); - if (FP_REGNO_P (r)) - { - if (mode == SFmode) - store_string = "stfs"; - else if (mode == DFmode) - store_string = "stfd"; - else - gcc_unreachable (); - } - else if (ALTIVEC_REGNO_P (r) && TARGET_P9_VECTOR) - { - if (mode == SFmode) - store_string = "stxssp"; - else if (mode == DFmode || mode == DImode) - store_string = "stxsd"; - else - gcc_unreachable (); - } - else if (INT_REGNO_P (r)) - { - switch (mode) - { - case E_QImode: - store_string = "stb"; - break; - case E_HImode: - store_string = "sth"; - break; - case E_SImode: - case E_SFmode: - store_string = "stw"; - break; - case E_DImode: - case E_DFmode: - if (!TARGET_POWERPC64) - gcc_unreachable (); - store_string = "std"; - break; - default: - gcc_unreachable (); - } - } - else - fatal_insn ("emit_fusion_p9_store, bad reg #2", reg); - - if (!MEM_P (mem)) - fatal_insn ("emit_fusion_p9_store not MEM", mem); - - addr = XEXP (mem, 0); - fusion_split_address (addr, &hi, &lo); - - /* Emit the addis instruction. */ - emit_fusion_addis (tmp_reg, hi); - - /* Emit the D-form load instruction. */ - emit_fusion_load_store (reg, tmp_reg, lo, store_string); - - return ""; -} - #ifdef RS6000_GLIBC_ATOMIC_FENV /* Function declarations for rs6000_atomic_assign_expand_fenv. */ static tree atomic_hold_decl, atomic_clear_decl, atomic_update_decl; Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 265931) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -136,7 +136,6 @@ (define_c_enum "unspec" UNSPEC_LSQ UNSPEC_FUSION_GPR UNSPEC_STACK_CHECK - UNSPEC_FUSION_P9 UNSPEC_ADD_ROUND_TO_ODD UNSPEC_SUB_ROUND_TO_ODD UNSPEC_MUL_ROUND_TO_ODD @@ -349,19 +348,6 @@ (define_mode_iterator HSI [HI SI]) ; SImode or DImode, even if DImode doesn't fit in GPRs. (define_mode_iterator SDI [SI DI]) -; Types that can be fused with an ADDIS instruction to load or store a GPR -; register that has reg+offset addressing. -(define_mode_iterator GPR_FUSION [QI - HI - SI - (DI "TARGET_POWERPC64") - SF - (DF "TARGET_POWERPC64")]) - -; Types that can be fused with an ADDIS instruction to load or store a FPR -; register that has reg+offset addressing. -(define_mode_iterator FPR_FUSION [DI SF DF]) - ; The size of a pointer. Also, the size of the value that a record-condition ; (one with a '.') will compare; and the size used for arithmetic carries. (define_mode_iterator P [(SI "TARGET_32BIT") (DI "TARGET_64BIT")]) @@ -13724,134 +13710,6 @@ (define_insn "*fusion_gpr_load_" (set_attr "length" "8")]) -;; ISA 3.0 (power9) fusion support -;; Merge addis with floating load/store to FPRs (or GPRs). -(define_peephole2 - [(set (match_operand:P 0 "base_reg_operand") - (match_operand:P 1 "fusion_gpr_addis")) - (set (match_operand:SFDF 2 "p9_fusion_reg_operand") - (match_operand:SFDF 3 "fusion_offsettable_mem_operand"))] - "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0]) - && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])" - [(const_int 0)] -{ - expand_fusion_p9_load (operands); - DONE; -}) - -(define_peephole2 - [(set (match_operand:P 0 "base_reg_operand") - (match_operand:P 1 "fusion_gpr_addis")) - (set (match_operand:SFDF 2 "offsettable_mem_operand") - (match_operand:SFDF 3 "p9_fusion_reg_operand"))] - "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0]) - && fusion_p9_p (operands[0], operands[1], operands[2], operands[3]) - && !rtx_equal_p (operands[0], operands[3])" - [(const_int 0)] -{ - expand_fusion_p9_store (operands); - DONE; -}) - -(define_peephole2 - [(set (match_operand:SDI 0 "int_reg_operand") - (match_operand:SDI 1 "upper16_cint_operand")) - (set (match_dup 0) - (ior:SDI (match_dup 0) - (match_operand:SDI 2 "u_short_cint_operand")))] - "TARGET_P9_FUSION" - [(set (match_dup 0) - (unspec:SDI [(match_dup 1) - (match_dup 2)] UNSPEC_FUSION_P9))]) - -(define_peephole2 - [(set (match_operand:SDI 0 "int_reg_operand") - (match_operand:SDI 1 "upper16_cint_operand")) - (set (match_operand:SDI 2 "int_reg_operand") - (ior:SDI (match_dup 0) - (match_operand:SDI 3 "u_short_cint_operand")))] - "TARGET_P9_FUSION - && !rtx_equal_p (operands[0], operands[2]) - && peep2_reg_dead_p (2, operands[0])" - [(set (match_dup 2) - (unspec:SDI [(match_dup 1) - (match_dup 3)] UNSPEC_FUSION_P9))]) - -;; Fusion insns, created by the define_peephole2 above (and eventually by -;; reload). Because we want to eventually have secondary_reload generate -;; these, they have to have a single alternative that gives the register -;; classes. This means we need to have separate gpr/fpr/altivec versions. -(define_insn "*fusion_gpr___load" - [(set (match_operand:GPR_FUSION 0 "int_reg_operand" "=r") - (unspec:GPR_FUSION - [(match_operand:GPR_FUSION 1 "fusion_addis_mem_combo_load" "wF")] - UNSPEC_FUSION_P9)) - (clobber (match_operand:P 2 "base_reg_operand" "=b"))] - "TARGET_P9_FUSION" -{ - /* This insn is a secondary reload insn, which cannot have alternatives. - If we are not loading up register 0, use the power8 fusion instead. */ - if (base_reg_operand (operands[0], mode)) - return emit_fusion_gpr_load (operands[0], operands[1]); - - return emit_fusion_p9_load (operands[0], operands[1], operands[2]); -} - [(set_attr "type" "load") - (set_attr "length" "8")]) - -(define_insn "*fusion_gpr___store" - [(set (match_operand:GPR_FUSION 0 "fusion_addis_mem_combo_store" "=wF") - (unspec:GPR_FUSION - [(match_operand:GPR_FUSION 1 "int_reg_operand" "r")] - UNSPEC_FUSION_P9)) - (clobber (match_operand:P 2 "base_reg_operand" "=b"))] - "TARGET_P9_FUSION" -{ - return emit_fusion_p9_store (operands[0], operands[1], operands[2]); -} - [(set_attr "type" "store") - (set_attr "length" "8")]) - -(define_insn "*fusion_vsx___load" - [(set (match_operand:FPR_FUSION 0 "vsx_register_operand" "=dwb") - (unspec:FPR_FUSION - [(match_operand:FPR_FUSION 1 "fusion_addis_mem_combo_load" "wF")] - UNSPEC_FUSION_P9)) - (clobber (match_operand:P 2 "base_reg_operand" "=b"))] - "TARGET_P9_FUSION" -{ - return emit_fusion_p9_load (operands[0], operands[1], operands[2]); -} - [(set_attr "type" "fpload") - (set_attr "length" "8")]) - -(define_insn "*fusion_vsx___store" - [(set (match_operand:FPR_FUSION 0 "fusion_addis_mem_combo_store" "=wF") - (unspec:FPR_FUSION - [(match_operand:FPR_FUSION 1 "vsx_register_operand" "dwb")] - UNSPEC_FUSION_P9)) - (clobber (match_operand:P 2 "base_reg_operand" "=b"))] - "TARGET_P9_FUSION" -{ - return emit_fusion_p9_store (operands[0], operands[1], operands[2]); -} - [(set_attr "type" "fpstore") - (set_attr "length" "8")]) - -(define_insn "*fusion_p9__constant" - [(set (match_operand:SDI 0 "int_reg_operand" "=r") - (unspec:SDI [(match_operand:SDI 1 "upper16_cint_operand" "L") - (match_operand:SDI 2 "u_short_cint_operand" "K")] - UNSPEC_FUSION_P9))] - "TARGET_P9_FUSION" -{ - emit_fusion_addis (operands[0], operands[1]); - return "ori %0,%0,%2"; -} - [(set_attr "type" "two") - (set_attr "length" "8")]) - - ;; Optimize cases where we want to do a D-form load (register+offset) on ;; ISA 2.06/2.07 to an Altivec register, and the register allocator ;; has generated: Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 265931) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -498,10 +498,6 @@ moptimize-swaps Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save Analyze and remove doubleword swaps from VSX computations. -mpower9-fusion -Target Undocumented Report Mask(P9_FUSION) Var(rs6000_isa_flags) -Fuse certain operations together for better performance on power9. - mpower9-misc Target Undocumented Report Mask(P9_MISC) Var(rs6000_isa_flags) Use certain scalar instructions added in ISA 3.0. Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 265931) +++ gcc/doc/md.texi (working copy) @@ -3213,7 +3213,7 @@ Int constant that is the element number Vector constant that can be loaded with the XXSPLTIB instruction. @item wF -Memory operand suitable for power9 fusion load/stores. +Memory operand suitable for power8 GPR load fusion @item wG Memory operand suitable for TOC fusion memory references. Index: gcc/testsuite/gcc.target/powerpc/fusion3.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/fusion3.c (revision 265931) +++ gcc/testsuite/gcc.target/powerpc/fusion3.c (working copy) @@ -1,18 +0,0 @@ -/* { dg-do compile { target { powerpc*-*-* } } } */ -/* { dg-skip-if "" { powerpc*-*-darwin* } } */ -/* { dg-require-effective-target powerpc_p9vector_ok } */ -/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */ -/* { dg-options "-mcpu=power7 -mtune=power9 -O3 -dp" } */ - -#define LARGE 0x12345 - -int fusion_float_read (float *p){ return p[LARGE]; } -int fusion_double_read (double *p){ return p[LARGE]; } - -void fusion_float_write (float *p, float f){ p[LARGE] = f; } -void fusion_double_write (double *p, double d){ p[LARGE] = d; } - -/* { dg-final { scan-assembler {fusion_vsx_[sd]i_sf_load} } } */ -/* { dg-final { scan-assembler {fusion_vsx_[sd]i_df_load} } } */ -/* { dg-final { scan-assembler {fusion_vsx_[sd]i_sf_store} } } */ -/* { dg-final { scan-assembler {fusion_vsx_[sd]i_df_store} } } */ Index: gcc/testsuite/gcc.target/powerpc/fusion4.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/fusion4.c (revision 265931) +++ gcc/testsuite/gcc.target/powerpc/fusion4.c (working copy) @@ -1,12 +0,0 @@ -/* { dg-skip-if "" { powerpc*-*-darwin* } } */ -/* { dg-require-effective-target powerpc_p9vector_ok } */ -/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power7" } } */ -/* { dg-options "-mcpu=power7 -mtune=power9 -O3 -msoft-float -dp" } */ - -#define LARGE 0x12345 - -float fusion_float_read (float *p){ return p[LARGE]; } - -void fusion_float_write (float *p, float f){ p[LARGE] = f; } - -/* { dg-final { scan-assembler {fusion_gpr_[sd]i_sf_store} } } */