From patchwork Fri Jul 13 20:56:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 943814 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-481548-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="FneXzl18"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41S4q90fT5z9ryt for ; Sat, 14 Jul 2018 06:56:34 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; q=dns; s= default; b=REC/4iQ+cGpmqPNSHz5tqxVHW+vBv0IR2mptTONJOmNT6OMbPdlsT JXOOPHPTtLHk6LgH+0lHgFudLYqSdY1nwxnj4x2kugnr4IV3nyjbRRnDhRRNPv0t nltifFqORFkiEI5g6AfOadNkSiKHMkTBqGwvEntF15BTQfexkQM0nc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:mime-version:content-type:message-id; s= default; bh=AQrRYHCPX45ulMEk7keLxCW8bxw=; b=FneXzl185RFYf6mAmvA6 lRHWoH7oodY3r8v+dvLHCZliYqb96SvJEZsuoGBRHcDM6dAgphd8nLa4tNdRwUC8 xHZz+Yq2a7SLMD6C2xcTlEwe3b/50nA7EnWDjxSoNa+eeesxCmf6p2UPFi7OFRD1 /Jn+/y2YIQrdAnibkwEGnbk= Received: (qmail 42749 invoked by alias); 13 Jul 2018 20:56:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 42162 invoked by uid 89); 13 Jul 2018 20:56:24 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=king, King, secondary, meissner X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 13 Jul 2018 20:56:20 +0000 Received: from pps.filterd (m0098409.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id w6DKs6Uh055679 for ; Fri, 13 Jul 2018 16:56:18 -0400 Received: from e17.ny.us.ibm.com (e17.ny.us.ibm.com [129.33.205.207]) by mx0a-001b2d01.pphosted.com with ESMTP id 2k705d0at7-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Fri, 13 Jul 2018 16:56:18 -0400 Received: from localhost by e17.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 13 Jul 2018 16:56:16 -0400 Received: from b01cxnp22035.gho.pok.ibm.com (9.57.198.25) by e17.ny.us.ibm.com (146.89.104.204) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Fri, 13 Jul 2018 16:56:15 -0400 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id w6DKuFPu7406088 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Fri, 13 Jul 2018 20:56:15 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 38467124058; Fri, 13 Jul 2018 17:57:51 -0400 (EDT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DE1A4124053; Fri, 13 Jul 2018 17:57:50 -0400 (EDT) Received: from ibm-toto.the-meissners.org (unknown [9.32.77.218]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTPS; Fri, 13 Jul 2018 17:57:50 -0400 (EDT) Date: Fri, 13 Jul 2018 16:56:13 -0400 From: Michael Meissner To: GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt , Michael Meissner Subject: [PATCH], Remove undocumented -mtoc-fusion from PowerPC Mail-Followup-To: Michael Meissner , GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) x-cbid: 18071320-0040-0000-0000-0000044E8B8A X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009365; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000266; SDB=6.01060904; UDB=6.00544610; IPR=6.00838822; MB=3.00022137; MTD=3.00000008; XFM=3.00000015; UTC=2018-07-13 20:56:16 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18071320-0041-0000-0000-00000854B306 Message-Id: <20180713205613.GA16944@ibm-toto.the-meissners.org> Back in the days when I was developing the extended fusion support for PowerPC (-mpower9-fusion), I added a partially implemented option called toc fusion. The idea was to recognize TOC entries (that normally get split into HIGH/LO_SUM pairs) early on, and keep the pairs together. Unfortunately, I messed the setting, and you could not actually use -mtoc-fusion without also setting -mcmodel=medium, since the TOC fusion tests in rs6000.c occured before the default code model was set in SUBSUBTARGET_OPTIONS. However, I stopped doing fusion work to do other things (basic power9 enablement and IEEE 128-bit floating point). While it would be simple to move the tests for TOC fusion to after the location where the code model is set, I'm thinking that the current code is rather limited. Right now, toc fusion replaces each TOC reference with a new insn that has the scratch register as a clobber. However, if you have multiple references to the same variable (such as doing the ++/-- operators) in a basic block or referencs to variables whose location near to the variable you previously referenced, we will generate multiple ADDIS operations. I have ideas how to a better job of fusion for current and future machines using a machine dependent pass to do fusion optimizations within a basic block. This means rather than keeping the toc fusion around (that nobody used), I would prefer to delete the current code, and replace it with better code as I implement it. I have tested this on a power8 little endian system with a bootstrap build and with make check. There were no regressions. In addition, I built the full spec 2006 CPU benchmark suite for power9 to make sure I didn't accidently delete insns that are used for -mpower9-fusion. Can I check this into the trunk? I don't anticipate that we will need a backport to the FSF GCC 8 branch. 2018-07-13 Michael Meissner * config/rs6000/constraints.md (wG constraint): Delete, no longer used. * config/rs6000/predicates.md (p9_fusion_reg_operand): Rename predicate to reflect toc fusion has been deleted. (toc_fusion_mem_raw): Delete, no longer used. (toc_fusion_mem_wrapped): Likewise. * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Delete toc fusion mask bit. * config/rs6000/rs6000-protos.h (fusion_wrap_memory_address): Delete, no longer used. * config/rs6000/rs6000.c (struct rs6000_reg_addr): Delete fields meant to be used for toc fusion. (rs6000_debug_print_mode): Delete toc fusion debugging. (rs6000_debug_reg_global): Likewise. (rs6000_init_hard_regno_mode_ok): Delete setting up fields for toc fusion and secondary reload support that were never used. (rs6000_option_override_internal): Delete TOC fusion, that was only partially defined, and it did not work unless you also used the -mcmodel= switch. (rs6000_legitimate_address_p): Delete TOC fusion support. (rs6000_opt_masks): Likewise. (fusion_wrap_memory_address): Delete function, no longer used. (fusion_split_address); Delete TOC fusion support. * config/rs6000/rs6000.h (TARGET_TOC_FUSION_INT): Delete, no longer used with toc fusion being deleted. (TARGET_TOC_FUSION_FP): Likewise. * config/rs6000/rs6000.md (UNSPEC_FUSION_ADDIS): Delete TOC fusion UNSPEC. (toc fusion spliter): Delete TOC fusion support. (toc_fusionload_): Likewise. (toc_fusionload_di): Likewise. (fusion_gpr_load_): Delete generator function, this insn no longer needs to be named. Rename predicate to delete TOC fusion. (fusion_gpr___load): Likewise. (fusion_gpr___store): Likewise. (fusion_vsx___load): Likewise. (fusion_vsx___store): Likewise. (p9 fusion peephole2s): Rename predicate to delete TOC fusion. Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 262647) +++ gcc/config/rs6000/constraints.md (working copy) @@ -157,10 +157,8 @@ (define_memory_constraint "wF" "Memory operand suitable for power9 fusion load/stores" (match_operand 0 "fusion_addis_mem_combo_load")) -;; Fusion gpr load. -(define_memory_constraint "wG" - "Memory operand suitable for TOC fusion memory references" - (match_operand 0 "toc_fusion_mem_wrapped")) +;; wG is now available. Previously it was a memory operand suitable for TOC +;; fusion. (define_register_constraint "wH" "rs6000_constraints[RS6000_CONSTRAINT_wH]" "Altivec register to hold 32-bit integers or NO_REGS.") Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 262647) +++ gcc/config/rs6000/predicates.md (working copy) @@ -412,7 +412,7 @@ (define_predicate "fpr_reg_operand" ;; ;; If this is a pseudo only allow for GPR fusion in power8. If we have the ;; power9 fusion allow the floating point types. -(define_predicate "toc_fusion_or_p9_reg_operand" +(define_predicate "p9_fusion_reg_operand" (match_code "reg,subreg") { HOST_WIDE_INT r; @@ -1664,35 +1664,6 @@ (define_predicate "small_toc_ref" return GET_CODE (op) == UNSPEC && XINT (op, 1) == UNSPEC_TOCREL; }) -;; Match the TOC memory operand that can be fused with an addis instruction. -;; This is used in matching a potential fused address before register -;; allocation. -(define_predicate "toc_fusion_mem_raw" - (match_code "mem") -{ - if (!TARGET_TOC_FUSION_INT || !can_create_pseudo_p ()) - return false; - - return small_toc_ref (XEXP (op, 0), Pmode); -}) - -;; Match the memory operand that has been fused with an addis instruction and -;; wrapped inside of an (unspec [...] UNSPEC_FUSION_ADDIS) wrapper. -(define_predicate "toc_fusion_mem_wrapped" - (match_code "mem") -{ - rtx addr; - - if (!TARGET_TOC_FUSION_INT) - return false; - - if (!MEM_P (op)) - return false; - - addr = XEXP (op, 0); - return (GET_CODE (addr) == UNSPEC && XINT (addr, 1) == UNSPEC_FUSION_ADDIS); -}) - ;; Match the first insn (addis) in fusing the combination of addis and loads to ;; GPR registers on power8. (define_predicate "fusion_gpr_addis" Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (revision 262647) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -135,7 +135,6 @@ | OPTION_MASK_RECIP_PRECISION \ | OPTION_MASK_SOFT_FLOAT \ | OPTION_MASK_STRICT_ALIGN_OPTIONAL \ - | OPTION_MASK_TOC_FUSION \ | OPTION_MASK_VSX) #endif Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h (revision 262647) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -98,7 +98,6 @@ extern void expand_fusion_p9_load (rtx * extern void expand_fusion_p9_store (rtx *); extern const char *emit_fusion_p9_load (rtx, rtx, rtx); extern const char *emit_fusion_p9_store (rtx, rtx, rtx); -extern rtx fusion_wrap_memory_address (rtx); extern enum reg_class (*rs6000_preferred_reload_class_ptr) (rtx, enum reg_class); extern enum reg_class (*rs6000_secondary_reload_class_ptr) (enum reg_class, Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 262647) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -531,18 +531,8 @@ struct rs6000_reg_addr { enum insn_code reload_fpr_gpr; /* INSN to move from FPR to GPR. */ enum insn_code reload_gpr_vsx; /* INSN to move from GPR to VSX. */ enum insn_code reload_vsx_gpr; /* INSN to move from VSX to GPR. */ - enum insn_code fusion_gpr_ld; /* INSN for fusing gpr ADDIS/loads. */ - /* INSNs for fusing addi with loads - or stores for each reg. class. */ - enum insn_code fusion_addi_ld[(int)N_RELOAD_REG]; - enum insn_code fusion_addi_st[(int)N_RELOAD_REG]; - /* INSNs for fusing addis with loads - or stores for each reg. class. */ - enum insn_code fusion_addis_ld[(int)N_RELOAD_REG]; - enum insn_code fusion_addis_st[(int)N_RELOAD_REG]; addr_mask_type addr_mask[(int)N_RELOAD_REG]; /* Valid address masks. */ bool scalar_in_vmx_p; /* Scalar value can go in VMX. */ - bool fused_toc; /* Mode supports TOC fusion. */ }; static struct rs6000_reg_addr reg_addr[NUM_MACHINE_MODES]; @@ -2376,7 +2366,6 @@ rs6000_debug_print_mode (ssize_t m) { ssize_t rc; int spaces = 0; - bool fuse_extra_p; fprintf (stderr, "Mode: %-5s", GET_MODE_NAME (m)); for (rc = 0; rc < N_RELOAD_REG; rc++) @@ -2385,9 +2374,12 @@ rs6000_debug_print_mode (ssize_t m) if ((reg_addr[m].reload_store != CODE_FOR_nothing) || (reg_addr[m].reload_load != CODE_FOR_nothing)) - fprintf (stderr, " Reload=%c%c", - (reg_addr[m].reload_store != CODE_FOR_nothing) ? 's' : '*', - (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*'); + { + fprintf (stderr, "%*s Reload=%c%c", spaces, "", + (reg_addr[m].reload_store != CODE_FOR_nothing) ? 's' : '*', + (reg_addr[m].reload_load != CODE_FOR_nothing) ? 'l' : '*'); + spaces = 0; + } else spaces += sizeof (" Reload=sl") - 1; @@ -2399,82 +2391,6 @@ rs6000_debug_print_mode (ssize_t m) else spaces += sizeof (" Upper=y") - 1; - fuse_extra_p = ((reg_addr[m].fusion_gpr_ld != CODE_FOR_nothing) - || reg_addr[m].fused_toc); - if (!fuse_extra_p) - { - for (rc = 0; rc < N_RELOAD_REG; rc++) - { - if (rc != RELOAD_REG_ANY) - { - if (reg_addr[m].fusion_addi_ld[rc] != CODE_FOR_nothing - || reg_addr[m].fusion_addi_ld[rc] != CODE_FOR_nothing - || reg_addr[m].fusion_addi_st[rc] != CODE_FOR_nothing - || reg_addr[m].fusion_addis_ld[rc] != CODE_FOR_nothing - || reg_addr[m].fusion_addis_st[rc] != CODE_FOR_nothing) - { - fuse_extra_p = true; - break; - } - } - } - } - - if (fuse_extra_p) - { - fprintf (stderr, "%*s Fuse:", spaces, ""); - spaces = 0; - - for (rc = 0; rc < N_RELOAD_REG; rc++) - { - if (rc != RELOAD_REG_ANY) - { - char load, store; - - if (reg_addr[m].fusion_addis_ld[rc] != CODE_FOR_nothing) - load = 'l'; - else if (reg_addr[m].fusion_addi_ld[rc] != CODE_FOR_nothing) - load = 'L'; - else - load = '-'; - - if (reg_addr[m].fusion_addis_st[rc] != CODE_FOR_nothing) - store = 's'; - else if (reg_addr[m].fusion_addi_st[rc] != CODE_FOR_nothing) - store = 'S'; - else - store = '-'; - - if (load == '-' && store == '-') - spaces += 5; - else - { - fprintf (stderr, "%*s%c=%c%c", (spaces + 1), "", - reload_reg_map[rc].name[0], load, store); - spaces = 0; - } - } - } - - if (reg_addr[m].fusion_gpr_ld != CODE_FOR_nothing) - { - fprintf (stderr, "%*sP8gpr", (spaces + 1), ""); - spaces = 0; - } - else - spaces += sizeof (" P8gpr") - 1; - - if (reg_addr[m].fused_toc) - { - fprintf (stderr, "%*sToc", (spaces + 1), ""); - spaces = 0; - } - else - spaces += sizeof (" Toc") - 1; - } - else - spaces += sizeof (" Fuse: G=ls F=ls v=ls P8gpr Toc") - 1; - if (rs6000_vector_unit[m] != VECTOR_NONE || rs6000_vector_mem[m] != VECTOR_NONE) { @@ -2870,9 +2786,6 @@ rs6000_debug_reg_global (void) char options[80]; strcpy (options, (TARGET_P9_FUSION) ? "power9" : "power8"); - if (TARGET_TOC_FUSION) - strcat (options, ", toc"); - if (TARGET_P8_FUSION_SIGN) strcat (options, ", sign"); @@ -3540,135 +3453,6 @@ rs6000_init_hard_regno_mode_ok (bool glo } } - /* Setup the fusion operations. */ - if (TARGET_P8_FUSION) - { - reg_addr[QImode].fusion_gpr_ld = CODE_FOR_fusion_gpr_load_qi; - reg_addr[HImode].fusion_gpr_ld = CODE_FOR_fusion_gpr_load_hi; - reg_addr[SImode].fusion_gpr_ld = CODE_FOR_fusion_gpr_load_si; - if (TARGET_64BIT) - reg_addr[DImode].fusion_gpr_ld = CODE_FOR_fusion_gpr_load_di; - } - - if (TARGET_P9_FUSION) - { - struct fuse_insns { - enum machine_mode mode; /* mode of the fused type. */ - enum machine_mode pmode; /* pointer mode. */ - enum rs6000_reload_reg_type rtype; /* register type. */ - enum insn_code load; /* load insn. */ - enum insn_code store; /* store insn. */ - }; - - static const struct fuse_insns addis_insns[] = { - { E_SFmode, E_DImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_di_sf_load, - CODE_FOR_fusion_vsx_di_sf_store }, - - { E_SFmode, E_SImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_si_sf_load, - CODE_FOR_fusion_vsx_si_sf_store }, - - { E_DFmode, E_DImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_di_df_load, - CODE_FOR_fusion_vsx_di_df_store }, - - { E_DFmode, E_SImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_si_df_load, - CODE_FOR_fusion_vsx_si_df_store }, - - { E_DImode, E_DImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_di_di_load, - CODE_FOR_fusion_vsx_di_di_store }, - - { E_DImode, E_SImode, RELOAD_REG_FPR, - CODE_FOR_fusion_vsx_si_di_load, - CODE_FOR_fusion_vsx_si_di_store }, - - { E_QImode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_qi_load, - CODE_FOR_fusion_gpr_di_qi_store }, - - { E_QImode, E_SImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_si_qi_load, - CODE_FOR_fusion_gpr_si_qi_store }, - - { E_HImode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_hi_load, - CODE_FOR_fusion_gpr_di_hi_store }, - - { E_HImode, E_SImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_si_hi_load, - CODE_FOR_fusion_gpr_si_hi_store }, - - { E_SImode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_si_load, - CODE_FOR_fusion_gpr_di_si_store }, - - { E_SImode, E_SImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_si_si_load, - CODE_FOR_fusion_gpr_si_si_store }, - - { E_SFmode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_sf_load, - CODE_FOR_fusion_gpr_di_sf_store }, - - { E_SFmode, E_SImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_si_sf_load, - CODE_FOR_fusion_gpr_si_sf_store }, - - { E_DImode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_di_load, - CODE_FOR_fusion_gpr_di_di_store }, - - { E_DFmode, E_DImode, RELOAD_REG_GPR, - CODE_FOR_fusion_gpr_di_df_load, - CODE_FOR_fusion_gpr_di_df_store }, - }; - - machine_mode cur_pmode = Pmode; - size_t i; - - for (i = 0; i < ARRAY_SIZE (addis_insns); i++) - { - machine_mode xmode = addis_insns[i].mode; - enum rs6000_reload_reg_type rtype = addis_insns[i].rtype; - - if (addis_insns[i].pmode != cur_pmode) - continue; - - if (rtype == RELOAD_REG_FPR && !TARGET_HARD_FLOAT) - continue; - - reg_addr[xmode].fusion_addis_ld[rtype] = addis_insns[i].load; - reg_addr[xmode].fusion_addis_st[rtype] = addis_insns[i].store; - - if (rtype == RELOAD_REG_FPR && TARGET_P9_VECTOR) - { - reg_addr[xmode].fusion_addis_ld[RELOAD_REG_VMX] - = addis_insns[i].load; - reg_addr[xmode].fusion_addis_st[RELOAD_REG_VMX] - = addis_insns[i].store; - } - } - } - - /* Note which types we support fusing TOC setup plus memory insn. We only do - fused TOCs for medium/large code models. */ - if (TARGET_P8_FUSION && TARGET_TOC_FUSION && TARGET_POWERPC64 - && (TARGET_CMODEL != CMODEL_SMALL)) - { - reg_addr[QImode].fused_toc = true; - reg_addr[HImode].fused_toc = true; - reg_addr[SImode].fused_toc = true; - reg_addr[DImode].fused_toc = true; - if (TARGET_HARD_FLOAT) - { - reg_addr[SFmode].fused_toc = true; - reg_addr[DFmode].fused_toc = true; - } - } - /* Precalculate HARD_REGNO_NREGS. */ for (r = 0; r < FIRST_PSEUDO_REGISTER; ++r) for (m = 0; m < NUM_MACHINE_MODES; ++m) @@ -4425,7 +4209,7 @@ rs6000_option_override_internal (bool gl & OPTION_MASK_P8_FUSION); /* Setting additional fusion flags turns on base fusion. */ - if (!TARGET_P8_FUSION && (TARGET_P8_FUSION_SIGN || TARGET_TOC_FUSION)) + if (!TARGET_P8_FUSION && TARGET_P8_FUSION_SIGN) { if (rs6000_isa_flags_explicit & OPTION_MASK_P8_FUSION) { @@ -4433,9 +4217,6 @@ rs6000_option_override_internal (bool gl error ("%qs requires %qs", "-mpower8-fusion-sign", "-mpower8-fusion"); - if (TARGET_TOC_FUSION) - error ("%qs requires %qs", "-mtoc-fusion", "-mpower8-fusion"); - rs6000_isa_flags &= ~OPTION_MASK_P8_FUSION; } else @@ -4473,28 +4254,6 @@ rs6000_option_override_internal (bool gl && optimize >= 3) rs6000_isa_flags |= OPTION_MASK_P8_FUSION_SIGN; - /* TOC fusion requires 64-bit and medium/large code model. */ - if (TARGET_TOC_FUSION && !TARGET_POWERPC64) - { - rs6000_isa_flags &= ~OPTION_MASK_TOC_FUSION; - if ((rs6000_isa_flags_explicit & OPTION_MASK_TOC_FUSION) != 0) - warning (0, N_("-mtoc-fusion requires 64-bit")); - } - - if (TARGET_TOC_FUSION && (TARGET_CMODEL == CMODEL_SMALL)) - { - rs6000_isa_flags &= ~OPTION_MASK_TOC_FUSION; - if ((rs6000_isa_flags_explicit & OPTION_MASK_TOC_FUSION) != 0) - warning (0, N_("-mtoc-fusion requires medium/large code model")); - } - - /* Turn on -mtoc-fusion by default if p8-fusion and 64-bit medium/large code - model. */ - if (TARGET_P8_FUSION && !TARGET_TOC_FUSION && TARGET_POWERPC64 - && (TARGET_CMODEL != CMODEL_SMALL) - && !(rs6000_isa_flags_explicit & OPTION_MASK_TOC_FUSION)) - rs6000_isa_flags |= OPTION_MASK_TOC_FUSION; - /* ISA 3.0 vector instructions include ISA 2.07. */ if (TARGET_P9_VECTOR && !TARGET_P8_VECTOR) { @@ -9535,9 +9294,6 @@ rs6000_legitimate_address_p (machine_mod if (legitimate_constant_pool_address_p (x, mode, reg_ok_strict || lra_in_progress)) return 1; - if (reg_addr[mode].fused_toc && GET_CODE (x) == UNSPEC - && XINT (x, 1) == UNSPEC_FUSION_ADDIS) - return 1; } /* For TImode, if we have TImode in VSX registers, only allow register @@ -35880,7 +35636,6 @@ static struct rs6000_opt_mask const rs60 { "recip-precision", OPTION_MASK_RECIP_PRECISION, false, true }, { "save-toc-indirect", OPTION_MASK_SAVE_TOC_INDIRECT, false, true }, { "string", 0, false, true }, - { "toc-fusion", OPTION_MASK_TOC_FUSION, false, true }, { "update", OPTION_MASK_NO_UPDATE, true , true }, { "vsx", OPTION_MASK_VSX, false, true }, #ifdef OPTION_MASK_64BIT @@ -38043,37 +37798,17 @@ emit_fusion_load_store (rtx load_store_r return; } -/* Wrap a TOC address that can be fused to indicate that special fusion - processing is needed. */ - -rtx -fusion_wrap_memory_address (rtx old_mem) -{ - rtx old_addr = XEXP (old_mem, 0); - rtvec v = gen_rtvec (1, old_addr); - rtx new_addr = gen_rtx_UNSPEC (Pmode, v, UNSPEC_FUSION_ADDIS); - return replace_equiv_address_nv (old_mem, new_addr, false); -} - /* Given an address, convert it into the addis and load offset parts. Addresses created during the peephole2 process look like: (lo_sum (high (unspec [(sym)] UNSPEC_TOCREL)) - (unspec [(...)] UNSPEC_TOCREL)) - - Addresses created via toc fusion look like: - (unspec [(unspec [(...)] UNSPEC_TOCREL)] UNSPEC_FUSION_ADDIS)) */ + (unspec [(...)] UNSPEC_TOCREL)) */ static void fusion_split_address (rtx addr, rtx *p_hi, rtx *p_lo) { rtx hi, lo; - if (GET_CODE (addr) == UNSPEC && XINT (addr, 1) == UNSPEC_FUSION_ADDIS) - { - lo = XVECEXP (addr, 0, 0); - hi = gen_rtx_HIGH (Pmode, lo); - } - else if (GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM) + if (GET_CODE (addr) == PLUS || GET_CODE (addr) == LO_SUM) { hi = XEXP (addr, 0); lo = XEXP (addr, 1); @@ -38090,9 +37825,6 @@ fusion_split_address (rtx addr, rtx *p_h is the logical address that was formed during peephole2: (lo_sum (high) (low-part)) - Or the address is the TOC address that is wrapped before register allocation: - (unspec [(addr) (toc-reg)] UNSPEC_FUSION_ADDIS) - The code is complicated, so we call output_asm_insn directly, and just return "". */ Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 262647) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -699,19 +699,6 @@ extern int rs6000_vector_align[]; #define TARGET_FRSQRTE (TARGET_HARD_FLOAT \ && (TARGET_PPC_GFXOPT || VECTOR_UNIT_VSX_P (DFmode))) -/* Conditions to allow TOC fusion for loading/storing integers. */ -#define TARGET_TOC_FUSION_INT (TARGET_P8_FUSION \ - && TARGET_TOC_FUSION \ - && (TARGET_CMODEL != CMODEL_SMALL) \ - && TARGET_POWERPC64) - -/* Conditions to allow TOC fusion for loading/storing floating point. */ -#define TARGET_TOC_FUSION_FP (TARGET_P9_FUSION \ - && TARGET_TOC_FUSION \ - && (TARGET_CMODEL != CMODEL_SMALL) \ - && TARGET_POWERPC64 \ - && TARGET_HARD_FLOAT) - /* Macro to say whether we can do optimizations where we need to do parts of the calculation in 64-bit GPRs and then is transfered to the vector registers. */ Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 262647) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -137,7 +137,6 @@ (define_c_enum "unspec" UNSPEC_FUSION_GPR UNSPEC_STACK_CHECK UNSPEC_FUSION_P9 - UNSPEC_FUSION_ADDIS UNSPEC_ADD_ROUND_TO_ODD UNSPEC_SUB_ROUND_TO_ODD UNSPEC_MUL_ROUND_TO_ODD @@ -13593,66 +13592,11 @@ (define_insn "rs6000_mtfsf" ;; a GPR. The addis instruction must be adjacent to the load, and use the same ;; register that is being loaded. The fused ops must be physically adjacent. -;; There are two parts to addis fusion. The support for fused TOCs occur -;; before register allocation, and is meant to reduce the lifetime for the -;; tempoary register that holds the ADDIS result. On Power8 GPR loads, we try -;; to use the register that is being load. The peephole2 then gathers any -;; other fused possibilities that it can find after register allocation. If -;; power9 fusion is selected, we also fuse floating point loads/stores. - -;; Fused TOC support: Replace simple GPR loads with a fused form. This is done -;; before register allocation, so that we can avoid allocating a temporary base -;; register that won't be used, and that we try to load into base registers, -;; and not register 0. If we can't get a fused GPR load, generate a P9 fusion -;; (addis followed by load) even on power8. +;; On Power8 GPR loads, we try to use the register that is being load. The +;; peephole2 then gathers any other fused possibilities that it can find after +;; register allocation. If power9 fusion is selected, we also fuse floating +;; point loads/stores. -(define_split - [(set (match_operand:INT1 0 "toc_fusion_or_p9_reg_operand") - (match_operand:INT1 1 "toc_fusion_mem_raw"))] - "TARGET_TOC_FUSION_INT && can_create_pseudo_p ()" - [(parallel [(set (match_dup 0) (match_dup 2)) - (unspec [(const_int 0)] UNSPEC_FUSION_ADDIS) - (use (match_dup 3)) - (clobber (scratch:DI))])] -{ - operands[2] = fusion_wrap_memory_address (operands[1]); - operands[3] = gen_rtx_REG (Pmode, TOC_REGISTER); -}) - -(define_insn "*toc_fusionload_" - [(set (match_operand:QHSI 0 "int_reg_operand" "=&b,??r") - (match_operand:QHSI 1 "toc_fusion_mem_wrapped" "wG,wG")) - (unspec [(const_int 0)] UNSPEC_FUSION_ADDIS) - (use (match_operand:DI 2 "base_reg_operand" "r,r")) - (clobber (match_scratch:DI 3 "=X,&b"))] - "TARGET_TOC_FUSION_INT" -{ - if (base_reg_operand (operands[0], mode)) - return emit_fusion_gpr_load (operands[0], operands[1]); - - return emit_fusion_p9_load (operands[0], operands[1], operands[3]); -} - [(set_attr "type" "load") - (set_attr "length" "8")]) - -(define_insn "*toc_fusionload_di" - [(set (match_operand:DI 0 "int_reg_operand" "=&b,??r,?d") - (match_operand:DI 1 "toc_fusion_mem_wrapped" "wG,wG,wG")) - (unspec [(const_int 0)] UNSPEC_FUSION_ADDIS) - (use (match_operand:DI 2 "base_reg_operand" "r,r,r")) - (clobber (match_scratch:DI 3 "=X,&b,&b"))] - "TARGET_TOC_FUSION_INT && TARGET_POWERPC64 - && (MEM_P (operands[1]) || int_reg_operand (operands[0], DImode))" -{ - if (base_reg_operand (operands[0], DImode)) - return emit_fusion_gpr_load (operands[0], operands[1]); - - return emit_fusion_p9_load (operands[0], operands[1], operands[3]); -} - [(set_attr "type" "load") - (set_attr "length" "8")]) - - ;; Find cases where the addis that feeds into a load instruction is either used ;; once or is the same as the target register, and replace it with the fusion ;; insn @@ -13674,7 +13618,7 @@ (define_peephole2 ;; Fusion insn, created by the define_peephole2 above (and eventually by ;; reload) -(define_insn "fusion_gpr_load_" +(define_insn "*fusion_gpr_load_" [(set (match_operand:INT1 0 "base_reg_operand" "=b") (unspec:INT1 [(match_operand:INT1 1 "fusion_addis_mem_combo_load" "wF")] UNSPEC_FUSION_GPR))] @@ -13691,7 +13635,7 @@ (define_insn "fusion_gpr_load_" (define_peephole2 [(set (match_operand:P 0 "base_reg_operand") (match_operand:P 1 "fusion_gpr_addis")) - (set (match_operand:SFDF 2 "toc_fusion_or_p9_reg_operand") + (set (match_operand:SFDF 2 "p9_fusion_reg_operand") (match_operand:SFDF 3 "fusion_offsettable_mem_operand"))] "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0]) && fusion_p9_p (operands[0], operands[1], operands[2], operands[3])" @@ -13705,7 +13649,7 @@ (define_peephole2 [(set (match_operand:P 0 "base_reg_operand") (match_operand:P 1 "fusion_gpr_addis")) (set (match_operand:SFDF 2 "offsettable_mem_operand") - (match_operand:SFDF 3 "toc_fusion_or_p9_reg_operand"))] + (match_operand:SFDF 3 "p9_fusion_reg_operand"))] "TARGET_P9_FUSION && peep2_reg_dead_p (2, operands[0]) && fusion_p9_p (operands[0], operands[1], operands[2], operands[3]) && !rtx_equal_p (operands[0], operands[3])" @@ -13743,7 +13687,7 @@ (define_peephole2 ;; reload). Because we want to eventually have secondary_reload generate ;; these, they have to have a single alternative that gives the register ;; classes. This means we need to have separate gpr/fpr/altivec versions. -(define_insn "fusion_gpr___load" +(define_insn "*fusion_gpr___load" [(set (match_operand:GPR_FUSION 0 "int_reg_operand" "=r") (unspec:GPR_FUSION [(match_operand:GPR_FUSION 1 "fusion_addis_mem_combo_load" "wF")] @@ -13761,7 +13705,7 @@ (define_insn "fusion_gpr____store" +(define_insn "*fusion_gpr___store" [(set (match_operand:GPR_FUSION 0 "fusion_addis_mem_combo_store" "=wF") (unspec:GPR_FUSION [(match_operand:GPR_FUSION 1 "int_reg_operand" "r")] @@ -13774,7 +13718,7 @@ (define_insn "fusion_gpr____load" +(define_insn "*fusion_vsx___load" [(set (match_operand:FPR_FUSION 0 "vsx_register_operand" "=dwb") (unspec:FPR_FUSION [(match_operand:FPR_FUSION 1 "fusion_addis_mem_combo_load" "wF")] @@ -13787,7 +13731,7 @@ (define_insn "fusion_vsx____store" +(define_insn "*fusion_vsx___store" [(set (match_operand:FPR_FUSION 0 "fusion_addis_mem_combo_store" "=wF") (unspec:FPR_FUSION [(match_operand:FPR_FUSION 1 "vsx_register_operand" "dwb")]