From patchwork Mon Aug 26 20:43:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 1153441 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-507739-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="F3jGhKur"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46HP9n3Hzsz9sML for ; Tue, 27 Aug 2019 06:43:57 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; q=dns; s=default; b=CO37K3CbPiyiN5V5kVK7biPy2X2nhw Pg/ykYJxWxZ6FUR3HRJg6ZDP0W47WptS0LWIHTy1sderoglf0FUZki57Sl9Mjoq9 3lpr8LSgT074vo0MYaaQYPP24w9ljA/rCtyxRs7F2pMdssj1ag2CEro5sF/JPS6u /H/vzsUz7aUr8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; s=default; bh=qXjdeIr1xDVDmS3+Ese0IIXzLzg=; b=F3jG hKurd/Eu2YeoocvXbr/qQD1sXvqGZJdEleUo/WarRj7kpWRG+00+dPWtU5qByfNM //4n7skHnT1bT4MVpNvmFYxheUR/A0kgwATawA+lt0V++RULlTt3+0V1YWw+bt6D +wF6DLS6suQhBe6wJU5GUWZ8fkO2w/D1NEY0kz8= Received: (qmail 69313 invoked by alias); 26 Aug 2019 20:43:47 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 69180 invoked by uid 89); 26 Aug 2019 20:43:47 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-9.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.1 spammy=sk:legitim X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 26 Aug 2019 20:43:43 +0000 Received: from pps.filterd (m0098393.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.27/8.16.0.27) with SMTP id x7QKhIJL085036; Mon, 26 Aug 2019 16:43:41 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 2umkqpxkp4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 26 Aug 2019 16:43:41 -0400 Received: from m0098393.ppops.net (m0098393.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.27/8.16.0.27) with SMTP id x7QKhJ9M085119; Mon, 26 Aug 2019 16:43:40 -0400 Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com with ESMTP id 2umkqpxknc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 26 Aug 2019 16:43:40 -0400 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id x7QKW8t9012948; Mon, 26 Aug 2019 20:43:39 GMT Received: from b01cxnp23033.gho.pok.ibm.com (b01cxnp23033.gho.pok.ibm.com [9.57.198.28]) by ppma02wdc.us.ibm.com with ESMTP id 2ujvv6eyme-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 26 Aug 2019 20:43:39 +0000 Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp23033.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id x7QKhdlf12845324 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 26 Aug 2019 20:43:39 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 31675AE060; Mon, 26 Aug 2019 20:43:39 +0000 (GMT) Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id D5C87AE05C; Mon, 26 Aug 2019 20:43:38 +0000 (GMT) Received: from ibm-toto.the-meissners.org (unknown [9.32.77.177]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTPS; Mon, 26 Aug 2019 20:43:38 +0000 (GMT) Date: Mon, 26 Aug 2019 16:43:37 -0400 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, dje.gcc@gmail.com Subject: [PATCH, V3, #4 of 10], Add general prefixed/pcrel support Message-ID: <20190826204337.GD11790@ibm-toto.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, segher@kernel.crashing.org, dje.gcc@gmail.com References: <20190826173320.GA7958@ibm-toto.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20190826173320.GA7958@ibm-toto.the-meissners.org> User-Agent: Mutt/1.5.21 (2010-09-15) This patch (V3 patch #4) is a rework of the V1 patches #3 and #4. It adds support to generate prefixed (and local pc-relative) instructions for all modes, except SDmode. SDmode can't be used with a prefixed offset instruction, because the default method to load up a SDmode value is to use the LFIWZX instruction, which only has an indexed format. For the stack_protect_setdi and stack_protect_testdi insns, I reworked them so that the expander will copy the prefixed memory address to a register and use the indexed instruction format. I added new predicates to make sure nothing re-combined the insn to form a prefixed insns. I changed the logic previously using insn_form to now use trad_insn. I think in the previoius patch, I mispoke, in that the logic for pc-relative vector extract is here, and not in the previous patch. I have built a bootstrap compiler on a little endian power8 system, and there were no regressions when I ran make check. Once the previous patches are checked in, can I check in this patch? 2019-08-26 Michael Meissner * config/rs6000/predicates.md (add_operand): Add support for the PADDI instruction. (non_add_cint_operand): Add support for the PADDI instruction. (lwa_operand): Add support for the PLWA instruction. (non_prefixed_mem_operand): New predicate. * config/rs6000/rs6000-protos.h (make_memory_non_prefixed): New declaration. * config/rs6000/rs6000.c (num_insns_constant_gpr): Add support for the PADDI instruction. (rs6000_adjust_vec_address): Add support for optimizing prefixed and pc-relative extracts with constant extraction elements. Add a failure when we use pc-relative addressing and non-constant extraction elements. Use SIGNED_16BIT_OFFSET_P. (quad_address_p): Add support for prefixed memory instructions. (mem_operand_gpr): Add support for prefixed memory instructions. Use SIGNED_16BIT_OFFSET_EXTRA_P. (mem_operand_ds_form): Add support for prefixed memory instructions. Use SIGNED_16BIT_OFFSET_EXTRA_P. (rs6000_legitimate_offset_address_p): Add support for prefixed memory instructions. (rs6000_legitimate_address_p): Add support for prefixed memory instructions. (rs6000_mode_dependent_address): Add support for prefixed memory instructions. (make_memory_non_prefixed): New function. (prefixed_paddi_p): Fix thinkos in last patch. (rs6000_rtx_costs): Add support for the PADDI instruction. (rs6000_num_insns): Don't treat prefixed instructions as being slower because they have a larger length. (rs6000_insn_cost): Call rs6000_num_insns. * config/rs6000/rs6000.md (add3): Add support for the PADDI instruction. (movsi_low): Add support for the PADDI instruction. (movsi const int splitter): Add support for the PADDI instruction. (mov_64bit_dm): Add support for prefixed memory instructions. Split alternatives that had merged loading a constant with register moves. (movtd_64bit_nodm): Add support for prefixed memory instructions. (movdi_internal64): Add support for prefixed memory instructions. (movdi const int splitter): Add comment. (mov_ppc64): Add support for prefixed memory instructions. (stack_protect_setdi): Do not allow prefixed instructions. (stack_protect_testdi): Do not allow prefixed instructions. * config/rs6000/vsx.md (vsx_mov_64bit): Add support for prefixed memory instructions. Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 274870) +++ gcc/config/rs6000/predicates.md (working copy) @@ -839,7 +839,8 @@ (define_special_predicate "indexed_addre (define_predicate "add_operand" (if_then_else (match_code "const_int") (match_test "satisfies_constraint_I (op) - || satisfies_constraint_L (op)") + || satisfies_constraint_L (op) + || satisfies_constraint_eI (op)") (match_operand 0 "gpc_reg_operand"))) ;; Return 1 if the operand is either a non-special register, or 0, or -1. @@ -852,7 +853,8 @@ (define_predicate "adde_operand" (define_predicate "non_add_cint_operand" (and (match_code "const_int") (match_test "!satisfies_constraint_I (op) - && !satisfies_constraint_L (op)"))) + && !satisfies_constraint_L (op) + && !satisfies_constraint_eI (op)"))) ;; Return 1 if the operand is a constant that can be used as the operand ;; of an AND, OR or XOR. @@ -933,6 +935,13 @@ (define_predicate "lwa_operand" return false; addr = XEXP (inner, 0); + + /* The LWA instruction uses the DS-form format where the bottom two bits of + the offset must be 0. The prefixed PLWA does not have this + restriction. */ + if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS)) + return true; + if (GET_CODE (addr) == PRE_INC || GET_CODE (addr) == PRE_DEC || (GET_CODE (addr) == PRE_MODIFY @@ -1686,6 +1695,17 @@ (define_predicate "pcrel_ext_address" return (SYMBOL_REF_P (op) && !SYMBOL_REF_LOCAL_P (op)); }) +;; Return 1 if op is a memory operand that is not prefixed. +(define_predicate "non_prefixed_mem_operand" + (match_code "mem") +{ + if (!memory_operand (op, mode)) + return false; + + return !prefixed_local_addr_p (XEXP (op, 0), GET_MODE (op), + TRAD_INSN_DEFAULT); +}) + ;; Match the first insn (addis) in fusing the combination of addis and loads to ;; GPR registers on power8. (define_predicate "fusion_gpr_addis" Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h (revision 274872) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -170,6 +170,7 @@ typedef enum { } trad_insn_type; extern bool prefixed_local_addr_p (rtx, machine_mode, trad_insn_type); +extern rtx make_memory_non_prefixed (rtx); extern bool prefixed_load_p (rtx_insn *); extern bool prefixed_store_p (rtx_insn *); extern bool prefixed_paddi_p (rtx_insn *); Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 274872) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -5727,7 +5727,7 @@ static int num_insns_constant_gpr (HOST_WIDE_INT value) { /* signed constant loadable with addi */ - if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x10000) + if (SIGNED_16BIT_OFFSET_P (value)) return 1; /* constant loadable with addis */ @@ -5735,6 +5735,10 @@ num_insns_constant_gpr (HOST_WIDE_INT va && (value >> 31 == -1 || value >> 31 == 0)) return 1; + /* PADDI can support up to 34 bit signed integers. */ + else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value)) + return 1; + else if (TARGET_POWERPC64) { HOST_WIDE_INT low = ((value & 0xffffffff) ^ 0x80000000) - 0x80000000; @@ -6905,6 +6909,7 @@ rs6000_adjust_vec_address (rtx scalar_re rtx element_offset; rtx new_addr; bool valid_addr_p; + bool pcrel_p = TARGET_PCREL && pcrel_local_address (addr, Pmode); /* Vector addresses should not have PRE_INC, PRE_DEC, or PRE_MODIFY. */ gcc_assert (GET_RTX_CLASS (GET_CODE (addr)) != RTX_AUTOINC); @@ -6942,6 +6947,41 @@ rs6000_adjust_vec_address (rtx scalar_re else if (REG_P (addr) || SUBREG_P (addr)) new_addr = gen_rtx_PLUS (Pmode, addr, element_offset); + + /* Optimize pc-relative addresses. */ + else if (pcrel_p) + { + if (CONST_INT_P (element_offset)) + { + rtx addr2 = addr; + HOST_WIDE_INT offset = INTVAL (element_offset); + + if (GET_CODE (addr2) == CONST) + addr2 = XEXP (addr2, 0); + + if (GET_CODE (addr2) == PLUS) + { + offset += INTVAL (XEXP (addr2, 1)); + addr2 = XEXP (addr2, 0); + } + + gcc_assert (SIGNED_34BIT_OFFSET_P (offset)); + if (offset) + { + addr2 = gen_rtx_PLUS (Pmode, addr2, GEN_INT (offset)); + new_addr = gen_rtx_CONST (Pmode, addr2); + } + else + new_addr = addr2; + } + + /* Right now, the pc-relative support needs to be re-thought if you have + a pc-relative address and a variable extract, due to having only have + one base register tmp to use. Fail until this is rewritten. */ + else + gcc_unreachable (); + } + /* Optimize D-FORM addresses with constant offset with a constant element, to include the element offset in the address directly. */ else if (GET_CODE (addr) == PLUS) @@ -6956,8 +6996,11 @@ rs6000_adjust_vec_address (rtx scalar_re HOST_WIDE_INT offset = INTVAL (op1) + INTVAL (element_offset); rtx offset_rtx = GEN_INT (offset); - if (IN_RANGE (offset, -32768, 32767) - && (scalar_size < 8 || (offset & 0x3) == 0)) + if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (offset)) + new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); + + else if (SIGNED_16BIT_OFFSET_P (offset) + && (scalar_size < 8 || (offset & 0x3) == 0)) new_addr = gen_rtx_PLUS (Pmode, op0, offset_rtx); else { @@ -7007,9 +7050,8 @@ rs6000_adjust_vec_address (rtx scalar_re /* If we have a PLUS, we need to see whether the particular register class allows for D-FORM or X-FORM addressing. */ - if (GET_CODE (new_addr) == PLUS) + if (GET_CODE (new_addr) == PLUS || pcrel_p) { - rtx op1 = XEXP (new_addr, 1); addr_mask_type addr_mask; unsigned int scalar_regno = reg_or_subregno (scalar_reg); @@ -7026,7 +7068,10 @@ rs6000_adjust_vec_address (rtx scalar_re else gcc_unreachable (); - if (REG_P (op1) || SUBREG_P (op1)) + if (pcrel_p) + valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0; + else if (REG_P (XEXP (new_addr, 1)) + || SUBREG_P (XEXP (new_addr, 1))) valid_addr_p = (addr_mask & RELOAD_REG_INDEXED) != 0; else valid_addr_p = (addr_mask & RELOAD_REG_OFFSET) != 0; @@ -7454,6 +7499,13 @@ quad_address_p (rtx addr, machine_mode m if (VECTOR_MODE_P (mode) && !mode_supports_dq_form (mode)) return false; + /* Is this a valid prefixed address? If the bottom four bits of the offset + are non-zero, we could use a prefixed instruction (which does not have the + DQ-form constraint that the traditional instruction had) instead of + forcing the unaligned offset to a GPR. */ + if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DQ)) + return true; + if (GET_CODE (addr) != PLUS) return false; @@ -7555,6 +7607,13 @@ mem_operand_gpr (rtx op, machine_mode mo && legitimate_indirect_address_p (XEXP (addr, 0), false)) return true; + /* Allow prefixed instructions if supported. If the bottom two bits of the + offset are non-zero, we could use a prefixed instruction (which does not + have the DS-form constraint that the traditional instruction had) instead + of forcing the unaligned offset to a GPR. */ + if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS)) + return true; + /* Don't allow non-offsettable addresses. See PRs 83969 and 84279. */ if (!rs6000_offsettable_memref_p (op, mode, false)) return false; @@ -7576,7 +7635,7 @@ mem_operand_gpr (rtx op, machine_mode mo causes a wrap, so test only the low 16 bits. */ offset = ((offset & 0xffff) ^ 0x8000) - 0x8000; - return offset + 0x8000 < 0x10000u - extra; + return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra); } /* As above, but for DS-FORM VSX insns. Unlike mem_operand_gpr, @@ -7589,6 +7648,13 @@ mem_operand_ds_form (rtx op, machine_mod int extra; rtx addr = XEXP (op, 0); + /* Allow prefixed instructions if supported. If the bottom two bits of the + offset are non-zero, we could use a prefixed instruction (which does not + have the DS-form constraint that the traditional instruction had) instead + of forcing the unaligned offset to a GPR. */ + if (prefixed_local_addr_p (addr, mode, TRAD_INSN_DS)) + return true; + if (!offsettable_address_p (false, mode, addr)) return false; @@ -7609,7 +7675,7 @@ mem_operand_ds_form (rtx op, machine_mod causes a wrap, so test only the low 16 bits. */ offset = ((offset & 0xffff) ^ 0x8000) - 0x8000; - return offset + 0x8000 < 0x10000u - extra; + return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra); } /* Subroutines of rs6000_legitimize_address and rs6000_legitimate_address_p. */ @@ -7958,8 +8024,10 @@ rs6000_legitimate_offset_address_p (mach break; } - offset += 0x8000; - return offset < 0x10000 - extra; + if (TARGET_PREFIXED_ADDR) + return SIGNED_34BIT_OFFSET_EXTRA_P (offset, extra); + else + return SIGNED_16BIT_OFFSET_EXTRA_P (offset, extra); } bool @@ -8856,6 +8924,11 @@ rs6000_legitimate_address_p (machine_mod && mode_supports_pre_incdec_p (mode) && legitimate_indirect_address_p (XEXP (x, 0), reg_ok_strict)) return 1; + + /* Handle prefixed addresses (pc-relative or 34-bit offset). */ + if (prefixed_local_addr_p (x, mode, TRAD_INSN_DEFAULT)) + return 1; + /* Handle restricted vector d-form offsets in ISA 3.0. */ if (quad_offset_p) { @@ -8914,7 +8987,10 @@ rs6000_legitimate_address_p (machine_mod || (!avoiding_indexed_address_p (mode) && legitimate_indexed_address_p (XEXP (x, 1), reg_ok_strict))) && rtx_equal_p (XEXP (XEXP (x, 1), 0), XEXP (x, 0))) - return 1; + { + /* There is no prefixed version of the load/store with update. */ + return !prefixed_local_addr_p (XEXP (x, 1), mode, TRAD_INSN_DEFAULT); + } if (reg_offset_p && !quad_offset_p && legitimate_lo_sum_address_p (mode, x, reg_ok_strict)) return 1; @@ -8976,8 +9052,12 @@ rs6000_mode_dependent_address (const_rtx && XEXP (addr, 0) != arg_pointer_rtx && CONST_INT_P (XEXP (addr, 1))) { - unsigned HOST_WIDE_INT val = INTVAL (XEXP (addr, 1)); - return val + 0x8000 >= 0x10000 - (TARGET_POWERPC64 ? 8 : 12); + HOST_WIDE_INT val = INTVAL (XEXP (addr, 1)); + HOST_WIDE_INT extra = TARGET_POWERPC64 ? 8 : 12; + if (TARGET_PREFIXED_ADDR) + return !SIGNED_34BIT_OFFSET_EXTRA_P (val, extra); + else + return !SIGNED_16BIT_OFFSET_EXTRA_P (val, extra); } break; @@ -13950,6 +14030,34 @@ prefixed_local_addr_p (rtx addr, return false; } + +/* Make a memory address non-prefixed if it is prefixed. */ + +rtx +make_memory_non_prefixed (rtx mem) +{ + gcc_assert (MEM_P (mem)); + if (prefixed_local_addr_p (XEXP (mem, 0), GET_MODE (mem), TRAD_INSN_DEFAULT)) + { + rtx old_addr = XEXP (mem, 0); + rtx new_addr; + + if (GET_CODE (old_addr) == PLUS + && (REG_P (XEXP (old_addr, 0)) || SUBREG_P (XEXP (old_addr, 0))) + && CONST_INT_P (XEXP (old_addr, 1))) + { + rtx tmp_reg = force_reg (Pmode, XEXP (old_addr, 1)); + new_addr = gen_rtx_PLUS (Pmode, XEXP (old_addr, 0), tmp_reg); + } + else + new_addr = force_reg (Pmode, old_addr); + + mem = change_address (mem, VOIDmode, new_addr); + } + + return mem; +} + /* Whether a load instruction is a prefixed instruction. This is called from the prefixed attribute processing. */ @@ -21060,7 +21168,8 @@ rs6000_rtx_costs (rtx x, machine_mode mo || outer_code == PLUS || outer_code == MINUS) && (satisfies_constraint_I (x) - || satisfies_constraint_L (x))) + || satisfies_constraint_L (x) + || satisfies_constraint_eI (x))) || (outer_code == AND && (satisfies_constraint_K (x) || (mode == SImode @@ -21440,6 +21549,42 @@ rs6000_debug_rtx_costs (rtx x, machine_m return ret; } +/* How many real instructions are generated for this insn? This is slightly + different from the length attribute, in that the length attribute counts the + number of bytes. With prefixed instructions, we don't want to count a + prefixed instruction (length 12 bytes including possible NOP) as taking 3 + instructions, but just one. */ + +static int +rs6000_num_insns (rtx_insn *insn) +{ + /* Try to figure it out based on the length and whether there are prefixed + instructions. While prefixed instructions are only 8 bytes, we have to + use 12 as the size of the first prefixed instruction in case the + instruction needs to be aligned. Back to back prefixed instructions would + only take 20 bytes, since it is guaranteed that one of the prefixed + instructions does not need the alignment. */ + int length = get_attr_length (insn); + + if (length >= 12 && TARGET_PREFIXED_ADDR + && get_attr_prefixed (insn) == PREFIXED_YES) + { + /* Single prefixed instruction. */ + if (length == 12) + return 1; + + /* A normal instruction and a prefixed instruction (16) or two back + to back prefixed instructions (20). */ + if (length == 16 || length == 20) + return 2; + + /* Guess for larger instruction sizes. */ + return 2 + (length - 20) / 4; + } + + return length / 4; +} + static int rs6000_insn_cost (rtx_insn *insn, bool speed) { @@ -21453,7 +21598,7 @@ rs6000_insn_cost (rtx_insn *insn, bool s if (cost > 0) return cost; - int n = get_attr_length (insn) / 4; + int n = rs6000_num_insns (insn); enum attr_type type = get_attr_type (insn); switch (type) Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 274872) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -1761,15 +1761,17 @@ (define_expand "add3" }) (define_insn "*add3" - [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r") - (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b") - (match_operand:GPR 2 "add_operand" "r,I,L")))] + [(set (match_operand:GPR 0 "gpc_reg_operand" "=r,r,r,r") + (plus:GPR (match_operand:GPR 1 "gpc_reg_operand" "%r,b,b,b") + (match_operand:GPR 2 "add_operand" "r,I,L,eI")))] "" "@ add %0,%1,%2 addi %0,%1,%2 - addis %0,%1,%v2" - [(set_attr "type" "add")]) + addis %0,%1,%v2 + addi %0,%1,%2" + [(set_attr "type" "add") + (set_attr "isa" "*,*,*,fut")]) (define_insn "*addsi3_high" [(set (match_operand:SI 0 "gpc_reg_operand" "=b") @@ -6909,22 +6911,22 @@ (define_insn "movsi_low" ;; MR LA LWZ LFIWZX LXSIWZX ;; STW STFIWX STXSIWX LI LIS -;; # XXLOR XXSPLTIB 0 XXSPLTIB -1 VSPLTISW -;; XXLXOR 0 XXLORC -1 P9 const MTVSRWZ MFVSRWZ -;; MF%1 MT%0 NOP +;; PLI # XXLOR XXSPLTIB 0 XXSPLTIB -1 +;; VSPLTISW XXLXOR 0 XXLORC -1 P9 const MTVSRWZ +;; MFVSRWZ MF%1 MT%0 NOP (define_insn "*movsi_internal1" [(set (match_operand:SI 0 "nonimmediate_operand" "=r, r, r, d, v, m, Z, Z, r, r, - r, wa, wa, wa, v, - wa, v, v, wa, r, - r, *h, *h") + r, r, wa, wa, wa, + v, wa, v, v, wa, + r, r, *h, *h") (match_operand:SI 1 "input_operand" "r, U, m, Z, Z, r, d, v, I, L, - n, wa, O, wM, wB, - O, wM, wS, r, wa, - *h, r, 0"))] + eI, n, wa, O, wM, + wB, O, wM, wS, r, + wa, *h, r, 0"))] "gpc_reg_operand (operands[0], SImode) || gpc_reg_operand (operands[1], SImode)" "@ @@ -6938,6 +6940,7 @@ (define_insn "*movsi_internal1" stxsiwx %x1,%y0 li %0,%1 lis %0,%v1 + li %0,%1 # xxlor %x0,%x1,%x1 xxspltib %x0,0 @@ -6954,21 +6957,21 @@ (define_insn "*movsi_internal1" [(set_attr "type" "*, *, load, fpload, fpload, store, fpstore, fpstore, *, *, - *, veclogical, vecsimple, vecsimple, vecsimple, - veclogical, veclogical, vecsimple, mffgpr, mftgpr, - *, *, *") + *, *, veclogical, vecsimple, vecsimple, + vecsimple, veclogical, veclogical, vecsimple, mffgpr, + mftgpr, *, *, *") (set_attr "length" "*, *, *, *, *, *, *, *, *, *, - 8, *, *, *, *, - *, *, 8, *, *, - *, *, *") + *, 8, *, *, *, + *, *, *, 8, *, + *, *, *, *") (set_attr "isa" "*, *, *, p8v, p8v, *, p8v, p8v, *, *, - *, p8v, p9v, p9v, p8v, - p9v, p8v, p9v, p8v, p8v, - *, *, *")]) + fut, *, p8v, p9v, p9v, + p8v, p9v, p8v, p9v, p8v, + p8v, *, *, *")]) ;; Like movsi, but adjust a SF value to be used in a SI context, i.e. ;; (set (reg:SI ...) (subreg:SI (reg:SF ...) 0)) @@ -7113,14 +7116,15 @@ (define_insn "*movsi_from_df" "xscvdpsp %x0,%x1" [(set_attr "type" "fp")]) -;; Split a load of a large constant into the appropriate two-insn -;; sequence. +;; Split a load of a large constant into the appropriate two-insn sequence. On +;; systems that support PADDI (PLI), we can use PLI to load any 32-bit constant +;; in one instruction. (define_split [(set (match_operand:SI 0 "gpc_reg_operand") (match_operand:SI 1 "const_int_operand"))] "(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x10000 - && (INTVAL (operands[1]) & 0xffff) != 0" + && (INTVAL (operands[1]) & 0xffff) != 0 && !TARGET_PREFIXED_ADDR" [(set (match_dup 0) (match_dup 2)) (set (match_dup 0) @@ -7759,9 +7763,18 @@ (define_expand "mov" ;; not swapped like they are for TImode or TFmode. Subregs therefore are ;; problematical. Don't allow direct move for this case. +;; FPR load FPR store FPR move FPR zero GPR load +;; GPR store GPR move GPR zero MFVSRD MTVSRD + (define_insn_and_split "*mov_64bit_dm" - [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r,r,d") - (match_operand:FMOVE128_FPR 1 "input_operand" "d,m,d,,r,Y,r,d,r"))] + [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" + "=m, d, d, d, Y, + r, r, r, r, d") + + (match_operand:FMOVE128_FPR 1 "input_operand" + "d, m, d, , r, + , Y, r, d, r"))] + "TARGET_HARD_FLOAT && TARGET_POWERPC64 && FLOAT128_2REG_P (mode) && (mode != TDmode || WORDS_BIG_ENDIAN) && (gpc_reg_operand (operands[0], mode) @@ -7769,9 +7782,13 @@ (define_insn_and_split "*mov_64bit "#" "&& reload_completed" [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; } - [(set_attr "length" "8,8,8,8,12,12,8,8,8") - (set_attr "isa" "*,*,*,*,*,*,*,p8v,p8v")]) +{ + rs6000_split_multireg_move (operands[0], operands[1]); + DONE; +} + [(set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v") + (set_attr "non_prefixed_length" "8") + (set_attr "prefixed_length" "20")]) (define_insn_and_split "*movtd_64bit_nodm" [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r") @@ -7782,8 +7799,12 @@ (define_insn_and_split "*movtd_64bit_nod "#" "&& reload_completed" [(pc)] -{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; } - [(set_attr "length" "8,8,8,12,12,8")]) +{ + rs6000_split_multireg_move (operands[0], operands[1]); + DONE; +} + [(set_attr "non_prefixed_length" "8") + (set_attr "prefixed_length" "20")]) (define_insn_and_split "*mov_32bit" [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r") @@ -8793,24 +8814,24 @@ (define_split [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }) -;; GPR store GPR load GPR move GPR li GPR lis GPR # -;; FPR store FPR load FPR move AVX store AVX store AVX load -;; AVX load VSX move P9 0 P9 -1 AVX 0/-1 VSX 0 -;; VSX -1 P9 const AVX const From SPR To SPR SPR<->SPR -;; VSX->GPR GPR->VSX +;; GPR store GPR load GPR move GPR li GPR lis GPR pli +;; GPR # FPR store FPR load FPR move AVX store AVX store +;; AVX load AVX load VSX move P9 0 P9 -1 AVX 0/-1 +;; VSX 0 VSX -1 P9 const AVX const From SPR To SPR +;; SPR<->SPR VSX->GPR GPR->VSX (define_insn "*movdi_internal64" [(set (match_operand:DI 0 "nonimmediate_operand" "=YZ, r, r, r, r, r, - m, ^d, ^d, wY, Z, $v, - $v, ^wa, wa, wa, v, wa, - wa, v, v, r, *h, *h, - ?r, ?wa") + r, m, ^d, ^d, wY, Z, + $v, $v, ^wa, wa, wa, v, + wa, wa, v, v, r, *h, + *h, ?r, ?wa") (match_operand:DI 1 "input_operand" - "r, YZ, r, I, L, nF, - ^d, m, ^d, ^v, $v, wY, - Z, ^wa, Oj, wM, OjwM, Oj, - wM, wS, wB, *h, r, 0, - wa, r"))] + "r, YZ, r, I, L, eI, + nF, ^d, m, ^d, ^v, $v, + wY, Z, ^wa, Oj, wM, OjwM, + Oj, wM, wS, wB, *h, r, + 0, wa, r"))] "TARGET_POWERPC64 && (gpc_reg_operand (operands[0], DImode) || gpc_reg_operand (operands[1], DImode))" @@ -8820,6 +8841,7 @@ (define_insn "*movdi_internal64" mr %0,%1 li %0,%1 lis %0,%v1 + li %0,%1 # stfd%U0%X0 %1,%0 lfd%U1%X1 %0,%1 @@ -8843,26 +8865,28 @@ (define_insn "*movdi_internal64" mtvsrd %x0,%1" [(set_attr "type" "store, load, *, *, *, *, - fpstore, fpload, fpsimple, fpstore, fpstore, fpload, - fpload, veclogical, vecsimple, vecsimple, vecsimple, veclogical, - veclogical, vecsimple, vecsimple, mfjmpr, mtjmpr, *, - mftgpr, mffgpr") + *, fpstore, fpload, fpsimple, fpstore, fpstore, + fpload, fpload, veclogical,vecsimple, vecsimple, vecsimple, + veclogical, veclogical, vecsimple, vecsimple, mfjmpr, mtjmpr, + *, mftgpr, mffgpr") (set_attr "size" "64") (set_attr "length" - "*, *, *, *, *, 20, - *, *, *, *, *, *, + "*, *, *, *, *, *, + 20, *, *, *, *, *, *, *, *, *, *, *, - *, 8, *, *, *, *, - *, *") + *, *, 8, *, *, *, + *, *, *") (set_attr "isa" - "*, *, *, *, *, *, - *, *, *, p9v, p7v, p9v, - p7v, *, p9v, p9v, p7v, *, - *, p7v, p7v, *, *, *, - p8v, p8v")]) + "*, *, *, *, *, fut, + *, *, *, *, p9v, p7v, + p9v, p7v, *, p9v, p9v, p7v, + *, *, p7v, p7v, *, *, + *, p8v, p8v")]) ; Some DImode loads are best done as a load of -1 followed by a mask -; instruction. +; instruction. On systems that support the PADDI (PLI) instruction, +; num_insns_constant returns 1, so these splitter would not be used for things +; that be loaded with PLI. (define_split [(set (match_operand:DI 0 "int_reg_operand_not_pseudo") (match_operand:DI 1 "const_int_operand"))] @@ -8980,7 +9004,8 @@ (define_insn "*mov_ppc64" return rs6000_output_move_128bit (operands); } [(set_attr "type" "store,store,load,load,*,*") - (set_attr "length" "8")]) + (set_attr "non_prefixed_length" "8,8,8,8,8,40") + (set_attr "prefixed_length" "20,20,20,20,8,40")]) (define_split [(set (match_operand:TI2 0 "int_reg_operand") @@ -11497,9 +11522,25 @@ (define_insn "stack_protect_setsi" [(set_attr "type" "three") (set_attr "length" "12")]) -(define_insn "stack_protect_setdi" - [(set (match_operand:DI 0 "memory_operand" "=Y") - (unspec:DI [(match_operand:DI 1 "memory_operand" "Y")] UNSPEC_SP_SET)) +(define_expand "stack_protect_setdi" + [(parallel [(set (match_operand:DI 0 "memory_operand") + (unspec:DI [(match_operand:DI 1 "memory_operand")] + UNSPEC_SP_SET)) + (set (match_scratch:DI 2) + (const_int 0))])] + "TARGET_64BIT" +{ + if (TARGET_PREFIXED_ADDR) + { + operands[0] = make_memory_non_prefixed (operands[0]); + operands[1] = make_memory_non_prefixed (operands[1]); + } +}) + +(define_insn "*stack_protect_setdi" + [(set (match_operand:DI 0 "non_prefixed_mem_operand" "=YZ") + (unspec:DI [(match_operand:DI 1 "non_prefixed_mem_operand" "YZ")] + UNSPEC_SP_SET)) (set (match_scratch:DI 2 "=&r") (const_int 0))] "TARGET_64BIT" "ld%U1%X1 %2,%1\;std%U0%X0 %2,%0\;li %2,0" @@ -11543,10 +11584,27 @@ (define_insn "stack_protect_testsi" lwz%U1%X1 %3,%1\;lwz%U2%X2 %4,%2\;cmplw %0,%3,%4\;li %3,0\;li %4,0" [(set_attr "length" "16,20")]) -(define_insn "stack_protect_testdi" +(define_expand "stack_protect_testdi" + [(parallel [(set (match_operand:CCEQ 0 "cc_reg_operand") + (unspec:CCEQ [(match_operand:DI 1 "memory_operand") + (match_operand:DI 2 "memory_operand")] + UNSPEC_SP_TEST)) + (set (match_scratch:DI 4) + (const_int 0)) + (clobber (match_scratch:DI 3))])] + "TARGET_64BIT" +{ + if (TARGET_PREFIXED_ADDR) + { + operands[0] = make_memory_non_prefixed (operands[0]); + operands[1] = make_memory_non_prefixed (operands[1]); + } +}) + +(define_insn "*stack_protect_testdi" [(set (match_operand:CCEQ 0 "cc_reg_operand" "=x,?y") - (unspec:CCEQ [(match_operand:DI 1 "memory_operand" "Y,Y") - (match_operand:DI 2 "memory_operand" "Y,Y")] + (unspec:CCEQ [(match_operand:DI 1 "non_prefixed_mem_operand" "YZ,YZ") + (match_operand:DI 2 "non_prefixed_mem_operand" "YZ,YZ")] UNSPEC_SP_TEST)) (set (match_scratch:DI 4 "=r,r") (const_int 0)) (clobber (match_scratch:DI 3 "=&r,&r"))] Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 274864) +++ gcc/config/rs6000/vsx.md (working copy) @@ -1149,10 +1149,30 @@ (define_insn "vsx_mov_64bit" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") - (set_attr "length" - "*, *, *, 8, *, 8, - 8, 8, 8, 8, *, *, - *, 20, 8, *, *") + (set (attr "non_prefixed_length") + (cond [(and (eq_attr "alternative" "4") ;; MTVSRDD + (match_test "TARGET_P9_VECTOR")) + (const_string "4") + + (eq_attr "alternative" "3,4") ;; GPR <-> VSX + (const_string "8") + + (eq_attr "alternative" "5,6,7,8") ;; GPR load/store + (const_string "8")] + (const_string "*"))) + + (set (attr "prefixed_length") + (cond [(and (eq_attr "alternative" "4") ;; MTVSRDD + (match_test "TARGET_P9_VECTOR")) + (const_string "4") + + (eq_attr "alternative" "3,4") ;; GPR <-> VSX + (const_string "8") + + (eq_attr "alternative" "5,6,7,8") ;; GPR load/store + (const_string "20")] + (const_string "*"))) + (set_attr "isa" ", , , *, *, *, *, *, *, *, p9v, *,