From patchwork Mon May 20 20:49:23 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 245117 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 05C092C00D3 for ; Tue, 21 May 2013 06:49:40 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; q=dns; s=default; b=Us3D/htyTfIctLpv9YzqMW8w1+1/PW 16+uOOyWQuKh4alkbjSQiMYqnfbqSqVQhoJcmxi1afR+mXK137VQaMkzhl5kot3s SzlwavK9506yB95Bi9MVR9dsA9QuppvhhgjbnXZ+Ywf2VUhlNqp8dHKFVszcnzRk cqlb0+5hsanw0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:references:mime-version:content-type :in-reply-to; s=default; bh=eEtLIBdAXN+xmGgVrsR0SJvybcI=; b=kMHR kz354D4mSJwAer0FM1F10uQkS/TsJaHdhNUfAyBtHVYK3IjWAz13mfxNPFaQODH3 YWHcZSiTP9AXvfXPHAGLfuBr6bWR+Df8ujazBHFWEPetVULJTpLicrOP6sRnV1x5 tC36+JRfGFlVdfaODeuDvx9i0wost4hdewOK/lo= Received: (qmail 23852 invoked by alias); 20 May 2013 20:49:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 23841 invoked by uid 89); 20 May 2013 20:49:32 -0000 X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL, BAYES_20, KHOP_RCVD_UNTRUST, RCVD_IN_HOSTKARMA_W, RCVD_IN_HOSTKARMA_WL, TW_GF, TW_MF, TW_MZ, TW_VS autolearn=no version=3.3.1 Received: from e7.ny.us.ibm.com (HELO e7.ny.us.ibm.com) (32.97.182.137) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Mon, 20 May 2013 20:49:29 +0000 Received: from /spool/local by e7.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 20 May 2013 16:49:27 -0400 Received: from d01dlp01.pok.ibm.com (9.56.250.166) by e7.ny.us.ibm.com (192.168.1.107) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 20 May 2013 16:49:26 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id 4230538C8042 for ; Mon, 20 May 2013 16:49:25 -0400 (EDT) Received: from d01av04.pok.ibm.com (d01av04.pok.ibm.com [9.56.224.64]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r4KKnPnW327232 for ; Mon, 20 May 2013 16:49:25 -0400 Received: from d01av04.pok.ibm.com (loopback [127.0.0.1]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r4KKnPPo031242 for ; Mon, 20 May 2013 16:49:25 -0400 Received: from ibm-tiger.the-meissners.org (dhcp-9-32-77-206.usma.ibm.com [9.32.77.206]) by d01av04.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r4KKnO8q031190; Mon, 20 May 2013 16:49:24 -0400 Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id E6379421F3; Mon, 20 May 2013 16:49:23 -0400 (EDT) Date: Mon, 20 May 2013 16:49:23 -0400 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, pthaugen@us.ibm.com, bergner@vnet.ibm.com Subject: Re: [PATCH, rs6000] power8 patch #1, infrastructure changes Message-ID: <20130520204923.GA25144@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, dje.gcc@gmail.com, pthaugen@us.ibm.com, bergner@vnet.ibm.com References: <20130520204053.GA21090@ibm-tiger.the-meissners.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20130520204053.GA21090@ibm-tiger.the-meissners.org> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13052020-5806-0000-0000-0000213AEDB4 X-Virus-Found: No These patches are primarily infrastructure patches patches, that adds the switches the following patches will use. I also added the new constraints and predicates that will be used by future patches. At this point of development, I have multiple switches for different sub-features. I could reduce the number of documented switches down to just one or two, like we did in the power7 time frame if desired. 2013-05-17 Michael Meissner Pat Haugen Peter Bergner * doc/invoke.texi (Option Summary): Add power8 options. (RS/6000 and PowerPC Options): Likewise. * doc/md.texi (PowerPC and IBM RS6000 constraints): Update to use constraints.md instead of rs6000.h. Reorder w* constraints. Add wm, wn, wr documentation. * gcc/config/rs6000/constraints.md (wm): New constraint for VSX registers if direct move instructions are enabled. (wn): New constraint for no registers. (wq): New constraint for quad word even GPR registers. (wr): New constraint if 64-bit instructions are enabled. (wv): New constraint if power8 vector instructions are enabled. (wQ): New constraint for quad word memory locations. * gcc/config/rs6000/predicates.md (const_0_to_15_operand): New constraint for 0..15 for crypto instructions. (gpc_reg_operand): If VSX allow registers in VSX registers as well as GPR and floating point registers. (int_reg_operand): New predicate to match only GPR registers. (base_reg_operand): New predicate to match base registers. (quad_int_reg_operand): New predicate to match even GPR registers for quad memory operations. (vsx_reg_or_cint_operand): New predicate to allow vector logical operations in both GPR and VSX registers. (quad_memory_operand): New predicate for quad memory operations. (reg_or_indexed_operand): New predicate for direct move support. * gcc/config/rs6000/rs6000-cpus.def (ISA_2_5_MASKS_EMBEDDED): Inherit from ISA_2_4_MASKS, not ISA_2_2_MASKS. (ISA_2_7_MASKS_SERVER): New mask for ISA 2.07 (i.e. power8). (POWERPC_MASKS): Add power8 options. (power8 cpu): Use ISA_2_7_MASKS_SERVER instead of specifying the various options. * gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define _ARCH_PWR8 and __POWER8_VECTOR__ for power8. * gcc/config/rs6000/rs6000.opt (-mvsx-timode): Add documentation. (-mpower8-fusion): New power8 options. (-mpower8-fusion-sign): Likewise. (-mpower8-vector): Likewise. (-mcrypto): Likewise. (-mdirect-move): Likewise. (-mquad-memory): Likewise. * gcc/config/rs6000/rs6000.c (power8_cost): Initial definition for power8. (rs6000_hard_regno_mode_ok): Make PTImode only match even GPR registers. (rs6000_debug_reg_print): Print the base register class if -mdebug=reg. (rs6000_debug_vector_unit): Add p8_vector. (rs6000_debug_reg_global): If -mdebug=reg, print power8 constraint definitions. Also print fusion state. (rs6000_init_hard_regno_mode_ok): Set up power8 constraints. (rs6000_builtin_mask_calculate): Add power8 builtin support. (rs6000_option_override_internal): Add support for power8. (rs6000_common_init_builtins): Add debugging for skipped builtins if -mdebug=builtin. (rs6000_adjust_cost): Add power8 support. (rs6000_issue_rate): Likewise. (insn_must_be_first_in_group): Likewise. (insn_must_be_last_in_group): Likewise. (force_new_group): Likewise. (rs6000_register_move_cost): Likewise. (rs6000_opt_masks): Likewise. * config/rs6000/rs6000.h (ASM_CPU_POWER8_SPEC): If we don't have a power8 capable assembler, default to power7 options. (TARGET_DIRECT_MOVE): Likewise. (TARGET_CRYPTO): Likewise. (TARGET_P8_VECTOR): Likewise. (VECTOR_UNIT_P8_VECTOR_P): Define power8 vector support. (VECTOR_UNIT_VSX_OR_P8_VECTOR_P): Likewise. (VECTOR_MEM_P8_VECTOR_P): Likewise. (VECTOR_MEM_VSX_OR_P8_VECTOR_P): Likewise. (VECTOR_MEM_ALTIVEC_OR_VSX_P): Likewise. (TARGET_XSCVDPSPN): Likewise. (TARGET_XSCVSPDPN): Likewsie. (TARGET_SYNC_HI_QI): Likewise. (TARGET_SYNC_TI): Likewise. (MASK_CRYPTO): Likewise. (MASK_DIRECT_MOVE): Likewise. (MASK_P8_FUSION): Likewise. (MASK_P8_VECTOR): Likewise. (REG_ALLOC_ORDER): Move fr13 to be lower in priority so that the TFmode temporary used by some of the direct move instructions to get two FP temporary registers does not force creation of a stack frame. (VLOGICAL_REGNO_P): Allow vector logical operations in GPRs. (MODES_TIEABLE_P): Move the VSX tests above the Altivec tests so that any VSX registers are tieable, even if they are also an Altivec vector mode. (r6000_reg_class_enum): Add wm, wq, wr, wv constraints. (RS6000_BTM_P8_VECTOR): Power8 builtin support. (RS6000_BTM_CRYPTO): Likewise. (RS6000_BTM_COMMON): Likewise. * config/rs6000/rs6000.md (cpu attribute): Add power8. * config/rs6000/rs6000-opts.h (PROCESSOR_POWER8): Likewise. (enum rs6000_vector): Add power8 vector support. Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 199037) +++ gcc/doc/invoke.texi (working copy) @@ -860,7 +860,10 @@ See RS/6000 and PowerPC Options. -mno-recip-precision @gol -mveclibabi=@var{type} -mfriz -mno-friz @gol -mpointers-to-nested-functions -mno-pointers-to-nested-functions @gol --msave-toc-indirect -mno-save-toc-indirect} +-msave-toc-indirect -mno-save-toc-indirect @gol +-mpower8-fusion -mno-mpower8-fusion -mpower8-vector -mno-power8-vector @gol +-mcrypto -mno-crypto -mdirect-move -mno-direct-move @gol +-mquad-memory -mno-quad-memory} @emph{RX Options} @gccoptlist{-m64bit-doubles -m32bit-doubles -fpu -nofpu@gol @@ -17341,7 +17344,8 @@ following options: @gccoptlist{-maltivec -mfprnd -mhard-float -mmfcrf -mmultiple @gol -mpopcntb -mpopcntd -mpowerpc64 @gol -mpowerpc-gpopt -mpowerpc-gfxopt -msingle-float -mdouble-float @gol --msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx} +-msimple-fpu -mstring -mmulhw -mdlmzb -mmfpgpr -mvsx @gol +-mcrypto -mdirect-move -mpower8-fusion -mpower8-vector -mquad-memory} The particular options set for any particular CPU varies between compiler versions, depending on what setting seems to produce optimal @@ -17459,6 +17463,47 @@ Generate code that uses (does not use) v instructions, and also enable the use of built-in functions that allow more direct access to the VSX instruction set. +@item -mcrypto +@itemx -mno-crypto +@opindex mcrypto +@opindex mno-crypto +Enable the use (disable) of the built-in functions that allow direct +access to the cryptographic instructions that were added in version +2.07 of the PowerPC ISA. + +@item -mdirect-move +@itemx -mno-direct-move +@opindex mdirect-move +@opindex mno-direct-move +Generate code that uses (does not use) the instructions to move data +between the general purpose registers and the vector/scalar (VSX) +registers that were added in version 2.07 of the PowerPC ISA. + +@item -mpower8-fusion +@itemx -mno-power8-fusion +@opindex mpower8-fusion +@opindex mno-power8-fusion +Generate code that keeps (does not keeps) some integer operations +adjacent so that the instructions can be fused together on power8 and +later processors. + +@item -mpower8-vector +@itemx -mno-power8-vector +@opindex mpower8-vector +@opindex mno-power8-vector +Generate code that uses (does not use) the vector and scalar +instructions that were added in version 2.07 of the PowerPC ISA. Also +enable the use of built-in functions that allow more direct access to +the vector instructions. + +@item -mquad-memory +@itemx -mno-quad-memory +@opindex mquad-memory +@opindex mno-quad-memory +Generate code that uses (does not use) the quad word memory +instructions. The @option{-mquad-memory} option requires use of +64-bit mode. + @item -mfloat-gprs=@var{yes/single/double/no} @itemx -mfloat-gprs @opindex mfloat-gprs Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 199037) +++ gcc/doc/md.texi (working copy) @@ -2055,7 +2055,7 @@ Any constant whose absolute value is no @end table -@item PowerPC and IBM RS6000---@file{config/rs6000/rs6000.h} +@item PowerPC and IBM RS6000---@file{config/rs6000/constraints.md} @table @code @item b Address base register @@ -2069,6 +2069,9 @@ Floating point register (containing 32-b @item v Altivec vector register +@item wa +Any VSX register + @item wd VSX vector register to hold vector double data @@ -2081,6 +2084,18 @@ If @option{-mmfpgpr} was used, a floatin @item wl If the LFIWAX instruction is enabled, a floating point register +@item wm +If direct moves are enabled, a VSX register. + +@item wn +No register. + +@item wq +Even general purpose register to use with load/store quad instructions + +@item wr +General purpose register if 64-bit mode is used + @item ws VSX vector register to hold scalar float data @@ -2093,8 +2108,9 @@ If the STFIWX instruction is enabled, a @item wz If the LFIWZX instruction is enabled, a floating point register -@item wa -Any VSX register +@item wQ +A memory address that will work with the @code{lq} and @code{stq} +instructions. @item h @samp{MQ}, @samp{CTR}, or @samp{LINK} register Index: gcc/ChangeLog.ibm =================================================================== --- gcc/ChangeLog.ibm (revision 199038) +++ gcc/ChangeLog.ibm (working copy) @@ -1,5 +1,108 @@ 2013-05-17 Michael Meissner + * doc/invoke.texi (Option Summary): Add power8 options. + (RS/6000 and PowerPC Options): Likewise. + + * doc/md.texi (PowerPC and IBM RS6000 constraints): Update to use + constraints.md instead of rs6000.h. Reorder w* constraints. Add + wm, wn, wr documentation. + + * gcc/config/rs6000/constraints.md (wm): New constraint for VSX + registers if direct move instructions are enabled. + (wn): New constraint for no registers. + (wq): New constraint for quad word even GPR registers. + (wr): New constraint if 64-bit instructions are enabled. + (wv): New constraint if power8 vector instructions are enabled. + (wQ): New constraint for quad word memory locations. + + * gcc/config/rs6000/predicates.md (const_0_to_15_operand): New + constraint for 0..15 for crypto instructions. + (gpc_reg_operand): If VSX allow registers in VSX registers as well + as GPR and floating point registers. + (int_reg_operand): New predicate to match only GPR registers. + (base_reg_operand): New predicate to match base registers. + (quad_int_reg_operand): New predicate to match even GPR registers + for quad memory operations. + (vsx_reg_or_cint_operand): New predicate to allow vector logical + operations in both GPR and VSX registers. + (quad_memory_operand): New predicate for quad memory operations. + (reg_or_indexed_operand): New predicate for direct move support. + + * gcc/config/rs6000/rs6000-cpus.def (ISA_2_5_MASKS_EMBEDDED): + Inherit from ISA_2_4_MASKS, not ISA_2_2_MASKS. + (ISA_2_7_MASKS_SERVER): New mask for ISA 2.07 (i.e. power8). + (POWERPC_MASKS): Add power8 options. + (power8 cpu): Use ISA_2_7_MASKS_SERVER instead of specifying the + various options. + + * gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros): + Define _ARCH_PWR8 and __POWER8_VECTOR__ for power8. + + * gcc/config/rs6000/rs6000.opt (-mvsx-timode): Add documentation. + (-mpower8-fusion): New power8 options. + (-mpower8-fusion-sign): Likewise. + (-mpower8-vector): Likewise. + (-mcrypto): Likewise. + (-mdirect-move): Likewise. + (-mquad-memory): Likewise. + + * gcc/config/rs6000/rs6000.c (power8_cost): Initial definition for + power8. + (rs6000_hard_regno_mode_ok): Make PTImode only match even GPR + registers. + (rs6000_debug_reg_print): Print the base register class if + -mdebug=reg. + (rs6000_debug_vector_unit): Add p8_vector. + (rs6000_debug_reg_global): If -mdebug=reg, print power8 constraint + definitions. Also print fusion state. + (rs6000_init_hard_regno_mode_ok): Set up power8 constraints. + (rs6000_builtin_mask_calculate): Add power8 builtin support. + (rs6000_option_override_internal): Add support for power8. + (rs6000_common_init_builtins): Add debugging for skipped builtins + if -mdebug=builtin. + (rs6000_adjust_cost): Add power8 support. + (rs6000_issue_rate): Likewise. + (insn_must_be_first_in_group): Likewise. + (insn_must_be_last_in_group): Likewise. + (force_new_group): Likewise. + (rs6000_register_move_cost): Likewise. + (rs6000_opt_masks): Likewise. + + * config/rs6000/rs6000.h (ASM_CPU_POWER8_SPEC): If we don't have a + power8 capable assembler, default to power7 options. + (TARGET_DIRECT_MOVE): Likewise. + (TARGET_CRYPTO): Likewise. + (TARGET_P8_VECTOR): Likewise. + (VECTOR_UNIT_P8_VECTOR_P): Define power8 vector support. + (VECTOR_UNIT_VSX_OR_P8_VECTOR_P): Likewise. + (VECTOR_MEM_P8_VECTOR_P): Likewise. + (VECTOR_MEM_VSX_OR_P8_VECTOR_P): Likewise. + (VECTOR_MEM_ALTIVEC_OR_VSX_P): Likewise. + (TARGET_XSCVDPSPN): Likewise. + (TARGET_XSCVSPDPN): Likewsie. + (TARGET_SYNC_HI_QI): Likewise. + (TARGET_SYNC_TI): Likewise. + (MASK_CRYPTO): Likewise. + (MASK_DIRECT_MOVE): Likewise. + (MASK_P8_FUSION): Likewise. + (MASK_P8_VECTOR): Likewise. + (REG_ALLOC_ORDER): Move fr13 to be lower in priority so that the + TFmode temporary used by some of the direct move instructions to + get two FP temporary registers does not force creation of a stack + frame. + (VLOGICAL_REGNO_P): Allow vector logical operations in GPRs. + (MODES_TIEABLE_P): Move the VSX tests above the Altivec tests so + that any VSX registers are tieable, even if they are also an + Altivec vector mode. + (r6000_reg_class_enum): Add wm, wq, wr, wv constraints. + (RS6000_BTM_P8_VECTOR): Power8 builtin support. + (RS6000_BTM_CRYPTO): Likewise. + (RS6000_BTM_COMMON): Likewise. + + * config/rs6000/rs6000.md (cpu attribute): Add power8. + * config/rs6000/rs6000-opts.h (PROCESSOR_POWER8): Likewise. + (enum rs6000_vector): Add power8 vector support. + Clone branch from subversion id 199028. * REVISION: New file to track subversion id. Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 199037) +++ gcc/config/rs6000/constraints.md (working copy) @@ -79,12 +79,35 @@ (define_register_constraint "wg" "rs6000 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]" "Floating point register if the LFIWAX instruction is enabled or NO_REGS.") +(define_register_constraint "wm" "rs6000_constraints[RS6000_CONSTRAINT_wm]" + "VSX register if direct move instructions are enabled, or NO_REGS.") + +(define_constraint "wq" + "Even general purpose register to use with load/store quad instructions." + (match_operand 0 "quad_int_reg_operand")) + +(define_register_constraint "wr" "rs6000_constraints[RS6000_CONSTRAINT_wr]" + "General purpose register if 64-bit instructions are enabled or NO_REGS.") + +(define_register_constraint "wv" "rs6000_constraints[RS6000_CONSTRAINT_wv]" + "Altivec register if -mpower8-vector is used or NO_REGS.") + (define_register_constraint "wx" "rs6000_constraints[RS6000_CONSTRAINT_wx]" "Floating point register if the STFIWX instruction is enabled or NO_REGS.") (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]" "Floating point register if the LFIWZX instruction is enabled or NO_REGS.") +;; NO_REGs register constraint, used to merge mov{sd,sf}, since movsd can use +;; direct move directly, and movsf can't to move between the register sets. +;; There is a mode_attr that resolves to wm for SDmode and wn for SFmode +(define_register_constraint "wn" "NO_REGS") + +;; Lq/stq validates the address for load/store quad +(define_memory_constraint "wQ" + "Memory operand suitable for the load/store quad instructions" + (match_operand 0 "quad_memory_operand")) + ;; Altivec style load/store that ignores the bottom bits of the address (define_memory_constraint "wZ" "Indexed or indirect memory operand, ignoring the bottom 4 bits" Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 199037) +++ gcc/config/rs6000/predicates.md (working copy) @@ -166,6 +166,11 @@ (define_predicate "const_2_to_3_operand" (and (match_code "const_int") (match_test "IN_RANGE (INTVAL (op), 2, 3)"))) +;; Match op = 0..15 +(define_predicate "const_0_to_15_operand" + (and (match_code "const_int") + (match_test "IN_RANGE (INTVAL (op), 0, 15)"))) + ;; Return 1 if op is a register that is not special. (define_predicate "gpc_reg_operand" (match_operand 0 "register_operand") @@ -182,9 +187,68 @@ (define_predicate "gpc_reg_operand" if (REGNO (op) >= ARG_POINTER_REGNUM && !CA_REGNO_P (REGNO (op))) return 1; + if (TARGET_VSX && VSX_REGNO_P (REGNO (op))) + return 1; + return INT_REGNO_P (REGNO (op)) || FP_REGNO_P (REGNO (op)); }) +;; Return 1 if op is a general purpose register. Unlike gpc_reg_operand, don't +;; allow floating point or vector registers. +(define_predicate "int_reg_operand" + (match_operand 0 "register_operand") +{ + if ((TARGET_E500_DOUBLE || TARGET_SPE) && invalid_e500_subreg (op, mode)) + return 0; + + if (GET_CODE (op) == SUBREG) + op = SUBREG_REG (op); + + if (!REG_P (op)) + return 0; + + if (REGNO (op) >= ARG_POINTER_REGNUM && !CA_REGNO_P (REGNO (op))) + return 1; + + return INT_REGNO_P (REGNO (op)); +}) + +;; Like int_reg_operand, but only return true for base registers +(define_predicate "base_reg_operand" + (match_operand 0 "int_reg_operand") +{ + if (GET_CODE (op) == SUBREG) + op = SUBREG_REG (op); + + if (!REG_P (op)) + return 0; + + return (REGNO (op) != FIRST_GPR_REGNO); +}) + +;; Return 1 if op is a general purpose register that is an even register +;; which suitable for a load/store quad operation +(define_predicate "quad_int_reg_operand" + (match_operand 0 "register_operand") +{ + HOST_WIDE_INT r; + + if (!TARGET_QUAD_MEMORY) + return 0; + + if (GET_CODE (op) == SUBREG) + op = SUBREG_REG (op); + + if (!REG_P (op)) + return 0; + + r = REGNO (op); + if (r >= FIRST_PSEUDO_REGISTER) + return 1; + + return (INT_REGNO_P (r) && ((r & 1) == 0)); +}) + ;; Return 1 if op is a register that is a condition register field. (define_predicate "cc_reg_operand" (match_operand 0 "register_operand") @@ -302,6 +366,11 @@ (define_predicate "reg_or_logical_cint_o & (~ (unsigned HOST_WIDE_INT) 0xffffffff)) == 0)") (match_operand 0 "gpc_reg_operand"))) +;; Like reg_or_logical_cint_operand, but allow vsx registers +(define_predicate "vsx_reg_or_cint_operand" + (ior (match_operand 0 "vsx_register_operand") + (match_operand 0 "reg_or_logical_cint_operand"))) + ;; Return 1 if operand is a CONST_DOUBLE that can be set in a register ;; with no more than one instruction per word. (define_predicate "easy_fp_constant" @@ -507,6 +576,54 @@ (define_predicate "offsettable_mem_opera (and (match_operand 0 "memory_operand") (match_test "offsettable_nonstrict_memref_p (op)"))) +;; Return 1 if the operand is suitable for load/store quad memory. +(define_predicate "quad_memory_operand" + (match_code "mem") +{ + rtx addr, op0, op1; + int ret; + + if (!TARGET_QUAD_MEMORY) + ret = 0; + + else if (!memory_operand (op, mode)) + ret = 0; + + else if (GET_MODE_SIZE (GET_MODE (op)) != 16) + ret = 0; + + else if (MEM_ALIGN (op) < 128) + ret = 0; + + else + { + addr = XEXP (op, 0); + if (int_reg_operand (addr, Pmode)) + ret = 1; + + else if (GET_CODE (addr) != PLUS) + ret = 0; + + else + { + op0 = XEXP (addr, 0); + op1 = XEXP (addr, 1); + ret = (int_reg_operand (op0, Pmode) + && GET_CODE (op1) == CONST_INT + && IN_RANGE (INTVAL (op1), -32768, 32767) + && (INTVAL (op1) & 15) == 0); + } + } + + if (TARGET_DEBUG_ADDR) + { + fprintf (stderr, "\nquad_memory_operand, ret = %s\n", ret ? "true" : "false"); + debug_rtx (op); + } + + return ret; +}) + ;; Return 1 if the operand is an indexed or indirect memory operand. (define_predicate "indexed_or_indirect_operand" (match_code "mem") @@ -521,6 +638,19 @@ (define_predicate "indexed_or_indirect_o return indexed_or_indirect_address (op, mode); }) +;; Like indexed_or_indirect_operand, but also allow a GPR register if direct +;; moves are supported. +(define_predicate "reg_or_indexed_operand" + (match_code "mem,reg") +{ + if (MEM_P (op)) + return indexed_or_indirect_operand (op, mode); + else if (TARGET_DIRECT_MOVE) + return register_operand (op, mode); + return + 0; +}) + ;; Return 1 if the operand is an indexed or indirect memory operand with an ;; AND -16 in it, used to recognize when we need to switch to Altivec loads ;; to realign loops instead of VSX (altivec silently ignores the bottom bits, Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (revision 199037) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -28,7 +28,7 @@ ALTIVEC, since in general it isn't a win on power6. In ISA 2.04, fsel, fre, fsqrt, etc. were no longer documented as optional. Group masks by server and embedded. */ -#define ISA_2_5_MASKS_EMBEDDED (ISA_2_2_MASKS \ +#define ISA_2_5_MASKS_EMBEDDED (ISA_2_4_MASKS \ | OPTION_MASK_CMPB \ | OPTION_MASK_RECIP_PRECISION \ | OPTION_MASK_PPC_GFXOPT \ @@ -45,6 +45,14 @@ | OPTION_MASK_VSX \ | OPTION_MASK_VSX_TIMODE) +/* For now, don't provide an embedded version of ISA 2.07. */ +#define ISA_2_7_MASKS_SERVER (ISA_2_6_MASKS_SERVER \ + | OPTION_MASK_P8_FUSION \ + | OPTION_MASK_P8_VECTOR \ + | OPTION_MASK_CRYPTO \ + | OPTION_MASK_DIRECT_MOVE \ + | OPTION_MASK_QUAD_MEMORY) + #define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC) /* Deal with ports that do not have -mstrict-align. */ @@ -61,7 +69,9 @@ /* Mask of all options to set the default isa flags based on -mcpu=. */ #define POWERPC_MASKS (OPTION_MASK_ALTIVEC \ | OPTION_MASK_CMPB \ + | OPTION_MASK_CRYPTO \ | OPTION_MASK_DFP \ + | OPTION_MASK_DIRECT_MOVE \ | OPTION_MASK_DLMZB \ | OPTION_MASK_FPRND \ | OPTION_MASK_ISEL \ @@ -69,11 +79,14 @@ | OPTION_MASK_MFPGPR \ | OPTION_MASK_MULHW \ | OPTION_MASK_NO_UPDATE \ + | OPTION_MASK_P8_FUSION \ + | OPTION_MASK_P8_VECTOR \ | OPTION_MASK_POPCNTB \ | OPTION_MASK_POPCNTD \ | OPTION_MASK_POWERPC64 \ | OPTION_MASK_PPC_GFXOPT \ | OPTION_MASK_PPC_GPOPT \ + | OPTION_MASK_QUAD_MEMORY \ | OPTION_MASK_RECIP_PRECISION \ | OPTION_MASK_SOFT_FLOAT \ | OPTION_MASK_STRICT_ALIGN_OPTIONAL \ @@ -168,10 +181,7 @@ RS6000_CPU ("power7", PROCESSOR_POWER7, POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD | MASK_VSX | MASK_RECIP_PRECISION | MASK_VSX_TIMODE) -RS6000_CPU ("power8", PROCESSOR_POWER7, /* Don't add MASK_ISEL by default */ - POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF - | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD - | MASK_VSX | MASK_RECIP_PRECISION | MASK_VSX_TIMODE) +RS6000_CPU ("power8", PROCESSOR_POWER7, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER) RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0) RS6000_CPU ("powerpc64", PROCESSOR_POWERPC64, MASK_PPC_GFXOPT | MASK_POWERPC64) RS6000_CPU ("rs64", PROCESSOR_RS64A, MASK_PPC_GFXOPT | MASK_POWERPC64) Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (revision 199037) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -315,6 +315,8 @@ rs6000_target_modify_macros (bool define rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR6X"); if ((flags & OPTION_MASK_POPCNTD) != 0) rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR7"); + if ((flags & OPTION_MASK_DIRECT_MOVE) != 0) + rs6000_define_or_undefine_macro (define_p, "_ARCH_PWR8"); if ((flags & OPTION_MASK_SOFT_FLOAT) != 0) rs6000_define_or_undefine_macro (define_p, "_SOFT_FLOAT"); if ((flags & OPTION_MASK_RECIP_PRECISION) != 0) @@ -331,6 +333,8 @@ rs6000_target_modify_macros (bool define } if ((flags & OPTION_MASK_VSX) != 0) rs6000_define_or_undefine_macro (define_p, "__VSX__"); + if ((flags & OPTION_MASK_P8_VECTOR) != 0) + rs6000_define_or_undefine_macro (define_p, "__POWER8_VECTOR__"); /* options from the builtin masks. */ if ((bu_mask & RS6000_BTM_SPE) != 0) Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 199037) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -517,4 +517,28 @@ Control whether we save the TOC in the p mvsx-timode Target Undocumented Mask(VSX_TIMODE) Var(rs6000_isa_flags) -; Allow/disallow TImode in VSX registers +Allow 128-bit integers in VSX registers + +mpower8-fusion +Target Report Mask(P8_FUSION) Var(rs6000_isa_flags) +Fuse certain integer operations together for better performance on power8 + +mpower8-fusion-sign +Target Undocumented Mask(P8_FUSION_SIGN) Var(rs6000_isa_flags) +Allow sign extension in fusion operations + +mpower8-vector +Target Report Mask(P8_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector and scalar instructions added in ISA 2.07. + +mcrypto +Target Report Mask(CRYPTO) Var(rs6000_isa_flags) +Use ISA 2.07 crypto instructions + +mdirect-move +Target Report Mask(DIRECT_MOVE) Var(rs6000_isa_flags) +Use ISA 2.07 direct move between GPR & VSX register instructions + +mquad-memory +Target Report Mask(QUAD_MEMORY) Var(rs6000_isa_flags) +Generate the quad word memory instructions (lq/stq/lqarx/stqcx). Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 199037) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -831,6 +831,25 @@ struct processor_costs power7_cost = { 12, /* prefetch streams */ }; +/* Instruction costs on POWER8 processors. */ +static const +struct processor_costs power8_cost = { + COSTS_N_INSNS (3), /* mulsi */ + COSTS_N_INSNS (3), /* mulsi_const */ + COSTS_N_INSNS (3), /* mulsi_const9 */ + COSTS_N_INSNS (3), /* muldi */ + COSTS_N_INSNS (19), /* divsi */ + COSTS_N_INSNS (35), /* divdi */ + COSTS_N_INSNS (3), /* fp */ + COSTS_N_INSNS (3), /* dmul */ + COSTS_N_INSNS (14), /* sdiv */ + COSTS_N_INSNS (17), /* ddiv */ + 128, /* cache line size */ + 32, /* l1 cache */ + 256, /* l2 cache */ + 12, /* prefetch streams */ +}; + /* Instruction costs on POWER A2 processors. */ static const struct processor_costs ppca2_cost = { @@ -1547,6 +1566,15 @@ rs6000_hard_regno_mode_ok (int regno, en { int last_regno = regno + rs6000_hard_regno_nregs[mode][regno] - 1; + /* PTImode can only go in GPRs. Quad word memory operations require even/odd + register combinations, and use PTImode where we need to deal with quad + word memory operations. Don't allow quad words in the argument or frame + pointer registers, just registers 0..31. */ + if (mode == PTImode) + return (IN_RANGE (regno, FIRST_GPR_REGNO, LAST_GPR_REGNO) + && IN_RANGE (last_regno, FIRST_GPR_REGNO, LAST_GPR_REGNO) + && ((regno & 1) == 0)); + /* VSX registers that overlap the FPR registers are larger than for non-VSX implementations. Don't allow an item to be split between a FP register and an Altivec register. */ @@ -1678,6 +1706,16 @@ rs6000_debug_reg_print (int first_regno, comma = ""; } + len += fprintf (stderr, "%sreg-class = %s", comma, + reg_class_names[(int)rs6000_regno_regclass[r]]); + comma = ", "; + + if (len > 70) + { + fprintf (stderr, ",\n\t"); + comma = ""; + } + fprintf (stderr, "%sregno = %d\n", comma, r); } } @@ -1710,6 +1748,7 @@ rs6000_debug_reg_global (void) "none", "altivec", "vsx", + "p8_vector", "paired", "spe", "other" @@ -1802,8 +1841,11 @@ rs6000_debug_reg_global (void) "wf reg_class = %s\n" "wg reg_class = %s\n" "wl reg_class = %s\n" + "wm reg_class = %s\n" + "wr reg_class = %s\n" "ws reg_class = %s\n" "wt reg_class = %s\n" + "wv reg_class = %s\n" "wx reg_class = %s\n" "wz reg_class = %s\n" "\n", @@ -1815,8 +1857,11 @@ rs6000_debug_reg_global (void) reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wm]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wr]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wt]], + reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wv]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]], reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wz]]); @@ -2050,6 +2095,10 @@ rs6000_debug_reg_global (void) if (targetm.lra_p ()) fprintf (stderr, DEBUG_FMT_S, "lra", "true"); + if (TARGET_P8_FUSION) + fprintf (stderr, DEBUG_FMT_S, "p8 fusion", + (TARGET_P8_FUSION_SIGN) ? "zero+sign" : "zero"); + fprintf (stderr, DEBUG_FMT_S, "plt-format", TARGET_SECURE_PLT ? "secure" : "bss"); fprintf (stderr, DEBUG_FMT_S, "struct-return", @@ -2240,6 +2289,15 @@ rs6000_init_hard_regno_mode_ok (bool glo if (TARGET_LFIWAX) rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS; + if (TARGET_DIRECT_MOVE) + rs6000_constraints[RS6000_CONSTRAINT_wm] = VSX_REGS; + + if (TARGET_POWERPC64) + rs6000_constraints[RS6000_CONSTRAINT_wr] = GENERAL_REGS; + + if (TARGET_P8_VECTOR) + rs6000_constraints[RS6000_CONSTRAINT_wv] = ALTIVEC_REGS; + if (TARGET_STFIWX) rs6000_constraints[RS6000_CONSTRAINT_wx] = FLOAT_REGS; @@ -2520,16 +2578,18 @@ darwin_rs6000_override_options (void) HOST_WIDE_INT rs6000_builtin_mask_calculate (void) { - return (((TARGET_ALTIVEC) ? RS6000_BTM_ALTIVEC : 0) - | ((TARGET_VSX) ? RS6000_BTM_VSX : 0) - | ((TARGET_SPE) ? RS6000_BTM_SPE : 0) - | ((TARGET_PAIRED_FLOAT) ? RS6000_BTM_PAIRED : 0) - | ((TARGET_FRE) ? RS6000_BTM_FRE : 0) - | ((TARGET_FRES) ? RS6000_BTM_FRES : 0) - | ((TARGET_FRSQRTE) ? RS6000_BTM_FRSQRTE : 0) - | ((TARGET_FRSQRTES) ? RS6000_BTM_FRSQRTES : 0) - | ((TARGET_POPCNTD) ? RS6000_BTM_POPCNTD : 0) - | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL : 0)); + return (((TARGET_ALTIVEC) ? RS6000_BTM_ALTIVEC : 0) + | ((TARGET_VSX) ? RS6000_BTM_VSX : 0) + | ((TARGET_SPE) ? RS6000_BTM_SPE : 0) + | ((TARGET_PAIRED_FLOAT) ? RS6000_BTM_PAIRED : 0) + | ((TARGET_FRE) ? RS6000_BTM_FRE : 0) + | ((TARGET_FRES) ? RS6000_BTM_FRES : 0) + | ((TARGET_FRSQRTE) ? RS6000_BTM_FRSQRTE : 0) + | ((TARGET_FRSQRTES) ? RS6000_BTM_FRSQRTES : 0) + | ((TARGET_POPCNTD) ? RS6000_BTM_POPCNTD : 0) + | ((rs6000_cpu == PROCESSOR_CELL) ? RS6000_BTM_CELL : 0) + | ((TARGET_P8_VECTOR) ? RS6000_BTM_P8_VECTOR : 0) + | ((TARGET_CRYPTO) ? RS6000_BTM_CRYPTO : 0)); } /* Override command line options. Mostly we process the processor type and @@ -2803,7 +2863,9 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-