From patchwork Wed Jun 1 22:32:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Meissner X-Patchwork-Id: 628888 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rKlVN0pQBz9sD5 for ; Thu, 2 Jun 2016 08:32:42 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=bfS/i2CD; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=j59ecQUSkjvljqJwJIX01BUm5C2s505PbrUEfCHVSPyYcnS0nSQgV bhOH4IHwmPe47WKxKNdvAcEDJdvk+dpKYJNaSuhZHjqkkqy6TsBq+8xZg3yL60KO TAHgUeJRgBoACzM/DYg6wSIhaSvsFBqOzSaw2BeCFSnj+ZQFZOCEEE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=jyhRNmMRakmk+cH+xEk4xCU3bZA=; b=bfS/i2CDbiJe4YPk1mg+ FpKlpF1UOP5AIPEIS4k4y8ljPE+DwWTWLGu+4pzTF50aXMmZprrwILrZ/6ZzSBcW X4qs92D2FOxyEn4oH/djDUbNh+kYO0AnSFIi4oMCAqiF2a6e2A0tBA3j9SA0kFIv aIb84r3duXQsyH9N8qpdAVw= Received: (qmail 9952 invoked by alias); 1 Jun 2016 22:32:26 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 9915 invoked by uid 89); 1 Jun 2016 22:32:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=2.6 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, KAM_STOCKGEN autolearn=no version=3.3.2 spammy=6000, rn, 978, strict X-HELO: e19.ny.us.ibm.com Received: from e19.ny.us.ibm.com (HELO e19.ny.us.ibm.com) (129.33.205.209) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Wed, 01 Jun 2016 22:32:13 +0000 Received: from localhost by e19.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 1 Jun 2016 18:32:11 -0400 Received: from d01dlp03.pok.ibm.com (9.56.250.168) by e19.ny.us.ibm.com (146.89.104.206) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 1 Jun 2016 18:32:09 -0400 X-IBM-Helo: d01dlp03.pok.ibm.com X-IBM-MailFrom: meissner@ibm-tiger.the-meissners.org X-IBM-RcptTo: gcc-patches@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b01cxnp22035.gho.pok.ibm.com (b01cxnp22035.gho.pok.ibm.com [9.57.198.25]) by d01dlp03.pok.ibm.com (Postfix) with ESMTP id 36E9FC90046; Wed, 1 Jun 2016 18:32:01 -0400 (EDT) Received: from b01ledav001.gho.pok.ibm.com (b01ledav001.gho.pok.ibm.com [9.57.199.106]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u51MW8If39059672; Wed, 1 Jun 2016 22:32:08 GMT Received: from b01ledav001.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BBB742803D; Wed, 1 Jun 2016 18:32:08 -0400 (EDT) Received: from ibm-tiger.the-meissners.org (unknown [9.32.77.111]) by b01ledav001.gho.pok.ibm.com (Postfix) with ESMTP id 9C74028048; Wed, 1 Jun 2016 18:32:08 -0400 (EDT) Received: by ibm-tiger.the-meissners.org (Postfix, from userid 500) id DCF974459E; Wed, 1 Jun 2016 18:32:07 -0400 (EDT) Date: Wed, 1 Jun 2016 18:32:07 -0400 From: Michael Meissner To: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt Subject: [PATCH applied], Backport PowerPC ISA 3.0 vector d-form to GCC 6.2 branch Message-ID: <20160601223207.GA30962@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Bill Schmidt MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16060122-0057-0000-0000-0000047D0A2A X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes After a bootstrap and test, I have applied this backport of the vector d-form patch that I applied to the trunk on May 11th. [gcc] 2016-06-01 Michael Meissner Back port from trunk 2016-05-11 Michael Meissner * config/rs6000/predicates.md (quad_memory_operand): Move most of the code into quad_address_p and call it to share code with vsx_quad_dform_memory_operand. (vsx_quad_dform_memory_operand): New predicate for ISA 3.0 vector d-form support. * config/rs6000/rs6000.opt (-mlra): Switch to being an option mask bit instead of being a separate word. Split -mpower9-dform into two switches, -mpower9-dform-scalar and -mpower9-dform-vector. * config/rs6000/rs6000.c (RELOAD_REG_QUAD_OFFSET): New addr_mask for the register class supporting 128-bit quad word memory offsets. (mode_supports_vsx_dform_quad): Helper function to return if the register class uses quad word memory offsets. (rs6000_debug_addr_mask): Add support for quad word memory offsets. (rs6000_debug_reg_global): Always print if we are using LRA or not. (rs6000_setup_reg_addr_masks): If ISA 3.0 vector d-form instructions are enabled, set up the appropriate addr_masks for 128-bit types. (rs6000_init_hard_regno_mode_ok): wb constraint is now based on -mpower9-dform-scalar, instead of -mpower9-dform. (rs6000_option_override_internal): Split -mpower9-dform into two switches, -mpower9-dform-scalar and -mpower9-dform-vector. The -mpower9-dform switch sets or clears both. If we are not using the LRA register allocator, do not enable -mpower9-dform-vector by default. If we are using LRA, enable -mpower9-dform-vector and -mvsx-timode if it is appropriate. Issue a warning if either -mpower9-dform-vector or -mvsx-timode are explicitly used without enabling LRA. (quad_address_offset_p): New helper function to return if the offset is legal for quad word memory instructions. (quad_address_p): New function to determin if GPR or vector register quad word memory addresses are legal. (mem_operand_gpr): Validate quad word address offsets. (reg_offset_addressing_ok_p): Add support for ISA 3.0 vector d-form (register + offset) instructions. (offsettable_ok_by_alignment): Likewise. (rs6000_legitimate_offset_address_p): Likewise. (legitimate_lo_sum_address_p): Likewise. (rs6000_legitimize_address): Likewise. (rs6000_legitimize_reload_address): Add more debug statements for -mdebug=addr. (rs6000_legitimate_address_p): Add support for ISA 3.0 vector d-form instructions. (rs6000_secondary_reload_memory): Add support for ISA 3.0 vector d-form instructions. Distinguish different cases in debug output. (rs6000_secondary_reload_inner): Add support for ISA 3.0 vector d-form instructions. (rs6000_preferred_reload_class): Likewise. (rs6000_output_move_128bit): Add support for ISA 3.0 d-form instructions. If ISA 3.0 is available, generate lxvx/stxvx instead of the ISA 2.06 indexed memory instructions. (rs6000_emit_prologue): If we have ISA 3.0 d-form instructions, use them to save/restore the saved vector registers instead of using Altivec instructions. (rs6000_emit_epilogue): Likewise. (rs6000_lra_p): Use TARGET_LRA instead of the old option word. (rs6000_opt_masks): Split -mpower9-dform into -mpower9-dform-scalar and -mpower9-dform-vector. (rs6000_print_options_internal): Print -mno- if was not selected. * config/rs6000/vsx.md (p9_vecload_): Delete hack to emit ISA 3.0 vector indexed memory instructions, and fold the code into the normal mov patterns. (p9_vecstore_): Likewise. (vsx_mov): Add support for ISA 3.0 vector d-form instructions. (vsx_movti_64bit): Likewise. (vsx_movti_32bit): Likewise. * config/rs6000/constraints.md (wO constraint): New constraint for ISA 3.0 vector d-form support. * config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Use -mpower9-dform-scalar instead of -mpower9-dform. Add note not to include -mpower9-dform-vector until we switch over to LRA. (POWERPC_MASKS): Add -mlra. Split -mpower9-dform into two. switches, -mpower9-dform-scalar and -mpower9-dform-vector. * config/rs6000/rs6000-protos.h (quad_address_p): Add declaration. * doc/invoke.texi (RS/6000 and PowerPC Options): Add documentation for -mpower9-dform and -mlra. * doc/md.texi (wO constraint): Document wO constraint. [gcc/testsuite] 2016-06-01 Michael Meissner Back port from trunk 2016-05-11 Michael Meissner * gcc.target/powerpc/dform-3.c: New test for ISA 3.0 vector d-form support. * gcc.target/powerpc/dform-1.c: Add -mlra option to silence warning when using -mvsx-timode. * gcc.target/powerpc/p8vector-int128-1.c: Likewise. * gcc.target/powerpc/dform-2.c: Likewise. * gcc.target/powerpc/pr68805.c: Likewise. Index: gcc/config/rs6000/constraints.md =================================================================== --- gcc/config/rs6000/constraints.md (revision 236941) +++ gcc/config/rs6000/constraints.md (working copy) @@ -156,6 +156,11 @@ (define_constraint "wL" (and (match_test "TARGET_DIRECT_MOVE_128") (match_test "(ival == VECTOR_ELEMENT_MFVSRLD_64BIT)")))) +;; ISA 3.0 vector d-form addresses +(define_memory_constraint "wO" + "Memory operand suitable for the ISA 3.0 vector d-form instructions." + (match_operand 0 "vsx_quad_dform_memory_operand")) + ;; Lq/stq validates the address for load/store quad (define_memory_constraint "wQ" "Memory operand suitable for the load/store quad instructions" Index: gcc/config/rs6000/predicates.md =================================================================== --- gcc/config/rs6000/predicates.md (revision 236941) +++ gcc/config/rs6000/predicates.md (working copy) @@ -698,48 +698,25 @@ (define_predicate "offsettable_mem_opera (define_predicate "quad_memory_operand" (match_code "mem") { - rtx addr, op0, op1; - int ret; - if (!TARGET_QUAD_MEMORY && !TARGET_SYNC_TI) - ret = 0; - - else if (!memory_operand (op, mode)) - ret = 0; - - else if (GET_MODE_SIZE (GET_MODE (op)) != 16) - ret = 0; - - else if (MEM_ALIGN (op) < 128) - ret = 0; - - else - { - addr = XEXP (op, 0); - if (int_reg_operand (addr, Pmode)) - ret = 1; + return false; - else if (GET_CODE (addr) != PLUS) - ret = 0; + if (GET_MODE_SIZE (mode) != 16 || !MEM_P (op) || MEM_ALIGN (op) < 128) + return false; - else - { - op0 = XEXP (addr, 0); - op1 = XEXP (addr, 1); - ret = (int_reg_operand (op0, Pmode) - && GET_CODE (op1) == CONST_INT - && IN_RANGE (INTVAL (op1), -32768, 32767) - && (INTVAL (op1) & 15) == 0); - } - } + return quad_address_p (XEXP (op, 0), mode, true); +}) - if (TARGET_DEBUG_ADDR) - { - fprintf (stderr, "\nquad_memory_operand, ret = %s\n", ret ? "true" : "false"); - debug_rtx (op); - } +;; Return 1 if the operand is suitable for load/store to vector registers with +;; d-form addressing (register+offset), which was added in ISA 3.0. +;; Unlike quad_memory_operand, we do not have to check for alignment. +(define_predicate "vsx_quad_dform_memory_operand" + (match_code "mem") +{ + if (!TARGET_P9_DFORM_VECTOR || !MEM_P (op) || GET_MODE_SIZE (mode) != 16) + return false; - return ret; + return quad_address_p (XEXP (op, 0), mode, false); }) ;; Return 1 if the operand is an indexed or indirect memory operand. Index: gcc/config/rs6000/rs6000-cpus.def =================================================================== --- gcc/config/rs6000/rs6000-cpus.def (revision 236941) +++ gcc/config/rs6000/rs6000-cpus.def (working copy) @@ -60,13 +60,14 @@ | OPTION_MASK_UPPER_REGS_SF) /* Add ISEL back into ISA 3.0, since it is supposed to be a win. Do not add - P9_DFORM or P9_MINMAX until they are fully debugged. */ + P9_MINMAX until the hardware that supports it is available. Do not add + P9_DFORM_VECTOR until LRA is the default register allocator. */ #define ISA_3_0_MASKS_SERVER (ISA_2_7_MASKS_SERVER \ | OPTION_MASK_FLOAT128_HW \ | OPTION_MASK_ISEL \ | OPTION_MASK_MODULO \ | OPTION_MASK_P9_FUSION \ - | OPTION_MASK_P9_DFORM \ + | OPTION_MASK_P9_DFORM_SCALAR \ | OPTION_MASK_P9_VECTOR) #define POWERPC_7400_MASK (OPTION_MASK_PPC_GFXOPT | OPTION_MASK_ALTIVEC) @@ -94,6 +95,7 @@ | OPTION_MASK_FPRND \ | OPTION_MASK_HTM \ | OPTION_MASK_ISEL \ + | OPTION_MASK_LRA \ | OPTION_MASK_MFCRF \ | OPTION_MASK_MFPGPR \ | OPTION_MASK_MODULO \ @@ -101,7 +103,8 @@ | OPTION_MASK_NO_UPDATE \ | OPTION_MASK_P8_FUSION \ | OPTION_MASK_P8_VECTOR \ - | OPTION_MASK_P9_DFORM \ + | OPTION_MASK_P9_DFORM_SCALAR \ + | OPTION_MASK_P9_DFORM_VECTOR \ | OPTION_MASK_P9_FUSION \ | OPTION_MASK_P9_MINMAX \ | OPTION_MASK_P9_VECTOR \ Index: gcc/config/rs6000/rs6000-protos.h =================================================================== --- gcc/config/rs6000/rs6000-protos.h (revision 236941) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -86,6 +86,7 @@ extern int registers_ok_for_quad_peep (r extern int mems_ok_for_quad_peep (rtx, rtx); extern bool gpr_or_gpr_p (rtx, rtx); extern bool direct_move_p (rtx, rtx); +extern bool quad_address_p (rtx, machine_mode, bool); extern bool quad_load_store_p (rtx, rtx); extern bool fusion_gpr_load_p (rtx, rtx, rtx, rtx); extern void expand_fusion_gpr_load (rtx *); Index: gcc/config/rs6000/rs6000.opt =================================================================== --- gcc/config/rs6000/rs6000.opt (revision 236941) +++ gcc/config/rs6000/rs6000.opt (working copy) @@ -470,8 +470,8 @@ Target RejectNegative Joined UInteger Va -mlong-double- Specify size of long double (64 or 128 bits). mlra -Target Report Var(rs6000_lra_flag) Init(0) Save -Use LRA instead of reload. +Target Report Mask(LRA) Var(rs6000_isa_flags) +Enable Local Register Allocation. msched-costly-dep= Target RejectNegative Joined Var(rs6000_sched_costly_dep_str) @@ -609,9 +609,17 @@ mpower9-vector Target Report Mask(P9_VECTOR) Var(rs6000_isa_flags) Use/do not use vector and scalar instructions added in ISA 3.0. +mpower9-dform-scalar +Target Undocumented Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags) +Use/do not use scalar register+offset memory instructions added in ISA 3.0. + +mpower9-dform-vector +Target Undocumented Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags) +Use/do not use vector register+offset memory instructions added in ISA 3.0. + mpower9-dform -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags) -Use/do not use vector and scalar instructions added in ISA 3.0. +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save +Use/do not use register+offset memory instructions added in ISA 3.0. mpower9-minmax Target Undocumented Mask(P9_MINMAX) Var(rs6000_isa_flags) Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 236943) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -452,6 +452,7 @@ typedef unsigned char addr_mask_type; #define RELOAD_REG_PRE_INCDEC 0x10 /* PRE_INC/PRE_DEC valid. */ #define RELOAD_REG_PRE_MODIFY 0x20 /* PRE_MODIFY valid. */ #define RELOAD_REG_AND_M16 0x40 /* AND -16 addressing. */ +#define RELOAD_REG_QUAD_OFFSET 0x80 /* quad offset is limited. */ /* Register type masks based on the type, of valid addressing modes. */ struct rs6000_reg_addr { @@ -499,6 +500,16 @@ mode_supports_vmx_dform (machine_mode mo return ((reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_OFFSET) != 0); } +/* Return true if we have D-form addressing in VSX registers. This addressing + is more limited than normal d-form addressing in that the offset must be + aligned on a 16-byte boundary. */ +static inline bool +mode_supports_vsx_dform_quad (machine_mode mode) +{ + return ((reg_addr[mode].addr_mask[RELOAD_REG_ANY] & RELOAD_REG_QUAD_OFFSET) + != 0); +} + /* Target cpu costs. */ @@ -2108,7 +2119,9 @@ rs6000_debug_addr_mask (addr_mask_type m else if (keep_spaces) *p++ = ' '; - if ((mask & RELOAD_REG_OFFSET) != 0) + if ((mask & RELOAD_REG_QUAD_OFFSET) != 0) + *p++ = 'O'; + else if ((mask & RELOAD_REG_OFFSET) != 0) *p++ = 'o'; else if (keep_spaces) *p++ = ' '; @@ -2645,8 +2658,7 @@ rs6000_debug_reg_global (void) if (TARGET_LINK_STACK) fprintf (stderr, DEBUG_FMT_S, "link_stack", "true"); - if (targetm.lra_p ()) - fprintf (stderr, DEBUG_FMT_S, "lra", "true"); + fprintf (stderr, DEBUG_FMT_S, "lra", TARGET_LRA ? "true" : "false"); if (TARGET_P8_FUSION) { @@ -2781,17 +2793,31 @@ rs6000_setup_reg_addr_masks (void) } /* GPR and FPR registers can do REG+OFFSET addressing, except - possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form - addressing for scalars to altivec registers. */ + possibly for SDmode. ISA 3.0 (i.e. power9) adds D-form addressing + for 64-bit scalars and 32-bit SFmode to altivec registers. */ if ((addr_mask != 0) && !indexed_only_p && msize <= 8 && (rc == RELOAD_REG_GPR - || rc == RELOAD_REG_FPR - || (rc == RELOAD_REG_VMX - && TARGET_P9_DFORM - && (m2 == DFmode || m2 == SFmode)))) + || ((msize == 8 || m2 == SFmode) + && (rc == RELOAD_REG_FPR + || (rc == RELOAD_REG_VMX + && TARGET_P9_DFORM_SCALAR))))) addr_mask |= RELOAD_REG_OFFSET; + /* VSX registers can do REG+OFFSET addresssing if ISA 3.0 + instructions are enabled. The offset for 128-bit VSX registers is + only 12-bits. While GPRs can handle the full offset range, VSX + registers can only handle the restricted range. */ + else if ((addr_mask != 0) && !indexed_only_p + && msize == 16 && TARGET_P9_DFORM_VECTOR + && (ALTIVEC_OR_VSX_VECTOR_MODE (m2) + || (m2 == TImode && TARGET_VSX_TIMODE))) + { + addr_mask |= RELOAD_REG_OFFSET; + if (rc == RELOAD_REG_FPR || rc == RELOAD_REG_VMX) + addr_mask |= RELOAD_REG_QUAD_OFFSET; + } + /* VMX registers can do (REG & -16) and ((REG+REG) & -16) addressing on 128-bit types. */ if (rc == RELOAD_REG_VMX && msize == 16 @@ -3114,7 +3140,7 @@ rs6000_init_hard_regno_mode_ok (bool glo } /* Support for new D-form instructions. */ - if (TARGET_P9_DFORM) + if (TARGET_P9_DFORM_SCALAR) rs6000_constraints[RS6000_CONSTRAINT_wb] = ALTIVEC_REGS; /* Support for ISA 3.0 (power9) vectors. */ @@ -3987,7 +4013,8 @@ rs6000_option_override_internal (bool gl /* For the newer switches (vsx, dfp, etc.) set some of the older options, unless the user explicitly used the -mno-