, Fix constraints on VSX Fma, Fix, and Reduce options

Message ID	20140910201631.GA20669@ibm-tiger.the-meissners.org
State	New
Headers	show Return-Path: <gcc-patches-return-377370-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=WG7xC8khk2vPaO9tm7pdmk6ByfnsuFjT+yMRaZkNLdzUkiD82sBhJ YFHSG3Tebe+h6AgvmCgqVNwOQLTzSo4MKvnSLiyIeJ785SeaMT2ReIxt6drZQFWH QxnKeLsMvOppAQdc2gCeaCxFwIqA/F3qlV7Sbnjp4O8sZB0D0iTZ3k= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Gateway: Authorized Use Only! Violators will be prosecuted for <gcc-patches@gcc.gnu.org> from <meissner@ibm-tiger.the-meissners.org>; Wed, 10 Sep 2014 16:16:35 -0400 Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 10 Sep 2014 16:16:33 -0400 Date: Wed, 10 Sep 2014 16:16:31 -0400 From: Michael Meissner <meissner@linux.vnet.ibm.com> To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Subject: [PATCH], Fix constraints on VSX Fma, Fix, and Reduce options Message-ID: <20140910201631.GA20669@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>, gcc-patches@gcc.gnu.org, dje.gcc@gmail.com MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="bg08WKrSYDhXBjb5" Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10)

Message ID

20140910201631.GA20669@ibm-tiger.the-meissners.org

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:date
	:from:to:subject:message-id:mime-version:content-type; q=dns; s=
	default; b=WG7xC8khk2vPaO9tm7pdmk6ByfnsuFjT+yMRaZkNLdzUkiD82sBhJ
	YFHSG3Tebe+h6AgvmCgqVNwOQLTzSo4MKvnSLiyIeJ785SeaMT2ReIxt6drZQFWH
	QxnKeLsMvOppAQdc2gCeaCxFwIqA/F3qlV7Sbnjp4O8sZB0D0iTZ3k=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
Date: Wed, 10 Sep 2014 16:16:31 -0400
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: [PATCH], Fix constraints on VSX Fma, Fix, and Reduce options
Message-ID: <20140910201631.GA20669@ibm-tiger.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,
	gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="bg08WKrSYDhXBjb5"
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-12-10)

Commit Message

Michael Meissner Sept. 10, 2014, 8:16 p.m. UTC

In doing work on improving power8 fusion support, I noticed that in several of
the patterns (vector fused multiply-add, optimization of float (fix (x)), and
vector reduction), I used the "ws" constraint which is the constraint for
scalar double precision floating point (currently FLOAT_REGS) in cases where
the operand is a vector, where we should use "wd" (preferred constraint for
V2DF), "wf" (preferred constraint for V4SF) or even "wa" (any VSX register).
This means the register allocator might generate extra code due to preferring
the traditional floating point registers.

I was curious about the code generation changes, so I built power8 versions of
the Spec 2006 benchmark suite, and compared the number of instructions
generated, using the same options.  Most of the floating point benchmarks had
some changes in code generation, including fewer scalar floating loads/stores
(where the RA picked a traditional scalar register, which meant elsewere a
scalar was spilled to the stack), and different encodings of the FMA
instructions.

I did a run of the FP spec benchmarks on a big endian power8 system.  There
were no regressions that were significant, and the cactusADM benchmark sped up
by 2%.

I did a bootstrap/make check comparison, and there were no regressions.  Is it
ok to install in trunk and the active PowerPC branches?

Comments

David Edelsohn Sept. 10, 2014, 8:42 p.m. UTC | #1

On Wed, Sep 10, 2014 at 4:16 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> In doing work on improving power8 fusion support, I noticed that in several of
> the patterns (vector fused multiply-add, optimization of float (fix (x)), and
> vector reduction), I used the "ws" constraint which is the constraint for
> scalar double precision floating point (currently FLOAT_REGS) in cases where
> the operand is a vector, where we should use "wd" (preferred constraint for
> V2DF), "wf" (preferred constraint for V4SF) or even "wa" (any VSX register).
> This means the register allocator might generate extra code due to preferring
> the traditional floating point registers.
>
> I was curious about the code generation changes, so I built power8 versions of
> the Spec 2006 benchmark suite, and compared the number of instructions
> generated, using the same options.  Most of the floating point benchmarks had
> some changes in code generation, including fewer scalar floating loads/stores
> (where the RA picked a traditional scalar register, which meant elsewere a
> scalar was spilled to the stack), and different encodings of the FMA
> instructions.
>
> I did a run of the FP spec benchmarks on a big endian power8 system.  There
> were no regressions that were significant, and the cactusADM benchmark sped up
> by 2%.
>
> I did a bootstrap/make check comparison, and there were no regressions.  Is it
> ok to install in trunk and the active PowerPC branches?

Needs a ChangeLog.

Okay.

thanks, David

Michael Meissner Sept. 10, 2014, 8:48 p.m. UTC | #2

On Wed, Sep 10, 2014 at 04:42:06PM -0400, David Edelsohn wrote:
> Needs a ChangeLog.

Whoops, I forgot to include it:

2014-09-10  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (vsx_fmav4sf4): Use correct constraints for
	V2DF, V4SF, DF, and DI modes.
	(vsx_fmav2df2): Likewise.
	(vsx_float_fix_<mode>2): Likewise.
	(vsx_reduc_<VEC_reduc_name>_v2df_scalar): Likewise.

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 214455)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -905,11 +905,11 @@  (define_insn "*vsx_tsqrt<mode>2_internal
 ;; multiply.
 
 (define_insn "*vsx_fmav4sf4"
-  [(set (match_operand:V4SF 0 "vsx_register_operand" "=ws,ws,?wa,?wa,v")
+  [(set (match_operand:V4SF 0 "vsx_register_operand" "=wf,wf,?wa,?wa,v")
 	(fma:V4SF
-	  (match_operand:V4SF 1 "vsx_register_operand" "%ws,ws,wa,wa,v")
-	  (match_operand:V4SF 2 "vsx_register_operand" "ws,0,wa,0,v")
-	  (match_operand:V4SF 3 "vsx_register_operand" "0,ws,0,wa,v")))]
+	  (match_operand:V4SF 1 "vsx_register_operand" "%wf,wf,wa,wa,v")
+	  (match_operand:V4SF 2 "vsx_register_operand" "wf,0,wa,0,v")
+	  (match_operand:V4SF 3 "vsx_register_operand" "0,wf,0,wa,v")))]
   "VECTOR_UNIT_VSX_P (V4SFmode)"
   "@
    xvmaddasp %x0,%x1,%x2
@@ -920,11 +920,11 @@  (define_insn "*vsx_fmav4sf4"
   [(set_attr "type" "vecfloat")])
 
 (define_insn "*vsx_fmav2df4"
-  [(set (match_operand:V2DF 0 "vsx_register_operand" "=ws,ws,?wa,?wa")
+  [(set (match_operand:V2DF 0 "vsx_register_operand" "=wd,wd,?wa,?wa")
 	(fma:V2DF
-	  (match_operand:V2DF 1 "vsx_register_operand" "%ws,ws,wa,wa")
-	  (match_operand:V2DF 2 "vsx_register_operand" "ws,0,wa,0")
-	  (match_operand:V2DF 3 "vsx_register_operand" "0,ws,0,wa")))]
+	  (match_operand:V2DF 1 "vsx_register_operand" "%wd,wd,wa,wa")
+	  (match_operand:V2DF 2 "vsx_register_operand" "wd,0,wa,0")
+	  (match_operand:V2DF 3 "vsx_register_operand" "0,wd,0,wa")))]
   "VECTOR_UNIT_VSX_P (V2DFmode)"
   "@
    xvmaddadp %x0,%x1,%x2
@@ -1360,8 +1360,8 @@  (define_insn "*vsx_float_fix_<mode>2"
 (define_insn "vsx_concat_<mode>"
   [(set (match_operand:VSX_D 0 "vsx_register_operand" "=<VSr>,?<VSa>")
 	(vec_concat:VSX_D
-	 (match_operand:<VS_scalar> 1 "vsx_register_operand" "ws,<VSa>")
-	 (match_operand:<VS_scalar> 2 "vsx_register_operand" "ws,<VSa>")))]
+	 (match_operand:<VS_scalar> 1 "vsx_register_operand" "<VS_64reg>,<VSa>")
+	 (match_operand:<VS_scalar> 2 "vsx_register_operand" "<VS_64reg>,<VSa>")))]
   "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
   if (BYTES_BIG_ENDIAN)
@@ -2018,7 +2018,7 @@  (define_insn_and_split "*vsx_reduc_<VEC_
 ;; to the top element of the V2DF array without doing an extract.
 
 (define_insn_and_split "*vsx_reduc_<VEC_reduc_name>_v2df_scalar"
-  [(set (match_operand:DF 0 "vfloat_operand" "=&ws,&?wa,ws,?wa")
+  [(set (match_operand:DF 0 "vfloat_operand" "=&ws,&?ws,ws,?ws")
 	(vec_select:DF
 	 (VEC_reduc:V2DF
 	  (vec_concat:V2DF

, Fix constraints on VSX Fma, Fix, and Reduce options

Commit Message

Comments

Patch