PowerPC merge TD/TF moves

Message ID	20130130225005.GA4615@ibm-tiger.the-meissners.org
State	New
Headers	show Return-Path: <gcc-patches-return-336222-incoming=patchwork.ozlabs.org@gcc.gnu.org> Comment: DKIM? See http://www.dkim.org Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:MIME-Version:Content-Type:Content-Disposition:User-Agent:X-Content-Scanned:x-cbid:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=oSrrV6rFHyqn0fWKUex+vffrsQwZe+bdG+PTKmA759p85se9J0EIesYvXPVu4k YQrstX/GfhcJ3EXzkFIWhMeDnVCuOi27vYZaOPCTm2NITmG8/f5msoOiHXiv0SS5 nwohg4uDPOIrMhrgN9MwJXPlW7U/dW9WgbBncMBLN2lps=; Gateway: Authorized Use Only! Violators will be prosecuted for <gcc-patches@gcc.gnu.org> from <meissner@ibm-tiger.the-meissners.org>; Wed, 30 Jan 2013 17:50:12 -0500 Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 30 Jan 2013 17:50:09 -0500 Date: Wed, 30 Jan 2013 17:50:05 -0500 From: Michael Meissner <meissner@linux.vnet.ibm.com> To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com Subject: [PATCH] PowerPC merge TD/TF moves Message-ID: <20130130225005.GA4615@ibm-tiger.the-meissners.org> Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>, gcc-patches@gcc.gnu.org, dje.gcc@gmail.com MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="n8g4imXOkfNTN/H1" Content-Disposition: inline User-Agent: Mutt/1.5.20 (2009-12-10) Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

Message ID

20130130225005.GA4615@ibm-tiger.the-meissners.org

State

New

Headers

Comment: DKIM? See http://www.dkim.org
Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys
DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org;
	h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Received:Received:Received:Received:Received:Date:From:To:Subject:Message-ID:Mail-Followup-To:MIME-Version:Content-Type:Content-Disposition:User-Agent:X-Content-Scanned:x-cbid:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To;
	b=oSrrV6rFHyqn0fWKUex+vffrsQwZe+bdG+PTKmA759p85se9J0EIesYvXPVu4k
	YQrstX/GfhcJ3EXzkFIWhMeDnVCuOi27vYZaOPCTm2NITmG8/f5msoOiHXiv0SS5
	nwohg4uDPOIrMhrgN9MwJXPlW7U/dW9WgbBncMBLN2lps=;
Date: Wed, 30 Jan 2013 17:50:05 -0500
From: Michael Meissner <meissner@linux.vnet.ibm.com>
To: gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
Subject: [PATCH] PowerPC merge TD/TF moves
Message-ID: <20130130225005.GA4615@ibm-tiger.the-meissners.org>
Mail-Followup-To: Michael Meissner <meissner@linux.vnet.ibm.com>,
	gcc-patches@gcc.gnu.org, dje.gcc@gmail.com
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="n8g4imXOkfNTN/H1"
Content-Disposition: inline
User-Agent: Mutt/1.5.20 (2009-12-10)
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org

Commit Message

Michael Meissner Jan. 30, 2013, 10:50 p.m. UTC

This patch like the previous 2 pages combines the decimal and binary floating
point moves, this time for 128-bit floating point.

In doing this patch, I discovered that I left out the code in the previous
patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
move instructions.  So, I added the code in this patch, and also created a test
to make sure that direct moves are generated in the future.

I also added the reload helper for DDmode to rs6000_vector_reload that was
missed in the last patch.  This was harmless, since that is only used with an
undocumented debug switch.  Hopefully sometime in the future, I will scalar
floating point to be able to be loaded in the upper 32 VSX registers that are
overlaid over the Altivec registers.

Like the previous 2 patches, I've bootstrapped this, and ran make check with no
regressions.  Is it ok to apply when GCC 4.9 opens up?

I have one more patch in the insn combination to post, combining movdi on
systems with normal floating point and with the power6 direct move
instructions.

[gcc]
2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg
	constraint if -mdebug=reg.
	(rs6000_initi_hard_regno_mode_ok): Enable wg constraint if
	-mfpgpr.  Enable using dd reload support if needed.

	* config/rs6000/dfp.md (movtd): Delete, combine with 128-bit
	binary and decimal floating point moves in rs6000.md.
	(movtd_internal): Likewise.

	* config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and
	decimal floating point moves.
	(movtf): Likewise.
	(movtf_internal): Likewise.
	(mov<mode>_internal, TDmode/TFmode): Likewise.
	(movtf_softfloat): Likewise.
	(mov<mode>_softfloat, TDmode/TFmode): Likewise.

[gcc/testsuite]
2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/mmfpgpr.c: New test.

Comments

David Edelsohn Feb. 5, 2013, 6:46 p.m. UTC | #1

On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch like the previous 2 pages combines the decimal and binary floating
> point moves, this time for 128-bit floating point.
>
> In doing this patch, I discovered that I left out the code in the previous
> patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> move instructions.  So, I added the code in this patch, and also created a test
> to make sure that direct moves are generated in the future.
>
> I also added the reload helper for DDmode to rs6000_vector_reload that was
> missed in the last patch.  This was harmless, since that is only used with an
> undocumented debug switch.  Hopefully sometime in the future, I will scalar
> floating point to be able to be loaded in the upper 32 VSX registers that are
> overlaid over the Altivec registers.
>
> Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> regressions.  Is it ok to apply when GCC 4.9 opens up?
>
> I have one more patch in the insn combination to post, combining movdi on
> systems with normal floating point and with the power6 direct move
> instructions.
>
> [gcc]
> 2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (rs6000_debug_reg_global): Print out wg
>         constraint if -mdebug=reg.
>         (rs6000_initi_hard_regno_mode_ok): Enable wg constraint if
>         -mfpgpr.  Enable using dd reload support if needed.
>
>         * config/rs6000/dfp.md (movtd): Delete, combine with 128-bit
>         binary and decimal floating point moves in rs6000.md.
>         (movtd_internal): Likewise.
>
>         * config/rs6000/rs6000.md (FMOVE128): Combine 128-bit binary and
>         decimal floating point moves.
>         (movtf): Likewise.
>         (movtf_internal): Likewise.
>         (mov<mode>_internal, TDmode/TFmode): Likewise.
>         (movtf_softfloat): Likewise.
>         (mov<mode>_softfloat, TDmode/TFmode): Likewise.
>
> [gcc/testsuite]
> 2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * gcc.target/powerpc/mmfpgpr.c: New test.

This patch is okay after 4.9 tree opens.  Again, please confirm that
it works on pre-POWER7 systems.

Thanks, David

David Edelsohn March 8, 2013, 1:45 a.m. UTC | #2

On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch like the previous 2 pages combines the decimal and binary floating
> point moves, this time for 128-bit floating point.
>
> In doing this patch, I discovered that I left out the code in the previous
> patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> move instructions.  So, I added the code in this patch, and also created a test
> to make sure that direct moves are generated in the future.
>
> I also added the reload helper for DDmode to rs6000_vector_reload that was
> missed in the last patch.  This was harmless, since that is only used with an
> undocumented debug switch.  Hopefully sometime in the future, I will scalar
> floating point to be able to be loaded in the upper 32 VSX registers that are
> overlaid over the Altivec registers.
>
> Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> regressions.  Is it ok to apply when GCC 4.9 opens up?
>
> I have one more patch in the insn combination to post, combining movdi on
> systems with normal floating point and with the power6 direct move
> instructions.

Mike,

Which of these sets of patches adjusts and updates
rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing
the same register file?

Thanks, David

Michael Meissner March 11, 2013, 7:04 p.m. UTC | #3

On Thu, Mar 07, 2013 at 08:45:10PM -0500, David Edelsohn wrote:
> On Wed, Jan 30, 2013 at 5:50 PM, Michael Meissner
> <meissner@linux.vnet.ibm.com> wrote:
> > This patch like the previous 2 pages combines the decimal and binary floating
> > point moves, this time for 128-bit floating point.
> >
> > In doing this patch, I discovered that I left out the code in the previous
> > patch to enable the wg constraint to enable -mcpu=power6x to utilize the direct
> > move instructions.  So, I added the code in this patch, and also created a test
> > to make sure that direct moves are generated in the future.
> >
> > I also added the reload helper for DDmode to rs6000_vector_reload that was
> > missed in the last patch.  This was harmless, since that is only used with an
> > undocumented debug switch.  Hopefully sometime in the future, I will scalar
> > floating point to be able to be loaded in the upper 32 VSX registers that are
> > overlaid over the Altivec registers.
> >
> > Like the previous 2 patches, I've bootstrapped this, and ran make check with no
> > regressions.  Is it ok to apply when GCC 4.9 opens up?
> >
> > I have one more patch in the insn combination to post, combining movdi on
> > systems with normal floating point and with the power6 direct move
> > instructions.
> 
> Mike,
> 
> Which of these sets of patches adjusts and updates
> rs6000_register_move_cost for -mfpgpr and for VSRs and FPRs sharing
> the same register file?

None of these patches adjust register_move_cost.

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 195586)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -1737,6 +1737,7 @@  rs6000_debug_reg_global (void)
 	   "wa reg_class = %s\n"
 	   "wd reg_class = %s\n"
 	   "wf reg_class = %s\n"
+	   "wg reg_class = %s\n"
 	   "wl reg_class = %s\n"
 	   "ws reg_class = %s\n"
 	   "wx reg_class = %s\n"
@@ -1748,6 +1749,7 @@  rs6000_debug_reg_global (void)
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wa]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wd]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wf]],
+	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wg]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wl]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_ws]],
 	   reg_class_names[rs6000_constraints[RS6000_CONSTRAINT_wx]],
@@ -2120,6 +2122,9 @@  rs6000_init_hard_regno_mode_ok (bool glo
   if (TARGET_ALTIVEC)
     rs6000_constraints[RS6000_CONSTRAINT_v] = ALTIVEC_REGS;
 
+  if (TARGET_MFPGPR)
+    rs6000_constraints[RS6000_CONSTRAINT_wg] = FLOAT_REGS;
+
   if (TARGET_LFIWAX)
     rs6000_constraints[RS6000_CONSTRAINT_wl] = FLOAT_REGS;
 
@@ -2150,6 +2155,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_di_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_di_load;
+	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_di_store;
+	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_di_load;
 	    }
 	}
       else
@@ -2170,6 +2177,8 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	    {
 	      rs6000_vector_reload[DFmode][0]  = CODE_FOR_reload_df_si_store;
 	      rs6000_vector_reload[DFmode][1]  = CODE_FOR_reload_df_si_load;
+	      rs6000_vector_reload[DDmode][0]  = CODE_FOR_reload_dd_si_store;
+	      rs6000_vector_reload[DDmode][1]  = CODE_FOR_reload_dd_si_load;
 	    }
 	}
     }
Index: gcc/config/rs6000/dfp.md
===================================================================
--- gcc/config/rs6000/dfp.md	(revision 195590)
+++ gcc/config/rs6000/dfp.md	(working copy)
@@ -144,27 +144,6 @@  (define_insn "*nabstd2_fpr"
   "fnabs %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "movtd"
-  [(set (match_operand:TD 0 "general_operand" "")
-	(match_operand:TD 1 "any_operand" ""))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS"
-  "{ rs6000_emit_move (operands[0], operands[1], TDmode); DONE; }")
-
-; It's important to list the Y->r and r->Y moves before r->r because
-; otherwise reload, given m->r, will try to pick r->r and reload it,
-; which doesn't make progress.
-(define_insn_and_split "*movtd_internal"
-  [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
-	(match_operand:TD 1 "input_operand"         "d,m,d,r,YGHF,r"))]
-  "TARGET_HARD_FLOAT && TARGET_FPRS
-   && (gpc_reg_operand (operands[0], TDmode)
-       || gpc_reg_operand (operands[1], TDmode))"
-  "#"
-  "&& reload_completed"
-  [(pc)]
-{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,20,20,16")])
-
 ;; Hardware support for decimal floating point operations.
 
 (define_insn "extendddtd2"
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 195590)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -257,6 +257,8 @@  (define_mode_iterator FMA_F [
 (define_mode_iterator FMOVE32 [SF SD])
 (define_mode_iterator FMOVE64 [DF DD])
 (define_mode_iterator FMOVE64X [DI DF DD])
+(define_mode_iterator FMOVE128 [(TF "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128")
+				(TD "TARGET_HARD_FLOAT && TARGET_FPRS")])
 
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
@@ -8148,35 +8150,33 @@  (define_insn "*mov<mode>_softfloat64"
   [(set_attr "type" "store,load,*,mtjmpr,mfjmpr,*,*,*,*")
    (set_attr "length" "4,4,4,4,4,8,12,16,4")])
 
-(define_expand "movtf"
-  [(set (match_operand:TF 0 "general_operand" "")
-	(match_operand:TF 1 "any_operand" ""))]
-  "!TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128"
-  "{ rs6000_emit_move (operands[0], operands[1], TFmode); DONE; }")
+(define_expand "mov<mode>"
+  [(set (match_operand:FMOVE128 0 "general_operand" "")
+	(match_operand:FMOVE128 1 "any_operand" ""))]
+  ""
+  "{ rs6000_emit_move (operands[0], operands[1], <MODE>mode); DONE; }")
 
 ;; It's important to list Y->r and r->Y before r->r because otherwise
 ;; reload, given m->r, will try to pick r->r and reload it, which
 ;; doesn't make progress.
-(define_insn_and_split "*movtf_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
-	(match_operand:TF 1 "input_operand" "d,m,d,r,YGHF,r"))]
-  "!TARGET_IEEEQUAD
-   && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_LONG_DOUBLE_128
-   && (gpc_reg_operand (operands[0], TFmode)
-       || gpc_reg_operand (operands[1], TFmode))"
+(define_insn_and_split "*mov<mode>_internal"
+  [(set (match_operand:FMOVE128 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
+	(match_operand:FMOVE128 1 "input_operand" "d,m,d,r,YGHF,r"))]
+  "TARGET_HARD_FLOAT && TARGET_FPRS
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "#"
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
   [(set_attr "length" "8,8,8,20,20,16")])
 
-(define_insn_and_split "*movtf_softfloat"
-  [(set (match_operand:TF 0 "rs6000_nonimmediate_operand" "=Y,r,r")
-	(match_operand:TF 1 "input_operand"         "r,YGHF,r"))]
-  "!TARGET_IEEEQUAD
-   && (TARGET_SOFT_FLOAT || !TARGET_FPRS) && TARGET_LONG_DOUBLE_128
-   && (gpc_reg_operand (operands[0], TFmode)
-       || gpc_reg_operand (operands[1], TFmode))"
+(define_insn_and_split "*mov<mode>_softfloat"
+  [(set (match_operand:FMOVE128 0 "rs6000_nonimmediate_operand" "=Y,r,r")
+	(match_operand:FMOVE128 1 "input_operand" "r,YGHF,r"))]
+  "(TARGET_SOFT_FLOAT || !TARGET_FPRS)
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "#"
   "&& reload_completed"
   [(pc)]
Index: gcc/testsuite/gcc.target/powerpc/mmfpgpr.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/mmfpgpr.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/mmfpgpr.c	(revision 0)
@@ -0,0 +1,22 @@ 
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
+/* { dg-skip-if "" { powerpc*-*-darwin* } { "*" } { "" } } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mcpu=power6x -mmfpgpr" } */
+/* { dg-final { scan-assembler "mffgpr" } } */
+/* { dg-final { scan-assembler "mftgpr" } } */
+
+/* Test that we generate the instructions to move between the GPR and FPR
+   registers under power6x.  */
+
+extern long return_long (void);
+extern double return_double (void);
+
+double return_double2 (void)
+{
+  return (double) return_long ();
+}
+
+long return_long2 (void)
+{
+  return (long) return_double ();
+}

PowerPC merge TD/TF moves

Commit Message

Comments

Patch