diff mbox

GCC 4.9 powerpc, merge DF/DD moves

Message ID 20130130204802.GA17728@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner Jan. 30, 2013, 8:48 p.m. UTC
This patch builds upon the patch in:
http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01457.html

It merges the DFmode and DDmode moves into one pattern.  In addition, it merges
the -mmfpgpr code with the normal floating point moves, using a new conditional
constraint (wg) that only returns FLOAT_REGS for the power6x.

I have tested this via bootstrap, and there were no regressions.  Is patch
acceptable to check in when the 4.9 tree opens up?

2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/constraints.md (wg constraint): New constraint to
	return FLOAT_REGS if -mmfpgpr (power6x) was used.

	* config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add wg
	constraint.

	* config/rs6000/rs6000.c (rs6000_debug_reg_global): If
	-mdebug=reg, print wg, wl, wx, and wz constraints.
	(rs6000_init_hard_regno_mode_ok): Initialize new constraints.
	Initialize the reload functions for 64-bit binary/decimal floating
	point types.
	(reg_offset_addressing_ok_p): If we are on a power7 or later, use
	LFIWZX and STFIWX to load/store 32-bit decimal types, and don't
	create the buffer on the stack to overcome not having a 32-bit
	load and store.
	(rs6000_emit_move): Likewise.
	(rs6000_secondary_memory_needed_rtx): Likewise.
	(rs6000_alloc_sdmode_stack_slot): Likewise.
	(rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f
	via xxlxor, just like DFmode 0.0.


	* config/rs6000/dfp.md (movdd): Delete, combine with binary
	floating point moves in rs6000.md.  Combine power6x (mfpgpr) moves
	with other moves by using conditional constraits (wg).  Use LFIWZX
	and STFIWX for loading SDmode on power7.
	(movdd splitters): Likewise.
	(movdd_hardfloat32): Likewise.
	(movdd_softfloat32): Likewise.
	(movdd_hardfloat64_mfpgpr): Likewise.
	(movdd_hardfloat64): Likewise.
	(movdd_softfloat64): Likewise.

	* config/rs6000/rs6000.md (FMOVE64): New iterators to combine
	64-bit binary and decimal floating point moves.
	(FMOVE64X): Likewise.
	(movdf): Combine 64-bit binary and decimal floating point moves.
	Combine power6x (mfpgpr) moves with other moves by using
	conditional constraits (wg).
	(mov<mode> for DFmode/DDmode): Likewise.
	(DFmode/DDmode splitters): Likewise.
	(movdf_hardfloat32): Likewise.
	(mov<mode>_hardfloat32 for DFmode/DDmode): Likewise.
	(movdf_softfloat32): Likewise.
	(movdf_hardfloat64_mfpgpr): Likewise.
	(movdf_hardfloat64): Likewise.
	(mov<mode>_hardfloat64 for DFmode/DDmode): Likewise.
	(movdf_softfloat64): Likewise.
	(mov<mode>_softfloat64 for DFmode/DDmode): Likewise.
	(reload_<mode>_load): Move to later in the file so they aren't in
	the middle of the floating point move insns.
	(reload_<mode>_store): Likewise.

	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document wg
	constraint.

Comments

David Edelsohn Feb. 5, 2013, 6:45 p.m. UTC | #1
On Wed, Jan 30, 2013 at 3:48 PM, Michael Meissner
<meissner@linux.vnet.ibm.com> wrote:
> This patch builds upon the patch in:
> http://gcc.gnu.org/ml/gcc-patches/2013-01/msg01457.html
>
> It merges the DFmode and DDmode moves into one pattern.  In addition, it merges
> the -mmfpgpr code with the normal floating point moves, using a new conditional
> constraint (wg) that only returns FLOAT_REGS for the power6x.
>
> I have tested this via bootstrap, and there were no regressions.  Is patch
> acceptable to check in when the 4.9 tree opens up?
>
> 2013-01-30  Michael Meissner  <meissner@linux.vnet.ibm.com>
>
>         * config/rs6000/constraints.md (wg constraint): New constraint to
>         return FLOAT_REGS if -mmfpgpr (power6x) was used.
>
>         * config/rs6000/rs6000.h (enum r6000_reg_class_enum): Add wg
>         constraint.
>
>         * config/rs6000/rs6000.c (rs6000_debug_reg_global): If
>         -mdebug=reg, print wg, wl, wx, and wz constraints.
>         (rs6000_init_hard_regno_mode_ok): Initialize new constraints.
>         Initialize the reload functions for 64-bit binary/decimal floating
>         point types.
>         (reg_offset_addressing_ok_p): If we are on a power7 or later, use
>         LFIWZX and STFIWX to load/store 32-bit decimal types, and don't
>         create the buffer on the stack to overcome not having a 32-bit
>         load and store.
>         (rs6000_emit_move): Likewise.
>         (rs6000_secondary_memory_needed_rtx): Likewise.
>         (rs6000_alloc_sdmode_stack_slot): Likewise.
>         (rs6000_preferred_reload_class): On VSX, we can create SFmode 0.0f
>         via xxlxor, just like DFmode 0.0.
>
>
>         * config/rs6000/dfp.md (movdd): Delete, combine with binary
>         floating point moves in rs6000.md.  Combine power6x (mfpgpr) moves
>         with other moves by using conditional constraits (wg).  Use LFIWZX
>         and STFIWX for loading SDmode on power7.
>         (movdd splitters): Likewise.
>         (movdd_hardfloat32): Likewise.
>         (movdd_softfloat32): Likewise.
>         (movdd_hardfloat64_mfpgpr): Likewise.
>         (movdd_hardfloat64): Likewise.
>         (movdd_softfloat64): Likewise.
>
>         * config/rs6000/rs6000.md (FMOVE64): New iterators to combine
>         64-bit binary and decimal floating point moves.
>         (FMOVE64X): Likewise.
>         (movdf): Combine 64-bit binary and decimal floating point moves.
>         Combine power6x (mfpgpr) moves with other moves by using
>         conditional constraits (wg).
>         (mov<mode> for DFmode/DDmode): Likewise.
>         (DFmode/DDmode splitters): Likewise.
>         (movdf_hardfloat32): Likewise.
>         (mov<mode>_hardfloat32 for DFmode/DDmode): Likewise.
>         (movdf_softfloat32): Likewise.
>         (movdf_hardfloat64_mfpgpr): Likewise.
>         (movdf_hardfloat64): Likewise.
>         (mov<mode>_hardfloat64 for DFmode/DDmode): Likewise.
>         (movdf_softfloat64): Likewise.
>         (mov<mode>_softfloat64 for DFmode/DDmode): Likewise.
>         (reload_<mode>_load): Move to later in the file so they aren't in
>         the middle of the floating point move insns.
>         (reload_<mode>_store): Likewise.
>
>         * doc/md.texi (PowerPC and IBM RS6000 constraints): Document wg
>         constraint.

This patch is okay when 4.9 tree opens, in combination with the parts
you included in the TF/TD patch.

Again, please confirm that it still works on pre-POWER7 systems.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(revision 195586)
+++ gcc/config/rs6000/constraints.md	(working copy)
@@ -69,6 +69,9 @@  (define_register_constraint "wa" "rs6000
   "@internal")
 
 ;; Register constraints to simplify move patterns
+(define_register_constraint "wg" "rs6000_constraints[RS6000_CONSTRAINT_wg]"
+  "Floating point register if -mmfpgpr is used, or NO_REGS.")
+
 (define_register_constraint "wl" "rs6000_constraints[RS6000_CONSTRAINT_wl]"
   "Floating point register if the LFIWAX instruction is enabled or NO_REGS.")
 
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 195586)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1324,6 +1324,7 @@  enum r6000_reg_class_enum {
   RS6000_CONSTRAINT_v,		/* Altivec registers */
   RS6000_CONSTRAINT_wa,		/* Any VSX register */
   RS6000_CONSTRAINT_wd,		/* VSX register for V2DF */
+  RS6000_CONSTRAINT_wg,		/* FPR register for -mmfpgpr */
   RS6000_CONSTRAINT_wf,		/* VSX register for V4SF */
   RS6000_CONSTRAINT_wl,		/* FPR register for LFIWAX */
   RS6000_CONSTRAINT_ws,		/* VSX register for DF */
Index: gcc/config/rs6000/dfp.md
===================================================================
--- gcc/config/rs6000/dfp.md	(revision 195586)
+++ gcc/config/rs6000/dfp.md	(working copy)
@@ -111,211 +111,6 @@  (define_insn "*nabsdd2_fpr"
   "fnabs %0,%1"
   [(set_attr "type" "fp")])
 
-(define_expand "movdd"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "")
-	(match_operand:DD 1 "any_operand" ""))]
-  ""
-  "{ rs6000_emit_move (operands[0], operands[1], DDmode); DONE; }")
-
-(define_split
-  [(set (match_operand:DD 0 "gpc_reg_operand" "")
-	(match_operand:DD 1 "const_int_operand" ""))]
-  "! TARGET_POWERPC64 && reload_completed
-   && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
-       || (GET_CODE (operands[0]) == SUBREG
-	   && GET_CODE (SUBREG_REG (operands[0])) == REG
-	   && REGNO (SUBREG_REG (operands[0])) <= 31))"
-  [(set (match_dup 2) (match_dup 4))
-   (set (match_dup 3) (match_dup 1))]
-  "
-{
-  int endian = (WORDS_BIG_ENDIAN == 0);
-  HOST_WIDE_INT value = INTVAL (operands[1]);
-
-  operands[2] = operand_subword (operands[0], endian, 0, DDmode);
-  operands[3] = operand_subword (operands[0], 1 - endian, 0, DDmode);
-#if HOST_BITS_PER_WIDE_INT == 32
-  operands[4] = (value & 0x80000000) ? constm1_rtx : const0_rtx;
-#else
-  operands[4] = GEN_INT (value >> 32);
-  operands[1] = GEN_INT (((value & 0xffffffff) ^ 0x80000000) - 0x80000000);
-#endif
-}")
-
-(define_split
-  [(set (match_operand:DD 0 "gpc_reg_operand" "")
-	(match_operand:DD 1 "const_double_operand" ""))]
-  "! TARGET_POWERPC64 && reload_completed
-   && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
-       || (GET_CODE (operands[0]) == SUBREG
-	   && GET_CODE (SUBREG_REG (operands[0])) == REG
-	   && REGNO (SUBREG_REG (operands[0])) <= 31))"
-  [(set (match_dup 2) (match_dup 4))
-   (set (match_dup 3) (match_dup 5))]
-  "
-{
-  int endian = (WORDS_BIG_ENDIAN == 0);
-  long l[2];
-  REAL_VALUE_TYPE rv;
-
-  REAL_VALUE_FROM_CONST_DOUBLE (rv, operands[1]);
-  REAL_VALUE_TO_TARGET_DECIMAL64 (rv, l);
-
-  operands[2] = operand_subword (operands[0], endian, 0, DDmode);
-  operands[3] = operand_subword (operands[0], 1 - endian, 0, DDmode);
-  operands[4] = gen_int_mode (l[endian], SImode);
-  operands[5] = gen_int_mode (l[1 - endian], SImode);
-}")
-
-(define_split
-  [(set (match_operand:DD 0 "gpc_reg_operand" "")
-	(match_operand:DD 1 "const_double_operand" ""))]
-  "TARGET_POWERPC64 && reload_completed
-   && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
-       || (GET_CODE (operands[0]) == SUBREG
-	   && GET_CODE (SUBREG_REG (operands[0])) == REG
-	   && REGNO (SUBREG_REG (operands[0])) <= 31))"
-  [(set (match_dup 2) (match_dup 3))]
-  "
-{
-  int endian = (WORDS_BIG_ENDIAN == 0);
-  long l[2];
-  REAL_VALUE_TYPE rv;
-#if HOST_BITS_PER_WIDE_INT >= 64
-  HOST_WIDE_INT val;
-#endif
-
-  REAL_VALUE_FROM_CONST_DOUBLE (rv, operands[1]);
-  REAL_VALUE_TO_TARGET_DECIMAL64 (rv, l);
-
-  operands[2] = gen_lowpart (DImode, operands[0]);
-  /* HIGHPART is lower memory address when WORDS_BIG_ENDIAN.  */
-#if HOST_BITS_PER_WIDE_INT >= 64
-  val = ((HOST_WIDE_INT)(unsigned long)l[endian] << 32
-	 | ((HOST_WIDE_INT)(unsigned long)l[1 - endian]));
-
-  operands[3] = gen_int_mode (val, DImode);
-#else
-  operands[3] = immed_double_const (l[1 - endian], l[endian], DImode);
-#endif
-}")
-
-;; Don't have reload use general registers to load a constant.  First,
-;; it might not work if the output operand is the equivalent of
-;; a non-offsettable memref, but also it is less efficient than loading
-;; the constant into an FP register, since it will probably be used there.
-;; The "??" is a kludge until we can figure out a more reasonable way
-;; of handling these non-offsettable values.
-(define_insn "*movdd_hardfloat32"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "=!r,??r,m,d,d,m,!r,!r,!r")
-	(match_operand:DD 1 "input_operand" "r,m,r,d,m,d,G,H,F"))]
-  "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS
-   && (gpc_reg_operand (operands[0], DDmode)
-       || gpc_reg_operand (operands[1], DDmode))"
-  "*
-{
-  switch (which_alternative)
-    {
-    default:
-      gcc_unreachable ();
-    case 0:
-    case 1:
-    case 2:
-      return \"#\";
-    case 3:
-      return \"fmr %0,%1\";
-    case 4:
-      return \"lfd%U1%X1 %0,%1\";
-    case 5:
-      return \"stfd%U0%X0 %1,%0\";
-    case 6:
-    case 7:
-    case 8:
-      return \"#\";
-    }
-}"
-  [(set_attr "type" "two,load,store,fp,fpload,fpstore,*,*,*")
-   (set_attr "length" "8,16,16,4,4,4,8,12,16")])
-
-(define_insn "*movdd_softfloat32"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "=r,r,m,r,r,r")
-	(match_operand:DD 1 "input_operand" "r,m,r,G,H,F"))]
-  "! TARGET_POWERPC64 && (TARGET_SOFT_FLOAT || !TARGET_FPRS)
-   && (gpc_reg_operand (operands[0], DDmode)
-       || gpc_reg_operand (operands[1], DDmode))"
-  "#"
-  [(set_attr "type" "two,load,store,*,*,*")
-   (set_attr "length" "8,8,8,8,12,16")])
-
-; ld/std require word-aligned displacements -> 'Y' constraint.
-; List Y->r and r->Y before r->r for reload.
-(define_insn "*movdd_hardfloat64_mfpgpr"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "=Y,r,!r,d,d,m,*c*l,!r,*h,!r,!r,!r,r,d")
-	(match_operand:DD 1 "input_operand" "r,Y,r,d,m,d,r,h,0,G,H,F,d,r"))]
-  "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS
-   && (gpc_reg_operand (operands[0], DDmode)
-       || gpc_reg_operand (operands[1], DDmode))"
-  "@
-   std%U0%X0 %1,%0
-   ld%U1%X1 %0,%1
-   mr %0,%1
-   fmr %0,%1
-   lfd%U1%X1 %0,%1
-   stfd%U0%X0 %1,%0
-   mt%0 %1
-   mf%1 %0
-   nop
-   #
-   #
-   #
-   mftgpr %0,%1
-   mffgpr %0,%1"
-  [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16,4,4")])
-
-; ld/std require word-aligned displacements -> 'Y' constraint.
-; List Y->r and r->Y before r->r for reload.
-(define_insn "*movdd_hardfloat64"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "=Y,r,!r,d,d,m,*c*l,!r,*h,!r,!r,!r")
-	(match_operand:DD 1 "input_operand" "r,Y,r,d,m,d,r,h,0,G,H,F"))]
-  "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS
-   && (gpc_reg_operand (operands[0], DDmode)
-       || gpc_reg_operand (operands[1], DDmode))"
-  "@
-   std%U0%X0 %1,%0
-   ld%U1%X1 %0,%1
-   mr %0,%1
-   fmr %0,%1
-   lfd%U1%X1 %0,%1
-   stfd%U0%X0 %1,%0
-   mt%0 %1
-   mf%1 %0
-   nop
-   #
-   #
-   #"
-  [(set_attr "type" "store,load,*,fp,fpload,fpstore,mtjmpr,mfjmpr,*,*,*,*")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,8,12,16")])
-
-(define_insn "*movdd_softfloat64"
-  [(set (match_operand:DD 0 "nonimmediate_operand" "=r,Y,r,cl,r,r,r,r,*h")
-	(match_operand:DD 1 "input_operand" "Y,r,r,r,h,G,H,F,0"))]
-  "TARGET_POWERPC64 && (TARGET_SOFT_FLOAT || !TARGET_FPRS)
-   && (gpc_reg_operand (operands[0], DDmode)
-       || gpc_reg_operand (operands[1], DDmode))"
-  "@
-   ld%U1%X1 %0,%1
-   std%U0%X0 %1,%0
-   mr %0,%1
-   mt%0 %1
-   mf%1 %0
-   #
-   #
-   #
-   nop"
-  [(set_attr "type" "load,store,*,mtjmpr,mfjmpr,*,*,*,*")
-   (set_attr "length" "4,4,4,4,4,8,12,16,4")])
-
 (define_expand "negtd2"
   [(set (match_operand:TD 0 "gpc_reg_operand" "")
 	(neg:TD (match_operand:TD 1 "gpc_reg_operand" "")))]
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 195586)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -255,6 +255,8 @@  (define_mode_iterator FMA_F [
 
 ; Floating point move iterators to combine binary and decimal moves
 (define_mode_iterator FMOVE32 [SF SD])
+(define_mode_iterator FMOVE64 [DF DD])
+(define_mode_iterator FMOVE64X [DI DF DD])
 
 ; Whether a floating point move is ok, don't allow SD without hardware FP
 (define_mode_attr fmove_ok [(SF "")
@@ -7955,15 +7957,16 @@  (define_insn "*mov<mode>_softfloat"
    (set_attr "length" "4,4,4,4,4,4,4,4,8,4")])
 
 
-(define_expand "movdf"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "")
-	(match_operand:DF 1 "any_operand" ""))]
+;; Move 64-bit binary/decimal floating point
+(define_expand "mov<mode>"
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "")
+	(match_operand:FMOVE64 1 "any_operand" ""))]
   ""
-  "{ rs6000_emit_move (operands[0], operands[1], DFmode); DONE; }")
+  "{ rs6000_emit_move (operands[0], operands[1], <MODE>mode); DONE; }")
 
 (define_split
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(match_operand:DF 1 "const_int_operand" ""))]
+  [(set (match_operand:FMOVE64 0 "gpc_reg_operand" "")
+	(match_operand:FMOVE64 1 "const_int_operand" ""))]
   "! TARGET_POWERPC64 && reload_completed
    && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
        || (GET_CODE (operands[0]) == SUBREG
@@ -7976,8 +7979,8 @@  (define_split
   int endian = (WORDS_BIG_ENDIAN == 0);
   HOST_WIDE_INT value = INTVAL (operands[1]);
 
-  operands[2] = operand_subword (operands[0], endian, 0, DFmode);
-  operands[3] = operand_subword (operands[0], 1 - endian, 0, DFmode);
+  operands[2] = operand_subword (operands[0], endian, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 1 - endian, 0, <MODE>mode);
 #if HOST_BITS_PER_WIDE_INT == 32
   operands[4] = (value & 0x80000000) ? constm1_rtx : const0_rtx;
 #else
@@ -7987,8 +7990,8 @@  (define_split
 }")
 
 (define_split
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(match_operand:DF 1 "const_double_operand" ""))]
+  [(set (match_operand:FMOVE64 0 "gpc_reg_operand" "")
+	(match_operand:FMOVE64 1 "const_double_operand" ""))]
   "! TARGET_POWERPC64 && reload_completed
    && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
        || (GET_CODE (operands[0]) == SUBREG
@@ -8003,17 +8006,17 @@  (define_split
   REAL_VALUE_TYPE rv;
 
   REAL_VALUE_FROM_CONST_DOUBLE (rv, operands[1]);
-  REAL_VALUE_TO_TARGET_DOUBLE (rv, l);
+  <real_value_to_target> (rv, l);
 
-  operands[2] = operand_subword (operands[0], endian, 0, DFmode);
-  operands[3] = operand_subword (operands[0], 1 - endian, 0, DFmode);
+  operands[2] = operand_subword (operands[0], endian, 0, <MODE>mode);
+  operands[3] = operand_subword (operands[0], 1 - endian, 0, <MODE>mode);
   operands[4] = gen_int_mode (l[endian], SImode);
   operands[5] = gen_int_mode (l[1 - endian], SImode);
 }")
 
 (define_split
-  [(set (match_operand:DF 0 "gpc_reg_operand" "")
-	(match_operand:DF 1 "const_double_operand" ""))]
+  [(set (match_operand:FMOVE64 0 "gpc_reg_operand" "")
+	(match_operand:FMOVE64 1 "const_double_operand" ""))]
   "TARGET_POWERPC64 && reload_completed
    && ((GET_CODE (operands[0]) == REG && REGNO (operands[0]) <= 31)
        || (GET_CODE (operands[0]) == SUBREG
@@ -8030,7 +8033,7 @@  (define_split
 #endif
 
   REAL_VALUE_FROM_CONST_DOUBLE (rv, operands[1]);
-  REAL_VALUE_TO_TARGET_DOUBLE (rv, l);
+  <real_value_to_target> (rv, l);
 
   operands[2] = gen_lowpart (DImode, operands[0]);
   /* HIGHPART is lower memory address when WORDS_BIG_ENDIAN.  */
@@ -8055,12 +8058,12 @@  (define_split
 ;; since the D-form version of the memory instructions does not need a GPR for
 ;; reloading.
 
-(define_insn "*movdf_hardfloat32"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,!r,!r,!r")
-	(match_operand:DF 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,G,H,F"))]
+(define_insn "*mov<mode>_hardfloat32"
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,!r,!r,!r")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,G,H,F"))]
   "! TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT 
-   && (gpc_reg_operand (operands[0], DFmode)
-       || gpc_reg_operand (operands[1], DFmode))"
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "@
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
@@ -8081,90 +8084,30 @@  (define_insn "*movdf_hardfloat32"
   [(set_attr "type" "fpstore,fpload,fp,fpload,fpload,fpstore,fpstore,vecsimple,vecsimple,vecsimple,store,load,two,fp,fp,*")
    (set_attr "length" "4,4,4,4,4,4,4,4,4,4,8,8,8,8,12,16")])
 
-(define_insn "*movdf_softfloat32"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,r,r,r,r")
-	(match_operand:DF 1 "input_operand" "r,Y,r,G,H,F"))]
+(define_insn "*mov<mode>_softfloat32"
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,r,r,r")
+	(match_operand:FMOVE64 1 "input_operand" "r,Y,r,G,H,F"))]
   "! TARGET_POWERPC64 
    && ((TARGET_FPRS && TARGET_SINGLE_FLOAT) 
        || TARGET_SOFT_FLOAT || TARGET_E500_SINGLE)
-   && (gpc_reg_operand (operands[0], DFmode)
-       || gpc_reg_operand (operands[1], DFmode))"
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "#"
   [(set_attr "type" "store,load,two,*,*,*")
    (set_attr "length" "8,8,8,8,12,16")])
 
-;; Reload patterns to support gpr load/store with misaligned mem.
-;; and multiple gpr load/store at offset >= 0xfffc
-(define_expand "reload_<mode>_store"
-  [(parallel [(match_operand 0 "memory_operand" "=m")
-              (match_operand 1 "gpc_reg_operand" "r")
-              (match_operand:GPR 2 "register_operand" "=&b")])]
-  ""
-{
-  rs6000_secondary_reload_gpr (operands[1], operands[0], operands[2], true);
-  DONE;
-})
-
-(define_expand "reload_<mode>_load"
-  [(parallel [(match_operand 0 "gpc_reg_operand" "=r")
-              (match_operand 1 "memory_operand" "m")
-              (match_operand:GPR 2 "register_operand" "=b")])]
-  ""
-{
-  rs6000_secondary_reload_gpr (operands[0], operands[1], operands[2], false);
-  DONE;
-})
-
-; ld/std require word-aligned displacements -> 'Y' constraint.
-; List Y->r and r->Y before r->r for reload.
-(define_insn "*movdf_hardfloat64_mfpgpr"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,!r,ws,?wa,ws,?wa,Z,?Z,m,d,d,wa,*c*l,!r,*h,!r,!r,!r,r,d")
-	(match_operand:DF 1 "input_operand" "r,Y,r,ws,?wa,Z,Z,ws,wa,d,m,d,j,r,h,0,G,H,F,d,r"))]
-  "TARGET_POWERPC64 && TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_DOUBLE_FLOAT
-   && (gpc_reg_operand (operands[0], DFmode)
-       || gpc_reg_operand (operands[1], DFmode))"
-  "@
-   std%U0%X0 %1,%0
-   ld%U1%X1 %0,%1
-   mr %0,%1
-   xxlor %x0,%x1,%x1
-   xxlor %x0,%x1,%x1
-   lxsd%U1x %x0,%y1
-   lxsd%U1x %x0,%y1
-   stxsd%U0x %x1,%y0
-   stxsd%U0x %x1,%y0
-   stfd%U0%X0 %1,%0
-   lfd%U1%X1 %0,%1
-   fmr %0,%1
-   xxlxor %x0,%x0,%x0
-   mt%0 %1
-   mf%1 %0
-   nop
-   #
-   #
-   #
-   mftgpr %0,%1
-   mffgpr %0,%1"
-  [(set_attr "type" "store,load,*,fp,fp,fpload,fpload,fpstore,fpstore,fpstore,fpload,fp,vecsimple,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4")])
-
 ; ld/std require word-aligned displacements -> 'Y' constraint.
 ; List Y->r and r->Y before r->r for reload.
-(define_insn "*movdf_hardfloat64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=m,d,d,Y,r,!r,ws,?wa,Z,?Z,ws,?wa,wa,*c*l,!r,*h,!r,!r,!r")
-	(match_operand:DF 1 "input_operand" "d,m,d,r,Y,r,Z,Z,ws,wa,ws,wa,j,r,h,0,G,H,F"))]
-  "TARGET_POWERPC64 && !TARGET_MFPGPR && TARGET_HARD_FLOAT && TARGET_FPRS 
-   && TARGET_DOUBLE_FLOAT
-   && (gpc_reg_operand (operands[0], DFmode)
-       || gpc_reg_operand (operands[1], DFmode))"
+(define_insn "*mov<mode>_hardfloat64"
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=m,d,d,ws,?wa,Z,?Z,ws,?wa,wa,Y,r,!r,*c*l,!r,*h,!r,!r,!r,r,wg")
+	(match_operand:FMOVE64 1 "input_operand" "d,m,d,Z,Z,ws,wa,ws,wa,j,r,Y,r,r,h,0,G,H,F,wg,r"))]
+  "TARGET_POWERPC64 && TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "@
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
    fmr %0,%1
-   std%U0%X0 %1,%0
-   ld%U1%X1 %0,%1
-   mr %0,%1
    lxsd%U1x %x0,%y1
    lxsd%U1x %x0,%y1
    stxsd%U0x %x1,%y0
@@ -8172,21 +8115,26 @@  (define_insn "*movdf_hardfloat64"
    xxlor %x0,%x1,%x1
    xxlor %x0,%x1,%x1
    xxlxor %x0,%x0,%x0
+   std%U0%X0 %1,%0
+   ld%U1%X1 %0,%1
+   mr %0,%1
    mt%0 %1
    mf%1 %0
    nop
    #
    #
-   #"
-  [(set_attr "type" "fpstore,fpload,fp,store,load,*,fpload,fpload,fpstore,fpstore,vecsimple,vecsimple,vecsimple,mtjmpr,mfjmpr,*,*,*,*")
-   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16")])
+   #
+   mftgpr %0,%1
+   mffgpr %0,%1"
+  [(set_attr "type" "fpstore,fpload,fp,fpload,fpload,fpstore,fpstore,vecsimple,vecsimple,vecsimple,store,load,*,mtjmpr,mfjmpr,*,*,*,*,mftgpr,mffgpr")
+   (set_attr "length" "4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4,8,12,16,4,4")])
 
-(define_insn "*movdf_softfloat64"
-  [(set (match_operand:DF 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
-	(match_operand:DF 1 "input_operand" "r,Y,r,r,h,G,H,F,0"))]
+(define_insn "*mov<mode>_softfloat64"
+  [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,cl,r,r,r,r,*h")
+	(match_operand:FMOVE64 1 "input_operand" "r,Y,r,r,h,G,H,F,0"))]
   "TARGET_POWERPC64 && (TARGET_SOFT_FLOAT || !TARGET_FPRS)
-   && (gpc_reg_operand (operands[0], DFmode)
-       || gpc_reg_operand (operands[1], DFmode))"
+   && (gpc_reg_operand (operands[0], <MODE>mode)
+       || gpc_reg_operand (operands[1], <MODE>mode))"
   "@
    std%U0%X0 %1,%0
    ld%U1%X1 %0,%1
@@ -8513,6 +8461,33 @@  (define_expand "abstf2_internal"
   operands[6] = simplify_gen_subreg (DFmode, operands[0], TFmode, lo_word);
 }")
 
+;; Reload helper functions used by rs6000_secondary_reload.  The patterns all
+;; must have 3 arguments, and scratch register constraint must be a single
+;; constraint.
+
+;; Reload patterns to support gpr load/store with misaligned mem.
+;; and multiple gpr load/store at offset >= 0xfffc
+(define_expand "reload_<mode>_store"
+  [(parallel [(match_operand 0 "memory_operand" "=m")
+              (match_operand 1 "gpc_reg_operand" "r")
+              (match_operand:GPR 2 "register_operand" "=&b")])]
+  ""
+{
+  rs6000_secondary_reload_gpr (operands[1], operands[0], operands[2], true);
+  DONE;
+})
+
+(define_expand "reload_<mode>_load"
+  [(parallel [(match_operand 0 "gpc_reg_operand" "=r")
+              (match_operand 1 "memory_operand" "m")
+              (match_operand:GPR 2 "register_operand" "=b")])]
+  ""
+{
+  rs6000_secondary_reload_gpr (operands[0], operands[1], operands[2], false);
+  DONE;
+})
+
+
 ;; Next come the multi-word integer load and store and the load and store
 ;; multiple insns.
 
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 195586)
+++ gcc/doc/md.texi	(working copy)
@@ -2075,6 +2075,9 @@  VSX vector register to hold vector doubl
 @item wf
 VSX vector register to hold vector float data
 
+@item wg
+If @option{-mmfpgpr} was used, a floating point register
+
 @item wl
 If the LFIWAX instruction is enabled, a floating point register