diff mbox

, PowerPC: Allow DImode in Altivec registers

Message ID 20160613182941.GA3747@ibm-tiger.the-meissners.org
State New
Headers show

Commit Message

Michael Meissner June 13, 2016, 6:29 p.m. UTC
It would help if I included the patch.

On Mon, Jun 13, 2016 at 01:28:16PM -0400, Michael Meissner wrote:
> This patch goes through the PowerPC compiler and adds support to allow DImode
> (64-bit integers) into Altivec registers for VSX systems.  It also adds some
> support to allow loading some DImode constants via either ISA 2.07 or ISA 3.0
> instructions.
> 
> I have bootstrapped this with no regressions on both a big endian power7 system
> and a little endian power8 system.
> 
> I have run a Spec 2006 INT tests with these changes, and the run times were
> comparable between the original compiler and the compiler with the changes.
> 
> Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> can I install them in the 6.2 branch if they cause no regression?
> 
> Note, I will be away from the office, starting Thursday afternoon (June 16th,
> 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> access to email during this time.

[gcc]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
	DImode constants with XXSPLTIB in vector registers.
	(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
	vsx_extract_<mode>_internal{1,2} into a single insn that handles
	direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
	extraction of the element at the top of the register as a scalar
	value.
	(vsx_extract_<mode>_internal1): Likewise.
	(vsx_extract_<mode>_internal2): Likewise.
	* config/rs6000/constraints.md (wi constraint): Remove a comment
	about DImode not being allowed in Altivec registers.
	(wB constraint): New constraint for constants that can be
	generated in Altivec registers with VSPLTISW/VUPKHSW.
	* config/rs6000/predicates.md (xxspltib_constant_split): Update
	comments.
	(xxspltib_constant_nosplit): Likewise.
	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
	support for -mupper-regs-di to enable DImode to go into Altivec
	registers.
	(POWERPC_MASKS): Likewise.
	(power7 cpu): Likewise.
	* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
	for DImode being allowed in Altivec registers.  Update wi/wj
	constraints.  Set scalar_in_vmx_p flag.
	(rs6000_option_override_internal): Add checks for -mupper-regs-di.
	(xxspltib_constant_p): Allow CONST_INT's with VOIDmode.  Don't
	return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
	(rs6000_opt_masks): Add -mupper-regs-di.
	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
	direct move to use wi and now wj.
	(lfiwzx): Likewise.
	(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
	alternative.
	(floatunssi<mode>2_lfiwzx_mem): Likewise.
	(fix_trunc<mode>di2_fctidz): Change second alternative to allow
	any VSX register, instead of just Altivec registers, to allow
	either operand to be an Altivec register or both.
	(fixuns_trunc<mode>di2_fctiduz): Likewise.
	(movdi_internal32): Add support for -mupper-regs-di.  Add support
	to load constants via XXSPLTIB or VSPLTISW.  Add spacing to allow
	the alternatives and attributes to be lined up to be easier to
	read.
	(movdi_internal64): Likewise.
	(64-bit DImode splitters): Change predicates to only split loading
	up GPR registers.  Add splits for using XXSPLTIB or VSPLTISW to
	load constants in ISA 3.0 or ISA 2.07 respectively.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
	-mupper-regs-di.  Update -mupper-regs-df and -mupper-regs-sf to
	mention -mcpu=power9 sets these options.
	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
	wB constraint.

[gcc/testsuite]
2016-06-13  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-dimode1.c: New test.
	* gcc.target/powerpc/p9-dimode2.c: Likewise.

Comments

Segher Boessenkool June 14, 2016, 10:53 p.m. UTC | #1
On Mon, Jun 13, 2016 at 02:29:41PM -0400, Michael Meissner wrote:
> It would help if I included the patch.

:-)

> > Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> > can I install them in the 6.2 branch if they cause no regression?

Okay for trunk.  Okay for 6 after a week.

> > Note, I will be away from the office, starting Thursday afternoon (June 16th,
> > 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> > access to email during this time.

If big problems show up, we can always revert the patch ;-)

A few things...

> 	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
> 	direct move to use wi and now wj.

s/now/not/

> +;; wB needs ISA 2.07 VUPKHSW
> +(define_constraint "wB"
> +  "Signed 5-bit constant integer that can be loaded into an altivec register."
> +  (and (match_code "const_int")
> +       (and (match_test "TARGET_P8_VECTOR")
> +	    (match_operand 0 "s5bit_cint_operand"))))

"and" takes as many operands as you want, i.e.

+  (and (match_code "const_int")
+       (match_test "TARGET_P8_VECTOR")
+       (match_operand 0 "s5bit_cint_operand")))

>  (define_insn "*movdi_internal32"
> -  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
> -	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
> +  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
> +         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
> +          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
> +          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
> +          ?wv")
> +
> +	(match_operand:DI 1 "input_operand"
> +          "r,        Y,         r,         d,         m,          d,
> +           IJKnGHF,  wb,        wv,        Y,         Z,          wi,

"n" includes "IJK" already?

>  ; Some DImode loads are best done as a load of -1 followed by a mask
>  ; instruction.
>  (define_split
> -  [(set (match_operand:DI 0 "gpc_reg_operand")
> +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")

Not sure what this is for...  If you want to say this split is only to
be done after RA, just say that explicitly in the split condition (i.e.
"reload_completed").  Or does this mean something else?


Segher
Michael Meissner June 15, 2016, 6:24 p.m. UTC | #2
On Tue, Jun 14, 2016 at 05:53:46PM -0500, Segher Boessenkool wrote:
> On Mon, Jun 13, 2016 at 02:29:41PM -0400, Michael Meissner wrote:
> > It would help if I included the patch.
> 
> :-)
> 
> > > Are these changes ok to install in the trunk?  Assuming they go in the trunk,
> > > can I install them in the 6.2 branch if they cause no regression?
> 
> Okay for trunk.  Okay for 6 after a week.
> 
> > > Note, I will be away from the office, starting Thursday afternoon (June 16th,
> > > 2016) and I will return on Monday (June 20th, 2016).  I will not have easy
> > > access to email during this time.
> 
> If big problems show up, we can always revert the patch ;-)
> 
> A few things...
> 
> > 	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
> > 	direct move to use wi and now wj.
> 
> s/now/not/

Ok.

> > +;; wB needs ISA 2.07 VUPKHSW
> > +(define_constraint "wB"
> > +  "Signed 5-bit constant integer that can be loaded into an altivec register."
> > +  (and (match_code "const_int")
> > +       (and (match_test "TARGET_P8_VECTOR")
> > +	    (match_operand 0 "s5bit_cint_operand"))))
> 
> "and" takes as many operands as you want, i.e.

Ok, useful to know for the future.

> +  (and (match_code "const_int")
> +       (match_test "TARGET_P8_VECTOR")
> +       (match_operand 0 "s5bit_cint_operand")))
> 
> >  (define_insn "*movdi_internal32"
> > -  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
> > -	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
> > +  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
> > +         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
> > +          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
> > +          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
> > +          ?wv")
> > +
> > +	(match_operand:DI 1 "input_operand"
> > +          "r,        Y,         r,         d,         m,          d,
> > +           IJKnGHF,  wb,        wv,        Y,         Z,          wi,
> 
> "n" includes "IJK" already?

In this case, I merely copied the existing code before formatting it.

> >  ; Some DImode loads are best done as a load of -1 followed by a mask
> >  ; instruction.
> >  (define_split
> > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> 
> Not sure what this is for...  If you want to say this split is only to
> be done after RA, just say that explicitly in the split condition (i.e.
> "reload_completed").  Or does this mean something else?

This is so that constants being loaded into the vector registers aren't split
(they are handled via different define_splits).  Previously, the only constant
that was loaded in vector registers was 0.

The int_reg_operand_not_pseudo allows the split to take place if it has already
gotten hard registers before register allocation.  It could have been the
normal int_reg_operand and then use a reload_completed check.
Segher Boessenkool June 15, 2016, 7:51 p.m. UTC | #3
On Wed, Jun 15, 2016 at 02:24:41PM -0400, Michael Meissner wrote:
> > >  ; Some DImode loads are best done as a load of -1 followed by a mask
> > >  ; instruction.
> > >  (define_split
> > > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> > 
> > Not sure what this is for...  If you want to say this split is only to
> > be done after RA, just say that explicitly in the split condition (i.e.
> > "reload_completed").  Or does this mean something else?
> 
> This is so that constants being loaded into the vector registers aren't split
> (they are handled via different define_splits).  Previously, the only constant
> that was loaded in vector registers was 0.
> 
> The int_reg_operand_not_pseudo allows the split to take place if it has already
> gotten hard registers before register allocation.

When does that happen?

> It could have been the
> normal int_reg_operand and then use a reload_completed check.

That is preferred if it makes no difference (otherwise, bebfore you know
it we'll have twice as many predicates).


Segher
Michael Meissner June 15, 2016, 9:12 p.m. UTC | #4
On Wed, Jun 15, 2016 at 02:51:20PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 15, 2016 at 02:24:41PM -0400, Michael Meissner wrote:
> > > >  ; Some DImode loads are best done as a load of -1 followed by a mask
> > > >  ; instruction.
> > > >  (define_split
> > > > -  [(set (match_operand:DI 0 "gpc_reg_operand")
> > > > +  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
> > > 
> > > Not sure what this is for...  If you want to say this split is only to
> > > be done after RA, just say that explicitly in the split condition (i.e.
> > > "reload_completed").  Or does this mean something else?
> > 
> > This is so that constants being loaded into the vector registers aren't split
> > (they are handled via different define_splits).  Previously, the only constant
> > that was loaded in vector registers was 0.
> > 
> > The int_reg_operand_not_pseudo allows the split to take place if it has already
> > gotten hard registers before register allocation.
> 
> When does that happen?

Using arguments, function returns, and of course explicit registers, but I
agree it is fairly low.

> > It could have been the
> > normal int_reg_operand and then use a reload_completed check.
> 
> That is preferred if it makes no difference (otherwise, bebfore you know
> it we'll have twice as many predicates).

We already had the predicate for another use, so I wasn't adding a new one.
diff mbox

Patch

Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/vsx.md	(.../gcc/config/rs6000)	(working copy)
@@ -260,7 +260,7 @@  (define_mode_attr VS_64reg [(V2DF	"ws")
 			    (V2DI	"wi")])
 
 ;; Iterators for loading constants with xxspltib
-(define_mode_iterator VSINT_84  [V4SI V2DI])
+(define_mode_iterator VSINT_84  [V4SI V2DI DI])
 (define_mode_iterator VSINT_842 [V8HI V4SI V2DI])
 
 ;; Constants for creating unspecs
@@ -2095,77 +2095,69 @@  (define_insn "vsx_set_<mode>"
   [(set_attr "type" "vecperm")])
 
 ;; Extract a DF/DI element from V2DF/V2DI
-(define_expand "vsx_extract_<mode>"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "")
-	(vec_select:<VS_scalar> (match_operand:VSX_D 1 "register_operand" "")
-		       (parallel
-			[(match_operand:QI 2 "u5bit_cint_operand" "")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "")
-
 ;; Optimize cases were we can do a simple or direct move.
 ;; Or see if we can avoid doing the move at all
-(define_insn "*vsx_extract_<mode>_internal1"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,<VS_64reg>,r,r")
+
+;; There are some unresolved problems with reload that show up if an Altivec
+;; register was picked.  Limit the scalar value to FPRs for now.
+
+(define_insn "vsx_extract_<mode>"
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand"
+            "=d,     wm,      wo,    d")
+
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "register_operand" "d,<VS_64reg>,<VS_64dm>,<VS_64dm>")
+	 (match_operand:VSX_D 1 "gpc_reg_operand"
+            "<VSa>, <VSa>,  <VSa>,  <VSa>")
+
 	 (parallel
-	  [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wL")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+	  [(match_operand:QI 2 "const_0_to_1_operand"
+            "wD,    wD,     wL,     n")])))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
+  int element = INTVAL (operands[2]);
   int op0_regno = REGNO (operands[0]);
   int op1_regno = REGNO (operands[1]);
+  int fldDM;
 
-  if (op0_regno == op1_regno)
-    return "nop";
-
-  if (INT_REGNO_P (op0_regno))
-    return ((INTVAL (operands[2]) == VECTOR_ELEMENT_MFVSRLD_64BIT)
-	    ? "mfvsrdl %0,%x1"
-	    : "mfvsrd %0,%x1");
+  gcc_assert (IN_RANGE (element, 0, 1));
+  gcc_assert (VSX_REGNO_P (op1_regno));
 
-  if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
-    return "fmr %0,%1";
+  if (element == VECTOR_ELEMENT_SCALAR_64BIT)
+    {
+      if (op0_regno == op1_regno)
+	return ASM_COMMENT_START " vec_extract to same register";
 
-  return "xxlor %x0,%x1,%x1";
-}
-  [(set_attr "type" "fp,vecsimple,mftgpr,mftgpr")
-   (set_attr "length" "4")])
+      else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE
+	       && TARGET_POWERPC64)
+	return "mfvsrd %0,%x1";
 
-(define_insn "*vsx_extract_<mode>_internal2"
-  [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,<VS_64reg>,<VS_64reg>")
-	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd")
-	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-   && (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE
-       || INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)"
-{
-  int fldDM;
-  gcc_assert (UINTVAL (operands[2]) <= 1);
+      else if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
+	return "fmr %0,%1";
 
-  if (INTVAL (operands[2]) == VECTOR_ELEMENT_SCALAR_64BIT)
-    {
-      int op0_regno = REGNO (operands[0]);
-      int op1_regno = REGNO (operands[1]);
+      else if (VSX_REGNO_P (op0_regno))
+	return "xxlor %x0,%x1,%x1";
 
-      if (op0_regno == op1_regno)
-	return "nop";
+      else
+	gcc_unreachable ();
+    }
 
-      if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
-	return "fmr %0,%1";
+  else if (element == VECTOR_ELEMENT_MFVSRLD_64BIT && INT_REGNO_P (op0_regno)
+	   && TARGET_P9_VECTOR && TARGET_POWERPC64 && TARGET_DIRECT_MOVE)
+    return "mfvsrdl %0,%x1";
 
-      return "xxlor %x0,%x1,%x1";
+  else if (VSX_REGNO_P (op0_regno))
+    {
+      fldDM = element << 1;
+      if (!BYTES_BIG_ENDIAN)
+	fldDM = 3 - fldDM;
+      operands[3] = GEN_INT (fldDM);
+      return "xxpermdi %x0,%x1,%x1,%3";
     }
 
-  fldDM = INTVAL (operands[2]) << 1;
-  if (!BYTES_BIG_ENDIAN)
-    fldDM = 3 - fldDM;
-  operands[3] = GEN_INT (fldDM);
-  return "xxpermdi %x0,%x1,%x1,%3";
+  else
+    gcc_unreachable ();
 }
-  [(set_attr "type" "fp,vecsimple,vecperm")
-   (set_attr "length" "4")])
+  [(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")])
 
 ;; Optimize extracting a single scalar element from memory if the scalar is in
 ;; the correct location to use a single load.
Index: gcc/config/rs6000/constraints.md
===================================================================
--- gcc/config/rs6000/constraints.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/constraints.md	(.../gcc/config/rs6000)	(working copy)
@@ -77,8 +77,6 @@  (define_register_constraint "wg" "rs6000
 (define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
   "Floating point register if direct moves are available, or NO_REGS.")
 
-;; At present, DImode is not allowed in the Altivec registers.  If in the
-;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS.
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
   "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
 
@@ -135,6 +133,13 @@  (define_register_constraint "wy" "rs6000
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
   "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
 
+;; wB needs ISA 2.07 VUPKHSW
+(define_constraint "wB"
+  "Signed 5-bit constant integer that can be loaded into an altivec register."
+  (and (match_code "const_int")
+       (and (match_test "TARGET_P8_VECTOR")
+	    (match_operand 0 "s5bit_cint_operand"))))
+
 (define_constraint "wD"
   "Int constant that is the element number of the 64-bit scalar in a vector."
   (and (match_code "const_int")
Index: gcc/config/rs6000/predicates.md
===================================================================
--- gcc/config/rs6000/predicates.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/predicates.md	(.../gcc/config/rs6000)	(working copy)
@@ -565,9 +565,8 @@  (define_predicate "easy_fp_constant"
     }
 })
 
-;; Return 1 if the operand is a CONST_VECTOR or VEC_DUPLICATE of a constant
-;; that can loaded with a XXSPLTIB instruction and then a VUPKHSB, VECSB2W or
-;; VECSB2D instruction.
+;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
+;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
 
 (define_predicate "xxspltib_constant_split"
   (match_code "const_vector,vec_duplicate,const_int")
@@ -582,8 +581,8 @@  (define_predicate "xxspltib_constant_spl
 })
 
 
-;; Return 1 if the operand is a CONST_VECTOR that can loaded directly with a
-;; XXSPLTIB instruction.
+;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
+;; instruction.
 
 (define_predicate "xxspltib_constant_nosplit"
   (match_code "const_vector,vec_duplicate,const_int")
Index: gcc/config/rs6000/rs6000-cpus.def
===================================================================
--- gcc/config/rs6000/rs6000-cpus.def	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000-cpus.def	(.../gcc/config/rs6000)	(working copy)
@@ -45,6 +45,7 @@ 
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF)
 
 /* For now, don't provide an embedded version of ISA 2.07.  */
@@ -119,6 +120,7 @@ 
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_TOC_FUSION		\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF		\
 				 | OPTION_MASK_UPPER_REGS_SF		\
 				 | OPTION_MASK_VSX			\
@@ -211,7 +213,8 @@  RS6000_CPU ("power6x", PROCESSOR_POWER6,
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
+	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF
+	    | OPTION_MASK_UPPER_REGS_DI)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
Index: gcc/config/rs6000/rs6000.opt
===================================================================
--- gcc/config/rs6000/rs6000.opt	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.opt	(.../gcc/config/rs6000)	(working copy)
@@ -597,6 +597,10 @@  mupper-regs
 Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
 Allow float/double variables in upper registers if cpu allows it.
 
+mupper-regs-di
+Target Report Mask(UPPER_REGS_DI) Var(rs6000_isa_flags)
+Allow 64-bit integer variables in upper registers with -mcpu=power7 or -mvsx.
+
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.c	(.../gcc/config/rs6000)	(working copy)
@@ -1938,7 +1938,8 @@  rs6000_hard_regno_mode_ok (int regno, ma
 	  || FLOAT128_VECTOR_P (mode)
 	  || reg_addr[mode].scalar_in_vmx_p
 	  || (TARGET_VSX_TIMODE && mode == TImode)
-	  || (TARGET_VADDUQM && mode == V1TImode)))
+	  || (TARGET_VADDUQM && mode == V1TImode)
+	  || (TARGET_UPPER_REGS_DI && mode == DImode)))
     {
       if (FP_REGNO_P (regno))
 	return FP_REGNO_P (last_regno);
@@ -3082,7 +3083,6 @@  rs6000_init_hard_regno_mode_ok (bool glo
       rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
       rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;	/* V2DFmode  */
       rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;	/* V4SFmode  */
-      rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;	/* DImode  */
 
       if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;	/* TImode  */
@@ -3094,6 +3094,11 @@  rs6000_init_hard_regno_mode_ok (bool glo
 	}
       else
 	rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;
+
+      if (TARGET_UPPER_REGS_DF)					/* DImode  */
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS;
+      else
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;
     }
 
   /* Add conditional constraints based on various options, to allow us to
@@ -3306,6 +3311,9 @@  rs6000_init_hard_regno_mode_ok (bool glo
       if (TARGET_UPPER_REGS_DF)
 	reg_addr[DFmode].scalar_in_vmx_p = true;
 
+      if (TARGET_UPPER_REGS_DI)
+	reg_addr[DImode].scalar_in_vmx_p = true;
+
       if (TARGET_UPPER_REGS_SF)
 	reg_addr[SFmode].scalar_in_vmx_p = true;
     }
@@ -4085,9 +4093,9 @@  rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_DFP;
     }
 
-  /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
-     -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
-     the individual option.  */
+  /* Allow an explicit -mupper-regs to set -mupper-regs-df, -mupper-regs-di,
+     and -mupper-regs-sf, depending on the cpu, unless the user explicitly also
+     set the individual option.  */
   if (TARGET_UPPER_REGS > 0)
     {
       if (TARGET_VSX
@@ -4096,6 +4104,12 @@  rs6000_option_override_internal (bool gl
 	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
       if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@@ -4111,6 +4125,12 @@  rs6000_option_override_internal (bool gl
 	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
       if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@@ -4126,6 +4146,13 @@  rs6000_option_override_internal (bool gl
       rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
     }
 
+  if (TARGET_UPPER_REGS_DI && !TARGET_VSX)
+    {
+      if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
+	error ("-mupper-regs-di requires -mvsx");
+      rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+    }
+
   if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR)
     {
       if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)
@@ -4386,6 +4413,7 @@  rs6000_option_override_internal (bool gl
   if (TARGET_FLOAT128_HW
       && (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR
 			      | OPTION_MASK_DIRECT_MOVE
+			      | OPTION_MASK_UPPER_REGS_DI
 			      | OPTION_MASK_UPPER_REGS_DF
 			      | OPTION_MASK_UPPER_REGS_SF)) == 0)
     {
@@ -6284,7 +6312,7 @@  xxspltib_constant_p (rtx op,
   if (mode == VOIDmode)
     mode = GET_MODE (op);
 
-  else if (mode != GET_MODE (op))
+  else if (mode != GET_MODE (op) && GET_MODE (op) != VOIDmode)
     return false;
 
   /* Handle (vec_duplicate <constant>).  */
@@ -6337,8 +6365,8 @@  xxspltib_constant_p (rtx op,
     }
 
   /* Handle integer constants being loaded into the upper part of the VSX
-     register as a scalar.  If the value isn't 0/-1, only allow it if
-     the mode can go in Altivec registers.  */
+     register as a scalar.  If the value isn't 0/-1, only allow it if the mode
+     can go in Altivec registers.  Prefer VSPLTISW/VUPKHSW over XXSPLITIB.  */
   else if (CONST_INT_P (op))
     {
       if (!SCALAR_INT_MODE_P (mode))
@@ -6348,9 +6376,14 @@  xxspltib_constant_p (rtx op,
       if (!IN_RANGE (value, -128, 127))
 	return false;
 
-      if (!IN_RANGE (value, -1, 0)
-	  && (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0)
-	return false;
+      if (!IN_RANGE (value, -1, 0))
+	{
+	  if (!(reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID))
+	    return false;
+
+	  if (EASY_VECTOR_15 (value))
+	    return false;
+	}
     }
 
   else
@@ -35485,6 +35518,7 @@  static struct rs6000_opt_mask const rs60
   { "string",			OPTION_MASK_STRING,		false, true  },
   { "toc-fusion",		OPTION_MASK_TOC_FUSION,		false, true  },
   { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
+  { "upper-regs-di",		OPTION_MASK_UPPER_REGS_DI,	false, true  },
   { "upper-regs-df",		OPTION_MASK_UPPER_REGS_DF,	false, true  },
   { "upper-regs-sf",		OPTION_MASK_UPPER_REGS_SF,	false, true  },
   { "vsx",			OPTION_MASK_VSX,		false, true  },
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)	(revision 237222)
+++ gcc/config/rs6000/rs6000.md	(.../gcc/config/rs6000)	(working copy)
@@ -4866,7 +4866,7 @@  (define_insn "lfiwax"
 (define_insn_and_split "floatsi<mode>2_lfiwax"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
    && <SI_CONVERT_FP> && can_create_pseudo_p ()"
   "#"
@@ -4905,11 +4905,11 @@  (define_insn_and_split "floatsi<mode>2_l
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatsi<mode>2_lfiwax_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF
 	 (sign_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
    && <SI_CONVERT_FP>"
   "#"
@@ -4941,7 +4941,7 @@  (define_insn "lfiwzx"
 (define_insn_and_split "floatunssi<mode>2_lfiwzx"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
    && <SI_CONVERT_FP>"
   "#"
@@ -4980,11 +4980,11 @@  (define_insn_and_split "floatunssi<mode>
    (set_attr "type" "fpload")])
 
 (define_insn_and_split "floatunssi<mode>2_lfiwzx_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF
 	 (zero_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
   "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
    && <SI_CONVERT_FP>"
   "#"
@@ -5288,7 +5288,7 @@  (define_expand "fix_trunc<mode>di2"
 
 (define_insn "*fix_trunc<mode>di2_fctidz"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
     && TARGET_FCFID"
   "@
@@ -5360,7 +5360,7 @@  (define_expand "fixuns_trunc<mode>di2"
 
 (define_insn "*fixuns_trunc<mode>di2_fctiduz"
   [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
   "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
     && TARGET_FCTIDUZ"
   "@
@@ -7700,9 +7700,25 @@  (define_insn "p8_mfvsrd_4_disf"
 ;; non-offsettable address by using r->r which won't make progress.
 ;; Use of fprs is disparaged slightly otherwise reload prefers to reload
 ;; a gpr into a fpr instead of reloading an invalid 'Y' address
+
+;;        GPR store  GPR load   GPR move   FPR store  FPR load    FPR move
+;;        GPR const  AVX store  AVX store  AVX load   AVX load    VSX move
+;;        P9 0       P9 -1      AVX 0/-1   VSX 0      VSX -1      P9 const
+;;        AVX const  
+
 (define_insn "*movdi_internal32"
-  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
-	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
+  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
+         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
+          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
+          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
+          ?wv")
+
+	(match_operand:DI 1 "input_operand"
+          "r,        Y,         r,         d,         m,          d,
+           IJKnGHF,  wb,        wv,        Y,         Z,          wi,
+           Oj,       wM,        OjwM,      Oj,        wM,         wS,
+           wB"))]
+
   "! TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -7713,8 +7729,24 @@  (define_insn "*movdi_internal32"
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
    fmr %0,%1
+   #
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
    #"
-  [(set_attr "type" "store,load,*,fpstore,fpload,fp,*")])
+  [(set_attr "type"
+               "store,     load,      *,         fpstore,   fpload,     fp,
+                *,         fpstore,   fpstore,   fpload,    fpload,     vecsimple,
+                vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple")])
 
 (define_split
   [(set (match_operand:DI 0 "gpc_reg_operand" "")
@@ -7744,9 +7776,26 @@  (define_split
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
 
+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
+;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
+;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
+;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
+;;              FPR->GPR   GPR->FPR   VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj,?*wi")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r,O"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand"
+               "=Y,        r,         r,         r,         r,          r,
+                ?m,        ?*d,       ?*d,       ?Y,        ?Z,         ?*wb,
+                ?*wv,      ?wi,       ?wo,       ?wo,       ?wv,        ?wi,
+                ?wi,       ?wv,       ?wv,       r,         *h,         *h,
+                ?*r,       ?*wg,      ?*r,       ?*wj")
+
+	(match_operand:DI 1 "input_operand"
+                "r,        Y,         r,         I,         L,          nF,
+                 d,        m,         d,         wb,        wv,         Y,
+                 Z,        wi,        Oj,        wM,        OjwM,       Oj,
+                 wM,       wS,        wB,        *h,        r,          0,
+                 wg,       r,         wj,        r"))]
+
   "TARGET_POWERPC64
    && (gpc_reg_operand (operands[0], DImode)
        || gpc_reg_operand (operands[1], DImode))"
@@ -7760,21 +7809,43 @@  (define_insn "*movdi_internal64"
    stfd%U0%X0 %1,%0
    lfd%U1%X1 %0,%1
    fmr %0,%1
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
+   #
    mf%1 %0
    mt%0 %1
    nop
    mftgpr %0,%1
    mffgpr %0,%1
    mfvsrd %0,%x1
-   mtvsrd %x0,%1
-   xxlxor %x0,%x0,%x0"
-  [(set_attr "type" "store,load,*,*,*,*,fpstore,fpload,fp,mfjmpr,mtjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr,vecsimple")
-   (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4,4,4")])
+   mtvsrd %x0,%1"
+  [(set_attr "type"
+               "store,     load,      *,         *,         *,          *,
+                fpstore,   fpload,    fp,        fpstore,   fpstore,    fpload,
+                fpload,    vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple, vecsimple, vecsimple, mfjmpr,    mtjmpr,     *,
+                mftgpr,    mffgpr,    mftgpr,    mffgpr")
+
+   (set_attr "length"
+               "4,         4,         4,         4,         4,          20,
+                4,         4,         4,         4,         4,          4,
+                4,         4,         4,         4,         4,          8,
+                8,         4,         4,         4,         4,          4,
+                4,         4,         4,         4")])
 
 ; Some DImode loads are best done as a load of -1 followed by a mask
 ; instruction.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]
   "TARGET_POWERPC64
    && num_insns_constant (operands[1], DImode) > 1
@@ -7791,7 +7862,7 @@  (define_split
 ;; When non-easy constants can go in the TOC, this should use
 ;; easy_fp_constant predicate.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_int_operand" ""))]
   "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
   [(set (match_dup 0) (match_dup 2))
@@ -7805,7 +7876,7 @@  (define_split
 }")
 
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_scalar_int_operand" ""))]
   "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
   [(set (match_dup 0) (match_dup 2))
@@ -7817,6 +7888,43 @@  (define_split
   else
     FAIL;
 }")
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "s5bit_cint_operand" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_VSX && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v4si = gen_rtx_REG (V4SImode, r);
+
+  emit_insn (gen_altivec_vspltisw (op0_v4si, op1));
+  if (op1 != const0_rtx && op1 != constm1_rtx)
+    {
+      rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+      emit_insn (gen_altivec_vupkhsw (op0_v2di, op0_v4si));
+    }
+  DONE;
+})
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "xxspltib_constant_split" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v16qi = gen_rtx_REG (V16QImode, r);
+
+  emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1));
+  emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi));
+  DONE;
+})
+
 
 ;; TImode/PTImode is similar, except that we usually want to compute the
 ;; address into a register and use lsi/stsi (the exception is during reload).
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 237222)
+++ gcc/doc/invoke.texi	(.../gcc/doc)	(working copy)
@@ -1005,6 +1005,7 @@  See RS/6000 and PowerPC Options.
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
 -mcompat-align-parm -mno-compat-align-parm @gol
 -mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs-di -mno-upper-regs-di @gol
 -mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol
 -mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol
 -mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector @gol
@@ -20210,6 +20211,17 @@  Generate code that uses (does not use) t
 instructions.  The @option{-mquad-memory-atomic} option requires use of
 64-bit mode.
 
+@item -mupper-regs-di
+@itemx -mno-upper-regs-di
+@opindex mupper-regs-di
+@opindex mno-upper-regs-di
+Generate code that uses (does not use) the scalar instructions that
+target all 64 registers in the vector/scalar floating point register
+set that were added in version 2.06 of the PowerPC ISA when processing
+integers.  @option{-mupper-regs-di} is turned on by default if you use
+any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.
+
 @item -mupper-regs-df
 @itemx -mno-upper-regs-df
 @opindex mupper-regs-df
@@ -20218,8 +20230,8 @@  Generate code that uses (does not use) t
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.06 of the
 PowerPC ISA.  @option{-mupper-regs-df} is turned on by default if you
-use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
-@option{-mvsx} options.
+use any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.
 
 @item -mupper-regs-sf
 @itemx -mno-upper-regs-sf
@@ -20229,8 +20241,8 @@  Generate code that uses (does not use) t
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.07 of the
 PowerPC ISA.  @option{-mupper-regs-sf} is turned on by default if you
-use either of the @option{-mcpu=power8} or @option{-mpower8-vector}
-options.
+use either of the @option{-mcpu=power8}, @option{-mpower8-vector}, or
+@option{-mpower9} options.
 
 @item -mupper-regs
 @itemx -mno-upper-regs
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/doc)	(revision 237222)
+++ gcc/doc/md.texi	(.../gcc/doc)	(working copy)
@@ -3211,6 +3211,9 @@  FP or VSX register to perform ISA 2.07 f
 @item wz
 Floating point register if the LFIWZX instruction is enabled or NO_REGS.
 
+@item wB
+Signed 5-bit constant integer that can be loaded into an altivec register.
+
 @item wD
 Int constant that is the element number of the 64-bit scalar in a vector.
 
Index: gcc/testsuite/gcc.target/powerpc/p9-dimode1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dimode1.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-dimode1.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237344)
@@ -0,0 +1,50 @@ 
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify P9 changes to allow DImode into Altivec registers, and generate
+   constants using XXSPLTIB.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_zero (void)
+{
+  long l = 0;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_plus_1 (void)
+{
+  long l = 1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_minus_1 (void)
+{
+  long l = -1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "xxspltib" } } */
+/* { dg-final { scan-assembler-not "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dimode2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dimode2.c	(.../svn+ssh://meissner@gcc.gnu.org/svn/gcc/trunk/gcc/testsuite/gcc.target/powerpc)	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/p9-dimode2.c	(.../gcc/testsuite/gcc.target/powerpc)	(revision 237344)
@@ -0,0 +1,27 @@ 
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify that large integer constants are loaded via direct move instead of being
+   loaded from memory.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_large (void)
+{
+  long l = 0x12345678;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */