diff mbox series

[1/6,ver,2] rs6000, Update support for vec_extract

Message ID a584a955a976753a1f66a13b27b9cff79538fa76.camel@us.ibm.com
State New
Headers show
Series ] Permute Class Operations | expand

Commit Message

Carl Love June 15, 2020, 11:37 p.m. UTC
v2 changes

config/rs6000/altivec.md log entry for move from changed as suggested.

config/rs6000/vsx.md log entro for moved to here changed as suggested.

define_mode_iterator VI2 also moved, included in both change log entries

--------------------------------------------
GCC maintainers:

Move the existing vector extract support in altivec.md to vsx.md
so all of the vector insert and extract support is in the same file.

The patch also updates the name of the builtins and descriptions for the
builtins in the documentation file so they match the approved builtin
names and descriptions.

The patch does not make any functional changes.

Please let me know if the changes are acceptable.  Thanks.

                  Carl Love

------------------------------------------------------

gcc/ChangeLog

2020-06-15  Carl Love  <cel@us.ibm.com>

        * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
	(vextractl<mode>, vextractr<mode>)
        (vextractl<mode>_internal, vextractr<mode>_internal)
	(VI2): Move to gcc/config/rs6000/vsx.md.
	* config/rs6000/vsx.md:	(UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
        (vextractl<mode>, vextractr<mode>)
        (vextractl<mode>_internal, vextractr<mode>_internal)
	(VI2): Code was moved from config/rs6000/altivec.md.
	* gcc/doc/extend.texi: Update documentation for vec_extractl.
	Replace builtin name vec_extractr with vec_extracth.  Update description
	of vec_extracth.
---
 gcc/config/rs6000/altivec.md | 64 -------------------------------
 gcc/config/rs6000/vsx.md     | 66 ++++++++++++++++++++++++++++++++
 gcc/doc/extend.texi          | 73 +++++++++++++++++-------------------
 3 files changed, 101 insertions(+), 102 deletions(-)

Comments

will schmidt June 16, 2020, 5:48 p.m. UTC | #1
On Mon, 2020-06-15 at 16:37 -0700, Carl Love via Gcc-patches wrote:
> v2 changes
> 
> config/rs6000/altivec.md log entry for move from changed as
> suggested.
> 
> config/rs6000/vsx.md log entro for moved to here changed as
> suggested.
> 
> define_mode_iterator VI2 also moved, included in both change log
> entries
> 
> --------------------------------------------
> GCC maintainers:
> 
> Move the existing vector extract support in altivec.md to vsx.md
> so all of the vector insert and extract support is in the same file.
> 
> The patch also updates the name of the builtins and descriptions for
> the
> builtins in the documentation file so they match the approved builtin
> names and descriptions.
> 
> The patch does not make any functional changes.
> 
> Please let me know if the changes are acceptable.  Thanks.
> 
>                   Carl Love
> 
> ------------------------------------------------------
> 
> gcc/ChangeLog
> 
> 2020-06-15  Carl Love  <cel@us.ibm.com>
> 
>         * config/rs6000/altivec.md: (UNSPEC_EXTRACTL,
> UNSPEC_EXTRACTR)
> 	(vextractl<mode>, vextractr<mode>)
>         (vextractl<mode>_internal, vextractr<mode>_internal)
> 	(VI2): Move to gcc/config/rs6000/vsx.md.
> 	* config/rs6000/vsx.md:	(UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
>         (vextractl<mode>, vextractr<mode>)
>         (vextractl<mode>_internal, vextractr<mode>_internal)
> 	(VI2): Code was moved from config/rs6000/altivec.md.

Compare the syntax with other patches that move code.  This should be
simplifiable as

        * config/rs6000/altivec.md: (UNSPEC_EXTRACTL,
UNSPEC_EXTRACTR)
	(vextractl<mode>, vextractr<mode>)
        (vextractl<mode>_internal, vextractr<mode>_internal)
	(VI2): Move to ..
	* config/rs6000/vsx.md:	(UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
        (vextractl<mode>, vextractr<mode>)
        (vextractl<mode>_internal, vextractr<mode>_internal)
	(VI2): .. here.




> 	* gcc/doc/extend.texi: Update documentation for vec_extractl.
> 	Replace builtin name vec_extractr with vec_extracth.  Update description
> 	of vec_extracth.
> ---
>  gcc/config/rs6000/altivec.md | 64 -------------------------------
>  gcc/config/rs6000/vsx.md     | 66 ++++++++++++++++++++++++++++++++
>  gcc/doc/extend.texi          | 73 +++++++++++++++++-------------------
>  3 files changed, 101 insertions(+), 102 deletions(-)
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 159f24ebc10..0b0b49ee056 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -171,8 +171,6 @@
>     UNSPEC_XXEVAL
>     UNSPEC_VSTRIR
>     UNSPEC_VSTRIL
> -   UNSPEC_EXTRACTL
> -   UNSPEC_EXTRACTR
>  ])
> 
>  (define_c_enum "unspecv"
> @@ -183,8 +181,6 @@
>     UNSPECV_DSS
>    ])
> 
> -;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
> -(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
>  ;; Short vec int modes
>  (define_mode_iterator VIshort [V8HI V16QI])
>  ;; Longer vec int modes for rotate/mask ops
> @@ -785,66 +781,6 @@
>    DONE;
>  })
> 
> -(define_expand "vextractl<mode>"
> -  [(set (match_operand:V2DI 0 "altivec_register_operand")
> -	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
> -		      (match_operand:VI2 2 "altivec_register_operand")
> -		      (match_operand:SI 3 "register_operand")]
> -		     UNSPEC_EXTRACTL))]
> -  "TARGET_FUTURE"
> -{
> -  if (BYTES_BIG_ENDIAN)
> -    {
> -      emit_insn (gen_vextractl<mode>_internal (operands[0], operands[1],
> -					       operands[2], operands[3]));
> -      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
> -    }
> -  else
> -    emit_insn (gen_vextractr<mode>_internal (operands[0], operands[2],
> -					     operands[1], operands[3]));
> -  DONE;
> -})
> -
> -(define_insn "vextractl<mode>_internal"
> -  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> -	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
> -		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
> -		      (match_operand:SI 3 "register_operand" "r")]
> -		     UNSPEC_EXTRACTL))]
> -  "TARGET_FUTURE"
> -  "vext<du_or_d><wd>vlx %0,%1,%2,%3"
> -  [(set_attr "type" "vecsimple")])
> -
> -(define_expand "vextractr<mode>"
> -  [(set (match_operand:V2DI 0 "altivec_register_operand")
> -	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
> -		      (match_operand:VI2 2 "altivec_register_operand")
> -		      (match_operand:SI 3 "register_operand")]
> -		     UNSPEC_EXTRACTR))]
> -  "TARGET_FUTURE"
> -{
> -  if (BYTES_BIG_ENDIAN)
> -    {
> -      emit_insn (gen_vextractr<mode>_internal (operands[0], operands[1],
> -					       operands[2], operands[3]));
> -      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
> -    }
> -  else
> -    emit_insn (gen_vextractl<mode>_internal (operands[0], operands[2],
> -    					     operands[1], operands[3]));
> -  DONE;
> -})
> -
> -(define_insn "vextractr<mode>_internal"
> -  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> -	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
> -		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
> -		      (match_operand:SI 3 "register_operand" "r")]
> -		     UNSPEC_EXTRACTR))]
> -  "TARGET_FUTURE"
> -  "vext<du_or_d><wd>vrx %0,%1,%2,%3"
> -  [(set_attr "type" "vecsimple")])
> -
>  (define_expand "vstrir_<mode>"
>    [(set (match_operand:VIshort 0 "altivec_register_operand")
>  	(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index 2a28215ac5b..51ffe2d2000 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -344,8 +344,13 @@
>     UNSPEC_VSX_FIRST_MISMATCH_INDEX
>     UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
>     UNSPEC_XXGENPCV
> +   UNSPEC_EXTRACTL
> +   UNSPEC_EXTRACTR
>    ])
> 
> +;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
> +(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
> +
>  ;; VSX moves
> 
>  ;; The patterns for LE permuted loads and stores come before the general
> @@ -3781,6 +3786,67 @@
>  }
>    [(set_attr "type" "load")])
> 
> +;; ISA 3.1 extract
> +(define_expand "vextractl<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand")
> +	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
> +		      (match_operand:VI2 2 "altivec_register_operand")
> +		      (match_operand:SI 3 "register_operand")]
> +		     UNSPEC_EXTRACTL))]
> +  "TARGET_FUTURE"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      emit_insn (gen_vextractl<mode>_internal (operands[0], operands[1],
> +					       operands[2], operands[3]));
> +      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
> +    }
> +  else
> +    emit_insn (gen_vextractr<mode>_internal (operands[0], operands[2],
> +					     operands[1], operands[3]));
> +  DONE;
> +})
> +
> +(define_insn "vextractl<mode>_internal"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> +	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
> +		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
> +		      (match_operand:SI 3 "register_operand" "r")]
> +		     UNSPEC_EXTRACTL))]
> +  "TARGET_FUTURE"
> +  "vext<du_or_d><wd>vlx %0,%1,%2,%3"
> +  [(set_attr "type" "vecsimple")])
> +
> +(define_expand "vextractr<mode>"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand")
> +	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
> +		      (match_operand:VI2 2 "altivec_register_operand")
> +		      (match_operand:SI 3 "register_operand")]
> +		     UNSPEC_EXTRACTR))]
> +  "TARGET_FUTURE"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      emit_insn (gen_vextractr<mode>_internal (operands[0], operands[1],
> +					       operands[2], operands[3]));
> +      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
> +    }
> +  else
> +    emit_insn (gen_vextractl<mode>_internal (operands[0], operands[2],
> +					     operands[1], operands[3]));
> +  DONE;
> +})
> +
> +(define_insn "vextractr<mode>_internal"
> +  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
> +	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
> +		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
> +		      (match_operand:SI 3 "register_operand" "r")]
> +		     UNSPEC_EXTRACTR))]
> +  "TARGET_FUTURE"
> +  "vext<du_or_d><wd>vrx %0,%1,%2,%3"
> +  [(set_attr "type" "vecsimple")])
> +
>  ;; VSX_EXTRACT optimizations
>  ;; Optimize double d = (double) vec_extract (vi, <n>)
>  ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index e656e66a80c..5549a695b42 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -20919,6 +20919,9 @@ Perform a 128-bit vector gather  operation, as if implemented by the Future
>  integer value between 2 and 7 inclusive.
>  @findex vec_gnb
> 
> +
> +Vector Extract
> +
>  @smallexample
>  @exdent vector unsigned long long int
>  @exdent vec_extractl (vector unsigned char, vector unsigned char, unsigned int)
> @@ -20929,51 +20932,45 @@ integer value between 2 and 7 inclusive.
>  @exdent vector unsigned long long int
>  @exdent vec_extractl (vector unsigned long long, vector unsigned long long, unsigned int)
>  @end smallexample
> -Extract a single element from the vector formed by catenating this function's
> -first two arguments at the byte offset specified by this function's
> -third argument.  On big-endian targets, this function behaves as if
> -implemented by the Future @code{vextdubvlx}, @code{vextduhvlx},
> -@code{vextduwvlx}, or @code{vextddvlx} instructions, depending on the
> -types of the function's first two arguments.  On little-endian
> -targets, this function behaves as if implemented by the Future
> -@code{vextdubvrx}, @code{vextduhvrx},
> -@code{vextduwvrx}, or @code{vextddvrx} instructions.
> -The byte offset of the element to be extracted is calculated
> -by computing the remainder of dividing the third argument by 32.
> -If this reminader value is not a multiple of the vector element size,
> -or if its value added to the vector element size exceeds 32, the
> -result is undefined.
> +Extract an element from two concatenated vectors starting at the given byte index
> +in natural-endian order, and place it zero-extended in doubleword 1 of the result
> +according to natural element order.  If the byte index is out of range for the
> +data type, the intrinsic will be rejected.
> +For little-endian, this output will match the placement by the hardware
> +instruction, i.e., dword[0] in RTL notation.  For big-endian, an additional
> +instruction is needed to move it from the "left" doubleword to the  "right" one.
> +For little-endian, semantics matching the vextdu*vrx instruction will be
> +generated, while for big-endian, semantics matching the vextdu*vlx instruction

The instructions should be wrapped in @code{foo}.

After a very quick halfhearted glance I don't see any instruction
wildcards in the existing texi,  I think either convert the instruction
references to using the longword descriptions "Vector extract double
unsigned .. using left-index"  or leave as the instruction searchable 
@code{vextdubvrx} references.

> +will be generated.  Note that some fairly anomalous results can be generated if
> +the byte index is not aligned on an element boundary for the element being
> +extracted.  This is a limitation of the bi-endian vector programming model is
> +consistent with the limitation on vec_perm, for example.
>  @findex vec_extractl
> 
>  @smallexample
>  @exdent vector unsigned long long int
> -@exdent vec_extractr (vector unsigned char, vector unsigned char, unsigned int)
> +@exdent vec_extracth (vector unsigned char, vector unsigned char, unsigned int)
>  @exdent vector unsigned long long int
> -@exdent vec_extractr (vector unsigned short, vector unsigned short, unsigned int)
> +@exdent vec_extracth (vector unsigned short, vector unsigned short, unsigned int)
>  @exdent vector unsigned long long int
> -@exdent vec_extractr (vector unsigned int, vector unsigned int, unsigned int)
> +@exdent vec_extracth (vector unsigned int, vector unsigned int, unsigned int)
>  @exdent vector unsigned long long int
> -@exdent vec_extractr (vector unsigned long long, vector unsigned long long, unsigned int)
> -@end smallexample
> -Extract a single element from the vector formed by catenating this function's
> -first two arguments at the byte offset calculated by subtracting this
> -function's third argument from 31.  On big-endian targets, this
> -function behaves as if
> -implemented by the Future
> -@code{vextdubvrx}, @code{vextduhvrx},
> -@code{vextduwvrx}, or @code{vextddvrx} instructions, depending on the
> -types of the function's first two arguments.
> -On little-endian
> -targets, this function behaves as if implemented by the Future
> -@code{vextdubvlx}, @code{vextduhvlx},
> -@code{vextduwvlx}, or @code{vextddvlx} instructions.
> -The byte offset of the element to be extracted, measured from the
> -right end of the catenation of the two vector arguments, is calculated
> -by computing the remainder of dividing the third argument by 32.
> -If this reminader value is not a multiple of the vector element size,
> -or if its value added to the vector element size exceeds 32, the
> -result is undefined.
> -@findex vec_extractr
> +@exdent vec_extracth (vector unsigned long long, vector unsigned long long, unsigned int)
> +@end smallexample
> +Extract an element from two concatenated vectors starting at the given byte index
> +in opposite-endian order, and place it zero-extended in doubleword 1 according to
> +natural element order.  If the byte index is out of range for the data type,
> +the intrinsic will be rejected.  For little-endian,
> +this output will match the placement by the hardware instruction, i.e., dword[0]
> +in RTL notation.  For big-endian, an additional instruction is needed to move it
> +from the "left" doubleword to the "right" one.  For little-endian, semantics
> +matching the vextdu*vlx instruction will be generated, while for big-endian,
> +semantics matching the vextdu*vrx instruction will be generated.  Note that some
> +fairly anomalous results can be generated if the byte index is not aligned on the
> +element boundary for the element being extracted.  This is a
> +limitation of the bi-endian vector programming model consistent with the
> +limitation on vec_perm, for example.
> +@findex vec_extracth
> 
>  @smallexample
>  @exdent vector unsigned long long int
Segher Boessenkool June 23, 2020, 9:50 p.m. UTC | #2
Hi!

On Mon, Jun 15, 2020 at 04:37:47PM -0700, Carl Love wrote:
>         * config/rs6000/altivec.md: (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)

No colon here.

> 	(vextractl<mode>, vextractr<mode>)
>         (vextractl<mode>_internal, vextractr<mode>_internal)
> 	(VI2): Move to gcc/config/rs6000/vsx.md.

Will explained how you can easily write a changelog entry for moving
code to another file.

If you write <mode> in a changelog left side of a colon, it usually
helps to say what iterator that is: "<mode> for VI2" for example.  This
is of course more important in more complex cases, say where you have
the exact same names just for different iterators, or when you have
<code> for example (the name "<code><mode>3" is really not very
enlightening :-) )

> 	* config/rs6000/vsx.md:	(UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)

No colon.

>         (vextractl<mode>, vextractr<mode>)
>         (vextractl<mode>_internal, vextractr<mode>_internal)
> 	(VI2): Code was moved from config/rs6000/altivec.md.
> 	* gcc/doc/extend.texi: Update documentation for vec_extractl.
> 	Replace builtin name vec_extractr with vec_extracth.  Update description
> 	of vec_extracth.

The indent is weird in places, I guess that is just a mail issue.

Okay for trunk with those trivialities, and the things Will found, fixed
up.  Thanks!  Just writing out the full instruction names is the easiest
for everyone btw, unless that then needs to be a huge list, that isn't
very helpful to anyone.


Segher
diff mbox series

Patch

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 159f24ebc10..0b0b49ee056 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -171,8 +171,6 @@ 
    UNSPEC_XXEVAL
    UNSPEC_VSTRIR
    UNSPEC_VSTRIL
-   UNSPEC_EXTRACTL
-   UNSPEC_EXTRACTR
 ])
 
 (define_c_enum "unspecv"
@@ -183,8 +181,6 @@ 
    UNSPECV_DSS
   ])
 
-;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
-(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
 ;; Short vec int modes
 (define_mode_iterator VIshort [V8HI V16QI])
 ;; Longer vec int modes for rotate/mask ops
@@ -785,66 +781,6 @@ 
   DONE;
 })
 
-(define_expand "vextractl<mode>"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
-		      (match_operand:VI2 2 "altivec_register_operand")
-		      (match_operand:SI 3 "register_operand")]
-		     UNSPEC_EXTRACTL))]
-  "TARGET_FUTURE"
-{
-  if (BYTES_BIG_ENDIAN)
-    {
-      emit_insn (gen_vextractl<mode>_internal (operands[0], operands[1],
-					       operands[2], operands[3]));
-      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-    }
-  else
-    emit_insn (gen_vextractr<mode>_internal (operands[0], operands[2],
-					     operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractl<mode>_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
-		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
-		      (match_operand:SI 3 "register_operand" "r")]
-		     UNSPEC_EXTRACTL))]
-  "TARGET_FUTURE"
-  "vext<du_or_d><wd>vlx %0,%1,%2,%3"
-  [(set_attr "type" "vecsimple")])
-
-(define_expand "vextractr<mode>"
-  [(set (match_operand:V2DI 0 "altivec_register_operand")
-	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
-		      (match_operand:VI2 2 "altivec_register_operand")
-		      (match_operand:SI 3 "register_operand")]
-		     UNSPEC_EXTRACTR))]
-  "TARGET_FUTURE"
-{
-  if (BYTES_BIG_ENDIAN)
-    {
-      emit_insn (gen_vextractr<mode>_internal (operands[0], operands[1],
-					       operands[2], operands[3]));
-      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
-    }
-  else
-    emit_insn (gen_vextractl<mode>_internal (operands[0], operands[2],
-    					     operands[1], operands[3]));
-  DONE;
-})
-
-(define_insn "vextractr<mode>_internal"
-  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
-	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
-		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
-		      (match_operand:SI 3 "register_operand" "r")]
-		     UNSPEC_EXTRACTR))]
-  "TARGET_FUTURE"
-  "vext<du_or_d><wd>vrx %0,%1,%2,%3"
-  [(set_attr "type" "vecsimple")])
-
 (define_expand "vstrir_<mode>"
   [(set (match_operand:VIshort 0 "altivec_register_operand")
 	(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index 2a28215ac5b..51ffe2d2000 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -344,8 +344,13 @@ 
    UNSPEC_VSX_FIRST_MISMATCH_INDEX
    UNSPEC_VSX_FIRST_MISMATCH_EOS_INDEX
    UNSPEC_XXGENPCV
+   UNSPEC_EXTRACTL
+   UNSPEC_EXTRACTR
   ])
 
+;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
+(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])
+
 ;; VSX moves
 
 ;; The patterns for LE permuted loads and stores come before the general
@@ -3781,6 +3786,67 @@ 
 }
   [(set_attr "type" "load")])
 
+;; ISA 3.1 extract
+(define_expand "vextractl<mode>"
+  [(set (match_operand:V2DI 0 "altivec_register_operand")
+	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
+		      (match_operand:VI2 2 "altivec_register_operand")
+		      (match_operand:SI 3 "register_operand")]
+		     UNSPEC_EXTRACTL))]
+  "TARGET_FUTURE"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      emit_insn (gen_vextractl<mode>_internal (operands[0], operands[1],
+					       operands[2], operands[3]));
+      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
+    }
+  else
+    emit_insn (gen_vextractr<mode>_internal (operands[0], operands[2],
+					     operands[1], operands[3]));
+  DONE;
+})
+
+(define_insn "vextractl<mode>_internal"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
+		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
+		      (match_operand:SI 3 "register_operand" "r")]
+		     UNSPEC_EXTRACTL))]
+  "TARGET_FUTURE"
+  "vext<du_or_d><wd>vlx %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")])
+
+(define_expand "vextractr<mode>"
+  [(set (match_operand:V2DI 0 "altivec_register_operand")
+	(unspec:V2DI [(match_operand:VI2 1 "altivec_register_operand")
+		      (match_operand:VI2 2 "altivec_register_operand")
+		      (match_operand:SI 3 "register_operand")]
+		     UNSPEC_EXTRACTR))]
+  "TARGET_FUTURE"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      emit_insn (gen_vextractr<mode>_internal (operands[0], operands[1],
+					       operands[2], operands[3]));
+      emit_insn (gen_xxswapd_v2di (operands[0], operands[0]));
+    }
+  else
+    emit_insn (gen_vextractl<mode>_internal (operands[0], operands[2],
+					     operands[1], operands[3]));
+  DONE;
+})
+
+(define_insn "vextractr<mode>_internal"
+  [(set (match_operand:V2DI 0 "altivec_register_operand" "=v")
+	(unspec:V2DI [(match_operand:VEC_I 1 "altivec_register_operand" "v")
+		      (match_operand:VEC_I 2 "altivec_register_operand" "v")
+		      (match_operand:SI 3 "register_operand" "r")]
+		     UNSPEC_EXTRACTR))]
+  "TARGET_FUTURE"
+  "vext<du_or_d><wd>vrx %0,%1,%2,%3"
+  [(set_attr "type" "vecsimple")])
+
 ;; VSX_EXTRACT optimizations
 ;; Optimize double d = (double) vec_extract (vi, <n>)
 ;; Get the element into the top position and use XVCVSWDP/XVCVUWDP
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e656e66a80c..5549a695b42 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -20919,6 +20919,9 @@  Perform a 128-bit vector gather  operation, as if implemented by the Future
 integer value between 2 and 7 inclusive.
 @findex vec_gnb
 
+
+Vector Extract
+
 @smallexample
 @exdent vector unsigned long long int
 @exdent vec_extractl (vector unsigned char, vector unsigned char, unsigned int)
@@ -20929,51 +20932,45 @@  integer value between 2 and 7 inclusive.
 @exdent vector unsigned long long int
 @exdent vec_extractl (vector unsigned long long, vector unsigned long long, unsigned int)
 @end smallexample
-Extract a single element from the vector formed by catenating this function's
-first two arguments at the byte offset specified by this function's
-third argument.  On big-endian targets, this function behaves as if
-implemented by the Future @code{vextdubvlx}, @code{vextduhvlx},
-@code{vextduwvlx}, or @code{vextddvlx} instructions, depending on the
-types of the function's first two arguments.  On little-endian
-targets, this function behaves as if implemented by the Future
-@code{vextdubvrx}, @code{vextduhvrx},
-@code{vextduwvrx}, or @code{vextddvrx} instructions.
-The byte offset of the element to be extracted is calculated
-by computing the remainder of dividing the third argument by 32.
-If this reminader value is not a multiple of the vector element size,
-or if its value added to the vector element size exceeds 32, the
-result is undefined.
+Extract an element from two concatenated vectors starting at the given byte index
+in natural-endian order, and place it zero-extended in doubleword 1 of the result
+according to natural element order.  If the byte index is out of range for the
+data type, the intrinsic will be rejected.
+For little-endian, this output will match the placement by the hardware
+instruction, i.e., dword[0] in RTL notation.  For big-endian, an additional
+instruction is needed to move it from the "left" doubleword to the  "right" one.
+For little-endian, semantics matching the vextdu*vrx instruction will be
+generated, while for big-endian, semantics matching the vextdu*vlx instruction
+will be generated.  Note that some fairly anomalous results can be generated if
+the byte index is not aligned on an element boundary for the element being
+extracted.  This is a limitation of the bi-endian vector programming model is
+consistent with the limitation on vec_perm, for example.
 @findex vec_extractl
 
 @smallexample
 @exdent vector unsigned long long int
-@exdent vec_extractr (vector unsigned char, vector unsigned char, unsigned int)
+@exdent vec_extracth (vector unsigned char, vector unsigned char, unsigned int)
 @exdent vector unsigned long long int
-@exdent vec_extractr (vector unsigned short, vector unsigned short, unsigned int)
+@exdent vec_extracth (vector unsigned short, vector unsigned short, unsigned int)
 @exdent vector unsigned long long int
-@exdent vec_extractr (vector unsigned int, vector unsigned int, unsigned int)
+@exdent vec_extracth (vector unsigned int, vector unsigned int, unsigned int)
 @exdent vector unsigned long long int
-@exdent vec_extractr (vector unsigned long long, vector unsigned long long, unsigned int)
-@end smallexample
-Extract a single element from the vector formed by catenating this function's
-first two arguments at the byte offset calculated by subtracting this
-function's third argument from 31.  On big-endian targets, this
-function behaves as if
-implemented by the Future
-@code{vextdubvrx}, @code{vextduhvrx},
-@code{vextduwvrx}, or @code{vextddvrx} instructions, depending on the
-types of the function's first two arguments.
-On little-endian
-targets, this function behaves as if implemented by the Future
-@code{vextdubvlx}, @code{vextduhvlx},
-@code{vextduwvlx}, or @code{vextddvlx} instructions.
-The byte offset of the element to be extracted, measured from the
-right end of the catenation of the two vector arguments, is calculated
-by computing the remainder of dividing the third argument by 32.
-If this reminader value is not a multiple of the vector element size,
-or if its value added to the vector element size exceeds 32, the
-result is undefined.
-@findex vec_extractr
+@exdent vec_extracth (vector unsigned long long, vector unsigned long long, unsigned int)
+@end smallexample
+Extract an element from two concatenated vectors starting at the given byte index
+in opposite-endian order, and place it zero-extended in doubleword 1 according to
+natural element order.  If the byte index is out of range for the data type,
+the intrinsic will be rejected.  For little-endian,
+this output will match the placement by the hardware instruction, i.e., dword[0]
+in RTL notation.  For big-endian, an additional instruction is needed to move it
+from the "left" doubleword to the "right" one.  For little-endian, semantics
+matching the vextdu*vlx instruction will be generated, while for big-endian,
+semantics matching the vextdu*vrx instruction will be generated.  Note that some
+fairly anomalous results can be generated if the byte index is not aligned on the
+element boundary for the element being extracted.  This is a
+limitation of the bi-endian vector programming model consistent with the
+limitation on vec_perm, for example.
+@findex vec_extracth
 
 @smallexample
 @exdent vector unsigned long long int