From patchwork Sun Aug 15 14:30:25 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Artem Shinkarov X-Patchwork-Id: 61747 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id EE286B70AE for ; Mon, 16 Aug 2010 01:23:30 +1000 (EST) Received: (qmail 1406 invoked by alias); 15 Aug 2010 15:23:27 -0000 Received: (qmail 1391 invoked by uid 22791); 15 Aug 2010 15:23:23 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, TW_TM, TW_UF, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-qw0-f47.google.com (HELO mail-qw0-f47.google.com) (209.85.216.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Sun, 15 Aug 2010 15:23:09 +0000 Received: by qwi2 with SMTP id 2so3820694qwi.20 for ; Sun, 15 Aug 2010 08:23:07 -0700 (PDT) Received: by 10.224.56.75 with SMTP id x11mr2521093qag.278.1281882645209; Sun, 15 Aug 2010 07:30:45 -0700 (PDT) MIME-Version: 1.0 Received: by 10.229.66.165 with HTTP; Sun, 15 Aug 2010 07:30:25 -0700 (PDT) From: Artem Shinkarov Date: Sun, 15 Aug 2010 15:30:25 +0100 Message-ID: Subject: Vector shuffling To: gcc-patches@gcc.gnu.org Cc: Richard Guenther X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org The patch implements vector shuffling according to the OpenCL standard. The patch introduces builtin function __builtin_shuffle which accepts two or three parameters: __builtin_shuffle (vec, mask) or __builtin_shuffle (vec0, vec1, mask) and returns a shuffled vector. Function is trying to dispatch shuffling to the hardware-specific shuffling instructions via new target hooks. If this attempt fails, function expands shuffling piecewise. Changelog: 2010-08-15 Artem Shinkarov gcc/ * builtins.def (BUILTIN_SHUFFLE): New built-in. * c-typeck.c (build_function_call_vec): Typecheck shuffle function, change return type. * Makefile.in: New include. * passes.c: Move lower_vector. * target.def: Target hooks for vector shuffle. * target.h: New include. * targhooks.c: Default expansions. * targhooks.h: New declarations. * tree.c (build_vector_from_val): Build a vector from sacalr. New function. * tree.h (build_vector_from_val): New definition. * tree-vect-generic.c (vector_element): Return vectors element at the position specified. New function. (lower_builtin_shuffle): Builtin expansion. New function. (expand_vector_operations_1): Handle expansion. (gate_expand_vector_operations_noop): New gate function. Change gate pass rules. * config/i386/i386.c (ix86_vectorize_builtin_shuffle): Expand builtin shuffle to hardware-specific instructions or return flase otherwise. (ix86_vectorize_builtin_shuffle2): Expand built-in shuffle with two input vectors to hardware-specific instructions or return false otherwise. (extract_vec_perm_cst): Handle situation when VECTOR_CST has less elements in the list than it should. (ix86_vectorize_builtin_vec_perm_ok): Return false instead of error when mask is invalid. gcc/testsuite/gcc.c-torture/execute/ * vect-shuffle-1.c: New test. * vect-shuffle-2.c: New test. * vect-shuffle-3.c: New test. * vect-shuffle-4.c: New test. gcc/doc/ * extend.texi: Adjust. * tm.texi: Adjust. * tm.texi.in: Adjust. bootstrapped and tested on x86_64_unknown-linux. Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 163244) +++ gcc/doc/extend.texi (working copy) @@ -6141,6 +6141,32 @@ minus or complement operators on a vecto elements are the negative or complemented values of the corresponding elements in the operand. +Vector shuffling is available using functions +@code{__builtin_shuffle (vec, mask)} and +@code{__builtin_shuffle (vec0, vec1, mask)}. Both functions construct +a permutation of elements from one or two vectors and return a vector +of the same type as input vector(s). The mask is a vector of +integer-typed elements. The size of each element of the mask must be +the same as the size of each input vector element. The number of +elements in input vector(s) and mask must be the same. + +The elements of the input vectors are numbered from left to right across +one or both of the vectors. Each element in the mask specifies a number +of element from the input vector(s). Consider the following example. + +@smallexample +typedef int v4si __attribute__ ((vector_size (16))); + +v4si a = @{1,2,3,4@}; +v4si b = @{5,6,7,8@}; +v4si mask1 = @{0,1,1,3@}; +v4si mask2 = @{0,4,2,5@}; +v4si res; + +res = __builtin_shuffle (a, mask1); /* res is @{1,2,2,4@} */ +res = __builtin_shuffle2 (a, b, mask2); /* res is @{1,5,3,6@} */ +@end smallexample + You can declare variables and use them in function calls and returns, as well as in assignments and some casts. You can specify a vector type as a return type for a function. Vector types can also be used as function Index: gcc/doc/tm.texi =================================================================== --- gcc/doc/tm.texi (revision 163244) +++ gcc/doc/tm.texi (working copy) @@ -5710,6 +5710,18 @@ misalignment value (@var{misalign}). Return true if vector alignment is reachable (by peeling N iterations) for the given type. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VECTORIZE_BUILTIN_SHUFFLE (gimple_stmt_iterator *@var{gsi}, tree @var{vec0}, tree @var{mask}) +This hook should find out if vector shuffling is possible within hardware instructions +and replace the statement pointed by @var{gsi} with the appropriate hardware-specific +functions. If the hardware expansion is unavailable, function returns false. +@end deftypefn + +@deftypefn {Target Hook} bool TARGET_VECTORIZE_BUILTIN_SHUFFLE2 (gimple_stmt_iterator *@var{gsi}, tree @var{vec0}, tree @var{vec1}, tree @var{mask}) +This hook should find out if vector shuffling is possible within hardware instructions +and replace the statement pointed by @var{gsi} with the appropriate hardware-specific +functions. If the hardware expansion is unavailable, function returns false. +@end deftypefn + @deftypefn {Target Hook} tree TARGET_VECTORIZE_BUILTIN_VEC_PERM (tree @var{type}, tree *@var{mask_element_type}) Target builtin that implements vector permute. @end deftypefn Index: gcc/doc/tm.texi.in =================================================================== --- gcc/doc/tm.texi.in (revision 163244) +++ gcc/doc/tm.texi.in (working copy) @@ -5710,6 +5710,18 @@ misalignment value (@var{misalign}). Return true if vector alignment is reachable (by peeling N iterations) for the given type. @end deftypefn +@hook TARGET_VECTORIZE_BUILTIN_SHUFFLE +This hook should find out if vector shuffling is possible within hardware instructions +and replace the statement pointed by @var{gsi} with the appropriate hardware-specific +functions. If the hardware expansion is unavailable, function returns false. +@end deftypefn + +@hook TARGET_VECTORIZE_BUILTIN_SHUFFLE2 +This hook should find out if vector shuffling is possible within hardware instructions +and replace the statement pointed by @var{gsi} with the appropriate hardware-specific +functions. If the hardware expansion is unavailable, function returns false. +@end deftypefn + @hook TARGET_VECTORIZE_BUILTIN_VEC_PERM Target builtin that implements vector permute. @end deftypefn Index: gcc/targhooks.c =================================================================== --- gcc/targhooks.c (revision 163244) +++ gcc/targhooks.c (working copy) @@ -472,6 +472,26 @@ default_builtin_vectorized_function (tre return NULL_TREE; } + +/* Vector shuffling functions */ + +bool +default_builtin_shuffle (gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, + tree vec0 ATTRIBUTE_UNUSED, + tree mask ATTRIBUTE_UNUSED) +{ + return false; +} + +bool +default_builtin_shuffle2 (gimple_stmt_iterator *gsi ATTRIBUTE_UNUSED, + tree vec0 ATTRIBUTE_UNUSED, + tree vec1 ATTRIBUTE_UNUSED, + tree mask ATTRIBUTE_UNUSED) +{ + return false; +} + /* Vectorized conversion. */ tree Index: gcc/targhooks.h =================================================================== --- gcc/targhooks.h (revision 163244) +++ gcc/targhooks.h (working copy) @@ -75,6 +75,12 @@ extern const char * default_invalid_with extern tree default_builtin_vectorized_function (tree, tree, tree); +extern bool default_builtin_shuffle (gimple_stmt_iterator *gsi, tree vec0, + tree mask); + +extern bool default_builtin_shuffle2 (gimple_stmt_iterator *gsi, tree vec0, + tree vec1, tree mask); + extern tree default_builtin_vectorized_conversion (unsigned int, tree, tree); extern int default_builtin_vectorization_cost (enum vect_cost_for_stmt, tree, int); Index: gcc/target.def =================================================================== --- gcc/target.def (revision 163244) +++ gcc/target.def (working copy) @@ -829,6 +829,23 @@ DEFHOOK "", tree, (tree type, tree *mask_element_type), NULL) +/* Target built-in that implements vector shuffling, or returns + false if not available. */ +DEFHOOK +(builtin_shuffle, + "", + bool, (gimple_stmt_iterator *gsi, tree vec0, tree mask), + default_builtin_shuffle) + +/* Target built-in that implements two vector shuffling, or returns + false if not available. */ +DEFHOOK +(builtin_shuffle2, + "", + bool, (gimple_stmt_iterator *gsi, tree vec0, tree vec1, tree mask), + default_builtin_shuffle2) + + /* Return true if a vector created for builtin_vec_perm is valid. */ DEFHOOK (builtin_vec_perm_ok, Index: gcc/tree.c =================================================================== --- gcc/tree.c (revision 163244) +++ gcc/tree.c (working copy) @@ -1358,6 +1358,27 @@ build_vector_from_ctor (tree type, VEC(c return build_vector (type, nreverse (list)); } +/* Build a vector of type VECTYPE where all the elements are SCs. */ +tree +build_vector_from_val (const tree sc, const tree vectype) +{ + tree t = NULL_TREE; + int i, nunits = TYPE_VECTOR_SUBPARTS (vectype); + + if (sc == error_mark_node) + return sc; + + gcc_assert (TREE_TYPE (sc) == TREE_TYPE (vectype)); + + for (i = 0; i < nunits; ++i) + t = tree_cons (NULL_TREE, sc, t); + + if (CONSTANT_CLASS_P (sc)) + return build_vector (vectype, t); + else + return build_constructor_from_list (vectype, t); +} + /* Return a new CONSTRUCTOR node whose type is TYPE and whose values are in the VEC pointed to by VALS. */ tree Index: gcc/tree.h =================================================================== --- gcc/tree.h (revision 163244) +++ gcc/tree.h (working copy) @@ -4027,6 +4027,7 @@ extern tree build_int_cst_type (tree, HO extern tree build_int_cst_wide (tree, unsigned HOST_WIDE_INT, HOST_WIDE_INT); extern tree build_vector (tree, tree); extern tree build_vector_from_ctor (tree, VEC(constructor_elt,gc) *); +extern tree build_vector_from_val (const tree, const tree); extern tree build_constructor (tree, VEC(constructor_elt,gc) *); extern tree build_constructor_single (tree, tree, tree); extern tree build_constructor_from_list (tree, tree); Index: gcc/target.h =================================================================== --- gcc/target.h (revision 163244) +++ gcc/target.h (working copy) @@ -51,6 +51,7 @@ #include "tm.h" #include "insn-modes.h" +#include "gimple.h" /* Types used by the record_gcc_switches() target function. */ typedef enum Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-2.c (revision 0) @@ -0,0 +1,43 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define shuf2compare(type, count, vres, v0, v1, mask) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx(type, vres, __i) != ((vidx(type, mask, __i) < count) ? \ + vidx(type, v0, vidx(type, mask, __i)) : \ + vidx(type, v1, (vidx(type, mask, __i) - count)))) \ + __builtin_abort (); \ + } \ +} while (0) + + +int main (int argc, char *argv[]) { + vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; + vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; + vector (8, short) v2; + + //vector (8, short) mask = {1,2,5,4,3,6,7}; + + vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; + vector (8, short) mask1 = {0,12,3,4,3,0,10,9}; + vector (8, short) mask2 = {0,8,1,9,2,10,3,11}; + + v2 = __builtin_shuffle (v0, v1, mask0); + shuf2compare (short, 8, v2, v0, v1, mask0); + + v2 = __builtin_shuffle (v0, v1, mask1); + shuf2compare (short, 8, v2, v0, v1, mask1); + + v2 = __builtin_shuffle (v0, v1, mask2); + shuf2compare (short, 8, v2, v0, v1, mask2); + + v2 = __builtin_shuffle (mask0, mask0, v0); + shuf2compare (short, 8, v2, mask0, mask0, v0); + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-4.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-4.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-4.c (revision 0) @@ -0,0 +1,50 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define shuf2compare(type, count, vres, v0, v1, mask) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx(type, vres, __i) != ((vidx(type, mask, __i) < count) ? \ + vidx(type, v0, vidx(type, mask, __i)) : \ + vidx(type, v1, (vidx(type, mask, __i) - count)))) \ + __builtin_abort (); \ + } \ +} while (0) + + +vector (8, short) __attribute__ ((noinline)) +f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) { + return __builtin_shuffle (x, y, mask); +} + + + +int main (int argc, char *argv[]) { + vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; + vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; + vector (8, short) v2; + + //vector (8, short) mask = {1,2,5,4,3,6,7}; + + vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; + vector (8, short) mask1 = {0,12,3,4,3,0,10,9}; + vector (8, short) mask2 = {0,8,1,9,2,10,3,11}; + + v2 = f (v0, v1, mask0); + shuf2compare (short, 8, v2, v0, v1, mask0); + + v2 = f (v0, v1, mask1); + shuf2compare (short, 8, v2, v0, v1, mask1); + + v2 = f (v0, v1, mask2); + shuf2compare (short, 8, v2, v0, v1, mask2); + + v2 = f (mask0, mask0, v0); + shuf2compare (short, 8, v2, mask0, mask0, v0); + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-1.c (revision 0) @@ -0,0 +1,46 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define shufcompare(type, count, vres, v0, mask) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx(type, vres, __i) != vidx(type, v0, vidx(type, mask, __i))) \ + __builtin_abort (); \ + } \ +} while (0) + + +int main (int argc, char *argv[]) { + /*vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; + vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; + vector (8, short) v2; + + vector (8, short) smask = {0,0,1,2,3,4,5,6}; + + v2 = __builtin_shuffle (v0, smask); + shufcompare (short, 8, v2, v0, smask); + v2 = __builtin_shuffle (v0, v1); + shufcompare (short, 8, v2, v0, v1); + v2 = __builtin_shuffle (smask, v0); + shufcompare (short, 8, v2, smask, v0);*/ + + vector (4, int) i0 = {argc, 1,2,3}; + vector (4, int) i1 = {argc, 1, argc, 3}; + vector (4, int) i2; + + vector (4, int) imask = {0,3,2,1}; + + /*i2 = __builtin_shuffle (i0, imask); + shufcompare (int, 4, i2, i0, imask);*/ + i2 = __builtin_shuffle (i0, i1); + shufcompare (int, 4, i2, i0, i1); + + i2 = __builtin_shuffle (imask, i0); + shufcompare (int, 4, i2, imask, i0); + + return 0; +} + Index: gcc/testsuite/gcc.c-torture/execute/vect-shuffle-3.c =================================================================== --- gcc/testsuite/gcc.c-torture/execute/vect-shuffle-3.c (revision 0) +++ gcc/testsuite/gcc.c-torture/execute/vect-shuffle-3.c (revision 0) @@ -0,0 +1,36 @@ +#define vector(elcount, type) \ +__attribute__((vector_size((elcount)*sizeof(type)))) type + +#define vidx(type, vec, idx) (*(((type *) &(vec)) + idx)) + +#define shufcompare(type, count, vres, v0, mask) \ +do { \ + int __i; \ + for (__i = 0; __i < count; __i++) { \ + if (vidx(type, vres, __i) != vidx(type, v0, vidx(type, mask, __i))) \ + __builtin_abort (); \ + } \ +} while (0) + +vector (8, short) __attribute__ ((noinline)) +f (vector (8, short) x, vector (8, short) mask) { + return __builtin_shuffle (x, mask); +} + + +int main (int argc, char *argv[]) { + vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; + vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; + vector (8, short) v2; + + vector (8, short) mask = {0,0,1,2,3,4,5,6}; + + v2 = f (v0, mask); + shufcompare (short, 8, v2, v0, mask); + + v2 = f (v0, v1); + shufcompare (short, 8, v2, v0, v1); + + return 0; +} + Index: gcc/builtins.def =================================================================== --- gcc/builtins.def (revision 163244) +++ gcc/builtins.def (working copy) @@ -708,6 +708,8 @@ DEF_GCC_BUILTIN (BUILT_IN_VA_ARG_ DEF_EXT_LIB_BUILTIN (BUILT_IN__EXIT, "_exit", BT_FN_VOID_INT, ATTR_NORETURN_NOTHROW_LIST) DEF_C99_BUILTIN (BUILT_IN__EXIT2, "_Exit", BT_FN_VOID_INT, ATTR_NORETURN_NOTHROW_LIST) +DEF_GCC_BUILTIN (BUILT_IN_SHUFFLE, "shuffle", BT_FN_INT_VAR, ATTR_CONST_NOTHROW_TYPEGENERIC) + /* Implementing nested functions. */ DEF_BUILTIN_STUB (BUILT_IN_INIT_TRAMPOLINE, "__builtin_init_trampoline") DEF_BUILTIN_STUB (BUILT_IN_ADJUST_TRAMPOLINE, "__builtin_adjust_trampoline") Index: gcc/c-typeck.c =================================================================== --- gcc/c-typeck.c (revision 163244) +++ gcc/c-typeck.c (working copy) @@ -2794,6 +2794,68 @@ build_function_call_vec (location_t loc, && !check_builtin_function_arguments (fundecl, nargs, argarray)) return error_mark_node; + /* Typecheck a builtin function which is declared with variable + argument list. */ + if (fundecl && DECL_BUILT_IN (fundecl) + && DECL_BUILT_IN_CLASS (fundecl) == BUILT_IN_NORMAL) + { + enum built_in_function fcode = DECL_FUNCTION_CODE (fundecl); + if (fcode == BUILT_IN_SHUFFLE) + { + tree firstarg = VEC_index (tree, params, 0); + tree mask = VEC_index (tree, params, nargs - 1); + + if (nargs != 2 && nargs != 3) + { + error_at (loc, "__builtin_shuffle accepts 2 or 3 argumensts"); + return error_mark_node; + } + + if (TREE_CODE (TREE_TYPE (mask)) != VECTOR_TYPE + || TREE_CODE (TREE_TYPE (TREE_TYPE (mask))) != INTEGER_TYPE) + { + error_at (loc, "__builtin_shuffle last argument must " + "be an integer vector"); + return error_mark_node; + } + + if (TREE_CODE (TREE_TYPE (firstarg)) != VECTOR_TYPE + || (nargs == 3 + && TREE_CODE (TREE_TYPE (VEC_index (tree, params, 1))) + != VECTOR_TYPE)) + { + error_at (loc, "__builtin_shuffle arguments must be vectors"); + return error_mark_node; + } + + if ((TYPE_VECTOR_SUBPARTS (TREE_TYPE (firstarg)) + != TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask))) + || (nargs == 3 + && TYPE_VECTOR_SUBPARTS ( + TREE_TYPE (VEC_index (tree, params, 1))) + != TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask)))) + { + error_at (loc, "__builtin_shuffle number of elements of the " + "argument vector(s) and the mask vector should " + "be the same"); + return error_mark_node; + } + + /* Here we change the return type of the builtin function + from int f(...) --> t f(...) where t is a type of the + first argument. */ + fundecl = copy_node (fundecl); + TREE_TYPE (fundecl) = build_function_type (TREE_TYPE (firstarg), + TYPE_ARG_TYPES (TREE_TYPE (fundecl))); + function = build_fold_addr_expr (fundecl); + result = build_call_array_loc (loc, TREE_TYPE (firstarg), + function, nargs, argarray); + return require_complete_type (result); + } + } + + + /* Check that the arguments to the function are valid. */ check_function_arguments (TYPE_ATTRIBUTES (fntype), nargs, argarray, TYPE_ARG_TYPES (fntype)); @@ -6005,10 +6067,17 @@ digest_init (location_t init_loc, tree t tree value; bool constant_p = true; - /* Iterate through elements and check if all constructor + /* If constructor has less elements than the vector type. */ + if (CONSTRUCTOR_NELTS (inside_init) + < TYPE_VECTOR_SUBPARTS (TREE_TYPE (inside_init))) + warning_at (init_loc, 0, "vector length does not match " + "initializer length, zero elements " + "will be inserted"); + + /* Iterate through elements and check if all constructor elements are *_CSTs. */ FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (inside_init), ix, value) - if (!CONSTANT_CLASS_P (value)) + if (!CONSTANT_CLASS_P (value)) { constant_p = false; break; Index: gcc/tree-vect-generic.c =================================================================== --- gcc/tree-vect-generic.c (revision 163244) +++ gcc/tree-vect-generic.c (working copy) @@ -30,6 +30,8 @@ along with GCC; see the file COPYING3. #include "tree-pass.h" #include "flags.h" #include "ggc.h" +#include "target.h" +#include "diagnostic.h" /* Need to include rtl.h, expr.h, etc. for optabs. */ #include "expr.h" @@ -383,6 +385,277 @@ type_for_widest_vector_mode (enum machin } } + +/* Build a reference to the element of the vector VECT. Function + returns either the element itself, either BIT_FIELD_REF, or an + ARRAY_REF expression. + + GSI is requred to insert temporary variables while building a + refernece to the element of the vector VECT. + + PTMPVEC is a pointer to the temporary variable for caching + purposes. In case when PTMPVEC is NULL new temporary variable + will be created. */ +static tree +vector_element (gimple_stmt_iterator *gsi, tree vect, tree idx, tree *ptmpvec) +{ + tree type; + gimple asgn; + unsigned HOST_WIDE_INT maxval; + tree tmpvec; + tree indextype, arraytype; + bool need_asgn = true; + + gcc_assert (TREE_CODE (TREE_TYPE (vect)) == VECTOR_TYPE); + + type = TREE_TYPE (vect); + if (TREE_CODE (idx) == INTEGER_CST) + { + unsigned HOST_WIDE_INT index; + + if (!host_integerp (idx, 1) + || (index = tree_low_cst (idx, 1)) > TYPE_VECTOR_SUBPARTS (type)-1) + return error_mark_node; + + if (TREE_CODE (vect) == VECTOR_CST) + { + unsigned i; + tree vals = TREE_VECTOR_CST_ELTS (vect); + for (i = 0; vals; vals = TREE_CHAIN (vals), ++i) + if (i == index) + return TREE_VALUE (vals); + return error_mark_node; + } + else if (TREE_CODE (vect) == CONSTRUCTOR) + { + unsigned i; + VEC (constructor_elt, gc) *vals = CONSTRUCTOR_ELTS (vect); + constructor_elt *elt; + + for (i = 0; VEC_iterate (constructor_elt, vals, i, elt); i++) + if (operand_equal_p (elt->index, idx, 0)) + return elt->value; + return fold_convert (TREE_TYPE (type), integer_zero_node); + } + else if (TREE_CODE (vect) == SSA_NAME) + { + tree el; + gimple vectdef = SSA_NAME_DEF_STMT (vect); + if (gimple_assign_single_p (vectdef) + && (el = vector_element (gsi, gimple_assign_rhs1 (vectdef), + idx, ptmpvec)) + != error_mark_node) + return el; + else + { + tree size = TYPE_SIZE (TREE_TYPE (type)); + tree pos = fold_build2 (MULT_EXPR, TREE_TYPE (idx), + idx, size); + return fold_build3 (BIT_FIELD_REF, TREE_TYPE (type), + vect, size, pos); + } + } + else + return error_mark_node; + } + + if (!ptmpvec) + tmpvec = create_tmp_var (TREE_TYPE (vect), "vectmp"); + else if (!*ptmpvec) + tmpvec = *ptmpvec = create_tmp_var (TREE_TYPE (vect), "vectmp"); + else + { + tmpvec = *ptmpvec; + need_asgn = false; + } + + if (need_asgn) + { + TREE_ADDRESSABLE (tmpvec) = 1; + asgn = gimple_build_assign (tmpvec, vect); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + } + + maxval = TYPE_VECTOR_SUBPARTS (TREE_TYPE (vect)) -1; + indextype = build_index_type (size_int (maxval)); + arraytype = build_array_type (TREE_TYPE (type), indextype); + + return build4 (ARRAY_REF, TREE_TYPE (type), + build1 (VIEW_CONVERT_EXPR, arraytype, tmpvec), + idx, NULL_TREE, NULL_TREE); + + +} + +/* Lower built-in vector shuffle function. Function can have two or + three arguments. + When function has two arguments: __builtin_shuffle (v0, mask, + the lowered version would be {v0[mask[0]], v0[mask[1]], ...} + MASK and V0 must have the same number of elements. + + In case of three arguments: __builtin_shuffle (v0, v1, mask) + the lowered version would be: + {mask[0] < len(v0) ? v0[mask[0]] : v1[mask[0]], ...} + V0 and V1 must have the same type. MASK, V0, V1 must have the + same number of arguments. */ +static void +lower_builtin_shuffle (gimple_stmt_iterator *gsi, location_t loc) +{ +#define TRAP_RETURN(new_stmt, stmt, gsi, vec0) \ +do { \ + new_stmt = gimple_build_call (built_in_decls[BUILT_IN_TRAP], 0); \ + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); \ + split_block (gimple_bb (new_stmt), new_stmt); \ + new_stmt = gimple_build_assign (gimple_call_lhs (stmt), vec0); \ + gsi_replace (gsi, new_stmt, false); \ + return; \ +} while (0) + + gimple stmt = gsi_stmt (*gsi); + unsigned numargs = gimple_call_num_args (stmt); + tree mask = gimple_call_arg (stmt, numargs - 1); + tree vec0 = gimple_call_arg (stmt, 0); + unsigned els = TYPE_VECTOR_SUBPARTS (TREE_TYPE (mask)); + tree type0 = TREE_TYPE (TREE_TYPE (vec0)); + VEC(constructor_elt,gc) *v = NULL; + tree vectype, constr; + gimple new_stmt; + tree vec0tmp = NULL_TREE, masktmp = NULL_TREE; + + if (numargs == 2) + { + unsigned i; + tree vec0tmp = NULL_TREE; + + if (targetm.vectorize.builtin_shuffle (gsi, vec0, mask)) + { + /* Built-in is expanded by target. */ + return ; + } + + v = VEC_alloc (constructor_elt, gc, els); + for (i = 0; i < els; i++) + { + tree idxval, vecel, t; + if ((idxval = vector_element (gsi, mask, size_int (i), &masktmp)) + == error_mark_node) + { + warning_at (loc, 0, "Invalid shuffling mask index %i", i); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + + if ((vecel = vector_element (gsi, vec0, idxval, &vec0tmp)) + == error_mark_node) + { + warning_at (loc, 0, "Invalid shuffling arguments"); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + + t = force_gimple_operand_gsi (gsi, vecel, true, + NULL_TREE, true, GSI_SAME_STMT); + CONSTRUCTOR_APPEND_ELT (v, size_int (i), t); + } + } + else if (numargs == 3) + { + unsigned i; + tree vec1 = gimple_call_arg (stmt, 1); + tree var = create_tmp_var (type0, "vecel"); + tree vec1tmp = NULL_TREE; + + if (targetm.vectorize.builtin_shuffle2 (gsi, vec0, vec1, mask)) + { + /* Built-in is expanded by target. */ + return ; + } + + v = VEC_alloc (constructor_elt, gc, els); + for (i = 0; i < els; i++) + { + tree idxval, idx1val, cond, elval0, elval1, condexpr, t, ssatmp; + tree vec0el, vec1el; + gimple asgn; + + if ((idxval = vector_element (gsi, mask, size_int (i), &masktmp)) + == error_mark_node) + { + warning_at (loc, 0, "Invalid shuffling mask index %i", i); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + + if (TREE_CODE (idxval) == INTEGER_CST) + { + if (tree_int_cst_lt (idxval, size_int (els))) + { + vec0el = vector_element (gsi, vec0, idxval, &vec0tmp); + t = force_gimple_operand_gsi (gsi, vec0el, + true, NULL_TREE, true, GSI_SAME_STMT); + } + else if (tree_int_cst_lt (idxval, size_int (2*els))) + { + idx1val = fold_build2 (MINUS_EXPR, TREE_TYPE (idxval), + idxval, build_int_cst (TREE_TYPE (idxval), els)); + + vec1el = vector_element (gsi, vec1, idx1val, &vec1tmp); + t = force_gimple_operand_gsi (gsi, vec1el, + true, NULL_TREE, true, GSI_SAME_STMT); + } + else + { + warning_at (loc, 0, "Invalid shuffling mask index %i", i); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + } + else + { + + idx1val = fold_build2 (MINUS_EXPR, TREE_TYPE (idxval), + idxval, build_int_cst (TREE_TYPE (idxval), els)); + idx1val = force_gimple_operand_gsi (gsi, idx1val, + true, NULL_TREE, true, GSI_SAME_STMT); + cond = build2 (GT_EXPR, boolean_type_node, \ + idxval, convert (type0, size_int (els - 1))); + + if ((vec0el = vector_element (gsi, vec0, idxval, &vec0tmp)) + == error_mark_node) + { + warning_at (loc, 0, "Invalid shuffling arguments"); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + elval0 = force_gimple_operand_gsi (gsi, vec0el, + true, NULL_TREE, true, GSI_SAME_STMT); + + if ((vec1el = vector_element (gsi, vec1, idx1val, &vec1tmp)) + == error_mark_node) + { + warning_at (loc, 0, "Invalid shuffling arguments"); + TRAP_RETURN (new_stmt, stmt, gsi, vec0); + } + + elval1 = force_gimple_operand_gsi (gsi, vec1el, + true, NULL_TREE, true, GSI_SAME_STMT); + + condexpr = fold_build3 (COND_EXPR, type0, cond, \ + elval1, elval0); + + t = force_gimple_operand_gsi (gsi, condexpr, true, \ + NULL_TREE, true, GSI_SAME_STMT); + } + + asgn = gimple_build_assign (var, t); + ssatmp = make_ssa_name (var, asgn); + gimple_assign_set_lhs (asgn, ssatmp); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + CONSTRUCTOR_APPEND_ELT (v, size_int (i), ssatmp); + } + } + + vectype = build_vector_type (type0, els); + constr = build_constructor (vectype, v); + new_stmt = gimple_build_assign (gimple_call_lhs (stmt), constr); + gsi_replace (gsi, new_stmt, false); +} + /* Process one statement. If we identify a vector operation, expand it. */ static void @@ -396,6 +669,12 @@ expand_vector_operations_1 (gimple_stmt_ enum gimple_rhs_class rhs_class; tree new_rhs; + if (gimple_call_builtin_p (stmt, BUILT_IN_SHUFFLE)) + { + lower_builtin_shuffle (gsi, gimple_location (stmt)); + gimple_set_modified (gsi_stmt (*gsi), true); + } + if (gimple_code (stmt) != GIMPLE_ASSIGN) return; @@ -521,10 +800,11 @@ expand_vector_operations_1 (gimple_stmt_ /* Use this to lower vector operations introduced by the vectorizer, if it may need the bit-twiddling tricks implemented in this file. */ + static bool -gate_expand_vector_operations (void) +gate_expand_vector_operations_noop (void) { - return flag_tree_vectorize != 0; + return optimize == 0; } static unsigned int @@ -549,7 +829,7 @@ struct gimple_opt_pass pass_lower_vector { GIMPLE_PASS, "veclower", /* name */ - 0, /* gate */ + gate_expand_vector_operations_noop, /* gate */ expand_vector_operations, /* execute */ NULL, /* sub */ NULL, /* next */ @@ -559,8 +839,8 @@ struct gimple_opt_pass pass_lower_vector 0, /* properties_provided */ 0, /* properties_destroyed */ 0, /* todo_flags_start */ - TODO_dump_func | TODO_ggc_collect - | TODO_verify_stmts /* todo_flags_finish */ + TODO_dump_func | TODO_update_ssa | TODO_ggc_collect + | TODO_verify_stmts | TODO_cleanup_cfg/* todo_flags_finish */ } }; @@ -569,7 +849,7 @@ struct gimple_opt_pass pass_lower_vector { GIMPLE_PASS, "veclower2", /* name */ - gate_expand_vector_operations, /* gate */ + 0, /* gate */ expand_vector_operations, /* execute */ NULL, /* sub */ NULL, /* next */ @@ -582,6 +862,7 @@ struct gimple_opt_pass pass_lower_vector TODO_dump_func | TODO_update_ssa /* todo_flags_finish */ | TODO_verify_ssa | TODO_verify_stmts | TODO_verify_flow + | TODO_cleanup_cfg } }; Index: gcc/Makefile.in =================================================================== --- gcc/Makefile.in (revision 163244) +++ gcc/Makefile.in (working copy) @@ -864,7 +864,7 @@ endif VEC_H = vec.h statistics.h EXCEPT_H = except.h $(HASHTAB_H) vecprim.h vecir.h TOPLEV_H = toplev.h $(INPUT_H) bversion.h $(DIAGNOSTIC_CORE_H) -TARGET_H = $(TM_H) target.h target.def insn-modes.h +TGT = $(TM_H) target.h target.def insn-modes.h MACHMODE_H = machmode.h mode-classes.def insn-modes.h HOOKS_H = hooks.h $(MACHMODE_H) HOSTHOOKS_DEF_H = hosthooks-def.h $(HOOKS_H) @@ -886,8 +886,9 @@ TREE_H = tree.h all-tree.def tree.def c- REGSET_H = regset.h $(BITMAP_H) hard-reg-set.h BASIC_BLOCK_H = basic-block.h $(PREDICT_H) $(VEC_H) $(FUNCTION_H) cfghooks.h GIMPLE_H = gimple.h gimple.def gsstruct.def pointer-set.h $(VEC_H) \ - $(GGC_H) $(BASIC_BLOCK_H) $(TM_H) $(TARGET_H) tree-ssa-operands.h \ + $(GGC_H) $(BASIC_BLOCK_H) $(TM_H) $(TGT) tree-ssa-operands.h \ tree-ssa-alias.h vecir.h +TARGET_H = $(TGT) gimple.h GCOV_IO_H = gcov-io.h gcov-iov.h auto-host.h COVERAGE_H = coverage.h $(GCOV_IO_H) DEMANGLE_H = $(srcdir)/../include/demangle.h @@ -3156,7 +3157,7 @@ tree-vect-generic.o : tree-vect-generic. $(TM_H) $(TREE_FLOW_H) $(GIMPLE_H) tree-iterator.h $(TREE_PASS_H) \ $(FLAGS_H) $(OPTABS_H) $(MACHMODE_H) $(EXPR_H) \ langhooks.h $(FLAGS_H) $(DIAGNOSTIC_H) gt-tree-vect-generic.h $(GGC_H) \ - coretypes.h insn-codes.h + coretypes.h insn-codes.h target.h diagnostic.h df-core.o : df-core.c $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(RTL_H) \ insn-config.h $(RECOG_H) $(FUNCTION_H) $(REGS_H) alloc-pool.h \ hard-reg-set.h $(BASIC_BLOCK_H) $(DF_H) $(BITMAP_H) sbitmap.h $(TIMEVAR_H) \ Index: gcc/passes.c =================================================================== --- gcc/passes.c (revision 163244) +++ gcc/passes.c (working copy) @@ -735,7 +735,6 @@ init_optimization_passes (void) NEXT_PASS (pass_refactor_eh); NEXT_PASS (pass_lower_eh); NEXT_PASS (pass_build_cfg); - NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_warn_function_return); NEXT_PASS (pass_build_cgraph_edges); NEXT_PASS (pass_inline_parameters); @@ -763,6 +762,7 @@ init_optimization_passes (void) NEXT_PASS (pass_referenced_vars); NEXT_PASS (pass_build_ssa); + NEXT_PASS (pass_lower_vector); NEXT_PASS (pass_early_warn_uninitialized); /* Note that it is not strictly necessary to schedule an early inline pass here. However, some test cases (e.g., @@ -915,7 +915,6 @@ init_optimization_passes (void) NEXT_PASS (pass_vectorize); { struct opt_pass **p = &pass_vectorize.pass.sub; - NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_dce_loop); } NEXT_PASS (pass_predcom); @@ -926,6 +925,7 @@ init_optimization_passes (void) NEXT_PASS (pass_iv_optimize); NEXT_PASS (pass_tree_loop_done); } + NEXT_PASS (pass_lower_vector_ssa); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_vrp); Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 163244) +++ gcc/config/i386/i386.c (working copy) @@ -25,6 +25,7 @@ along with GCC; see the file COPYING3. #include "tm.h" #include "rtl.h" #include "tree.h" +#include "tree-flow.h" #include "tm_p.h" #include "regs.h" #include "hard-reg-set.h" @@ -27855,6 +27856,9 @@ struct expand_vec_perm_d static bool expand_vec_perm_1 (struct expand_vec_perm_d *d); static bool expand_vec_perm_broadcast_1 (struct expand_vec_perm_d *d); +static int extract_vec_perm_cst (struct expand_vec_perm_d *, tree); +static bool ix86_vectorize_builtin_vec_perm_ok (tree vec_type, tree mask); + /* Get a vector mode of the same size as the original but with elements twice as wide. This is only guaranteed to apply to integral vectors. */ @@ -30222,6 +30226,182 @@ ix86_vectorize_builtin_vec_perm (tree ve return ix86_builtins[(int) fcode]; } +/* Lower shuffling the vector VEC0 as specified by MASK. Replaces the + statement at *GSI with a target specific sequence implementing the + shuffle operation and returns true. Returns false if no target + specific sequence for this shuffle operation exists. */ +static bool +ix86_vectorize_builtin_shuffle (gimple_stmt_iterator *gsi, + tree vec0, tree mask) +{ + if (!VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (vec0)))) + return false; + + /* Recursively grab the definition of the variable. */ + while (TREE_CODE (mask) == SSA_NAME) + { + gimple maskdef = SSA_NAME_DEF_STMT (mask); + if (gimple_assign_single_p (maskdef)) + mask = gimple_assign_rhs1 (maskdef); + else + break; + } + + if (TREE_CODE (mask) == VECTOR_CST) + { + tree t, m_type; + if (ix86_vectorize_builtin_vec_perm_ok (TREE_TYPE (vec0), mask)) + { + t = ix86_vectorize_builtin_vec_perm (TREE_TYPE (vec0), &m_type); + + if (t != NULL_TREE) + { + gimple c = gimple_build_call (t, 3, vec0, vec0, mask); + gimple stmt = gsi_stmt (*gsi); + gimple_call_set_lhs (c, gimple_call_lhs (stmt)); + gsi_replace (gsi, c, false); + return true; + } + } + } + /* If we cannot expand it via vec_perm, we will try to expand it + via PSHUFB instruction. */ + { + tree mtype = TREE_TYPE (mask); + unsigned HOST_WIDE_INT i = 1, w = TYPE_VECTOR_SUBPARTS (mtype); + tree mcst, c, m1; + tree mvar, m1var, t; + tree fntype; + gimple asgn; + + if (tree_low_cst (TYPE_SIZE (mtype), 1) != 128) + return false; + + if (!TARGET_SSE3 && !TARGET_AVX) + return false; + + if (NULL_TREE == (fntype = ix86_builtins[(int) IX86_BUILTIN_PSHUFB128])) + return false; + + mvar = create_tmp_var (mtype, "mask"); + m1var = create_tmp_var (mtype, "nmask"); + + c = build_int_cst (TREE_TYPE (mtype), w-1); + mcst = build_vector_from_val (c, mtype); + + /* mvar = mask & {w-1, w-1, w-1,...} */ + m1 = build2 (BIT_AND_EXPR, mtype, mask, mcst); + t = force_gimple_operand_gsi (gsi, m1, + true, NULL_TREE, true, GSI_SAME_STMT); + asgn = gimple_build_assign (mvar, t); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + + /* m1var = mvar */ + asgn = gimple_build_assign (m1var, t); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + + while (w != 16) + { + /* m1var = mm1var << 8*i */ + m1 = build2 (LSHIFT_EXPR, mtype, m1var, + build_int_cst (TREE_TYPE (mtype), 8*i)); + t = force_gimple_operand_gsi (gsi, m1, + true, NULL_TREE, true, GSI_SAME_STMT); + asgn = gimple_build_assign (m1var, t); + gsi_insert_before (gsi, asgn , GSI_SAME_STMT); + + /* mvar = mvar | m1var */ + m1 = build2 (BIT_IOR_EXPR, mtype, mvar, m1var); + t = force_gimple_operand_gsi (gsi, m1, + true, NULL_TREE, true, GSI_SAME_STMT); + asgn = gimple_build_assign (mvar, t); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + + /* m1var = mvar */ + t = force_gimple_operand_gsi (gsi, mvar, + true, NULL_TREE, true, GSI_SAME_STMT); + asgn = gimple_build_assign (m1var, t); + gsi_insert_before (gsi, asgn, GSI_SAME_STMT); + + w *= 2; + i *= 2; + } + + if (fntype != NULL_TREE) + { + tree v, m, r, ctype; + gimple c, stmt, asgn; + + ctype = build_vector_type (char_type_node, 16); + r = create_tmp_var (ctype, "res"); + + v = force_gimple_operand_gsi (gsi, + fold_build1 (VIEW_CONVERT_EXPR, ctype, vec0), + true, NULL_TREE, true, GSI_SAME_STMT); + + m = force_gimple_operand_gsi (gsi, + fold_build1 (VIEW_CONVERT_EXPR, ctype, mvar), + true, NULL_TREE, true, GSI_SAME_STMT); + + c = gimple_build_call (fntype, 2, v, m); + gimple_call_set_lhs (c, r); + gsi_insert_before (gsi, c, GSI_SAME_STMT); + + stmt = gsi_stmt (*gsi); + t = force_gimple_operand_gsi (gsi, + build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vec0), r), + true, NULL_TREE, true, GSI_SAME_STMT); + asgn = gimple_build_assign (gimple_call_lhs (stmt), t); + gsi_replace (gsi, asgn, false); + return true; + } + } + return false; +} + +/* Lower shuffling vectors VEC0 and VEC1 as specified by MASK. + Replaces the statement at *GSI with a target specific sequence + implementing the shuffle operation and returns true. Returns + false if no target specific sequence for this shuffle operation + exists. */ +static bool +ix86_vectorize_builtin_shuffle2 (gimple_stmt_iterator *gsi, + tree vec0, tree vec1, tree mask) +{ + if (!VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (vec0)))) + return false; + + /* Check wheteher vector size is 128 or 256 */ + while (TREE_CODE (mask) == SSA_NAME) + { + gimple maskdef = SSA_NAME_DEF_STMT (mask); + if (gimple_assign_single_p (maskdef)) + mask = gimple_assign_rhs1 (maskdef); + else + break; + } + + if (TREE_CODE (mask) == VECTOR_CST) + { + tree t, m_type; + if (!ix86_vectorize_builtin_vec_perm_ok (TREE_TYPE (vec0), mask)) + return false; + + t = ix86_vectorize_builtin_vec_perm (TREE_TYPE (vec0), &m_type); + + if (t != NULL_TREE) + { + gimple c = gimple_build_call (t, 3, vec0, vec1, mask); + gimple stmt = gsi_stmt (*gsi); + gimple_call_set_lhs (c, gimple_call_lhs (stmt)); + gsi_replace (gsi, c, false); + return true; + } + } + + return false; +} + /* Return a vector mode with twice as many elements as VMODE. */ /* ??? Consider moving this to a table generated by genmodes.c. */ @@ -31139,13 +31319,21 @@ extract_vec_perm_cst (struct expand_vec_ unsigned i, nelt = d->nelt; int ret = 0; - for (i = 0; i < nelt; ++i, list = TREE_CHAIN (list)) + for (i = 0; i < nelt; ++i, list = + (list == NULL_TREE ? NULL_TREE : TREE_CHAIN (list))) { unsigned HOST_WIDE_INT e; + tree value; + + if (list != NULL_TREE) + value = TREE_VALUE (list); + else + value = fold_convert (TREE_TYPE (TREE_TYPE (cst)), + integer_zero_node); - if (!host_integerp (TREE_VALUE (list), 1)) + if (!host_integerp (value, 1)) return 0; - e = tree_low_cst (TREE_VALUE (list), 1); + e = tree_low_cst (value, 1); if (e >= 2 * nelt) return 0; @@ -31294,10 +31482,10 @@ ix86_vectorize_builtin_vec_perm_ok (tree vec_mask = extract_vec_perm_cst (&d, mask); - /* This hook is cannot be called in response to something that the - user does (unlike the builtin expander) so we shouldn't ever see - an error generated from the extract. */ - gcc_assert (vec_mask > 0 && vec_mask <= 3); + /* Check whether the mask can be applied to the vector type. */ + if (vec_mask < 0 || vec_mask > 3) + return false; + one_vec = (vec_mask != 3); /* Implementable with shufps or pshufd. */ @@ -31715,6 +31903,14 @@ ix86_enum_va_list (int idx, const char * #define TARGET_VECTORIZE_BUILTIN_VEC_PERM_OK \ ix86_vectorize_builtin_vec_perm_ok +#undef TARGET_VECTORIZE_BUILTIN_SHUFFLE +#define TARGET_VECTORIZE_BUILTIN_SHUFFLE \ + ix86_vectorize_builtin_shuffle + +#undef TARGET_VECTORIZE_BUILTIN_SHUFFLE2 +#define TARGET_VECTORIZE_BUILTIN_SHUFFLE2 \ + ix86_vectorize_builtin_shuffle2 + #undef TARGET_SET_CURRENT_FUNCTION #define TARGET_SET_CURRENT_FUNCTION ix86_set_current_function