Add __builtin_convertvector support (PR c++/85052)

Message ID	20190103100640.GM30353@tucnak
State	New
Headers	show Return-Path: <gcc-patches-return-493317-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=PVfnFanRJPW/Tl3f7u6oEV1hcpi/q dLfgu5GWwBXn61pom0THkui+msBtk3ylUu1UR5LVM1Lj6Z/W9hyxyvq+IGBTxWUJ nx1CH3hutrO+IYfXlRjyAPRQ65ugSgWg9eHvL3/uzdLBfVPJoW1J4HxDa52mdSse PLfQ1fTuSts5/8= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Date: Thu, 3 Jan 2019 11:06:40 +0100 From: Jakub Jelinek <jakub@redhat.com> To: Richard Biener <rguenther@suse.de>, Richard Sandiford <rdsandiford@googlemail.com>, Jason Merrill <jason@redhat.com>, "Joseph S. Myers" <joseph@codesourcery.com>, Jan Hubicka <jh@suse.cz> Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Add __builtin_convertvector support (PR c++/85052) Message-ID: <20190103100640.GM30353@tucnak> Reply-To: Jakub Jelinek <jakub@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.10.1 (2018-07-13)
Series	Add __builtin_convertvector support (PR c++/85052) \| expand Add __builtin_convertvector support (PR c++/85052)

On Thu, 3 Jan 2019, Jakub Jelinek wrote: > Hi! > > The following patch adds support for the __builtin_convertvector builtin. > C casts on generic vectors are just reinterpretation of the bits (i.e. a > VCE), this builtin allows to cast int/unsigned elements to float or vice > versa or promote/demote them. doc/ change is missing, will write it soon. > > The builtin appeared in I think clang 3.4 and is apparently in real-world > use as e.g. Honza reported. The first argument is an expression with vector > type, the second argument is a vector type (similarly e.g. to va_arg), to > which the first argument should be converted. Both vector types need to > have the same number of elements. > > I've implemented same element size (thus also whole vector size) conversions > efficiently - signed to unsigned and vice versa or same vector type just > using a VCE, for e.g. int <-> float or long long <-> double using > appropriate optab, possibly repeated multiple times for very large vectors. > For everything there is a fallback to lower __builtin_convertvector (x, t) > as { (__typeof (t[0])) x[0], (__typeof (t[1])) x[1], ... }. > > What isn't implemented efficiently (yet) are the narrowing or widening > conversions; the optabs we have are meant for same size vectors, so > for the packing we have 2 arguments that we pack into 1, for unpacking we > have those lo/hi variants, but in this case at least for the most common > vectors we have just one argument and want result with the same number of > elements. The AVX* different vector size instructions is the thing that > does this most efficiently, of course for large generic vectors we can > easily use these optabs. Shall we go for e.g. trying to pack the argument > and all zeros dummy operand and pick the low half of the result vector, > or pick the low and high halves of the argument and use a half sized vector > operations, or both? I guess it depends on target capabilities - I think __builtin_convertvector is a bit "misdesigned" for pack/unpack. You also have to consider a v2di to v2qi conversion which requires several unpack steps. Does the clang documentation given any hints how to "efficiently" use __builtin_convertvector for packing/unpacking without exposing too much of the target architecture? But yes, for unpacking you'd use a series of vec_unpack_*_lo_expr with padded input (padded with "don't care" if we had that, on RTL we'd use a paradoxical subreg, on GIMPLE we _might_ consider allowing VCE of different size? Or simply allow half-size input operands to vec_unpack_*_lo where that expands to paradoxical subregs (a bit difficult for the optab query I guess). For packing you'd use a series of vec_pack_* on argument split to two halves via BIT_FIELD_REF. What does clang do for testcases that request promotion/demotion? > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? do_vec_conversion needs a comment. Overall the patch (with its existing features) looks OK to me. As of Marcs comments I agree that vector lowering happens quite late. It might be for example useful to lower before vectorization (or any loop optimization) so that un-handled generic vector code can be eventually vectorized differently. But that's sth to investigate for GCC 10. Giving FE maintainers a chance to comment, so no overall ACK yet. Thanks, Richard. > 2019-01-03 Jakub Jelinek <jakub@redhat.com> > > PR c++/85052 > * tree-vect-generic.c (expand_vector_piecewise): Add defaulted > ret_type argument, if non-NULL, use that in preference to type > for the result type. > (expand_vector_parallel): Formatting fix. > (do_vec_conversion, expand_vector_conversion): New functions. > (expand_vector_operations_1): Call expand_vector_conversion > for VEC_CONVERT ifn calls. > * internal-fn.def (VEC_CONVERT): New internal function. > * internal-fn.c (expand_VEC_CONVERT): New function. > * fold-const-call.c (fold_const_vec_convert): New function. > (fold_const_call): Use it for CFN_VEC_CONVERT. > c-family/ > * c-common.h (enum rid): Add RID_BUILTIN_CONVERTVECTOR. > (c_build_vec_convert): Declare. > * c-common.c (c_build_vec_convert): New function. > c/ > * c-parser.c (c_parser_postfix_expression): Parse > __builtin_convertvector. > cp/ > * cp-tree.h (cp_build_vec_convert): Declare. > * parser.c (cp_parser_postfix_expression): Parse > __builtin_convertvector. > * constexpr.c: Include fold-const-call.h. > (cxx_eval_internal_function): Handle IFN_VEC_CONVERT. > (potential_constant_expression_1): Likewise. > * semantics.c (cp_build_vec_convert): New function. > * pt.c (tsubst_copy_and_build): Handle CALL_EXPR to > IFN_VEC_CONVERT. > testsuite/ > * c-c++-common/builtin-convertvector-1.c: New test. > * c-c++-common/torture/builtin-convertvector-1.c: New test. > * g++.dg/ext/builtin-convertvector-1.C: New test. > * g++.dg/cpp0x/constexpr-builtin4.C: New test. > > --- gcc/tree-vect-generic.c.jj 2019-01-01 12:37:17.084976148 +0100 > +++ gcc/tree-vect-generic.c 2019-01-02 17:51:28.012876543 +0100 > @@ -267,7 +267,8 @@ do_negate (gimple_stmt_iterator *gsi, tr > static tree > expand_vector_piecewise (gimple_stmt_iterator *gsi, elem_op_func f, > tree type, tree inner_type, > - tree a, tree b, enum tree_code code) > + tree a, tree b, enum tree_code code, > + tree ret_type = NULL_TREE) > { > vec<constructor_elt, va_gc> *v; > tree part_width = TYPE_SIZE (inner_type); > @@ -278,23 +279,27 @@ expand_vector_piecewise (gimple_stmt_ite > int i; > location_t loc = gimple_location (gsi_stmt (*gsi)); > > - if (types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type)) > + if (ret_type > + || types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type)) > warning_at (loc, OPT_Wvector_operation_performance, > "vector operation will be expanded piecewise"); > else > warning_at (loc, OPT_Wvector_operation_performance, > "vector operation will be expanded in parallel"); > > + if (!ret_type) > + ret_type = type; > vec_alloc (v, (nunits + delta - 1) / delta); > for (i = 0; i < nunits; > i += delta, index = int_const_binop (PLUS_EXPR, index, part_width)) > { > - tree result = f (gsi, inner_type, a, b, index, part_width, code, type); > + tree result = f (gsi, inner_type, a, b, index, part_width, code, > + ret_type); > constructor_elt ce = {NULL_TREE, result}; > v->quick_push (ce); > } > > - return build_constructor (type, v); > + return build_constructor (ret_type, v); > } > > /* Expand a vector operation to scalars with the freedom to use > @@ -302,8 +307,7 @@ expand_vector_piecewise (gimple_stmt_ite > in the vector type. */ > static tree > expand_vector_parallel (gimple_stmt_iterator *gsi, elem_op_func f, tree type, > - tree a, tree b, > - enum tree_code code) > + tree a, tree b, enum tree_code code) > { > tree result, compute_type; > int n_words = tree_to_uhwi (TYPE_SIZE_UNIT (type)) / UNITS_PER_WORD; > @@ -1547,6 +1551,147 @@ expand_vector_scalar_condition (gimple_s > update_stmt (gsi_stmt (*gsi)); > } > > +static tree > +do_vec_conversion (gimple_stmt_iterator *gsi, tree inner_type, tree a, > + tree decl, tree bitpos, tree bitsize, > + enum tree_code code, tree type) > +{ > + a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos); > + if (!VECTOR_TYPE_P (inner_type)) > + return gimplify_build1 (gsi, code, TREE_TYPE (type), a); > + if (code == CALL_EXPR) > + { > + gimple *g = gimple_build_call (decl, 1, a); > + tree lhs = make_ssa_name (TREE_TYPE (TREE_TYPE (decl))); > + gimple_call_set_lhs (g, lhs); > + gsi_insert_before (gsi, g, GSI_SAME_STMT); > + return lhs; > + } > + else > + { > + tree outer_type = build_vector_type (TREE_TYPE (type), > + TYPE_VECTOR_SUBPARTS (inner_type)); > + return gimplify_build1 (gsi, code, outer_type, a); > + } > +} > + > +/* Expand VEC_CONVERT ifn call. */ > + > +static void > +expand_vector_conversion (gimple_stmt_iterator *gsi) > +{ > + gimple *stmt = gsi_stmt (*gsi); > + gimple *g; > + tree lhs = gimple_call_lhs (stmt); > + tree arg = gimple_call_arg (stmt, 0); > + tree decl = NULL_TREE; > + tree ret_type = TREE_TYPE (lhs); > + tree arg_type = TREE_TYPE (arg); > + tree new_rhs, compute_type = TREE_TYPE (arg_type); > + enum tree_code code = NOP_EXPR; > + enum tree_code code1 = ERROR_MARK; > + enum { NARROW, NONE, WIDEN } modifier = NONE; > + optab optab1 = unknown_optab; > + > + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type)); > + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (ret_type)))); > + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (arg_type)))); > + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type)) > + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type))) > + code = FIX_TRUNC_EXPR; > + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type)) > + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type))) > + code = FLOAT_EXPR; > + if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type))) > + < tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type)))) > + modifier = NARROW; > + else if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type))) > + > tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type)))) > + modifier = WIDEN; > + > + if (modifier == NONE && (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR)) > + { > + if (supportable_convert_operation (code, ret_type, arg_type, &decl, > + &code1)) > + { > + if (code1 == CALL_EXPR) > + { > + g = gimple_build_call (decl, 1, arg); > + gimple_call_set_lhs (g, lhs); > + } > + else > + g = gimple_build_assign (lhs, code1, arg); > + gsi_replace (gsi, g, false); > + return; > + } > + /* Can't use get_compute_type here, as supportable_convert_operation > + doesn't necessarily use an optab and needs two arguments. */ > + tree vector_compute_type > + = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab); > + unsigned HOST_WIDE_INT nelts; > + if (vector_compute_type > + && VECTOR_MODE_P (TYPE_MODE (vector_compute_type)) > + && subparts_gt (arg_type, vector_compute_type) > + && TYPE_VECTOR_SUBPARTS (vector_compute_type).is_constant (&nelts)) > + { > + while (nelts > 1) > + { > + tree ret1_type = build_vector_type (TREE_TYPE (ret_type), nelts); > + tree arg1_type = build_vector_type (TREE_TYPE (arg_type), nelts); > + if (supportable_convert_operation (code, ret1_type, arg1_type, > + &decl, &code1)) > + { > + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, > + ret_type, arg1_type, arg, > + decl, code1); > + g = gimple_build_assign (lhs, new_rhs); > + gsi_replace (gsi, g, false); > + return; > + } > + nelts = nelts / 2; > + } > + } > + } > + /* FIXME: __builtin_convertvector argument and return vectors have the same > + number of elements, so for both narrowing and widening we need to figure > + out what is the best set of optabs to use. E.g. for NARROW > + VEC_PACK_TRUNC_EXPR has 2 arguments, shall we prefer emitting that with > + one argument of arg and another argument all zeros and extract first > + half of the resulting vector, or extract lo and hi halves of the arg > + vector and use VEC_PACK_TRUNC_EXPR on those? */ > + else if (0 && modifier == NARROW) > + { > + switch (code) > + { > + case NOP_EXPR: > + code1 = VEC_PACK_TRUNC_EXPR; > + optab1 = optab_for_tree_code (code1, arg_type, optab_default); > + break; > + case FIX_TRUNC_EXPR: > + code1 = VEC_PACK_FIX_TRUNC_EXPR; > + /* The signedness is determined from output operand. */ > + optab1 = optab_for_tree_code (code1, ret_type, optab_default); > + break; > + case FLOAT_EXPR: > + code1 = VEC_PACK_FLOAT_EXPR; > + optab1 = optab_for_tree_code (code1, arg_type, optab_default); > + break; > + default: > + gcc_unreachable (); > + } > + > + if (optab1) > + compute_type = get_compute_type (code1, optab1, arg_type); > + (void) compute_type; > + } > + > + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, arg_type, > + TREE_TYPE (arg_type), arg, > + NULL_TREE, code, ret_type); > + g = gimple_build_assign (lhs, new_rhs); > + gsi_replace (gsi, g, false); > +} > + > /* Process one statement. If we identify a vector operation, expand it. */ > > static void > @@ -1561,7 +1706,11 @@ expand_vector_operations_1 (gimple_stmt_ > /* Only consider code == GIMPLE_ASSIGN. */ > gassign *stmt = dyn_cast <gassign *> (gsi_stmt (*gsi)); > if (!stmt) > - return; > + { > + if (gimple_call_internal_p (gsi_stmt (*gsi), IFN_VEC_CONVERT)) > + expand_vector_conversion (gsi); > + return; > + } > > code = gimple_assign_rhs_code (stmt); > rhs_class = get_gimple_rhs_class (code); > --- gcc/internal-fn.def.jj 2019-01-01 12:37:17.893962875 +0100 > +++ gcc/internal-fn.def 2019-01-02 11:24:24.307681792 +0100 > @@ -296,6 +296,7 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST > DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL) > +DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > > /* An unduplicable, uncombinable function. Generally used to preserve > a CFG property in the face of jump threading, tail merging or > --- gcc/internal-fn.c.jj 2019-01-01 12:37:19.567935410 +0100 > +++ gcc/internal-fn.c 2019-01-02 11:24:24.315681661 +0100 > @@ -2581,6 +2581,15 @@ expand_VA_ARG (internal_fn, gcall *) > gcc_unreachable (); > } > > +/* IFN_VEC_CONVERT is supposed to be expanded at pass_lower_vector. So this > + dummy function should never be called. */ > + > +static void > +expand_VEC_CONVERT (internal_fn, gcall *) > +{ > + gcc_unreachable (); > +} > + > /* Expand the IFN_UNIQUE function according to its first argument. */ > > static void > --- gcc/fold-const-call.c.jj 2019-01-01 12:37:16.528985271 +0100 > +++ gcc/fold-const-call.c 2019-01-02 15:57:36.656449175 +0100 > @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. > #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO. */ > #include "builtins.h" > #include "gimple-expr.h" > +#include "tree-vector-builder.h" > > /* Functions that test for certain constant types, abstracting away the > decision about whether to check for overflow. */ > @@ -645,6 +646,40 @@ fold_const_reduction (tree type, tree ar > return res; > } > > +/* Fold a call to IFN_VEC_CONVERT (ARG) returning TYPE. */ > + > +static tree > +fold_const_vec_convert (tree ret_type, tree arg) > +{ > + enum tree_code code = NOP_EXPR; > + tree arg_type = TREE_TYPE (arg); > + if (TREE_CODE (arg) != VECTOR_CST) > + return NULL_TREE; > + > + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type)); > + > + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type)) > + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type))) > + code = FIX_TRUNC_EXPR; > + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type)) > + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type))) > + code = FLOAT_EXPR; > + > + tree_vector_builder elts; > + elts.new_unary_operation (ret_type, arg, true); > + unsigned int count = elts.encoded_nelts (); > + for (unsigned int i = 0; i < count; ++i) > + { > + tree elt = fold_unary (code, TREE_TYPE (ret_type), > + VECTOR_CST_ELT (arg, i)); > + if (elt == NULL_TREE || !CONSTANT_CLASS_P (elt)) > + return NULL_TREE; > + elts.quick_push (elt); > + } > + > + return elts.build (); > +} > + > /* Try to evaluate: > > *RESULT = FN (*ARG) > @@ -1232,6 +1267,9 @@ fold_const_call (combined_fn fn, tree ty > case CFN_REDUC_XOR: > return fold_const_reduction (type, arg, BIT_XOR_EXPR); > > + case CFN_VEC_CONVERT: > + return fold_const_vec_convert (type, arg); > + > default: > return fold_const_call_1 (fn, type, arg); > } > --- gcc/c-family/c-common.h.jj 2019-01-01 12:37:51.309414610 +0100 > +++ gcc/c-family/c-common.h 2019-01-02 11:24:24.314681677 +0100 > @@ -102,7 +102,7 @@ enum rid > RID_ASM, RID_TYPEOF, RID_ALIGNOF, RID_ATTRIBUTE, RID_VA_ARG, > RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR, > RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE, > - RID_BUILTIN_TGMATH, > + RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH, > RID_BUILTIN_HAS_ATTRIBUTE, > RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, > > @@ -1001,6 +1001,7 @@ extern bool lvalue_p (const_tree); > extern bool vector_targets_convertible_p (const_tree t1, const_tree t2); > extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool emit_lax_note); > extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true); > +extern tree c_build_vec_convert (location_t, tree, location_t, tree, bool = true); > > extern void init_c_lex (void); > > --- gcc/c-family/c-common.c.jj 2019-01-01 12:37:51.366413675 +0100 > +++ gcc/c-family/c-common.c 2019-01-02 11:24:24.314681677 +0100 > @@ -376,6 +376,7 @@ const struct c_common_resword c_common_r > RID_BUILTIN_CALL_WITH_STATIC_CHAIN, D_CONLY }, > { "__builtin_choose_expr", RID_CHOOSE_EXPR, D_CONLY }, > { "__builtin_complex", RID_BUILTIN_COMPLEX, D_CONLY }, > + { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, > { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, > { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, > { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, > @@ -1070,6 +1071,70 @@ c_build_vec_perm_expr (location_t loc, t > ret = c_wrap_maybe_const (ret, true); > > return ret; > +} > + > +/* Build a VEC_CONVERT ifn for __builtin_convertvector builtin. */ > + > +tree > +c_build_vec_convert (location_t loc1, tree expr, location_t loc2, tree type, > + bool complain) > +{ > + if (error_operand_p (type)) > + return error_mark_node; > + if (error_operand_p (expr)) > + return error_mark_node; > + > + if (!VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr)) > + && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (expr))) > + { > + if (complain) > + error_at (loc1, "%<__builtin_convertvector%> first argument must " > + "be an integer or floating vector"); > + return error_mark_node; > + } > + > + if (!VECTOR_INTEGER_TYPE_P (type) && !VECTOR_FLOAT_TYPE_P (type)) > + { > + if (complain) > + error_at (loc2, "%<__builtin_convertvector%> second argument must " > + "be an integer or floating vector type"); > + return error_mark_node; > + } > + > + if (maybe_ne (TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr)), > + TYPE_VECTOR_SUBPARTS (type))) > + { > + if (complain) > + error_at (loc1, "%<__builtin_convertvector%> number of elements " > + "of the first argument vector and the second argument " > + "vector type should be the same"); > + return error_mark_node; > + } > + > + if ((TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (expr))) > + == TYPE_MAIN_VARIANT (TREE_TYPE (type))) > + || (VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr)) > + && VECTOR_INTEGER_TYPE_P (type) > + && (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (expr))) > + == TYPE_PRECISION (TREE_TYPE (type))))) > + return build1_loc (loc1, VIEW_CONVERT_EXPR, type, expr); > + > + bool wrap = true; > + bool maybe_const = false; > + tree ret; > + if (!c_dialect_cxx ()) > + { > + /* Avoid C_MAYBE_CONST_EXPRs inside of VEC_CONVERT argument. */ > + expr = c_fully_fold (expr, false, &maybe_const); > + wrap &= maybe_const; > + } > + > + ret = build_call_expr_internal_loc (loc1, IFN_VEC_CONVERT, type, 1, expr); > + > + if (!wrap) > + ret = c_wrap_maybe_const (ret, true); > + > + return ret; > } > > /* Like tree.c:get_narrower, but retain conversion from C++0x scoped enum > --- gcc/c/c-parser.c.jj 2019-01-01 12:37:48.677457794 +0100 > +++ gcc/c/c-parser.c 2019-01-02 11:24:24.312681710 +0100 > @@ -8038,6 +8038,7 @@ enum tgmath_parm_kind > __builtin_shuffle ( assignment-expression , > assignment-expression , > assignment-expression, ) > + __builtin_convertvector ( assignment-expression , type-name ) > > offsetof-member-designator: > identifier > @@ -9113,17 +9114,14 @@ c_parser_postfix_expression (c_parser *p > *p = convert_lvalue_to_rvalue (loc, *p, true, true); > > if (vec_safe_length (cexpr_list) == 2) > - expr.value = > - c_build_vec_perm_expr > - (loc, (*cexpr_list)[0].value, > - NULL_TREE, (*cexpr_list)[1].value); > + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value, > + NULL_TREE, > + (*cexpr_list)[1].value); > > else if (vec_safe_length (cexpr_list) == 3) > - expr.value = > - c_build_vec_perm_expr > - (loc, (*cexpr_list)[0].value, > - (*cexpr_list)[1].value, > - (*cexpr_list)[2].value); > + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value, > + (*cexpr_list)[1].value, > + (*cexpr_list)[2].value); > else > { > error_at (loc, "wrong number of arguments to " > @@ -9133,6 +9131,41 @@ c_parser_postfix_expression (c_parser *p > set_c_expr_source_range (&expr, loc, close_paren_loc); > break; > } > + case RID_BUILTIN_CONVERTVECTOR: > + { > + location_t start_loc = loc; > + c_parser_consume_token (parser); > + matching_parens parens; > + if (!parens.require_open (parser)) > + { > + expr.set_error (); > + break; > + } > + e1 = c_parser_expr_no_commas (parser, NULL); > + mark_exp_read (e1.value); > + if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>")) > + { > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL); > + expr.set_error (); > + break; > + } > + loc = c_parser_peek_token (parser)->location; > + t1 = c_parser_type_name (parser); > + location_t end_loc = c_parser_peek_token (parser)->get_finish (); > + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, > + "expected %<)%>"); > + if (t1 == NULL) > + expr.set_error (); > + else > + { > + tree type_expr = NULL_TREE; > + expr.value = c_build_vec_convert (start_loc, e1.value, loc, > + groktypename (t1, &type_expr, > + NULL)); > + set_c_expr_source_range (&expr, start_loc, end_loc); > + } > + } > + break; > case RID_AT_SELECTOR: > { > gcc_assert (c_dialect_objc ()); > --- gcc/cp/cp-tree.h.jj 2019-01-01 12:37:46.884487212 +0100 > +++ gcc/cp/cp-tree.h 2019-01-02 16:43:35.480393140 +0100 > @@ -7142,6 +7142,8 @@ extern bool is_lambda_ignored_entity > extern bool lambda_static_thunk_p (tree); > extern tree finish_builtin_launder (location_t, tree, > tsubst_flags_t); > +extern tree cp_build_vec_convert (tree, location_t, tree, > + tsubst_flags_t); > extern void start_lambda_scope (tree); > extern void record_lambda_scope (tree); > extern void record_null_lambda_scope (tree); > --- gcc/cp/parser.c.jj 2019-01-01 12:37:47.352479534 +0100 > +++ gcc/cp/parser.c 2019-01-02 16:19:44.765760167 +0100 > @@ -7031,6 +7031,32 @@ cp_parser_postfix_expression (cp_parser > break; > } > > + case RID_BUILTIN_CONVERTVECTOR: > + { > + tree expression; > + tree type; > + /* Consume the `__builtin_convertvector' token. */ > + cp_lexer_consume_token (parser->lexer); > + /* Look for the opening `('. */ > + matching_parens parens; > + parens.require_open (parser); > + /* Now, parse the assignment-expression. */ > + expression = cp_parser_assignment_expression (parser); > + /* Look for the `,'. */ > + cp_parser_require (parser, CPP_COMMA, RT_COMMA); > + location_t type_location > + = cp_lexer_peek_token (parser->lexer)->location; > + /* Parse the type-id. */ > + { > + type_id_in_expr_sentinel s (parser); > + type = cp_parser_type_id (parser); > + } > + /* Look for the closing `)'. */ > + parens.require_close (parser); > + return cp_build_vec_convert (expression, type_location, type, > + tf_warning_or_error); > + } > + > default: > { > tree type; > --- gcc/cp/constexpr.c.jj 2019-01-01 12:37:47.282480682 +0100 > +++ gcc/cp/constexpr.c 2019-01-02 16:56:54.126359632 +0100 > @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. > #include "ubsan.h" > #include "gimple-fold.h" > #include "timevar.h" > +#include "fold-const-call.h" > > static bool verify_constant (tree, bool, bool *, bool *); > #define VERIFY_CONSTANT(X) \ > @@ -1449,6 +1450,20 @@ cxx_eval_internal_function (const conste > return cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0), > false, non_constant_p, overflow_p); > > + case IFN_VEC_CONVERT: > + { > + tree arg = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0), > + false, non_constant_p, > + overflow_p); > + if (TREE_CODE (arg) == VECTOR_CST) > + return fold_const_call (CFN_VEC_CONVERT, TREE_TYPE (t), arg); > + else > + { > + *non_constant_p = true; > + return t; > + } > + } > + > default: > if (!ctx->quiet) > error_at (cp_expr_loc_or_loc (t, input_location), > @@ -5623,7 +5638,9 @@ potential_constant_expression_1 (tree t, > case IFN_SUB_OVERFLOW: > case IFN_MUL_OVERFLOW: > case IFN_LAUNDER: > + case IFN_VEC_CONVERT: > bail = false; > + break; > > default: > break; > --- gcc/cp/semantics.c.jj 2019-01-01 12:37:46.976485703 +0100 > +++ gcc/cp/semantics.c 2019-01-02 18:15:42.844133048 +0100 > @@ -9933,4 +9933,26 @@ finish_builtin_launder (location_t loc, > TREE_TYPE (arg), 1, arg); > } > > +/* Finish __builtin_convertvector (arg, type). */ > + > +tree > +cp_build_vec_convert (tree arg, location_t loc, tree type, > + tsubst_flags_t complain) > +{ > + if (error_operand_p (type)) > + return error_mark_node; > + if (error_operand_p (arg)) > + return error_mark_node; > + > + tree ret = NULL_TREE; > + if (!type_dependent_expression_p (arg) && !dependent_type_p (type)) > + ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location), arg, > + loc, type, (complain & tf_error) != 0); > + > + if (!processing_template_decl) > + return ret; > + > + return build_call_expr_internal_loc (loc, IFN_VEC_CONVERT, type, 1, arg); > +} > + > #include "gt-cp-semantics.h" > --- gcc/cp/pt.c.jj 2019-01-01 12:37:47.081483980 +0100 > +++ gcc/cp/pt.c 2019-01-02 18:25:17.997778249 +0100 > @@ -18813,6 +18813,27 @@ tsubst_copy_and_build (tree t, > (*call_args)[0], complain); > break; > > + case IFN_VEC_CONVERT: > + gcc_assert (nargs == 1); > + if (vec_safe_length (call_args) != 1) > + { > + error_at (cp_expr_loc_or_loc (t, input_location), > + "wrong number of arguments to " > + "%<__builtin_convertvector%>"); > + ret = error_mark_node; > + break; > + } > + ret = cp_build_vec_convert ((*call_args)[0], input_location, > + tsubst (TREE_TYPE (t), args, > + complain, in_decl), > + complain); > + if (TREE_CODE (ret) == VIEW_CONVERT_EXPR) > + { > + release_tree_vector (call_args); > + RETURN (ret); > + } > + break; > + > default: > /* Unsupported internal function with arguments. */ > gcc_unreachable (); > --- gcc/testsuite/c-c++-common/builtin-convertvector-1.c.jj 2019-01-02 18:38:18.265090910 +0100 > +++ gcc/testsuite/c-c++-common/builtin-convertvector-1.c 2019-01-02 18:37:50.337544972 +0100 > @@ -0,0 +1,15 @@ > +typedef int v8si __attribute__((vector_size (8 * sizeof (int)))); > +typedef long long v4di __attribute__((vector_size (4 * sizeof (long long)))); > + > +void > +foo (v8si *x, v4di *y, int z) > +{ > + __builtin_convertvector (*y, v8si); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */ > + __builtin_convertvector (*x, v4di); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */ > + __builtin_convertvector (*x, int); /* { dg-error "second argument must be an integer or floating vector type" } */ > + __builtin_convertvector (z, v4di); /* { dg-error "first argument must be an integer or floating vector" } */ > + __builtin_convertvector (); /* { dg-error "expected" } */ > + __builtin_convertvector (*x); /* { dg-error "expected" } */ > + __builtin_convertvector (*x, *y); /* { dg-error "expected" } */ > + __builtin_convertvector (*x, v8si, 1);/* { dg-error "expected" } */ > +} > --- gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c.jj 2019-01-02 18:00:59.982534637 +0100 > +++ gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c 2019-01-02 18:00:32.871977360 +0100 > @@ -0,0 +1,131 @@ > +extern > +#ifdef __cplusplus > +"C" > +#endif > +void abort (void); > +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); > +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int)))); > +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); > +typedef double v4df __attribute__((vector_size (4 * sizeof (double)))); > +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long)))); > +typedef double v256df __attribute__((vector_size (256 * sizeof (double)))); > + > +void > +f1 (v4usi *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +void > +f2 (v4sf *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +void > +f3 (v4si *x, v4sf *y) > +{ > + *y = __builtin_convertvector (*x, v4sf); > +} > + > +void > +f4 (v4df *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +void > +f5 (v4si *x, v4df *y) > +{ > + *y = __builtin_convertvector (*x, v4df); > +} > + > +void > +f6 (v256df *x, v256di *y) > +{ > + *y = __builtin_convertvector (*x, v256di); > +} > + > +void > +f7 (v256di *x, v256df *y) > +{ > + *y = __builtin_convertvector (*x, v256df); > +} > + > +void > +f8 (v4df *x) > +{ > + v4si a = { 1, 2, -3, -4 }; > + *x = __builtin_convertvector (a, v4df); > +} > + > +int > +main () > +{ > + union U1 { v4si v; int a[4]; } u1; > + union U2 { v4usi v; unsigned int a[4]; } u2; > + union U3 { v4sf v; float a[4]; } u3; > + union U4 { v4df v; double a[4]; } u4; > + union U5 { v256di v; long long a[256]; } u5; > + union U6 { v256df v; double a[256]; } u6; > + int i; > + for (i = 0; i < 4; i++) > + u2.a[i] = i * 2; > + f1 (&u2.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i * 2) > + abort (); > + else > + u3.a[i] = i - 2.25f; > + f2 (&u3.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != (i == 3 ? 0 : i - 2)) > + abort (); > + else > + u3.a[i] = i + 0.75f; > + f2 (&u3.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i) > + abort (); > + else > + u1.a[i] = 7 * i - 5; > + f3 (&u1.v, &u3.v); > + for (i = 0; i < 4; i++) > + if (u3.a[i] != 7 * i - 5) > + abort (); > + else > + u4.a[i] = i - 2.25; > + f4 (&u4.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != (i == 3 ? 0 : i - 2)) > + abort (); > + else > + u4.a[i] = i + 0.75; > + f4 (&u4.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i) > + abort (); > + else > + u1.a[i] = 7 * i - 5; > + f5 (&u1.v, &u4.v); > + for (i = 0; i < 4; i++) > + if (u4.a[i] != 7 * i - 5) > + abort (); > + for (i = 0; i < 256; i++) > + u6.a[i] = i - 128.25; > + f6 (&u6.v, &u5.v); > + for (i = 0; i < 256; i++) > + if (u5.a[i] != i - 128 - (i > 128)) > + abort (); > + else > + u5.a[i] = i - 128; > + f7 (&u5.v, &u6.v); > + for (i = 0; i < 256; i++) > + if (u6.a[i] != i - 128) > + abort (); > + f8 (&u4.v); > + for (i = 0; i < 4; i++) > + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1)) > + abort (); > + return 0; > +} > --- gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C.jj 2019-01-02 18:04:14.984350274 +0100 > +++ gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C 2019-01-02 18:07:17.122375950 +0100 > @@ -0,0 +1,137 @@ > +// { dg-do run } > + > +extern "C" void abort (); > +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); > +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int)))); > +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); > +typedef double v4df __attribute__((vector_size (4 * sizeof (double)))); > +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long)))); > +typedef double v256df __attribute__((vector_size (256 * sizeof (double)))); > + > +template <int N> > +void > +f1 (v4usi *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +template <typename T> > +void > +f2 (T *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +template <typename T> > +void > +f3 (v4si *x, T *y) > +{ > + *y = __builtin_convertvector (*x, T); > +} > + > +template <int N> > +void > +f4 (v4df *x, v4si *y) > +{ > + *y = __builtin_convertvector (*x, v4si); > +} > + > +template <typename T, typename U> > +void > +f5 (T *x, U *y) > +{ > + *y = __builtin_convertvector (*x, U); > +} > + > +template <typename T> > +void > +f6 (v256df *x, T *y) > +{ > + *y = __builtin_convertvector (*x, T); > +} > + > +template <int N> > +void > +f7 (v256di *x, v256df *y) > +{ > + *y = __builtin_convertvector (*x, v256df); > +} > + > +template <int N> > +void > +f8 (v4df *x) > +{ > + v4si a = { 1, 2, -3, -4 }; > + *x = __builtin_convertvector (a, v4df); > +} > + > +int > +main () > +{ > + union U1 { v4si v; int a[4]; } u1; > + union U2 { v4usi v; unsigned int a[4]; } u2; > + union U3 { v4sf v; float a[4]; } u3; > + union U4 { v4df v; double a[4]; } u4; > + union U5 { v256di v; long long a[256]; } u5; > + union U6 { v256df v; double a[256]; } u6; > + int i; > + for (i = 0; i < 4; i++) > + u2.a[i] = i * 2; > + f1<0> (&u2.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i * 2) > + abort (); > + else > + u3.a[i] = i - 2.25f; > + f2 (&u3.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != (i == 3 ? 0 : i - 2)) > + abort (); > + else > + u3.a[i] = i + 0.75f; > + f2 (&u3.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i) > + abort (); > + else > + u1.a[i] = 7 * i - 5; > + f3 (&u1.v, &u3.v); > + for (i = 0; i < 4; i++) > + if (u3.a[i] != 7 * i - 5) > + abort (); > + else > + u4.a[i] = i - 2.25; > + f4<12> (&u4.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != (i == 3 ? 0 : i - 2)) > + abort (); > + else > + u4.a[i] = i + 0.75; > + f4<13> (&u4.v, &u1.v); > + for (i = 0; i < 4; i++) > + if (u1.a[i] != i) > + abort (); > + else > + u1.a[i] = 7 * i - 5; > + f5 (&u1.v, &u4.v); > + for (i = 0; i < 4; i++) > + if (u4.a[i] != 7 * i - 5) > + abort (); > + for (i = 0; i < 256; i++) > + u6.a[i] = i - 128.25; > + f6 (&u6.v, &u5.v); > + for (i = 0; i < 256; i++) > + if (u5.a[i] != i - 128 - (i > 128)) > + abort (); > + else > + u5.a[i] = i - 128; > + f7<-1> (&u5.v, &u6.v); > + for (i = 0; i < 256; i++) > + if (u6.a[i] != i - 128) > + abort (); > + f8<5> (&u4.v); > + for (i = 0; i < 4; i++) > + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1)) > + abort (); > + return 0; > +} > --- gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C.jj 2019-01-02 18:39:12.767204801 +0100 > +++ gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C 2019-01-02 18:42:30.749985890 +0100 > @@ -0,0 +1,17 @@ > +// { dg-do compile { target c++11 } } > +// { dg-additional-options "-Wno-psabi" } > + > +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); > +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); > +constexpr v4sf a = __builtin_convertvector (v4si { 1, 2, -3, -4 }, v4sf); > + > +constexpr v4sf > +foo (v4si x) > +{ > + return __builtin_convertvector (x, v4sf); > +} > + > +constexpr v4sf b = foo (v4si { 3, 4, -1, -2 }); > + > +static_assert (a[0] == 1.0f && a[1] == 2.0f && a[2] == -3.0f && a[3] == -4.0f, ""); > +static_assert (b[0] == 3.0f && b[1] == 4.0f && b[2] == -1.0f && b[3] == -2.0f, ""); > > Jakub > >

--- gcc/tree-vect-generic.c.jj 2019-01-01 12:37:17.084976148 +0100 +++ gcc/tree-vect-generic.c 2019-01-02 17:51:28.012876543 +0100 @@ -267,7 +267,8 @@ do_negate (gimple_stmt_iterator *gsi, tr static tree expand_vector_piecewise (gimple_stmt_iterator *gsi, elem_op_func f, tree type, tree inner_type, - tree a, tree b, enum tree_code code) + tree a, tree b, enum tree_code code, + tree ret_type = NULL_TREE) { vec<constructor_elt, va_gc> *v; tree part_width = TYPE_SIZE (inner_type); @@ -278,23 +279,27 @@ expand_vector_piecewise (gimple_stmt_ite int i; location_t loc = gimple_location (gsi_stmt (*gsi)); - if (types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type)) + if (ret_type + || types_compatible_p (gimple_expr_type (gsi_stmt (*gsi)), type)) warning_at (loc, OPT_Wvector_operation_performance, "vector operation will be expanded piecewise"); else warning_at (loc, OPT_Wvector_operation_performance, "vector operation will be expanded in parallel"); + if (!ret_type) + ret_type = type; vec_alloc (v, (nunits + delta - 1) / delta); for (i = 0; i < nunits; i += delta, index = int_const_binop (PLUS_EXPR, index, part_width)) { - tree result = f (gsi, inner_type, a, b, index, part_width, code, type); + tree result = f (gsi, inner_type, a, b, index, part_width, code, + ret_type); constructor_elt ce = {NULL_TREE, result}; v->quick_push (ce); } - return build_constructor (type, v); + return build_constructor (ret_type, v); } /* Expand a vector operation to scalars with the freedom to use @@ -302,8 +307,7 @@ expand_vector_piecewise (gimple_stmt_ite in the vector type. */ static tree expand_vector_parallel (gimple_stmt_iterator *gsi, elem_op_func f, tree type, - tree a, tree b, - enum tree_code code) + tree a, tree b, enum tree_code code) { tree result, compute_type; int n_words = tree_to_uhwi (TYPE_SIZE_UNIT (type)) / UNITS_PER_WORD; @@ -1547,6 +1551,147 @@ expand_vector_scalar_condition (gimple_s update_stmt (gsi_stmt (*gsi)); } +static tree +do_vec_conversion (gimple_stmt_iterator *gsi, tree inner_type, tree a, + tree decl, tree bitpos, tree bitsize, + enum tree_code code, tree type) +{ + a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos); + if (!VECTOR_TYPE_P (inner_type)) + return gimplify_build1 (gsi, code, TREE_TYPE (type), a); + if (code == CALL_EXPR) + { + gimple *g = gimple_build_call (decl, 1, a); + tree lhs = make_ssa_name (TREE_TYPE (TREE_TYPE (decl))); + gimple_call_set_lhs (g, lhs); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + return lhs; + } + else + { + tree outer_type = build_vector_type (TREE_TYPE (type), + TYPE_VECTOR_SUBPARTS (inner_type)); + return gimplify_build1 (gsi, code, outer_type, a); + } +} + +/* Expand VEC_CONVERT ifn call. */ + +static void +expand_vector_conversion (gimple_stmt_iterator *gsi) +{ + gimple *stmt = gsi_stmt (*gsi); + gimple *g; + tree lhs = gimple_call_lhs (stmt); + tree arg = gimple_call_arg (stmt, 0); + tree decl = NULL_TREE; + tree ret_type = TREE_TYPE (lhs); + tree arg_type = TREE_TYPE (arg); + tree new_rhs, compute_type = TREE_TYPE (arg_type); + enum tree_code code = NOP_EXPR; + enum tree_code code1 = ERROR_MARK; + enum { NARROW, NONE, WIDEN } modifier = NONE; + optab optab1 = unknown_optab; + + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type)); + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (ret_type)))); + gcc_checking_assert (tree_fits_uhwi_p (TYPE_SIZE (TREE_TYPE (arg_type)))); + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type)) + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type))) + code = FIX_TRUNC_EXPR; + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type)) + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type))) + code = FLOAT_EXPR; + if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type))) + < tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type)))) + modifier = NARROW; + else if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (ret_type))) + > tree_to_uhwi (TYPE_SIZE (TREE_TYPE (arg_type)))) + modifier = WIDEN; + + if (modifier == NONE && (code == FIX_TRUNC_EXPR || code == FLOAT_EXPR)) + { + if (supportable_convert_operation (code, ret_type, arg_type, &decl, + &code1)) + { + if (code1 == CALL_EXPR) + { + g = gimple_build_call (decl, 1, arg); + gimple_call_set_lhs (g, lhs); + } + else + g = gimple_build_assign (lhs, code1, arg); + gsi_replace (gsi, g, false); + return; + } + /* Can't use get_compute_type here, as supportable_convert_operation + doesn't necessarily use an optab and needs two arguments. */ + tree vector_compute_type + = type_for_widest_vector_mode (TREE_TYPE (arg_type), mov_optab); + unsigned HOST_WIDE_INT nelts; + if (vector_compute_type + && VECTOR_MODE_P (TYPE_MODE (vector_compute_type)) + && subparts_gt (arg_type, vector_compute_type) + && TYPE_VECTOR_SUBPARTS (vector_compute_type).is_constant (&nelts)) + { + while (nelts > 1) + { + tree ret1_type = build_vector_type (TREE_TYPE (ret_type), nelts); + tree arg1_type = build_vector_type (TREE_TYPE (arg_type), nelts); + if (supportable_convert_operation (code, ret1_type, arg1_type, + &decl, &code1)) + { + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, + ret_type, arg1_type, arg, + decl, code1); + g = gimple_build_assign (lhs, new_rhs); + gsi_replace (gsi, g, false); + return; + } + nelts = nelts / 2; + } + } + } + /* FIXME: __builtin_convertvector argument and return vectors have the same + number of elements, so for both narrowing and widening we need to figure + out what is the best set of optabs to use. E.g. for NARROW + VEC_PACK_TRUNC_EXPR has 2 arguments, shall we prefer emitting that with + one argument of arg and another argument all zeros and extract first + half of the resulting vector, or extract lo and hi halves of the arg + vector and use VEC_PACK_TRUNC_EXPR on those? */ + else if (0 && modifier == NARROW) + { + switch (code) + { + case NOP_EXPR: + code1 = VEC_PACK_TRUNC_EXPR; + optab1 = optab_for_tree_code (code1, arg_type, optab_default); + break; + case FIX_TRUNC_EXPR: + code1 = VEC_PACK_FIX_TRUNC_EXPR; + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (code1, ret_type, optab_default); + break; + case FLOAT_EXPR: + code1 = VEC_PACK_FLOAT_EXPR; + optab1 = optab_for_tree_code (code1, arg_type, optab_default); + break; + default: + gcc_unreachable (); + } + + if (optab1) + compute_type = get_compute_type (code1, optab1, arg_type); + (void) compute_type; + } + + new_rhs = expand_vector_piecewise (gsi, do_vec_conversion, arg_type, + TREE_TYPE (arg_type), arg, + NULL_TREE, code, ret_type); + g = gimple_build_assign (lhs, new_rhs); + gsi_replace (gsi, g, false); +} + /* Process one statement. If we identify a vector operation, expand it. */ static void @@ -1561,7 +1706,11 @@ expand_vector_operations_1 (gimple_stmt_ /* Only consider code == GIMPLE_ASSIGN. */ gassign *stmt = dyn_cast <gassign *> (gsi_stmt (*gsi)); if (!stmt) - return; + { + if (gimple_call_internal_p (gsi_stmt (*gsi), IFN_VEC_CONVERT)) + expand_vector_conversion (gsi); + return; + } code = gimple_assign_rhs_code (stmt); rhs_class = get_gimple_rhs_class (code); --- gcc/internal-fn.def.jj 2019-01-01 12:37:17.893962875 +0100 +++ gcc/internal-fn.def 2019-01-02 11:24:24.307681792 +0100 @@ -296,6 +296,7 @@ DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW | ECF_LEAF, NULL) +DEF_INTERNAL_FN (VEC_CONVERT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) /* An unduplicable, uncombinable function. Generally used to preserve a CFG property in the face of jump threading, tail merging or --- gcc/internal-fn.c.jj 2019-01-01 12:37:19.567935410 +0100 +++ gcc/internal-fn.c 2019-01-02 11:24:24.315681661 +0100 @@ -2581,6 +2581,15 @@ expand_VA_ARG (internal_fn, gcall *) gcc_unreachable (); } +/* IFN_VEC_CONVERT is supposed to be expanded at pass_lower_vector. So this + dummy function should never be called. */ + +static void +expand_VEC_CONVERT (internal_fn, gcall *) +{ + gcc_unreachable (); +} + /* Expand the IFN_UNIQUE function according to its first argument. */ static void --- gcc/fold-const-call.c.jj 2019-01-01 12:37:16.528985271 +0100 +++ gcc/fold-const-call.c 2019-01-02 15:57:36.656449175 +0100 @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. #include "tm.h" /* For C[LT]Z_DEFINED_AT_ZERO. */ #include "builtins.h" #include "gimple-expr.h" +#include "tree-vector-builder.h" /* Functions that test for certain constant types, abstracting away the decision about whether to check for overflow. */ @@ -645,6 +646,40 @@ fold_const_reduction (tree type, tree ar return res; } +/* Fold a call to IFN_VEC_CONVERT (ARG) returning TYPE. */ + +static tree +fold_const_vec_convert (tree ret_type, tree arg) +{ + enum tree_code code = NOP_EXPR; + tree arg_type = TREE_TYPE (arg); + if (TREE_CODE (arg) != VECTOR_CST) + return NULL_TREE; + + gcc_checking_assert (VECTOR_TYPE_P (ret_type) && VECTOR_TYPE_P (arg_type)); + + if (INTEGRAL_TYPE_P (TREE_TYPE (ret_type)) + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (arg_type))) + code = FIX_TRUNC_EXPR; + else if (INTEGRAL_TYPE_P (TREE_TYPE (arg_type)) + && SCALAR_FLOAT_TYPE_P (TREE_TYPE (ret_type))) + code = FLOAT_EXPR; + + tree_vector_builder elts; + elts.new_unary_operation (ret_type, arg, true); + unsigned int count = elts.encoded_nelts (); + for (unsigned int i = 0; i < count; ++i) + { + tree elt = fold_unary (code, TREE_TYPE (ret_type), + VECTOR_CST_ELT (arg, i)); + if (elt == NULL_TREE || !CONSTANT_CLASS_P (elt)) + return NULL_TREE; + elts.quick_push (elt); + } + + return elts.build (); +} + /* Try to evaluate: *RESULT = FN (*ARG) @@ -1232,6 +1267,9 @@ fold_const_call (combined_fn fn, tree ty case CFN_REDUC_XOR: return fold_const_reduction (type, arg, BIT_XOR_EXPR); + case CFN_VEC_CONVERT: + return fold_const_vec_convert (type, arg); + default: return fold_const_call_1 (fn, type, arg); } --- gcc/c-family/c-common.h.jj 2019-01-01 12:37:51.309414610 +0100 +++ gcc/c-family/c-common.h 2019-01-02 11:24:24.314681677 +0100 @@ -102,7 +102,7 @@ enum rid RID_ASM, RID_TYPEOF, RID_ALIGNOF, RID_ATTRIBUTE, RID_VA_ARG, RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL, RID_CHOOSE_EXPR, RID_TYPES_COMPATIBLE_P, RID_BUILTIN_COMPLEX, RID_BUILTIN_SHUFFLE, - RID_BUILTIN_TGMATH, + RID_BUILTIN_CONVERTVECTOR, RID_BUILTIN_TGMATH, RID_BUILTIN_HAS_ATTRIBUTE, RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128, @@ -1001,6 +1001,7 @@ extern bool lvalue_p (const_tree); extern bool vector_targets_convertible_p (const_tree t1, const_tree t2); extern bool vector_types_convertible_p (const_tree t1, const_tree t2, bool emit_lax_note); extern tree c_build_vec_perm_expr (location_t, tree, tree, tree, bool = true); +extern tree c_build_vec_convert (location_t, tree, location_t, tree, bool = true); extern void init_c_lex (void); --- gcc/c-family/c-common.c.jj 2019-01-01 12:37:51.366413675 +0100 +++ gcc/c-family/c-common.c 2019-01-02 11:24:24.314681677 +0100 @@ -376,6 +376,7 @@ const struct c_common_resword c_common_r RID_BUILTIN_CALL_WITH_STATIC_CHAIN, D_CONLY }, { "__builtin_choose_expr", RID_CHOOSE_EXPR, D_CONLY }, { "__builtin_complex", RID_BUILTIN_COMPLEX, D_CONLY }, + { "__builtin_convertvector", RID_BUILTIN_CONVERTVECTOR, 0 }, { "__builtin_has_attribute", RID_BUILTIN_HAS_ATTRIBUTE, 0 }, { "__builtin_launder", RID_BUILTIN_LAUNDER, D_CXXONLY }, { "__builtin_shuffle", RID_BUILTIN_SHUFFLE, 0 }, @@ -1070,6 +1071,70 @@ c_build_vec_perm_expr (location_t loc, t ret = c_wrap_maybe_const (ret, true); return ret; +} + +/* Build a VEC_CONVERT ifn for __builtin_convertvector builtin. */ + +tree +c_build_vec_convert (location_t loc1, tree expr, location_t loc2, tree type, + bool complain) +{ + if (error_operand_p (type)) + return error_mark_node; + if (error_operand_p (expr)) + return error_mark_node; + + if (!VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr)) + && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (expr))) + { + if (complain) + error_at (loc1, "%<__builtin_convertvector%> first argument must " + "be an integer or floating vector"); + return error_mark_node; + } + + if (!VECTOR_INTEGER_TYPE_P (type) && !VECTOR_FLOAT_TYPE_P (type)) + { + if (complain) + error_at (loc2, "%<__builtin_convertvector%> second argument must " + "be an integer or floating vector type"); + return error_mark_node; + } + + if (maybe_ne (TYPE_VECTOR_SUBPARTS (TREE_TYPE (expr)), + TYPE_VECTOR_SUBPARTS (type))) + { + if (complain) + error_at (loc1, "%<__builtin_convertvector%> number of elements " + "of the first argument vector and the second argument " + "vector type should be the same"); + return error_mark_node; + } + + if ((TYPE_MAIN_VARIANT (TREE_TYPE (TREE_TYPE (expr))) + == TYPE_MAIN_VARIANT (TREE_TYPE (type))) + || (VECTOR_INTEGER_TYPE_P (TREE_TYPE (expr)) + && VECTOR_INTEGER_TYPE_P (type) + && (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (expr))) + == TYPE_PRECISION (TREE_TYPE (type))))) + return build1_loc (loc1, VIEW_CONVERT_EXPR, type, expr); + + bool wrap = true; + bool maybe_const = false; + tree ret; + if (!c_dialect_cxx ()) + { + /* Avoid C_MAYBE_CONST_EXPRs inside of VEC_CONVERT argument. */ + expr = c_fully_fold (expr, false, &maybe_const); + wrap &= maybe_const; + } + + ret = build_call_expr_internal_loc (loc1, IFN_VEC_CONVERT, type, 1, expr); + + if (!wrap) + ret = c_wrap_maybe_const (ret, true); + + return ret; } /* Like tree.c:get_narrower, but retain conversion from C++0x scoped enum --- gcc/c/c-parser.c.jj 2019-01-01 12:37:48.677457794 +0100 +++ gcc/c/c-parser.c 2019-01-02 11:24:24.312681710 +0100 @@ -8038,6 +8038,7 @@ enum tgmath_parm_kind __builtin_shuffle ( assignment-expression , assignment-expression , assignment-expression, ) + __builtin_convertvector ( assignment-expression , type-name ) offsetof-member-designator: identifier @@ -9113,17 +9114,14 @@ c_parser_postfix_expression (c_parser *p *p = convert_lvalue_to_rvalue (loc, *p, true, true); if (vec_safe_length (cexpr_list) == 2) - expr.value = - c_build_vec_perm_expr - (loc, (*cexpr_list)[0].value, - NULL_TREE, (*cexpr_list)[1].value); + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value, + NULL_TREE, + (*cexpr_list)[1].value); else if (vec_safe_length (cexpr_list) == 3) - expr.value = - c_build_vec_perm_expr - (loc, (*cexpr_list)[0].value, - (*cexpr_list)[1].value, - (*cexpr_list)[2].value); + expr.value = c_build_vec_perm_expr (loc, (*cexpr_list)[0].value, + (*cexpr_list)[1].value, + (*cexpr_list)[2].value); else { error_at (loc, "wrong number of arguments to " @@ -9133,6 +9131,41 @@ c_parser_postfix_expression (c_parser *p set_c_expr_source_range (&expr, loc, close_paren_loc); break; } + case RID_BUILTIN_CONVERTVECTOR: + { + location_t start_loc = loc; + c_parser_consume_token (parser); + matching_parens parens; + if (!parens.require_open (parser)) + { + expr.set_error (); + break; + } + e1 = c_parser_expr_no_commas (parser, NULL); + mark_exp_read (e1.value); + if (!c_parser_require (parser, CPP_COMMA, "expected %<,%>")) + { + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, NULL); + expr.set_error (); + break; + } + loc = c_parser_peek_token (parser)->location; + t1 = c_parser_type_name (parser); + location_t end_loc = c_parser_peek_token (parser)->get_finish (); + c_parser_skip_until_found (parser, CPP_CLOSE_PAREN, + "expected %<)%>"); + if (t1 == NULL) + expr.set_error (); + else + { + tree type_expr = NULL_TREE; + expr.value = c_build_vec_convert (start_loc, e1.value, loc, + groktypename (t1, &type_expr, + NULL)); + set_c_expr_source_range (&expr, start_loc, end_loc); + } + } + break; case RID_AT_SELECTOR: { gcc_assert (c_dialect_objc ()); --- gcc/cp/cp-tree.h.jj 2019-01-01 12:37:46.884487212 +0100 +++ gcc/cp/cp-tree.h 2019-01-02 16:43:35.480393140 +0100 @@ -7142,6 +7142,8 @@ extern bool is_lambda_ignored_entity extern bool lambda_static_thunk_p (tree); extern tree finish_builtin_launder (location_t, tree, tsubst_flags_t); +extern tree cp_build_vec_convert (tree, location_t, tree, + tsubst_flags_t); extern void start_lambda_scope (tree); extern void record_lambda_scope (tree); extern void record_null_lambda_scope (tree); --- gcc/cp/parser.c.jj 2019-01-01 12:37:47.352479534 +0100 +++ gcc/cp/parser.c 2019-01-02 16:19:44.765760167 +0100 @@ -7031,6 +7031,32 @@ cp_parser_postfix_expression (cp_parser break; } + case RID_BUILTIN_CONVERTVECTOR: + { + tree expression; + tree type; + /* Consume the `__builtin_convertvector' token. */ + cp_lexer_consume_token (parser->lexer); + /* Look for the opening `('. */ + matching_parens parens; + parens.require_open (parser); + /* Now, parse the assignment-expression. */ + expression = cp_parser_assignment_expression (parser); + /* Look for the `,'. */ + cp_parser_require (parser, CPP_COMMA, RT_COMMA); + location_t type_location + = cp_lexer_peek_token (parser->lexer)->location; + /* Parse the type-id. */ + { + type_id_in_expr_sentinel s (parser); + type = cp_parser_type_id (parser); + } + /* Look for the closing `)'. */ + parens.require_close (parser); + return cp_build_vec_convert (expression, type_location, type, + tf_warning_or_error); + } + default: { tree type; --- gcc/cp/constexpr.c.jj 2019-01-01 12:37:47.282480682 +0100 +++ gcc/cp/constexpr.c 2019-01-02 16:56:54.126359632 +0100 @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. #include "ubsan.h" #include "gimple-fold.h" #include "timevar.h" +#include "fold-const-call.h" static bool verify_constant (tree, bool, bool *, bool *); #define VERIFY_CONSTANT(X) \ @@ -1449,6 +1450,20 @@ cxx_eval_internal_function (const conste return cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0), false, non_constant_p, overflow_p); + case IFN_VEC_CONVERT: + { + tree arg = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0), + false, non_constant_p, + overflow_p); + if (TREE_CODE (arg) == VECTOR_CST) + return fold_const_call (CFN_VEC_CONVERT, TREE_TYPE (t), arg); + else + { + *non_constant_p = true; + return t; + } + } + default: if (!ctx->quiet) error_at (cp_expr_loc_or_loc (t, input_location), @@ -5623,7 +5638,9 @@ potential_constant_expression_1 (tree t, case IFN_SUB_OVERFLOW: case IFN_MUL_OVERFLOW: case IFN_LAUNDER: + case IFN_VEC_CONVERT: bail = false; + break; default: break; --- gcc/cp/semantics.c.jj 2019-01-01 12:37:46.976485703 +0100 +++ gcc/cp/semantics.c 2019-01-02 18:15:42.844133048 +0100 @@ -9933,4 +9933,26 @@ finish_builtin_launder (location_t loc, TREE_TYPE (arg), 1, arg); } +/* Finish __builtin_convertvector (arg, type). */ + +tree +cp_build_vec_convert (tree arg, location_t loc, tree type, + tsubst_flags_t complain) +{ + if (error_operand_p (type)) + return error_mark_node; + if (error_operand_p (arg)) + return error_mark_node; + + tree ret = NULL_TREE; + if (!type_dependent_expression_p (arg) && !dependent_type_p (type)) + ret = c_build_vec_convert (cp_expr_loc_or_loc (arg, input_location), arg, + loc, type, (complain & tf_error) != 0); + + if (!processing_template_decl) + return ret; + + return build_call_expr_internal_loc (loc, IFN_VEC_CONVERT, type, 1, arg); +} + #include "gt-cp-semantics.h" --- gcc/cp/pt.c.jj 2019-01-01 12:37:47.081483980 +0100 +++ gcc/cp/pt.c 2019-01-02 18:25:17.997778249 +0100 @@ -18813,6 +18813,27 @@ tsubst_copy_and_build (tree t, (*call_args)[0], complain); break; + case IFN_VEC_CONVERT: + gcc_assert (nargs == 1); + if (vec_safe_length (call_args) != 1) + { + error_at (cp_expr_loc_or_loc (t, input_location), + "wrong number of arguments to " + "%<__builtin_convertvector%>"); + ret = error_mark_node; + break; + } + ret = cp_build_vec_convert ((*call_args)[0], input_location, + tsubst (TREE_TYPE (t), args, + complain, in_decl), + complain); + if (TREE_CODE (ret) == VIEW_CONVERT_EXPR) + { + release_tree_vector (call_args); + RETURN (ret); + } + break; + default: /* Unsupported internal function with arguments. */ gcc_unreachable (); --- gcc/testsuite/c-c++-common/builtin-convertvector-1.c.jj 2019-01-02 18:38:18.265090910 +0100 +++ gcc/testsuite/c-c++-common/builtin-convertvector-1.c 2019-01-02 18:37:50.337544972 +0100 @@ -0,0 +1,15 @@ +typedef int v8si __attribute__((vector_size (8 * sizeof (int)))); +typedef long long v4di __attribute__((vector_size (4 * sizeof (long long)))); + +void +foo (v8si *x, v4di *y, int z) +{ + __builtin_convertvector (*y, v8si); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */ + __builtin_convertvector (*x, v4di); /* { dg-error "number of elements of the first argument vector and the second argument vector type should be the same" } */ + __builtin_convertvector (*x, int); /* { dg-error "second argument must be an integer or floating vector type" } */ + __builtin_convertvector (z, v4di); /* { dg-error "first argument must be an integer or floating vector" } */ + __builtin_convertvector (); /* { dg-error "expected" } */ + __builtin_convertvector (*x); /* { dg-error "expected" } */ + __builtin_convertvector (*x, *y); /* { dg-error "expected" } */ + __builtin_convertvector (*x, v8si, 1);/* { dg-error "expected" } */ +} --- gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c.jj 2019-01-02 18:00:59.982534637 +0100 +++ gcc/testsuite/c-c++-common/torture/builtin-convertvector-1.c 2019-01-02 18:00:32.871977360 +0100 @@ -0,0 +1,131 @@ +extern +#ifdef __cplusplus +"C" +#endif +void abort (void); +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int)))); +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); +typedef double v4df __attribute__((vector_size (4 * sizeof (double)))); +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long)))); +typedef double v256df __attribute__((vector_size (256 * sizeof (double)))); + +void +f1 (v4usi *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +void +f2 (v4sf *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +void +f3 (v4si *x, v4sf *y) +{ + *y = __builtin_convertvector (*x, v4sf); +} + +void +f4 (v4df *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +void +f5 (v4si *x, v4df *y) +{ + *y = __builtin_convertvector (*x, v4df); +} + +void +f6 (v256df *x, v256di *y) +{ + *y = __builtin_convertvector (*x, v256di); +} + +void +f7 (v256di *x, v256df *y) +{ + *y = __builtin_convertvector (*x, v256df); +} + +void +f8 (v4df *x) +{ + v4si a = { 1, 2, -3, -4 }; + *x = __builtin_convertvector (a, v4df); +} + +int +main () +{ + union U1 { v4si v; int a[4]; } u1; + union U2 { v4usi v; unsigned int a[4]; } u2; + union U3 { v4sf v; float a[4]; } u3; + union U4 { v4df v; double a[4]; } u4; + union U5 { v256di v; long long a[256]; } u5; + union U6 { v256df v; double a[256]; } u6; + int i; + for (i = 0; i < 4; i++) + u2.a[i] = i * 2; + f1 (&u2.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i * 2) + abort (); + else + u3.a[i] = i - 2.25f; + f2 (&u3.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != (i == 3 ? 0 : i - 2)) + abort (); + else + u3.a[i] = i + 0.75f; + f2 (&u3.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i) + abort (); + else + u1.a[i] = 7 * i - 5; + f3 (&u1.v, &u3.v); + for (i = 0; i < 4; i++) + if (u3.a[i] != 7 * i - 5) + abort (); + else + u4.a[i] = i - 2.25; + f4 (&u4.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != (i == 3 ? 0 : i - 2)) + abort (); + else + u4.a[i] = i + 0.75; + f4 (&u4.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i) + abort (); + else + u1.a[i] = 7 * i - 5; + f5 (&u1.v, &u4.v); + for (i = 0; i < 4; i++) + if (u4.a[i] != 7 * i - 5) + abort (); + for (i = 0; i < 256; i++) + u6.a[i] = i - 128.25; + f6 (&u6.v, &u5.v); + for (i = 0; i < 256; i++) + if (u5.a[i] != i - 128 - (i > 128)) + abort (); + else + u5.a[i] = i - 128; + f7 (&u5.v, &u6.v); + for (i = 0; i < 256; i++) + if (u6.a[i] != i - 128) + abort (); + f8 (&u4.v); + for (i = 0; i < 4; i++) + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1)) + abort (); + return 0; +} --- gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C.jj 2019-01-02 18:04:14.984350274 +0100 +++ gcc/testsuite/g++.dg/ext/builtin-convertvector-1.C 2019-01-02 18:07:17.122375950 +0100 @@ -0,0 +1,137 @@ +// { dg-do run } + +extern "C" void abort (); +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); +typedef unsigned int v4usi __attribute__((vector_size (4 * sizeof (unsigned int)))); +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); +typedef double v4df __attribute__((vector_size (4 * sizeof (double)))); +typedef long long v256di __attribute__((vector_size (256 * sizeof (long long)))); +typedef double v256df __attribute__((vector_size (256 * sizeof (double)))); + +template <int N> +void +f1 (v4usi *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +template <typename T> +void +f2 (T *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +template <typename T> +void +f3 (v4si *x, T *y) +{ + *y = __builtin_convertvector (*x, T); +} + +template <int N> +void +f4 (v4df *x, v4si *y) +{ + *y = __builtin_convertvector (*x, v4si); +} + +template <typename T, typename U> +void +f5 (T *x, U *y) +{ + *y = __builtin_convertvector (*x, U); +} + +template <typename T> +void +f6 (v256df *x, T *y) +{ + *y = __builtin_convertvector (*x, T); +} + +template <int N> +void +f7 (v256di *x, v256df *y) +{ + *y = __builtin_convertvector (*x, v256df); +} + +template <int N> +void +f8 (v4df *x) +{ + v4si a = { 1, 2, -3, -4 }; + *x = __builtin_convertvector (a, v4df); +} + +int +main () +{ + union U1 { v4si v; int a[4]; } u1; + union U2 { v4usi v; unsigned int a[4]; } u2; + union U3 { v4sf v; float a[4]; } u3; + union U4 { v4df v; double a[4]; } u4; + union U5 { v256di v; long long a[256]; } u5; + union U6 { v256df v; double a[256]; } u6; + int i; + for (i = 0; i < 4; i++) + u2.a[i] = i * 2; + f1<0> (&u2.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i * 2) + abort (); + else + u3.a[i] = i - 2.25f; + f2 (&u3.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != (i == 3 ? 0 : i - 2)) + abort (); + else + u3.a[i] = i + 0.75f; + f2 (&u3.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i) + abort (); + else + u1.a[i] = 7 * i - 5; + f3 (&u1.v, &u3.v); + for (i = 0; i < 4; i++) + if (u3.a[i] != 7 * i - 5) + abort (); + else + u4.a[i] = i - 2.25; + f4<12> (&u4.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != (i == 3 ? 0 : i - 2)) + abort (); + else + u4.a[i] = i + 0.75; + f4<13> (&u4.v, &u1.v); + for (i = 0; i < 4; i++) + if (u1.a[i] != i) + abort (); + else + u1.a[i] = 7 * i - 5; + f5 (&u1.v, &u4.v); + for (i = 0; i < 4; i++) + if (u4.a[i] != 7 * i - 5) + abort (); + for (i = 0; i < 256; i++) + u6.a[i] = i - 128.25; + f6 (&u6.v, &u5.v); + for (i = 0; i < 256; i++) + if (u5.a[i] != i - 128 - (i > 128)) + abort (); + else + u5.a[i] = i - 128; + f7<-1> (&u5.v, &u6.v); + for (i = 0; i < 256; i++) + if (u6.a[i] != i - 128) + abort (); + f8<5> (&u4.v); + for (i = 0; i < 4; i++) + if (u4.a[i] != (i >= 2 ? -1 - i : i + 1)) + abort (); + return 0; +} --- gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C.jj 2019-01-02 18:39:12.767204801 +0100 +++ gcc/testsuite/g++.dg/cpp0x/constexpr-builtin4.C 2019-01-02 18:42:30.749985890 +0100 @@ -0,0 +1,17 @@ +// { dg-do compile { target c++11 } } +// { dg-additional-options "-Wno-psabi" } + +typedef int v4si __attribute__((vector_size (4 * sizeof (int)))); +typedef float v4sf __attribute__((vector_size (4 * sizeof (float)))); +constexpr v4sf a = __builtin_convertvector (v4si { 1, 2, -3, -4 }, v4sf); + +constexpr v4sf +foo (v4si x) +{ + return __builtin_convertvector (x, v4sf); +} + +constexpr v4sf b = foo (v4si { 3, 4, -1, -2 }); + +static_assert (a[0] == 1.0f && a[1] == 2.0f && a[2] == -3.0f && a[3] == -4.0f, ""); +static_assert (b[0] == 3.0f && b[1] == 4.0f && b[2] == -1.0f && b[3] == -2.0f, "");

Add __builtin_convertvector support (PR c++/85052)

Commit Message

Comments

Patch