Message ID | 20230418134608.244751-3-christophe.lyon@arm.com |
---|---|
State | New |
Headers | show |
Series | arm: New framework for MVE intrinsics | expand |
> -----Original Message----- > From: Christophe Lyon <christophe.lyon@arm.com> > Sent: Tuesday, April 18, 2023 2:46 PM > To: gcc-patches@gcc.gnu.org; Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; > Richard Earnshaw <Richard.Earnshaw@arm.com>; Richard Sandiford > <Richard.Sandiford@arm.com> > Cc: Christophe Lyon <Christophe.Lyon@arm.com> > Subject: [PATCH 02/22] arm: [MVE intrinsics] Add new framework > > This patch introduces the new MVE intrinsics framework, heavily > inspired by the SVE one in the aarch64 port. > > Like the MVE intrinsic types implementation, the intrinsics framework > defines functions via a new pragma in arm_mve.h. A boolean parameter > is used to pass true when __ARM_MVE_PRESERVE_USER_NAMESPACE is > defined, and false when it is not, allowing for non-prefixed intrinsic > functions to be conditionally defined. > > Future patches will build on this framework by adding new intrinsic > functions and adding the features needed to support them. > > Differences compared to the aarch64/SVE port include: > - when present, the predicate argument is the last one with MVE (the > first one with SVE) > - when using merging predicates ("_m" suffix), the "inactive" argument > (if any) is inserted in the first position > - when using merging predicates ("_m" suffix), some function do not > have the "inactive" argument, so we maintain an exception-list > - MVE intrinsics dealing with floating-point require the FP extension, > while SVE may support different extensions > - regarding global state, MVE does not have any prefetch intrinsic, so > we do not need a flag for this > - intrinsic names can be prefixed with "__arm", depending on whether > preserve_user_namespace is true or false > - parse_signature: the maximum number of arguments is now a parameter, > this helps detecting an overflow with a new assert. > - suffixes and overloading can be controlled using > explicit_mode_suffix_p and skip_overload_p in addition to > explicit_type_suffix_p Ok. Thanks, Kyrill > > At this implemtation stage, there are some limitations compared > to aarch64/SVE, which are removed later in the series: > - "offset" mode is not supported yet > - gimple folding is not implemented > > 2022-09-08 Murray Steele <murray.steele@arm.com> > Christophe Lyon <christophe.lyon@arm.com> > > gcc/ChangeLog: > > * config.gcc: Add arm-mve-builtins-base.o and > arm-mve-builtins-shapes.o to extra_objs. > * config/arm/arm-builtins.cc (arm_builtin_decl): Handle MVE builtin > numberspace. > (arm_expand_builtin): Likewise > (arm_check_builtin_call): Likewise > (arm_describe_resolver): Likewise. > * config/arm/arm-builtins.h (enum resolver_ident): Add > arm_mve_resolver. > * config/arm/arm-c.cc (arm_pragma_arm): Handle new pragma. > (arm_resolve_overloaded_builtin): Handle MVE builtins. > (arm_register_target_pragmas): Register arm_check_builtin_call. > * config/arm/arm-mve-builtins.cc (class registered_function): New > class. > (struct registered_function_hasher): New struct. > (pred_suffixes): New table. > (mode_suffixes): New table. > (type_suffix_info): New table. > (TYPES_float16): New. > (TYPES_all_float): New. > (TYPES_integer_8): New. > (TYPES_integer_8_16): New. > (TYPES_integer_16_32): New. > (TYPES_integer_32): New. > (TYPES_signed_16_32): New. > (TYPES_signed_32): New. > (TYPES_all_signed): New. > (TYPES_all_unsigned): New. > (TYPES_all_integer): New. > (TYPES_all_integer_with_64): New. > (DEF_VECTOR_TYPE): New. > (DEF_DOUBLE_TYPE): New. > (DEF_MVE_TYPES_ARRAY): New. > (all_integer): New. > (all_integer_with_64): New. > (float16): New. > (all_float): New. > (all_signed): New. > (all_unsigned): New. > (integer_8): New. > (integer_8_16): New. > (integer_16_32): New. > (integer_32): New. > (signed_16_32): New. > (signed_32): New. > (register_vector_type): Use void_type_node for mve.fp-only types > when > mve.fp is not enabled. > (register_builtin_tuple_types): Likewise. > (handle_arm_mve_h): New function.. > (matches_type_p): Likewise.. > (report_out_of_range): Likewise. > (report_not_enum): Likewise. > (report_missing_float): Likewise. > (report_non_ice): Likewise. > (check_requires_float): Likewise. > (function_instance::hash): Likewise > (function_instance::call_properties): Likewise. > (function_instance::reads_global_state_p): Likewise. > (function_instance::modifies_global_state_p): Likewise. > (function_instance::could_trap_p): Likewise. > (function_instance::has_inactive_argument): Likewise. > (registered_function_hasher::hash): Likewise. > (registered_function_hasher::equal): Likewise. > (function_builder::function_builder): Likewise. > (function_builder::~function_builder): Likewise. > (function_builder::append_name): Likewise. > (function_builder::finish_name): Likewise. > (function_builder::get_name): Likewise. > (add_attribute): Likewise. > (function_builder::get_attributes): Likewise. > (function_builder::add_function): Likewise. > (function_builder::add_unique_function): Likewise. > (function_builder::add_overloaded_function): Likewise. > (function_builder::add_overloaded_functions): Likewise. > (function_builder::register_function_group): Likewise. > (function_call_info::function_call_info): Likewise. > (function_resolver::function_resolver): Likewise. > (function_resolver::get_vector_type): Likewise. > (function_resolver::get_scalar_type_name): Likewise. > (function_resolver::get_argument_type): Likewise. > (function_resolver::scalar_argument_p): Likewise. > (function_resolver::report_no_such_form): Likewise. > (function_resolver::lookup_form): Likewise. > (function_resolver::resolve_to): Likewise. > (function_resolver::infer_vector_or_tuple_type): Likewise. > (function_resolver::infer_vector_type): Likewise. > (function_resolver::require_vector_or_scalar_type): Likewise. > (function_resolver::require_vector_type): Likewise. > (function_resolver::require_matching_vector_type): Likewise. > (function_resolver::require_derived_vector_type): Likewise. > (function_resolver::require_derived_scalar_type): Likewise. > (function_resolver::require_integer_immediate): Likewise. > (function_resolver::require_scalar_type): Likewise. > (function_resolver::check_num_arguments): Likewise. > (function_resolver::check_gp_argument): Likewise. > (function_resolver::finish_opt_n_resolution): Likewise. > (function_resolver::resolve_unary): Likewise. > (function_resolver::resolve_unary_n): Likewise. > (function_resolver::resolve_uniform): Likewise. > (function_resolver::resolve_uniform_opt_n): Likewise. > (function_resolver::resolve): Likewise. > (function_checker::function_checker): Likewise. > (function_checker::argument_exists_p): Likewise. > (function_checker::require_immediate): Likewise. > (function_checker::require_immediate_enum): Likewise. > (function_checker::require_immediate_range): Likewise. > (function_checker::check): Likewise. > (gimple_folder::gimple_folder): Likewise. > (gimple_folder::fold): Likewise. > (function_expander::function_expander): Likewise. > (function_expander::direct_optab_handler): Likewise. > (function_expander::get_fallback_value): Likewise. > (function_expander::get_reg_target): Likewise. > (function_expander::add_output_operand): Likewise. > (function_expander::add_input_operand): Likewise. > (function_expander::add_integer_operand): Likewise. > (function_expander::generate_insn): Likewise. > (function_expander::use_exact_insn): Likewise. > (function_expander::use_unpred_insn): Likewise. > (function_expander::use_pred_x_insn): Likewise. > (function_expander::use_cond_insn): Likewise. > (function_expander::map_to_rtx_codes): Likewise. > (function_expander::expand): Likewise. > (resolve_overloaded_builtin): Likewise. > (check_builtin_call): Likewise. > (gimple_fold_builtin): Likewise. > (expand_builtin): Likewise. > (gt_ggc_mx): Likewise. > (gt_pch_nx): Likewise. > (gt_pch_nx): Likewise. > * config/arm/arm-mve-builtins.def(s8): Define new type suffix. > (s16): Likewise. > (s32): Likewise. > (s64): Likewise. > (u8): Likewise. > (u16): Likewise. > (u32): Likewise. > (u64): Likewise. > (f16): Likewise. > (f32): Likewise. > (n): New mode. > (offset): New mode. > * config/arm/arm-mve-builtins.h (MAX_TUPLE_SIZE): New constant. > (CP_READ_FPCR): Likewise. > (CP_RAISE_FP_EXCEPTIONS): Likewise. > (CP_READ_MEMORY): Likewise. > (CP_WRITE_MEMORY): Likewise. > (enum units_index): New enum. > (enum predication_index): New. > (enum type_class_index): New. > (enum mode_suffix_index): New enum. > (enum type_suffix_index): New. > (struct mode_suffix_info): New struct. > (struct type_suffix_info): New. > (struct function_group_info): Likewise. > (class function_instance): Likewise. > (class registered_function): Likewise. > (class function_builder): Likewise. > (class function_call_info): Likewise. > (class function_resolver): Likewise. > (class function_checker): Likewise. > (class gimple_folder): Likewise. > (class function_expander): Likewise. > (get_mve_pred16_t): Likewise. > (find_mode_suffix): New function. > (class function_base): Likewise. > (class function_shape): Likewise. > (function_instance::operator==): New function. > (function_instance::operator!=): Likewise. > (function_instance::vectors_per_tuple): Likewise. > (function_instance::mode_suffix): Likewise. > (function_instance::type_suffix): Likewise. > (function_instance::scalar_type): Likewise. > (function_instance::vector_type): Likewise. > (function_instance::tuple_type): Likewise. > (function_instance::vector_mode): Likewise. > (function_call_info::function_returns_void_p): Likewise. > (function_base::call_properties): Likewise. > * config/arm/arm-protos.h (enum arm_builtin_class): Add > ARM_BUILTIN_MVE. > (handle_arm_mve_h): New. > (resolve_overloaded_builtin): New. > (check_builtin_call): New. > (gimple_fold_builtin): New. > (expand_builtin): New. > * config/arm/arm.cc (TARGET_GIMPLE_FOLD_BUILTIN): Define as > arm_gimple_fold_builtin. > (arm_gimple_fold_builtin): New function. > * config/arm/arm_mve.h: Use new arm_mve.h pragma. > * config/arm/predicates.md (arm_any_register_operand): New > predicate. > * config/arm/t-arm: (arm-mve-builtins.o): Add includes. > (arm-mve-builtins-shapes.o): New target. > (arm-mve-builtins-base.o): New target. > * config/arm/arm-mve-builtins-base.cc: New file. > * config/arm/arm-mve-builtins-base.def: New file. > * config/arm/arm-mve-builtins-base.h: New file. > * config/arm/arm-mve-builtins-functions.h: New file. > * config/arm/arm-mve-builtins-shapes.cc: New file. > * config/arm/arm-mve-builtins-shapes.h: New file. > > Co-authored-by: Christophe Lyon <christophe.lyon@arm.com > --- > gcc/config.gcc | 2 +- > gcc/config/arm/arm-builtins.cc | 15 +- > gcc/config/arm/arm-builtins.h | 1 + > gcc/config/arm/arm-c.cc | 42 +- > gcc/config/arm/arm-mve-builtins-base.cc | 45 + > gcc/config/arm/arm-mve-builtins-base.def | 24 + > gcc/config/arm/arm-mve-builtins-base.h | 29 + > gcc/config/arm/arm-mve-builtins-functions.h | 50 + > gcc/config/arm/arm-mve-builtins-shapes.cc | 343 ++++ > gcc/config/arm/arm-mve-builtins-shapes.h | 30 + > gcc/config/arm/arm-mve-builtins.cc | 1950 ++++++++++++++++++- > gcc/config/arm/arm-mve-builtins.def | 40 +- > gcc/config/arm/arm-mve-builtins.h | 669 ++++++- > gcc/config/arm/arm-protos.h | 10 +- > gcc/config/arm/arm.cc | 27 + > gcc/config/arm/arm_mve.h | 6 + > gcc/config/arm/predicates.md | 4 + > gcc/config/arm/t-arm | 32 +- > 18 files changed, 3292 insertions(+), 27 deletions(-) > create mode 100644 gcc/config/arm/arm-mve-builtins-base.cc > create mode 100644 gcc/config/arm/arm-mve-builtins-base.def > create mode 100644 gcc/config/arm/arm-mve-builtins-base.h > create mode 100644 gcc/config/arm/arm-mve-builtins-functions.h > create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.cc > create mode 100644 gcc/config/arm/arm-mve-builtins-shapes.h > > diff --git a/gcc/config.gcc b/gcc/config.gcc > index 6fd1594480a..5d49f5890ab 100644 > --- a/gcc/config.gcc > +++ b/gcc/config.gcc > @@ -362,7 +362,7 @@ arc*-*-*) > ;; > arm*-*-*) > cpu_type=arm > - extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o > aarch-bti-insert.o" > + extra_objs="arm-builtins.o arm-mve-builtins.o arm-mve-builtins- > shapes.o arm-mve-builtins-base.o aarch-common.o aarch-bti-insert.o" > extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h > arm_cmse.h arm_bf16.h arm_mve_types.h arm_mve.h arm_cde.h" > target_type_format_char='%' > c_target_objs="arm-c.o" > diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc > index adcb50d2185..d0c57409b4c 100644 > --- a/gcc/config/arm/arm-builtins.cc > +++ b/gcc/config/arm/arm-builtins.cc > @@ -2712,6 +2712,7 @@ arm_general_builtin_decl (unsigned code) > return arm_builtin_decls[code]; > } > > +/* Implement TARGET_BUILTIN_DECL. */ > /* Return the ARM builtin for CODE. */ > tree > arm_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED) > @@ -2721,6 +2722,8 @@ arm_builtin_decl (unsigned code, bool initialize_p > ATTRIBUTE_UNUSED) > { > case ARM_BUILTIN_GENERAL: > return arm_general_builtin_decl (subcode); > + case ARM_BUILTIN_MVE: > + return error_mark_node; > default: > gcc_unreachable (); > } > @@ -4087,6 +4090,8 @@ arm_expand_builtin (tree exp, > { > case ARM_BUILTIN_GENERAL: > return arm_general_expand_builtin (subcode, exp, target, ignore); > + case ARM_BUILTIN_MVE: > + return arm_mve::expand_builtin (subcode, exp, target); > default: > gcc_unreachable (); > } > @@ -4188,8 +4193,9 @@ arm_general_check_builtin_call (unsigned int code) > > /* Implement TARGET_CHECK_BUILTIN_CALL. */ > bool > -arm_check_builtin_call (location_t, vec<location_t>, tree fndecl, tree, > - unsigned int, tree *) > +arm_check_builtin_call (location_t loc, vec<location_t> arg_loc, > + tree fndecl, tree orig_fndecl, > + unsigned int nargs, tree *args) > { > unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); > unsigned int subcode = code >> ARM_BUILTIN_SHIFT; > @@ -4197,6 +4203,9 @@ arm_check_builtin_call (location_t, > vec<location_t>, tree fndecl, tree, > { > case ARM_BUILTIN_GENERAL: > return arm_general_check_builtin_call (subcode); > + case ARM_BUILTIN_MVE: > + return arm_mve::check_builtin_call (loc, arg_loc, subcode, > + orig_fndecl, nargs, args); > default: > gcc_unreachable (); > } > @@ -4215,6 +4224,8 @@ arm_describe_resolver (tree fndecl) > && subcode < ARM_BUILTIN_MVE_BASE) > return arm_cde_resolver; > return arm_no_resolver; > + case ARM_BUILTIN_MVE: > + return arm_mve_resolver; > default: > gcc_unreachable (); > } > diff --git a/gcc/config/arm/arm-builtins.h b/gcc/config/arm/arm-builtins.h > index 8c94b6bc40b..494dcd09411 100644 > --- a/gcc/config/arm/arm-builtins.h > +++ b/gcc/config/arm/arm-builtins.h > @@ -27,6 +27,7 @@ > > enum resolver_ident { > arm_cde_resolver, > + arm_mve_resolver, > arm_no_resolver > }; > enum resolver_ident arm_describe_resolver (tree); > diff --git a/gcc/config/arm/arm-c.cc b/gcc/config/arm/arm-c.cc > index 59c0d8ce747..d3d93ceba00 100644 > --- a/gcc/config/arm/arm-c.cc > +++ b/gcc/config/arm/arm-c.cc > @@ -144,20 +144,44 @@ arm_pragma_arm (cpp_reader *) > const char *name = TREE_STRING_POINTER (x); > if (strcmp (name, "arm_mve_types.h") == 0) > arm_mve::handle_arm_mve_types_h (); > + else if (strcmp (name, "arm_mve.h") == 0) > + { > + if (pragma_lex (&x) == CPP_NAME) > + { > + if (strcmp (IDENTIFIER_POINTER (x), "true") == 0) > + arm_mve::handle_arm_mve_h (true); > + else if (strcmp (IDENTIFIER_POINTER (x), "false") == 0) > + arm_mve::handle_arm_mve_h (false); > + else > + error ("%<#pragma GCC arm \"arm_mve.h\"%> requires a boolean > parameter"); > + } > + } > else > error ("unknown %<#pragma GCC arm%> option %qs", name); > } > > -/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. This is currently > only > - used for the MVE related builtins for the CDE extension. > - Here we ensure the type of arguments is such that the size is correct, and > - then return a tree that describes the same function call but with the > - relevant types cast as necessary. */ > +/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. */ > tree > -arm_resolve_overloaded_builtin (location_t loc, tree fndecl, void *arglist) > +arm_resolve_overloaded_builtin (location_t loc, tree fndecl, > + void *uncast_arglist) > { > - if (arm_describe_resolver (fndecl) == arm_cde_resolver) > - return arm_resolve_cde_builtin (loc, fndecl, arglist); > + enum resolver_ident resolver = arm_describe_resolver (fndecl); > + if (resolver == arm_cde_resolver) > + return arm_resolve_cde_builtin (loc, fndecl, uncast_arglist); > + if (resolver == arm_mve_resolver) > + { > + vec<tree, va_gc> empty = {}; > + vec<tree, va_gc> *arglist = (uncast_arglist > + ? (vec<tree, va_gc> *) uncast_arglist > + : &empty); > + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); > + unsigned int subcode = code >> ARM_BUILTIN_SHIFT; > + tree new_fndecl = arm_mve::resolve_overloaded_builtin (loc, subcode, > arglist); > + if (new_fndecl == NULL_TREE || new_fndecl == error_mark_node) > + return new_fndecl; > + return build_function_call_vec (loc, vNULL, new_fndecl, arglist, > + NULL, fndecl); > + } > return NULL_TREE; > } > > @@ -519,7 +543,9 @@ arm_register_target_pragmas (void) > { > /* Update pragma hook to allow parsing #pragma GCC target. */ > targetm.target_option.pragma_parse = arm_pragma_target_parse; > + > targetm.resolve_overloaded_builtin = arm_resolve_overloaded_builtin; > + targetm.check_builtin_call = arm_check_builtin_call; > > c_register_pragma ("GCC", "arm", arm_pragma_arm); > > diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm- > mve-builtins-base.cc > new file mode 100644 > index 00000000000..e9f285faf2b > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-base.cc > @@ -0,0 +1,45 @@ > +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "tm.h" > +#include "tree.h" > +#include "rtl.h" > +#include "memmodel.h" > +#include "insn-codes.h" > +#include "optabs.h" > +#include "basic-block.h" > +#include "function.h" > +#include "gimple.h" > +#include "arm-mve-builtins.h" > +#include "arm-mve-builtins-shapes.h" > +#include "arm-mve-builtins-base.h" > +#include "arm-mve-builtins-functions.h" > + > +using namespace arm_mve; > + > +namespace { > + > +} /* end anonymous namespace */ > + > +namespace arm_mve { > + > +} /* end namespace arm_mve */ > diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm- > mve-builtins-base.def > new file mode 100644 > index 00000000000..d15ba2e23e8 > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-base.def > @@ -0,0 +1,24 @@ > +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#define REQUIRES_FLOAT false > +#undef REQUIRES_FLOAT > + > +#define REQUIRES_FLOAT true > +#undef REQUIRES_FLOAT > diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm- > mve-builtins-base.h > new file mode 100644 > index 00000000000..c4d7b750cd5 > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-base.h > @@ -0,0 +1,29 @@ > +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef GCC_ARM_MVE_BUILTINS_BASE_H > +#define GCC_ARM_MVE_BUILTINS_BASE_H > + > +namespace arm_mve { > +namespace functions { > + > +} /* end namespace arm_mve::functions */ > +} /* end namespace arm_mve */ > + > +#endif > diff --git a/gcc/config/arm/arm-mve-builtins-functions.h > b/gcc/config/arm/arm-mve-builtins-functions.h > new file mode 100644 > index 00000000000..dff01999bcd > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-functions.h > @@ -0,0 +1,50 @@ > +/* ACLE support for Arm MVE (function_base classes) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef GCC_ARM_MVE_BUILTINS_FUNCTIONS_H > +#define GCC_ARM_MVE_BUILTINS_FUNCTIONS_H > + > +namespace arm_mve { > + > +/* Wrap T, which is derived from function_base, and indicate that the > + function never has side effects. It is only necessary to use this > + wrapper on functions that might have floating-point suffixes, since > + otherwise we assume by default that the function has no side effects. */ > +template<typename T> > +class quiet : public T > +{ > +public: > + CONSTEXPR quiet () : T () {} > + > + unsigned int > + call_properties (const function_instance &) const override > + { > + return 0; > + } > +}; > + > +} /* end namespace arm_mve */ > + > +/* Declare the global function base NAME, creating it from an instance > + of class CLASS with constructor arguments ARGS. */ > +#define FUNCTION(NAME, CLASS, ARGS) \ > + namespace { static CONSTEXPR const CLASS NAME##_obj ARGS; } \ > + namespace functions { const function_base *const NAME = &NAME##_obj; > } > + > +#endif > diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm- > mve-builtins-shapes.cc > new file mode 100644 > index 00000000000..f20660d8319 > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc > @@ -0,0 +1,343 @@ > +/* ACLE support for Arm MVE (function shapes) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "tm.h" > +#include "tree.h" > +#include "rtl.h" > +#include "memmodel.h" > +#include "insn-codes.h" > +#include "optabs.h" > +#include "arm-mve-builtins.h" > +#include "arm-mve-builtins-shapes.h" > + > +/* In the comments below, _t0 represents the first type suffix > + (e.g. "_s8") and _t1 represents the second. T0/T1 represent the > + type full names (e.g. int8x16_t). Square brackets enclose > + characters that are present in only the full name, not the > + overloaded name. Governing predicate arguments and predicate > + suffixes are not shown, since they depend on the predication type, > + which is a separate piece of information from the shape. */ > + > +namespace arm_mve { > + > +/* If INSTANCE has a predicate, add it to the list of argument types > + in ARGUMENT_TYPES. RETURN_TYPE is the type returned by the > + function. */ > +static void > +apply_predication (const function_instance &instance, tree return_type, > + vec<tree> &argument_types) > +{ > + if (instance.pred != PRED_none) > + { > + /* When predicate is PRED_m, insert a first argument > + ("inactive") with the same type as return_type. */ > + if (instance.has_inactive_argument ()) > + argument_types.quick_insert (0, return_type); > + argument_types.quick_push (get_mve_pred16_t ()); > + } > +} > + > +/* Parse and move past an element type in FORMAT and return it as a type > + suffix. The format is: > + > + [01] - the element type in type suffix 0 or 1 of INSTANCE. > + h<elt> - a half-sized version of <elt> > + s<bits> - a signed type with the given number of bits > + s[01] - a signed type with the same width as type suffix 0 or 1 > + u<bits> - an unsigned type with the given number of bits > + u[01] - an unsigned type with the same width as type suffix 0 or 1 > + w<elt> - a double-sized version of <elt> > + x<bits> - a type with the given number of bits and same signedness > + as the next argument. > + > + Future intrinsics will extend this format. */ > +static type_suffix_index > +parse_element_type (const function_instance &instance, const char > *&format) > +{ > + int ch = *format++; > + > + > + if (ch == 's' || ch == 'u') > + { > + type_class_index tclass = (ch == 'f' ? TYPE_float > + : ch == 's' ? TYPE_signed > + : TYPE_unsigned); > + char *end; > + unsigned int bits = strtol (format, &end, 10); > + format = end; > + if (bits == 0 || bits == 1) > + bits = instance.type_suffix (bits).element_bits; > + return find_type_suffix (tclass, bits); > + } > + > + if (ch == 'h') > + { > + type_suffix_index suffix = parse_element_type (instance, format); > + return find_type_suffix (type_suffixes[suffix].tclass, > + type_suffixes[suffix].element_bits / 2); > + } > + > + if (ch == 'w') > + { > + type_suffix_index suffix = parse_element_type (instance, format); > + return find_type_suffix (type_suffixes[suffix].tclass, > + type_suffixes[suffix].element_bits * 2); > + } > + > + if (ch == 'x') > + { > + const char *next = format; > + next = strstr (format, ","); > + next+=2; > + type_suffix_index suffix = parse_element_type (instance, next); > + type_class_index tclass = type_suffixes[suffix].tclass; > + char *end; > + unsigned int bits = strtol (format, &end, 10); > + format = end; > + return find_type_suffix (tclass, bits); > + } > + > + if (ch == '0' || ch == '1') > + return instance.type_suffix_ids[ch - '0']; > + > + gcc_unreachable (); > +} > + > +/* Read and return a type from FORMAT for function INSTANCE. Advance > + FORMAT beyond the type string. The format is: > + > + p - predicates with type mve_pred16_t > + s<elt> - a scalar type with the given element suffix > + t<elt> - a vector or tuple type with given element suffix [*1] > + v<elt> - a vector with the given element suffix > + > + where <elt> has the format described above parse_element_type. > + > + Future intrinsics will extend this format. > + > + [*1] the vectors_per_tuple function indicates whether the type should > + be a tuple, and if so, how many vectors it should contain. */ > +static tree > +parse_type (const function_instance &instance, const char *&format) > +{ > + int ch = *format++; > + > + if (ch == 'p') > + return get_mve_pred16_t (); > + > + if (ch == 's') > + { > + type_suffix_index suffix = parse_element_type (instance, format); > + return scalar_types[type_suffixes[suffix].vector_type]; > + } > + > + if (ch == 't') > + { > + type_suffix_index suffix = parse_element_type (instance, format); > + vector_type_index vector_type = type_suffixes[suffix].vector_type; > + unsigned int num_vectors = instance.vectors_per_tuple (); > + return acle_vector_types[num_vectors - 1][vector_type]; > + } > + > + if (ch == 'v') > + { > + type_suffix_index suffix = parse_element_type (instance, format); > + return acle_vector_types[0][type_suffixes[suffix].vector_type]; > + } > + > + gcc_unreachable (); > +} > + > +/* Read a type signature for INSTANCE from FORMAT. Add the argument > + types to ARGUMENT_TYPES and return the return type. Assert there > + are no more than MAX_ARGS arguments. > + > + The format is a comma-separated list of types (as for parse_type), > + with the first type being the return type and the rest being the > + argument types. */ > +static tree > +parse_signature (const function_instance &instance, const char *format, > + vec<tree> &argument_types, unsigned int max_args) > +{ > + tree return_type = parse_type (instance, format); > + unsigned int args = 0; > + while (format[0] == ',') > + { > + gcc_assert (args < max_args); > + format += 1; > + tree argument_type = parse_type (instance, format); > + argument_types.quick_push (argument_type); > + args += 1; > + } > + gcc_assert (format[0] == 0); > + return return_type; > +} > + > +/* Add one function instance for GROUP, using mode suffix > MODE_SUFFIX_ID, > + the type suffixes at index TI and the predication suffix at index PI. > + The other arguments are as for build_all. */ > +static void > +build_one (function_builder &b, const char *signature, > + const function_group_info &group, mode_suffix_index > mode_suffix_id, > + unsigned int ti, unsigned int pi, bool preserve_user_namespace, > + bool force_direct_overloads) > +{ > + /* Current functions take at most five arguments. Match > + parse_signature parameter below. */ > + auto_vec<tree, 5> argument_types; > + function_instance instance (group.base_name, *group.base, *group.shape, > + mode_suffix_id, group.types[ti], > + group.preds[pi]); > + tree return_type = parse_signature (instance, signature, argument_types, > 5); > + apply_predication (instance, return_type, argument_types); > + b.add_unique_function (instance, return_type, argument_types, > + preserve_user_namespace, group.requires_float, > + force_direct_overloads); > +} > + > +/* Add a function instance for every type and predicate combination in > + GROUP, except if requested to use only the predicates listed in > + RESTRICT_TO_PREDS. Take the function base name from GROUP and the > + mode suffix from MODE_SUFFIX_ID. Use SIGNATURE to construct the > + function signature, then use apply_predication to add in the > + predicate. */ > +static void > +build_all (function_builder &b, const char *signature, > + const function_group_info &group, mode_suffix_index > mode_suffix_id, > + bool preserve_user_namespace, > + bool force_direct_overloads = false, > + const predication_index *restrict_to_preds = NULL) > +{ > + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) > + { > + unsigned int pi2 = 0; > + > + if (restrict_to_preds) > + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2) > + if (restrict_to_preds[pi2] == group.preds[pi]) > + break; > + > + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS) > + for (unsigned int ti = 0; > + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) > + build_one (b, signature, group, mode_suffix_id, ti, pi, > + preserve_user_namespace, force_direct_overloads); > + } > +} > + > +/* Add a function instance for every type and predicate combination in > + GROUP, except if requested to use only the predicates listed in > + RESTRICT_TO_PREDS, and only for 16-bit and 32-bit integers. Take > + the function base name from GROUP and the mode suffix from > + MODE_SUFFIX_ID. Use SIGNATURE to construct the function signature, > + then use apply_predication to add in the predicate. */ > +static void > +build_16_32 (function_builder &b, const char *signature, > + const function_group_info &group, mode_suffix_index > mode_suffix_id, > + bool preserve_user_namespace, > + bool force_direct_overloads = false, > + const predication_index *restrict_to_preds = NULL) > +{ > + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) > + { > + unsigned int pi2 = 0; > + > + if (restrict_to_preds) > + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2) > + if (restrict_to_preds[pi2] == group.preds[pi]) > + break; > + > + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS) > + for (unsigned int ti = 0; > + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) > + { > + unsigned int element_bits = > type_suffixes[group.types[ti][0]].element_bits; > + type_class_index tclass = type_suffixes[group.types[ti][0]].tclass; > + if ((tclass == TYPE_signed || tclass == TYPE_unsigned) > + && (element_bits == 16 || element_bits == 32)) > + build_one (b, signature, group, mode_suffix_id, ti, pi, > + preserve_user_namespace, force_direct_overloads); > + } > + } > +} > + > +/* Declare the function shape NAME, pointing it to an instance > + of class <NAME>_def. */ > +#define SHAPE(NAME) \ > + static CONSTEXPR const NAME##_def NAME##_obj; \ > + namespace shapes { const function_shape *const NAME = &NAME##_obj; } > + > +/* Base class for functions that are not overloaded. */ > +struct nonoverloaded_base : public function_shape > +{ > + bool > + explicit_type_suffix_p (unsigned int, enum predication_index, enum > mode_suffix_index) const override > + { > + return true; > + } > + > + bool > + explicit_mode_suffix_p (enum predication_index, enum > mode_suffix_index) const override > + { > + return true; > + } > + > + bool > + skip_overload_p (enum predication_index, enum mode_suffix_index) > const override > + { > + return false; > + } > + > + tree > + resolve (function_resolver &) const override > + { > + gcc_unreachable (); > + } > +}; > + > +/* Base class for overloaded functions. Bit N of EXPLICIT_MASK is true > + if type suffix N appears in the overloaded name. */ > +template<unsigned int EXPLICIT_MASK> > +struct overloaded_base : public function_shape > +{ > + bool > + explicit_type_suffix_p (unsigned int i, enum predication_index, enum > mode_suffix_index) const override > + { > + return (EXPLICIT_MASK >> i) & 1; > + } > + > + bool > + explicit_mode_suffix_p (enum predication_index, enum > mode_suffix_index) const override > + { > + return false; > + } > + > + bool > + skip_overload_p (enum predication_index, enum mode_suffix_index) > const override > + { > + return false; > + } > +}; > + > +} /* end namespace arm_mve */ > + > +#undef SHAPE > diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm- > mve-builtins-shapes.h > new file mode 100644 > index 00000000000..9e353b85a76 > --- /dev/null > +++ b/gcc/config/arm/arm-mve-builtins-shapes.h > @@ -0,0 +1,30 @@ > +/* ACLE support for Arm MVE (function shapes) > + Copyright (C) 2023 Free Software Foundation, Inc. > + > + This file is part of GCC. > + > + GCC is free software; you can redistribute it and/or modify it > + under the terms of the GNU General Public License as published by > + the Free Software Foundation; either version 3, or (at your option) > + any later version. > + > + GCC is distributed in the hope that it will be useful, but > + WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + General Public License for more details. > + > + You should have received a copy of the GNU General Public License > + along with GCC; see the file COPYING3. If not see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef GCC_ARM_MVE_BUILTINS_SHAPES_H > +#define GCC_ARM_MVE_BUILTINS_SHAPES_H > + > +namespace arm_mve > +{ > + namespace shapes > + { > + } /* end namespace arm_mve::shapes */ > +} /* end namespace arm_mve */ > + > +#endif > diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve- > builtins.cc > index 7586a82e3c1..b0cceb75ceb 100644 > --- a/gcc/config/arm/arm-mve-builtins.cc > +++ b/gcc/config/arm/arm-mve-builtins.cc > @@ -24,7 +24,19 @@ > #include "coretypes.h" > #include "tm.h" > #include "tree.h" > +#include "rtl.h" > +#include "tm_p.h" > +#include "memmodel.h" > +#include "insn-codes.h" > +#include "optabs.h" > +#include "recog.h" > +#include "expr.h" > +#include "basic-block.h" > +#include "function.h" > #include "fold-const.h" > +#include "gimple.h" > +#include "gimple-iterator.h" > +#include "emit-rtl.h" > #include "langhooks.h" > #include "stringpool.h" > #include "attribs.h" > @@ -32,6 +44,8 @@ > #include "arm-protos.h" > #include "arm-builtins.h" > #include "arm-mve-builtins.h" > +#include "arm-mve-builtins-base.h" > +#include "arm-mve-builtins-shapes.h" > > namespace arm_mve { > > @@ -46,6 +60,33 @@ struct vector_type_info > const bool requires_float; > }; > > +/* Describes a function decl. */ > +class GTY(()) registered_function > +{ > +public: > + /* The ACLE function that the decl represents. */ > + function_instance instance GTY ((skip)); > + > + /* The decl itself. */ > + tree decl; > + > + /* Whether the function requires a floating point abi. */ > + bool requires_float; > + > + /* True if the decl represents an overloaded function that needs to be > + resolved by function_resolver. */ > + bool overloaded_p; > +}; > + > +/* Hash traits for registered_function. */ > +struct registered_function_hasher : nofree_ptr_hash <registered_function> > +{ > + typedef function_instance compare_type; > + > + static hashval_t hash (value_type); > + static bool equal (value_type, const compare_type &); > +}; > + > /* Flag indicating whether the arm MVE types have been handled. */ > static bool handle_arm_mve_types_p; > > @@ -54,11 +95,167 @@ static CONSTEXPR const vector_type_info > vector_types[] = { > #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \ > { #ACLE_NAME, REQUIRES_FLOAT }, > #include "arm-mve-builtins.def" > -#undef DEF_MVE_TYPE > +}; > + > +/* The function name suffix associated with each predication type. */ > +static const char *const pred_suffixes[NUM_PREDS + 1] = { > + "", > + "_m", > + "_p", > + "_x", > + "_z", > + "" > +}; > + > +/* Static information about each mode_suffix_index. */ > +CONSTEXPR const mode_suffix_info mode_suffixes[] = { > +#define VECTOR_TYPE_none NUM_VECTOR_TYPES > +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS) \ > + { "_" #NAME, VECTOR_TYPE_##BASE, VECTOR_TYPE_##DISPLACEMENT, > UNITS_##UNITS }, > +#include "arm-mve-builtins.def" > +#undef VECTOR_TYPE_none > + { "", NUM_VECTOR_TYPES, NUM_VECTOR_TYPES, UNITS_none } > +}; > + > +/* Static information about each type_suffix_index. */ > +CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = > { > +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) > \ > + { "_" #NAME, \ > + VECTOR_TYPE_##ACLE_TYPE, \ > + TYPE_##CLASS, \ > + BITS, \ > + BITS / BITS_PER_UNIT, \ > + TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \ > + TYPE_##CLASS == TYPE_unsigned, \ > + TYPE_##CLASS == TYPE_float, \ > + 0, \ > + MODE }, > +#include "arm-mve-builtins.def" > + { "", NUM_VECTOR_TYPES, TYPE_bool, 0, 0, false, false, false, > + 0, VOIDmode } > +}; > + > +/* Define a TYPES_<combination> macro for each combination of type > + suffixes that an ACLE function can have, where <combination> is the > + name used in DEF_MVE_FUNCTION entries. > + > + Use S (T) for single type suffix T and D (T1, T2) for a pair of type > + suffixes T1 and T2. Use commas to separate the suffixes. > + > + Although the order shouldn't matter, the convention is to sort the > + suffixes lexicographically after dividing suffixes into a type > + class ("b", "f", etc.) and a numerical bit count. */ > + > +/* _f16. */ > +#define TYPES_float16(S, D) \ > + S (f16) > + > +/* _f16 _f32. */ > +#define TYPES_all_float(S, D) \ > + S (f16), S (f32) > + > +/* _s8 _u8 . */ > +#define TYPES_integer_8(S, D) \ > + S (s8), S (u8) > + > +/* _s8 _s16 > + _u8 _u16. */ > +#define TYPES_integer_8_16(S, D) \ > + S (s8), S (s16), S (u8), S(u16) > + > +/* _s16 _s32 > + _u16 _u32. */ > +#define TYPES_integer_16_32(S, D) \ > + S (s16), S (s32), \ > + S (u16), S (u32) > + > +/* _s16 _s32. */ > +#define TYPES_signed_16_32(S, D) \ > + S (s16), S (s32) > + > +/* _s8 _s16 _s32. */ > +#define TYPES_all_signed(S, D) \ > + S (s8), S (s16), S (s32) > + > +/* _u8 _u16 _u32. */ > +#define TYPES_all_unsigned(S, D) \ > + S (u8), S (u16), S (u32) > + > +/* _s8 _s16 _s32 > + _u8 _u16 _u32. */ > +#define TYPES_all_integer(S, D) \ > + TYPES_all_signed (S, D), TYPES_all_unsigned (S, D) > + > +/* _s8 _s16 _s32 _s64 > + _u8 _u16 _u32 _u64. */ > +#define TYPES_all_integer_with_64(S, D) \ > + TYPES_all_signed (S, D), S (s64), TYPES_all_unsigned (S, D), S (u64) > + > +/* s32 _u32. */ > +#define TYPES_integer_32(S, D) \ > + S (s32), S (u32) > + > +/* s32 . */ > +#define TYPES_signed_32(S, D) \ > + S (s32) > + > +/* Describe a pair of type suffixes in which only the first is used. */ > +#define DEF_VECTOR_TYPE(X) { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES } > + > +/* Describe a pair of type suffixes in which both are used. */ > +#define DEF_DOUBLE_TYPE(X, Y) { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y } > + > +/* Create an array that can be used in arm-mve-builtins.def to > + select the type suffixes in TYPES_<NAME>. */ > +#define DEF_MVE_TYPES_ARRAY(NAME) \ > + static const type_suffix_pair types_##NAME[] = { \ > + TYPES_##NAME (DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE), \ > + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } \ > + } > + > +/* For functions that don't take any type suffixes. */ > +static const type_suffix_pair types_none[] = { > + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }, > + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } > +}; > + > +DEF_MVE_TYPES_ARRAY (all_integer); > +DEF_MVE_TYPES_ARRAY (all_integer_with_64); > +DEF_MVE_TYPES_ARRAY (float16); > +DEF_MVE_TYPES_ARRAY (all_float); > +DEF_MVE_TYPES_ARRAY (all_signed); > +DEF_MVE_TYPES_ARRAY (all_unsigned); > +DEF_MVE_TYPES_ARRAY (integer_8); > +DEF_MVE_TYPES_ARRAY (integer_8_16); > +DEF_MVE_TYPES_ARRAY (integer_16_32); > +DEF_MVE_TYPES_ARRAY (integer_32); > +DEF_MVE_TYPES_ARRAY (signed_16_32); > +DEF_MVE_TYPES_ARRAY (signed_32); > + > +/* Used by functions that have no governing predicate. */ > +static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; > + > +/* Used by functions that have the m (merging) predicated form, and in > + addition have an unpredicated form. */ > +static const predication_index preds_m_or_none[] = { > + PRED_m, PRED_none, NUM_PREDS > +}; > + > +/* Used by functions that have the mx (merging and "don't care" > + predicated forms, and in addition have an unpredicated form. */ > +static const predication_index preds_mx_or_none[] = { > + PRED_m, PRED_x, PRED_none, NUM_PREDS > +}; > + > +/* Used by functions that have the p predicated form, in addition to > + an unpredicated form. */ > +static const predication_index preds_p_or_none[] = { > + PRED_p, PRED_none, NUM_PREDS > }; > > /* The scalar type associated with each vector type. */ > -GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; > +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; > +tree scalar_types[NUM_VECTOR_TYPES]; > > /* The single-predicate and single-vector types, with their built-in > "__simd128_..._t" name. Allow an index of NUM_VECTOR_TYPES, which > always > @@ -66,7 +263,20 @@ GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; > static GTY(()) tree abi_vector_types[NUM_VECTOR_TYPES + 1]; > > /* Same, but with the arm_mve.h names. */ > -GTY(()) tree acle_vector_types[3][NUM_VECTOR_TYPES + 1]; > +extern GTY(()) tree > acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; > +tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; > + > +/* The list of all registered function decls, indexed by code. */ > +static GTY(()) vec<registered_function *, va_gc> *registered_functions; > + > +/* All registered function decls, hashed on the function_instance > + that they implement. This is used for looking up implementations of > + overloaded functions. */ > +static hash_table<registered_function_hasher> *function_table; > + > +/* True if we've already complained about attempts to use functions > + when the required extension is disabled. */ > +static bool reported_missing_float_p; > > /* Return the MVE abi type with element of type TYPE. */ > static tree > @@ -87,7 +297,6 @@ register_builtin_types () > #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \ > scalar_types[VECTOR_TYPE_ ## ACLE_NAME] = SCALAR_TYPE; > #include "arm-mve-builtins.def" > -#undef DEF_MVE_TYPE > for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i) > { > if (vector_types[i].requires_float && !TARGET_HAVE_MVE_FLOAT) > @@ -113,8 +322,18 @@ register_builtin_types () > static void > register_vector_type (vector_type_index type) > { > + > + /* If the target does not have the mve.fp extension, but the type requires > + it, then it needs to be assigned a non-dummy type so that functions > + with those types in their signature can be registered. This allows for > + diagnostics about the missing extension, rather than about a missing > + function definition. */ > if (vector_types[type].requires_float && !TARGET_HAVE_MVE_FLOAT) > - return; > + { > + acle_vector_types[0][type] = void_type_node; > + return; > + } > + > tree vectype = abi_vector_types[type]; > tree id = get_identifier (vector_types[type].acle_name); > tree decl = build_decl (input_location, TYPE_DECL, id, vectype); > @@ -133,15 +352,26 @@ register_vector_type (vector_type_index type) > acle_vector_types[0][type] = vectype; > } > > -/* Register tuple type TYPE with NUM_VECTORS arity under its > - arm_mve_types.h name. */ > +/* Register tuple types of element type TYPE under their arm_mve_types.h > + names. */ > static void > register_builtin_tuple_types (vector_type_index type) > { > const vector_type_info* info = &vector_types[type]; > + > + /* If the target does not have the mve.fp extension, but the type requires > + it, then it needs to be assigned a non-dummy type so that functions > + with those types in their signature can be registered. This allows for > + diagnostics about the missing extension, rather than about a missing > + function definition. */ > if (scalar_types[type] == boolean_type_node > || (info->requires_float && !TARGET_HAVE_MVE_FLOAT)) > + { > + for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2) > + acle_vector_types[num_vectors >> 1][type] = void_type_node; > return; > + } > + > const char *vector_type_name = info->acle_name; > char buffer[sizeof ("float32x4x2_t")]; > for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2) > @@ -189,8 +419,1710 @@ handle_arm_mve_types_h () > } > } > > -} /* end namespace arm_mve */ > +/* Implement #pragma GCC arm "arm_mve.h" <bool>. */ > +void > +handle_arm_mve_h (bool preserve_user_namespace) > +{ > + if (function_table) > + { > + error ("duplicate definition of %qs", "arm_mve.h"); > + return; > + } > > -using namespace arm_mve; > + /* Define MVE functions. */ > + function_table = new hash_table<registered_function_hasher> (1023); > +} > + > +/* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading > + purposes. */ > +static bool > +matches_type_p (const_tree model_type, const_tree candidate) > +{ > + if (VECTOR_TYPE_P (model_type)) > + { > + if (!VECTOR_TYPE_P (candidate) > + || maybe_ne (TYPE_VECTOR_SUBPARTS (model_type), > + TYPE_VECTOR_SUBPARTS (candidate)) > + || TYPE_MODE (model_type) != TYPE_MODE (candidate)) > + return false; > + > + model_type = TREE_TYPE (model_type); > + candidate = TREE_TYPE (candidate); > + } > + return (candidate != error_mark_node > + && TYPE_MAIN_VARIANT (model_type) == TYPE_MAIN_VARIANT > (candidate)); > +} > + > +/* Report an error against LOCATION that the user has tried to use > + a floating point function when the mve.fp extension is disabled. */ > +static void > +report_missing_float (location_t location, tree fndecl) > +{ > + /* Avoid reporting a slew of messages for a single oversight. */ > + if (reported_missing_float_p) > + return; > + > + error_at (location, "ACLE function %qD requires ISA extension %qs", > + fndecl, "mve.fp"); > + inform (location, "you can enable mve.fp by using the command-line" > + " option %<-march%>, or by using the %<target%>" > + " attribute or pragma"); > + reported_missing_float_p = true; > +} > + > +/* Report that LOCATION has a call to FNDECL in which argument ARGNO > + was not an integer constant expression. ARGNO counts from zero. */ > +static void > +report_non_ice (location_t location, tree fndecl, unsigned int argno) > +{ > + error_at (location, "argument %d of %qE must be an integer constant" > + " expression", argno + 1, fndecl); > +} > + > +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has > + the value ACTUAL, whereas the function requires a value in the range > + [MIN, MAX]. ARGNO counts from zero. */ > +static void > +report_out_of_range (location_t location, tree fndecl, unsigned int argno, > + HOST_WIDE_INT actual, HOST_WIDE_INT min, > + HOST_WIDE_INT max) > +{ > + error_at (location, "passing %wd to argument %d of %qE, which expects" > + " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, > + min, max); > +} > + > +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has > + the value ACTUAL, whereas the function requires a valid value of > + enum type ENUMTYPE. ARGNO counts from zero. */ > +static void > +report_not_enum (location_t location, tree fndecl, unsigned int argno, > + HOST_WIDE_INT actual, tree enumtype) > +{ > + error_at (location, "passing %wd to argument %d of %qE, which expects" > + " a valid %qT value", actual, argno + 1, fndecl, enumtype); > +} > + > +/* Checks that the mve.fp extension is enabled, given that REQUIRES_FLOAT > + indicates whether it is required or not for function FNDECL. > + Report an error against LOCATION if not. */ > +static bool > +check_requires_float (location_t location, tree fndecl, > + bool requires_float) > +{ > + if (requires_float && !TARGET_HAVE_MVE_FLOAT) > + { > + report_missing_float (location, fndecl); > + return false; > + } > + > + return true; > +} > + > +/* Return a hash code for a function_instance. */ > +hashval_t > +function_instance::hash () const > +{ > + inchash::hash h; > + /* BASE uniquely determines BASE_NAME, so we don't need to hash both. > */ > + h.add_ptr (base); > + h.add_ptr (shape); > + h.add_int (mode_suffix_id); > + h.add_int (type_suffix_ids[0]); > + h.add_int (type_suffix_ids[1]); > + h.add_int (pred); > + return h.end (); > +} > + > +/* Return a set of CP_* flags that describe what the function could do, > + taking the command-line flags into account. */ > +unsigned int > +function_instance::call_properties () const > +{ > + unsigned int flags = base->call_properties (*this); > + > + /* -fno-trapping-math means that we can assume any FP exceptions > + are not user-visible. */ > + if (!flag_trapping_math) > + flags &= ~CP_RAISE_FP_EXCEPTIONS; > + > + return flags; > +} > + > +/* Return true if calls to the function could read some form of > + global state. */ > +bool > +function_instance::reads_global_state_p () const > +{ > + unsigned int flags = call_properties (); > + > + /* Preserve any dependence on rounding mode, flush to zero mode, etc. > + There is currently no way of turning this off; in particular, > + -fno-rounding-math (which is the default) means that we should make > + the usual assumptions about rounding mode, which for intrinsics means > + acting as the instructions do. */ > + if (flags & CP_READ_FPCR) > + return true; > + > + return false; > +} > + > +/* Return true if calls to the function could modify some form of > + global state. */ > +bool > +function_instance::modifies_global_state_p () const > +{ > + unsigned int flags = call_properties (); > + > + /* Preserve any exception state written back to the FPCR, > + unless -fno-trapping-math says this is unnecessary. */ > + if (flags & CP_RAISE_FP_EXCEPTIONS) > + return true; > + > + /* Handle direct modifications of global state. */ > + return flags & CP_WRITE_MEMORY; > +} > + > +/* Return true if calls to the function could raise a signal. */ > +bool > +function_instance::could_trap_p () const > +{ > + unsigned int flags = call_properties (); > + > + /* Handle functions that could raise SIGFPE. */ > + if (flags & CP_RAISE_FP_EXCEPTIONS) > + return true; > + > + /* Handle functions that could raise SIGBUS or SIGSEGV. */ > + if (flags & (CP_READ_MEMORY | CP_WRITE_MEMORY)) > + return true; > + > + return false; > +} > + > +/* Return true if the function has an implicit "inactive" argument. > + This is the case of most _m predicated functions, but not all. > + The list will be updated as needed. */ > +bool > +function_instance::has_inactive_argument () const > +{ > + if (pred != PRED_m) > + return false; > + > + return true; > +} > + > +inline hashval_t > +registered_function_hasher::hash (value_type value) > +{ > + return value->instance.hash (); > +} > + > +inline bool > +registered_function_hasher::equal (value_type value, const compare_type > &key) > +{ > + return value->instance == key; > +} > + > +function_builder::function_builder () > +{ > + m_overload_type = build_function_type (void_type_node, void_list_node); > + m_direct_overloads = lang_GNU_CXX (); > + gcc_obstack_init (&m_string_obstack); > +} > + > +function_builder::~function_builder () > +{ > + obstack_free (&m_string_obstack, NULL); > +} > + > +/* Add NAME to the end of the function name being built. */ > +void > +function_builder::append_name (const char *name) > +{ > + obstack_grow (&m_string_obstack, name, strlen (name)); > +} > + > +/* Zero-terminate and complete the function name being built. */ > +char * > +function_builder::finish_name () > +{ > + obstack_1grow (&m_string_obstack, 0); > + return (char *) obstack_finish (&m_string_obstack); > +} > + > +/* Return the overloaded or full function name for INSTANCE, with optional > + prefix; PRESERVE_USER_NAMESPACE selects the prefix, and > OVERLOADED_P > + selects which the overloaded or full function name. Allocate the string on > + m_string_obstack; the caller must use obstack_free to free it after use. */ > +char * > +function_builder::get_name (const function_instance &instance, > + bool preserve_user_namespace, > + bool overloaded_p) > +{ > + if (preserve_user_namespace) > + append_name ("__arm_"); > + append_name (instance.base_name); > + append_name (pred_suffixes[instance.pred]); > + if (!overloaded_p > + || instance.shape->explicit_mode_suffix_p (instance.pred, > + instance.mode_suffix_id)) > + append_name (instance.mode_suffix ().string); > + for (unsigned int i = 0; i < 2; ++i) > + if (!overloaded_p > + || instance.shape->explicit_type_suffix_p (i, instance.pred, > + instance.mode_suffix_id)) > + append_name (instance.type_suffix (i).string); > + return finish_name (); > +} > + > +/* Add attribute NAME to ATTRS. */ > +static tree > +add_attribute (const char *name, tree attrs) > +{ > + return tree_cons (get_identifier (name), NULL_TREE, attrs); > +} > + > +/* Return the appropriate function attributes for INSTANCE. */ > +tree > +function_builder::get_attributes (const function_instance &instance) > +{ > + tree attrs = NULL_TREE; > + > + if (!instance.modifies_global_state_p ()) > + { > + if (instance.reads_global_state_p ()) > + attrs = add_attribute ("pure", attrs); > + else > + attrs = add_attribute ("const", attrs); > + } > + > + if (!flag_non_call_exceptions || !instance.could_trap_p ()) > + attrs = add_attribute ("nothrow", attrs); > + > + return add_attribute ("leaf", attrs); > +} > + > +/* Add a function called NAME with type FNTYPE and attributes ATTRS. > + INSTANCE describes what the function does and OVERLOADED_P indicates > + whether it is overloaded. REQUIRES_FLOAT indicates whether the function > + requires the mve.fp extension. */ > +registered_function & > +function_builder::add_function (const function_instance &instance, > + const char *name, tree fntype, tree attrs, > + bool requires_float, > + bool overloaded_p, > + bool placeholder_p) > +{ > + unsigned int code = vec_safe_length (registered_functions); > + code = (code << ARM_BUILTIN_SHIFT) | ARM_BUILTIN_MVE; > + > + /* We need to be able to generate placeholders to ensure that we have a > + consistent numbering scheme for function codes between the C and C++ > + frontends, so that everything ties up in LTO. > + > + Currently, tree-streamer-in.cc:unpack_ts_function_decl_value_fields > + validates that tree nodes returned by TARGET_BUILTIN_DECL are non- > NULL and > + some node other than error_mark_node. This is a holdover from when > builtin > + decls were streamed by code rather than by value. > + > + Ultimately, we should be able to remove this validation of BUILT_IN_MD > + nodes and remove the target hook. For now, however, we need to > appease the > + validation and return a non-NULL, non-error_mark_node node, so we > + arbitrarily choose integer_zero_node. */ > + tree decl = placeholder_p > + ? integer_zero_node > + : simulate_builtin_function_decl (input_location, name, fntype, > + code, NULL, attrs); > + > + registered_function &rfn = *ggc_alloc <registered_function> (); > + rfn.instance = instance; > + rfn.decl = decl; > + rfn.requires_float = requires_float; > + rfn.overloaded_p = overloaded_p; > + vec_safe_push (registered_functions, &rfn); > + > + return rfn; > +} > + > +/* Add a built-in function for INSTANCE, with the argument types given > + by ARGUMENT_TYPES and the return type given by RETURN_TYPE. > + REQUIRES_FLOAT indicates whether the function requires the mve.fp > extension, > + and PRESERVE_USER_NAMESPACE indicates whether the function should > also be > + registered under its non-prefixed name. */ > +void > +function_builder::add_unique_function (const function_instance &instance, > + tree return_type, > + vec<tree> &argument_types, > + bool preserve_user_namespace, > + bool requires_float, > + bool force_direct_overloads) > +{ > + /* Add the function under its full (unique) name with prefix. */ > + char *name = get_name (instance, true, false); > + tree fntype = build_function_type_array (return_type, > + argument_types.length (), > + argument_types.address ()); > + tree attrs = get_attributes (instance); > + registered_function &rfn = add_function (instance, name, fntype, attrs, > + requires_float, false, false); > + > + /* Enter the function into the hash table. */ > + hashval_t hash = instance.hash (); > + registered_function **rfn_slot > + = function_table->find_slot_with_hash (instance, hash, INSERT); > + gcc_assert (!*rfn_slot); > + *rfn_slot = &rfn; > + > + /* Also add the non-prefixed non-overloaded function, if the user > namespace > + does not need to be preserved. */ > + if (!preserve_user_namespace) > + { > + char *noprefix_name = get_name (instance, false, false); > + tree attrs = get_attributes (instance); > + add_function (instance, noprefix_name, fntype, attrs, requires_float, > + false, false); > + } > + > + /* Also add the function under its overloaded alias, if we want > + a separate decl for each instance of an overloaded function. */ > + char *overload_name = get_name (instance, true, true); > + if (strcmp (name, overload_name) != 0) > + { > + /* Attribute lists shouldn't be shared. */ > + tree attrs = get_attributes (instance); > + bool placeholder_p = !(m_direct_overloads || force_direct_overloads); > + add_function (instance, overload_name, fntype, attrs, > + requires_float, false, placeholder_p); > + > + /* Also add the non-prefixed overloaded function, if the user namespace > + does not need to be preserved. */ > + if (!preserve_user_namespace) > + { > + char *noprefix_overload_name = get_name (instance, false, true); > + tree attrs = get_attributes (instance); > + add_function (instance, noprefix_overload_name, fntype, attrs, > + requires_float, false, placeholder_p); > + } > + } > + > + obstack_free (&m_string_obstack, name); > +} > + > +/* Add one function decl for INSTANCE, to be used with manual overload > + resolution. REQUIRES_FLOAT indicates whether the function requires the > + mve.fp extension. > + > + For simplicity, partition functions by instance and required extensions, > + and check whether the required extensions are available as part of > resolving > + the function to the relevant unique function. */ > +void > +function_builder::add_overloaded_function (const function_instance > &instance, > + bool preserve_user_namespace, > + bool requires_float) > +{ > + char *name = get_name (instance, true, true); > + if (registered_function **map_value = m_overload_names.get (name)) > + { > + gcc_assert ((*map_value)->instance == instance); > + obstack_free (&m_string_obstack, name); > + } > + else > + { > + registered_function &rfn > + = add_function (instance, name, m_overload_type, NULL_TREE, > + requires_float, true, m_direct_overloads); > + m_overload_names.put (name, &rfn); > + if (!preserve_user_namespace) > + { > + char *noprefix_name = get_name (instance, false, true); > + registered_function &noprefix_rfn > + = add_function (instance, noprefix_name, m_overload_type, > + NULL_TREE, requires_float, true, > + m_direct_overloads); > + m_overload_names.put (noprefix_name, &noprefix_rfn); > + } > + } > +} > + > +/* If we are using manual overload resolution, add one function decl > + for each overloaded function in GROUP. Take the function base name > + from GROUP and the mode from MODE. */ > +void > +function_builder::add_overloaded_functions (const function_group_info > &group, > + mode_suffix_index mode, > + bool preserve_user_namespace) > +{ > + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) > + { > + unsigned int explicit_type0 > + = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode); > + unsigned int explicit_type1 > + = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode); > + > + if ((*group.shape)->skip_overload_p (group.preds[pi], mode)) > + continue; > + > + if (!explicit_type0 && !explicit_type1) > + { > + /* Deal with the common case in which there is one overloaded > + function for all type combinations. */ > + function_instance instance (group.base_name, *group.base, > + *group.shape, mode, types_none[0], > + group.preds[pi]); > + add_overloaded_function (instance, preserve_user_namespace, > + group.requires_float); > + } > + else > + for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES; > + ++ti) > + { > + /* Stub out the types that are determined by overload > + resolution. */ > + type_suffix_pair types = { > + explicit_type0 ? group.types[ti][0] : NUM_TYPE_SUFFIXES, > + explicit_type1 ? group.types[ti][1] : NUM_TYPE_SUFFIXES > + }; > + function_instance instance (group.base_name, *group.base, > + *group.shape, mode, types, > + group.preds[pi]); > + add_overloaded_function (instance, preserve_user_namespace, > + group.requires_float); > + } > + } > +} > + > +/* Register all the functions in GROUP. */ > +void > +function_builder::register_function_group (const function_group_info > &group, > + bool preserve_user_namespace) > +{ > + (*group.shape)->build (*this, group, preserve_user_namespace); > +} > + > +function_call_info::function_call_info (location_t location_in, > + const function_instance &instance_in, > + tree fndecl_in) > + : function_instance (instance_in), location (location_in), fndecl (fndecl_in) > +{ > +} > + > +function_resolver::function_resolver (location_t location, > + const function_instance &instance, > + tree fndecl, vec<tree, va_gc> &arglist) > + : function_call_info (location, instance, fndecl), m_arglist (arglist) > +{ > +} > + > +/* Return the vector type associated with type suffix TYPE. */ > +tree > +function_resolver::get_vector_type (type_suffix_index type) > +{ > + return acle_vector_types[0][type_suffixes[type].vector_type]; > +} > + > +/* Return the <stdint.h> name associated with TYPE. Using the <stdint.h> > + name should be more user-friendly than the underlying canonical type, > + since it makes the signedness and bitwidth explicit. */ > +const char * > +function_resolver::get_scalar_type_name (type_suffix_index type) > +{ > + return vector_types[type_suffixes[type].vector_type].acle_name + 2; > +} > + > +/* Return the type of argument I, or error_mark_node if it isn't > + well-formed. */ > +tree > +function_resolver::get_argument_type (unsigned int i) > +{ > + tree arg = m_arglist[i]; > + return arg == error_mark_node ? arg : TREE_TYPE (arg); > +} > + > +/* Return true if argument I is some form of scalar value. */ > +bool > +function_resolver::scalar_argument_p (unsigned int i) > +{ > + tree type = get_argument_type (i); > + return (INTEGRAL_TYPE_P (type) > + /* Allow pointer types, leaving the frontend to warn where > + necessary. */ > + || POINTER_TYPE_P (type) > + || SCALAR_FLOAT_TYPE_P (type)); > +} > + > +/* Report that the function has no form that takes type suffix TYPE. > + Return error_mark_node. */ > +tree > +function_resolver::report_no_such_form (type_suffix_index type) > +{ > + error_at (location, "%qE has no form that takes %qT arguments", > + fndecl, get_vector_type (type)); > + return error_mark_node; > +} > + > +/* Silently check whether there is an instance of the function with the > + mode suffix given by MODE and the type suffixes given by TYPE0 and > TYPE1. > + Return its function decl if so, otherwise return null. */ > +tree > +function_resolver::lookup_form (mode_suffix_index mode, > + type_suffix_index type0, > + type_suffix_index type1) > +{ > + type_suffix_pair types = { type0, type1 }; > + function_instance instance (base_name, base, shape, mode, types, pred); > + registered_function *rfn > + = function_table->find_with_hash (instance, instance.hash ()); > + return rfn ? rfn->decl : NULL_TREE; > +} > + > +/* Resolve the function to one with the mode suffix given by MODE and the > + type suffixes given by TYPE0 and TYPE1. Return its function decl on > + success, otherwise report an error and return error_mark_node. */ > +tree > +function_resolver::resolve_to (mode_suffix_index mode, > + type_suffix_index type0, > + type_suffix_index type1) > +{ > + tree res = lookup_form (mode, type0, type1); > + if (!res) > + { > + if (type1 == NUM_TYPE_SUFFIXES) > + return report_no_such_form (type0); > + if (type0 == type_suffix_ids[0]) > + return report_no_such_form (type1); > + /* To be filled in when we have other cases. */ > + gcc_unreachable (); > + } > + return res; > +} > + > +/* Require argument ARGNO to be a single vector or a tuple of > NUM_VECTORS > + vectors; NUM_VECTORS is 1 for the former. Return the associated type > + suffix on success, using TYPE_SUFFIX_b for predicates. Report an error > + and return NUM_TYPE_SUFFIXES on failure. */ > +type_suffix_index > +function_resolver::infer_vector_or_tuple_type (unsigned int argno, > + unsigned int num_vectors) > +{ > + tree actual = get_argument_type (argno); > + if (actual == error_mark_node) > + return NUM_TYPE_SUFFIXES; > + > + /* A linear search should be OK here, since the code isn't hot and > + the number of types is only small. */ > + for (unsigned int size_i = 0; size_i < MAX_TUPLE_SIZE; ++size_i) > + for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) > + { > + vector_type_index type_i = type_suffixes[suffix_i].vector_type; > + tree type = acle_vector_types[size_i][type_i]; > + if (type && matches_type_p (type, actual)) > + { > + if (size_i + 1 == num_vectors) > + return type_suffix_index (suffix_i); > + > + if (num_vectors == 1) > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a single MVE vector rather than a tuple", > + actual, argno + 1, fndecl); > + else if (size_i == 0 && type_i != VECTOR_TYPE_mve_pred16_t) > + /* num_vectors is always != 1, so the singular isn't needed. */ > + error_n (location, num_vectors, "%qT%d%qE%d", > + "passing single vector %qT to argument %d" > + " of %qE, which expects a tuple of %d vectors", > + actual, argno + 1, fndecl, num_vectors); > + else > + /* num_vectors is always != 1, so the singular isn't needed. */ > + error_n (location, num_vectors, "%qT%d%qE%d", > + "passing %qT to argument %d of %qE, which" > + " expects a tuple of %d vectors", actual, argno + 1, > + fndecl, num_vectors); > + return NUM_TYPE_SUFFIXES; > + } > + } > + > + if (num_vectors == 1) > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects an MVE vector type", actual, argno + 1, fndecl); > + else > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects an MVE tuple type", actual, argno + 1, fndecl); > + return NUM_TYPE_SUFFIXES; > +} > + > +/* Require argument ARGNO to have some form of vector type. Return the > + associated type suffix on success, using TYPE_SUFFIX_b for predicates. > + Report an error and return NUM_TYPE_SUFFIXES on failure. */ > +type_suffix_index > +function_resolver::infer_vector_type (unsigned int argno) > +{ > + return infer_vector_or_tuple_type (argno, 1); > +} > + > +/* Require argument ARGNO to be a vector or scalar argument. Return true > + if it is, otherwise report an appropriate error. */ > +bool > +function_resolver::require_vector_or_scalar_type (unsigned int argno) > +{ > + tree actual = get_argument_type (argno); > + if (actual == error_mark_node) > + return false; > + > + if (!scalar_argument_p (argno) && !VECTOR_TYPE_P (actual)) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a vector or scalar type", actual, argno + 1, fndecl); > + return false; > + } > + > + return true; > +} > + > +/* Require argument ARGNO to have vector type TYPE, in cases where this > + requirement holds for all uses of the function. Return true if the > + argument has the right form, otherwise report an appropriate error. */ > +bool > +function_resolver::require_vector_type (unsigned int argno, > + vector_type_index type) > +{ > + tree expected = acle_vector_types[0][type]; > + tree actual = get_argument_type (argno); > + if (actual == error_mark_node) > + return false; > + > + if (!matches_type_p (expected, actual)) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects %qT", actual, argno + 1, fndecl, expected); > + return false; > + } > + return true; > +} > + > +/* Like require_vector_type, but TYPE is inferred from previous arguments > + rather than being a fixed part of the function signature. This changes > + the nature of the error messages. */ > +bool > +function_resolver::require_matching_vector_type (unsigned int argno, > + type_suffix_index type) > +{ > + type_suffix_index new_type = infer_vector_type (argno); > + if (new_type == NUM_TYPE_SUFFIXES) > + return false; > + > + if (type != new_type) > + { > + error_at (location, "passing %qT to argument %d of %qE, but" > + " previous arguments had type %qT", > + get_vector_type (new_type), argno + 1, fndecl, > + get_vector_type (type)); > + return false; > + } > + return true; > +} > + > +/* Require argument ARGNO to be a vector type with the following > properties: > + > + - the type class must be the same as FIRST_TYPE's if EXPECTED_TCLASS > + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself. > + > + - the element size must be: > + > + - the same as FIRST_TYPE's if EXPECTED_BITS == SAME_SIZE > + - half of FIRST_TYPE's if EXPECTED_BITS == HALF_SIZE > + - a quarter of FIRST_TYPE's if EXPECTED_BITS == QUARTER_SIZE > + - EXPECTED_BITS itself otherwise > + > + Return true if the argument has the required type, otherwise report > + an appropriate error. > + > + FIRST_ARGNO is the first argument that is known to have type FIRST_TYPE. > + Usually it comes before ARGNO, but sometimes it is more natural to > resolve > + arguments out of order. > + > + If the required properties depend on FIRST_TYPE then both FIRST_ARGNO > and > + ARGNO contribute to the resolution process. If the required properties > + are fixed, only FIRST_ARGNO contributes to the resolution process. > + > + This function is a bit of a Swiss army knife. The complication comes > + from trying to give good error messages when FIRST_ARGNO and ARGNO > are > + inconsistent, since either of them might be wrong. */ > +bool function_resolver:: > +require_derived_vector_type (unsigned int argno, > + unsigned int first_argno, > + type_suffix_index first_type, > + type_class_index expected_tclass, > + unsigned int expected_bits) > +{ > + /* If the type needs to match FIRST_ARGNO exactly, use the preferred > + error message for that case. The VECTOR_TYPE_P test excludes tuple > + types, which we handle below instead. */ > + bool both_vectors_p = VECTOR_TYPE_P (get_argument_type (first_argno)); > + if (both_vectors_p > + && expected_tclass == SAME_TYPE_CLASS > + && expected_bits == SAME_SIZE) > + { > + /* There's no need to resolve this case out of order. */ > + gcc_assert (argno > first_argno); > + return require_matching_vector_type (argno, first_type); > + } > + > + /* Use FIRST_TYPE to get the expected type class and element size. */ > + type_class_index orig_expected_tclass = expected_tclass; > + if (expected_tclass == NUM_TYPE_CLASSES) > + expected_tclass = type_suffixes[first_type].tclass; > + > + unsigned int orig_expected_bits = expected_bits; > + if (expected_bits == SAME_SIZE) > + expected_bits = type_suffixes[first_type].element_bits; > + else if (expected_bits == HALF_SIZE) > + expected_bits = type_suffixes[first_type].element_bits / 2; > + else if (expected_bits == QUARTER_SIZE) > + expected_bits = type_suffixes[first_type].element_bits / 4; > + > + /* If the expected type doesn't depend on FIRST_TYPE at all, > + just check for the fixed choice of vector type. */ > + if (expected_tclass == orig_expected_tclass > + && expected_bits == orig_expected_bits) > + { > + const type_suffix_info &expected_suffix > + = type_suffixes[find_type_suffix (expected_tclass, expected_bits)]; > + return require_vector_type (argno, expected_suffix.vector_type); > + } > + > + /* Require the argument to be some form of MVE vector type, > + without being specific about the type of vector we want. */ > + type_suffix_index actual_type = infer_vector_type (argno); > + if (actual_type == NUM_TYPE_SUFFIXES) > + return false; > + > + /* Exit now if we got the right type. */ > + bool tclass_ok_p = (type_suffixes[actual_type].tclass == expected_tclass); > + bool size_ok_p = (type_suffixes[actual_type].element_bits == > expected_bits); > + if (tclass_ok_p && size_ok_p) > + return true; > + > + /* First look for cases in which the actual type contravenes a fixed > + size requirement, without having to refer to FIRST_TYPE. */ > + if (!size_ok_p && expected_bits == orig_expected_bits) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a vector of %d-bit elements", > + get_vector_type (actual_type), argno + 1, fndecl, > + expected_bits); > + return false; > + } > + > + /* Likewise for a fixed type class requirement. This is only ever > + needed for signed and unsigned types, so don't create unnecessary > + translation work for other type classes. */ > + if (!tclass_ok_p && orig_expected_tclass == TYPE_signed) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a vector of signed integers", > + get_vector_type (actual_type), argno + 1, fndecl); > + return false; > + } > + if (!tclass_ok_p && orig_expected_tclass == TYPE_unsigned) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a vector of unsigned integers", > + get_vector_type (actual_type), argno + 1, fndecl); > + return false; > + } > + > + /* Make sure that FIRST_TYPE itself is sensible before using it > + as a basis for an error message. */ > + if (resolve_to (mode_suffix_id, first_type) == error_mark_node) > + return false; > + > + /* If the arguments have consistent type classes, but a link between > + the sizes has been broken, try to describe the error in those terms. */ > + if (both_vectors_p && tclass_ok_p && orig_expected_bits == SAME_SIZE) > + { > + if (argno < first_argno) > + { > + std::swap (argno, first_argno); > + std::swap (actual_type, first_type); > + } > + error_at (location, "arguments %d and %d of %qE must have the" > + " same element size, but the values passed here have type" > + " %qT and %qT respectively", first_argno + 1, argno + 1, > + fndecl, get_vector_type (first_type), > + get_vector_type (actual_type)); > + return false; > + } > + > + /* Likewise in reverse: look for cases in which the sizes are consistent > + but a link between the type classes has been broken. */ > + if (both_vectors_p > + && size_ok_p > + && orig_expected_tclass == SAME_TYPE_CLASS > + && type_suffixes[first_type].integer_p > + && type_suffixes[actual_type].integer_p) > + { > + if (argno < first_argno) > + { > + std::swap (argno, first_argno); > + std::swap (actual_type, first_type); > + } > + error_at (location, "arguments %d and %d of %qE must have the" > + " same signedness, but the values passed here have type" > + " %qT and %qT respectively", first_argno + 1, argno + 1, > + fndecl, get_vector_type (first_type), > + get_vector_type (actual_type)); > + return false; > + } > + > + /* The two arguments are wildly inconsistent. */ > + type_suffix_index expected_type > + = find_type_suffix (expected_tclass, expected_bits); > + error_at (location, "passing %qT instead of the expected %qT to argument" > + " %d of %qE, after passing %qT to argument %d", > + get_vector_type (actual_type), get_vector_type (expected_type), > + argno + 1, fndecl, get_argument_type (first_argno), > + first_argno + 1); > + return false; > +} > + > +/* Require argument ARGNO to be a (possibly variable) scalar, expecting it > + to have the following properties: > + > + - the type class must be the same as for type suffix 0 if EXPECTED_TCLASS > + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself. > + > + - the element size must be the same as for type suffix 0 if EXPECTED_BITS > + is SAME_TYPE_SIZE, otherwise it must be EXPECTED_BITS itself. > + > + Return true if the argument is valid, otherwise report an appropriate error. > + > + Note that we don't check whether the scalar type actually has the required > + properties, since that's subject to implicit promotions and conversions. > + Instead we just use the expected properties to tune the error message. */ > +bool function_resolver:: > +require_derived_scalar_type (unsigned int argno, > + type_class_index expected_tclass, > + unsigned int expected_bits) > +{ > + gcc_assert (expected_tclass == SAME_TYPE_CLASS > + || expected_tclass == TYPE_signed > + || expected_tclass == TYPE_unsigned); > + > + /* If the expected type doesn't depend on the type suffix at all, > + just check for the fixed choice of scalar type. */ > + if (expected_tclass != SAME_TYPE_CLASS && expected_bits != SAME_SIZE) > + { > + type_suffix_index expected_type > + = find_type_suffix (expected_tclass, expected_bits); > + return require_scalar_type (argno, get_scalar_type_name > (expected_type)); > + } > + > + if (scalar_argument_p (argno)) > + return true; > + > + if (expected_tclass == SAME_TYPE_CLASS) > + /* It doesn't really matter whether the element is expected to be > + the same size as type suffix 0. */ > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a scalar element", get_argument_type (argno), > + argno + 1, fndecl); > + else > + /* It doesn't seem useful to distinguish between signed and unsigned > + scalars here. */ > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects a scalar integer", get_argument_type (argno), > + argno + 1, fndecl); > + return false; > +} > + > +/* Require argument ARGNO to be suitable for an integer constant > expression. > + Return true if it is, otherwise report an appropriate error. > + > + function_checker checks whether the argument is actually constant and > + has a suitable range. The reason for distinguishing immediate arguments > + here is because it provides more consistent error messages than > + require_scalar_type would. */ > +bool > +function_resolver::require_integer_immediate (unsigned int argno) > +{ > + if (!scalar_argument_p (argno)) > + { > + report_non_ice (location, fndecl, argno); > + return false; > + } > + return true; > +} > + > +/* Require argument ARGNO to be a (possibly variable) scalar, using > EXPECTED > + as the name of its expected type. Return true if the argument has the > + right form, otherwise report an appropriate error. */ > +bool > +function_resolver::require_scalar_type (unsigned int argno, > + const char *expected) > +{ > + if (!scalar_argument_p (argno)) > + { > + error_at (location, "passing %qT to argument %d of %qE, which" > + " expects %qs", get_argument_type (argno), argno + 1, > + fndecl, expected); > + return false; > + } > + return true; > +} > + > +/* Require the function to have exactly EXPECTED arguments. Return true > + if it does, otherwise report an appropriate error. */ > +bool > +function_resolver::check_num_arguments (unsigned int expected) > +{ > + if (m_arglist.length () < expected) > + error_at (location, "too few arguments to function %qE", fndecl); > + else if (m_arglist.length () > expected) > + error_at (location, "too many arguments to function %qE", fndecl); > + return m_arglist.length () == expected; > +} > + > +/* If the function is predicated, check that the last argument is a > + suitable predicate. Also check that there are NOPS further > + arguments before any predicate, but don't check what they are. > + > + Return true on success, otherwise report a suitable error. > + When returning true: > + > + - set I to the number of the last unchecked argument. > + - set NARGS to the total number of arguments. */ > +bool > +function_resolver::check_gp_argument (unsigned int nops, > + unsigned int &i, unsigned int &nargs) > +{ > + i = nops - 1; > + if (pred != PRED_none) > + { > + switch (pred) > + { > + case PRED_m: > + /* Add first inactive argument if needed, and final predicate. */ > + if (has_inactive_argument ()) > + nargs = nops + 2; > + else > + nargs = nops + 1; > + break; > + > + case PRED_p: > + case PRED_x: > + /* Add final predicate. */ > + nargs = nops + 1; > + break; > + > + default: > + gcc_unreachable (); > + } > + > + if (!check_num_arguments (nargs) > + || !require_vector_type (nargs - 1, VECTOR_TYPE_mve_pred16_t)) > + return false; > + > + i = nargs - 2; > + } > + else > + { > + nargs = nops; > + if (!check_num_arguments (nargs)) > + return false; > + } > + > + return true; > +} > + > +/* Finish resolving a function whose final argument can be a vector > + or a scalar, with the function having an implicit "_n" suffix > + in the latter case. This "_n" form might only exist for certain > + type suffixes. > + > + ARGNO is the index of the final argument. The inferred type suffix > + was obtained from argument FIRST_ARGNO, which has type FIRST_TYPE. > + EXPECTED_TCLASS and EXPECTED_BITS describe the expected properties > + of the final vector or scalar argument, in the same way as for > + require_derived_vector_type. INFERRED_TYPE is the inferred type > + suffix itself, or NUM_TYPE_SUFFIXES if it's the same as FIRST_TYPE. > + > + Return the function decl of the resolved function on success, > + otherwise report a suitable error and return error_mark_node. */ > +tree function_resolver:: > +finish_opt_n_resolution (unsigned int argno, unsigned int first_argno, > + type_suffix_index first_type, > + type_class_index expected_tclass, > + unsigned int expected_bits, > + type_suffix_index inferred_type) > +{ > + if (inferred_type == NUM_TYPE_SUFFIXES) > + inferred_type = first_type; > + tree scalar_form = lookup_form (MODE_n, inferred_type); > + > + /* Allow the final argument to be scalar, if an _n form exists. */ > + if (scalar_argument_p (argno)) > + { > + if (scalar_form) > + return scalar_form; > + > + /* Check the vector form normally. If that succeeds, raise an > + error about having no corresponding _n form. */ > + tree res = resolve_to (mode_suffix_id, inferred_type); > + if (res != error_mark_node) > + error_at (location, "passing %qT to argument %d of %qE, but its" > + " %qT form does not accept scalars", > + get_argument_type (argno), argno + 1, fndecl, > + get_vector_type (first_type)); > + return error_mark_node; > + } > + > + /* If an _n form does exist, provide a more accurate message than > + require_derived_vector_type would for arguments that are neither > + vectors nor scalars. */ > + if (scalar_form && !require_vector_or_scalar_type (argno)) > + return error_mark_node; > + > + /* Check for the correct vector type. */ > + if (!require_derived_vector_type (argno, first_argno, first_type, > + expected_tclass, expected_bits)) > + return error_mark_node; > + > + return resolve_to (mode_suffix_id, inferred_type); > +} > + > +/* Resolve a (possibly predicated) unary function. If the function uses > + merge predication or if TREAT_AS_MERGE_P is true, there is an extra > + vector argument before the governing predicate that specifies the > + values of inactive elements. This argument has the following > + properties: > + > + - the type class must be the same as for active elements if MERGE_TCLASS > + is SAME_TYPE_CLASS, otherwise it must be MERGE_TCLASS itself. > + > + - the element size must be the same as for active elements if MERGE_BITS > + is SAME_TYPE_SIZE, otherwise it must be MERGE_BITS itself. > + > + Return the function decl of the resolved function on success, > + otherwise report a suitable error and return error_mark_node. */ > +tree > +function_resolver::resolve_unary (type_class_index merge_tclass, > + unsigned int merge_bits, > + bool treat_as_merge_p) > +{ > + type_suffix_index type; > + if (pred == PRED_m || treat_as_merge_p) > + { > + if (!check_num_arguments (3)) > + return error_mark_node; > + if (merge_tclass == SAME_TYPE_CLASS && merge_bits == SAME_SIZE) > + { > + /* The inactive elements are the same as the active elements, > + so we can use normal left-to-right resolution. */ > + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES > + /* Predicates are the last argument. */ > + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t) > + || !require_matching_vector_type (1 , type)) > + return error_mark_node; > + } > + else > + { > + /* The inactive element type is a function of the active one, > + so resolve the active one first. */ > + if (!require_vector_type (1, VECTOR_TYPE_mve_pred16_t) > + || (type = infer_vector_type (2)) == NUM_TYPE_SUFFIXES > + || !require_derived_vector_type (0, 2, type, merge_tclass, > + merge_bits)) > + return error_mark_node; > + } > + } > + else > + { > + /* We just need to check the predicate (if any) and the single > + vector argument. */ > + unsigned int i, nargs; > + if (!check_gp_argument (1, i, nargs) > + || (type = infer_vector_type (i)) == NUM_TYPE_SUFFIXES) > + return error_mark_node; > + } > + > + /* Handle convert-like functions in which the first type suffix is > + explicit. */ > + if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES) > + return resolve_to (mode_suffix_id, type_suffix_ids[0], type); > + > + return resolve_to (mode_suffix_id, type); > +} > + > +/* Resolve a (possibly predicated) unary function taking a scalar > + argument (_n suffix). If the function uses merge predication, > + there is an extra vector argument in the first position, and the > + final governing predicate that specifies the values of inactive > + elements. > + > + Return the function decl of the resolved function on success, > + otherwise report a suitable error and return error_mark_node. */ > +tree > +function_resolver::resolve_unary_n () > +{ > + type_suffix_index type; > + > + /* Currently only support overrides for _m (vdupq). */ > + if (pred != PRED_m) > + return error_mark_node; > + > + if (pred == PRED_m) > + { > + if (!check_num_arguments (3)) > + return error_mark_node; > + > + /* The inactive elements are the same as the active elements, > + so we can use normal left-to-right resolution. */ > + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES > + /* Predicates are the last argument. */ > + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t)) > + return error_mark_node; > + } > + > + /* Make sure the argument is scalar. */ > + tree scalar_form = lookup_form (MODE_n, type); > + > + if (scalar_argument_p (1) && scalar_form) > + return scalar_form; > + > + return error_mark_node; > +} > + > +/* Resolve a (possibly predicated) function that takes NOPS like-typed > + vector arguments followed by NIMM integer immediates. Return the > + function decl of the resolved function on success, otherwise report > + a suitable error and return error_mark_node. */ > +tree > +function_resolver::resolve_uniform (unsigned int nops, unsigned int nimm) > +{ > + unsigned int i, nargs; > + type_suffix_index type; > + if (!check_gp_argument (nops + nimm, i, nargs) > + || (type = infer_vector_type (0 )) == NUM_TYPE_SUFFIXES) > + return error_mark_node; > + > + unsigned int last_arg = i + 1 - nimm; > + for (i = 0; i < last_arg; i++) > + if (!require_matching_vector_type (i, type)) > + return error_mark_node; > + > + for (i = last_arg; i < nargs; ++i) > + if (!require_integer_immediate (i)) > + return error_mark_node; > + > + return resolve_to (mode_suffix_id, type); > +} > + > +/* Resolve a (possibly predicated) function that offers a choice between > + taking: > + > + - NOPS like-typed vector arguments or > + - NOPS - 1 like-typed vector arguments followed by a scalar argument > + > + Return the function decl of the resolved function on success, > + otherwise report a suitable error and return error_mark_node. */ > +tree > +function_resolver::resolve_uniform_opt_n (unsigned int nops) > +{ > + unsigned int i, nargs; > + type_suffix_index type; > + if (!check_gp_argument (nops, i, nargs) > + /* Unary operators should use resolve_unary, so using i - 1 is > + safe. */ > + || (type = infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES) > + return error_mark_node; > + > + /* Skip last argument, may be scalar. */ > + unsigned int last_arg = i; > + for (i = 0; i < last_arg; i++) > + if (!require_matching_vector_type (i, type)) > + return error_mark_node; > + > + return finish_opt_n_resolution (last_arg, 0, type); > +} > + > +/* If the call is erroneous, report an appropriate error and return > + error_mark_node. Otherwise, if the function is overloaded, return > + the decl of the non-overloaded function. Return NULL_TREE otherwise, > + indicating that the call should be processed in the normal way. */ > +tree > +function_resolver::resolve () > +{ > + return shape->resolve (*this); > +} > + > +function_checker::function_checker (location_t location, > + const function_instance &instance, > + tree fndecl, tree fntype, > + unsigned int nargs, tree *args) > + : function_call_info (location, instance, fndecl), > + m_fntype (fntype), m_nargs (nargs), m_args (args) > +{ > + if (instance.has_inactive_argument ()) > + m_base_arg = 1; > + else > + m_base_arg = 0; > +} > + > +/* Return true if argument ARGNO exists. which it might not for > + erroneous calls. It is safe to wave through checks if this > + function returns false. */ > +bool > +function_checker::argument_exists_p (unsigned int argno) > +{ > + gcc_assert (argno < (unsigned int) type_num_arguments (m_fntype)); > + return argno < m_nargs; > +} > + > +/* Check that argument ARGNO is an integer constant expression and > + store its value in VALUE_OUT if so. The caller should first > + check that argument ARGNO exists. */ > +bool > +function_checker::require_immediate (unsigned int argno, > + HOST_WIDE_INT &value_out) > +{ > + gcc_assert (argno < m_nargs); > + tree arg = m_args[argno]; > + > + /* The type and range are unsigned, so read the argument as an > + unsigned rather than signed HWI. */ > + if (!tree_fits_uhwi_p (arg)) > + { > + report_non_ice (location, fndecl, argno); > + return false; > + } > + > + /* ...but treat VALUE_OUT as signed for error reporting, since printing > + -1 is more user-friendly than the maximum uint64_t value. */ > + value_out = tree_to_uhwi (arg); > + return true; > +} > + > +/* Check that argument REL_ARGNO is an integer constant expression that > has > + a valid value for enumeration type TYPE. REL_ARGNO counts from the end > + of the predication arguments. */ > +bool > +function_checker::require_immediate_enum (unsigned int rel_argno, tree > type) > +{ > + unsigned int argno = m_base_arg + rel_argno; > + if (!argument_exists_p (argno)) > + return true; > + > + HOST_WIDE_INT actual; > + if (!require_immediate (argno, actual)) > + return false; > + > + for (tree entry = TYPE_VALUES (type); entry; entry = TREE_CHAIN (entry)) > + { > + /* The value is an INTEGER_CST for C and a CONST_DECL wrapper > + around an INTEGER_CST for C++. */ > + tree value = TREE_VALUE (entry); > + if (TREE_CODE (value) == CONST_DECL) > + value = DECL_INITIAL (value); > + if (wi::to_widest (value) == actual) > + return true; > + } > + > + report_not_enum (location, fndecl, argno, actual, type); > + return false; > +} > + > +/* Check that argument REL_ARGNO is an integer constant expression in the > + range [MIN, MAX]. REL_ARGNO counts from the end of the predication > + arguments. */ > +bool > +function_checker::require_immediate_range (unsigned int rel_argno, > + HOST_WIDE_INT min, > + HOST_WIDE_INT max) > +{ > + unsigned int argno = m_base_arg + rel_argno; > + if (!argument_exists_p (argno)) > + return true; > + > + /* Required because of the tree_to_uhwi -> HOST_WIDE_INT conversion > + in require_immediate. */ > + gcc_assert (min >= 0 && min <= max); > + HOST_WIDE_INT actual; > + if (!require_immediate (argno, actual)) > + return false; > + > + if (!IN_RANGE (actual, min, max)) > + { > + report_out_of_range (location, fndecl, argno, actual, min, max); > + return false; > + } > + > + return true; > +} > + > +/* Perform semantic checks on the call. Return true if the call is valid, > + otherwise report a suitable error. */ > +bool > +function_checker::check () > +{ > + function_args_iterator iter; > + tree type; > + unsigned int i = 0; > + FOREACH_FUNCTION_ARGS (m_fntype, type, iter) > + { > + if (type == void_type_node || i >= m_nargs) > + break; > + > + if (i >= m_base_arg > + && TREE_CODE (type) == ENUMERAL_TYPE > + && !require_immediate_enum (i - m_base_arg, type)) > + return false; > + > + i += 1; > + } > + > + return shape->check (*this); > +} > + > +gimple_folder::gimple_folder (const function_instance &instance, tree > fndecl, > + gcall *call_in) > + : function_call_info (gimple_location (call_in), instance, fndecl), > + call (call_in), lhs (gimple_call_lhs (call_in)) > +{ > +} > + > +/* Try to fold the call. Return the new statement on success and null > + on failure. */ > +gimple * > +gimple_folder::fold () > +{ > + /* Don't fold anything when MVE is disabled; emit an error during > + expansion instead. */ > + if (!TARGET_HAVE_MVE) > + return NULL; > + > + /* Punt if the function has a return type and no result location is > + provided. The attributes should allow target-independent code to > + remove the calls if appropriate. */ > + if (!lhs && TREE_TYPE (gimple_call_fntype (call)) != void_type_node) > + return NULL; > + > + return base->fold (*this); > +} > + > +function_expander::function_expander (const function_instance &instance, > + tree fndecl, tree call_expr_in, > + rtx possible_target_in) > + : function_call_info (EXPR_LOCATION (call_expr_in), instance, fndecl), > + call_expr (call_expr_in), possible_target (possible_target_in) > +{ > +} > + > +/* Return the handler of direct optab OP for type suffix SUFFIX_I. */ > +insn_code > +function_expander::direct_optab_handler (optab op, unsigned int suffix_i) > +{ > + return ::direct_optab_handler (op, vector_mode (suffix_i)); > +} > + > +/* For a function that does the equivalent of: > + > + OUTPUT = COND ? FN (INPUTS) : FALLBACK; > + > + return the value of FALLBACK. > + > + MODE is the mode of OUTPUT. > + MERGE_ARGNO is the argument that provides FALLBACK for _m functions, > + or DEFAULT_MERGE_ARGNO if we should apply the usual rules. > + > + ARGNO is the caller's index into args. If the returned value is > + argument 0 (as for unary _m operations), increment ARGNO past the > + returned argument. */ > +rtx > +function_expander::get_fallback_value (machine_mode mode, > + unsigned int merge_argno, > + unsigned int &argno) > +{ > + if (pred == PRED_z) > + return CONST0_RTX (mode); > + > + gcc_assert (pred == PRED_m || pred == PRED_x); > + > + if (merge_argno == 0) > + return args[argno++]; > + > + return args[merge_argno]; > +} > + > +/* Return a REG rtx that can be used for the result of the function, > + using the preferred target if suitable. */ > +rtx > +function_expander::get_reg_target () > +{ > + machine_mode target_mode = TYPE_MODE (TREE_TYPE (TREE_TYPE > (fndecl))); > + if (!possible_target || GET_MODE (possible_target) != target_mode) > + possible_target = gen_reg_rtx (target_mode); > + return possible_target; > +} > + > +/* Add an output operand to the instruction we're building, which has > + code ICODE. Bind the output to the preferred target rtx if possible. */ > +void > +function_expander::add_output_operand (insn_code icode) > +{ > + unsigned int opno = m_ops.length (); > + machine_mode mode = insn_data[icode].operand[opno].mode; > + m_ops.safe_grow (opno + 1, true); > + create_output_operand (&m_ops.last (), possible_target, mode); > +} > + > +/* Add an input operand to the instruction we're building, which has > + code ICODE. Calculate the value of the operand as follows: > + > + - If the operand is a predicate, coerce X to have the > + mode that the instruction expects. > + > + - Otherwise use X directly. The expand machinery checks that X has > + the right mode for the instruction. */ > +void > +function_expander::add_input_operand (insn_code icode, rtx x) > +{ > + unsigned int opno = m_ops.length (); > + const insn_operand_data &operand = insn_data[icode].operand[opno]; > + machine_mode mode = operand.mode; > + if (mode == VOIDmode) > + { > + /* The only allowable use of VOIDmode is the wildcard > + arm_any_register_operand, which is used to avoid > + combinatorial explosion in the reinterpret patterns. */ > + gcc_assert (operand.predicate == arm_any_register_operand); > + mode = GET_MODE (x); > + } > + else if (VALID_MVE_PRED_MODE (mode)) > + x = gen_lowpart (mode, x); > + > + m_ops.safe_grow (m_ops.length () + 1, true); > + create_input_operand (&m_ops.last (), x, mode); > +} > + > +/* Add an integer operand with value X to the instruction. */ > +void > +function_expander::add_integer_operand (HOST_WIDE_INT x) > +{ > + m_ops.safe_grow (m_ops.length () + 1, true); > + create_integer_operand (&m_ops.last (), x); > +} > + > +/* Generate instruction ICODE, given that its operands have already > + been added to M_OPS. Return the value of the first operand. */ > +rtx > +function_expander::generate_insn (insn_code icode) > +{ > + expand_insn (icode, m_ops.length (), m_ops.address ()); > + return function_returns_void_p () ? const0_rtx : m_ops[0].value; > +} > + > +/* Implement the call using instruction ICODE, with a 1:1 mapping between > + arguments and input operands. */ > +rtx > +function_expander::use_exact_insn (insn_code icode) > +{ > + unsigned int nops = insn_data[icode].n_operands; > + if (!function_returns_void_p ()) > + { > + add_output_operand (icode); > + nops -= 1; > + } > + for (unsigned int i = 0; i < nops; ++i) > + add_input_operand (icode, args[i]); > + return generate_insn (icode); > +} > + > +/* Implement the call using instruction ICODE, which does not use a > + predicate. */ > +rtx > +function_expander::use_unpred_insn (insn_code icode) > +{ > + gcc_assert (pred == PRED_none); > + /* Discount the output operand. */ > + unsigned int nops = insn_data[icode].n_operands - 1; > + unsigned int i = 0; > + > + add_output_operand (icode); > + for (; i < nops; ++i) > + add_input_operand (icode, args[i]); > + > + return generate_insn (icode); > +} > + > +/* Implement the call using instruction ICODE, which is a predicated > + operation that returns arbitrary values for inactive lanes. */ > +rtx > +function_expander::use_pred_x_insn (insn_code icode) > +{ > + gcc_assert (pred == PRED_x); > + unsigned int nops = args.length (); > + > + add_output_operand (icode); > + /* Use first operand as arbitrary inactive input. */ > + add_input_operand (icode, possible_target); > + emit_clobber (possible_target); > + /* Copy remaining arguments, including the final predicate. */ > + for (unsigned int i = 0; i < nops; ++i) > + add_input_operand (icode, args[i]); > + > + return generate_insn (icode); > +} > + > +/* Implement the call using instruction ICODE, which does the equivalent of: > + > + OUTPUT = COND ? FN (INPUTS) : FALLBACK; > + > + The instruction operands are in the order above: OUTPUT, COND, INPUTS > + and FALLBACK. MERGE_ARGNO is the argument that provides FALLBACK > for _m > + functions, or DEFAULT_MERGE_ARGNO if we should apply the usual rules. > */ > +rtx > +function_expander::use_cond_insn (insn_code icode, unsigned int > merge_argno) > +{ > + /* At present we never need to handle PRED_none, which would involve > + creating a new predicate rather than using one supplied by the user. */ > + gcc_assert (pred != PRED_none); > + /* For MVE, we only handle PRED_m at present. */ > + gcc_assert (pred == PRED_m); > + > + /* Discount the output, predicate and fallback value. */ > + unsigned int nops = insn_data[icode].n_operands - 3; > + machine_mode mode = insn_data[icode].operand[0].mode; > + > + unsigned int opno = 0; > + rtx fallback_arg = NULL_RTX; > + fallback_arg = get_fallback_value (mode, merge_argno, opno); > + rtx pred_arg = args[nops + 1]; > + > + add_output_operand (icode); > + add_input_operand (icode, fallback_arg); > + for (unsigned int i = 0; i < nops; ++i) > + add_input_operand (icode, args[opno + i]); > + add_input_operand (icode, pred_arg); > + return generate_insn (icode); > +} > + > +/* Implement the call using a normal unpredicated optab for PRED_none. > + > + <optab> corresponds to: > + > + - CODE_FOR_SINT for signed integers > + - CODE_FOR_UINT for unsigned integers > + - CODE_FOR_FP for floating-point values */ > +rtx > +function_expander::map_to_rtx_codes (rtx_code code_for_sint, > + rtx_code code_for_uint, > + rtx_code code_for_fp) > +{ > + gcc_assert (pred == PRED_none); > + rtx_code code = type_suffix (0).integer_p ? > + (type_suffix (0).unsigned_p ? code_for_uint : code_for_sint) > + : code_for_fp; > + insn_code icode = direct_optab_handler (code_to_optab (code), 0); > + if (icode == CODE_FOR_nothing) > + gcc_unreachable (); > + > + return use_unpred_insn (icode); > +} > + > +/* Expand the call and return its lhs. */ > +rtx > +function_expander::expand () > +{ > + unsigned int nargs = call_expr_nargs (call_expr); > + args.reserve (nargs); > + for (unsigned int i = 0; i < nargs; ++i) > + args.quick_push (expand_normal (CALL_EXPR_ARG (call_expr, i))); > + > + return base->expand (*this); > +} > + > +/* If we're implementing manual overloading, check whether the MVE > + function with subcode CODE is overloaded, and if so attempt to > + determine the corresponding non-overloaded function. The call > + occurs at location LOCATION and has the arguments given by ARGLIST. > + > + If the call is erroneous, report an appropriate error and return > + error_mark_node. Otherwise, if the function is overloaded, return > + the decl of the non-overloaded function. Return NULL_TREE otherwise, > + indicating that the call should be processed in the normal way. */ > +tree > +resolve_overloaded_builtin (location_t location, unsigned int code, > + vec<tree, va_gc> *arglist) > +{ > + if (code >= vec_safe_length (registered_functions)) > + return NULL_TREE; > + > + registered_function &rfn = *(*registered_functions)[code]; > + if (rfn.overloaded_p) > + return function_resolver (location, rfn.instance, rfn.decl, > + *arglist).resolve (); > + return NULL_TREE; > +} > + > +/* Perform any semantic checks needed for a call to the MVE function > + with subcode CODE, such as testing for integer constant expressions. > + The call occurs at location LOCATION and has NARGS arguments, > + given by ARGS. FNDECL is the original function decl, before > + overload resolution. > + > + Return true if the call is valid, otherwise report a suitable error. */ > +bool > +check_builtin_call (location_t location, vec<location_t>, unsigned int code, > + tree fndecl, unsigned int nargs, tree *args) > +{ > + const registered_function &rfn = *(*registered_functions)[code]; > + if (!check_requires_float (location, rfn.decl, rfn.requires_float)) > + return false; > + > + return function_checker (location, rfn.instance, fndecl, > + TREE_TYPE (rfn.decl), nargs, args).check (); > +} > + > +/* Attempt to fold STMT, given that it's a call to the MVE function > + with subcode CODE. Return the new statement on success and null > + on failure. Insert any other new statements at GSI. */ > +gimple * > +gimple_fold_builtin (unsigned int code, gcall *stmt) > +{ > + registered_function &rfn = *(*registered_functions)[code]; > + return gimple_folder (rfn.instance, rfn.decl, stmt).fold (); > +} > + > +/* Expand a call to the MVE function with subcode CODE. EXP is the call > + expression and TARGET is the preferred location for the result. > + Return the value of the lhs. */ > +rtx > +expand_builtin (unsigned int code, tree exp, rtx target) > +{ > + registered_function &rfn = *(*registered_functions)[code]; > + if (!check_requires_float (EXPR_LOCATION (exp), rfn.decl, > + rfn.requires_float)) > + return target; > + return function_expander (rfn.instance, rfn.decl, exp, target).expand (); > +} > + > +} /* end namespace arm_mve */ > + > +using namespace arm_mve; > + > +inline void > +gt_ggc_mx (function_instance *) > +{ > +} > + > +inline void > +gt_pch_nx (function_instance *) > +{ > +} > + > +inline void > +gt_pch_nx (function_instance *, gt_pointer_operator, void *) > +{ > +} > > #include "gt-arm-mve-builtins.h" > diff --git a/gcc/config/arm/arm-mve-builtins.def b/gcc/config/arm/arm-mve- > builtins.def > index 69f3f81b473..49d07364fa2 100644 > --- a/gcc/config/arm/arm-mve-builtins.def > +++ b/gcc/config/arm/arm-mve-builtins.def > @@ -17,10 +17,25 @@ > along with GCC; see the file COPYING3. If not see > <http://www.gnu.org/licenses/>. */ > > +#ifndef DEF_MVE_MODE > +#define DEF_MVE_MODE(A, B, C, D) > +#endif > + > #ifndef DEF_MVE_TYPE > -#error "arm-mve-builtins.def included without defining DEF_MVE_TYPE" > +#define DEF_MVE_TYPE(A, B) > +#endif > + > +#ifndef DEF_MVE_TYPE_SUFFIX > +#define DEF_MVE_TYPE_SUFFIX(A, B, C, D, E) > #endif > > +#ifndef DEF_MVE_FUNCTION > +#define DEF_MVE_FUNCTION(A, B, C, D) > +#endif > + > +DEF_MVE_MODE (n, none, none, none) > +DEF_MVE_MODE (offset, none, none, bytes) > + > #define REQUIRES_FLOAT false > DEF_MVE_TYPE (mve_pred16_t, boolean_type_node) > DEF_MVE_TYPE (uint8x16_t, unsigned_intQI_type_node) > @@ -37,3 +52,26 @@ DEF_MVE_TYPE (int64x2_t, intDI_type_node) > DEF_MVE_TYPE (float16x8_t, arm_fp16_type_node) > DEF_MVE_TYPE (float32x4_t, float_type_node) > #undef REQUIRES_FLOAT > + > +#define REQUIRES_FLOAT false > +DEF_MVE_TYPE_SUFFIX (s8, int8x16_t, signed, 8, V16QImode) > +DEF_MVE_TYPE_SUFFIX (s16, int16x8_t, signed, 16, V8HImode) > +DEF_MVE_TYPE_SUFFIX (s32, int32x4_t, signed, 32, V4SImode) > +DEF_MVE_TYPE_SUFFIX (s64, int64x2_t, signed, 64, V2DImode) > +DEF_MVE_TYPE_SUFFIX (u8, uint8x16_t, unsigned, 8, V16QImode) > +DEF_MVE_TYPE_SUFFIX (u16, uint16x8_t, unsigned, 16, V8HImode) > +DEF_MVE_TYPE_SUFFIX (u32, uint32x4_t, unsigned, 32, V4SImode) > +DEF_MVE_TYPE_SUFFIX (u64, uint64x2_t, unsigned, 64, V2DImode) > +#undef REQUIRES_FLOAT > + > +#define REQUIRES_FLOAT true > +DEF_MVE_TYPE_SUFFIX (f16, float16x8_t, float, 16, V8HFmode) > +DEF_MVE_TYPE_SUFFIX (f32, float32x4_t, float, 32, V4SFmode) > +#undef REQUIRES_FLOAT > + > +#include "arm-mve-builtins-base.def" > + > +#undef DEF_MVE_TYPE > +#undef DEF_MVE_TYPE_SUFFIX > +#undef DEF_MVE_FUNCTION > +#undef DEF_MVE_MODE > diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve- > builtins.h > index 290a118ec92..a20d2fb5d86 100644 > --- a/gcc/config/arm/arm-mve-builtins.h > +++ b/gcc/config/arm/arm-mve-builtins.h > @@ -20,7 +20,79 @@ > #ifndef GCC_ARM_MVE_BUILTINS_H > #define GCC_ARM_MVE_BUILTINS_H > > +/* The full name of an MVE ACLE function is the concatenation of: > + > + - the base name ("vadd", etc.) > + - the "mode" suffix ("_n", "_index", etc.) > + - the type suffixes ("_s32", "_b8", etc.) > + - the predication suffix ("_x", "_z", etc.) > + > + Each piece of information is individually useful, so we retain this > + classification throughout: > + > + - function_base represents the base name > + > + - mode_suffix_index represents the mode suffix > + > + - type_suffix_index represents individual type suffixes, while > + type_suffix_pair represents a pair of them > + > + - prediction_index extends the predication suffix with an additional > + alternative: PRED_implicit for implicitly-predicated operations > + > + In addition to its unique full name, a function may have a shorter > + overloaded alias. This alias removes pieces of the suffixes that > + can be inferred from the arguments, such as by shortening the mode > + suffix or dropping some of the type suffixes. The base name and the > + predication suffix stay the same. > + > + The function_shape class describes what arguments a given function > + takes and what its overloaded alias is called. In broad terms, > + function_base describes how the underlying instruction behaves while > + function_shape describes how that instruction has been presented at > + the language level. > + > + The static list of functions uses function_group to describe a group > + of related functions. The function_builder class is responsible for > + expanding this static description into a list of individual functions > + and registering the associated built-in functions. function_instance > + describes one of these individual functions in terms of the properties > + described above. > + > + The classes involved in compiling a function call are: > + > + - function_resolver, which resolves an overloaded function call to a > + specific function_instance and its associated function decl > + > + - function_checker, which checks whether the values of the arguments > + conform to the ACLE specification > + > + - gimple_folder, which tries to fold a function call at the gimple level > + > + - function_expander, which expands a function call into rtl instructions > + > + function_resolver and function_checker operate at the language level > + and so are associated with the function_shape. gimple_folder and > + function_expander are concerned with the behavior of the function > + and so are associated with the function_base. > + > + Note that we've specifically chosen not to fold calls in the frontend, > + since MVE intrinsics will hardly ever fold a useful language-level > + constant. */ > namespace arm_mve { > +/* The maximum number of vectors in an ACLE tuple type. */ > +const unsigned int MAX_TUPLE_SIZE = 3; > + > +/* Used to represent the default merge argument index for _m functions. > + The actual index depends on how many arguments the function takes. */ > +const unsigned int DEFAULT_MERGE_ARGNO = 0; > + > +/* Flags that describe what a function might do, in addition to reading > + its arguments and returning a result. */ > +const unsigned int CP_READ_FPCR = 1U << 0; > +const unsigned int CP_RAISE_FP_EXCEPTIONS = 1U << 1; > +const unsigned int CP_READ_MEMORY = 1U << 2; > +const unsigned int CP_WRITE_MEMORY = 1U << 3; > > /* Enumerates the MVE predicate and (data) vector types, together called > "vector types" for brevity. */ > @@ -30,11 +102,604 @@ enum vector_type_index > VECTOR_TYPE_ ## ACLE_NAME, > #include "arm-mve-builtins.def" > NUM_VECTOR_TYPES > -#undef DEF_MVE_TYPE > }; > > +/* Classifies the available measurement units for an address displacement. > */ > +enum units_index > +{ > + UNITS_none, > + UNITS_bytes > +}; > + > +/* Describes the various uses of a governing predicate. */ > +enum predication_index > +{ > + /* No governing predicate is present. */ > + PRED_none, > + > + /* Merging predication: copy inactive lanes from the first data argument > + to the vector result. */ > + PRED_m, > + > + /* Plain predication: inactive lanes are not used to compute the > + scalar result. */ > + PRED_p, > + > + /* "Don't care" predication: set inactive lanes of the vector result > + to arbitrary values. */ > + PRED_x, > + > + /* Zero predication: set inactive lanes of the vector result to zero. */ > + PRED_z, > + > + NUM_PREDS > +}; > + > +/* Classifies element types, based on type suffixes with the bit count > + removed. */ > +enum type_class_index > +{ > + TYPE_bool, > + TYPE_float, > + TYPE_signed, > + TYPE_unsigned, > + NUM_TYPE_CLASSES > +}; > + > +/* Classifies an operation into "modes"; for example, to distinguish > + vector-scalar operations from vector-vector operations, or to > + distinguish between different addressing modes. This classification > + accounts for the function suffixes that occur between the base name > + and the first type suffix. */ > +enum mode_suffix_index > +{ > +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS) > MODE_##NAME, > +#include "arm-mve-builtins.def" > + MODE_none > +}; > + > +/* Enumerates the possible type suffixes. Each suffix is associated with > + a vector type, but for predicates provides extra information about the > + element size. */ > +enum type_suffix_index > +{ > +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) > \ > + TYPE_SUFFIX_ ## NAME, > +#include "arm-mve-builtins.def" > + NUM_TYPE_SUFFIXES > +}; > + > +/* Combines two type suffixes. */ > +typedef enum type_suffix_index type_suffix_pair[2]; > + > +class function_base; > +class function_shape; > + > +/* Static information about a mode suffix. */ > +struct mode_suffix_info > +{ > + /* The suffix string itself. */ > + const char *string; > + > + /* The type of the vector base address, or NUM_VECTOR_TYPES if the > + mode does not include a vector base address. */ > + vector_type_index base_vector_type; > + > + /* The type of the vector displacement, or NUM_VECTOR_TYPES if the > + mode does not include a vector displacement. (Note that scalar > + displacements are always int64_t.) */ > + vector_type_index displacement_vector_type; > + > + /* The units in which the vector or scalar displacement is measured, > + or UNITS_none if the mode doesn't take a displacement. */ > + units_index displacement_units; > +}; > + > +/* Static information about a type suffix. */ > +struct type_suffix_info > +{ > + /* The suffix string itself. */ > + const char *string; > + > + /* The associated ACLE vector or predicate type. */ > + vector_type_index vector_type : 8; > + > + /* What kind of type the suffix represents. */ > + type_class_index tclass : 8; > + > + /* The number of bits and bytes in an element. For predicates this > + measures the associated data elements. */ > + unsigned int element_bits : 8; > + unsigned int element_bytes : 8; > + > + /* True if the suffix is for an integer type. */ > + unsigned int integer_p : 1; > + /* True if the suffix is for an unsigned type. */ > + unsigned int unsigned_p : 1; > + /* True if the suffix is for a floating-point type. */ > + unsigned int float_p : 1; > + unsigned int spare : 13; > + > + /* The associated vector or predicate mode. */ > + machine_mode vector_mode : 16; > +}; > + > +/* Static information about a set of functions. */ > +struct function_group_info > +{ > + /* The base name, as a string. */ > + const char *base_name; > + > + /* Describes the behavior associated with the function base name. */ > + const function_base *const *base; > + > + /* The shape of the functions, as described above the class definition. > + It's possible to have entries with the same base name but different > + shapes. */ > + const function_shape *const *shape; > + > + /* A list of the available type suffixes, and of the available predication > + types. The function supports every combination of the two. > + > + The list of type suffixes is terminated by two NUM_TYPE_SUFFIXES > + while the list of predication types is terminated by NUM_PREDS. > + The list of type suffixes is lexicographically ordered based > + on the index value. */ > + const type_suffix_pair *types; > + const predication_index *preds; > + > + /* Whether the function group requires a floating point abi. */ > + bool requires_float; > +}; > + > +/* Describes a single fully-resolved function (i.e. one that has a > + unique full name). */ > +class GTY((user)) function_instance > +{ > +public: > + function_instance (const char *, const function_base *, > + const function_shape *, mode_suffix_index, > + const type_suffix_pair &, predication_index); > + > + bool operator== (const function_instance &) const; > + bool operator!= (const function_instance &) const; > + hashval_t hash () const; > + > + unsigned int call_properties () const; > + bool reads_global_state_p () const; > + bool modifies_global_state_p () const; > + bool could_trap_p () const; > + > + unsigned int vectors_per_tuple () const; > + > + const mode_suffix_info &mode_suffix () const; > + > + const type_suffix_info &type_suffix (unsigned int) const; > + tree scalar_type (unsigned int) const; > + tree vector_type (unsigned int) const; > + tree tuple_type (unsigned int) const; > + machine_mode vector_mode (unsigned int) const; > + machine_mode gp_mode (unsigned int) const; > + > + bool has_inactive_argument () const; > + > + /* The properties of the function. (The explicit "enum"s are required > + for gengtype.) */ > + const char *base_name; > + const function_base *base; > + const function_shape *shape; > + enum mode_suffix_index mode_suffix_id; > + type_suffix_pair type_suffix_ids; > + enum predication_index pred; > +}; > + > +class registered_function; > + > +/* A class for building and registering function decls. */ > +class function_builder > +{ > +public: > + function_builder (); > + ~function_builder (); > + > + void add_unique_function (const function_instance &, tree, > + vec<tree> &, bool, bool, bool); > + void add_overloaded_function (const function_instance &, bool, bool); > + void add_overloaded_functions (const function_group_info &, > + mode_suffix_index, bool); > + > + void register_function_group (const function_group_info &, bool); > + > +private: > + void append_name (const char *); > + char *finish_name (); > + > + char *get_name (const function_instance &, bool, bool); > + > + tree get_attributes (const function_instance &); > + > + registered_function &add_function (const function_instance &, > + const char *, tree, tree, > + bool, bool, bool); > + > + /* The function type to use for functions that are resolved by > + function_resolver. */ > + tree m_overload_type; > + > + /* True if we should create a separate decl for each instance of an > + overloaded function, instead of using function_resolver. */ > + bool m_direct_overloads; > + > + /* Used for building up function names. */ > + obstack m_string_obstack; > + > + /* Maps all overloaded function names that we've registered so far > + to their associated function_instances. */ > + hash_map<nofree_string_hash, registered_function *> > m_overload_names; > +}; > + > +/* A base class for handling calls to built-in functions. */ > +class function_call_info : public function_instance > +{ > +public: > + function_call_info (location_t, const function_instance &, tree); > + > + bool function_returns_void_p (); > + > + /* The location of the call. */ > + location_t location; > + > + /* The FUNCTION_DECL that is being called. */ > + tree fndecl; > +}; > + > +/* A class for resolving an overloaded function call. */ > +class function_resolver : public function_call_info > +{ > +public: > + enum { SAME_SIZE = 256, HALF_SIZE, QUARTER_SIZE }; > + static const type_class_index SAME_TYPE_CLASS = NUM_TYPE_CLASSES; > + > + function_resolver (location_t, const function_instance &, tree, > + vec<tree, va_gc> &); > + > + tree get_vector_type (type_suffix_index); > + const char *get_scalar_type_name (type_suffix_index); > + tree get_argument_type (unsigned int); > + bool scalar_argument_p (unsigned int); > + > + tree report_no_such_form (type_suffix_index); > + tree lookup_form (mode_suffix_index, > + type_suffix_index = NUM_TYPE_SUFFIXES, > + type_suffix_index = NUM_TYPE_SUFFIXES); > + tree resolve_to (mode_suffix_index, > + type_suffix_index = NUM_TYPE_SUFFIXES, > + type_suffix_index = NUM_TYPE_SUFFIXES); > + > + type_suffix_index infer_vector_or_tuple_type (unsigned int, unsigned int); > + type_suffix_index infer_vector_type (unsigned int); > + > + bool require_vector_or_scalar_type (unsigned int); > + > + bool require_vector_type (unsigned int, vector_type_index); > + bool require_matching_vector_type (unsigned int, type_suffix_index); > + bool require_derived_vector_type (unsigned int, unsigned int, > + type_suffix_index, > + type_class_index = SAME_TYPE_CLASS, > + unsigned int = SAME_SIZE); > + bool require_integer_immediate (unsigned int); > + bool require_scalar_type (unsigned int, const char *); > + bool require_derived_scalar_type (unsigned int, type_class_index, > + unsigned int = SAME_SIZE); > + > + bool check_num_arguments (unsigned int); > + bool check_gp_argument (unsigned int, unsigned int &, unsigned int &); > + tree resolve_unary (type_class_index = SAME_TYPE_CLASS, > + unsigned int = SAME_SIZE, bool = false); > + tree resolve_unary_n (); > + tree resolve_uniform (unsigned int, unsigned int = 0); > + tree resolve_uniform_opt_n (unsigned int); > + tree finish_opt_n_resolution (unsigned int, unsigned int, type_suffix_index, > + type_class_index = SAME_TYPE_CLASS, > + unsigned int = SAME_SIZE, > + type_suffix_index = NUM_TYPE_SUFFIXES); > + > + tree resolve (); > + > +private: > + /* The arguments to the overloaded function. */ > + vec<tree, va_gc> &m_arglist; > +}; > + > +/* A class for checking that the semantic constraints on a function call are > + satisfied, such as arguments being integer constant expressions with > + a particular range. The parent class's FNDECL is the decl that was > + called in the original source, before overload resolution. */ > +class function_checker : public function_call_info > +{ > +public: > + function_checker (location_t, const function_instance &, tree, > + tree, unsigned int, tree *); > + > + bool require_immediate_enum (unsigned int, tree); > + bool require_immediate_lane_index (unsigned int, unsigned int = 1); > + bool require_immediate_range (unsigned int, HOST_WIDE_INT, > HOST_WIDE_INT); > + > + bool check (); > + > +private: > + bool argument_exists_p (unsigned int); > + > + bool require_immediate (unsigned int, HOST_WIDE_INT &); > + > + /* The type of the resolved function. */ > + tree m_fntype; > + > + /* The arguments to the function. */ > + unsigned int m_nargs; > + tree *m_args; > + > + /* The first argument not associated with the function's predication > + type. */ > + unsigned int m_base_arg; > +}; > + > +/* A class for folding a gimple function call. */ > +class gimple_folder : public function_call_info > +{ > +public: > + gimple_folder (const function_instance &, tree, > + gcall *); > + > + gimple *fold (); > + > + /* The call we're folding. */ > + gcall *call; > + > + /* The result of the call, or null if none. */ > + tree lhs; > +}; > + > +/* A class for expanding a function call into RTL. */ > +class function_expander : public function_call_info > +{ > +public: > + function_expander (const function_instance &, tree, tree, rtx); > + rtx expand (); > + > + insn_code direct_optab_handler (optab, unsigned int = 0); > + > + rtx get_fallback_value (machine_mode, unsigned int, unsigned int &); > + rtx get_reg_target (); > + > + void add_output_operand (insn_code); > + void add_input_operand (insn_code, rtx); > + void add_integer_operand (HOST_WIDE_INT); > + rtx generate_insn (insn_code); > + > + rtx use_exact_insn (insn_code); > + rtx use_unpred_insn (insn_code); > + rtx use_pred_x_insn (insn_code); > + rtx use_cond_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); > + > + rtx map_to_rtx_codes (rtx_code, rtx_code, rtx_code); > + > + /* The function call expression. */ > + tree call_expr; > + > + /* For functions that return a value, this is the preferred location > + of that value. It could be null or could have a different mode > + from the function return type. */ > + rtx possible_target; > + > + /* The expanded arguments. */ > + auto_vec<rtx, 16> args; > + > +private: > + /* Used to build up the operands to an instruction. */ > + auto_vec<expand_operand, 8> m_ops; > +}; > + > +/* Provides information about a particular function base name, and handles > + tasks related to the base name. */ > +class function_base > +{ > +public: > + /* Return a set of CP_* flags that describe what the function might do, > + in addition to reading its arguments and returning a result. */ > + virtual unsigned int call_properties (const function_instance &) const; > + > + /* If the function operates on tuples of vectors, return the number > + of vectors in the tuples, otherwise return 1. */ > + virtual unsigned int vectors_per_tuple () const { return 1; } > + > + /* Try to fold the given gimple call. Return the new gimple statement > + on success, otherwise return null. */ > + virtual gimple *fold (gimple_folder &) const { return NULL; } > + > + /* Expand the given call into rtl. Return the result of the function, > + or an arbitrary value if the function doesn't return a result. */ > + virtual rtx expand (function_expander &) const = 0; > +}; > + > +/* Classifies functions into "shapes". The idea is to take all the > + type signatures for a set of functions, and classify what's left > + based on: > + > + - the number of arguments > + > + - the process of determining the types in the signature from the mode > + and type suffixes in the function name (including types that are not > + affected by the suffixes) > + > + - which arguments must be integer constant expressions, and what range > + those arguments have > + > + - the process for mapping overloaded names to "full" names. */ > +class function_shape > +{ > +public: > + virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, > enum mode_suffix_index) const = 0; > + virtual bool explicit_mode_suffix_p (enum predication_index, enum > mode_suffix_index) const = 0; > + virtual bool skip_overload_p (enum predication_index, enum > mode_suffix_index) const = 0; > + > + /* Define all functions associated with the given group. */ > + virtual void build (function_builder &, > + const function_group_info &, > + bool) const = 0; > + > + /* Try to resolve the overloaded call. Return the non-overloaded > + function decl on success and error_mark_node on failure. */ > + virtual tree resolve (function_resolver &) const = 0; > + > + /* Check whether the given call is semantically valid. Return true > + if it is, otherwise report an error and return false. */ > + virtual bool check (function_checker &) const { return true; } > +}; > + > +extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1]; > +extern const mode_suffix_info mode_suffixes[MODE_none + 1]; > + > extern tree scalar_types[NUM_VECTOR_TYPES]; > -extern tree acle_vector_types[3][NUM_VECTOR_TYPES + 1]; > +extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; > + > +/* Return the ACLE type mve_pred16_t. */ > +inline tree > +get_mve_pred16_t (void) > +{ > + return acle_vector_types[0][VECTOR_TYPE_mve_pred16_t]; > +} > + > +/* Try to find a mode with the given mode_suffix_info fields. Return the > + mode on success or MODE_none on failure. */ > +inline mode_suffix_index > +find_mode_suffix (vector_type_index base_vector_type, > + vector_type_index displacement_vector_type, > + units_index displacement_units) > +{ > + for (unsigned int mode_i = 0; mode_i < ARRAY_SIZE (mode_suffixes); > ++mode_i) > + { > + const mode_suffix_info &mode = mode_suffixes[mode_i]; > + if (mode.base_vector_type == base_vector_type > + && mode.displacement_vector_type == displacement_vector_type > + && mode.displacement_units == displacement_units) > + return mode_suffix_index (mode_i); > + } > + return MODE_none; > +} > + > +/* Return the type suffix associated with ELEMENT_BITS-bit elements of type > + class TCLASS. */ > +inline type_suffix_index > +find_type_suffix (type_class_index tclass, unsigned int element_bits) > +{ > + for (unsigned int i = 0; i < NUM_TYPE_SUFFIXES; ++i) > + if (type_suffixes[i].tclass == tclass > + && type_suffixes[i].element_bits == element_bits) > + return type_suffix_index (i); > + gcc_unreachable (); > +} > + > +inline function_instance:: > +function_instance (const char *base_name_in, > + const function_base *base_in, > + const function_shape *shape_in, > + mode_suffix_index mode_suffix_id_in, > + const type_suffix_pair &type_suffix_ids_in, > + predication_index pred_in) > + : base_name (base_name_in), base (base_in), shape (shape_in), > + mode_suffix_id (mode_suffix_id_in), pred (pred_in) > +{ > + memcpy (type_suffix_ids, type_suffix_ids_in, sizeof (type_suffix_ids)); > +} > + > +inline bool > +function_instance::operator== (const function_instance &other) const > +{ > + return (base == other.base > + && shape == other.shape > + && mode_suffix_id == other.mode_suffix_id > + && pred == other.pred > + && type_suffix_ids[0] == other.type_suffix_ids[0] > + && type_suffix_ids[1] == other.type_suffix_ids[1]); > +} > + > +inline bool > +function_instance::operator!= (const function_instance &other) const > +{ > + return !operator== (other); > +} > + > +/* If the function operates on tuples of vectors, return the number > + of vectors in the tuples, otherwise return 1. */ > +inline unsigned int > +function_instance::vectors_per_tuple () const > +{ > + return base->vectors_per_tuple (); > +} > + > +/* Return information about the function's mode suffix. */ > +inline const mode_suffix_info & > +function_instance::mode_suffix () const > +{ > + return mode_suffixes[mode_suffix_id]; > +} > + > +/* Return information about type suffix I. */ > +inline const type_suffix_info & > +function_instance::type_suffix (unsigned int i) const > +{ > + return type_suffixes[type_suffix_ids[i]]; > +} > + > +/* Return the scalar type associated with type suffix I. */ > +inline tree > +function_instance::scalar_type (unsigned int i) const > +{ > + return scalar_types[type_suffix (i).vector_type]; > +} > + > +/* Return the vector type associated with type suffix I. */ > +inline tree > +function_instance::vector_type (unsigned int i) const > +{ > + return acle_vector_types[0][type_suffix (i).vector_type]; > +} > + > +/* If the function operates on tuples of vectors, return the tuple type > + associated with type suffix I, otherwise return the vector type associated > + with type suffix I. */ > +inline tree > +function_instance::tuple_type (unsigned int i) const > +{ > + unsigned int num_vectors = vectors_per_tuple (); > + return acle_vector_types[num_vectors - 1][type_suffix (i).vector_type]; > +} > + > +/* Return the vector or predicate mode associated with type suffix I. */ > +inline machine_mode > +function_instance::vector_mode (unsigned int i) const > +{ > + return type_suffix (i).vector_mode; > +} > + > +/* Return true if the function has no return value. */ > +inline bool > +function_call_info::function_returns_void_p () > +{ > + return TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node; > +} > + > +/* Default implementation of function::call_properties, with conservatively > + correct behavior for floating-point instructions. */ > +inline unsigned int > +function_base::call_properties (const function_instance &instance) const > +{ > + unsigned int flags = 0; > + if (instance.type_suffix (0).float_p || instance.type_suffix (1).float_p) > + flags |= CP_READ_FPCR | CP_RAISE_FP_EXCEPTIONS; > + return flags; > +} > > } /* end namespace arm_mve */ > > diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h > index 1bdbd3b8ab3..61fcd671437 100644 > --- a/gcc/config/arm/arm-protos.h > +++ b/gcc/config/arm/arm-protos.h > @@ -215,7 +215,8 @@ extern opt_machine_mode arm_get_mask_mode > (machine_mode mode); > those groups. */ > enum arm_builtin_class > { > - ARM_BUILTIN_GENERAL > + ARM_BUILTIN_GENERAL, > + ARM_BUILTIN_MVE > }; > > /* Built-in function codes are structured so that the low > @@ -229,6 +230,13 @@ const unsigned int ARM_BUILTIN_CLASS = (1 << > ARM_BUILTIN_SHIFT) - 1; > /* MVE functions. */ > namespace arm_mve { > void handle_arm_mve_types_h (); > + void handle_arm_mve_h (bool); > + tree resolve_overloaded_builtin (location_t, unsigned int, > + vec<tree, va_gc> *); > + bool check_builtin_call (location_t, vec<location_t>, unsigned int, > + tree, unsigned int, tree *); > + gimple *gimple_fold_builtin (unsigned int code, gcall *stmt); > + rtx expand_builtin (unsigned int, tree, rtx); > } > > /* Thumb functions. */ > diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc > index bf7ff9a9704..004e6c6194e 100644 > --- a/gcc/config/arm/arm.cc > +++ b/gcc/config/arm/arm.cc > @@ -69,6 +69,7 @@ > #include "optabs-libfuncs.h" > #include "gimplify.h" > #include "gimple.h" > +#include "gimple-iterator.h" > #include "selftest.h" > #include "tree-vectorizer.h" > #include "opts.h" > @@ -506,6 +507,9 @@ static const struct attribute_spec > arm_attribute_table[] = > #undef TARGET_FUNCTION_VALUE_REGNO_P > #define TARGET_FUNCTION_VALUE_REGNO_P arm_function_value_regno_p > > +#undef TARGET_GIMPLE_FOLD_BUILTIN > +#define TARGET_GIMPLE_FOLD_BUILTIN arm_gimple_fold_builtin > + > #undef TARGET_ASM_OUTPUT_MI_THUNK > #define TARGET_ASM_OUTPUT_MI_THUNK arm_output_mi_thunk > #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK > @@ -2844,6 +2848,29 @@ arm_init_libfuncs (void) > speculation_barrier_libfunc = init_one_libfunc ("__speculation_barrier"); > } > > +/* Implement TARGET_GIMPLE_FOLD_BUILTIN. */ > +static bool > +arm_gimple_fold_builtin (gimple_stmt_iterator *gsi) > +{ > + gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi)); > + tree fndecl = gimple_call_fndecl (stmt); > + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); > + unsigned int subcode = code >> ARM_BUILTIN_SHIFT; > + gimple *new_stmt = NULL; > + switch (code & ARM_BUILTIN_CLASS) > + { > + case ARM_BUILTIN_GENERAL: > + break; > + case ARM_BUILTIN_MVE: > + new_stmt = arm_mve::gimple_fold_builtin (subcode, stmt); > + } > + if (!new_stmt) > + return false; > + > + gsi_replace (gsi, new_stmt, true); > + return true; > +} > + > /* On AAPCS systems, this is the "struct __va_list". */ > static GTY(()) tree va_list_type; > > diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h > index 1262d668121..0d2ba968fc0 100644 > --- a/gcc/config/arm/arm_mve.h > +++ b/gcc/config/arm/arm_mve.h > @@ -34,6 +34,12 @@ > #endif > #include "arm_mve_types.h" > > +#ifdef __ARM_MVE_PRESERVE_USER_NAMESPACE > +#pragma GCC arm "arm_mve.h" true > +#else > +#pragma GCC arm "arm_mve.h" false > +#endif > + > #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE > #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) > #define vdupq_n(__a) __arm_vdupq_n(__a) > diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md > index 3139750c606..8e235f63ee6 100644 > --- a/gcc/config/arm/predicates.md > +++ b/gcc/config/arm/predicates.md > @@ -903,3 +903,7 @@ (define_predicate "call_insn_operand" > (define_special_predicate "aligned_operand" > (ior (not (match_code "mem")) > (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)"))) > + > +;; A special predicate that doesn't match a particular mode. > +(define_special_predicate "arm_any_register_operand" > + (match_code "reg")) > diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm > index 637e72af5bb..9a1b06368a1 100644 > --- a/gcc/config/arm/t-arm > +++ b/gcc/config/arm/t-arm > @@ -154,15 +154,41 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.cc > $(CONFIG_H) \ > $(srcdir)/config/arm/arm-builtins.cc > > arm-mve-builtins.o: $(srcdir)/config/arm/arm-mve-builtins.cc $(CONFIG_H) \ > - $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ > - fold-const.h langhooks.h stringpool.h attribs.h diagnostic.h \ > + $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) $(TM_P_H) \ > + memmodel.h insn-codes.h optabs.h recog.h expr.h basic-block.h \ > + function.h fold-const.h gimple.h gimple-fold.h emit-rtl.h langhooks.h \ > + stringpool.h attribs.h diagnostic.h \ > $(srcdir)/config/arm/arm-protos.h \ > $(srcdir)/config/arm/arm-builtins.h \ > $(srcdir)/config/arm/arm-mve-builtins.h \ > - $(srcdir)/config/arm/arm-mve-builtins.def > + $(srcdir)/config/arm/arm-mve-builtins-base.h \ > + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \ > + $(srcdir)/config/arm/arm-mve-builtins.def \ > + $(srcdir)/config/arm/arm-mve-builtins-base.def > $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) > $(INCLUDES) \ > $(srcdir)/config/arm/arm-mve-builtins.cc > > +arm-mve-builtins-shapes.o: \ > + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc \ > + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ > + $(RTL_H) memmodel.h insn-codes.h optabs.h \ > + $(srcdir)/config/arm/arm-mve-builtins.h \ > + $(srcdir)/config/arm/arm-mve-builtins-shapes.h > + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) > $(INCLUDES) \ > + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc > + > +arm-mve-builtins-base.o: \ > + $(srcdir)/config/arm/arm-mve-builtins-base.cc \ > + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ > + memmodel.h insn-codes.h $(OPTABS_H) \ > + $(BASIC_BLOCK_H) $(FUNCTION_H) $(GIMPLE_H) \ > + $(srcdir)/config/arm/arm-mve-builtins.h \ > + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \ > + $(srcdir)/config/arm/arm-mve-builtins-base.h \ > + $(srcdir)/config/arm/arm-mve-builtins-functions.h > + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) > $(INCLUDES) \ > + $(srcdir)/config/arm/arm-mve-builtins-base.cc > + > arm-c.o: $(srcdir)/config/arm/arm-c.cc $(CONFIG_H) $(SYSTEM_H) \ > coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) > $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) > $(INCLUDES) \ > -- > 2.34.1
diff --git a/gcc/config.gcc b/gcc/config.gcc index 6fd1594480a..5d49f5890ab 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -362,7 +362,7 @@ arc*-*-*) ;; arm*-*-*) cpu_type=arm - extra_objs="arm-builtins.o arm-mve-builtins.o aarch-common.o aarch-bti-insert.o" + extra_objs="arm-builtins.o arm-mve-builtins.o arm-mve-builtins-shapes.o arm-mve-builtins-base.o aarch-common.o aarch-bti-insert.o" extra_headers="mmintrin.h arm_neon.h arm_acle.h arm_fp16.h arm_cmse.h arm_bf16.h arm_mve_types.h arm_mve.h arm_cde.h" target_type_format_char='%' c_target_objs="arm-c.o" diff --git a/gcc/config/arm/arm-builtins.cc b/gcc/config/arm/arm-builtins.cc index adcb50d2185..d0c57409b4c 100644 --- a/gcc/config/arm/arm-builtins.cc +++ b/gcc/config/arm/arm-builtins.cc @@ -2712,6 +2712,7 @@ arm_general_builtin_decl (unsigned code) return arm_builtin_decls[code]; } +/* Implement TARGET_BUILTIN_DECL. */ /* Return the ARM builtin for CODE. */ tree arm_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED) @@ -2721,6 +2722,8 @@ arm_builtin_decl (unsigned code, bool initialize_p ATTRIBUTE_UNUSED) { case ARM_BUILTIN_GENERAL: return arm_general_builtin_decl (subcode); + case ARM_BUILTIN_MVE: + return error_mark_node; default: gcc_unreachable (); } @@ -4087,6 +4090,8 @@ arm_expand_builtin (tree exp, { case ARM_BUILTIN_GENERAL: return arm_general_expand_builtin (subcode, exp, target, ignore); + case ARM_BUILTIN_MVE: + return arm_mve::expand_builtin (subcode, exp, target); default: gcc_unreachable (); } @@ -4188,8 +4193,9 @@ arm_general_check_builtin_call (unsigned int code) /* Implement TARGET_CHECK_BUILTIN_CALL. */ bool -arm_check_builtin_call (location_t, vec<location_t>, tree fndecl, tree, - unsigned int, tree *) +arm_check_builtin_call (location_t loc, vec<location_t> arg_loc, + tree fndecl, tree orig_fndecl, + unsigned int nargs, tree *args) { unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); unsigned int subcode = code >> ARM_BUILTIN_SHIFT; @@ -4197,6 +4203,9 @@ arm_check_builtin_call (location_t, vec<location_t>, tree fndecl, tree, { case ARM_BUILTIN_GENERAL: return arm_general_check_builtin_call (subcode); + case ARM_BUILTIN_MVE: + return arm_mve::check_builtin_call (loc, arg_loc, subcode, + orig_fndecl, nargs, args); default: gcc_unreachable (); } @@ -4215,6 +4224,8 @@ arm_describe_resolver (tree fndecl) && subcode < ARM_BUILTIN_MVE_BASE) return arm_cde_resolver; return arm_no_resolver; + case ARM_BUILTIN_MVE: + return arm_mve_resolver; default: gcc_unreachable (); } diff --git a/gcc/config/arm/arm-builtins.h b/gcc/config/arm/arm-builtins.h index 8c94b6bc40b..494dcd09411 100644 --- a/gcc/config/arm/arm-builtins.h +++ b/gcc/config/arm/arm-builtins.h @@ -27,6 +27,7 @@ enum resolver_ident { arm_cde_resolver, + arm_mve_resolver, arm_no_resolver }; enum resolver_ident arm_describe_resolver (tree); diff --git a/gcc/config/arm/arm-c.cc b/gcc/config/arm/arm-c.cc index 59c0d8ce747..d3d93ceba00 100644 --- a/gcc/config/arm/arm-c.cc +++ b/gcc/config/arm/arm-c.cc @@ -144,20 +144,44 @@ arm_pragma_arm (cpp_reader *) const char *name = TREE_STRING_POINTER (x); if (strcmp (name, "arm_mve_types.h") == 0) arm_mve::handle_arm_mve_types_h (); + else if (strcmp (name, "arm_mve.h") == 0) + { + if (pragma_lex (&x) == CPP_NAME) + { + if (strcmp (IDENTIFIER_POINTER (x), "true") == 0) + arm_mve::handle_arm_mve_h (true); + else if (strcmp (IDENTIFIER_POINTER (x), "false") == 0) + arm_mve::handle_arm_mve_h (false); + else + error ("%<#pragma GCC arm \"arm_mve.h\"%> requires a boolean parameter"); + } + } else error ("unknown %<#pragma GCC arm%> option %qs", name); } -/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. This is currently only - used for the MVE related builtins for the CDE extension. - Here we ensure the type of arguments is such that the size is correct, and - then return a tree that describes the same function call but with the - relevant types cast as necessary. */ +/* Implement TARGET_RESOLVE_OVERLOADED_BUILTIN. */ tree -arm_resolve_overloaded_builtin (location_t loc, tree fndecl, void *arglist) +arm_resolve_overloaded_builtin (location_t loc, tree fndecl, + void *uncast_arglist) { - if (arm_describe_resolver (fndecl) == arm_cde_resolver) - return arm_resolve_cde_builtin (loc, fndecl, arglist); + enum resolver_ident resolver = arm_describe_resolver (fndecl); + if (resolver == arm_cde_resolver) + return arm_resolve_cde_builtin (loc, fndecl, uncast_arglist); + if (resolver == arm_mve_resolver) + { + vec<tree, va_gc> empty = {}; + vec<tree, va_gc> *arglist = (uncast_arglist + ? (vec<tree, va_gc> *) uncast_arglist + : &empty); + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); + unsigned int subcode = code >> ARM_BUILTIN_SHIFT; + tree new_fndecl = arm_mve::resolve_overloaded_builtin (loc, subcode, arglist); + if (new_fndecl == NULL_TREE || new_fndecl == error_mark_node) + return new_fndecl; + return build_function_call_vec (loc, vNULL, new_fndecl, arglist, + NULL, fndecl); + } return NULL_TREE; } @@ -519,7 +543,9 @@ arm_register_target_pragmas (void) { /* Update pragma hook to allow parsing #pragma GCC target. */ targetm.target_option.pragma_parse = arm_pragma_target_parse; + targetm.resolve_overloaded_builtin = arm_resolve_overloaded_builtin; + targetm.check_builtin_call = arm_check_builtin_call; c_register_pragma ("GCC", "arm", arm_pragma_arm); diff --git a/gcc/config/arm/arm-mve-builtins-base.cc b/gcc/config/arm/arm-mve-builtins-base.cc new file mode 100644 index 00000000000..e9f285faf2b --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-base.cc @@ -0,0 +1,45 @@ +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "basic-block.h" +#include "function.h" +#include "gimple.h" +#include "arm-mve-builtins.h" +#include "arm-mve-builtins-shapes.h" +#include "arm-mve-builtins-base.h" +#include "arm-mve-builtins-functions.h" + +using namespace arm_mve; + +namespace { + +} /* end anonymous namespace */ + +namespace arm_mve { + +} /* end namespace arm_mve */ diff --git a/gcc/config/arm/arm-mve-builtins-base.def b/gcc/config/arm/arm-mve-builtins-base.def new file mode 100644 index 00000000000..d15ba2e23e8 --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-base.def @@ -0,0 +1,24 @@ +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#define REQUIRES_FLOAT false +#undef REQUIRES_FLOAT + +#define REQUIRES_FLOAT true +#undef REQUIRES_FLOAT diff --git a/gcc/config/arm/arm-mve-builtins-base.h b/gcc/config/arm/arm-mve-builtins-base.h new file mode 100644 index 00000000000..c4d7b750cd5 --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-base.h @@ -0,0 +1,29 @@ +/* ACLE support for Arm MVE (__ARM_FEATURE_MVE intrinsics) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#ifndef GCC_ARM_MVE_BUILTINS_BASE_H +#define GCC_ARM_MVE_BUILTINS_BASE_H + +namespace arm_mve { +namespace functions { + +} /* end namespace arm_mve::functions */ +} /* end namespace arm_mve */ + +#endif diff --git a/gcc/config/arm/arm-mve-builtins-functions.h b/gcc/config/arm/arm-mve-builtins-functions.h new file mode 100644 index 00000000000..dff01999bcd --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-functions.h @@ -0,0 +1,50 @@ +/* ACLE support for Arm MVE (function_base classes) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#ifndef GCC_ARM_MVE_BUILTINS_FUNCTIONS_H +#define GCC_ARM_MVE_BUILTINS_FUNCTIONS_H + +namespace arm_mve { + +/* Wrap T, which is derived from function_base, and indicate that the + function never has side effects. It is only necessary to use this + wrapper on functions that might have floating-point suffixes, since + otherwise we assume by default that the function has no side effects. */ +template<typename T> +class quiet : public T +{ +public: + CONSTEXPR quiet () : T () {} + + unsigned int + call_properties (const function_instance &) const override + { + return 0; + } +}; + +} /* end namespace arm_mve */ + +/* Declare the global function base NAME, creating it from an instance + of class CLASS with constructor arguments ARGS. */ +#define FUNCTION(NAME, CLASS, ARGS) \ + namespace { static CONSTEXPR const CLASS NAME##_obj ARGS; } \ + namespace functions { const function_base *const NAME = &NAME##_obj; } + +#endif diff --git a/gcc/config/arm/arm-mve-builtins-shapes.cc b/gcc/config/arm/arm-mve-builtins-shapes.cc new file mode 100644 index 00000000000..f20660d8319 --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-shapes.cc @@ -0,0 +1,343 @@ +/* ACLE support for Arm MVE (function shapes) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tm.h" +#include "tree.h" +#include "rtl.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "arm-mve-builtins.h" +#include "arm-mve-builtins-shapes.h" + +/* In the comments below, _t0 represents the first type suffix + (e.g. "_s8") and _t1 represents the second. T0/T1 represent the + type full names (e.g. int8x16_t). Square brackets enclose + characters that are present in only the full name, not the + overloaded name. Governing predicate arguments and predicate + suffixes are not shown, since they depend on the predication type, + which is a separate piece of information from the shape. */ + +namespace arm_mve { + +/* If INSTANCE has a predicate, add it to the list of argument types + in ARGUMENT_TYPES. RETURN_TYPE is the type returned by the + function. */ +static void +apply_predication (const function_instance &instance, tree return_type, + vec<tree> &argument_types) +{ + if (instance.pred != PRED_none) + { + /* When predicate is PRED_m, insert a first argument + ("inactive") with the same type as return_type. */ + if (instance.has_inactive_argument ()) + argument_types.quick_insert (0, return_type); + argument_types.quick_push (get_mve_pred16_t ()); + } +} + +/* Parse and move past an element type in FORMAT and return it as a type + suffix. The format is: + + [01] - the element type in type suffix 0 or 1 of INSTANCE. + h<elt> - a half-sized version of <elt> + s<bits> - a signed type with the given number of bits + s[01] - a signed type with the same width as type suffix 0 or 1 + u<bits> - an unsigned type with the given number of bits + u[01] - an unsigned type with the same width as type suffix 0 or 1 + w<elt> - a double-sized version of <elt> + x<bits> - a type with the given number of bits and same signedness + as the next argument. + + Future intrinsics will extend this format. */ +static type_suffix_index +parse_element_type (const function_instance &instance, const char *&format) +{ + int ch = *format++; + + + if (ch == 's' || ch == 'u') + { + type_class_index tclass = (ch == 'f' ? TYPE_float + : ch == 's' ? TYPE_signed + : TYPE_unsigned); + char *end; + unsigned int bits = strtol (format, &end, 10); + format = end; + if (bits == 0 || bits == 1) + bits = instance.type_suffix (bits).element_bits; + return find_type_suffix (tclass, bits); + } + + if (ch == 'h') + { + type_suffix_index suffix = parse_element_type (instance, format); + return find_type_suffix (type_suffixes[suffix].tclass, + type_suffixes[suffix].element_bits / 2); + } + + if (ch == 'w') + { + type_suffix_index suffix = parse_element_type (instance, format); + return find_type_suffix (type_suffixes[suffix].tclass, + type_suffixes[suffix].element_bits * 2); + } + + if (ch == 'x') + { + const char *next = format; + next = strstr (format, ","); + next+=2; + type_suffix_index suffix = parse_element_type (instance, next); + type_class_index tclass = type_suffixes[suffix].tclass; + char *end; + unsigned int bits = strtol (format, &end, 10); + format = end; + return find_type_suffix (tclass, bits); + } + + if (ch == '0' || ch == '1') + return instance.type_suffix_ids[ch - '0']; + + gcc_unreachable (); +} + +/* Read and return a type from FORMAT for function INSTANCE. Advance + FORMAT beyond the type string. The format is: + + p - predicates with type mve_pred16_t + s<elt> - a scalar type with the given element suffix + t<elt> - a vector or tuple type with given element suffix [*1] + v<elt> - a vector with the given element suffix + + where <elt> has the format described above parse_element_type. + + Future intrinsics will extend this format. + + [*1] the vectors_per_tuple function indicates whether the type should + be a tuple, and if so, how many vectors it should contain. */ +static tree +parse_type (const function_instance &instance, const char *&format) +{ + int ch = *format++; + + if (ch == 'p') + return get_mve_pred16_t (); + + if (ch == 's') + { + type_suffix_index suffix = parse_element_type (instance, format); + return scalar_types[type_suffixes[suffix].vector_type]; + } + + if (ch == 't') + { + type_suffix_index suffix = parse_element_type (instance, format); + vector_type_index vector_type = type_suffixes[suffix].vector_type; + unsigned int num_vectors = instance.vectors_per_tuple (); + return acle_vector_types[num_vectors - 1][vector_type]; + } + + if (ch == 'v') + { + type_suffix_index suffix = parse_element_type (instance, format); + return acle_vector_types[0][type_suffixes[suffix].vector_type]; + } + + gcc_unreachable (); +} + +/* Read a type signature for INSTANCE from FORMAT. Add the argument + types to ARGUMENT_TYPES and return the return type. Assert there + are no more than MAX_ARGS arguments. + + The format is a comma-separated list of types (as for parse_type), + with the first type being the return type and the rest being the + argument types. */ +static tree +parse_signature (const function_instance &instance, const char *format, + vec<tree> &argument_types, unsigned int max_args) +{ + tree return_type = parse_type (instance, format); + unsigned int args = 0; + while (format[0] == ',') + { + gcc_assert (args < max_args); + format += 1; + tree argument_type = parse_type (instance, format); + argument_types.quick_push (argument_type); + args += 1; + } + gcc_assert (format[0] == 0); + return return_type; +} + +/* Add one function instance for GROUP, using mode suffix MODE_SUFFIX_ID, + the type suffixes at index TI and the predication suffix at index PI. + The other arguments are as for build_all. */ +static void +build_one (function_builder &b, const char *signature, + const function_group_info &group, mode_suffix_index mode_suffix_id, + unsigned int ti, unsigned int pi, bool preserve_user_namespace, + bool force_direct_overloads) +{ + /* Current functions take at most five arguments. Match + parse_signature parameter below. */ + auto_vec<tree, 5> argument_types; + function_instance instance (group.base_name, *group.base, *group.shape, + mode_suffix_id, group.types[ti], + group.preds[pi]); + tree return_type = parse_signature (instance, signature, argument_types, 5); + apply_predication (instance, return_type, argument_types); + b.add_unique_function (instance, return_type, argument_types, + preserve_user_namespace, group.requires_float, + force_direct_overloads); +} + +/* Add a function instance for every type and predicate combination in + GROUP, except if requested to use only the predicates listed in + RESTRICT_TO_PREDS. Take the function base name from GROUP and the + mode suffix from MODE_SUFFIX_ID. Use SIGNATURE to construct the + function signature, then use apply_predication to add in the + predicate. */ +static void +build_all (function_builder &b, const char *signature, + const function_group_info &group, mode_suffix_index mode_suffix_id, + bool preserve_user_namespace, + bool force_direct_overloads = false, + const predication_index *restrict_to_preds = NULL) +{ + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) + { + unsigned int pi2 = 0; + + if (restrict_to_preds) + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2) + if (restrict_to_preds[pi2] == group.preds[pi]) + break; + + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS) + for (unsigned int ti = 0; + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) + build_one (b, signature, group, mode_suffix_id, ti, pi, + preserve_user_namespace, force_direct_overloads); + } +} + +/* Add a function instance for every type and predicate combination in + GROUP, except if requested to use only the predicates listed in + RESTRICT_TO_PREDS, and only for 16-bit and 32-bit integers. Take + the function base name from GROUP and the mode suffix from + MODE_SUFFIX_ID. Use SIGNATURE to construct the function signature, + then use apply_predication to add in the predicate. */ +static void +build_16_32 (function_builder &b, const char *signature, + const function_group_info &group, mode_suffix_index mode_suffix_id, + bool preserve_user_namespace, + bool force_direct_overloads = false, + const predication_index *restrict_to_preds = NULL) +{ + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) + { + unsigned int pi2 = 0; + + if (restrict_to_preds) + for (; restrict_to_preds[pi2] != NUM_PREDS; ++pi2) + if (restrict_to_preds[pi2] == group.preds[pi]) + break; + + if (restrict_to_preds == NULL || restrict_to_preds[pi2] != NUM_PREDS) + for (unsigned int ti = 0; + ti == 0 || group.types[ti][0] != NUM_TYPE_SUFFIXES; ++ti) + { + unsigned int element_bits = type_suffixes[group.types[ti][0]].element_bits; + type_class_index tclass = type_suffixes[group.types[ti][0]].tclass; + if ((tclass == TYPE_signed || tclass == TYPE_unsigned) + && (element_bits == 16 || element_bits == 32)) + build_one (b, signature, group, mode_suffix_id, ti, pi, + preserve_user_namespace, force_direct_overloads); + } + } +} + +/* Declare the function shape NAME, pointing it to an instance + of class <NAME>_def. */ +#define SHAPE(NAME) \ + static CONSTEXPR const NAME##_def NAME##_obj; \ + namespace shapes { const function_shape *const NAME = &NAME##_obj; } + +/* Base class for functions that are not overloaded. */ +struct nonoverloaded_base : public function_shape +{ + bool + explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const override + { + return true; + } + + bool + explicit_mode_suffix_p (enum predication_index, enum mode_suffix_index) const override + { + return true; + } + + bool + skip_overload_p (enum predication_index, enum mode_suffix_index) const override + { + return false; + } + + tree + resolve (function_resolver &) const override + { + gcc_unreachable (); + } +}; + +/* Base class for overloaded functions. Bit N of EXPLICIT_MASK is true + if type suffix N appears in the overloaded name. */ +template<unsigned int EXPLICIT_MASK> +struct overloaded_base : public function_shape +{ + bool + explicit_type_suffix_p (unsigned int i, enum predication_index, enum mode_suffix_index) const override + { + return (EXPLICIT_MASK >> i) & 1; + } + + bool + explicit_mode_suffix_p (enum predication_index, enum mode_suffix_index) const override + { + return false; + } + + bool + skip_overload_p (enum predication_index, enum mode_suffix_index) const override + { + return false; + } +}; + +} /* end namespace arm_mve */ + +#undef SHAPE diff --git a/gcc/config/arm/arm-mve-builtins-shapes.h b/gcc/config/arm/arm-mve-builtins-shapes.h new file mode 100644 index 00000000000..9e353b85a76 --- /dev/null +++ b/gcc/config/arm/arm-mve-builtins-shapes.h @@ -0,0 +1,30 @@ +/* ACLE support for Arm MVE (function shapes) + Copyright (C) 2023 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + + GCC is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + <http://www.gnu.org/licenses/>. */ + +#ifndef GCC_ARM_MVE_BUILTINS_SHAPES_H +#define GCC_ARM_MVE_BUILTINS_SHAPES_H + +namespace arm_mve +{ + namespace shapes + { + } /* end namespace arm_mve::shapes */ +} /* end namespace arm_mve */ + +#endif diff --git a/gcc/config/arm/arm-mve-builtins.cc b/gcc/config/arm/arm-mve-builtins.cc index 7586a82e3c1..b0cceb75ceb 100644 --- a/gcc/config/arm/arm-mve-builtins.cc +++ b/gcc/config/arm/arm-mve-builtins.cc @@ -24,7 +24,19 @@ #include "coretypes.h" #include "tm.h" #include "tree.h" +#include "rtl.h" +#include "tm_p.h" +#include "memmodel.h" +#include "insn-codes.h" +#include "optabs.h" +#include "recog.h" +#include "expr.h" +#include "basic-block.h" +#include "function.h" #include "fold-const.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "emit-rtl.h" #include "langhooks.h" #include "stringpool.h" #include "attribs.h" @@ -32,6 +44,8 @@ #include "arm-protos.h" #include "arm-builtins.h" #include "arm-mve-builtins.h" +#include "arm-mve-builtins-base.h" +#include "arm-mve-builtins-shapes.h" namespace arm_mve { @@ -46,6 +60,33 @@ struct vector_type_info const bool requires_float; }; +/* Describes a function decl. */ +class GTY(()) registered_function +{ +public: + /* The ACLE function that the decl represents. */ + function_instance instance GTY ((skip)); + + /* The decl itself. */ + tree decl; + + /* Whether the function requires a floating point abi. */ + bool requires_float; + + /* True if the decl represents an overloaded function that needs to be + resolved by function_resolver. */ + bool overloaded_p; +}; + +/* Hash traits for registered_function. */ +struct registered_function_hasher : nofree_ptr_hash <registered_function> +{ + typedef function_instance compare_type; + + static hashval_t hash (value_type); + static bool equal (value_type, const compare_type &); +}; + /* Flag indicating whether the arm MVE types have been handled. */ static bool handle_arm_mve_types_p; @@ -54,11 +95,167 @@ static CONSTEXPR const vector_type_info vector_types[] = { #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \ { #ACLE_NAME, REQUIRES_FLOAT }, #include "arm-mve-builtins.def" -#undef DEF_MVE_TYPE +}; + +/* The function name suffix associated with each predication type. */ +static const char *const pred_suffixes[NUM_PREDS + 1] = { + "", + "_m", + "_p", + "_x", + "_z", + "" +}; + +/* Static information about each mode_suffix_index. */ +CONSTEXPR const mode_suffix_info mode_suffixes[] = { +#define VECTOR_TYPE_none NUM_VECTOR_TYPES +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS) \ + { "_" #NAME, VECTOR_TYPE_##BASE, VECTOR_TYPE_##DISPLACEMENT, UNITS_##UNITS }, +#include "arm-mve-builtins.def" +#undef VECTOR_TYPE_none + { "", NUM_VECTOR_TYPES, NUM_VECTOR_TYPES, UNITS_none } +}; + +/* Static information about each type_suffix_index. */ +CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = { +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) \ + { "_" #NAME, \ + VECTOR_TYPE_##ACLE_TYPE, \ + TYPE_##CLASS, \ + BITS, \ + BITS / BITS_PER_UNIT, \ + TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \ + TYPE_##CLASS == TYPE_unsigned, \ + TYPE_##CLASS == TYPE_float, \ + 0, \ + MODE }, +#include "arm-mve-builtins.def" + { "", NUM_VECTOR_TYPES, TYPE_bool, 0, 0, false, false, false, + 0, VOIDmode } +}; + +/* Define a TYPES_<combination> macro for each combination of type + suffixes that an ACLE function can have, where <combination> is the + name used in DEF_MVE_FUNCTION entries. + + Use S (T) for single type suffix T and D (T1, T2) for a pair of type + suffixes T1 and T2. Use commas to separate the suffixes. + + Although the order shouldn't matter, the convention is to sort the + suffixes lexicographically after dividing suffixes into a type + class ("b", "f", etc.) and a numerical bit count. */ + +/* _f16. */ +#define TYPES_float16(S, D) \ + S (f16) + +/* _f16 _f32. */ +#define TYPES_all_float(S, D) \ + S (f16), S (f32) + +/* _s8 _u8 . */ +#define TYPES_integer_8(S, D) \ + S (s8), S (u8) + +/* _s8 _s16 + _u8 _u16. */ +#define TYPES_integer_8_16(S, D) \ + S (s8), S (s16), S (u8), S(u16) + +/* _s16 _s32 + _u16 _u32. */ +#define TYPES_integer_16_32(S, D) \ + S (s16), S (s32), \ + S (u16), S (u32) + +/* _s16 _s32. */ +#define TYPES_signed_16_32(S, D) \ + S (s16), S (s32) + +/* _s8 _s16 _s32. */ +#define TYPES_all_signed(S, D) \ + S (s8), S (s16), S (s32) + +/* _u8 _u16 _u32. */ +#define TYPES_all_unsigned(S, D) \ + S (u8), S (u16), S (u32) + +/* _s8 _s16 _s32 + _u8 _u16 _u32. */ +#define TYPES_all_integer(S, D) \ + TYPES_all_signed (S, D), TYPES_all_unsigned (S, D) + +/* _s8 _s16 _s32 _s64 + _u8 _u16 _u32 _u64. */ +#define TYPES_all_integer_with_64(S, D) \ + TYPES_all_signed (S, D), S (s64), TYPES_all_unsigned (S, D), S (u64) + +/* s32 _u32. */ +#define TYPES_integer_32(S, D) \ + S (s32), S (u32) + +/* s32 . */ +#define TYPES_signed_32(S, D) \ + S (s32) + +/* Describe a pair of type suffixes in which only the first is used. */ +#define DEF_VECTOR_TYPE(X) { TYPE_SUFFIX_ ## X, NUM_TYPE_SUFFIXES } + +/* Describe a pair of type suffixes in which both are used. */ +#define DEF_DOUBLE_TYPE(X, Y) { TYPE_SUFFIX_ ## X, TYPE_SUFFIX_ ## Y } + +/* Create an array that can be used in arm-mve-builtins.def to + select the type suffixes in TYPES_<NAME>. */ +#define DEF_MVE_TYPES_ARRAY(NAME) \ + static const type_suffix_pair types_##NAME[] = { \ + TYPES_##NAME (DEF_VECTOR_TYPE, DEF_DOUBLE_TYPE), \ + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } \ + } + +/* For functions that don't take any type suffixes. */ +static const type_suffix_pair types_none[] = { + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES }, + { NUM_TYPE_SUFFIXES, NUM_TYPE_SUFFIXES } +}; + +DEF_MVE_TYPES_ARRAY (all_integer); +DEF_MVE_TYPES_ARRAY (all_integer_with_64); +DEF_MVE_TYPES_ARRAY (float16); +DEF_MVE_TYPES_ARRAY (all_float); +DEF_MVE_TYPES_ARRAY (all_signed); +DEF_MVE_TYPES_ARRAY (all_unsigned); +DEF_MVE_TYPES_ARRAY (integer_8); +DEF_MVE_TYPES_ARRAY (integer_8_16); +DEF_MVE_TYPES_ARRAY (integer_16_32); +DEF_MVE_TYPES_ARRAY (integer_32); +DEF_MVE_TYPES_ARRAY (signed_16_32); +DEF_MVE_TYPES_ARRAY (signed_32); + +/* Used by functions that have no governing predicate. */ +static const predication_index preds_none[] = { PRED_none, NUM_PREDS }; + +/* Used by functions that have the m (merging) predicated form, and in + addition have an unpredicated form. */ +static const predication_index preds_m_or_none[] = { + PRED_m, PRED_none, NUM_PREDS +}; + +/* Used by functions that have the mx (merging and "don't care" + predicated forms, and in addition have an unpredicated form. */ +static const predication_index preds_mx_or_none[] = { + PRED_m, PRED_x, PRED_none, NUM_PREDS +}; + +/* Used by functions that have the p predicated form, in addition to + an unpredicated form. */ +static const predication_index preds_p_or_none[] = { + PRED_p, PRED_none, NUM_PREDS }; /* The scalar type associated with each vector type. */ -GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; +extern GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; +tree scalar_types[NUM_VECTOR_TYPES]; /* The single-predicate and single-vector types, with their built-in "__simd128_..._t" name. Allow an index of NUM_VECTOR_TYPES, which always @@ -66,7 +263,20 @@ GTY(()) tree scalar_types[NUM_VECTOR_TYPES]; static GTY(()) tree abi_vector_types[NUM_VECTOR_TYPES + 1]; /* Same, but with the arm_mve.h names. */ -GTY(()) tree acle_vector_types[3][NUM_VECTOR_TYPES + 1]; +extern GTY(()) tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; +tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; + +/* The list of all registered function decls, indexed by code. */ +static GTY(()) vec<registered_function *, va_gc> *registered_functions; + +/* All registered function decls, hashed on the function_instance + that they implement. This is used for looking up implementations of + overloaded functions. */ +static hash_table<registered_function_hasher> *function_table; + +/* True if we've already complained about attempts to use functions + when the required extension is disabled. */ +static bool reported_missing_float_p; /* Return the MVE abi type with element of type TYPE. */ static tree @@ -87,7 +297,6 @@ register_builtin_types () #define DEF_MVE_TYPE(ACLE_NAME, SCALAR_TYPE) \ scalar_types[VECTOR_TYPE_ ## ACLE_NAME] = SCALAR_TYPE; #include "arm-mve-builtins.def" -#undef DEF_MVE_TYPE for (unsigned int i = 0; i < NUM_VECTOR_TYPES; ++i) { if (vector_types[i].requires_float && !TARGET_HAVE_MVE_FLOAT) @@ -113,8 +322,18 @@ register_builtin_types () static void register_vector_type (vector_type_index type) { + + /* If the target does not have the mve.fp extension, but the type requires + it, then it needs to be assigned a non-dummy type so that functions + with those types in their signature can be registered. This allows for + diagnostics about the missing extension, rather than about a missing + function definition. */ if (vector_types[type].requires_float && !TARGET_HAVE_MVE_FLOAT) - return; + { + acle_vector_types[0][type] = void_type_node; + return; + } + tree vectype = abi_vector_types[type]; tree id = get_identifier (vector_types[type].acle_name); tree decl = build_decl (input_location, TYPE_DECL, id, vectype); @@ -133,15 +352,26 @@ register_vector_type (vector_type_index type) acle_vector_types[0][type] = vectype; } -/* Register tuple type TYPE with NUM_VECTORS arity under its - arm_mve_types.h name. */ +/* Register tuple types of element type TYPE under their arm_mve_types.h + names. */ static void register_builtin_tuple_types (vector_type_index type) { const vector_type_info* info = &vector_types[type]; + + /* If the target does not have the mve.fp extension, but the type requires + it, then it needs to be assigned a non-dummy type so that functions + with those types in their signature can be registered. This allows for + diagnostics about the missing extension, rather than about a missing + function definition. */ if (scalar_types[type] == boolean_type_node || (info->requires_float && !TARGET_HAVE_MVE_FLOAT)) + { + for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2) + acle_vector_types[num_vectors >> 1][type] = void_type_node; return; + } + const char *vector_type_name = info->acle_name; char buffer[sizeof ("float32x4x2_t")]; for (unsigned int num_vectors = 2; num_vectors <= 4; num_vectors += 2) @@ -189,8 +419,1710 @@ handle_arm_mve_types_h () } } -} /* end namespace arm_mve */ +/* Implement #pragma GCC arm "arm_mve.h" <bool>. */ +void +handle_arm_mve_h (bool preserve_user_namespace) +{ + if (function_table) + { + error ("duplicate definition of %qs", "arm_mve.h"); + return; + } -using namespace arm_mve; + /* Define MVE functions. */ + function_table = new hash_table<registered_function_hasher> (1023); +} + +/* Return true if CANDIDATE is equivalent to MODEL_TYPE for overloading + purposes. */ +static bool +matches_type_p (const_tree model_type, const_tree candidate) +{ + if (VECTOR_TYPE_P (model_type)) + { + if (!VECTOR_TYPE_P (candidate) + || maybe_ne (TYPE_VECTOR_SUBPARTS (model_type), + TYPE_VECTOR_SUBPARTS (candidate)) + || TYPE_MODE (model_type) != TYPE_MODE (candidate)) + return false; + + model_type = TREE_TYPE (model_type); + candidate = TREE_TYPE (candidate); + } + return (candidate != error_mark_node + && TYPE_MAIN_VARIANT (model_type) == TYPE_MAIN_VARIANT (candidate)); +} + +/* Report an error against LOCATION that the user has tried to use + a floating point function when the mve.fp extension is disabled. */ +static void +report_missing_float (location_t location, tree fndecl) +{ + /* Avoid reporting a slew of messages for a single oversight. */ + if (reported_missing_float_p) + return; + + error_at (location, "ACLE function %qD requires ISA extension %qs", + fndecl, "mve.fp"); + inform (location, "you can enable mve.fp by using the command-line" + " option %<-march%>, or by using the %<target%>" + " attribute or pragma"); + reported_missing_float_p = true; +} + +/* Report that LOCATION has a call to FNDECL in which argument ARGNO + was not an integer constant expression. ARGNO counts from zero. */ +static void +report_non_ice (location_t location, tree fndecl, unsigned int argno) +{ + error_at (location, "argument %d of %qE must be an integer constant" + " expression", argno + 1, fndecl); +} + +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has + the value ACTUAL, whereas the function requires a value in the range + [MIN, MAX]. ARGNO counts from zero. */ +static void +report_out_of_range (location_t location, tree fndecl, unsigned int argno, + HOST_WIDE_INT actual, HOST_WIDE_INT min, + HOST_WIDE_INT max) +{ + error_at (location, "passing %wd to argument %d of %qE, which expects" + " a value in the range [%wd, %wd]", actual, argno + 1, fndecl, + min, max); +} + +/* Report that LOCATION has a call to FNDECL in which argument ARGNO has + the value ACTUAL, whereas the function requires a valid value of + enum type ENUMTYPE. ARGNO counts from zero. */ +static void +report_not_enum (location_t location, tree fndecl, unsigned int argno, + HOST_WIDE_INT actual, tree enumtype) +{ + error_at (location, "passing %wd to argument %d of %qE, which expects" + " a valid %qT value", actual, argno + 1, fndecl, enumtype); +} + +/* Checks that the mve.fp extension is enabled, given that REQUIRES_FLOAT + indicates whether it is required or not for function FNDECL. + Report an error against LOCATION if not. */ +static bool +check_requires_float (location_t location, tree fndecl, + bool requires_float) +{ + if (requires_float && !TARGET_HAVE_MVE_FLOAT) + { + report_missing_float (location, fndecl); + return false; + } + + return true; +} + +/* Return a hash code for a function_instance. */ +hashval_t +function_instance::hash () const +{ + inchash::hash h; + /* BASE uniquely determines BASE_NAME, so we don't need to hash both. */ + h.add_ptr (base); + h.add_ptr (shape); + h.add_int (mode_suffix_id); + h.add_int (type_suffix_ids[0]); + h.add_int (type_suffix_ids[1]); + h.add_int (pred); + return h.end (); +} + +/* Return a set of CP_* flags that describe what the function could do, + taking the command-line flags into account. */ +unsigned int +function_instance::call_properties () const +{ + unsigned int flags = base->call_properties (*this); + + /* -fno-trapping-math means that we can assume any FP exceptions + are not user-visible. */ + if (!flag_trapping_math) + flags &= ~CP_RAISE_FP_EXCEPTIONS; + + return flags; +} + +/* Return true if calls to the function could read some form of + global state. */ +bool +function_instance::reads_global_state_p () const +{ + unsigned int flags = call_properties (); + + /* Preserve any dependence on rounding mode, flush to zero mode, etc. + There is currently no way of turning this off; in particular, + -fno-rounding-math (which is the default) means that we should make + the usual assumptions about rounding mode, which for intrinsics means + acting as the instructions do. */ + if (flags & CP_READ_FPCR) + return true; + + return false; +} + +/* Return true if calls to the function could modify some form of + global state. */ +bool +function_instance::modifies_global_state_p () const +{ + unsigned int flags = call_properties (); + + /* Preserve any exception state written back to the FPCR, + unless -fno-trapping-math says this is unnecessary. */ + if (flags & CP_RAISE_FP_EXCEPTIONS) + return true; + + /* Handle direct modifications of global state. */ + return flags & CP_WRITE_MEMORY; +} + +/* Return true if calls to the function could raise a signal. */ +bool +function_instance::could_trap_p () const +{ + unsigned int flags = call_properties (); + + /* Handle functions that could raise SIGFPE. */ + if (flags & CP_RAISE_FP_EXCEPTIONS) + return true; + + /* Handle functions that could raise SIGBUS or SIGSEGV. */ + if (flags & (CP_READ_MEMORY | CP_WRITE_MEMORY)) + return true; + + return false; +} + +/* Return true if the function has an implicit "inactive" argument. + This is the case of most _m predicated functions, but not all. + The list will be updated as needed. */ +bool +function_instance::has_inactive_argument () const +{ + if (pred != PRED_m) + return false; + + return true; +} + +inline hashval_t +registered_function_hasher::hash (value_type value) +{ + return value->instance.hash (); +} + +inline bool +registered_function_hasher::equal (value_type value, const compare_type &key) +{ + return value->instance == key; +} + +function_builder::function_builder () +{ + m_overload_type = build_function_type (void_type_node, void_list_node); + m_direct_overloads = lang_GNU_CXX (); + gcc_obstack_init (&m_string_obstack); +} + +function_builder::~function_builder () +{ + obstack_free (&m_string_obstack, NULL); +} + +/* Add NAME to the end of the function name being built. */ +void +function_builder::append_name (const char *name) +{ + obstack_grow (&m_string_obstack, name, strlen (name)); +} + +/* Zero-terminate and complete the function name being built. */ +char * +function_builder::finish_name () +{ + obstack_1grow (&m_string_obstack, 0); + return (char *) obstack_finish (&m_string_obstack); +} + +/* Return the overloaded or full function name for INSTANCE, with optional + prefix; PRESERVE_USER_NAMESPACE selects the prefix, and OVERLOADED_P + selects which the overloaded or full function name. Allocate the string on + m_string_obstack; the caller must use obstack_free to free it after use. */ +char * +function_builder::get_name (const function_instance &instance, + bool preserve_user_namespace, + bool overloaded_p) +{ + if (preserve_user_namespace) + append_name ("__arm_"); + append_name (instance.base_name); + append_name (pred_suffixes[instance.pred]); + if (!overloaded_p + || instance.shape->explicit_mode_suffix_p (instance.pred, + instance.mode_suffix_id)) + append_name (instance.mode_suffix ().string); + for (unsigned int i = 0; i < 2; ++i) + if (!overloaded_p + || instance.shape->explicit_type_suffix_p (i, instance.pred, + instance.mode_suffix_id)) + append_name (instance.type_suffix (i).string); + return finish_name (); +} + +/* Add attribute NAME to ATTRS. */ +static tree +add_attribute (const char *name, tree attrs) +{ + return tree_cons (get_identifier (name), NULL_TREE, attrs); +} + +/* Return the appropriate function attributes for INSTANCE. */ +tree +function_builder::get_attributes (const function_instance &instance) +{ + tree attrs = NULL_TREE; + + if (!instance.modifies_global_state_p ()) + { + if (instance.reads_global_state_p ()) + attrs = add_attribute ("pure", attrs); + else + attrs = add_attribute ("const", attrs); + } + + if (!flag_non_call_exceptions || !instance.could_trap_p ()) + attrs = add_attribute ("nothrow", attrs); + + return add_attribute ("leaf", attrs); +} + +/* Add a function called NAME with type FNTYPE and attributes ATTRS. + INSTANCE describes what the function does and OVERLOADED_P indicates + whether it is overloaded. REQUIRES_FLOAT indicates whether the function + requires the mve.fp extension. */ +registered_function & +function_builder::add_function (const function_instance &instance, + const char *name, tree fntype, tree attrs, + bool requires_float, + bool overloaded_p, + bool placeholder_p) +{ + unsigned int code = vec_safe_length (registered_functions); + code = (code << ARM_BUILTIN_SHIFT) | ARM_BUILTIN_MVE; + + /* We need to be able to generate placeholders to ensure that we have a + consistent numbering scheme for function codes between the C and C++ + frontends, so that everything ties up in LTO. + + Currently, tree-streamer-in.cc:unpack_ts_function_decl_value_fields + validates that tree nodes returned by TARGET_BUILTIN_DECL are non-NULL and + some node other than error_mark_node. This is a holdover from when builtin + decls were streamed by code rather than by value. + + Ultimately, we should be able to remove this validation of BUILT_IN_MD + nodes and remove the target hook. For now, however, we need to appease the + validation and return a non-NULL, non-error_mark_node node, so we + arbitrarily choose integer_zero_node. */ + tree decl = placeholder_p + ? integer_zero_node + : simulate_builtin_function_decl (input_location, name, fntype, + code, NULL, attrs); + + registered_function &rfn = *ggc_alloc <registered_function> (); + rfn.instance = instance; + rfn.decl = decl; + rfn.requires_float = requires_float; + rfn.overloaded_p = overloaded_p; + vec_safe_push (registered_functions, &rfn); + + return rfn; +} + +/* Add a built-in function for INSTANCE, with the argument types given + by ARGUMENT_TYPES and the return type given by RETURN_TYPE. + REQUIRES_FLOAT indicates whether the function requires the mve.fp extension, + and PRESERVE_USER_NAMESPACE indicates whether the function should also be + registered under its non-prefixed name. */ +void +function_builder::add_unique_function (const function_instance &instance, + tree return_type, + vec<tree> &argument_types, + bool preserve_user_namespace, + bool requires_float, + bool force_direct_overloads) +{ + /* Add the function under its full (unique) name with prefix. */ + char *name = get_name (instance, true, false); + tree fntype = build_function_type_array (return_type, + argument_types.length (), + argument_types.address ()); + tree attrs = get_attributes (instance); + registered_function &rfn = add_function (instance, name, fntype, attrs, + requires_float, false, false); + + /* Enter the function into the hash table. */ + hashval_t hash = instance.hash (); + registered_function **rfn_slot + = function_table->find_slot_with_hash (instance, hash, INSERT); + gcc_assert (!*rfn_slot); + *rfn_slot = &rfn; + + /* Also add the non-prefixed non-overloaded function, if the user namespace + does not need to be preserved. */ + if (!preserve_user_namespace) + { + char *noprefix_name = get_name (instance, false, false); + tree attrs = get_attributes (instance); + add_function (instance, noprefix_name, fntype, attrs, requires_float, + false, false); + } + + /* Also add the function under its overloaded alias, if we want + a separate decl for each instance of an overloaded function. */ + char *overload_name = get_name (instance, true, true); + if (strcmp (name, overload_name) != 0) + { + /* Attribute lists shouldn't be shared. */ + tree attrs = get_attributes (instance); + bool placeholder_p = !(m_direct_overloads || force_direct_overloads); + add_function (instance, overload_name, fntype, attrs, + requires_float, false, placeholder_p); + + /* Also add the non-prefixed overloaded function, if the user namespace + does not need to be preserved. */ + if (!preserve_user_namespace) + { + char *noprefix_overload_name = get_name (instance, false, true); + tree attrs = get_attributes (instance); + add_function (instance, noprefix_overload_name, fntype, attrs, + requires_float, false, placeholder_p); + } + } + + obstack_free (&m_string_obstack, name); +} + +/* Add one function decl for INSTANCE, to be used with manual overload + resolution. REQUIRES_FLOAT indicates whether the function requires the + mve.fp extension. + + For simplicity, partition functions by instance and required extensions, + and check whether the required extensions are available as part of resolving + the function to the relevant unique function. */ +void +function_builder::add_overloaded_function (const function_instance &instance, + bool preserve_user_namespace, + bool requires_float) +{ + char *name = get_name (instance, true, true); + if (registered_function **map_value = m_overload_names.get (name)) + { + gcc_assert ((*map_value)->instance == instance); + obstack_free (&m_string_obstack, name); + } + else + { + registered_function &rfn + = add_function (instance, name, m_overload_type, NULL_TREE, + requires_float, true, m_direct_overloads); + m_overload_names.put (name, &rfn); + if (!preserve_user_namespace) + { + char *noprefix_name = get_name (instance, false, true); + registered_function &noprefix_rfn + = add_function (instance, noprefix_name, m_overload_type, + NULL_TREE, requires_float, true, + m_direct_overloads); + m_overload_names.put (noprefix_name, &noprefix_rfn); + } + } +} + +/* If we are using manual overload resolution, add one function decl + for each overloaded function in GROUP. Take the function base name + from GROUP and the mode from MODE. */ +void +function_builder::add_overloaded_functions (const function_group_info &group, + mode_suffix_index mode, + bool preserve_user_namespace) +{ + for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) + { + unsigned int explicit_type0 + = (*group.shape)->explicit_type_suffix_p (0, group.preds[pi], mode); + unsigned int explicit_type1 + = (*group.shape)->explicit_type_suffix_p (1, group.preds[pi], mode); + + if ((*group.shape)->skip_overload_p (group.preds[pi], mode)) + continue; + + if (!explicit_type0 && !explicit_type1) + { + /* Deal with the common case in which there is one overloaded + function for all type combinations. */ + function_instance instance (group.base_name, *group.base, + *group.shape, mode, types_none[0], + group.preds[pi]); + add_overloaded_function (instance, preserve_user_namespace, + group.requires_float); + } + else + for (unsigned int ti = 0; group.types[ti][0] != NUM_TYPE_SUFFIXES; + ++ti) + { + /* Stub out the types that are determined by overload + resolution. */ + type_suffix_pair types = { + explicit_type0 ? group.types[ti][0] : NUM_TYPE_SUFFIXES, + explicit_type1 ? group.types[ti][1] : NUM_TYPE_SUFFIXES + }; + function_instance instance (group.base_name, *group.base, + *group.shape, mode, types, + group.preds[pi]); + add_overloaded_function (instance, preserve_user_namespace, + group.requires_float); + } + } +} + +/* Register all the functions in GROUP. */ +void +function_builder::register_function_group (const function_group_info &group, + bool preserve_user_namespace) +{ + (*group.shape)->build (*this, group, preserve_user_namespace); +} + +function_call_info::function_call_info (location_t location_in, + const function_instance &instance_in, + tree fndecl_in) + : function_instance (instance_in), location (location_in), fndecl (fndecl_in) +{ +} + +function_resolver::function_resolver (location_t location, + const function_instance &instance, + tree fndecl, vec<tree, va_gc> &arglist) + : function_call_info (location, instance, fndecl), m_arglist (arglist) +{ +} + +/* Return the vector type associated with type suffix TYPE. */ +tree +function_resolver::get_vector_type (type_suffix_index type) +{ + return acle_vector_types[0][type_suffixes[type].vector_type]; +} + +/* Return the <stdint.h> name associated with TYPE. Using the <stdint.h> + name should be more user-friendly than the underlying canonical type, + since it makes the signedness and bitwidth explicit. */ +const char * +function_resolver::get_scalar_type_name (type_suffix_index type) +{ + return vector_types[type_suffixes[type].vector_type].acle_name + 2; +} + +/* Return the type of argument I, or error_mark_node if it isn't + well-formed. */ +tree +function_resolver::get_argument_type (unsigned int i) +{ + tree arg = m_arglist[i]; + return arg == error_mark_node ? arg : TREE_TYPE (arg); +} + +/* Return true if argument I is some form of scalar value. */ +bool +function_resolver::scalar_argument_p (unsigned int i) +{ + tree type = get_argument_type (i); + return (INTEGRAL_TYPE_P (type) + /* Allow pointer types, leaving the frontend to warn where + necessary. */ + || POINTER_TYPE_P (type) + || SCALAR_FLOAT_TYPE_P (type)); +} + +/* Report that the function has no form that takes type suffix TYPE. + Return error_mark_node. */ +tree +function_resolver::report_no_such_form (type_suffix_index type) +{ + error_at (location, "%qE has no form that takes %qT arguments", + fndecl, get_vector_type (type)); + return error_mark_node; +} + +/* Silently check whether there is an instance of the function with the + mode suffix given by MODE and the type suffixes given by TYPE0 and TYPE1. + Return its function decl if so, otherwise return null. */ +tree +function_resolver::lookup_form (mode_suffix_index mode, + type_suffix_index type0, + type_suffix_index type1) +{ + type_suffix_pair types = { type0, type1 }; + function_instance instance (base_name, base, shape, mode, types, pred); + registered_function *rfn + = function_table->find_with_hash (instance, instance.hash ()); + return rfn ? rfn->decl : NULL_TREE; +} + +/* Resolve the function to one with the mode suffix given by MODE and the + type suffixes given by TYPE0 and TYPE1. Return its function decl on + success, otherwise report an error and return error_mark_node. */ +tree +function_resolver::resolve_to (mode_suffix_index mode, + type_suffix_index type0, + type_suffix_index type1) +{ + tree res = lookup_form (mode, type0, type1); + if (!res) + { + if (type1 == NUM_TYPE_SUFFIXES) + return report_no_such_form (type0); + if (type0 == type_suffix_ids[0]) + return report_no_such_form (type1); + /* To be filled in when we have other cases. */ + gcc_unreachable (); + } + return res; +} + +/* Require argument ARGNO to be a single vector or a tuple of NUM_VECTORS + vectors; NUM_VECTORS is 1 for the former. Return the associated type + suffix on success, using TYPE_SUFFIX_b for predicates. Report an error + and return NUM_TYPE_SUFFIXES on failure. */ +type_suffix_index +function_resolver::infer_vector_or_tuple_type (unsigned int argno, + unsigned int num_vectors) +{ + tree actual = get_argument_type (argno); + if (actual == error_mark_node) + return NUM_TYPE_SUFFIXES; + + /* A linear search should be OK here, since the code isn't hot and + the number of types is only small. */ + for (unsigned int size_i = 0; size_i < MAX_TUPLE_SIZE; ++size_i) + for (unsigned int suffix_i = 0; suffix_i < NUM_TYPE_SUFFIXES; ++suffix_i) + { + vector_type_index type_i = type_suffixes[suffix_i].vector_type; + tree type = acle_vector_types[size_i][type_i]; + if (type && matches_type_p (type, actual)) + { + if (size_i + 1 == num_vectors) + return type_suffix_index (suffix_i); + + if (num_vectors == 1) + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a single MVE vector rather than a tuple", + actual, argno + 1, fndecl); + else if (size_i == 0 && type_i != VECTOR_TYPE_mve_pred16_t) + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing single vector %qT to argument %d" + " of %qE, which expects a tuple of %d vectors", + actual, argno + 1, fndecl, num_vectors); + else + /* num_vectors is always != 1, so the singular isn't needed. */ + error_n (location, num_vectors, "%qT%d%qE%d", + "passing %qT to argument %d of %qE, which" + " expects a tuple of %d vectors", actual, argno + 1, + fndecl, num_vectors); + return NUM_TYPE_SUFFIXES; + } + } + + if (num_vectors == 1) + error_at (location, "passing %qT to argument %d of %qE, which" + " expects an MVE vector type", actual, argno + 1, fndecl); + else + error_at (location, "passing %qT to argument %d of %qE, which" + " expects an MVE tuple type", actual, argno + 1, fndecl); + return NUM_TYPE_SUFFIXES; +} + +/* Require argument ARGNO to have some form of vector type. Return the + associated type suffix on success, using TYPE_SUFFIX_b for predicates. + Report an error and return NUM_TYPE_SUFFIXES on failure. */ +type_suffix_index +function_resolver::infer_vector_type (unsigned int argno) +{ + return infer_vector_or_tuple_type (argno, 1); +} + +/* Require argument ARGNO to be a vector or scalar argument. Return true + if it is, otherwise report an appropriate error. */ +bool +function_resolver::require_vector_or_scalar_type (unsigned int argno) +{ + tree actual = get_argument_type (argno); + if (actual == error_mark_node) + return false; + + if (!scalar_argument_p (argno) && !VECTOR_TYPE_P (actual)) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a vector or scalar type", actual, argno + 1, fndecl); + return false; + } + + return true; +} + +/* Require argument ARGNO to have vector type TYPE, in cases where this + requirement holds for all uses of the function. Return true if the + argument has the right form, otherwise report an appropriate error. */ +bool +function_resolver::require_vector_type (unsigned int argno, + vector_type_index type) +{ + tree expected = acle_vector_types[0][type]; + tree actual = get_argument_type (argno); + if (actual == error_mark_node) + return false; + + if (!matches_type_p (expected, actual)) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects %qT", actual, argno + 1, fndecl, expected); + return false; + } + return true; +} + +/* Like require_vector_type, but TYPE is inferred from previous arguments + rather than being a fixed part of the function signature. This changes + the nature of the error messages. */ +bool +function_resolver::require_matching_vector_type (unsigned int argno, + type_suffix_index type) +{ + type_suffix_index new_type = infer_vector_type (argno); + if (new_type == NUM_TYPE_SUFFIXES) + return false; + + if (type != new_type) + { + error_at (location, "passing %qT to argument %d of %qE, but" + " previous arguments had type %qT", + get_vector_type (new_type), argno + 1, fndecl, + get_vector_type (type)); + return false; + } + return true; +} + +/* Require argument ARGNO to be a vector type with the following properties: + + - the type class must be the same as FIRST_TYPE's if EXPECTED_TCLASS + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself. + + - the element size must be: + + - the same as FIRST_TYPE's if EXPECTED_BITS == SAME_SIZE + - half of FIRST_TYPE's if EXPECTED_BITS == HALF_SIZE + - a quarter of FIRST_TYPE's if EXPECTED_BITS == QUARTER_SIZE + - EXPECTED_BITS itself otherwise + + Return true if the argument has the required type, otherwise report + an appropriate error. + + FIRST_ARGNO is the first argument that is known to have type FIRST_TYPE. + Usually it comes before ARGNO, but sometimes it is more natural to resolve + arguments out of order. + + If the required properties depend on FIRST_TYPE then both FIRST_ARGNO and + ARGNO contribute to the resolution process. If the required properties + are fixed, only FIRST_ARGNO contributes to the resolution process. + + This function is a bit of a Swiss army knife. The complication comes + from trying to give good error messages when FIRST_ARGNO and ARGNO are + inconsistent, since either of them might be wrong. */ +bool function_resolver:: +require_derived_vector_type (unsigned int argno, + unsigned int first_argno, + type_suffix_index first_type, + type_class_index expected_tclass, + unsigned int expected_bits) +{ + /* If the type needs to match FIRST_ARGNO exactly, use the preferred + error message for that case. The VECTOR_TYPE_P test excludes tuple + types, which we handle below instead. */ + bool both_vectors_p = VECTOR_TYPE_P (get_argument_type (first_argno)); + if (both_vectors_p + && expected_tclass == SAME_TYPE_CLASS + && expected_bits == SAME_SIZE) + { + /* There's no need to resolve this case out of order. */ + gcc_assert (argno > first_argno); + return require_matching_vector_type (argno, first_type); + } + + /* Use FIRST_TYPE to get the expected type class and element size. */ + type_class_index orig_expected_tclass = expected_tclass; + if (expected_tclass == NUM_TYPE_CLASSES) + expected_tclass = type_suffixes[first_type].tclass; + + unsigned int orig_expected_bits = expected_bits; + if (expected_bits == SAME_SIZE) + expected_bits = type_suffixes[first_type].element_bits; + else if (expected_bits == HALF_SIZE) + expected_bits = type_suffixes[first_type].element_bits / 2; + else if (expected_bits == QUARTER_SIZE) + expected_bits = type_suffixes[first_type].element_bits / 4; + + /* If the expected type doesn't depend on FIRST_TYPE at all, + just check for the fixed choice of vector type. */ + if (expected_tclass == orig_expected_tclass + && expected_bits == orig_expected_bits) + { + const type_suffix_info &expected_suffix + = type_suffixes[find_type_suffix (expected_tclass, expected_bits)]; + return require_vector_type (argno, expected_suffix.vector_type); + } + + /* Require the argument to be some form of MVE vector type, + without being specific about the type of vector we want. */ + type_suffix_index actual_type = infer_vector_type (argno); + if (actual_type == NUM_TYPE_SUFFIXES) + return false; + + /* Exit now if we got the right type. */ + bool tclass_ok_p = (type_suffixes[actual_type].tclass == expected_tclass); + bool size_ok_p = (type_suffixes[actual_type].element_bits == expected_bits); + if (tclass_ok_p && size_ok_p) + return true; + + /* First look for cases in which the actual type contravenes a fixed + size requirement, without having to refer to FIRST_TYPE. */ + if (!size_ok_p && expected_bits == orig_expected_bits) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a vector of %d-bit elements", + get_vector_type (actual_type), argno + 1, fndecl, + expected_bits); + return false; + } + + /* Likewise for a fixed type class requirement. This is only ever + needed for signed and unsigned types, so don't create unnecessary + translation work for other type classes. */ + if (!tclass_ok_p && orig_expected_tclass == TYPE_signed) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a vector of signed integers", + get_vector_type (actual_type), argno + 1, fndecl); + return false; + } + if (!tclass_ok_p && orig_expected_tclass == TYPE_unsigned) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a vector of unsigned integers", + get_vector_type (actual_type), argno + 1, fndecl); + return false; + } + + /* Make sure that FIRST_TYPE itself is sensible before using it + as a basis for an error message. */ + if (resolve_to (mode_suffix_id, first_type) == error_mark_node) + return false; + + /* If the arguments have consistent type classes, but a link between + the sizes has been broken, try to describe the error in those terms. */ + if (both_vectors_p && tclass_ok_p && orig_expected_bits == SAME_SIZE) + { + if (argno < first_argno) + { + std::swap (argno, first_argno); + std::swap (actual_type, first_type); + } + error_at (location, "arguments %d and %d of %qE must have the" + " same element size, but the values passed here have type" + " %qT and %qT respectively", first_argno + 1, argno + 1, + fndecl, get_vector_type (first_type), + get_vector_type (actual_type)); + return false; + } + + /* Likewise in reverse: look for cases in which the sizes are consistent + but a link between the type classes has been broken. */ + if (both_vectors_p + && size_ok_p + && orig_expected_tclass == SAME_TYPE_CLASS + && type_suffixes[first_type].integer_p + && type_suffixes[actual_type].integer_p) + { + if (argno < first_argno) + { + std::swap (argno, first_argno); + std::swap (actual_type, first_type); + } + error_at (location, "arguments %d and %d of %qE must have the" + " same signedness, but the values passed here have type" + " %qT and %qT respectively", first_argno + 1, argno + 1, + fndecl, get_vector_type (first_type), + get_vector_type (actual_type)); + return false; + } + + /* The two arguments are wildly inconsistent. */ + type_suffix_index expected_type + = find_type_suffix (expected_tclass, expected_bits); + error_at (location, "passing %qT instead of the expected %qT to argument" + " %d of %qE, after passing %qT to argument %d", + get_vector_type (actual_type), get_vector_type (expected_type), + argno + 1, fndecl, get_argument_type (first_argno), + first_argno + 1); + return false; +} + +/* Require argument ARGNO to be a (possibly variable) scalar, expecting it + to have the following properties: + + - the type class must be the same as for type suffix 0 if EXPECTED_TCLASS + is SAME_TYPE_CLASS, otherwise it must be EXPECTED_TCLASS itself. + + - the element size must be the same as for type suffix 0 if EXPECTED_BITS + is SAME_TYPE_SIZE, otherwise it must be EXPECTED_BITS itself. + + Return true if the argument is valid, otherwise report an appropriate error. + + Note that we don't check whether the scalar type actually has the required + properties, since that's subject to implicit promotions and conversions. + Instead we just use the expected properties to tune the error message. */ +bool function_resolver:: +require_derived_scalar_type (unsigned int argno, + type_class_index expected_tclass, + unsigned int expected_bits) +{ + gcc_assert (expected_tclass == SAME_TYPE_CLASS + || expected_tclass == TYPE_signed + || expected_tclass == TYPE_unsigned); + + /* If the expected type doesn't depend on the type suffix at all, + just check for the fixed choice of scalar type. */ + if (expected_tclass != SAME_TYPE_CLASS && expected_bits != SAME_SIZE) + { + type_suffix_index expected_type + = find_type_suffix (expected_tclass, expected_bits); + return require_scalar_type (argno, get_scalar_type_name (expected_type)); + } + + if (scalar_argument_p (argno)) + return true; + + if (expected_tclass == SAME_TYPE_CLASS) + /* It doesn't really matter whether the element is expected to be + the same size as type suffix 0. */ + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a scalar element", get_argument_type (argno), + argno + 1, fndecl); + else + /* It doesn't seem useful to distinguish between signed and unsigned + scalars here. */ + error_at (location, "passing %qT to argument %d of %qE, which" + " expects a scalar integer", get_argument_type (argno), + argno + 1, fndecl); + return false; +} + +/* Require argument ARGNO to be suitable for an integer constant expression. + Return true if it is, otherwise report an appropriate error. + + function_checker checks whether the argument is actually constant and + has a suitable range. The reason for distinguishing immediate arguments + here is because it provides more consistent error messages than + require_scalar_type would. */ +bool +function_resolver::require_integer_immediate (unsigned int argno) +{ + if (!scalar_argument_p (argno)) + { + report_non_ice (location, fndecl, argno); + return false; + } + return true; +} + +/* Require argument ARGNO to be a (possibly variable) scalar, using EXPECTED + as the name of its expected type. Return true if the argument has the + right form, otherwise report an appropriate error. */ +bool +function_resolver::require_scalar_type (unsigned int argno, + const char *expected) +{ + if (!scalar_argument_p (argno)) + { + error_at (location, "passing %qT to argument %d of %qE, which" + " expects %qs", get_argument_type (argno), argno + 1, + fndecl, expected); + return false; + } + return true; +} + +/* Require the function to have exactly EXPECTED arguments. Return true + if it does, otherwise report an appropriate error. */ +bool +function_resolver::check_num_arguments (unsigned int expected) +{ + if (m_arglist.length () < expected) + error_at (location, "too few arguments to function %qE", fndecl); + else if (m_arglist.length () > expected) + error_at (location, "too many arguments to function %qE", fndecl); + return m_arglist.length () == expected; +} + +/* If the function is predicated, check that the last argument is a + suitable predicate. Also check that there are NOPS further + arguments before any predicate, but don't check what they are. + + Return true on success, otherwise report a suitable error. + When returning true: + + - set I to the number of the last unchecked argument. + - set NARGS to the total number of arguments. */ +bool +function_resolver::check_gp_argument (unsigned int nops, + unsigned int &i, unsigned int &nargs) +{ + i = nops - 1; + if (pred != PRED_none) + { + switch (pred) + { + case PRED_m: + /* Add first inactive argument if needed, and final predicate. */ + if (has_inactive_argument ()) + nargs = nops + 2; + else + nargs = nops + 1; + break; + + case PRED_p: + case PRED_x: + /* Add final predicate. */ + nargs = nops + 1; + break; + + default: + gcc_unreachable (); + } + + if (!check_num_arguments (nargs) + || !require_vector_type (nargs - 1, VECTOR_TYPE_mve_pred16_t)) + return false; + + i = nargs - 2; + } + else + { + nargs = nops; + if (!check_num_arguments (nargs)) + return false; + } + + return true; +} + +/* Finish resolving a function whose final argument can be a vector + or a scalar, with the function having an implicit "_n" suffix + in the latter case. This "_n" form might only exist for certain + type suffixes. + + ARGNO is the index of the final argument. The inferred type suffix + was obtained from argument FIRST_ARGNO, which has type FIRST_TYPE. + EXPECTED_TCLASS and EXPECTED_BITS describe the expected properties + of the final vector or scalar argument, in the same way as for + require_derived_vector_type. INFERRED_TYPE is the inferred type + suffix itself, or NUM_TYPE_SUFFIXES if it's the same as FIRST_TYPE. + + Return the function decl of the resolved function on success, + otherwise report a suitable error and return error_mark_node. */ +tree function_resolver:: +finish_opt_n_resolution (unsigned int argno, unsigned int first_argno, + type_suffix_index first_type, + type_class_index expected_tclass, + unsigned int expected_bits, + type_suffix_index inferred_type) +{ + if (inferred_type == NUM_TYPE_SUFFIXES) + inferred_type = first_type; + tree scalar_form = lookup_form (MODE_n, inferred_type); + + /* Allow the final argument to be scalar, if an _n form exists. */ + if (scalar_argument_p (argno)) + { + if (scalar_form) + return scalar_form; + + /* Check the vector form normally. If that succeeds, raise an + error about having no corresponding _n form. */ + tree res = resolve_to (mode_suffix_id, inferred_type); + if (res != error_mark_node) + error_at (location, "passing %qT to argument %d of %qE, but its" + " %qT form does not accept scalars", + get_argument_type (argno), argno + 1, fndecl, + get_vector_type (first_type)); + return error_mark_node; + } + + /* If an _n form does exist, provide a more accurate message than + require_derived_vector_type would for arguments that are neither + vectors nor scalars. */ + if (scalar_form && !require_vector_or_scalar_type (argno)) + return error_mark_node; + + /* Check for the correct vector type. */ + if (!require_derived_vector_type (argno, first_argno, first_type, + expected_tclass, expected_bits)) + return error_mark_node; + + return resolve_to (mode_suffix_id, inferred_type); +} + +/* Resolve a (possibly predicated) unary function. If the function uses + merge predication or if TREAT_AS_MERGE_P is true, there is an extra + vector argument before the governing predicate that specifies the + values of inactive elements. This argument has the following + properties: + + - the type class must be the same as for active elements if MERGE_TCLASS + is SAME_TYPE_CLASS, otherwise it must be MERGE_TCLASS itself. + + - the element size must be the same as for active elements if MERGE_BITS + is SAME_TYPE_SIZE, otherwise it must be MERGE_BITS itself. + + Return the function decl of the resolved function on success, + otherwise report a suitable error and return error_mark_node. */ +tree +function_resolver::resolve_unary (type_class_index merge_tclass, + unsigned int merge_bits, + bool treat_as_merge_p) +{ + type_suffix_index type; + if (pred == PRED_m || treat_as_merge_p) + { + if (!check_num_arguments (3)) + return error_mark_node; + if (merge_tclass == SAME_TYPE_CLASS && merge_bits == SAME_SIZE) + { + /* The inactive elements are the same as the active elements, + so we can use normal left-to-right resolution. */ + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES + /* Predicates are the last argument. */ + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t) + || !require_matching_vector_type (1 , type)) + return error_mark_node; + } + else + { + /* The inactive element type is a function of the active one, + so resolve the active one first. */ + if (!require_vector_type (1, VECTOR_TYPE_mve_pred16_t) + || (type = infer_vector_type (2)) == NUM_TYPE_SUFFIXES + || !require_derived_vector_type (0, 2, type, merge_tclass, + merge_bits)) + return error_mark_node; + } + } + else + { + /* We just need to check the predicate (if any) and the single + vector argument. */ + unsigned int i, nargs; + if (!check_gp_argument (1, i, nargs) + || (type = infer_vector_type (i)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + } + + /* Handle convert-like functions in which the first type suffix is + explicit. */ + if (type_suffix_ids[0] != NUM_TYPE_SUFFIXES) + return resolve_to (mode_suffix_id, type_suffix_ids[0], type); + + return resolve_to (mode_suffix_id, type); +} + +/* Resolve a (possibly predicated) unary function taking a scalar + argument (_n suffix). If the function uses merge predication, + there is an extra vector argument in the first position, and the + final governing predicate that specifies the values of inactive + elements. + + Return the function decl of the resolved function on success, + otherwise report a suitable error and return error_mark_node. */ +tree +function_resolver::resolve_unary_n () +{ + type_suffix_index type; + + /* Currently only support overrides for _m (vdupq). */ + if (pred != PRED_m) + return error_mark_node; + + if (pred == PRED_m) + { + if (!check_num_arguments (3)) + return error_mark_node; + + /* The inactive elements are the same as the active elements, + so we can use normal left-to-right resolution. */ + if ((type = infer_vector_type (0)) == NUM_TYPE_SUFFIXES + /* Predicates are the last argument. */ + || !require_vector_type (2 , VECTOR_TYPE_mve_pred16_t)) + return error_mark_node; + } + + /* Make sure the argument is scalar. */ + tree scalar_form = lookup_form (MODE_n, type); + + if (scalar_argument_p (1) && scalar_form) + return scalar_form; + + return error_mark_node; +} + +/* Resolve a (possibly predicated) function that takes NOPS like-typed + vector arguments followed by NIMM integer immediates. Return the + function decl of the resolved function on success, otherwise report + a suitable error and return error_mark_node. */ +tree +function_resolver::resolve_uniform (unsigned int nops, unsigned int nimm) +{ + unsigned int i, nargs; + type_suffix_index type; + if (!check_gp_argument (nops + nimm, i, nargs) + || (type = infer_vector_type (0 )) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + unsigned int last_arg = i + 1 - nimm; + for (i = 0; i < last_arg; i++) + if (!require_matching_vector_type (i, type)) + return error_mark_node; + + for (i = last_arg; i < nargs; ++i) + if (!require_integer_immediate (i)) + return error_mark_node; + + return resolve_to (mode_suffix_id, type); +} + +/* Resolve a (possibly predicated) function that offers a choice between + taking: + + - NOPS like-typed vector arguments or + - NOPS - 1 like-typed vector arguments followed by a scalar argument + + Return the function decl of the resolved function on success, + otherwise report a suitable error and return error_mark_node. */ +tree +function_resolver::resolve_uniform_opt_n (unsigned int nops) +{ + unsigned int i, nargs; + type_suffix_index type; + if (!check_gp_argument (nops, i, nargs) + /* Unary operators should use resolve_unary, so using i - 1 is + safe. */ + || (type = infer_vector_type (i - 1)) == NUM_TYPE_SUFFIXES) + return error_mark_node; + + /* Skip last argument, may be scalar. */ + unsigned int last_arg = i; + for (i = 0; i < last_arg; i++) + if (!require_matching_vector_type (i, type)) + return error_mark_node; + + return finish_opt_n_resolution (last_arg, 0, type); +} + +/* If the call is erroneous, report an appropriate error and return + error_mark_node. Otherwise, if the function is overloaded, return + the decl of the non-overloaded function. Return NULL_TREE otherwise, + indicating that the call should be processed in the normal way. */ +tree +function_resolver::resolve () +{ + return shape->resolve (*this); +} + +function_checker::function_checker (location_t location, + const function_instance &instance, + tree fndecl, tree fntype, + unsigned int nargs, tree *args) + : function_call_info (location, instance, fndecl), + m_fntype (fntype), m_nargs (nargs), m_args (args) +{ + if (instance.has_inactive_argument ()) + m_base_arg = 1; + else + m_base_arg = 0; +} + +/* Return true if argument ARGNO exists. which it might not for + erroneous calls. It is safe to wave through checks if this + function returns false. */ +bool +function_checker::argument_exists_p (unsigned int argno) +{ + gcc_assert (argno < (unsigned int) type_num_arguments (m_fntype)); + return argno < m_nargs; +} + +/* Check that argument ARGNO is an integer constant expression and + store its value in VALUE_OUT if so. The caller should first + check that argument ARGNO exists. */ +bool +function_checker::require_immediate (unsigned int argno, + HOST_WIDE_INT &value_out) +{ + gcc_assert (argno < m_nargs); + tree arg = m_args[argno]; + + /* The type and range are unsigned, so read the argument as an + unsigned rather than signed HWI. */ + if (!tree_fits_uhwi_p (arg)) + { + report_non_ice (location, fndecl, argno); + return false; + } + + /* ...but treat VALUE_OUT as signed for error reporting, since printing + -1 is more user-friendly than the maximum uint64_t value. */ + value_out = tree_to_uhwi (arg); + return true; +} + +/* Check that argument REL_ARGNO is an integer constant expression that has + a valid value for enumeration type TYPE. REL_ARGNO counts from the end + of the predication arguments. */ +bool +function_checker::require_immediate_enum (unsigned int rel_argno, tree type) +{ + unsigned int argno = m_base_arg + rel_argno; + if (!argument_exists_p (argno)) + return true; + + HOST_WIDE_INT actual; + if (!require_immediate (argno, actual)) + return false; + + for (tree entry = TYPE_VALUES (type); entry; entry = TREE_CHAIN (entry)) + { + /* The value is an INTEGER_CST for C and a CONST_DECL wrapper + around an INTEGER_CST for C++. */ + tree value = TREE_VALUE (entry); + if (TREE_CODE (value) == CONST_DECL) + value = DECL_INITIAL (value); + if (wi::to_widest (value) == actual) + return true; + } + + report_not_enum (location, fndecl, argno, actual, type); + return false; +} + +/* Check that argument REL_ARGNO is an integer constant expression in the + range [MIN, MAX]. REL_ARGNO counts from the end of the predication + arguments. */ +bool +function_checker::require_immediate_range (unsigned int rel_argno, + HOST_WIDE_INT min, + HOST_WIDE_INT max) +{ + unsigned int argno = m_base_arg + rel_argno; + if (!argument_exists_p (argno)) + return true; + + /* Required because of the tree_to_uhwi -> HOST_WIDE_INT conversion + in require_immediate. */ + gcc_assert (min >= 0 && min <= max); + HOST_WIDE_INT actual; + if (!require_immediate (argno, actual)) + return false; + + if (!IN_RANGE (actual, min, max)) + { + report_out_of_range (location, fndecl, argno, actual, min, max); + return false; + } + + return true; +} + +/* Perform semantic checks on the call. Return true if the call is valid, + otherwise report a suitable error. */ +bool +function_checker::check () +{ + function_args_iterator iter; + tree type; + unsigned int i = 0; + FOREACH_FUNCTION_ARGS (m_fntype, type, iter) + { + if (type == void_type_node || i >= m_nargs) + break; + + if (i >= m_base_arg + && TREE_CODE (type) == ENUMERAL_TYPE + && !require_immediate_enum (i - m_base_arg, type)) + return false; + + i += 1; + } + + return shape->check (*this); +} + +gimple_folder::gimple_folder (const function_instance &instance, tree fndecl, + gcall *call_in) + : function_call_info (gimple_location (call_in), instance, fndecl), + call (call_in), lhs (gimple_call_lhs (call_in)) +{ +} + +/* Try to fold the call. Return the new statement on success and null + on failure. */ +gimple * +gimple_folder::fold () +{ + /* Don't fold anything when MVE is disabled; emit an error during + expansion instead. */ + if (!TARGET_HAVE_MVE) + return NULL; + + /* Punt if the function has a return type and no result location is + provided. The attributes should allow target-independent code to + remove the calls if appropriate. */ + if (!lhs && TREE_TYPE (gimple_call_fntype (call)) != void_type_node) + return NULL; + + return base->fold (*this); +} + +function_expander::function_expander (const function_instance &instance, + tree fndecl, tree call_expr_in, + rtx possible_target_in) + : function_call_info (EXPR_LOCATION (call_expr_in), instance, fndecl), + call_expr (call_expr_in), possible_target (possible_target_in) +{ +} + +/* Return the handler of direct optab OP for type suffix SUFFIX_I. */ +insn_code +function_expander::direct_optab_handler (optab op, unsigned int suffix_i) +{ + return ::direct_optab_handler (op, vector_mode (suffix_i)); +} + +/* For a function that does the equivalent of: + + OUTPUT = COND ? FN (INPUTS) : FALLBACK; + + return the value of FALLBACK. + + MODE is the mode of OUTPUT. + MERGE_ARGNO is the argument that provides FALLBACK for _m functions, + or DEFAULT_MERGE_ARGNO if we should apply the usual rules. + + ARGNO is the caller's index into args. If the returned value is + argument 0 (as for unary _m operations), increment ARGNO past the + returned argument. */ +rtx +function_expander::get_fallback_value (machine_mode mode, + unsigned int merge_argno, + unsigned int &argno) +{ + if (pred == PRED_z) + return CONST0_RTX (mode); + + gcc_assert (pred == PRED_m || pred == PRED_x); + + if (merge_argno == 0) + return args[argno++]; + + return args[merge_argno]; +} + +/* Return a REG rtx that can be used for the result of the function, + using the preferred target if suitable. */ +rtx +function_expander::get_reg_target () +{ + machine_mode target_mode = TYPE_MODE (TREE_TYPE (TREE_TYPE (fndecl))); + if (!possible_target || GET_MODE (possible_target) != target_mode) + possible_target = gen_reg_rtx (target_mode); + return possible_target; +} + +/* Add an output operand to the instruction we're building, which has + code ICODE. Bind the output to the preferred target rtx if possible. */ +void +function_expander::add_output_operand (insn_code icode) +{ + unsigned int opno = m_ops.length (); + machine_mode mode = insn_data[icode].operand[opno].mode; + m_ops.safe_grow (opno + 1, true); + create_output_operand (&m_ops.last (), possible_target, mode); +} + +/* Add an input operand to the instruction we're building, which has + code ICODE. Calculate the value of the operand as follows: + + - If the operand is a predicate, coerce X to have the + mode that the instruction expects. + + - Otherwise use X directly. The expand machinery checks that X has + the right mode for the instruction. */ +void +function_expander::add_input_operand (insn_code icode, rtx x) +{ + unsigned int opno = m_ops.length (); + const insn_operand_data &operand = insn_data[icode].operand[opno]; + machine_mode mode = operand.mode; + if (mode == VOIDmode) + { + /* The only allowable use of VOIDmode is the wildcard + arm_any_register_operand, which is used to avoid + combinatorial explosion in the reinterpret patterns. */ + gcc_assert (operand.predicate == arm_any_register_operand); + mode = GET_MODE (x); + } + else if (VALID_MVE_PRED_MODE (mode)) + x = gen_lowpart (mode, x); + + m_ops.safe_grow (m_ops.length () + 1, true); + create_input_operand (&m_ops.last (), x, mode); +} + +/* Add an integer operand with value X to the instruction. */ +void +function_expander::add_integer_operand (HOST_WIDE_INT x) +{ + m_ops.safe_grow (m_ops.length () + 1, true); + create_integer_operand (&m_ops.last (), x); +} + +/* Generate instruction ICODE, given that its operands have already + been added to M_OPS. Return the value of the first operand. */ +rtx +function_expander::generate_insn (insn_code icode) +{ + expand_insn (icode, m_ops.length (), m_ops.address ()); + return function_returns_void_p () ? const0_rtx : m_ops[0].value; +} + +/* Implement the call using instruction ICODE, with a 1:1 mapping between + arguments and input operands. */ +rtx +function_expander::use_exact_insn (insn_code icode) +{ + unsigned int nops = insn_data[icode].n_operands; + if (!function_returns_void_p ()) + { + add_output_operand (icode); + nops -= 1; + } + for (unsigned int i = 0; i < nops; ++i) + add_input_operand (icode, args[i]); + return generate_insn (icode); +} + +/* Implement the call using instruction ICODE, which does not use a + predicate. */ +rtx +function_expander::use_unpred_insn (insn_code icode) +{ + gcc_assert (pred == PRED_none); + /* Discount the output operand. */ + unsigned int nops = insn_data[icode].n_operands - 1; + unsigned int i = 0; + + add_output_operand (icode); + for (; i < nops; ++i) + add_input_operand (icode, args[i]); + + return generate_insn (icode); +} + +/* Implement the call using instruction ICODE, which is a predicated + operation that returns arbitrary values for inactive lanes. */ +rtx +function_expander::use_pred_x_insn (insn_code icode) +{ + gcc_assert (pred == PRED_x); + unsigned int nops = args.length (); + + add_output_operand (icode); + /* Use first operand as arbitrary inactive input. */ + add_input_operand (icode, possible_target); + emit_clobber (possible_target); + /* Copy remaining arguments, including the final predicate. */ + for (unsigned int i = 0; i < nops; ++i) + add_input_operand (icode, args[i]); + + return generate_insn (icode); +} + +/* Implement the call using instruction ICODE, which does the equivalent of: + + OUTPUT = COND ? FN (INPUTS) : FALLBACK; + + The instruction operands are in the order above: OUTPUT, COND, INPUTS + and FALLBACK. MERGE_ARGNO is the argument that provides FALLBACK for _m + functions, or DEFAULT_MERGE_ARGNO if we should apply the usual rules. */ +rtx +function_expander::use_cond_insn (insn_code icode, unsigned int merge_argno) +{ + /* At present we never need to handle PRED_none, which would involve + creating a new predicate rather than using one supplied by the user. */ + gcc_assert (pred != PRED_none); + /* For MVE, we only handle PRED_m at present. */ + gcc_assert (pred == PRED_m); + + /* Discount the output, predicate and fallback value. */ + unsigned int nops = insn_data[icode].n_operands - 3; + machine_mode mode = insn_data[icode].operand[0].mode; + + unsigned int opno = 0; + rtx fallback_arg = NULL_RTX; + fallback_arg = get_fallback_value (mode, merge_argno, opno); + rtx pred_arg = args[nops + 1]; + + add_output_operand (icode); + add_input_operand (icode, fallback_arg); + for (unsigned int i = 0; i < nops; ++i) + add_input_operand (icode, args[opno + i]); + add_input_operand (icode, pred_arg); + return generate_insn (icode); +} + +/* Implement the call using a normal unpredicated optab for PRED_none. + + <optab> corresponds to: + + - CODE_FOR_SINT for signed integers + - CODE_FOR_UINT for unsigned integers + - CODE_FOR_FP for floating-point values */ +rtx +function_expander::map_to_rtx_codes (rtx_code code_for_sint, + rtx_code code_for_uint, + rtx_code code_for_fp) +{ + gcc_assert (pred == PRED_none); + rtx_code code = type_suffix (0).integer_p ? + (type_suffix (0).unsigned_p ? code_for_uint : code_for_sint) + : code_for_fp; + insn_code icode = direct_optab_handler (code_to_optab (code), 0); + if (icode == CODE_FOR_nothing) + gcc_unreachable (); + + return use_unpred_insn (icode); +} + +/* Expand the call and return its lhs. */ +rtx +function_expander::expand () +{ + unsigned int nargs = call_expr_nargs (call_expr); + args.reserve (nargs); + for (unsigned int i = 0; i < nargs; ++i) + args.quick_push (expand_normal (CALL_EXPR_ARG (call_expr, i))); + + return base->expand (*this); +} + +/* If we're implementing manual overloading, check whether the MVE + function with subcode CODE is overloaded, and if so attempt to + determine the corresponding non-overloaded function. The call + occurs at location LOCATION and has the arguments given by ARGLIST. + + If the call is erroneous, report an appropriate error and return + error_mark_node. Otherwise, if the function is overloaded, return + the decl of the non-overloaded function. Return NULL_TREE otherwise, + indicating that the call should be processed in the normal way. */ +tree +resolve_overloaded_builtin (location_t location, unsigned int code, + vec<tree, va_gc> *arglist) +{ + if (code >= vec_safe_length (registered_functions)) + return NULL_TREE; + + registered_function &rfn = *(*registered_functions)[code]; + if (rfn.overloaded_p) + return function_resolver (location, rfn.instance, rfn.decl, + *arglist).resolve (); + return NULL_TREE; +} + +/* Perform any semantic checks needed for a call to the MVE function + with subcode CODE, such as testing for integer constant expressions. + The call occurs at location LOCATION and has NARGS arguments, + given by ARGS. FNDECL is the original function decl, before + overload resolution. + + Return true if the call is valid, otherwise report a suitable error. */ +bool +check_builtin_call (location_t location, vec<location_t>, unsigned int code, + tree fndecl, unsigned int nargs, tree *args) +{ + const registered_function &rfn = *(*registered_functions)[code]; + if (!check_requires_float (location, rfn.decl, rfn.requires_float)) + return false; + + return function_checker (location, rfn.instance, fndecl, + TREE_TYPE (rfn.decl), nargs, args).check (); +} + +/* Attempt to fold STMT, given that it's a call to the MVE function + with subcode CODE. Return the new statement on success and null + on failure. Insert any other new statements at GSI. */ +gimple * +gimple_fold_builtin (unsigned int code, gcall *stmt) +{ + registered_function &rfn = *(*registered_functions)[code]; + return gimple_folder (rfn.instance, rfn.decl, stmt).fold (); +} + +/* Expand a call to the MVE function with subcode CODE. EXP is the call + expression and TARGET is the preferred location for the result. + Return the value of the lhs. */ +rtx +expand_builtin (unsigned int code, tree exp, rtx target) +{ + registered_function &rfn = *(*registered_functions)[code]; + if (!check_requires_float (EXPR_LOCATION (exp), rfn.decl, + rfn.requires_float)) + return target; + return function_expander (rfn.instance, rfn.decl, exp, target).expand (); +} + +} /* end namespace arm_mve */ + +using namespace arm_mve; + +inline void +gt_ggc_mx (function_instance *) +{ +} + +inline void +gt_pch_nx (function_instance *) +{ +} + +inline void +gt_pch_nx (function_instance *, gt_pointer_operator, void *) +{ +} #include "gt-arm-mve-builtins.h" diff --git a/gcc/config/arm/arm-mve-builtins.def b/gcc/config/arm/arm-mve-builtins.def index 69f3f81b473..49d07364fa2 100644 --- a/gcc/config/arm/arm-mve-builtins.def +++ b/gcc/config/arm/arm-mve-builtins.def @@ -17,10 +17,25 @@ along with GCC; see the file COPYING3. If not see <http://www.gnu.org/licenses/>. */ +#ifndef DEF_MVE_MODE +#define DEF_MVE_MODE(A, B, C, D) +#endif + #ifndef DEF_MVE_TYPE -#error "arm-mve-builtins.def included without defining DEF_MVE_TYPE" +#define DEF_MVE_TYPE(A, B) +#endif + +#ifndef DEF_MVE_TYPE_SUFFIX +#define DEF_MVE_TYPE_SUFFIX(A, B, C, D, E) #endif +#ifndef DEF_MVE_FUNCTION +#define DEF_MVE_FUNCTION(A, B, C, D) +#endif + +DEF_MVE_MODE (n, none, none, none) +DEF_MVE_MODE (offset, none, none, bytes) + #define REQUIRES_FLOAT false DEF_MVE_TYPE (mve_pred16_t, boolean_type_node) DEF_MVE_TYPE (uint8x16_t, unsigned_intQI_type_node) @@ -37,3 +52,26 @@ DEF_MVE_TYPE (int64x2_t, intDI_type_node) DEF_MVE_TYPE (float16x8_t, arm_fp16_type_node) DEF_MVE_TYPE (float32x4_t, float_type_node) #undef REQUIRES_FLOAT + +#define REQUIRES_FLOAT false +DEF_MVE_TYPE_SUFFIX (s8, int8x16_t, signed, 8, V16QImode) +DEF_MVE_TYPE_SUFFIX (s16, int16x8_t, signed, 16, V8HImode) +DEF_MVE_TYPE_SUFFIX (s32, int32x4_t, signed, 32, V4SImode) +DEF_MVE_TYPE_SUFFIX (s64, int64x2_t, signed, 64, V2DImode) +DEF_MVE_TYPE_SUFFIX (u8, uint8x16_t, unsigned, 8, V16QImode) +DEF_MVE_TYPE_SUFFIX (u16, uint16x8_t, unsigned, 16, V8HImode) +DEF_MVE_TYPE_SUFFIX (u32, uint32x4_t, unsigned, 32, V4SImode) +DEF_MVE_TYPE_SUFFIX (u64, uint64x2_t, unsigned, 64, V2DImode) +#undef REQUIRES_FLOAT + +#define REQUIRES_FLOAT true +DEF_MVE_TYPE_SUFFIX (f16, float16x8_t, float, 16, V8HFmode) +DEF_MVE_TYPE_SUFFIX (f32, float32x4_t, float, 32, V4SFmode) +#undef REQUIRES_FLOAT + +#include "arm-mve-builtins-base.def" + +#undef DEF_MVE_TYPE +#undef DEF_MVE_TYPE_SUFFIX +#undef DEF_MVE_FUNCTION +#undef DEF_MVE_MODE diff --git a/gcc/config/arm/arm-mve-builtins.h b/gcc/config/arm/arm-mve-builtins.h index 290a118ec92..a20d2fb5d86 100644 --- a/gcc/config/arm/arm-mve-builtins.h +++ b/gcc/config/arm/arm-mve-builtins.h @@ -20,7 +20,79 @@ #ifndef GCC_ARM_MVE_BUILTINS_H #define GCC_ARM_MVE_BUILTINS_H +/* The full name of an MVE ACLE function is the concatenation of: + + - the base name ("vadd", etc.) + - the "mode" suffix ("_n", "_index", etc.) + - the type suffixes ("_s32", "_b8", etc.) + - the predication suffix ("_x", "_z", etc.) + + Each piece of information is individually useful, so we retain this + classification throughout: + + - function_base represents the base name + + - mode_suffix_index represents the mode suffix + + - type_suffix_index represents individual type suffixes, while + type_suffix_pair represents a pair of them + + - prediction_index extends the predication suffix with an additional + alternative: PRED_implicit for implicitly-predicated operations + + In addition to its unique full name, a function may have a shorter + overloaded alias. This alias removes pieces of the suffixes that + can be inferred from the arguments, such as by shortening the mode + suffix or dropping some of the type suffixes. The base name and the + predication suffix stay the same. + + The function_shape class describes what arguments a given function + takes and what its overloaded alias is called. In broad terms, + function_base describes how the underlying instruction behaves while + function_shape describes how that instruction has been presented at + the language level. + + The static list of functions uses function_group to describe a group + of related functions. The function_builder class is responsible for + expanding this static description into a list of individual functions + and registering the associated built-in functions. function_instance + describes one of these individual functions in terms of the properties + described above. + + The classes involved in compiling a function call are: + + - function_resolver, which resolves an overloaded function call to a + specific function_instance and its associated function decl + + - function_checker, which checks whether the values of the arguments + conform to the ACLE specification + + - gimple_folder, which tries to fold a function call at the gimple level + + - function_expander, which expands a function call into rtl instructions + + function_resolver and function_checker operate at the language level + and so are associated with the function_shape. gimple_folder and + function_expander are concerned with the behavior of the function + and so are associated with the function_base. + + Note that we've specifically chosen not to fold calls in the frontend, + since MVE intrinsics will hardly ever fold a useful language-level + constant. */ namespace arm_mve { +/* The maximum number of vectors in an ACLE tuple type. */ +const unsigned int MAX_TUPLE_SIZE = 3; + +/* Used to represent the default merge argument index for _m functions. + The actual index depends on how many arguments the function takes. */ +const unsigned int DEFAULT_MERGE_ARGNO = 0; + +/* Flags that describe what a function might do, in addition to reading + its arguments and returning a result. */ +const unsigned int CP_READ_FPCR = 1U << 0; +const unsigned int CP_RAISE_FP_EXCEPTIONS = 1U << 1; +const unsigned int CP_READ_MEMORY = 1U << 2; +const unsigned int CP_WRITE_MEMORY = 1U << 3; /* Enumerates the MVE predicate and (data) vector types, together called "vector types" for brevity. */ @@ -30,11 +102,604 @@ enum vector_type_index VECTOR_TYPE_ ## ACLE_NAME, #include "arm-mve-builtins.def" NUM_VECTOR_TYPES -#undef DEF_MVE_TYPE }; +/* Classifies the available measurement units for an address displacement. */ +enum units_index +{ + UNITS_none, + UNITS_bytes +}; + +/* Describes the various uses of a governing predicate. */ +enum predication_index +{ + /* No governing predicate is present. */ + PRED_none, + + /* Merging predication: copy inactive lanes from the first data argument + to the vector result. */ + PRED_m, + + /* Plain predication: inactive lanes are not used to compute the + scalar result. */ + PRED_p, + + /* "Don't care" predication: set inactive lanes of the vector result + to arbitrary values. */ + PRED_x, + + /* Zero predication: set inactive lanes of the vector result to zero. */ + PRED_z, + + NUM_PREDS +}; + +/* Classifies element types, based on type suffixes with the bit count + removed. */ +enum type_class_index +{ + TYPE_bool, + TYPE_float, + TYPE_signed, + TYPE_unsigned, + NUM_TYPE_CLASSES +}; + +/* Classifies an operation into "modes"; for example, to distinguish + vector-scalar operations from vector-vector operations, or to + distinguish between different addressing modes. This classification + accounts for the function suffixes that occur between the base name + and the first type suffix. */ +enum mode_suffix_index +{ +#define DEF_MVE_MODE(NAME, BASE, DISPLACEMENT, UNITS) MODE_##NAME, +#include "arm-mve-builtins.def" + MODE_none +}; + +/* Enumerates the possible type suffixes. Each suffix is associated with + a vector type, but for predicates provides extra information about the + element size. */ +enum type_suffix_index +{ +#define DEF_MVE_TYPE_SUFFIX(NAME, ACLE_TYPE, CLASS, BITS, MODE) \ + TYPE_SUFFIX_ ## NAME, +#include "arm-mve-builtins.def" + NUM_TYPE_SUFFIXES +}; + +/* Combines two type suffixes. */ +typedef enum type_suffix_index type_suffix_pair[2]; + +class function_base; +class function_shape; + +/* Static information about a mode suffix. */ +struct mode_suffix_info +{ + /* The suffix string itself. */ + const char *string; + + /* The type of the vector base address, or NUM_VECTOR_TYPES if the + mode does not include a vector base address. */ + vector_type_index base_vector_type; + + /* The type of the vector displacement, or NUM_VECTOR_TYPES if the + mode does not include a vector displacement. (Note that scalar + displacements are always int64_t.) */ + vector_type_index displacement_vector_type; + + /* The units in which the vector or scalar displacement is measured, + or UNITS_none if the mode doesn't take a displacement. */ + units_index displacement_units; +}; + +/* Static information about a type suffix. */ +struct type_suffix_info +{ + /* The suffix string itself. */ + const char *string; + + /* The associated ACLE vector or predicate type. */ + vector_type_index vector_type : 8; + + /* What kind of type the suffix represents. */ + type_class_index tclass : 8; + + /* The number of bits and bytes in an element. For predicates this + measures the associated data elements. */ + unsigned int element_bits : 8; + unsigned int element_bytes : 8; + + /* True if the suffix is for an integer type. */ + unsigned int integer_p : 1; + /* True if the suffix is for an unsigned type. */ + unsigned int unsigned_p : 1; + /* True if the suffix is for a floating-point type. */ + unsigned int float_p : 1; + unsigned int spare : 13; + + /* The associated vector or predicate mode. */ + machine_mode vector_mode : 16; +}; + +/* Static information about a set of functions. */ +struct function_group_info +{ + /* The base name, as a string. */ + const char *base_name; + + /* Describes the behavior associated with the function base name. */ + const function_base *const *base; + + /* The shape of the functions, as described above the class definition. + It's possible to have entries with the same base name but different + shapes. */ + const function_shape *const *shape; + + /* A list of the available type suffixes, and of the available predication + types. The function supports every combination of the two. + + The list of type suffixes is terminated by two NUM_TYPE_SUFFIXES + while the list of predication types is terminated by NUM_PREDS. + The list of type suffixes is lexicographically ordered based + on the index value. */ + const type_suffix_pair *types; + const predication_index *preds; + + /* Whether the function group requires a floating point abi. */ + bool requires_float; +}; + +/* Describes a single fully-resolved function (i.e. one that has a + unique full name). */ +class GTY((user)) function_instance +{ +public: + function_instance (const char *, const function_base *, + const function_shape *, mode_suffix_index, + const type_suffix_pair &, predication_index); + + bool operator== (const function_instance &) const; + bool operator!= (const function_instance &) const; + hashval_t hash () const; + + unsigned int call_properties () const; + bool reads_global_state_p () const; + bool modifies_global_state_p () const; + bool could_trap_p () const; + + unsigned int vectors_per_tuple () const; + + const mode_suffix_info &mode_suffix () const; + + const type_suffix_info &type_suffix (unsigned int) const; + tree scalar_type (unsigned int) const; + tree vector_type (unsigned int) const; + tree tuple_type (unsigned int) const; + machine_mode vector_mode (unsigned int) const; + machine_mode gp_mode (unsigned int) const; + + bool has_inactive_argument () const; + + /* The properties of the function. (The explicit "enum"s are required + for gengtype.) */ + const char *base_name; + const function_base *base; + const function_shape *shape; + enum mode_suffix_index mode_suffix_id; + type_suffix_pair type_suffix_ids; + enum predication_index pred; +}; + +class registered_function; + +/* A class for building and registering function decls. */ +class function_builder +{ +public: + function_builder (); + ~function_builder (); + + void add_unique_function (const function_instance &, tree, + vec<tree> &, bool, bool, bool); + void add_overloaded_function (const function_instance &, bool, bool); + void add_overloaded_functions (const function_group_info &, + mode_suffix_index, bool); + + void register_function_group (const function_group_info &, bool); + +private: + void append_name (const char *); + char *finish_name (); + + char *get_name (const function_instance &, bool, bool); + + tree get_attributes (const function_instance &); + + registered_function &add_function (const function_instance &, + const char *, tree, tree, + bool, bool, bool); + + /* The function type to use for functions that are resolved by + function_resolver. */ + tree m_overload_type; + + /* True if we should create a separate decl for each instance of an + overloaded function, instead of using function_resolver. */ + bool m_direct_overloads; + + /* Used for building up function names. */ + obstack m_string_obstack; + + /* Maps all overloaded function names that we've registered so far + to their associated function_instances. */ + hash_map<nofree_string_hash, registered_function *> m_overload_names; +}; + +/* A base class for handling calls to built-in functions. */ +class function_call_info : public function_instance +{ +public: + function_call_info (location_t, const function_instance &, tree); + + bool function_returns_void_p (); + + /* The location of the call. */ + location_t location; + + /* The FUNCTION_DECL that is being called. */ + tree fndecl; +}; + +/* A class for resolving an overloaded function call. */ +class function_resolver : public function_call_info +{ +public: + enum { SAME_SIZE = 256, HALF_SIZE, QUARTER_SIZE }; + static const type_class_index SAME_TYPE_CLASS = NUM_TYPE_CLASSES; + + function_resolver (location_t, const function_instance &, tree, + vec<tree, va_gc> &); + + tree get_vector_type (type_suffix_index); + const char *get_scalar_type_name (type_suffix_index); + tree get_argument_type (unsigned int); + bool scalar_argument_p (unsigned int); + + tree report_no_such_form (type_suffix_index); + tree lookup_form (mode_suffix_index, + type_suffix_index = NUM_TYPE_SUFFIXES, + type_suffix_index = NUM_TYPE_SUFFIXES); + tree resolve_to (mode_suffix_index, + type_suffix_index = NUM_TYPE_SUFFIXES, + type_suffix_index = NUM_TYPE_SUFFIXES); + + type_suffix_index infer_vector_or_tuple_type (unsigned int, unsigned int); + type_suffix_index infer_vector_type (unsigned int); + + bool require_vector_or_scalar_type (unsigned int); + + bool require_vector_type (unsigned int, vector_type_index); + bool require_matching_vector_type (unsigned int, type_suffix_index); + bool require_derived_vector_type (unsigned int, unsigned int, + type_suffix_index, + type_class_index = SAME_TYPE_CLASS, + unsigned int = SAME_SIZE); + bool require_integer_immediate (unsigned int); + bool require_scalar_type (unsigned int, const char *); + bool require_derived_scalar_type (unsigned int, type_class_index, + unsigned int = SAME_SIZE); + + bool check_num_arguments (unsigned int); + bool check_gp_argument (unsigned int, unsigned int &, unsigned int &); + tree resolve_unary (type_class_index = SAME_TYPE_CLASS, + unsigned int = SAME_SIZE, bool = false); + tree resolve_unary_n (); + tree resolve_uniform (unsigned int, unsigned int = 0); + tree resolve_uniform_opt_n (unsigned int); + tree finish_opt_n_resolution (unsigned int, unsigned int, type_suffix_index, + type_class_index = SAME_TYPE_CLASS, + unsigned int = SAME_SIZE, + type_suffix_index = NUM_TYPE_SUFFIXES); + + tree resolve (); + +private: + /* The arguments to the overloaded function. */ + vec<tree, va_gc> &m_arglist; +}; + +/* A class for checking that the semantic constraints on a function call are + satisfied, such as arguments being integer constant expressions with + a particular range. The parent class's FNDECL is the decl that was + called in the original source, before overload resolution. */ +class function_checker : public function_call_info +{ +public: + function_checker (location_t, const function_instance &, tree, + tree, unsigned int, tree *); + + bool require_immediate_enum (unsigned int, tree); + bool require_immediate_lane_index (unsigned int, unsigned int = 1); + bool require_immediate_range (unsigned int, HOST_WIDE_INT, HOST_WIDE_INT); + + bool check (); + +private: + bool argument_exists_p (unsigned int); + + bool require_immediate (unsigned int, HOST_WIDE_INT &); + + /* The type of the resolved function. */ + tree m_fntype; + + /* The arguments to the function. */ + unsigned int m_nargs; + tree *m_args; + + /* The first argument not associated with the function's predication + type. */ + unsigned int m_base_arg; +}; + +/* A class for folding a gimple function call. */ +class gimple_folder : public function_call_info +{ +public: + gimple_folder (const function_instance &, tree, + gcall *); + + gimple *fold (); + + /* The call we're folding. */ + gcall *call; + + /* The result of the call, or null if none. */ + tree lhs; +}; + +/* A class for expanding a function call into RTL. */ +class function_expander : public function_call_info +{ +public: + function_expander (const function_instance &, tree, tree, rtx); + rtx expand (); + + insn_code direct_optab_handler (optab, unsigned int = 0); + + rtx get_fallback_value (machine_mode, unsigned int, unsigned int &); + rtx get_reg_target (); + + void add_output_operand (insn_code); + void add_input_operand (insn_code, rtx); + void add_integer_operand (HOST_WIDE_INT); + rtx generate_insn (insn_code); + + rtx use_exact_insn (insn_code); + rtx use_unpred_insn (insn_code); + rtx use_pred_x_insn (insn_code); + rtx use_cond_insn (insn_code, unsigned int = DEFAULT_MERGE_ARGNO); + + rtx map_to_rtx_codes (rtx_code, rtx_code, rtx_code); + + /* The function call expression. */ + tree call_expr; + + /* For functions that return a value, this is the preferred location + of that value. It could be null or could have a different mode + from the function return type. */ + rtx possible_target; + + /* The expanded arguments. */ + auto_vec<rtx, 16> args; + +private: + /* Used to build up the operands to an instruction. */ + auto_vec<expand_operand, 8> m_ops; +}; + +/* Provides information about a particular function base name, and handles + tasks related to the base name. */ +class function_base +{ +public: + /* Return a set of CP_* flags that describe what the function might do, + in addition to reading its arguments and returning a result. */ + virtual unsigned int call_properties (const function_instance &) const; + + /* If the function operates on tuples of vectors, return the number + of vectors in the tuples, otherwise return 1. */ + virtual unsigned int vectors_per_tuple () const { return 1; } + + /* Try to fold the given gimple call. Return the new gimple statement + on success, otherwise return null. */ + virtual gimple *fold (gimple_folder &) const { return NULL; } + + /* Expand the given call into rtl. Return the result of the function, + or an arbitrary value if the function doesn't return a result. */ + virtual rtx expand (function_expander &) const = 0; +}; + +/* Classifies functions into "shapes". The idea is to take all the + type signatures for a set of functions, and classify what's left + based on: + + - the number of arguments + + - the process of determining the types in the signature from the mode + and type suffixes in the function name (including types that are not + affected by the suffixes) + + - which arguments must be integer constant expressions, and what range + those arguments have + + - the process for mapping overloaded names to "full" names. */ +class function_shape +{ +public: + virtual bool explicit_type_suffix_p (unsigned int, enum predication_index, enum mode_suffix_index) const = 0; + virtual bool explicit_mode_suffix_p (enum predication_index, enum mode_suffix_index) const = 0; + virtual bool skip_overload_p (enum predication_index, enum mode_suffix_index) const = 0; + + /* Define all functions associated with the given group. */ + virtual void build (function_builder &, + const function_group_info &, + bool) const = 0; + + /* Try to resolve the overloaded call. Return the non-overloaded + function decl on success and error_mark_node on failure. */ + virtual tree resolve (function_resolver &) const = 0; + + /* Check whether the given call is semantically valid. Return true + if it is, otherwise report an error and return false. */ + virtual bool check (function_checker &) const { return true; } +}; + +extern const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1]; +extern const mode_suffix_info mode_suffixes[MODE_none + 1]; + extern tree scalar_types[NUM_VECTOR_TYPES]; -extern tree acle_vector_types[3][NUM_VECTOR_TYPES + 1]; +extern tree acle_vector_types[MAX_TUPLE_SIZE][NUM_VECTOR_TYPES + 1]; + +/* Return the ACLE type mve_pred16_t. */ +inline tree +get_mve_pred16_t (void) +{ + return acle_vector_types[0][VECTOR_TYPE_mve_pred16_t]; +} + +/* Try to find a mode with the given mode_suffix_info fields. Return the + mode on success or MODE_none on failure. */ +inline mode_suffix_index +find_mode_suffix (vector_type_index base_vector_type, + vector_type_index displacement_vector_type, + units_index displacement_units) +{ + for (unsigned int mode_i = 0; mode_i < ARRAY_SIZE (mode_suffixes); ++mode_i) + { + const mode_suffix_info &mode = mode_suffixes[mode_i]; + if (mode.base_vector_type == base_vector_type + && mode.displacement_vector_type == displacement_vector_type + && mode.displacement_units == displacement_units) + return mode_suffix_index (mode_i); + } + return MODE_none; +} + +/* Return the type suffix associated with ELEMENT_BITS-bit elements of type + class TCLASS. */ +inline type_suffix_index +find_type_suffix (type_class_index tclass, unsigned int element_bits) +{ + for (unsigned int i = 0; i < NUM_TYPE_SUFFIXES; ++i) + if (type_suffixes[i].tclass == tclass + && type_suffixes[i].element_bits == element_bits) + return type_suffix_index (i); + gcc_unreachable (); +} + +inline function_instance:: +function_instance (const char *base_name_in, + const function_base *base_in, + const function_shape *shape_in, + mode_suffix_index mode_suffix_id_in, + const type_suffix_pair &type_suffix_ids_in, + predication_index pred_in) + : base_name (base_name_in), base (base_in), shape (shape_in), + mode_suffix_id (mode_suffix_id_in), pred (pred_in) +{ + memcpy (type_suffix_ids, type_suffix_ids_in, sizeof (type_suffix_ids)); +} + +inline bool +function_instance::operator== (const function_instance &other) const +{ + return (base == other.base + && shape == other.shape + && mode_suffix_id == other.mode_suffix_id + && pred == other.pred + && type_suffix_ids[0] == other.type_suffix_ids[0] + && type_suffix_ids[1] == other.type_suffix_ids[1]); +} + +inline bool +function_instance::operator!= (const function_instance &other) const +{ + return !operator== (other); +} + +/* If the function operates on tuples of vectors, return the number + of vectors in the tuples, otherwise return 1. */ +inline unsigned int +function_instance::vectors_per_tuple () const +{ + return base->vectors_per_tuple (); +} + +/* Return information about the function's mode suffix. */ +inline const mode_suffix_info & +function_instance::mode_suffix () const +{ + return mode_suffixes[mode_suffix_id]; +} + +/* Return information about type suffix I. */ +inline const type_suffix_info & +function_instance::type_suffix (unsigned int i) const +{ + return type_suffixes[type_suffix_ids[i]]; +} + +/* Return the scalar type associated with type suffix I. */ +inline tree +function_instance::scalar_type (unsigned int i) const +{ + return scalar_types[type_suffix (i).vector_type]; +} + +/* Return the vector type associated with type suffix I. */ +inline tree +function_instance::vector_type (unsigned int i) const +{ + return acle_vector_types[0][type_suffix (i).vector_type]; +} + +/* If the function operates on tuples of vectors, return the tuple type + associated with type suffix I, otherwise return the vector type associated + with type suffix I. */ +inline tree +function_instance::tuple_type (unsigned int i) const +{ + unsigned int num_vectors = vectors_per_tuple (); + return acle_vector_types[num_vectors - 1][type_suffix (i).vector_type]; +} + +/* Return the vector or predicate mode associated with type suffix I. */ +inline machine_mode +function_instance::vector_mode (unsigned int i) const +{ + return type_suffix (i).vector_mode; +} + +/* Return true if the function has no return value. */ +inline bool +function_call_info::function_returns_void_p () +{ + return TREE_TYPE (TREE_TYPE (fndecl)) == void_type_node; +} + +/* Default implementation of function::call_properties, with conservatively + correct behavior for floating-point instructions. */ +inline unsigned int +function_base::call_properties (const function_instance &instance) const +{ + unsigned int flags = 0; + if (instance.type_suffix (0).float_p || instance.type_suffix (1).float_p) + flags |= CP_READ_FPCR | CP_RAISE_FP_EXCEPTIONS; + return flags; +} } /* end namespace arm_mve */ diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 1bdbd3b8ab3..61fcd671437 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -215,7 +215,8 @@ extern opt_machine_mode arm_get_mask_mode (machine_mode mode); those groups. */ enum arm_builtin_class { - ARM_BUILTIN_GENERAL + ARM_BUILTIN_GENERAL, + ARM_BUILTIN_MVE }; /* Built-in function codes are structured so that the low @@ -229,6 +230,13 @@ const unsigned int ARM_BUILTIN_CLASS = (1 << ARM_BUILTIN_SHIFT) - 1; /* MVE functions. */ namespace arm_mve { void handle_arm_mve_types_h (); + void handle_arm_mve_h (bool); + tree resolve_overloaded_builtin (location_t, unsigned int, + vec<tree, va_gc> *); + bool check_builtin_call (location_t, vec<location_t>, unsigned int, + tree, unsigned int, tree *); + gimple *gimple_fold_builtin (unsigned int code, gcall *stmt); + rtx expand_builtin (unsigned int, tree, rtx); } /* Thumb functions. */ diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index bf7ff9a9704..004e6c6194e 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -69,6 +69,7 @@ #include "optabs-libfuncs.h" #include "gimplify.h" #include "gimple.h" +#include "gimple-iterator.h" #include "selftest.h" #include "tree-vectorizer.h" #include "opts.h" @@ -506,6 +507,9 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_FUNCTION_VALUE_REGNO_P #define TARGET_FUNCTION_VALUE_REGNO_P arm_function_value_regno_p +#undef TARGET_GIMPLE_FOLD_BUILTIN +#define TARGET_GIMPLE_FOLD_BUILTIN arm_gimple_fold_builtin + #undef TARGET_ASM_OUTPUT_MI_THUNK #define TARGET_ASM_OUTPUT_MI_THUNK arm_output_mi_thunk #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK @@ -2844,6 +2848,29 @@ arm_init_libfuncs (void) speculation_barrier_libfunc = init_one_libfunc ("__speculation_barrier"); } +/* Implement TARGET_GIMPLE_FOLD_BUILTIN. */ +static bool +arm_gimple_fold_builtin (gimple_stmt_iterator *gsi) +{ + gcall *stmt = as_a <gcall *> (gsi_stmt (*gsi)); + tree fndecl = gimple_call_fndecl (stmt); + unsigned int code = DECL_MD_FUNCTION_CODE (fndecl); + unsigned int subcode = code >> ARM_BUILTIN_SHIFT; + gimple *new_stmt = NULL; + switch (code & ARM_BUILTIN_CLASS) + { + case ARM_BUILTIN_GENERAL: + break; + case ARM_BUILTIN_MVE: + new_stmt = arm_mve::gimple_fold_builtin (subcode, stmt); + } + if (!new_stmt) + return false; + + gsi_replace (gsi, new_stmt, true); + return true; +} + /* On AAPCS systems, this is the "struct __va_list". */ static GTY(()) tree va_list_type; diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 1262d668121..0d2ba968fc0 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -34,6 +34,12 @@ #endif #include "arm_mve_types.h" +#ifdef __ARM_MVE_PRESERVE_USER_NAMESPACE +#pragma GCC arm "arm_mve.h" true +#else +#pragma GCC arm "arm_mve.h" false +#endif + #ifndef __ARM_MVE_PRESERVE_USER_NAMESPACE #define vst4q(__addr, __value) __arm_vst4q(__addr, __value) #define vdupq_n(__a) __arm_vdupq_n(__a) diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index 3139750c606..8e235f63ee6 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -903,3 +903,7 @@ (define_predicate "call_insn_operand" (define_special_predicate "aligned_operand" (ior (not (match_code "mem")) (match_test "MEM_ALIGN (op) >= GET_MODE_ALIGNMENT (mode)"))) + +;; A special predicate that doesn't match a particular mode. +(define_special_predicate "arm_any_register_operand" + (match_code "reg")) diff --git a/gcc/config/arm/t-arm b/gcc/config/arm/t-arm index 637e72af5bb..9a1b06368a1 100644 --- a/gcc/config/arm/t-arm +++ b/gcc/config/arm/t-arm @@ -154,15 +154,41 @@ arm-builtins.o: $(srcdir)/config/arm/arm-builtins.cc $(CONFIG_H) \ $(srcdir)/config/arm/arm-builtins.cc arm-mve-builtins.o: $(srcdir)/config/arm/arm-mve-builtins.cc $(CONFIG_H) \ - $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ - fold-const.h langhooks.h stringpool.h attribs.h diagnostic.h \ + $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) $(TM_P_H) \ + memmodel.h insn-codes.h optabs.h recog.h expr.h basic-block.h \ + function.h fold-const.h gimple.h gimple-fold.h emit-rtl.h langhooks.h \ + stringpool.h attribs.h diagnostic.h \ $(srcdir)/config/arm/arm-protos.h \ $(srcdir)/config/arm/arm-builtins.h \ $(srcdir)/config/arm/arm-mve-builtins.h \ - $(srcdir)/config/arm/arm-mve-builtins.def + $(srcdir)/config/arm/arm-mve-builtins-base.h \ + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \ + $(srcdir)/config/arm/arm-mve-builtins.def \ + $(srcdir)/config/arm/arm-mve-builtins-base.def $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/arm/arm-mve-builtins.cc +arm-mve-builtins-shapes.o: \ + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) \ + $(RTL_H) memmodel.h insn-codes.h optabs.h \ + $(srcdir)/config/arm/arm-mve-builtins.h \ + $(srcdir)/config/arm/arm-mve-builtins-shapes.h + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/arm/arm-mve-builtins-shapes.cc + +arm-mve-builtins-base.o: \ + $(srcdir)/config/arm/arm-mve-builtins-base.cc \ + $(CONFIG_H) $(SYSTEM_H) coretypes.h $(TM_H) $(TREE_H) $(RTL_H) \ + memmodel.h insn-codes.h $(OPTABS_H) \ + $(BASIC_BLOCK_H) $(FUNCTION_H) $(GIMPLE_H) \ + $(srcdir)/config/arm/arm-mve-builtins.h \ + $(srcdir)/config/arm/arm-mve-builtins-shapes.h \ + $(srcdir)/config/arm/arm-mve-builtins-base.h \ + $(srcdir)/config/arm/arm-mve-builtins-functions.h + $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ + $(srcdir)/config/arm/arm-mve-builtins-base.cc + arm-c.o: $(srcdir)/config/arm/arm-c.cc $(CONFIG_H) $(SYSTEM_H) \ coretypes.h $(TM_H) $(TREE_H) output.h $(C_COMMON_H) $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \