Message ID | 20190820091845.80750-3-iii@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Use signaling FP comparison instructions | expand |
Ilya Leoshkevich <iii@linux.ibm.com> writes: > z13 supports only non-signaling vector comparisons. This means we > cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. > However, we cannot express this restriction today: the code only checks > whether vcond$a$b optab, which does not contain information about the > operation. > > Introduce a hook that tells whether target supports certain vector > comparison operations with certain modes. > > gcc/ChangeLog: > > 2019-08-09 Ilya Leoshkevich <iii@linux.ibm.com> > > * doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document. > * doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document. > * optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p > in addition to get_vcond_icode. > * target.def (vcond_supported_p): New hook. > * targhooks.c (default_vcond_supported_p): Likewise. > * targhooks.h (default_vcond_supported_p): Likewise. IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses the predicate for operand 3 to test whether a particular comparison is supported. I guess this would require a cached rtx to avoid generating too much garbage rtl though (via GTY((cache))). If the general feeling is that we should have a hook instead, I assume the same restrictions would apply to the vec_cmp@var{m}@var{n} optab too, on targets that use that. It might be worth generalising the hook so that it applies to both patterns. > diff --git a/gcc/target.def b/gcc/target.def > index b2332d8215c..4cd2a964c0e 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -3379,6 +3379,17 @@ the vector element type.", > HOST_WIDE_INT, (const_tree type), > default_vector_alignment) > > +DEFHOOK > +(vcond_supported_p, > + "Define this to restrict which vector comparison operations are supported by\n\ > +vcond$a$b expander. An operation is represented by its operand machine mode\n\ Maybe s/vcond$a$b expander/a @code{vcond@var{m}@var{n}} instruction pattern/ for consistency with the .md documentation. > +@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum\n\ > +tree_code CODE}. The main use of this hook is to support machines which\n\ > +provide only certain vector comparison instructions, e.g. only non-signaling\n\ > +ones. The default is that all operations are supported.", ...if the associated pattern exists. > + bool, (machine_mode result_mode, machine_mode op_mode, int code), > + default_vcond_supported_p) > + > DEFHOOK > (array_mode, > "Return the mode that GCC should use for an array that has\n\ > diff --git a/gcc/targhooks.c b/gcc/targhooks.c > index 1d12ec54704..2b9a5d12bab 100644 > --- a/gcc/targhooks.c > +++ b/gcc/targhooks.c > @@ -448,6 +448,18 @@ default_scalar_mode_supported_p (scalar_mode mode) > } > } > > +/* Return true if vcond$a$b expander supports vector comparisons using the CODE > + of type enum tree_code, in which the operands have machine mode OP_MODE and > + the result has machine mode RESULT_MODE. */ > + > +bool > +default_vcond_supported_p (machine_mode result_mode ATTRIBUTE_UNUSED, > + machine_mode op_mode ATTRIBUTE_UNUSED, > + int code ATTRIBUTE_UNUSED) No need to repeat the documentation here -- just say which hook it's the default for. Thanks, Richard
> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>: > > Ilya Leoshkevich <iii@linux.ibm.com> writes: >> z13 supports only non-signaling vector comparisons. This means we >> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. >> However, we cannot express this restriction today: the code only checks >> whether vcond$a$b optab, which does not contain information about the >> operation. >> >> Introduce a hook that tells whether target supports certain vector >> comparison operations with certain modes. >> >> gcc/ChangeLog: >> >> 2019-08-09 Ilya Leoshkevich <iii@linux.ibm.com> >> >> * doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document. >> * doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document. >> * optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p >> in addition to get_vcond_icode. >> * target.def (vcond_supported_p): New hook. >> * targhooks.c (default_vcond_supported_p): Likewise. >> * targhooks.h (default_vcond_supported_p): Likewise. > > IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses > the predicate for operand 3 to test whether a particular comparison > is supported. I guess this would require a cached rtx to avoid > generating too much garbage rtl though (via GTY((cache))). How can I implement such a predicate? Would calling maybe_gen_insn with a fake rtx be reasonable? In this case, what would be the best way to generate fake input operands? The existing code that calls maybe_gen_insn gets the corresponding rtxes from upper layers.
Ilya Leoshkevich <iii@linux.ibm.com> writes: >> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>: >> >> Ilya Leoshkevich <iii@linux.ibm.com> writes: >>> z13 supports only non-signaling vector comparisons. This means we >>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. >>> However, we cannot express this restriction today: the code only checks >>> whether vcond$a$b optab, which does not contain information about the >>> operation. >>> >>> Introduce a hook that tells whether target supports certain vector >>> comparison operations with certain modes. >>> >>> gcc/ChangeLog: >>> >>> 2019-08-09 Ilya Leoshkevich <iii@linux.ibm.com> >>> >>> * doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document. >>> * doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document. >>> * optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p >>> in addition to get_vcond_icode. >>> * target.def (vcond_supported_p): New hook. >>> * targhooks.c (default_vcond_supported_p): Likewise. >>> * targhooks.h (default_vcond_supported_p): Likewise. >> >> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses >> the predicate for operand 3 to test whether a particular comparison >> is supported. I guess this would require a cached rtx to avoid >> generating too much garbage rtl though (via GTY((cache))). > > How can I implement such a predicate? Would calling maybe_gen_insn with > a fake rtx be reasonable? In this case, what would be the best way to > generate fake input operands? The existing code that calls > maybe_gen_insn gets the corresponding rtxes from upper layers. I was thinking of something like optabs.c:can_compare_p, but with some caching to reduce the overhead, and comparing registers rather than constants. E.g.: static rtx cached_binop GTY ((cached)); rtx get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode) { ...create or modify cached_binop, with register operands...; return cached_binop; } /* ...maybe a different name if the overload seems like a bad idea... */ bool vcond_supported_p (machine_mode value_mode, rtx_code cmp, machine_mode cmp_op_mode) { bool unsigned = (cmp == LTU || cmp == LEU || cmp == GTU || cmp == GEU); insn_code icode = get_vcond_icode (value_mode, cmp_op_mode, unsigned); if (icode == CODE_FOR_nothing) return false; if (insn_operand_predicate_fn pred = insn_data[icode].operand[3].predicate) { machine_mode cmp_mode = insn_data[icode].operand[3].mode; rtx cmp = get_cached_binop (cmp_mode, cmp, cmp_op_mode); if (!pred (cmp, cmp_mode)) return false; } return true; } (not attached to the names). Then the target pattern would use a normal match_operator for operand 3, with the predicate on the match_operand only allowing through the comparisons it actually supports. The get_cached_binop stuff makes it a bit cumbersome, but that part would be reusable by other queries in future (and by can_compare_p probably). Thanks, Richard
> Am 20.08.2019 um 13:54 schrieb Richard Sandiford <richard.sandiford@arm.com>: > > Ilya Leoshkevich <iii@linux.ibm.com> writes: >>> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>: >>> >>> Ilya Leoshkevich <iii@linux.ibm.com> writes: >>>> z13 supports only non-signaling vector comparisons. This means we >>>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13. >>>> However, we cannot express this restriction today: the code only checks >>>> whether vcond$a$b optab, which does not contain information about the >>>> operation. >>>> >>>> Introduce a hook that tells whether target supports certain vector >>>> comparison operations with certain modes. >>>> >>>> gcc/ChangeLog: >>>> >>>> 2019-08-09 Ilya Leoshkevich <iii@linux.ibm.com> >>>> >>>> * doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document. >>>> * doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document. >>>> * optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p >>>> in addition to get_vcond_icode. >>>> * target.def (vcond_supported_p): New hook. >>>> * targhooks.c (default_vcond_supported_p): Likewise. >>>> * targhooks.h (default_vcond_supported_p): Likewise. >>> >>> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses >>> the predicate for operand 3 to test whether a particular comparison >>> is supported. I guess this would require a cached rtx to avoid >>> generating too much garbage rtl though (via GTY((cache))). >> >> How can I implement such a predicate? Would calling maybe_gen_insn with >> a fake rtx be reasonable? In this case, what would be the best way to >> generate fake input operands? The existing code that calls >> maybe_gen_insn gets the corresponding rtxes from upper layers. > > I was thinking of something like optabs.c:can_compare_p, but with > some caching to reduce the overhead, and comparing registers rather > than constants. E.g.: > > static rtx cached_binop GTY ((cached)); > > rtx > get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode) > { > ...create or modify cached_binop, with register operands...; > return cached_binop; > } Thanks, I got it working! Still need to run the regtest, but the result looks promising. I have a question about caching. If I define just `static rtx cached_binop`, like you suggest, I would expect to see constant cache misses. Shouldn't it be something like `hash_map<tuple<enum rtx_code, machine_mode, machine_mode>, rtx>`?
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 4eadd0c9300..23ccce1d3f4 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4326,6 +4326,15 @@ insns involving vector mode @var{mode}. At the very least, it must have move patterns for this mode. @end deftypefn +@deftypefn {Target Hook} bool TARGET_VCOND_SUPPORTED_P (machine_mode @var{result_mode}, machine_mode @var{op_mode}, int @var{code}) +Define this to restrict which vector comparison operations are supported by +vcond$a$b expander. An operation is represented by its operand machine mode +@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum +tree_code CODE}. The main use of this hook is to support machines which +provide only certain vector comparison instructions, e.g. only non-signaling +ones. The default is that all operations are supported. +@end deftypefn + @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems}) Return the mode that GCC should use for an array that has @var{nelems} elements, with each element having mode @var{mode}. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 04f236f611e..a4399cd9d12 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3366,6 +3366,8 @@ stack. @hook TARGET_VECTOR_MODE_SUPPORTED_P +@hook TARGET_VCOND_SUPPORTED_P + @hook TARGET_ARRAY_MODE @hook TARGET_ARRAY_MODE_SUPPORTED_P diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c index 8157798cc71..de62e3d2765 100644 --- a/gcc/optabs-tree.c +++ b/gcc/optabs-tree.c @@ -347,8 +347,10 @@ expand_vec_cond_expr_p (tree value_type, tree cmp_op_type, enum tree_code code) || maybe_ne (GET_MODE_NUNITS (value_mode), GET_MODE_NUNITS (cmp_op_mode))) return false; - if (get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type), - TYPE_UNSIGNED (cmp_op_type)) == CODE_FOR_nothing + if (((get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type), + TYPE_UNSIGNED (cmp_op_type)) == CODE_FOR_nothing) + || !targetm.vcond_supported_p (TYPE_MODE (value_type), + TYPE_MODE (cmp_op_type), code)) && ((code != EQ_EXPR && code != NE_EXPR) || get_vcond_eq_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type)) == CODE_FOR_nothing)) diff --git a/gcc/target.def b/gcc/target.def index b2332d8215c..4cd2a964c0e 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -3379,6 +3379,17 @@ the vector element type.", HOST_WIDE_INT, (const_tree type), default_vector_alignment) +DEFHOOK +(vcond_supported_p, + "Define this to restrict which vector comparison operations are supported by\n\ +vcond$a$b expander. An operation is represented by its operand machine mode\n\ +@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum\n\ +tree_code CODE}. The main use of this hook is to support machines which\n\ +provide only certain vector comparison instructions, e.g. only non-signaling\n\ +ones. The default is that all operations are supported.", + bool, (machine_mode result_mode, machine_mode op_mode, int code), + default_vcond_supported_p) + DEFHOOK (array_mode, "Return the mode that GCC should use for an array that has\n\ diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 1d12ec54704..2b9a5d12bab 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -448,6 +448,18 @@ default_scalar_mode_supported_p (scalar_mode mode) } } +/* Return true if vcond$a$b expander supports vector comparisons using the CODE + of type enum tree_code, in which the operands have machine mode OP_MODE and + the result has machine mode RESULT_MODE. */ + +bool +default_vcond_supported_p (machine_mode result_mode ATTRIBUTE_UNUSED, + machine_mode op_mode ATTRIBUTE_UNUSED, + int code ATTRIBUTE_UNUSED) +{ + return true; +} + /* Return true if libgcc supports floating-point mode MODE (known to be supported as a scalar mode). */ diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 20a6e79d2d2..771a0996567 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -76,6 +76,7 @@ extern tree default_mangle_assembler_name (const char *); extern machine_mode default_translate_mode_attribute (machine_mode); extern bool default_scalar_mode_supported_p (scalar_mode); +extern bool default_vcond_supported_p (machine_mode, machine_mode, int); extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode); extern opt_scalar_float_mode default_floatn_mode (int, bool); extern bool default_floatn_builtin_p (int);