diff mbox series

[2/8] Introduce vcond_supported_p hook

Message ID 20190820091845.80750-3-iii@linux.ibm.com
State New
Headers show
Series Use signaling FP comparison instructions | expand

Commit Message

Ilya Leoshkevich Aug. 20, 2019, 9:18 a.m. UTC
z13 supports only non-signaling vector comparisons.  This means we
cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
However, we cannot express this restriction today: the code only checks
whether vcond$a$b optab, which does not contain information about the
operation.

Introduce a hook that tells whether target supports certain vector
comparison operations with certain modes.

gcc/ChangeLog:

2019-08-09  Ilya Leoshkevich  <iii@linux.ibm.com>

	* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
	* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
	* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
	in addition to get_vcond_icode.
	* target.def (vcond_supported_p): New hook.
	* targhooks.c (default_vcond_supported_p): Likewise.
	* targhooks.h (default_vcond_supported_p): Likewise.
---
 gcc/doc/tm.texi    |  9 +++++++++
 gcc/doc/tm.texi.in |  2 ++
 gcc/optabs-tree.c  |  6 ++++--
 gcc/target.def     | 11 +++++++++++
 gcc/targhooks.c    | 12 ++++++++++++
 gcc/targhooks.h    |  1 +
 6 files changed, 39 insertions(+), 2 deletions(-)

Comments

Richard Sandiford Aug. 20, 2019, 10:13 a.m. UTC | #1
Ilya Leoshkevich <iii@linux.ibm.com> writes:
> z13 supports only non-signaling vector comparisons.  This means we
> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
> However, we cannot express this restriction today: the code only checks
> whether vcond$a$b optab, which does not contain information about the
> operation.
>
> Introduce a hook that tells whether target supports certain vector
> comparison operations with certain modes.
>
> gcc/ChangeLog:
>
> 2019-08-09  Ilya Leoshkevich  <iii@linux.ibm.com>
>
> 	* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
> 	* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
> 	* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
> 	in addition to get_vcond_icode.
> 	* target.def (vcond_supported_p): New hook.
> 	* targhooks.c (default_vcond_supported_p): Likewise.
> 	* targhooks.h (default_vcond_supported_p): Likewise.

IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses
the predicate for operand 3 to test whether a particular comparison
is supported.  I guess this would require a cached rtx to avoid
generating too much garbage rtl though (via GTY((cache))).

If the general feeling is that we should have a hook instead, I assume
the same restrictions would apply to the vec_cmp@var{m}@var{n} optab too,
on targets that use that.  It might be worth generalising the hook so
that it applies to both patterns.

> diff --git a/gcc/target.def b/gcc/target.def
> index b2332d8215c..4cd2a964c0e 100644
> --- a/gcc/target.def
> +++ b/gcc/target.def
> @@ -3379,6 +3379,17 @@ the vector element type.",
>   HOST_WIDE_INT, (const_tree type),
>   default_vector_alignment)
>  
> +DEFHOOK
> +(vcond_supported_p,
> + "Define this to restrict which vector comparison operations are supported by\n\
> +vcond$a$b expander.  An operation is represented by its operand machine mode\n\

Maybe s/vcond$a$b expander/a @code{vcond@var{m}@var{n}} instruction pattern/
for consistency with the .md documentation.

> +@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum\n\
> +tree_code CODE}.  The main use of this hook is to support machines which\n\
> +provide only certain vector comparison instructions, e.g. only non-signaling\n\
> +ones.  The default is that all operations are supported.",

...if the associated pattern exists.

> + bool, (machine_mode result_mode, machine_mode op_mode, int code),
> + default_vcond_supported_p)
> +
>  DEFHOOK
>  (array_mode,
>   "Return the mode that GCC should use for an array that has\n\
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 1d12ec54704..2b9a5d12bab 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -448,6 +448,18 @@ default_scalar_mode_supported_p (scalar_mode mode)
>      }
>  }
>  
> +/* Return true if vcond$a$b expander supports vector comparisons using the CODE
> +   of type enum tree_code, in which the operands have machine mode OP_MODE and
> +   the result has machine mode RESULT_MODE.  */
> +
> +bool
> +default_vcond_supported_p (machine_mode result_mode ATTRIBUTE_UNUSED,
> +                           machine_mode op_mode ATTRIBUTE_UNUSED,
> +                           int code ATTRIBUTE_UNUSED)

No need to repeat the documentation here -- just say which hook it's
the default for.

Thanks,
Richard
Ilya Leoshkevich Aug. 20, 2019, 11:07 a.m. UTC | #2
> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>:
> 
> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>> z13 supports only non-signaling vector comparisons.  This means we
>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
>> However, we cannot express this restriction today: the code only checks
>> whether vcond$a$b optab, which does not contain information about the
>> operation.
>> 
>> Introduce a hook that tells whether target supports certain vector
>> comparison operations with certain modes.
>> 
>> gcc/ChangeLog:
>> 
>> 2019-08-09  Ilya Leoshkevich  <iii@linux.ibm.com>
>> 
>> 	* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
>> 	* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
>> 	* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
>> 	in addition to get_vcond_icode.
>> 	* target.def (vcond_supported_p): New hook.
>> 	* targhooks.c (default_vcond_supported_p): Likewise.
>> 	* targhooks.h (default_vcond_supported_p): Likewise.
> 
> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses
> the predicate for operand 3 to test whether a particular comparison
> is supported.  I guess this would require a cached rtx to avoid
> generating too much garbage rtl though (via GTY((cache))).

How can I implement such a predicate?  Would calling maybe_gen_insn with
a fake rtx be reasonable?  In this case, what would be the best way to
generate fake input operands?  The existing code that calls
maybe_gen_insn gets the corresponding rtxes from upper layers.
Richard Sandiford Aug. 20, 2019, 11:54 a.m. UTC | #3
Ilya Leoshkevich <iii@linux.ibm.com> writes:
>> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>:
>> 
>> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>>> z13 supports only non-signaling vector comparisons.  This means we
>>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
>>> However, we cannot express this restriction today: the code only checks
>>> whether vcond$a$b optab, which does not contain information about the
>>> operation.
>>> 
>>> Introduce a hook that tells whether target supports certain vector
>>> comparison operations with certain modes.
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2019-08-09  Ilya Leoshkevich  <iii@linux.ibm.com>
>>> 
>>> 	* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
>>> 	* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
>>> 	* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
>>> 	in addition to get_vcond_icode.
>>> 	* target.def (vcond_supported_p): New hook.
>>> 	* targhooks.c (default_vcond_supported_p): Likewise.
>>> 	* targhooks.h (default_vcond_supported_p): Likewise.
>> 
>> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses
>> the predicate for operand 3 to test whether a particular comparison
>> is supported.  I guess this would require a cached rtx to avoid
>> generating too much garbage rtl though (via GTY((cache))).
>
> How can I implement such a predicate?  Would calling maybe_gen_insn with
> a fake rtx be reasonable?  In this case, what would be the best way to
> generate fake input operands?  The existing code that calls
> maybe_gen_insn gets the corresponding rtxes from upper layers.

I was thinking of something like optabs.c:can_compare_p, but with
some caching to reduce the overhead, and comparing registers rather
than constants.  E.g.:

static rtx cached_binop GTY ((cached));

rtx
get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode)
{
  ...create or modify cached_binop, with register operands...;
  return cached_binop;
}

/* ...maybe a different name if the overload seems like a bad idea... */
bool
vcond_supported_p (machine_mode value_mode, rtx_code cmp,
		   machine_mode cmp_op_mode)
{
  bool unsigned = (cmp == LTU || cmp == LEU || cmp == GTU || cmp == GEU);
  insn_code icode = get_vcond_icode (value_mode, cmp_op_mode, unsigned);
  if (icode == CODE_FOR_nothing)
    return false;

  if (insn_operand_predicate_fn pred = insn_data[icode].operand[3].predicate)
    {
      machine_mode cmp_mode = insn_data[icode].operand[3].mode;
      rtx cmp = get_cached_binop (cmp_mode, cmp, cmp_op_mode);
      if (!pred (cmp, cmp_mode))
	return false;
    }

  return true;
}

(not attached to the names).

Then the target pattern would use a normal match_operator for operand 3,
with the predicate on the match_operand only allowing through the
comparisons it actually supports.

The get_cached_binop stuff makes it a bit cumbersome, but that part
would be reusable by other queries in future (and by can_compare_p
probably).

Thanks,
Richard
Ilya Leoshkevich Aug. 21, 2019, 11:43 a.m. UTC | #4
> Am 20.08.2019 um 13:54 schrieb Richard Sandiford <richard.sandiford@arm.com>:
> 
> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>>> Am 20.08.2019 um 12:13 schrieb Richard Sandiford <richard.sandiford@arm.com>:
>>> 
>>> Ilya Leoshkevich <iii@linux.ibm.com> writes:
>>>> z13 supports only non-signaling vector comparisons.  This means we
>>>> cannot vectorize LT, LE, GT, GE and LTGT when compiling for z13.
>>>> However, we cannot express this restriction today: the code only checks
>>>> whether vcond$a$b optab, which does not contain information about the
>>>> operation.
>>>> 
>>>> Introduce a hook that tells whether target supports certain vector
>>>> comparison operations with certain modes.
>>>> 
>>>> gcc/ChangeLog:
>>>> 
>>>> 2019-08-09  Ilya Leoshkevich  <iii@linux.ibm.com>
>>>> 
>>>> 	* doc/tm.texi (TARGET_VCOND_SUPPORTED_P): Document.
>>>> 	* doc/tm.texi.in (TARGET_VCOND_SUPPORTED_P): Document.
>>>> 	* optabs-tree.c (expand_vec_cond_expr_p): Use vcond_supported_p
>>>> 	in addition to get_vcond_icode.
>>>> 	* target.def (vcond_supported_p): New hook.
>>>> 	* targhooks.c (default_vcond_supported_p): Likewise.
>>>> 	* targhooks.h (default_vcond_supported_p): Likewise.
>>> 
>>> IMO it'd be cleaner to have a new optabs-query.[hc] helper that uses
>>> the predicate for operand 3 to test whether a particular comparison
>>> is supported.  I guess this would require a cached rtx to avoid
>>> generating too much garbage rtl though (via GTY((cache))).
>> 
>> How can I implement such a predicate?  Would calling maybe_gen_insn with
>> a fake rtx be reasonable?  In this case, what would be the best way to
>> generate fake input operands?  The existing code that calls
>> maybe_gen_insn gets the corresponding rtxes from upper layers.
> 
> I was thinking of something like optabs.c:can_compare_p, but with
> some caching to reduce the overhead, and comparing registers rather
> than constants.  E.g.:
> 
> static rtx cached_binop GTY ((cached));
> 
> rtx
> get_cached_binop (machine_mode mode, rtx_code code, machine_mode op_mode)
> {
>  ...create or modify cached_binop, with register operands...;
>  return cached_binop;
> }

Thanks, I got it working!  Still need to run the regtest, but the result
looks promising.

I have a question about caching.  If I define just `static rtx
cached_binop`, like you suggest, I would expect to see constant cache
misses.  Shouldn't it be something like `hash_map<tuple<enum rtx_code,
machine_mode, machine_mode>, rtx>`?
diff mbox series

Patch

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 4eadd0c9300..23ccce1d3f4 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4326,6 +4326,15 @@  insns involving vector mode @var{mode}.  At the very least, it
 must have move patterns for this mode.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_VCOND_SUPPORTED_P (machine_mode @var{result_mode}, machine_mode @var{op_mode}, int @var{code})
+Define this to restrict which vector comparison operations are supported by
+vcond$a$b expander.  An operation is represented by its operand machine mode
+@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum
+tree_code CODE}.  The main use of this hook is to support machines which
+provide only certain vector comparison instructions, e.g. only non-signaling
+ones.  The default is that all operations are supported.
+@end deftypefn
+
 @deftypefn {Target Hook} opt_machine_mode TARGET_ARRAY_MODE (machine_mode @var{mode}, unsigned HOST_WIDE_INT @var{nelems})
 Return the mode that GCC should use for an array that has
 @var{nelems} elements, with each element having mode @var{mode}.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 04f236f611e..a4399cd9d12 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3366,6 +3366,8 @@  stack.
 
 @hook TARGET_VECTOR_MODE_SUPPORTED_P
 
+@hook TARGET_VCOND_SUPPORTED_P
+
 @hook TARGET_ARRAY_MODE
 
 @hook TARGET_ARRAY_MODE_SUPPORTED_P
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 8157798cc71..de62e3d2765 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -347,8 +347,10 @@  expand_vec_cond_expr_p (tree value_type, tree cmp_op_type, enum tree_code code)
       || maybe_ne (GET_MODE_NUNITS (value_mode), GET_MODE_NUNITS (cmp_op_mode)))
     return false;
 
-  if (get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type),
-		       TYPE_UNSIGNED (cmp_op_type)) == CODE_FOR_nothing
+  if (((get_vcond_icode (TYPE_MODE (value_type), TYPE_MODE (cmp_op_type),
+		         TYPE_UNSIGNED (cmp_op_type)) == CODE_FOR_nothing)
+       || !targetm.vcond_supported_p (TYPE_MODE (value_type),
+                                      TYPE_MODE (cmp_op_type), code))
       && ((code != EQ_EXPR && code != NE_EXPR)
 	  || get_vcond_eq_icode (TYPE_MODE (value_type),
 				 TYPE_MODE (cmp_op_type)) == CODE_FOR_nothing))
diff --git a/gcc/target.def b/gcc/target.def
index b2332d8215c..4cd2a964c0e 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -3379,6 +3379,17 @@  the vector element type.",
  HOST_WIDE_INT, (const_tree type),
  default_vector_alignment)
 
+DEFHOOK
+(vcond_supported_p,
+ "Define this to restrict which vector comparison operations are supported by\n\
+vcond$a$b expander.  An operation is represented by its operand machine mode\n\
+@code{OP_MODE}, its result machine mode @code{RESULT_MODE} and @code{enum\n\
+tree_code CODE}.  The main use of this hook is to support machines which\n\
+provide only certain vector comparison instructions, e.g. only non-signaling\n\
+ones.  The default is that all operations are supported.",
+ bool, (machine_mode result_mode, machine_mode op_mode, int code),
+ default_vcond_supported_p)
+
 DEFHOOK
 (array_mode,
  "Return the mode that GCC should use for an array that has\n\
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 1d12ec54704..2b9a5d12bab 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -448,6 +448,18 @@  default_scalar_mode_supported_p (scalar_mode mode)
     }
 }
 
+/* Return true if vcond$a$b expander supports vector comparisons using the CODE
+   of type enum tree_code, in which the operands have machine mode OP_MODE and
+   the result has machine mode RESULT_MODE.  */
+
+bool
+default_vcond_supported_p (machine_mode result_mode ATTRIBUTE_UNUSED,
+                           machine_mode op_mode ATTRIBUTE_UNUSED,
+                           int code ATTRIBUTE_UNUSED)
+{
+  return true;
+}
+
 /* Return true if libgcc supports floating-point mode MODE (known to
    be supported as a scalar mode).  */
 
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 20a6e79d2d2..771a0996567 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -76,6 +76,7 @@  extern tree default_mangle_assembler_name (const char *);
 
 extern machine_mode default_translate_mode_attribute (machine_mode);
 extern bool default_scalar_mode_supported_p (scalar_mode);
+extern bool default_vcond_supported_p (machine_mode, machine_mode, int);
 extern bool default_libgcc_floating_mode_supported_p (scalar_float_mode);
 extern opt_scalar_float_mode default_floatn_mode (int, bool);
 extern bool default_floatn_builtin_p (int);