diff mbox series

[9/n] PR85694: Add a vect_look_through_pattern helper

Message ID 87d0wof858.fsf@arm.com
State New
Headers show
Series [9/n] PR85694: Add a vect_look_through_pattern helper | expand

Commit Message

Richard Sandiford June 18, 2018, 3:04 p.m. UTC
When following the definitions of SSA names, some recognisers
already cope with statements that have been replaced by patterns.
This patch makes that happen automatically for users of
type_conversion_p and vect_get_internal_def.  It also adds
a vect_look_through_pattern helper that can be used directly.

The reason for doing this is that the main patch for PR85694
makes over_widening handle more general cases.  These over-widened
patterns can still be useful when matching later statements;
e.g. an overwidened MULT_EXPR could be the input to a DOT_PROD_EXPR.

The patch doesn't do anything with the STMT_VINFO_IN_PATTERN_P checks
in vect_recog_over_widening_pattern or vect_recog_widen_shift_pattern
since later patches rewrite them anyway.

Doing this fixed an XFAIL in vect-reduc-dot-u16b.c.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2018-06-18  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vectorizer.h (vect_look_through_pattern): New function.
	* tree-vect-patterns.c (vect_get_internal_def): Use it.
	(type_conversion_p): Likewise.
	(vect_recog_rotate_pattern): Likewise.
	(vect_recog_dot_prod_pattern): Check directly for a WIDEN_MULT_EXPR.
	(vect_recog_vector_vector_shift_pattern): Don't check
	STMT_VINFO_IN_PATTERN_P.

gcc/testsuite/
	* gcc.dg/vect/vect-reduc-dot-u16b.c: Remove xfail and update the
	test for vectorization along the lines described in the comment.

Comments

Richard Biener June 19, 2018, 10:11 a.m. UTC | #1
On Mon, Jun 18, 2018 at 5:04 PM Richard Sandiford
<richard.sandiford@arm.com> wrote:
>
> When following the definitions of SSA names, some recognisers
> already cope with statements that have been replaced by patterns.
> This patch makes that happen automatically for users of
> type_conversion_p and vect_get_internal_def.  It also adds
> a vect_look_through_pattern helper that can be used directly.
>
> The reason for doing this is that the main patch for PR85694
> makes over_widening handle more general cases.  These over-widened
> patterns can still be useful when matching later statements;
> e.g. an overwidened MULT_EXPR could be the input to a DOT_PROD_EXPR.
>
> The patch doesn't do anything with the STMT_VINFO_IN_PATTERN_P checks
> in vect_recog_over_widening_pattern or vect_recog_widen_shift_pattern
> since later patches rewrite them anyway.
>
> Doing this fixed an XFAIL in vect-reduc-dot-u16b.c.
>
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Hmm.  It seems to me that *def_stmt for vect_is_simple_use should
eventually be the pattern def given the vectype overload takes the
vectype from the pattern def already but oddly enough the
DEF_TYPE is taken from the non-pattern stmt.

I wonder which callers look at def_stmt at all (and how...)

I guess swapping the def_stmt and dt arguments and adding yet another
overload to remove all unused &def_stmt args might this easier to review...

So - I'm suggesting to change vect_is_simple_use.

Otherwise it somehow looks inconsistent.

Anyhow, I can of course reconsider.

Thanks,
Richard.

> Richard
>
>
> 2018-06-18  Richard Sandiford  <richard.sandiford@arm.com>
>
> gcc/
>         * tree-vectorizer.h (vect_look_through_pattern): New function.
>         * tree-vect-patterns.c (vect_get_internal_def): Use it.
>         (type_conversion_p): Likewise.
>         (vect_recog_rotate_pattern): Likewise.
>         (vect_recog_dot_prod_pattern): Check directly for a WIDEN_MULT_EXPR.
>         (vect_recog_vector_vector_shift_pattern): Don't check
>         STMT_VINFO_IN_PATTERN_P.
>
> gcc/testsuite/
>         * gcc.dg/vect/vect-reduc-dot-u16b.c: Remove xfail and update the
>         test for vectorization along the lines described in the comment.
>
> Index: gcc/tree-vectorizer.h
> ===================================================================
> --- gcc/tree-vectorizer.h       2018-06-18 15:43:52.951038712 +0100
> +++ gcc/tree-vectorizer.h       2018-06-18 15:44:05.522927566 +0100
> @@ -1422,6 +1422,19 @@ vect_get_scalar_dr_size (struct data_ref
>    return tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr))));
>  }
>
> +/* STMT is a statement that we're thinking about vectorizing.
> +   If it has been replaced by a pattern statement, return that
> +   pattern statement, otherwise return STMT itself.  */
> +
> +inline gimple *
> +vect_look_through_pattern (gimple *stmt)
> +{
> +  stmt_vec_info vinfo = vinfo_for_stmt (stmt);
> +  if (STMT_VINFO_IN_PATTERN_P (vinfo))
> +    stmt = STMT_VINFO_RELATED_STMT (vinfo);
> +  return stmt;
> +}
> +
>  /* Source location */
>  extern source_location vect_location;
>
> Index: gcc/tree-vect-patterns.c
> ===================================================================
> --- gcc/tree-vect-patterns.c    2018-06-18 15:43:52.951038712 +0100
> +++ gcc/tree-vect-patterns.c    2018-06-18 15:44:05.522927566 +0100
> @@ -164,7 +164,7 @@ vect_get_internal_def (vec_info *vinfo,
>        || dt != vect_internal_def)
>      return NULL;
>
> -  return vinfo_for_stmt (def_stmt);
> +  return vinfo_for_stmt (vect_look_through_pattern (def_stmt));
>  }
>
>  /* Check whether NAME, an ssa-name used in USE_STMT,
> @@ -195,12 +195,7 @@ type_conversion_p (tree name, gimple *us
>      return false;
>
>    if (dt == vect_internal_def)
> -    {
> -      stmt_vec_info def_vinfo = vinfo_for_stmt (*def_stmt);
> -      if (STMT_VINFO_IN_PATTERN_P (def_vinfo))
> -       return false;
> -    }
> -
> +    *def_stmt = vect_look_through_pattern (*def_stmt);
>    if (!is_gimple_assign (*def_stmt))
>      return false;
>
> @@ -384,20 +379,11 @@ vect_recog_dot_prod_pattern (vec<gimple
>    /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
>       inside the loop (in case we are analyzing an outer-loop).  */
>    gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt);
> -  if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR)
> +  if (!mult)
>      return NULL;
> -  if (STMT_VINFO_IN_PATTERN_P (mult_vinfo))
> +  if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR)
>      {
>        /* Has been detected as a widening multiplication?  */
> -
> -      mult = dyn_cast <gassign *> (STMT_VINFO_RELATED_STMT (mult_vinfo));
> -      if (!mult || gimple_assign_rhs_code (mult) != WIDEN_MULT_EXPR)
> -        return NULL;
> -      STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo)
> -       = STMT_VINFO_PATTERN_DEF_SEQ (mult_vinfo);
> -      mult_vinfo = vinfo_for_stmt (mult);
> -      gcc_assert (mult_vinfo);
> -      gcc_assert (STMT_VINFO_DEF_TYPE (mult_vinfo) == vect_internal_def);
>        oprnd00 = gimple_assign_rhs1 (mult);
>        oprnd01 = gimple_assign_rhs2 (mult);
>      }
> @@ -407,6 +393,9 @@ vect_recog_dot_prod_pattern (vec<gimple
>        gimple *def_stmt;
>        tree oprnd0, oprnd1;
>
> +      if (gimple_assign_rhs_code (mult) != MULT_EXPR)
> +       return NULL;
> +
>        oprnd0 = gimple_assign_rhs1 (mult);
>        oprnd1 = gimple_assign_rhs2 (mult);
>        if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt,
> @@ -1803,6 +1792,9 @@ vect_recog_rotate_pattern (vec<gimple *>
>        && dt != vect_external_def)
>      return NULL;
>
> +  if (dt == vect_internal_def)
> +    def_stmt = vect_look_through_pattern (def_stmt);
> +
>    vectype = get_vectype_for_scalar_type (type);
>    if (vectype == NULL_TREE)
>      return NULL;
> @@ -2051,9 +2043,7 @@ vect_recog_vector_vector_shift_pattern (
>
>    tree def = NULL_TREE;
>    gassign *def_stmt = dyn_cast <gassign *> (def_vinfo->stmt);
> -  if (!STMT_VINFO_IN_PATTERN_P (def_vinfo)
> -      && def_stmt
> -      && gimple_assign_cast_p (def_stmt))
> +  if (def_stmt && gimple_assign_cast_p (def_stmt))
>      {
>        tree rhs1 = gimple_assign_rhs1 (def_stmt);
>        if (TYPE_MODE (TREE_TYPE (rhs1)) == TYPE_MODE (TREE_TYPE (oprnd0))
> Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c
> ===================================================================
> --- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c     2016-11-11 17:07:36.588798042 +0000
> +++ gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c     2018-06-18 15:44:05.522927566 +0100
> @@ -10,11 +10,8 @@ #define DOT2 43680
>  unsigned short X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
>  unsigned short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
>
> -/* short->int->int dot product.
> -   Currently not detected as a dot-product pattern: the multiplication
> -   promotes the ushorts to int, and then the product is promoted to unsigned
> -   int for the addition.  Which results in an int->unsigned int cast, which
> -   since no bits are modified in the cast should be trivially vectorizable.  */
> +/* ushort->int->uint dot product: the multiplication promotes the ushorts
> +   to int, and then the product is converted to uint for the addition.  */
>  __attribute__ ((noinline)) unsigned int
>  foo2(int len) {
>    int i;
> @@ -47,12 +44,6 @@ int main (void)
>    return 0;
>  }
>
> -/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" { xfail *-*-* } } } */
> -
> -/* Once the dot-product pattern is detected, we expect
> -   that loop to be vectorized on vect_udot_hi targets (targets that support
> -   dot-product of unsigned shorts) and targets that support widening multiplication.  */
> -/* The induction loop in main is vectorized.  */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail *-*-* } } } */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */
> +/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" } } */
>
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_pack_trunc || vect_udot_hi } } } } */
Richard Sandiford June 20, 2018, 10:24 a.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Mon, Jun 18, 2018 at 5:04 PM Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> When following the definitions of SSA names, some recognisers
>> already cope with statements that have been replaced by patterns.
>> This patch makes that happen automatically for users of
>> type_conversion_p and vect_get_internal_def.  It also adds
>> a vect_look_through_pattern helper that can be used directly.
>>
>> The reason for doing this is that the main patch for PR85694
>> makes over_widening handle more general cases.  These over-widened
>> patterns can still be useful when matching later statements;
>> e.g. an overwidened MULT_EXPR could be the input to a DOT_PROD_EXPR.
>>
>> The patch doesn't do anything with the STMT_VINFO_IN_PATTERN_P checks
>> in vect_recog_over_widening_pattern or vect_recog_widen_shift_pattern
>> since later patches rewrite them anyway.
>>
>> Doing this fixed an XFAIL in vect-reduc-dot-u16b.c.
>>
>> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>
> Hmm.  It seems to me that *def_stmt for vect_is_simple_use should
> eventually be the pattern def given the vectype overload takes the
> vectype from the pattern def already but oddly enough the
> DEF_TYPE is taken from the non-pattern stmt.
>
> I wonder which callers look at def_stmt at all (and how...)
>
> I guess swapping the def_stmt and dt arguments and adding yet another
> overload to remove all unused &def_stmt args might this easier to review...
>
> So - I'm suggesting to change vect_is_simple_use.

OK, I'll try that.  Might end up being its own mini-series. :-)

I'll post the next set of patches anyway since they don't depend
on *how* this is fixed, just that it is.

Thanks,
Richard
diff mbox series

Patch

Index: gcc/tree-vectorizer.h
===================================================================
--- gcc/tree-vectorizer.h	2018-06-18 15:43:52.951038712 +0100
+++ gcc/tree-vectorizer.h	2018-06-18 15:44:05.522927566 +0100
@@ -1422,6 +1422,19 @@  vect_get_scalar_dr_size (struct data_ref
   return tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr))));
 }
 
+/* STMT is a statement that we're thinking about vectorizing.
+   If it has been replaced by a pattern statement, return that
+   pattern statement, otherwise return STMT itself.  */
+
+inline gimple *
+vect_look_through_pattern (gimple *stmt)
+{
+  stmt_vec_info vinfo = vinfo_for_stmt (stmt);
+  if (STMT_VINFO_IN_PATTERN_P (vinfo))
+    stmt = STMT_VINFO_RELATED_STMT (vinfo);
+  return stmt;
+}
+
 /* Source location */
 extern source_location vect_location;
 
Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c	2018-06-18 15:43:52.951038712 +0100
+++ gcc/tree-vect-patterns.c	2018-06-18 15:44:05.522927566 +0100
@@ -164,7 +164,7 @@  vect_get_internal_def (vec_info *vinfo,
       || dt != vect_internal_def)
     return NULL;
 
-  return vinfo_for_stmt (def_stmt);
+  return vinfo_for_stmt (vect_look_through_pattern (def_stmt));
 }
 
 /* Check whether NAME, an ssa-name used in USE_STMT,
@@ -195,12 +195,7 @@  type_conversion_p (tree name, gimple *us
     return false;
 
   if (dt == vect_internal_def)
-    {
-      stmt_vec_info def_vinfo = vinfo_for_stmt (*def_stmt);
-      if (STMT_VINFO_IN_PATTERN_P (def_vinfo))
-	return false;
-    }
-
+    *def_stmt = vect_look_through_pattern (*def_stmt);
   if (!is_gimple_assign (*def_stmt))
     return false;
 
@@ -384,20 +379,11 @@  vect_recog_dot_prod_pattern (vec<gimple
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt);
-  if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR)
+  if (!mult)
     return NULL;
-  if (STMT_VINFO_IN_PATTERN_P (mult_vinfo))
+  if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR)
     {
       /* Has been detected as a widening multiplication?  */
-
-      mult = dyn_cast <gassign *> (STMT_VINFO_RELATED_STMT (mult_vinfo));
-      if (!mult || gimple_assign_rhs_code (mult) != WIDEN_MULT_EXPR)
-        return NULL;
-      STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo)
-	= STMT_VINFO_PATTERN_DEF_SEQ (mult_vinfo);
-      mult_vinfo = vinfo_for_stmt (mult);
-      gcc_assert (mult_vinfo);
-      gcc_assert (STMT_VINFO_DEF_TYPE (mult_vinfo) == vect_internal_def);
       oprnd00 = gimple_assign_rhs1 (mult);
       oprnd01 = gimple_assign_rhs2 (mult);
     }
@@ -407,6 +393,9 @@  vect_recog_dot_prod_pattern (vec<gimple
       gimple *def_stmt;
       tree oprnd0, oprnd1;
 
+      if (gimple_assign_rhs_code (mult) != MULT_EXPR)
+	return NULL;
+
       oprnd0 = gimple_assign_rhs1 (mult);
       oprnd1 = gimple_assign_rhs2 (mult);
       if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt,
@@ -1803,6 +1792,9 @@  vect_recog_rotate_pattern (vec<gimple *>
       && dt != vect_external_def)
     return NULL;
 
+  if (dt == vect_internal_def)
+    def_stmt = vect_look_through_pattern (def_stmt);
+
   vectype = get_vectype_for_scalar_type (type);
   if (vectype == NULL_TREE)
     return NULL;
@@ -2051,9 +2043,7 @@  vect_recog_vector_vector_shift_pattern (
 
   tree def = NULL_TREE;
   gassign *def_stmt = dyn_cast <gassign *> (def_vinfo->stmt);
-  if (!STMT_VINFO_IN_PATTERN_P (def_vinfo)
-      && def_stmt
-      && gimple_assign_cast_p (def_stmt))
+  if (def_stmt && gimple_assign_cast_p (def_stmt))
     {
       tree rhs1 = gimple_assign_rhs1 (def_stmt);
       if (TYPE_MODE (TREE_TYPE (rhs1)) == TYPE_MODE (TREE_TYPE (oprnd0))
Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c
===================================================================
--- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c	2016-11-11 17:07:36.588798042 +0000
+++ gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c	2018-06-18 15:44:05.522927566 +0100
@@ -10,11 +10,8 @@  #define DOT2 43680
 unsigned short X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 unsigned short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__)));
 
-/* short->int->int dot product. 
-   Currently not detected as a dot-product pattern: the multiplication 
-   promotes the ushorts to int, and then the product is promoted to unsigned 
-   int for the addition.  Which results in an int->unsigned int cast, which 
-   since no bits are modified in the cast should be trivially vectorizable.  */
+/* ushort->int->uint dot product: the multiplication promotes the ushorts
+   to int, and then the product is converted to uint for the addition.  */
 __attribute__ ((noinline)) unsigned int
 foo2(int len) {
   int i;
@@ -47,12 +44,6 @@  int main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" { xfail *-*-* } } } */
-
-/* Once the dot-product pattern is detected, we expect
-   that loop to be vectorized on vect_udot_hi targets (targets that support 
-   dot-product of unsigned shorts) and targets that support widening multiplication.  */
-/* The induction loop in main is vectorized.  */
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail *-*-* } } } */ 
-/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */ 
+/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" } } */
 
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_pack_trunc || vect_udot_hi } } } } */