Message ID | 87d0wof858.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | [9/n] PR85694: Add a vect_look_through_pattern helper | expand |
On Mon, Jun 18, 2018 at 5:04 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > When following the definitions of SSA names, some recognisers > already cope with statements that have been replaced by patterns. > This patch makes that happen automatically for users of > type_conversion_p and vect_get_internal_def. It also adds > a vect_look_through_pattern helper that can be used directly. > > The reason for doing this is that the main patch for PR85694 > makes over_widening handle more general cases. These over-widened > patterns can still be useful when matching later statements; > e.g. an overwidened MULT_EXPR could be the input to a DOT_PROD_EXPR. > > The patch doesn't do anything with the STMT_VINFO_IN_PATTERN_P checks > in vect_recog_over_widening_pattern or vect_recog_widen_shift_pattern > since later patches rewrite them anyway. > > Doing this fixed an XFAIL in vect-reduc-dot-u16b.c. > > Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? Hmm. It seems to me that *def_stmt for vect_is_simple_use should eventually be the pattern def given the vectype overload takes the vectype from the pattern def already but oddly enough the DEF_TYPE is taken from the non-pattern stmt. I wonder which callers look at def_stmt at all (and how...) I guess swapping the def_stmt and dt arguments and adding yet another overload to remove all unused &def_stmt args might this easier to review... So - I'm suggesting to change vect_is_simple_use. Otherwise it somehow looks inconsistent. Anyhow, I can of course reconsider. Thanks, Richard. > Richard > > > 2018-06-18 Richard Sandiford <richard.sandiford@arm.com> > > gcc/ > * tree-vectorizer.h (vect_look_through_pattern): New function. > * tree-vect-patterns.c (vect_get_internal_def): Use it. > (type_conversion_p): Likewise. > (vect_recog_rotate_pattern): Likewise. > (vect_recog_dot_prod_pattern): Check directly for a WIDEN_MULT_EXPR. > (vect_recog_vector_vector_shift_pattern): Don't check > STMT_VINFO_IN_PATTERN_P. > > gcc/testsuite/ > * gcc.dg/vect/vect-reduc-dot-u16b.c: Remove xfail and update the > test for vectorization along the lines described in the comment. > > Index: gcc/tree-vectorizer.h > =================================================================== > --- gcc/tree-vectorizer.h 2018-06-18 15:43:52.951038712 +0100 > +++ gcc/tree-vectorizer.h 2018-06-18 15:44:05.522927566 +0100 > @@ -1422,6 +1422,19 @@ vect_get_scalar_dr_size (struct data_ref > return tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)))); > } > > +/* STMT is a statement that we're thinking about vectorizing. > + If it has been replaced by a pattern statement, return that > + pattern statement, otherwise return STMT itself. */ > + > +inline gimple * > +vect_look_through_pattern (gimple *stmt) > +{ > + stmt_vec_info vinfo = vinfo_for_stmt (stmt); > + if (STMT_VINFO_IN_PATTERN_P (vinfo)) > + stmt = STMT_VINFO_RELATED_STMT (vinfo); > + return stmt; > +} > + > /* Source location */ > extern source_location vect_location; > > Index: gcc/tree-vect-patterns.c > =================================================================== > --- gcc/tree-vect-patterns.c 2018-06-18 15:43:52.951038712 +0100 > +++ gcc/tree-vect-patterns.c 2018-06-18 15:44:05.522927566 +0100 > @@ -164,7 +164,7 @@ vect_get_internal_def (vec_info *vinfo, > || dt != vect_internal_def) > return NULL; > > - return vinfo_for_stmt (def_stmt); > + return vinfo_for_stmt (vect_look_through_pattern (def_stmt)); > } > > /* Check whether NAME, an ssa-name used in USE_STMT, > @@ -195,12 +195,7 @@ type_conversion_p (tree name, gimple *us > return false; > > if (dt == vect_internal_def) > - { > - stmt_vec_info def_vinfo = vinfo_for_stmt (*def_stmt); > - if (STMT_VINFO_IN_PATTERN_P (def_vinfo)) > - return false; > - } > - > + *def_stmt = vect_look_through_pattern (*def_stmt); > if (!is_gimple_assign (*def_stmt)) > return false; > > @@ -384,20 +379,11 @@ vect_recog_dot_prod_pattern (vec<gimple > /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi > inside the loop (in case we are analyzing an outer-loop). */ > gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt); > - if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR) > + if (!mult) > return NULL; > - if (STMT_VINFO_IN_PATTERN_P (mult_vinfo)) > + if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR) > { > /* Has been detected as a widening multiplication? */ > - > - mult = dyn_cast <gassign *> (STMT_VINFO_RELATED_STMT (mult_vinfo)); > - if (!mult || gimple_assign_rhs_code (mult) != WIDEN_MULT_EXPR) > - return NULL; > - STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) > - = STMT_VINFO_PATTERN_DEF_SEQ (mult_vinfo); > - mult_vinfo = vinfo_for_stmt (mult); > - gcc_assert (mult_vinfo); > - gcc_assert (STMT_VINFO_DEF_TYPE (mult_vinfo) == vect_internal_def); > oprnd00 = gimple_assign_rhs1 (mult); > oprnd01 = gimple_assign_rhs2 (mult); > } > @@ -407,6 +393,9 @@ vect_recog_dot_prod_pattern (vec<gimple > gimple *def_stmt; > tree oprnd0, oprnd1; > > + if (gimple_assign_rhs_code (mult) != MULT_EXPR) > + return NULL; > + > oprnd0 = gimple_assign_rhs1 (mult); > oprnd1 = gimple_assign_rhs2 (mult); > if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt, > @@ -1803,6 +1792,9 @@ vect_recog_rotate_pattern (vec<gimple *> > && dt != vect_external_def) > return NULL; > > + if (dt == vect_internal_def) > + def_stmt = vect_look_through_pattern (def_stmt); > + > vectype = get_vectype_for_scalar_type (type); > if (vectype == NULL_TREE) > return NULL; > @@ -2051,9 +2043,7 @@ vect_recog_vector_vector_shift_pattern ( > > tree def = NULL_TREE; > gassign *def_stmt = dyn_cast <gassign *> (def_vinfo->stmt); > - if (!STMT_VINFO_IN_PATTERN_P (def_vinfo) > - && def_stmt > - && gimple_assign_cast_p (def_stmt)) > + if (def_stmt && gimple_assign_cast_p (def_stmt)) > { > tree rhs1 = gimple_assign_rhs1 (def_stmt); > if (TYPE_MODE (TREE_TYPE (rhs1)) == TYPE_MODE (TREE_TYPE (oprnd0)) > Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c 2016-11-11 17:07:36.588798042 +0000 > +++ gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c 2018-06-18 15:44:05.522927566 +0100 > @@ -10,11 +10,8 @@ #define DOT2 43680 > unsigned short X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); > unsigned short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); > > -/* short->int->int dot product. > - Currently not detected as a dot-product pattern: the multiplication > - promotes the ushorts to int, and then the product is promoted to unsigned > - int for the addition. Which results in an int->unsigned int cast, which > - since no bits are modified in the cast should be trivially vectorizable. */ > +/* ushort->int->uint dot product: the multiplication promotes the ushorts > + to int, and then the product is converted to uint for the addition. */ > __attribute__ ((noinline)) unsigned int > foo2(int len) { > int i; > @@ -47,12 +44,6 @@ int main (void) > return 0; > } > > -/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" { xfail *-*-* } } } */ > - > -/* Once the dot-product pattern is detected, we expect > - that loop to be vectorized on vect_udot_hi targets (targets that support > - dot-product of unsigned shorts) and targets that support widening multiplication. */ > -/* The induction loop in main is vectorized. */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail *-*-* } } } */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */ > +/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" } } */ > > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_pack_trunc || vect_udot_hi } } } } */
Richard Biener <richard.guenther@gmail.com> writes: > On Mon, Jun 18, 2018 at 5:04 PM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> When following the definitions of SSA names, some recognisers >> already cope with statements that have been replaced by patterns. >> This patch makes that happen automatically for users of >> type_conversion_p and vect_get_internal_def. It also adds >> a vect_look_through_pattern helper that can be used directly. >> >> The reason for doing this is that the main patch for PR85694 >> makes over_widening handle more general cases. These over-widened >> patterns can still be useful when matching later statements; >> e.g. an overwidened MULT_EXPR could be the input to a DOT_PROD_EXPR. >> >> The patch doesn't do anything with the STMT_VINFO_IN_PATTERN_P checks >> in vect_recog_over_widening_pattern or vect_recog_widen_shift_pattern >> since later patches rewrite them anyway. >> >> Doing this fixed an XFAIL in vect-reduc-dot-u16b.c. >> >> Tested on aarch64-linux-gnu and x86_64-linux-gnu. OK to install? > > Hmm. It seems to me that *def_stmt for vect_is_simple_use should > eventually be the pattern def given the vectype overload takes the > vectype from the pattern def already but oddly enough the > DEF_TYPE is taken from the non-pattern stmt. > > I wonder which callers look at def_stmt at all (and how...) > > I guess swapping the def_stmt and dt arguments and adding yet another > overload to remove all unused &def_stmt args might this easier to review... > > So - I'm suggesting to change vect_is_simple_use. OK, I'll try that. Might end up being its own mini-series. :-) I'll post the next set of patches anyway since they don't depend on *how* this is fixed, just that it is. Thanks, Richard
Index: gcc/tree-vectorizer.h =================================================================== --- gcc/tree-vectorizer.h 2018-06-18 15:43:52.951038712 +0100 +++ gcc/tree-vectorizer.h 2018-06-18 15:44:05.522927566 +0100 @@ -1422,6 +1422,19 @@ vect_get_scalar_dr_size (struct data_ref return tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr)))); } +/* STMT is a statement that we're thinking about vectorizing. + If it has been replaced by a pattern statement, return that + pattern statement, otherwise return STMT itself. */ + +inline gimple * +vect_look_through_pattern (gimple *stmt) +{ + stmt_vec_info vinfo = vinfo_for_stmt (stmt); + if (STMT_VINFO_IN_PATTERN_P (vinfo)) + stmt = STMT_VINFO_RELATED_STMT (vinfo); + return stmt; +} + /* Source location */ extern source_location vect_location; Index: gcc/tree-vect-patterns.c =================================================================== --- gcc/tree-vect-patterns.c 2018-06-18 15:43:52.951038712 +0100 +++ gcc/tree-vect-patterns.c 2018-06-18 15:44:05.522927566 +0100 @@ -164,7 +164,7 @@ vect_get_internal_def (vec_info *vinfo, || dt != vect_internal_def) return NULL; - return vinfo_for_stmt (def_stmt); + return vinfo_for_stmt (vect_look_through_pattern (def_stmt)); } /* Check whether NAME, an ssa-name used in USE_STMT, @@ -195,12 +195,7 @@ type_conversion_p (tree name, gimple *us return false; if (dt == vect_internal_def) - { - stmt_vec_info def_vinfo = vinfo_for_stmt (*def_stmt); - if (STMT_VINFO_IN_PATTERN_P (def_vinfo)) - return false; - } - + *def_stmt = vect_look_through_pattern (*def_stmt); if (!is_gimple_assign (*def_stmt)) return false; @@ -384,20 +379,11 @@ vect_recog_dot_prod_pattern (vec<gimple /* FORNOW. Can continue analyzing the def-use chain when this stmt in a phi inside the loop (in case we are analyzing an outer-loop). */ gassign *mult = dyn_cast <gassign *> (mult_vinfo->stmt); - if (!mult || gimple_assign_rhs_code (mult) != MULT_EXPR) + if (!mult) return NULL; - if (STMT_VINFO_IN_PATTERN_P (mult_vinfo)) + if (gimple_assign_rhs_code (mult) == WIDEN_MULT_EXPR) { /* Has been detected as a widening multiplication? */ - - mult = dyn_cast <gassign *> (STMT_VINFO_RELATED_STMT (mult_vinfo)); - if (!mult || gimple_assign_rhs_code (mult) != WIDEN_MULT_EXPR) - return NULL; - STMT_VINFO_PATTERN_DEF_SEQ (stmt_vinfo) - = STMT_VINFO_PATTERN_DEF_SEQ (mult_vinfo); - mult_vinfo = vinfo_for_stmt (mult); - gcc_assert (mult_vinfo); - gcc_assert (STMT_VINFO_DEF_TYPE (mult_vinfo) == vect_internal_def); oprnd00 = gimple_assign_rhs1 (mult); oprnd01 = gimple_assign_rhs2 (mult); } @@ -407,6 +393,9 @@ vect_recog_dot_prod_pattern (vec<gimple gimple *def_stmt; tree oprnd0, oprnd1; + if (gimple_assign_rhs_code (mult) != MULT_EXPR) + return NULL; + oprnd0 = gimple_assign_rhs1 (mult); oprnd1 = gimple_assign_rhs2 (mult); if (!type_conversion_p (oprnd0, mult, true, &half_type0, &def_stmt, @@ -1803,6 +1792,9 @@ vect_recog_rotate_pattern (vec<gimple *> && dt != vect_external_def) return NULL; + if (dt == vect_internal_def) + def_stmt = vect_look_through_pattern (def_stmt); + vectype = get_vectype_for_scalar_type (type); if (vectype == NULL_TREE) return NULL; @@ -2051,9 +2043,7 @@ vect_recog_vector_vector_shift_pattern ( tree def = NULL_TREE; gassign *def_stmt = dyn_cast <gassign *> (def_vinfo->stmt); - if (!STMT_VINFO_IN_PATTERN_P (def_vinfo) - && def_stmt - && gimple_assign_cast_p (def_stmt)) + if (def_stmt && gimple_assign_cast_p (def_stmt)) { tree rhs1 = gimple_assign_rhs1 (def_stmt); if (TYPE_MODE (TREE_TYPE (rhs1)) == TYPE_MODE (TREE_TYPE (oprnd0)) Index: gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c =================================================================== --- gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c 2016-11-11 17:07:36.588798042 +0000 +++ gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u16b.c 2018-06-18 15:44:05.522927566 +0100 @@ -10,11 +10,8 @@ #define DOT2 43680 unsigned short X[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); unsigned short Y[N] __attribute__ ((__aligned__(__BIGGEST_ALIGNMENT__))); -/* short->int->int dot product. - Currently not detected as a dot-product pattern: the multiplication - promotes the ushorts to int, and then the product is promoted to unsigned - int for the addition. Which results in an int->unsigned int cast, which - since no bits are modified in the cast should be trivially vectorizable. */ +/* ushort->int->uint dot product: the multiplication promotes the ushorts + to int, and then the product is converted to uint for the addition. */ __attribute__ ((noinline)) unsigned int foo2(int len) { int i; @@ -47,12 +44,6 @@ int main (void) return 0; } -/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" { xfail *-*-* } } } */ - -/* Once the dot-product pattern is detected, we expect - that loop to be vectorized on vect_udot_hi targets (targets that support - dot-product of unsigned shorts) and targets that support widening multiplication. */ -/* The induction loop in main is vectorized. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { xfail *-*-* } } } */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_pack_trunc } } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_dot_prod_pattern: detected" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_pack_trunc || vect_udot_hi } } } } */