Message ID | alpine.LNX.2.00.1010181209340.23074@zhemvz.fhfr.qr |
---|---|
State | New |
Headers | show |
Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:17:16 PM: > On Sun, 17 Oct 2010, Ira Rosen wrote: > > > > > Hi, > > > > With this patch we choose operands' type as the type of the vector created > > for them in case when the operands are invariant variables. > > In these PRs we have two different operations, so it doesn't seem > > appropriate to choose the type according to operation. > > > > OK for trunk after bootstrap and testing complete on x86_64-suse-linux? > > Well, I now looked at the problem myself. The function seems to be > used to create vectors for operands which then get used to create > vectorized statements to produce their LHS (thus, much similar > to vect_get_vec_def_for_operand). It thus does not make much sense > to look at STMT_VINFO_VECTYPE at all (that's the vector type of > the LHS of the stmt with the operands). It should simply use > get_vectype_for_scalar_type always (or get_same_sized_vectype > with STMT_VINFO_VECTYPE as the vector type determining the size). > > So either > > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c (revision 165610) > +++ gcc/tree-vect-slp.c (working copy) > @@ -1899,14 +1899,7 @@ vect_get_constant_vectors (slp_tree slp_ > else > constant_p = false; > > - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > itself. > - If OP is the first operand of POINTER_PLUS_EXPR, its type is the > type of > - the statement, so it's OK to use OP's type for both first and second > - operands. */ > - if (code == POINTER_PLUS_EXPR) > - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > - else > - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > + vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > gcc_assert (vector_type); > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > or, to probably make AVX happier with float -> int conversion > > Index: gcc/tree-vect-slp.c > =================================================================== > --- gcc/tree-vect-slp.c (revision 165610) > +++ gcc/tree-vect-slp.c (working copy) > @@ -1899,14 +1899,8 @@ vect_get_constant_vectors (slp_tree slp_ > else > constant_p = false; > > - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > itself. > - If OP is the first operand of POINTER_PLUS_EXPR, its type is the > type of > - the statement, so it's OK to use OP's type for both first and second > - operands. */ > - if (code == POINTER_PLUS_EXPR) > - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > - else > - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > + vector_type = get_same_sized_vectype (TREE_TYPE (op), > + STMT_VINFO_VECTYPE (stmt_vinfo)); > > gcc_assert (vector_type); > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > both variants pass the new testcases and the C vect testsuite for me. It will not work for pr45902.c on PowerPC (you first patch just reverts the patch that fixed PR 45902). For: short res[N]; short a[N]; int main1 () { int i; for (i = 0; i < N/4; i+=4) { res[i] = a[i] >> 8; res[i+1] = a[i+1] >> 8; res[i+2] = a[i+2] >> 8; res[i+3] = a[i+3] >> 8; } } it will create vector shift for vector short with {8, 8, 8, 8} instead of {8, 8, 8, 8, 8, 8, 8, 8}. Thanks, Ira > Unless we start being less conservative with constraining vector > sizes with AVX there is probably no functional difference, but for > consistency I'd prefer the second variant. > > Thanks, > Richard. > > > > Thanks, > > Ira > > > > ChangeLog: > > > > PR tree-optimization/46049 > > PR tree-optimization/46052 > > * tree-vect-slp.c (vect_get_constant_vectors): Use operands' > > type in case of invariant variables. > > > > testsuite/ChangeLog: > > > > PR tree-optimization/46049 > > PR tree-optimization/46052 > > * gcc.dg/vect/pr46052.c: New test. > > * gcc.dg/vect/pr46049.c: New test. > > > > > > Index: testsuite/gcc.dg/vect/pr46052.c > > =================================================================== > > --- testsuite/gcc.dg/vect/pr46052.c (revision 0) > > +++ testsuite/gcc.dg/vect/pr46052.c (revision 0) > > @@ -0,0 +1,33 @@ > > +/* { dg-do compile } */ > > + > > +int i; > > +int a[2]; > > + > > +static inline char bar (void) > > +{ > > + return i ? i : 1; > > +} > > + > > +void foo (int n) > > +{ > > + while (n--) > > + { > > + a[0] ^= bar (); > > + a[1] ^= bar (); > > + } > > +} > > + > > +static inline char bar1 (void) > > +{ > > +} > > + > > +void foo1 (int n) > > +{ > > + while (n--) > > + { > > + a[0] ^= bar1 (); > > + a[1] ^= bar1 (); > > + } > > +} > > + > > +/* { dg-final { cleanup-tree-dump "vect" } } */ > > Index: testsuite/gcc.dg/vect/pr46049.c > > =================================================================== > > --- testsuite/gcc.dg/vect/pr46049.c (revision 0) > > +++ testsuite/gcc.dg/vect/pr46049.c (revision 0) > > @@ -0,0 +1,21 @@ > > +/* { dg-do compile } */ > > + > > +typedef __INT16_TYPE__ int16_t; > > +typedef __INT32_TYPE__ int32_t; > > + > > +static inline int32_t bar (int16_t x, int16_t y) > > +{ > > + return x * y; > > +} > > + > > +void foo (int16_t i, int16_t *p, int16_t x) > > +{ > > + while (i--) > > + { > > + *p = bar (*p, x) >> 15; > > + p++; > > + *p = bar (*p, x) >> 15; > > + p++; > > + } > > +} > > +/* { dg-final { cleanup-tree-dump "vect" } } */ > > Index: tree-vect-slp.c > > =================================================================== > > --- tree-vect-slp.c (revision 165574) > > +++ tree-vect-slp.c (working copy) > > @@ -1902,8 +1902,8 @@ vect_get_constant_vectors (slp_tree slp_ > > /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > > itself. > > If OP is the first operand of POINTER_PLUS_EXPR, its type is the type > > of > > the statement, so it's OK to use OP's type for both first and second > > - operands. */ > > - if (code == POINTER_PLUS_EXPR) > > + operands. We also use the type of OP if it's an invariant variable. > > */ > > + if (code == POINTER_PLUS_EXPR || !constant_p) > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > else > > vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > > > > > > -- > Richard Guenther <rguenther@suse.de> > Novell / SUSE Labs > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - > GF: Markus Rex
On Mon, 18 Oct 2010, Ira Rosen wrote: > > > Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:17:16 PM: > > > On Sun, 17 Oct 2010, Ira Rosen wrote: > > > > > > > > Hi, > > > > > > With this patch we choose operands' type as the type of the vector > created > > > for them in case when the operands are invariant variables. > > > In these PRs we have two different operations, so it doesn't seem > > > appropriate to choose the type according to operation. > > > > > > OK for trunk after bootstrap and testing complete on x86_64-suse-linux? > > > > Well, I now looked at the problem myself. The function seems to be > > used to create vectors for operands which then get used to create > > vectorized statements to produce their LHS (thus, much similar > > to vect_get_vec_def_for_operand). It thus does not make much sense > > to look at STMT_VINFO_VECTYPE at all (that's the vector type of > > the LHS of the stmt with the operands). It should simply use > > get_vectype_for_scalar_type always (or get_same_sized_vectype > > with STMT_VINFO_VECTYPE as the vector type determining the size). > > > > So either > > > > Index: gcc/tree-vect-slp.c > > =================================================================== > > --- gcc/tree-vect-slp.c (revision 165610) > > +++ gcc/tree-vect-slp.c (working copy) > > @@ -1899,14 +1899,7 @@ vect_get_constant_vectors (slp_tree slp_ > > else > > constant_p = false; > > > > - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > > itself. > > - If OP is the first operand of POINTER_PLUS_EXPR, its type is the > > type of > > - the statement, so it's OK to use OP's type for both first and > second > > - operands. */ > > - if (code == POINTER_PLUS_EXPR) > > - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > - else > > - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > > + vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > > > gcc_assert (vector_type); > > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > > > or, to probably make AVX happier with float -> int conversion > > > > Index: gcc/tree-vect-slp.c > > =================================================================== > > --- gcc/tree-vect-slp.c (revision 165610) > > +++ gcc/tree-vect-slp.c (working copy) > > @@ -1899,14 +1899,8 @@ vect_get_constant_vectors (slp_tree slp_ > > else > > constant_p = false; > > > > - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > > itself. > > - If OP is the first operand of POINTER_PLUS_EXPR, its type is the > > type of > > - the statement, so it's OK to use OP's type for both first and > second > > - operands. */ > > - if (code == POINTER_PLUS_EXPR) > > - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > - else > > - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > > + vector_type = get_same_sized_vectype (TREE_TYPE (op), > > + STMT_VINFO_VECTYPE (stmt_vinfo)); > > > > gcc_assert (vector_type); > > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > > > both variants pass the new testcases and the C vect testsuite for me. > > It will not work for pr45902.c on PowerPC (you first patch just reverts the > patch that fixed PR 45902). > > For: > > short res[N]; > short a[N]; > > int > main1 () > { > int i; > > for (i = 0; i < N/4; i+=4) > { > res[i] = a[i] >> 8; > res[i+1] = a[i+1] >> 8; > res[i+2] = a[i+2] >> 8; > res[i+3] = a[i+3] >> 8; > } > } > > it will create vector shift for vector short with {8, 8, 8, 8} instead of > {8, 8, 8, 8, 8, 8, 8, 8}. Hm, but then this is a problem of the caller, and likely non-SLP mode has the same issue (looking at vectorizable_operation I don't see any special casing handling it for non-SLP, it'll just call vect_get_vec_defs and then vect_get_vec_def_for_operand). I think that vectorizable_operation has to special case shifts (it already does for the vectorization test but not for code generation), which also means that it really asks for being separated out into a vectorizable_shift function ... Basically the if (op_type == binary_op && scalar_shift_arg) { case should also handle the variant where we need to extend the scalar shift arg to a vector. Richard. > Thanks, > Ira > > > > Unless we start being less conservative with constraining vector > > sizes with AVX there is probably no functional difference, but for > > consistency I'd prefer the second variant. > > > > Thanks, > > Richard. > > > > > > > Thanks, > > > Ira > > > > > > ChangeLog: > > > > > > PR tree-optimization/46049 > > > PR tree-optimization/46052 > > > * tree-vect-slp.c (vect_get_constant_vectors): Use operands' > > > type in case of invariant variables. > > > > > > testsuite/ChangeLog: > > > > > > PR tree-optimization/46049 > > > PR tree-optimization/46052 > > > * gcc.dg/vect/pr46052.c: New test. > > > * gcc.dg/vect/pr46049.c: New test. > > > > > > > > > Index: testsuite/gcc.dg/vect/pr46052.c > > > =================================================================== > > > --- testsuite/gcc.dg/vect/pr46052.c (revision 0) > > > +++ testsuite/gcc.dg/vect/pr46052.c (revision 0) > > > @@ -0,0 +1,33 @@ > > > +/* { dg-do compile } */ > > > + > > > +int i; > > > +int a[2]; > > > + > > > +static inline char bar (void) > > > +{ > > > + return i ? i : 1; > > > +} > > > + > > > +void foo (int n) > > > +{ > > > + while (n--) > > > + { > > > + a[0] ^= bar (); > > > + a[1] ^= bar (); > > > + } > > > +} > > > + > > > +static inline char bar1 (void) > > > +{ > > > +} > > > + > > > +void foo1 (int n) > > > +{ > > > + while (n--) > > > + { > > > + a[0] ^= bar1 (); > > > + a[1] ^= bar1 (); > > > + } > > > +} > > > + > > > +/* { dg-final { cleanup-tree-dump "vect" } } */ > > > Index: testsuite/gcc.dg/vect/pr46049.c > > > =================================================================== > > > --- testsuite/gcc.dg/vect/pr46049.c (revision 0) > > > +++ testsuite/gcc.dg/vect/pr46049.c (revision 0) > > > @@ -0,0 +1,21 @@ > > > +/* { dg-do compile } */ > > > + > > > +typedef __INT16_TYPE__ int16_t; > > > +typedef __INT32_TYPE__ int32_t; > > > + > > > +static inline int32_t bar (int16_t x, int16_t y) > > > +{ > > > + return x * y; > > > +} > > > + > > > +void foo (int16_t i, int16_t *p, int16_t x) > > > +{ > > > + while (i--) > > > + { > > > + *p = bar (*p, x) >> 15; > > > + p++; > > > + *p = bar (*p, x) >> 15; > > > + p++; > > > + } > > > +} > > > +/* { dg-final { cleanup-tree-dump "vect" } } */ > > > Index: tree-vect-slp.c > > > =================================================================== > > > --- tree-vect-slp.c (revision 165574) > > > +++ tree-vect-slp.c (working copy) > > > @@ -1902,8 +1902,8 @@ vect_get_constant_vectors (slp_tree slp_ > > > /* For POINTER_PLUS_EXPR we use the type of the constant/invariant > > > itself. > > > If OP is the first operand of POINTER_PLUS_EXPR, its type is the > type > > > of > > > the statement, so it's OK to use OP's type for both first and > second > > > - operands. */ > > > - if (code == POINTER_PLUS_EXPR) > > > + operands. We also use the type of OP if it's an invariant > variable. > > > */ > > > + if (code == POINTER_PLUS_EXPR || !constant_p) > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > > else > > > vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); > > > > > > > > > > -- > > Richard Guenther <rguenther@suse.de> > > Novell / SUSE Labs > > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - > > GF: Markus Rex > >
Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:48:15 PM: > > Hm, but then this is a problem of the caller, and likely non-SLP mode > has the same issue (looking at vectorizable_operation I don't see > any special casing handling it for non-SLP, it'll just call > vect_get_vec_defs and then vect_get_vec_def_for_operand). vect_get_vec_def_for_operand actually treats constants and invariants differently, using number of units of the stmt's vectype for constants and the type of the operand for invariants. > > I think that vectorizable_operation has to special case shifts > (it already does for the vectorization test but not for code generation), > which also means that it really asks for being separated out > into a vectorizable_shift function ... > > Basically the > > if (op_type == binary_op && scalar_shift_arg) > { > > case should also handle the variant where we need to extend > the scalar shift arg to a vector. But scalar_shift_arg indicates whether the shift argument is a scalar or a vector. How can we treat it as "we need to extend the scalar shift arg to a vector" as well? vectorizable_shift sounds good to me, I hope there are no other operations like that, allowing operands of different types. Thanks, Ira > > Richard. >
On Tue, 19 Oct 2010, Ira Rosen wrote: > > > Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:48:15 PM: > > > > > Hm, but then this is a problem of the caller, and likely non-SLP mode > > has the same issue (looking at vectorizable_operation I don't see > > any special casing handling it for non-SLP, it'll just call > > vect_get_vec_defs and then vect_get_vec_def_for_operand). > > vect_get_vec_def_for_operand actually treats constants and invariants > differently, using number of units of the stmt's vectype for constants and > the type of the operand for invariants. Well, as far as I can see vect_is_simple_use will assign the operands defining statement definition to *def (which will be always op itself). So, case vect_constant_def: { vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); and case vect_external_def: { vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); are the same. > > I think that vectorizable_operation has to special case shifts > > (it already does for the vectorization test but not for code generation), > > which also means that it really asks for being separated out > > into a vectorizable_shift function ... > > > > Basically the > > > > if (op_type == binary_op && scalar_shift_arg) > > { > > > > case should also handle the variant where we need to extend > > the scalar shift arg to a vector. > > But scalar_shift_arg indicates whether the shift argument is a scalar or a > vector. How can we treat it as "we need to extend the scalar shift arg to a > vector" as well? If the target doesn't support a non-vector shift amount then we always need to promote it to a vector (and we know how many elements that will have from the other operands element count). > vectorizable_shift sounds good to me, I hope there are no other operations > like that, allowing operands of different types. I don't think so. Thanks, Richard. > Thanks, > Ira > > > > > Richard. > > > >
Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM: > > > Hm, but then this is a problem of the caller, and likely non-SLP mode > > > has the same issue (looking at vectorizable_operation I don't see > > > any special casing handling it for non-SLP, it'll just call > > > vect_get_vec_defs and then vect_get_vec_def_for_operand). > > > > vect_get_vec_def_for_operand actually treats constants and invariants > > differently, using number of units of the stmt's vectype for constants and > > the type of the operand for invariants. > > Well, as far as I can see vect_is_simple_use will assign the operands > defining statement definition to *def (which will be always op itself). > So, > > case vect_constant_def: > { > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > and > > case vect_external_def: > { > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > are the same. Right, but the number of elements of created vector is determined by nunits, and for case vect_constant_def we use: tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo); unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype); while case vect_external_def: { vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); gcc_assert (vector_type); nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > > > I think that vectorizable_operation has to special case shifts > > > (it already does for the vectorization test but not for code generation), > > > which also means that it really asks for being separated out > > > into a vectorizable_shift function ... > > > > > > Basically the > > > > > > if (op_type == binary_op && scalar_shift_arg) > > > { > > > > > > case should also handle the variant where we need to extend > > > the scalar shift arg to a vector. > > > > But scalar_shift_arg indicates whether the shift argument is a scalar or a > > vector. How can we treat it as "we need to extend the scalar shift arg to a > > vector" as well? > > If the target doesn't support a non-vector shift amount then we always > need to promote it to a vector (and we know how many elements that > will have from the other operands element count). But in that case scalar_shift_arg is false. > > > vectorizable_shift sounds good to me, I hope there are no other operations > > like that, allowing operands of different types. > > I don't think so. OK, good. I'll prepare a patch then. Thanks, Ira > > Thanks, > Richard. > > > Thanks, > > Ira > > > > > > > > Richard. > > > > > > > > > -- > Richard Guenther <rguenther@suse.de> > Novell / SUSE Labs > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - > GF: Markus Rex
On Tue, 19 Oct 2010, Ira Rosen wrote: > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM: > > > > > Hm, but then this is a problem of the caller, and likely non-SLP mode > > > > has the same issue (looking at vectorizable_operation I don't see > > > > any special casing handling it for non-SLP, it'll just call > > > > vect_get_vec_defs and then vect_get_vec_def_for_operand). > > > > > > vect_get_vec_def_for_operand actually treats constants and invariants > > > differently, using number of units of the stmt's vectype for constants > and > > > the type of the operand for invariants. > > > > Well, as far as I can see vect_is_simple_use will assign the operands > > defining statement definition to *def (which will be always op itself). > > So, > > > > case vect_constant_def: > > { > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > > > and > > > > case vect_external_def: > > { > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > > > are the same. > > Right, but the number of elements of created vector is determined by > nunits, and for case vect_constant_def we use: > > tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo); > unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype); > > while > > case vect_external_def: > { > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > gcc_assert (vector_type); > nunits = TYPE_VECTOR_SUBPARTS (vector_type); Oh indeed. A strange inconsistency. But of course if nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating a bogus vector_cst anyway, so I'd argue the vect_external_def handling is correct and the vect_constant_def one wrong. > > > > I think that vectorizable_operation has to special case shifts > > > > (it already does for the vectorization test but not for code > generation), > > > > which also means that it really asks for being separated out > > > > into a vectorizable_shift function ... > > > > > > > > Basically the > > > > > > > > if (op_type == binary_op && scalar_shift_arg) > > > > { > > > > > > > > case should also handle the variant where we need to extend > > > > the scalar shift arg to a vector. > > > > > > But scalar_shift_arg indicates whether the shift argument is a scalar > or a > > > vector. How can we treat it as "we need to extend the scalar shift arg > to a > > > vector" as well? > > > > If the target doesn't support a non-vector shift amount then we always > > need to promote it to a vector (and we know how many elements that > > will have from the other operands element count). > > But in that case scalar_shift_arg is false. Oh, indeed. So we need to always promote the arg to a vector then (if !scalar_shift_arg). > > > > > vectorizable_shift sounds good to me, I hope there are no other > operations > > > like that, allowing operands of different types. > > > > I don't think so. > > OK, good. I'll prepare a patch then. Thanks, Richard.
Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:59:10 PM: > > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM: > > > > > > > Hm, but then this is a problem of the caller, and likely non-SLP mode > > > > > has the same issue (looking at vectorizable_operation I don't see > > > > > any special casing handling it for non-SLP, it'll just call > > > > > vect_get_vec_defs and then vect_get_vec_def_for_operand). > > > > > > > > vect_get_vec_def_for_operand actually treats constants and invariants > > > > differently, using number of units of the stmt's vectype for constants > > and > > > > the type of the operand for invariants. > > > > > > Well, as far as I can see vect_is_simple_use will assign the operands > > > defining statement definition to *def (which will be always op itself). > > > So, > > > > > > case vect_constant_def: > > > { > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > > > > > and > > > > > > case vect_external_def: > > > { > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > > > > > are the same. > > > > Right, but the number of elements of created vector is determined by > > nunits, and for case vect_constant_def we use: > > > > tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo); > > unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype); > > > > while > > > > case vect_external_def: > > { > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > gcc_assert (vector_type); > > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > Oh indeed. A strange inconsistency. But of course if > nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating > a bogus vector_cst anyway, so I'd argue the vect_external_def handling > is correct and the vect_constant_def one wrong. > > > > > > I think that vectorizable_operation has to special case shifts > > > > > (it already does for the vectorization test but not for code > > generation), > > > > > which also means that it really asks for being separated out > > > > > into a vectorizable_shift function ... > > > > > > > > > > Basically the > > > > > > > > > > if (op_type == binary_op && scalar_shift_arg) > > > > > { > > > > > > > > > > case should also handle the variant where we need to extend > > > > > the scalar shift arg to a vector. > > > > > > > > But scalar_shift_arg indicates whether the shift argument is a scalar > > or a > > > > vector. How can we treat it as "we need to extend the scalar shift arg > > to a > > > > vector" as well? > > > > > > If the target doesn't support a non-vector shift amount then we always > > > need to promote it to a vector (and we know how many elements that > > > will have from the other operands element count). > > > > But in that case scalar_shift_arg is false. > > Oh, indeed. So we need to always promote the arg to a vector then > (if !scalar_shift_arg). > > > > > > > > vectorizable_shift sounds good to me, I hope there are no other > > operations > > > > like that, allowing operands of different types. > > > > > > I don't think so. > > > > OK, good. I'll prepare a patch then. Now I see this code in vectorizable_operation: /* Unlike the other binary operators, shifts/rotates have the rhs being int, instead of the same type as the lhs, so make sure the scalar is the right type if we are dealing with vectors of short/char. */ if (dt[1] == vect_constant_def) op1 = fold_convert (TREE_TYPE (vectype), op1); op1 is passed to vect_get_vec_def_for_operand, but not to vect_get_constant_vectors, which explains the difference in behavior between regular vectorization and SLP. So, now I think that it will be enough to pass op1 to SLP functions in order to fix this. Thanks, Ira > > Thanks, > Richard.
On Tue, 19 Oct 2010, Ira Rosen wrote: > > > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:59:10 PM: > > > > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM: > > > > > > > > > Hm, but then this is a problem of the caller, and likely non-SLP > mode > > > > > > has the same issue (looking at vectorizable_operation I don't see > > > > > > any special casing handling it for non-SLP, it'll just call > > > > > > vect_get_vec_defs and then vect_get_vec_def_for_operand). > > > > > > > > > > vect_get_vec_def_for_operand actually treats constants and > invariants > > > > > differently, using number of units of the stmt's vectype for > constants > > > and > > > > > the type of the operand for invariants. > > > > > > > > Well, as far as I can see vect_is_simple_use will assign the operands > > > > defining statement definition to *def (which will be always op > itself). > > > > So, > > > > > > > > case vect_constant_def: > > > > { > > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); > > > > > > > > and > > > > > > > > case vect_external_def: > > > > { > > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > > > > > > > are the same. > > > > > > Right, but the number of elements of created vector is determined by > > > nunits, and for case vect_constant_def we use: > > > > > > tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo); > > > unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype); > > > > > > while > > > > > > case vect_external_def: > > > { > > > vector_type = get_vectype_for_scalar_type (TREE_TYPE (def)); > > > gcc_assert (vector_type); > > > nunits = TYPE_VECTOR_SUBPARTS (vector_type); > > > > Oh indeed. A strange inconsistency. But of course if > > nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating > > a bogus vector_cst anyway, so I'd argue the vect_external_def handling > > is correct and the vect_constant_def one wrong. > > > > > > > > I think that vectorizable_operation has to special case shifts > > > > > > (it already does for the vectorization test but not for code > > > generation), > > > > > > which also means that it really asks for being separated out > > > > > > into a vectorizable_shift function ... > > > > > > > > > > > > Basically the > > > > > > > > > > > > if (op_type == binary_op && scalar_shift_arg) > > > > > > { > > > > > > > > > > > > case should also handle the variant where we need to extend > > > > > > the scalar shift arg to a vector. > > > > > > > > > > But scalar_shift_arg indicates whether the shift argument is a > scalar > > > or a > > > > > vector. How can we treat it as "we need to extend the scalar shift > arg > > > to a > > > > > vector" as well? > > > > > > > > If the target doesn't support a non-vector shift amount then we > always > > > > need to promote it to a vector (and we know how many elements that > > > > will have from the other operands element count). > > > > > > But in that case scalar_shift_arg is false. > > > > Oh, indeed. So we need to always promote the arg to a vector then > > (if !scalar_shift_arg). > > > > > > > > > > > vectorizable_shift sounds good to me, I hope there are no other > > > operations > > > > > like that, allowing operands of different types. > > > > > > > > I don't think so. > > > > > > OK, good. I'll prepare a patch then. > > Now I see this code in vectorizable_operation: > > /* Unlike the other binary operators, shifts/rotates have > the rhs being int, instead of the same type as the lhs, > so make sure the scalar is the right type if we are > dealing with vectors of short/char. */ > if (dt[1] == vect_constant_def) > op1 = fold_convert (TREE_TYPE (vectype), op1); > > op1 is passed to vect_get_vec_def_for_operand, but not to > vect_get_constant_vectors, which explains the difference in behavior > between regular vectorization and SLP. So, now I think that it will be > enough to pass op1 to SLP functions in order to fix this. Ick - what a twisted maze! ;) Richard.
Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c (revision 165610) +++ gcc/tree-vect-slp.c (working copy) @@ -1899,14 +1899,7 @@ vect_get_constant_vectors (slp_tree slp_ else constant_p = false; - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant itself. - If OP is the first operand of POINTER_PLUS_EXPR, its type is the type of - the statement, so it's OK to use OP's type for both first and second - operands. */ - if (code == POINTER_PLUS_EXPR) - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); - else - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); + vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); gcc_assert (vector_type); nunits = TYPE_VECTOR_SUBPARTS (vector_type); or, to probably make AVX happier with float -> int conversion Index: gcc/tree-vect-slp.c =================================================================== --- gcc/tree-vect-slp.c (revision 165610) +++ gcc/tree-vect-slp.c (working copy) @@ -1899,14 +1899,8 @@ vect_get_constant_vectors (slp_tree slp_ else constant_p = false; - /* For POINTER_PLUS_EXPR we use the type of the constant/invariant itself. - If OP is the first operand of POINTER_PLUS_EXPR, its type is the type of - the statement, so it's OK to use OP's type for both first and second - operands. */ - if (code == POINTER_PLUS_EXPR) - vector_type = get_vectype_for_scalar_type (TREE_TYPE (op)); - else - vector_type = STMT_VINFO_VECTYPE (stmt_vinfo); + vector_type = get_same_sized_vectype (TREE_TYPE (op), + STMT_VINFO_VECTYPE (stmt_vinfo));