Patchwork Fix PR tree-optimization/46049 and tree-optimization/46052

login
register
mail settings
Submitter Richard Guenther
Date Oct. 18, 2010, 10:17 a.m.
Message ID <alpine.LNX.2.00.1010181209340.23074@zhemvz.fhfr.qr>
Download mbox | patch
Permalink /patch/68163/
State New
Headers show

Comments

Richard Guenther - Oct. 18, 2010, 10:17 a.m.
On Sun, 17 Oct 2010, Ira Rosen wrote:

> 
> Hi,
> 
> With this patch we choose operands' type as the type of the vector created
> for them in case when the operands are invariant variables.
> In these PRs we have two different operations, so it doesn't seem
> appropriate to choose the type according to operation.
> 
> OK for trunk after bootstrap and testing complete on x86_64-suse-linux?

Well, I now looked at the problem myself.  The function seems to be
used to create vectors for operands which then get used to create
vectorized statements to produce their LHS (thus, much similar
to vect_get_vec_def_for_operand).  It thus does not make much sense
to look at STMT_VINFO_VECTYPE at all (that's the vector type of
the LHS of the stmt with the operands).  It should simply use
get_vectype_for_scalar_type always (or get_same_sized_vectype
with STMT_VINFO_VECTYPE as the vector type determining the size).

So either

   gcc_assert (vector_type);
   nunits = TYPE_VECTOR_SUBPARTS (vector_type);

both variants pass the new testcases and the C vect testsuite for me.
Unless we start being less conservative with constraining vector
sizes with AVX there is probably no functional difference, but for
consistency I'd prefer the second variant.

Thanks,
Richard.


> Thanks,
> Ira
> 
> ChangeLog:
> 
> 	PR tree-optimization/46049
> 	PR tree-optimization/46052
> 	* tree-vect-slp.c (vect_get_constant_vectors): Use operands'
> 	type in case of invariant variables.
> 
> testsuite/ChangeLog:
> 
> 	PR tree-optimization/46049
> 	PR tree-optimization/46052
> 	* gcc.dg/vect/pr46052.c: New test.
> 	* gcc.dg/vect/pr46049.c: New test.
> 
> 
> Index: testsuite/gcc.dg/vect/pr46052.c
> ===================================================================
> --- testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> +++ testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +
> +int i;
> +int a[2];
> +
> +static inline char bar (void)
> +{
> +  return i ? i : 1;
> +}
> +
> +void foo (int n)
> +{
> +  while (n--)
> +    {
> +      a[0] ^= bar ();
> +      a[1] ^= bar ();
> +    }
> +}
> +
> +static inline char bar1 (void)
> +{
> +}
> +
> +void foo1 (int n)
> +{
> +  while (n--)
> +    {
> +      a[0] ^= bar1 ();
> +      a[1] ^= bar1 ();
> +    }
> +}
> +
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> Index: testsuite/gcc.dg/vect/pr46049.c
> ===================================================================
> --- testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> +++ testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +
> +typedef __INT16_TYPE__ int16_t;
> +typedef __INT32_TYPE__ int32_t;
> +
> +static inline int32_t bar (int16_t x, int16_t y)
> +{
> +  return x * y;
> +}
> +
> +void foo (int16_t i, int16_t *p, int16_t x)
> +{
> +  while (i--)
> +    {
> +      *p = bar (*p, x) >> 15;
> +      p++;
> +      *p = bar (*p, x) >> 15;
> +      p++;
> +    }
> +}
> +/* { dg-final { cleanup-tree-dump "vect" } } */
> Index: tree-vect-slp.c
> ===================================================================
> --- tree-vect-slp.c     (revision 165574)
> +++ tree-vect-slp.c     (working copy)
> @@ -1902,8 +1902,8 @@ vect_get_constant_vectors (slp_tree slp_
>    /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> itself.
>       If OP is the first operand of POINTER_PLUS_EXPR, its type is the type
> of
>       the statement, so it's OK to use OP's type for both first and second
> -     operands.  */
> -  if (code == POINTER_PLUS_EXPR)
> +     operands.  We also use the type of OP if it's an invariant variable.
> */
> +  if (code == POINTER_PLUS_EXPR || !constant_p)
>      vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
>    else
>      vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> 
>
Ira Rosen - Oct. 18, 2010, 10:31 a.m.
Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:17:16 PM:

> On Sun, 17 Oct 2010, Ira Rosen wrote:
>
> >
> > Hi,
> >
> > With this patch we choose operands' type as the type of the vector
created
> > for them in case when the operands are invariant variables.
> > In these PRs we have two different operations, so it doesn't seem
> > appropriate to choose the type according to operation.
> >
> > OK for trunk after bootstrap and testing complete on x86_64-suse-linux?
>
> Well, I now looked at the problem myself.  The function seems to be
> used to create vectors for operands which then get used to create
> vectorized statements to produce their LHS (thus, much similar
> to vect_get_vec_def_for_operand).  It thus does not make much sense
> to look at STMT_VINFO_VECTYPE at all (that's the vector type of
> the LHS of the stmt with the operands).  It should simply use
> get_vectype_for_scalar_type always (or get_same_sized_vectype
> with STMT_VINFO_VECTYPE as the vector type determining the size).
>
> So either
>
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c (revision 165610)
> +++ gcc/tree-vect-slp.c (working copy)
> @@ -1899,14 +1899,7 @@ vect_get_constant_vectors (slp_tree slp_
>    else
>      constant_p = false;
>
> -  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> itself.
> -     If OP is the first operand of POINTER_PLUS_EXPR, its type is the
> type of
> -     the statement, so it's OK to use OP's type for both first and
second
> -     operands.  */
> -  if (code == POINTER_PLUS_EXPR)
> -    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> -  else
> -    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> +  vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
>
>    gcc_assert (vector_type);
>    nunits = TYPE_VECTOR_SUBPARTS (vector_type);
>
> or, to probably make AVX happier with float -> int conversion
>
> Index: gcc/tree-vect-slp.c
> ===================================================================
> --- gcc/tree-vect-slp.c (revision 165610)
> +++ gcc/tree-vect-slp.c (working copy)
> @@ -1899,14 +1899,8 @@ vect_get_constant_vectors (slp_tree slp_
>    else
>      constant_p = false;
>
> -  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> itself.
> -     If OP is the first operand of POINTER_PLUS_EXPR, its type is the
> type of
> -     the statement, so it's OK to use OP's type for both first and
second
> -     operands.  */
> -  if (code == POINTER_PLUS_EXPR)
> -    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> -  else
> -    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> +  vector_type = get_same_sized_vectype (TREE_TYPE (op),
> +                                       STMT_VINFO_VECTYPE (stmt_vinfo));
>
>    gcc_assert (vector_type);
>    nunits = TYPE_VECTOR_SUBPARTS (vector_type);
>
> both variants pass the new testcases and the C vect testsuite for me.

It will not work for pr45902.c on PowerPC (you first patch just reverts the
patch that fixed PR 45902).

For:

short res[N];
short a[N];

int
main1 ()
{
  int i;

  for (i = 0; i < N/4; i+=4)
    {
      res[i] = a[i] >> 8;
      res[i+1] = a[i+1] >> 8;
      res[i+2] = a[i+2] >> 8;
      res[i+3] = a[i+3] >> 8;
    }
}

it will create vector shift for vector short with {8, 8, 8, 8} instead of
{8, 8, 8, 8, 8, 8, 8, 8}.

Thanks,
Ira


> Unless we start being less conservative with constraining vector
> sizes with AVX there is probably no functional difference, but for
> consistency I'd prefer the second variant.
>
> Thanks,
> Richard.
>
>
> > Thanks,
> > Ira
> >
> > ChangeLog:
> >
> >    PR tree-optimization/46049
> >    PR tree-optimization/46052
> >    * tree-vect-slp.c (vect_get_constant_vectors): Use operands'
> >    type in case of invariant variables.
> >
> > testsuite/ChangeLog:
> >
> >    PR tree-optimization/46049
> >    PR tree-optimization/46052
> >    * gcc.dg/vect/pr46052.c: New test.
> >    * gcc.dg/vect/pr46049.c: New test.
> >
> >
> > Index: testsuite/gcc.dg/vect/pr46052.c
> > ===================================================================
> > --- testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> > +++ testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> > @@ -0,0 +1,33 @@
> > +/* { dg-do compile } */
> > +
> > +int i;
> > +int a[2];
> > +
> > +static inline char bar (void)
> > +{
> > +  return i ? i : 1;
> > +}
> > +
> > +void foo (int n)
> > +{
> > +  while (n--)
> > +    {
> > +      a[0] ^= bar ();
> > +      a[1] ^= bar ();
> > +    }
> > +}
> > +
> > +static inline char bar1 (void)
> > +{
> > +}
> > +
> > +void foo1 (int n)
> > +{
> > +  while (n--)
> > +    {
> > +      a[0] ^= bar1 ();
> > +      a[1] ^= bar1 ();
> > +    }
> > +}
> > +
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > Index: testsuite/gcc.dg/vect/pr46049.c
> > ===================================================================
> > --- testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> > +++ testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> > @@ -0,0 +1,21 @@
> > +/* { dg-do compile } */
> > +
> > +typedef __INT16_TYPE__ int16_t;
> > +typedef __INT32_TYPE__ int32_t;
> > +
> > +static inline int32_t bar (int16_t x, int16_t y)
> > +{
> > +  return x * y;
> > +}
> > +
> > +void foo (int16_t i, int16_t *p, int16_t x)
> > +{
> > +  while (i--)
> > +    {
> > +      *p = bar (*p, x) >> 15;
> > +      p++;
> > +      *p = bar (*p, x) >> 15;
> > +      p++;
> > +    }
> > +}
> > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > Index: tree-vect-slp.c
> > ===================================================================
> > --- tree-vect-slp.c     (revision 165574)
> > +++ tree-vect-slp.c     (working copy)
> > @@ -1902,8 +1902,8 @@ vect_get_constant_vectors (slp_tree slp_
> >    /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> > itself.
> >       If OP is the first operand of POINTER_PLUS_EXPR, its type is the
type
> > of
> >       the statement, so it's OK to use OP's type for both first and
second
> > -     operands.  */
> > -  if (code == POINTER_PLUS_EXPR)
> > +     operands.  We also use the type of OP if it's an invariant
variable.
> > */
> > +  if (code == POINTER_PLUS_EXPR || !constant_p)
> >      vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> >    else
> >      vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> >
> >
>
> --
> Richard Guenther <rguenther@suse.de>
> Novell / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 -
> GF: Markus Rex
Richard Guenther - Oct. 18, 2010, 10:48 a.m.
On Mon, 18 Oct 2010, Ira Rosen wrote:

> 
> 
> Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:17:16 PM:
> 
> > On Sun, 17 Oct 2010, Ira Rosen wrote:
> >
> > >
> > > Hi,
> > >
> > > With this patch we choose operands' type as the type of the vector
> created
> > > for them in case when the operands are invariant variables.
> > > In these PRs we have two different operations, so it doesn't seem
> > > appropriate to choose the type according to operation.
> > >
> > > OK for trunk after bootstrap and testing complete on x86_64-suse-linux?
> >
> > Well, I now looked at the problem myself.  The function seems to be
> > used to create vectors for operands which then get used to create
> > vectorized statements to produce their LHS (thus, much similar
> > to vect_get_vec_def_for_operand).  It thus does not make much sense
> > to look at STMT_VINFO_VECTYPE at all (that's the vector type of
> > the LHS of the stmt with the operands).  It should simply use
> > get_vectype_for_scalar_type always (or get_same_sized_vectype
> > with STMT_VINFO_VECTYPE as the vector type determining the size).
> >
> > So either
> >
> > Index: gcc/tree-vect-slp.c
> > ===================================================================
> > --- gcc/tree-vect-slp.c (revision 165610)
> > +++ gcc/tree-vect-slp.c (working copy)
> > @@ -1899,14 +1899,7 @@ vect_get_constant_vectors (slp_tree slp_
> >    else
> >      constant_p = false;
> >
> > -  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> > itself.
> > -     If OP is the first operand of POINTER_PLUS_EXPR, its type is the
> > type of
> > -     the statement, so it's OK to use OP's type for both first and
> second
> > -     operands.  */
> > -  if (code == POINTER_PLUS_EXPR)
> > -    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> > -  else
> > -    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> > +  vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> >
> >    gcc_assert (vector_type);
> >    nunits = TYPE_VECTOR_SUBPARTS (vector_type);
> >
> > or, to probably make AVX happier with float -> int conversion
> >
> > Index: gcc/tree-vect-slp.c
> > ===================================================================
> > --- gcc/tree-vect-slp.c (revision 165610)
> > +++ gcc/tree-vect-slp.c (working copy)
> > @@ -1899,14 +1899,8 @@ vect_get_constant_vectors (slp_tree slp_
> >    else
> >      constant_p = false;
> >
> > -  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> > itself.
> > -     If OP is the first operand of POINTER_PLUS_EXPR, its type is the
> > type of
> > -     the statement, so it's OK to use OP's type for both first and
> second
> > -     operands.  */
> > -  if (code == POINTER_PLUS_EXPR)
> > -    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> > -  else
> > -    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> > +  vector_type = get_same_sized_vectype (TREE_TYPE (op),
> > +                                       STMT_VINFO_VECTYPE (stmt_vinfo));
> >
> >    gcc_assert (vector_type);
> >    nunits = TYPE_VECTOR_SUBPARTS (vector_type);
> >
> > both variants pass the new testcases and the C vect testsuite for me.
> 
> It will not work for pr45902.c on PowerPC (you first patch just reverts the
> patch that fixed PR 45902).
> 
> For:
> 
> short res[N];
> short a[N];
> 
> int
> main1 ()
> {
>   int i;
> 
>   for (i = 0; i < N/4; i+=4)
>     {
>       res[i] = a[i] >> 8;
>       res[i+1] = a[i+1] >> 8;
>       res[i+2] = a[i+2] >> 8;
>       res[i+3] = a[i+3] >> 8;
>     }
> }
> 
> it will create vector shift for vector short with {8, 8, 8, 8} instead of
> {8, 8, 8, 8, 8, 8, 8, 8}.

Hm, but then this is a problem of the caller, and likely non-SLP mode
has the same issue (looking at vectorizable_operation I don't see
any special casing handling it for non-SLP, it'll just call
vect_get_vec_defs and then vect_get_vec_def_for_operand).

I think that vectorizable_operation has to special case shifts
(it already does for the vectorization test but not for code generation),
which also means that it really asks for being separated out
into a vectorizable_shift function ...

Basically the 

          if (op_type == binary_op && scalar_shift_arg)
            {

case should also handle the variant where we need to extend
the scalar shift arg to a vector.

Richard.

> Thanks,
> Ira
> 
> 
> > Unless we start being less conservative with constraining vector
> > sizes with AVX there is probably no functional difference, but for
> > consistency I'd prefer the second variant.
> >
> > Thanks,
> > Richard.
> >
> >
> > > Thanks,
> > > Ira
> > >
> > > ChangeLog:
> > >
> > >    PR tree-optimization/46049
> > >    PR tree-optimization/46052
> > >    * tree-vect-slp.c (vect_get_constant_vectors): Use operands'
> > >    type in case of invariant variables.
> > >
> > > testsuite/ChangeLog:
> > >
> > >    PR tree-optimization/46049
> > >    PR tree-optimization/46052
> > >    * gcc.dg/vect/pr46052.c: New test.
> > >    * gcc.dg/vect/pr46049.c: New test.
> > >
> > >
> > > Index: testsuite/gcc.dg/vect/pr46052.c
> > > ===================================================================
> > > --- testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> > > +++ testsuite/gcc.dg/vect/pr46052.c     (revision 0)
> > > @@ -0,0 +1,33 @@
> > > +/* { dg-do compile } */
> > > +
> > > +int i;
> > > +int a[2];
> > > +
> > > +static inline char bar (void)
> > > +{
> > > +  return i ? i : 1;
> > > +}
> > > +
> > > +void foo (int n)
> > > +{
> > > +  while (n--)
> > > +    {
> > > +      a[0] ^= bar ();
> > > +      a[1] ^= bar ();
> > > +    }
> > > +}
> > > +
> > > +static inline char bar1 (void)
> > > +{
> > > +}
> > > +
> > > +void foo1 (int n)
> > > +{
> > > +  while (n--)
> > > +    {
> > > +      a[0] ^= bar1 ();
> > > +      a[1] ^= bar1 ();
> > > +    }
> > > +}
> > > +
> > > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > > Index: testsuite/gcc.dg/vect/pr46049.c
> > > ===================================================================
> > > --- testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> > > +++ testsuite/gcc.dg/vect/pr46049.c     (revision 0)
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +
> > > +typedef __INT16_TYPE__ int16_t;
> > > +typedef __INT32_TYPE__ int32_t;
> > > +
> > > +static inline int32_t bar (int16_t x, int16_t y)
> > > +{
> > > +  return x * y;
> > > +}
> > > +
> > > +void foo (int16_t i, int16_t *p, int16_t x)
> > > +{
> > > +  while (i--)
> > > +    {
> > > +      *p = bar (*p, x) >> 15;
> > > +      p++;
> > > +      *p = bar (*p, x) >> 15;
> > > +      p++;
> > > +    }
> > > +}
> > > +/* { dg-final { cleanup-tree-dump "vect" } } */
> > > Index: tree-vect-slp.c
> > > ===================================================================
> > > --- tree-vect-slp.c     (revision 165574)
> > > +++ tree-vect-slp.c     (working copy)
> > > @@ -1902,8 +1902,8 @@ vect_get_constant_vectors (slp_tree slp_
> > >    /* For POINTER_PLUS_EXPR we use the type of the constant/invariant
> > > itself.
> > >       If OP is the first operand of POINTER_PLUS_EXPR, its type is the
> type
> > > of
> > >       the statement, so it's OK to use OP's type for both first and
> second
> > > -     operands.  */
> > > -  if (code == POINTER_PLUS_EXPR)
> > > +     operands.  We also use the type of OP if it's an invariant
> variable.
> > > */
> > > +  if (code == POINTER_PLUS_EXPR || !constant_p)
> > >      vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> > >    else
> > >      vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
> > >
> > >
> >
> > --
> > Richard Guenther <rguenther@suse.de>
> > Novell / SUSE Labs
> > SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 -
> > GF: Markus Rex
> 
>
Ira Rosen - Oct. 19, 2010, 11:51 a.m.
Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:48:15 PM:

>
> Hm, but then this is a problem of the caller, and likely non-SLP mode
> has the same issue (looking at vectorizable_operation I don't see
> any special casing handling it for non-SLP, it'll just call
> vect_get_vec_defs and then vect_get_vec_def_for_operand).

vect_get_vec_def_for_operand actually treats constants and invariants
differently, using number of units of the stmt's vectype for constants and
the type of the operand for invariants.

>
> I think that vectorizable_operation has to special case shifts
> (it already does for the vectorization test but not for code generation),
> which also means that it really asks for being separated out
> into a vectorizable_shift function ...
>
> Basically the
>
>           if (op_type == binary_op && scalar_shift_arg)
>             {
>
> case should also handle the variant where we need to extend
> the scalar shift arg to a vector.

But scalar_shift_arg indicates whether the shift argument is a scalar or a
vector. How can we treat it as "we need to extend the scalar shift arg to a
vector" as well?

vectorizable_shift sounds good to me, I hope there are no other operations
like that, allowing operands of different types.

Thanks,
Ira

>
> Richard.
>
Richard Guenther - Oct. 19, 2010, 12:27 p.m.
On Tue, 19 Oct 2010, Ira Rosen wrote:

> 
> 
> Richard Guenther <rguenther@suse.de> wrote on 18/10/2010 12:48:15 PM:
> 
> >
> > Hm, but then this is a problem of the caller, and likely non-SLP mode
> > has the same issue (looking at vectorizable_operation I don't see
> > any special casing handling it for non-SLP, it'll just call
> > vect_get_vec_defs and then vect_get_vec_def_for_operand).
> 
> vect_get_vec_def_for_operand actually treats constants and invariants
> differently, using number of units of the stmt's vectype for constants and
> the type of the operand for invariants.

Well, as far as I can see vect_is_simple_use will assign the operands
defining statement definition to *def (which will be always op itself).
So,

    case vect_constant_def:
      {
        vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));

and

    case vect_external_def:
      {
        vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));

are the same.

> > I think that vectorizable_operation has to special case shifts
> > (it already does for the vectorization test but not for code generation),
> > which also means that it really asks for being separated out
> > into a vectorizable_shift function ...
> >
> > Basically the
> >
> >           if (op_type == binary_op && scalar_shift_arg)
> >             {
> >
> > case should also handle the variant where we need to extend
> > the scalar shift arg to a vector.
> 
> But scalar_shift_arg indicates whether the shift argument is a scalar or a
> vector. How can we treat it as "we need to extend the scalar shift arg to a
> vector" as well?

If the target doesn't support a non-vector shift amount then we always
need to promote it to a vector (and we know how many elements that
will have from the other operands element count).

> vectorizable_shift sounds good to me, I hope there are no other operations
> like that, allowing operands of different types.

I don't think so.

Thanks,
Richard.

> Thanks,
> Ira
> 
> >
> > Richard.
> >
> 
>
Ira Rosen - Oct. 19, 2010, 12:45 p.m.
Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM:

> > > Hm, but then this is a problem of the caller, and likely non-SLP mode
> > > has the same issue (looking at vectorizable_operation I don't see
> > > any special casing handling it for non-SLP, it'll just call
> > > vect_get_vec_defs and then vect_get_vec_def_for_operand).
> >
> > vect_get_vec_def_for_operand actually treats constants and invariants
> > differently, using number of units of the stmt's vectype for constants
and
> > the type of the operand for invariants.
>
> Well, as far as I can see vect_is_simple_use will assign the operands
> defining statement definition to *def (which will be always op itself).
> So,
>
>     case vect_constant_def:
>       {
>         vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
>
> and
>
>     case vect_external_def:
>       {
>         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
>
> are the same.

Right, but the number of elements of created vector is determined by
nunits, and for case vect_constant_def we use:

  tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
  unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);

while

    case vect_external_def:
      {
        vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
        gcc_assert (vector_type);
        nunits = TYPE_VECTOR_SUBPARTS (vector_type);


>
> > > I think that vectorizable_operation has to special case shifts
> > > (it already does for the vectorization test but not for code
generation),
> > > which also means that it really asks for being separated out
> > > into a vectorizable_shift function ...
> > >
> > > Basically the
> > >
> > >           if (op_type == binary_op && scalar_shift_arg)
> > >             {
> > >
> > > case should also handle the variant where we need to extend
> > > the scalar shift arg to a vector.
> >
> > But scalar_shift_arg indicates whether the shift argument is a scalar
or a
> > vector. How can we treat it as "we need to extend the scalar shift arg
to a
> > vector" as well?
>
> If the target doesn't support a non-vector shift amount then we always
> need to promote it to a vector (and we know how many elements that
> will have from the other operands element count).

But in that case scalar_shift_arg is false.

>
> > vectorizable_shift sounds good to me, I hope there are no other
operations
> > like that, allowing operands of different types.
>
> I don't think so.

OK, good. I'll prepare a patch then.

Thanks,
Ira

>
> Thanks,
> Richard.
>
> > Thanks,
> > Ira
> >
> > >
> > > Richard.
> > >
> >
> >
>
> --
> Richard Guenther <rguenther@suse.de>
> Novell / SUSE Labs
> SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 -
> GF: Markus Rex
Richard Guenther - Oct. 19, 2010, 12:59 p.m.
On Tue, 19 Oct 2010, Ira Rosen wrote:

> Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM:
> 
> > > > Hm, but then this is a problem of the caller, and likely non-SLP mode
> > > > has the same issue (looking at vectorizable_operation I don't see
> > > > any special casing handling it for non-SLP, it'll just call
> > > > vect_get_vec_defs and then vect_get_vec_def_for_operand).
> > >
> > > vect_get_vec_def_for_operand actually treats constants and invariants
> > > differently, using number of units of the stmt's vectype for constants
> and
> > > the type of the operand for invariants.
> >
> > Well, as far as I can see vect_is_simple_use will assign the operands
> > defining statement definition to *def (which will be always op itself).
> > So,
> >
> >     case vect_constant_def:
> >       {
> >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> >
> > and
> >
> >     case vect_external_def:
> >       {
> >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
> >
> > are the same.
> 
> Right, but the number of elements of created vector is determined by
> nunits, and for case vect_constant_def we use:
> 
>   tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
>   unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
> 
> while
> 
>     case vect_external_def:
>       {
>         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
>         gcc_assert (vector_type);
>         nunits = TYPE_VECTOR_SUBPARTS (vector_type);

Oh indeed.  A strange inconsistency.  But of course if
nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating
a bogus vector_cst anyway, so I'd argue the vect_external_def handling
is correct and the vect_constant_def one wrong.

> > > > I think that vectorizable_operation has to special case shifts
> > > > (it already does for the vectorization test but not for code
> generation),
> > > > which also means that it really asks for being separated out
> > > > into a vectorizable_shift function ...
> > > >
> > > > Basically the
> > > >
> > > >           if (op_type == binary_op && scalar_shift_arg)
> > > >             {
> > > >
> > > > case should also handle the variant where we need to extend
> > > > the scalar shift arg to a vector.
> > >
> > > But scalar_shift_arg indicates whether the shift argument is a scalar
> or a
> > > vector. How can we treat it as "we need to extend the scalar shift arg
> to a
> > > vector" as well?
> >
> > If the target doesn't support a non-vector shift amount then we always
> > need to promote it to a vector (and we know how many elements that
> > will have from the other operands element count).
> 
> But in that case scalar_shift_arg is false.

Oh, indeed.  So we need to always promote the arg to a vector then
(if !scalar_shift_arg).

> >
> > > vectorizable_shift sounds good to me, I hope there are no other
> operations
> > > like that, allowing operands of different types.
> >
> > I don't think so.
> 
> OK, good. I'll prepare a patch then.

Thanks,
Richard.
Ira Rosen - Oct. 19, 2010, 1:40 p.m.
Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:59:10 PM:

> > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM:
> >
> > > > > Hm, but then this is a problem of the caller, and likely non-SLP
mode
> > > > > has the same issue (looking at vectorizable_operation I don't see
> > > > > any special casing handling it for non-SLP, it'll just call
> > > > > vect_get_vec_defs and then vect_get_vec_def_for_operand).
> > > >
> > > > vect_get_vec_def_for_operand actually treats constants and
invariants
> > > > differently, using number of units of the stmt's vectype for
constants
> > and
> > > > the type of the operand for invariants.
> > >
> > > Well, as far as I can see vect_is_simple_use will assign the operands
> > > defining statement definition to *def (which will be always op
itself).
> > > So,
> > >
> > >     case vect_constant_def:
> > >       {
> > >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> > >
> > > and
> > >
> > >     case vect_external_def:
> > >       {
> > >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
> > >
> > > are the same.
> >
> > Right, but the number of elements of created vector is determined by
> > nunits, and for case vect_constant_def we use:
> >
> >   tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
> >   unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
> >
> > while
> >
> >     case vect_external_def:
> >       {
> >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
> >         gcc_assert (vector_type);
> >         nunits = TYPE_VECTOR_SUBPARTS (vector_type);
>
> Oh indeed.  A strange inconsistency.  But of course if
> nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating
> a bogus vector_cst anyway, so I'd argue the vect_external_def handling
> is correct and the vect_constant_def one wrong.
>
> > > > > I think that vectorizable_operation has to special case shifts
> > > > > (it already does for the vectorization test but not for code
> > generation),
> > > > > which also means that it really asks for being separated out
> > > > > into a vectorizable_shift function ...
> > > > >
> > > > > Basically the
> > > > >
> > > > >           if (op_type == binary_op && scalar_shift_arg)
> > > > >             {
> > > > >
> > > > > case should also handle the variant where we need to extend
> > > > > the scalar shift arg to a vector.
> > > >
> > > > But scalar_shift_arg indicates whether the shift argument is a
scalar
> > or a
> > > > vector. How can we treat it as "we need to extend the scalar shift
arg
> > to a
> > > > vector" as well?
> > >
> > > If the target doesn't support a non-vector shift amount then we
always
> > > need to promote it to a vector (and we know how many elements that
> > > will have from the other operands element count).
> >
> > But in that case scalar_shift_arg is false.
>
> Oh, indeed.  So we need to always promote the arg to a vector then
> (if !scalar_shift_arg).
>
> > >
> > > > vectorizable_shift sounds good to me, I hope there are no other
> > operations
> > > > like that, allowing operands of different types.
> > >
> > > I don't think so.
> >
> > OK, good. I'll prepare a patch then.

Now I see this code in vectorizable_operation:

              /* Unlike the other binary operators, shifts/rotates have
                 the rhs being int, instead of the same type as the lhs,
                 so make sure the scalar is the right type if we are
                 dealing with vectors of short/char.  */
              if (dt[1] == vect_constant_def)
                op1 = fold_convert (TREE_TYPE (vectype), op1);

op1 is passed to vect_get_vec_def_for_operand, but not to
vect_get_constant_vectors, which explains the difference in behavior
between regular vectorization and SLP. So, now I think that it will be
enough to pass op1 to SLP functions in order to fix this.

Thanks,
Ira


>
> Thanks,
> Richard.
Richard Guenther - Oct. 19, 2010, 2:49 p.m.
On Tue, 19 Oct 2010, Ira Rosen wrote:

> 
> 
> Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:59:10 PM:
> 
> > > Richard Guenther <rguenther@suse.de> wrote on 19/10/2010 02:27:27 PM:
> > >
> > > > > > Hm, but then this is a problem of the caller, and likely non-SLP
> mode
> > > > > > has the same issue (looking at vectorizable_operation I don't see
> > > > > > any special casing handling it for non-SLP, it'll just call
> > > > > > vect_get_vec_defs and then vect_get_vec_def_for_operand).
> > > > >
> > > > > vect_get_vec_def_for_operand actually treats constants and
> invariants
> > > > > differently, using number of units of the stmt's vectype for
> constants
> > > and
> > > > > the type of the operand for invariants.
> > > >
> > > > Well, as far as I can see vect_is_simple_use will assign the operands
> > > > defining statement definition to *def (which will be always op
> itself).
> > > > So,
> > > >
> > > >     case vect_constant_def:
> > > >       {
> > > >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
> > > >
> > > > and
> > > >
> > > >     case vect_external_def:
> > > >       {
> > > >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
> > > >
> > > > are the same.
> > >
> > > Right, but the number of elements of created vector is determined by
> > > nunits, and for case vect_constant_def we use:
> > >
> > >   tree vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
> > >   unsigned int nunits = TYPE_VECTOR_SUBPARTS (vectype);
> > >
> > > while
> > >
> > >     case vect_external_def:
> > >       {
> > >         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
> > >         gcc_assert (vector_type);
> > >         nunits = TYPE_VECTOR_SUBPARTS (vector_type);
> >
> > Oh indeed.  A strange inconsistency.  But of course if
> > nunits != TYPE_VECTOR_SUBPARTS (vector_type) then we're generating
> > a bogus vector_cst anyway, so I'd argue the vect_external_def handling
> > is correct and the vect_constant_def one wrong.
> >
> > > > > > I think that vectorizable_operation has to special case shifts
> > > > > > (it already does for the vectorization test but not for code
> > > generation),
> > > > > > which also means that it really asks for being separated out
> > > > > > into a vectorizable_shift function ...
> > > > > >
> > > > > > Basically the
> > > > > >
> > > > > >           if (op_type == binary_op && scalar_shift_arg)
> > > > > >             {
> > > > > >
> > > > > > case should also handle the variant where we need to extend
> > > > > > the scalar shift arg to a vector.
> > > > >
> > > > > But scalar_shift_arg indicates whether the shift argument is a
> scalar
> > > or a
> > > > > vector. How can we treat it as "we need to extend the scalar shift
> arg
> > > to a
> > > > > vector" as well?
> > > >
> > > > If the target doesn't support a non-vector shift amount then we
> always
> > > > need to promote it to a vector (and we know how many elements that
> > > > will have from the other operands element count).
> > >
> > > But in that case scalar_shift_arg is false.
> >
> > Oh, indeed.  So we need to always promote the arg to a vector then
> > (if !scalar_shift_arg).
> >
> > > >
> > > > > vectorizable_shift sounds good to me, I hope there are no other
> > > operations
> > > > > like that, allowing operands of different types.
> > > >
> > > > I don't think so.
> > >
> > > OK, good. I'll prepare a patch then.
> 
> Now I see this code in vectorizable_operation:
> 
>               /* Unlike the other binary operators, shifts/rotates have
>                  the rhs being int, instead of the same type as the lhs,
>                  so make sure the scalar is the right type if we are
>                  dealing with vectors of short/char.  */
>               if (dt[1] == vect_constant_def)
>                 op1 = fold_convert (TREE_TYPE (vectype), op1);
> 
> op1 is passed to vect_get_vec_def_for_operand, but not to
> vect_get_constant_vectors, which explains the difference in behavior
> between regular vectorization and SLP. So, now I think that it will be
> enough to pass op1 to SLP functions in order to fix this.

Ick - what a twisted maze! ;)

Richard.

Patch

Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c (revision 165610)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1899,14 +1899,7 @@  vect_get_constant_vectors (slp_tree slp_
   else
     constant_p = false;
 
-  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant 
itself.
-     If OP is the first operand of POINTER_PLUS_EXPR, its type is the 
type of
-     the statement, so it's OK to use OP's type for both first and second
-     operands.  */
-  if (code == POINTER_PLUS_EXPR)
-    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
-  else
-    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
+  vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
 
   gcc_assert (vector_type);
   nunits = TYPE_VECTOR_SUBPARTS (vector_type);

or, to probably make AVX happier with float -> int conversion

Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c (revision 165610)
+++ gcc/tree-vect-slp.c (working copy)
@@ -1899,14 +1899,8 @@  vect_get_constant_vectors (slp_tree slp_
   else
     constant_p = false;
 
-  /* For POINTER_PLUS_EXPR we use the type of the constant/invariant 
itself.
-     If OP is the first operand of POINTER_PLUS_EXPR, its type is the 
type of
-     the statement, so it's OK to use OP's type for both first and second
-     operands.  */
-  if (code == POINTER_PLUS_EXPR)
-    vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
-  else
-    vector_type = STMT_VINFO_VECTYPE (stmt_vinfo);
+  vector_type = get_same_sized_vectype (TREE_TYPE (op),
+                                       STMT_VINFO_VECTYPE (stmt_vinfo));