diff mbox series

[i386] Fix scalar costing in ix86_add_stmt_cost

Message ID alpine.LSU.2.20.1810041131590.16707@zhemvz.fhfr.qr
State New
Headers show
Series [i386] Fix scalar costing in ix86_add_stmt_cost | expand

Commit Message

Richard Biener Oct. 4, 2018, 9:37 a.m. UTC
The following fixes bogus differences in scalar iteration cost
computed by the vectorizer when comparing 128 and 256 bit vectorizations.
This was caused by the hook looking at the vector types mode even
for kind == scalar_stmt and thus returning vector operation costs
for things like add or negate.

This is most noticable on targets where ix86_vec_cost applies
multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
Zen and Bulldozer).  But it will also adjust the actual costs
everywhere where scalar and vector costs diverge.

The adjustments done for Silvermont also look broken since they
are applied to both scalar and vector cost which makes it mostly
a no-op.  The patch adjusts it to only apply for vector costing
but of course I didn't benchmark this and the magic number of 1.7
may not make sense after this fix so I'm happy to leave that
out - Kirill?  Should I just go ahead with that for trunk (we can
revert or adjust if autotesters pick up regressions on your side)?

Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

Richard.

2018-10-04   Richard Biener  <rguenther@suse.de>

	* config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
	is asked for initialize mode to the component mode of the
	vector type.  Make sure Bonnel and esp. other Atom cost
	adjustments are not done for scalar cost estimates.

Comments

Richard Biener Oct. 4, 2018, 1:24 p.m. UTC | #1
On Thu, 4 Oct 2018, Richard Biener wrote:

> 
> The following fixes bogus differences in scalar iteration cost
> computed by the vectorizer when comparing 128 and 256 bit vectorizations.
> This was caused by the hook looking at the vector types mode even
> for kind == scalar_stmt and thus returning vector operation costs
> for things like add or negate.
> 
> This is most noticable on targets where ix86_vec_cost applies
> multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
> Zen and Bulldozer).  But it will also adjust the actual costs
> everywhere where scalar and vector costs diverge.
> 
> The adjustments done for Silvermont also look broken since they
> are applied to both scalar and vector cost which makes it mostly
> a no-op.  The patch adjusts it to only apply for vector costing
> but of course I didn't benchmark this and the magic number of 1.7
> may not make sense after this fix so I'm happy to leave that
> out - Kirill?  Should I just go ahead with that for trunk (we can
> revert or adjust if autotesters pick up regressions on your side)?
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?

So this revealed

FAIL: gcc.target/i386/vect-double-2.c scan-tree-dump-times vect 
"Vectorized loops: 1" 1

and looking closer makes it clear that for TARGET_BONNELL we
do not want to penaltize vector loads or stores but just
arithmetic(?), so the kind == vector_stmt was more correct.

Similar logic might apply to the TARGET_SILVERMONT and friends
so I'll leave that alone.

Thus please consider the patch with just the first hunk.

Thanks,
Richard.

> Richard.
> 
> 2018-10-04   Richard Biener  <rguenther@suse.de>
> 
> 	* config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
> 	is asked for initialize mode to the component mode of the
> 	vector type.  Make sure Bonnel and esp. other Atom cost
> 	adjustments are not done for scalar cost estimates.
> 
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c	(revision 264837)
> +++ gcc/config/i386/i386.c	(working copy)
> @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun
>  {
>    unsigned *cost = (unsigned *) data;
>    unsigned retval = 0;
> +  bool scalar_p
> +    = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
>  
>    tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
>    int stmt_cost = - 1;
>  
>    bool fp = false;
> -  machine_mode mode = TImode;
> +  machine_mode mode = scalar_p ? SImode : TImode;
>  
>    if (vectype != NULL)
>      {
>        fp = FLOAT_TYPE_P (vectype);
>        mode = TYPE_MODE (vectype);
> +      if (scalar_p)
> +	mode = TYPE_MODE (TREE_TYPE (vectype));
>      }
>  
>    if ((kind == vector_stmt || kind == scalar_stmt)
> @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun
>      stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>  
>    /* Penalize DFmode vector operations for Bonnell.  */
> -  if (TARGET_BONNELL && kind == vector_stmt
> +  if (TARGET_BONNELL && !scalar_p
>        && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
>      stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
>  
> @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun
>       for Silvermont as it has out of order integer pipeline and can execute
>       2 scalar instruction per tick, but has in order SIMD pipeline.  */
>    if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
> -       || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt)
> +       || TARGET_TREMONT || TARGET_INTEL)
> +      && !scalar_p && stmt_info && stmt_info->stmt)
>      {
>        tree lhs_op = gimple_get_lhs (stmt_info->stmt);
>        if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)
>
Jan Hubicka Oct. 5, 2018, 11:29 a.m. UTC | #2
> 
> The following fixes bogus differences in scalar iteration cost
> computed by the vectorizer when comparing 128 and 256 bit vectorizations.
> This was caused by the hook looking at the vector types mode even
> for kind == scalar_stmt and thus returning vector operation costs
> for things like add or negate.
> 
> This is most noticable on targets where ix86_vec_cost applies
> multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
> Zen and Bulldozer).  But it will also adjust the actual costs
> everywhere where scalar and vector costs diverge.
> 
> The adjustments done for Silvermont also look broken since they
> are applied to both scalar and vector cost which makes it mostly
> a no-op.  The patch adjusts it to only apply for vector costing
> but of course I didn't benchmark this and the magic number of 1.7
> may not make sense after this fix so I'm happy to leave that
> out - Kirill?  Should I just go ahead with that for trunk (we can
> revert or adjust if autotesters pick up regressions on your side)?
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?
> 
> Richard.
> 
> 2018-10-04   Richard Biener  <rguenther@suse.de>
> 
> 	* config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
> 	is asked for initialize mode to the component mode of the
> 	vector type.  Make sure Bonnel and esp. other Atom cost
> 	adjustments are not done for scalar cost estimates.
OK,
thanks!
> 
> Index: gcc/config/i386/i386.c
> ===================================================================
> --- gcc/config/i386/i386.c	(revision 264837)
> +++ gcc/config/i386/i386.c	(working copy)
> @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun
>  {
>    unsigned *cost = (unsigned *) data;
>    unsigned retval = 0;
> +  bool scalar_p
> +    = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
>  
>    tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
>    int stmt_cost = - 1;
>  
>    bool fp = false;
> -  machine_mode mode = TImode;
> +  machine_mode mode = scalar_p ? SImode : TImode;
>  
>    if (vectype != NULL)
>      {
>        fp = FLOAT_TYPE_P (vectype);
>        mode = TYPE_MODE (vectype);
> +      if (scalar_p)
> +	mode = TYPE_MODE (TREE_TYPE (vectype));
>      }
>  
>    if ((kind == vector_stmt || kind == scalar_stmt)
> @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun
>      stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
>  
>    /* Penalize DFmode vector operations for Bonnell.  */
> -  if (TARGET_BONNELL && kind == vector_stmt
> +  if (TARGET_BONNELL && !scalar_p
>        && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
>      stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
>  
> @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun
>       for Silvermont as it has out of order integer pipeline and can execute
>       2 scalar instruction per tick, but has in order SIMD pipeline.  */
>    if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
> -       || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt)
> +       || TARGET_TREMONT || TARGET_INTEL)
> +      && !scalar_p && stmt_info && stmt_info->stmt)
>      {
>        tree lhs_op = gimple_get_lhs (stmt_info->stmt);
>        if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)
H.J. Lu Oct. 7, 2018, 12:53 p.m. UTC | #3
On Fri, Oct 5, 2018 at 4:29 AM Jan Hubicka <hubicka@ucw.cz> wrote:
>
> >
> > The following fixes bogus differences in scalar iteration cost
> > computed by the vectorizer when comparing 128 and 256 bit vectorizations.
> > This was caused by the hook looking at the vector types mode even
> > for kind == scalar_stmt and thus returning vector operation costs
> > for things like add or negate.
> >
> > This is most noticable on targets where ix86_vec_cost applies
> > multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus
> > Zen and Bulldozer).  But it will also adjust the actual costs
> > everywhere where scalar and vector costs diverge.
> >
> > The adjustments done for Silvermont also look broken since they
> > are applied to both scalar and vector cost which makes it mostly
> > a no-op.  The patch adjusts it to only apply for vector costing
> > but of course I didn't benchmark this and the magic number of 1.7
> > may not make sense after this fix so I'm happy to leave that
> > out - Kirill?  Should I just go ahead with that for trunk (we can
> > revert or adjust if autotesters pick up regressions on your side)?
> >
> > Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK?
> >
> > Richard.
> >
> > 2018-10-04   Richard Biener  <rguenther@suse.de>
> >
> >       * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost
> >       is asked for initialize mode to the component mode of the
> >       vector type.  Make sure Bonnel and esp. other Atom cost
> >       adjustments are not done for scalar cost estimates.
> OK,

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87545
diff mbox series

Patch

Index: gcc/config/i386/i386.c
===================================================================
--- gcc/config/i386/i386.c	(revision 264837)
+++ gcc/config/i386/i386.c	(working copy)
@@ -49486,17 +49486,21 @@  ix86_add_stmt_cost (void *data, int coun
 {
   unsigned *cost = (unsigned *) data;
   unsigned retval = 0;
+  bool scalar_p
+    = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store);
 
   tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE;
   int stmt_cost = - 1;
 
   bool fp = false;
-  machine_mode mode = TImode;
+  machine_mode mode = scalar_p ? SImode : TImode;
 
   if (vectype != NULL)
     {
       fp = FLOAT_TYPE_P (vectype);
       mode = TYPE_MODE (vectype);
+      if (scalar_p)
+	mode = TYPE_MODE (TREE_TYPE (vectype));
     }
 
   if ((kind == vector_stmt || kind == scalar_stmt)
@@ -49632,7 +49636,7 @@  ix86_add_stmt_cost (void *data, int coun
     stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign);
 
   /* Penalize DFmode vector operations for Bonnell.  */
-  if (TARGET_BONNELL && kind == vector_stmt
+  if (TARGET_BONNELL && !scalar_p
       && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode)
     stmt_cost *= 5;  /* FIXME: The value here is arbitrary.  */
 
@@ -49648,7 +49652,8 @@  ix86_add_stmt_cost (void *data, int coun
      for Silvermont as it has out of order integer pipeline and can execute
      2 scalar instruction per tick, but has in order SIMD pipeline.  */
   if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS
-       || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt)
+       || TARGET_TREMONT || TARGET_INTEL)
+      && !scalar_p && stmt_info && stmt_info->stmt)
     {
       tree lhs_op = gimple_get_lhs (stmt_info->stmt);
       if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)