Message ID | alpine.LSU.2.20.1810041131590.16707@zhemvz.fhfr.qr |
---|---|
State | New |
Headers | show |
Series | [i386] Fix scalar costing in ix86_add_stmt_cost | expand |
On Thu, 4 Oct 2018, Richard Biener wrote: > > The following fixes bogus differences in scalar iteration cost > computed by the vectorizer when comparing 128 and 256 bit vectorizations. > This was caused by the hook looking at the vector types mode even > for kind == scalar_stmt and thus returning vector operation costs > for things like add or negate. > > This is most noticable on targets where ix86_vec_cost applies > multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus > Zen and Bulldozer). But it will also adjust the actual costs > everywhere where scalar and vector costs diverge. > > The adjustments done for Silvermont also look broken since they > are applied to both scalar and vector cost which makes it mostly > a no-op. The patch adjusts it to only apply for vector costing > but of course I didn't benchmark this and the magic number of 1.7 > may not make sense after this fix so I'm happy to leave that > out - Kirill? Should I just go ahead with that for trunk (we can > revert or adjust if autotesters pick up regressions on your side)? > > Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK? So this revealed FAIL: gcc.target/i386/vect-double-2.c scan-tree-dump-times vect "Vectorized loops: 1" 1 and looking closer makes it clear that for TARGET_BONNELL we do not want to penaltize vector loads or stores but just arithmetic(?), so the kind == vector_stmt was more correct. Similar logic might apply to the TARGET_SILVERMONT and friends so I'll leave that alone. Thus please consider the patch with just the first hunk. Thanks, Richard. > Richard. > > 2018-10-04 Richard Biener <rguenther@suse.de> > > * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost > is asked for initialize mode to the component mode of the > vector type. Make sure Bonnel and esp. other Atom cost > adjustments are not done for scalar cost estimates. > > Index: gcc/config/i386/i386.c > =================================================================== > --- gcc/config/i386/i386.c (revision 264837) > +++ gcc/config/i386/i386.c (working copy) > @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun > { > unsigned *cost = (unsigned *) data; > unsigned retval = 0; > + bool scalar_p > + = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store); > > tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; > int stmt_cost = - 1; > > bool fp = false; > - machine_mode mode = TImode; > + machine_mode mode = scalar_p ? SImode : TImode; > > if (vectype != NULL) > { > fp = FLOAT_TYPE_P (vectype); > mode = TYPE_MODE (vectype); > + if (scalar_p) > + mode = TYPE_MODE (TREE_TYPE (vectype)); > } > > if ((kind == vector_stmt || kind == scalar_stmt) > @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); > > /* Penalize DFmode vector operations for Bonnell. */ > - if (TARGET_BONNELL && kind == vector_stmt > + if (TARGET_BONNELL && !scalar_p > && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode) > stmt_cost *= 5; /* FIXME: The value here is arbitrary. */ > > @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun > for Silvermont as it has out of order integer pipeline and can execute > 2 scalar instruction per tick, but has in order SIMD pipeline. */ > if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS > - || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt) > + || TARGET_TREMONT || TARGET_INTEL) > + && !scalar_p && stmt_info && stmt_info->stmt) > { > tree lhs_op = gimple_get_lhs (stmt_info->stmt); > if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE) >
> > The following fixes bogus differences in scalar iteration cost > computed by the vectorizer when comparing 128 and 256 bit vectorizations. > This was caused by the hook looking at the vector types mode even > for kind == scalar_stmt and thus returning vector operation costs > for things like add or negate. > > This is most noticable on targets where ix86_vec_cost applies > multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus > Zen and Bulldozer). But it will also adjust the actual costs > everywhere where scalar and vector costs diverge. > > The adjustments done for Silvermont also look broken since they > are applied to both scalar and vector cost which makes it mostly > a no-op. The patch adjusts it to only apply for vector costing > but of course I didn't benchmark this and the magic number of 1.7 > may not make sense after this fix so I'm happy to leave that > out - Kirill? Should I just go ahead with that for trunk (we can > revert or adjust if autotesters pick up regressions on your side)? > > Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK? > > Richard. > > 2018-10-04 Richard Biener <rguenther@suse.de> > > * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost > is asked for initialize mode to the component mode of the > vector type. Make sure Bonnel and esp. other Atom cost > adjustments are not done for scalar cost estimates. OK, thanks! > > Index: gcc/config/i386/i386.c > =================================================================== > --- gcc/config/i386/i386.c (revision 264837) > +++ gcc/config/i386/i386.c (working copy) > @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun > { > unsigned *cost = (unsigned *) data; > unsigned retval = 0; > + bool scalar_p > + = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store); > > tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; > int stmt_cost = - 1; > > bool fp = false; > - machine_mode mode = TImode; > + machine_mode mode = scalar_p ? SImode : TImode; > > if (vectype != NULL) > { > fp = FLOAT_TYPE_P (vectype); > mode = TYPE_MODE (vectype); > + if (scalar_p) > + mode = TYPE_MODE (TREE_TYPE (vectype)); > } > > if ((kind == vector_stmt || kind == scalar_stmt) > @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun > stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); > > /* Penalize DFmode vector operations for Bonnell. */ > - if (TARGET_BONNELL && kind == vector_stmt > + if (TARGET_BONNELL && !scalar_p > && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode) > stmt_cost *= 5; /* FIXME: The value here is arbitrary. */ > > @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun > for Silvermont as it has out of order integer pipeline and can execute > 2 scalar instruction per tick, but has in order SIMD pipeline. */ > if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS > - || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt) > + || TARGET_TREMONT || TARGET_INTEL) > + && !scalar_p && stmt_info && stmt_info->stmt) > { > tree lhs_op = gimple_get_lhs (stmt_info->stmt); > if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)
On Fri, Oct 5, 2018 at 4:29 AM Jan Hubicka <hubicka@ucw.cz> wrote: > > > > > The following fixes bogus differences in scalar iteration cost > > computed by the vectorizer when comparing 128 and 256 bit vectorizations. > > This was caused by the hook looking at the vector types mode even > > for kind == scalar_stmt and thus returning vector operation costs > > for things like add or negate. > > > > This is most noticable on targets where ix86_vec_cost applies > > multiplication based on vector size (TARGET_AVX128_OPTIMAL, thus > > Zen and Bulldozer). But it will also adjust the actual costs > > everywhere where scalar and vector costs diverge. > > > > The adjustments done for Silvermont also look broken since they > > are applied to both scalar and vector cost which makes it mostly > > a no-op. The patch adjusts it to only apply for vector costing > > but of course I didn't benchmark this and the magic number of 1.7 > > may not make sense after this fix so I'm happy to leave that > > out - Kirill? Should I just go ahead with that for trunk (we can > > revert or adjust if autotesters pick up regressions on your side)? > > > > Bootstrap & regtest running on x86_64-unknown-linux-gnu, OK? > > > > Richard. > > > > 2018-10-04 Richard Biener <rguenther@suse.de> > > > > * config/i386/i386.c (ix86_add_stmt_cost): When scalar cost > > is asked for initialize mode to the component mode of the > > vector type. Make sure Bonnel and esp. other Atom cost > > adjustments are not done for scalar cost estimates. > OK, This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87545
Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 264837) +++ gcc/config/i386/i386.c (working copy) @@ -49486,17 +49486,21 @@ ix86_add_stmt_cost (void *data, int coun { unsigned *cost = (unsigned *) data; unsigned retval = 0; + bool scalar_p + = (kind == scalar_stmt || kind == scalar_load || kind == scalar_store); tree vectype = stmt_info ? stmt_vectype (stmt_info) : NULL_TREE; int stmt_cost = - 1; bool fp = false; - machine_mode mode = TImode; + machine_mode mode = scalar_p ? SImode : TImode; if (vectype != NULL) { fp = FLOAT_TYPE_P (vectype); mode = TYPE_MODE (vectype); + if (scalar_p) + mode = TYPE_MODE (TREE_TYPE (vectype)); } if ((kind == vector_stmt || kind == scalar_stmt) @@ -49632,7 +49636,7 @@ ix86_add_stmt_cost (void *data, int coun stmt_cost = ix86_builtin_vectorization_cost (kind, vectype, misalign); /* Penalize DFmode vector operations for Bonnell. */ - if (TARGET_BONNELL && kind == vector_stmt + if (TARGET_BONNELL && !scalar_p && vectype && GET_MODE_INNER (TYPE_MODE (vectype)) == DFmode) stmt_cost *= 5; /* FIXME: The value here is arbitrary. */ @@ -49648,7 +49652,8 @@ ix86_add_stmt_cost (void *data, int coun for Silvermont as it has out of order integer pipeline and can execute 2 scalar instruction per tick, but has in order SIMD pipeline. */ if ((TARGET_SILVERMONT || TARGET_GOLDMONT || TARGET_GOLDMONT_PLUS - || TARGET_TREMONT || TARGET_INTEL) && stmt_info && stmt_info->stmt) + || TARGET_TREMONT || TARGET_INTEL) + && !scalar_p && stmt_info && stmt_info->stmt) { tree lhs_op = gimple_get_lhs (stmt_info->stmt); if (lhs_op && TREE_CODE (TREE_TYPE (lhs_op)) == INTEGER_TYPE)