Message ID | 804b71d6-40c3-7c0d-8bfa-b347a7b7fda4@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [rs6000] Lower vec_promote_demote vectorization cost for P8/P9 | expand |
Hi Kewen, On Wed, Oct 09, 2019 at 02:43:02PM +0800, Kewen.Lin wrote: > This patch is to lower vec_promote_demote vectorization cost in > rs6000_builtin_vectorization_cost. It's similar to what we committed > for vec_perm, the current cost for vec_promote_demote is also > overpriced for Power8 and Power9 since Power8 and Power9 has > supported more units for permute/unpack/pack rather than single one > on Power7. > > The performance evaluation on SPEC2017 Power9 shows +2.88% gain on > 525.x264_r, degraded -1.70% on 526.blender_r but which has been > identified as just exposing some other issues and actually unrelated, > while SPEC2017 Power8 evaluation shows +4.63% gain on 525.x264_r > without any significant degradations, SPEC2006 Power8 evaluation > shows 1.99% gain on 453.povray. The geomean gain for SPEC2017 > on both Power8 and Power9 is +0.06%, and it's unchanged for SPEC2006 > Power8. Small steps :-) The patch is okay for trunk. Thank you! Segher > * config/rs6000/rs6000.c (rs6000_builtin_vectorization_cost): Lower > vec_promote_demote cost to 1 for non-Power7 VSX architectures.
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c index 2fd9808..8040577 100644 --- a/gcc/config/rs6000/rs6000.c +++ b/gcc/config/rs6000/rs6000.c @@ -4781,10 +4781,11 @@ rs6000_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, return 1; case vec_promote_demote: - if (TARGET_VSX) - return 4; - else - return 1; + /* Power7 has only one permute/pack unit, make it a bit expensive. */ + if (TARGET_VSX && rs6000_tune == PROCESSOR_POWER7) + return 4; + else + return 1; case cond_branch_taken: return 3;