Patchwork [ARM] Implement vectorizer cost hooks

login
register
mail settings
Submitter Christophe Lyon
Date Feb. 11, 2013, 3:43 p.m.
Message ID <CAKdteOaeWXTZG+s5DJ6piTifiZbiLbPZ8OTmvw-p8U7pEP-G5g@mail.gmail.com>
Download mbox | patch
Permalink /patch/219623/
State New
Headers show

Comments

Christophe Lyon - Feb. 11, 2013, 3:43 p.m.
Richard,

Thanks for your comments.

Here a new version with the changes you suggested.

Christophe


On 11 February 2013 11:57, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 05/02/13 18:18, Christophe Lyon wrote:
>>
>> Hi,
>>
>> Following the discussion about "disable peeling" [1] a few weeks ago,
>> it turned out that the vectorizer cost model needed some
>> implementation for ARM.
>>
>> The attached patch implements arm_builtin_vectorization_cost and
>> arm_add_stmt_cost, providing default costs when aligned and unaligned
>> loads/stores have the same cost (=1). init_cost and finish_cost still
>> use the default implementation (I noticed that x86 has chosen to
>> duplicate the default implementation without changing it, why?)
>>
>> Benchmarking shows very little variation, expect a noticeable +1.6% on
>> coremark.
>>
>> If this is OK, we can then discuss how to disable peeling completely
>> when aligned and unaligned accesses have the same cost (and thus where
>> peeling is a loss of performance). I think adding a new hook is
>> necessary, since target descriptions may use different models for
>> these costs (eg x86 makes no difference between unaligned loads and
>> unaligned stores).
>>
>> Thanks,
>>
>> Christophe.
>>
>> [1] http://gcc.gnu.org/ml/gcc/2012-12/msg00036.html
>>
>> 2013-02-05  Christophe Lyon <christophe.lyon@linaro.org>
>>
>>          * config/arm/arm.c (arm_builtin_vectorization_cost)
>>          (arm_add_stmt_cost): New functions.
>>          (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST)
>>          (TARGET_VECTORIZE_ADD_STMT_COST): Define.
>>          (struct processor_costs): New struct type.
>>          (default_arm_cost): New struct of type processor_costs.=
>>
>
> Christophe,
>
> Thanks for the patch.  This is mostly OK, but please can you make the
> following changes.
>
> +struct processor_costs {
>
> Please name this something like cpu_vec_costs.  It's not the only cost table
> in the back-end.
>
> +struct processor_costs default_arm_cost = {    /* arm generic costs.  */
>
> Similarly, use something like default_arm_vec_cost.
>
> +const struct processor_costs *arm_cost = &default_arm_cost;
>
> And here.  But better still, link this through the current_tune table rather
> than introducing a new global.
>
> Finally,
>
> @@ -27256,4 +27272,130 @@ arm_validize_comparison (rtx *comparison, rtx *
> op1, rtx * op2)
>
>  }
>
> +/* Vectorizer cost model implementation.  */
>
>
> Please put the patch in a more suitable location rather than just dumping it
> at the end of the file.  There are already numerous functions related to
> costs that are mostly grouped together.  I suggest this goes near the
> rtx_costs code.
>
> R.
>
>
Richard Earnshaw - Feb. 12, 2013, 1:42 p.m.
On 11/02/13 15:43, Christophe Lyon wrote:
> Richard,
>
> Thanks for your comments.
>
> Here a new version with the changes you suggested.
>

Thanks for turning this around quickly.  This is fine.

Ramana and I have discussed this and we're agreed that we'd like this to 
go into 4.8.  However, if any problems do crop up we'll back it out 
again and wait for stage1 to re-open.

Please can you commit it ASAP so that we can make sure it gets a 
reasonable amount of testing.

R.

> Christophe
>
>
> On 11 February 2013 11:57, Richard Earnshaw <rearnsha@arm.com> wrote:
>> On 05/02/13 18:18, Christophe Lyon wrote:
>>>
>>> Hi,
>>>
>>> Following the discussion about "disable peeling" [1] a few weeks ago,
>>> it turned out that the vectorizer cost model needed some
>>> implementation for ARM.
>>>
>>> The attached patch implements arm_builtin_vectorization_cost and
>>> arm_add_stmt_cost, providing default costs when aligned and unaligned
>>> loads/stores have the same cost (=1). init_cost and finish_cost still
>>> use the default implementation (I noticed that x86 has chosen to
>>> duplicate the default implementation without changing it, why?)
>>>
>>> Benchmarking shows very little variation, expect a noticeable +1.6% on
>>> coremark.
>>>
>>> If this is OK, we can then discuss how to disable peeling completely
>>> when aligned and unaligned accesses have the same cost (and thus where
>>> peeling is a loss of performance). I think adding a new hook is
>>> necessary, since target descriptions may use different models for
>>> these costs (eg x86 makes no difference between unaligned loads and
>>> unaligned stores).
>>>
>>> Thanks,
>>>
>>> Christophe.
>>>
>>> [1] http://gcc.gnu.org/ml/gcc/2012-12/msg00036.html
>>>
>>> 2013-02-05  Christophe Lyon <christophe.lyon@linaro.org>
>>>
>>>           * config/arm/arm.c (arm_builtin_vectorization_cost)
>>>           (arm_add_stmt_cost): New functions.
>>>           (TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST)
>>>           (TARGET_VECTORIZE_ADD_STMT_COST): Define.
>>>           (struct processor_costs): New struct type.
>>>           (default_arm_cost): New struct of type processor_costs.=
>>>
>>
>> Christophe,
>>
>> Thanks for the patch.  This is mostly OK, but please can you make the
>> following changes.
>>
>> +struct processor_costs {
>>
>> Please name this something like cpu_vec_costs.  It's not the only cost table
>> in the back-end.
>>
>> +struct processor_costs default_arm_cost = {    /* arm generic costs.  */
>>
>> Similarly, use something like default_arm_vec_cost.
>>
>> +const struct processor_costs *arm_cost = &default_arm_cost;
>>
>> And here.  But better still, link this through the current_tune table rather
>> than introducing a new global.
>>
>> Finally,
>>
>> @@ -27256,4 +27272,130 @@ arm_validize_comparison (rtx *comparison, rtx *
>> op1, rtx * op2)
>>
>>   }
>>
>> +/* Vectorizer cost model implementation.  */
>>
>>
>> Please put the patch in a more suitable location rather than just dumping it
>> at the end of the file.  There are already numerous functions related to
>> costs that are mostly grouped together.  I suggest this goes near the
>> rtx_costs code.
>>
>> R.
>>
>> =
>>
>> vect-cost-model2.txt
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index bfb857d..56fde74 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,17 @@
>> +2013-02-05  Christophe Lyon <christophe.lyon@linaro.org>
>> +
>> +	* config/arm/arm-protos.h (struct cpu_vec_costs): New struct type.
>> +	(struct tune_params): Add vec_costs field.
>> +	* config/arm/arm.c (arm_builtin_vectorization_cost)
>> +	(arm_add_stmt_cost): New functions.
>> +	(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST)
>> +	(TARGET_VECTORIZE_ADD_STMT_COST): Define.
>> +	(arm_default_vec_cost): New struct of type cpu_vec_costs.
>> +	(arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune)
>> +	(arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune)
>> +	(arm_cortex_a15_tune, arm_cortex_a5_tune, arm_cortex_a9_tune)
>> +	(arm_v6m_tune, arm_fa726te_tune): Define new vec_costs field.
>> +
>>   2013-02-04  Alexander Potapenko <glider@google.com>
>>               Jack Howarth  <howarth@bromo.med.uc.edu>
>>   	    Jakub Jelinek  <jakub@redhat.com>
>>
>> vect-cost-model2.patch
>>
>>
>> N¬n‡r¥ªíÂ)emçhÂyhiם¢w^™©Ý
Christophe Lyon - Feb. 12, 2013, 2:57 p.m.
On 12 February 2013 14:42, Richard Earnshaw <rearnsha@arm.com> wrote:
> On 11/02/13 15:43, Christophe Lyon wrote:
>>
>> Richard,
>>
>> Thanks for your comments.
>>
>> Here a new version with the changes you suggested.
>>
>
> Thanks for turning this around quickly.  This is fine.
>
> Ramana and I have discussed this and we're agreed that we'd like this to go
> into 4.8.  However, if any problems do crop up we'll back it out again and
> wait for stage1 to re-open.
>
> Please can you commit it ASAP so that we can make sure it gets a reasonable
> amount of testing.
>
> R.
>
Thanks, I have just committed it, as rev #195977.

Christophe.

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index bfb857d..56fde74 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,17 @@ 
+2013-02-05  Christophe Lyon <christophe.lyon@linaro.org>
+
+	* config/arm/arm-protos.h (struct cpu_vec_costs): New struct type.
+	(struct tune_params): Add vec_costs field.
+	* config/arm/arm.c (arm_builtin_vectorization_cost)
+	(arm_add_stmt_cost): New functions.
+	(TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST)
+	(TARGET_VECTORIZE_ADD_STMT_COST): Define.
+	(arm_default_vec_cost): New struct of type cpu_vec_costs.
+	(arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune)
+	(arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune)
+	(arm_cortex_a15_tune, arm_cortex_a5_tune, arm_cortex_a9_tune)
+	(arm_v6m_tune, arm_fa726te_tune): Define new vec_costs field.
+
 2013-02-04  Alexander Potapenko <glider@google.com>
             Jack Howarth  <howarth@bromo.med.uc.edu>
 	    Jakub Jelinek  <jakub@redhat.com>