diff mbox series

[Aarch64] Fix vec_perm cost for thunderx2t99

Message ID 20191101140242.GA7200@bell-sw.com
State New
Headers show
Series [Aarch64] Fix vec_perm cost for thunderx2t99 | expand

Commit Message

Anton Youdkevitch Nov. 1, 2019, 2:02 p.m. UTC
Hello,

Here is the one-liner that fixes the incorrect
vec_perm cost for thunderx2t99 chip.
With the patch applied 526.blender of CPU2017
gets ~5% improvement with no measurable changes
for other benchmarks.

Bootstrapped OK on aarch64-linux-gnu.

OK for trunk?

2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>

	* gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
	change vec_perm field

Comments

Andrew Pinski Nov. 1, 2019, 11:22 p.m. UTC | #1
On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
<anton.youdkevitch@bell-sw.com> wrote:
>
> Hello,
>
> Here is the one-liner that fixes the incorrect
> vec_perm cost for thunderx2t99 chip.
> With the patch applied 526.blender of CPU2017
> gets ~5% improvement with no measurable changes
> for other benchmarks.
>
> Bootstrapped OK on aarch64-linux-gnu.
>
> OK for trunk?

Maybe the big question is vec_perm used for both 1 input and 2 input
cases?  If so maybe splitting the two cases would be important too.
Otherwise this is ok from my point of view but I can't approve it.


Thanks,
Andrew Pinski

>
> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
>
>         * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>         change vec_perm field
>
> --
>   Thanks,
>   Anton
Kyrylo Tkachov Nov. 5, 2019, 11:43 a.m. UTC | #2
Hi Andrew, Anton,

On 11/1/19 11:22 PM, Andrew Pinski wrote:
> On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
> <anton.youdkevitch@bell-sw.com> wrote:
> >
> > Hello,
> >
> > Here is the one-liner that fixes the incorrect
> > vec_perm cost for thunderx2t99 chip.
> > With the patch applied 526.blender of CPU2017
> > gets ~5% improvement with no measurable changes
> > for other benchmarks.
> >
> > Bootstrapped OK on aarch64-linux-gnu.
> >
> > OK for trunk?
>
> Maybe the big question is vec_perm used for both 1 input and 2 input
> cases?  If so maybe splitting the two cases would be important too.
> Otherwise this is ok from my point of view but I can't approve it.
>
I'd be interested to see a testcase/demonstration where this would would 
be beneficial.

In the meantime this patch is ok if it helps thunderx2t99 performance.

2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>

         * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
         change vec_perm field

ChangeLog nits:

* Two spaces between name and date+email

* No gcc/ prefix as the relevant ChangeLog file lives in gcc/

* End entry with full stop.

Anton, do you need someone to commit this for you?

Thanks,

Kyrill



>
> Thanks,
> Andrew Pinski
>
> >
> > 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
> >
> >         * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
> >         change vec_perm field
> >
> > --
> >   Thanks,
> >   Anton
Anton Youdkevitch Nov. 5, 2019, 11:48 a.m. UTC | #3
Andrew,

On 02.11.2019 2:22, Andrew Pinski wrote:
> On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
> <anton.youdkevitch@bell-sw.com> wrote:
>>
>> Hello,
>>
>> Here is the one-liner that fixes the incorrect
>> vec_perm cost for thunderx2t99 chip.
>> With the patch applied 526.blender of CPU2017
>> gets ~5% improvement with no measurable changes
>> for other benchmarks.
>>
>> Bootstrapped OK on aarch64-linux-gnu.
>>
>> OK for trunk?
> 
> Maybe the big question is vec_perm used for both 1 input and 2 input
> cases?  If so maybe splitting the two cases would be important too.
It is as there is no per-number-of-operands distinction while
computing the vector permutation cost.
However, since 1-operand permutes are rare this would be a good
approximation (statistically).

> Otherwise this is ok from my point of view but I can't approve it.

> 
> 
> Thanks,
> Andrew Pinski
> 
>>
>> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
>>
>>          * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>>          change vec_perm field
>>
>> --
>>    Thanks,
>>    Anton
Anton Youdkevitch Nov. 5, 2019, 11:54 a.m. UTC | #4
Kyrill,

On 05.11.2019 14:43, Kyrylo Tkachov wrote:
> Hi Andrew, Anton,
> 
> On 11/1/19 11:22 PM, Andrew Pinski wrote:
>> On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
>> <anton.youdkevitch@bell-sw.com> wrote:
>>>
>>> Hello,
>>>
>>> Here is the one-liner that fixes the incorrect
>>> vec_perm cost for thunderx2t99 chip.
>>> With the patch applied 526.blender of CPU2017
>>> gets ~5% improvement with no measurable changes
>>> for other benchmarks.
>>>
>>> Bootstrapped OK on aarch64-linux-gnu.
>>>
>>> OK for trunk?
>>
>> Maybe the big question is vec_perm used for both 1 input and 2 input
>> cases?  If so maybe splitting the two cases would be important too.
>> Otherwise this is ok from my point of view but I can't approve it.
>>
> I'd be interested to see a testcase/demonstration where this would would
> be beneficial.
Well, since I measured this on SPEC 2017, so, the result is
the overall benchmark score. I can try to extract the relevant
pieces of code that get compiled differently to see if they can
be make into a standalone testcase. I didn't try this yet, though.

> 
> In the meantime this patch is ok if it helps thunderx2t99 performance.
> 
> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
> 
>           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>           change vec_perm field
> 
> ChangeLog nits:
> 
> * Two spaces between name and date+email
> 
> * No gcc/ prefix as the relevant ChangeLog file lives in gcc/
> 
> * End entry with full stop.
Thanks, will do like this next time.

> 
> Anton, do you need someone to commit this for you?
Yes, it would be nice if you can do this for me.

> 
> Thanks,
> 
> Kyrill
> 
> 
> 
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
>>>
>>>           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>>>           change vec_perm field
>>>
>>> --
>>>     Thanks,
>>>     Anton
Kyrylo Tkachov Nov. 5, 2019, 12:09 p.m. UTC | #5
On 11/5/19 11:54 AM, Anton Youdkevitch wrote:
Kyrill,

On 05.11.2019 14:43, Kyrylo Tkachov wrote:
> Hi Andrew, Anton,
>
> On 11/1/19 11:22 PM, Andrew Pinski wrote:
>> On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
>> <anton.youdkevitch@bell-sw.com><mailto:anton.youdkevitch@bell-sw.com> wrote:
>>>
>>> Hello,
>>>
>>> Here is the one-liner that fixes the incorrect
>>> vec_perm cost for thunderx2t99 chip.
>>> With the patch applied 526.blender of CPU2017
>>> gets ~5% improvement with no measurable changes
>>> for other benchmarks.
>>>
>>> Bootstrapped OK on aarch64-linux-gnu.
>>>
>>> OK for trunk?
>>
>> Maybe the big question is vec_perm used for both 1 input and 2 input
>> cases?  If so maybe splitting the two cases would be important too.
>> Otherwise this is ok from my point of view but I can't approve it.
>>
> I'd be interested to see a testcase/demonstration where this would would
> be beneficial.
Well, since I measured this on SPEC 2017, so, the result is
the overall benchmark score. I can try to extract the relevant
pieces of code that get compiled differently to see if they can
be make into a standalone testcase. I didn't try this yet, though.


Sorry, I was referring to Andrew's suggestion about splitting the costs rather than your change.


>
> In the meantime this patch is ok if it helps thunderx2t99 performance.
>
> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com><mailto:anton.youdkevitch@bell-sw.com>
>
>           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>           change vec_perm field
>
> ChangeLog nits:
>
> * Two spaces between name and date+email
>
> * No gcc/ prefix as the relevant ChangeLog file lives in gcc/
>
> * End entry with full stop.
Thanks, will do like this next time.

>
> Anton, do you need someone to commit this for you?
Yes, it would be nice if you can do this for me.


Committed as r277826 with the following adjusted ChangeLog:

2019-11-05  Anton Youdkevitch  <anton.youdkevitch@bell-sw.com><mailto:anton.youdkevitch@bell-sw.com>

    * config/aarch64/aarch64.c (thunderx2t99_vector_cost):
    Change vec_perm field to 10.

Thanks for the patch. If you intend to make more contributions in the future it would be worth sorting a copyright assignment if you haven't done so already.

Kyrill



>
> Thanks,
>
> Kyrill
>
>
>
>>
>> Thanks,
>> Andrew Pinski
>>
>>>
>>> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com><mailto:anton.youdkevitch@bell-sw.com>
>>>
>>>           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>>>           change vec_perm field
>>>
>>> --
>>>     Thanks,
>>>     Anton
Anton Youdkevitch Nov. 5, 2019, 12:22 p.m. UTC | #6
On 05.11.2019 15:09, Kyrylo Tkachov wrote:
> 
> On 11/5/19 11:54 AM, Anton Youdkevitch wrote:
>> Kyrill,
>>
>> On 05.11.2019 14:43, Kyrylo Tkachov wrote:
>> > Hi Andrew, Anton,
>> > 
>> > On 11/1/19 11:22 PM, Andrew Pinski wrote:
>> >> On Fri, Nov 1, 2019 at 7:03 AM Anton Youdkevitch
>> >> <anton.youdkevitch@bell-sw.com> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> Here is the one-liner that fixes the incorrect
>> >>> vec_perm cost for thunderx2t99 chip.
>> >>> With the patch applied 526.blender of CPU2017
>> >>> gets ~5% improvement with no measurable changes
>> >>> for other benchmarks.
>> >>>
>> >>> Bootstrapped OK on aarch64-linux-gnu.
>> >>>
>> >>> OK for trunk?
>> >>
>> >> Maybe the big question is vec_perm used for both 1 input and 2 input
>> >> cases?  If so maybe splitting the two cases would be important too.
>> >> Otherwise this is ok from my point of view but I can't approve it.
>> >>
>> > I'd be interested to see a testcase/demonstration where this would would
>> > be beneficial.
>> Well, since I measured this on SPEC 2017, so, the result is
>> the overall benchmark score. I can try to extract the relevant
>> pieces of code that get compiled differently to see if they can
>> be make into a standalone testcase. I didn't try this yet, though.
>>
> Sorry, I was referring to Andrew's suggestion about splitting the costs 
> rather than your change.
> 
> 
>> > 
>> > In the meantime this patch is ok if it helps thunderx2t99 performance.
>> > 
>> > 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
>> > 
>> >           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>> >           change vec_perm field
>> > 
>> > ChangeLog nits:
>> > 
>> > * Two spaces between name and date+email
>> > 
>> > * No gcc/ prefix as the relevant ChangeLog file lives in gcc/
>> > 
>> > * End entry with full stop.
>> Thanks, will do like this next time.
>>
>> > 
>> > Anton, do you need someone to commit this for you?
>> Yes, it would be nice if you can do this for me.
>>
> Committed as r277826 with the following adjusted ChangeLog:
> 
> 2019-11-05  Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
> 
>      * config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>      Change vec_perm field to 10.
> 
> Thanks for the patch. If you intend to make more contributions in the 
> future it would be worth sorting a copyright assignment if you haven't 
> done so already.
Thanks a lot for pushing it.
I will figure that out.

> 
> Kyrill
> 
> 
> 
>> > 
>> > Thanks,
>> > 
>> > Kyrill
>> > 
>> > 
>> > 
>> >>
>> >> Thanks,
>> >> Andrew Pinski
>> >>
>> >>>
>> >>> 2019-11-01 Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
>> >>>
>> >>>           * gcc/config/aarch64/aarch64.c (thunderx2t99_vector_cost):
>> >>>           change vec_perm field
>> >>>
>> >>> --
>> >>>     Thanks,
>> >>>     Anton
diff mbox series

Patch

From 42eab5c85a39a5c5ed7be00245f13f73c3f6b946 Mon Sep 17 00:00:00 2001
From: Anton Youdkevitch <anton.youdkevitch@bell-sw.com>
Date: Wed, 30 Oct 2019 14:42:41 +0000
Subject: [PATCH] [Aarch64] Fix vec_perm cost for thunderx2t99

The vec_perm cost for thunderx2t99 was set to 3
which was too low and resulted in non-beneficial
permutations neing generated.
Changed it to 10 (2-register form tbl cost) to more
accurately represent the timings wrt the actual
TX2 hardware.
---
 gcc/config/aarch64/aarch64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 599d07a..f28fe80 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -585,7 +585,7 @@  static const struct cpu_vector_cost thunderx2t99_vector_cost =
   1, /* scalar_store_cost  */
   5, /* vec_int_stmt_cost  */
   6, /* vec_fp_stmt_cost  */
-  3, /* vec_permute_cost  */
+  10, /* vec_permute_cost  */
   6, /* vec_to_scalar_cost  */
   5, /* scalar_to_vec_cost  */
   8, /* vec_align_load_cost  */
-- 
2.7.4