diff mbox

[AArch64] Enable AES fusion with -mcpu=generic

Message ID AM5PR0802MB2610174A4992D4C2809BB5C283260@AM5PR0802MB2610.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Wilco Dijkstra March 16, 2017, 5:22 p.m. UTC
Many supported cores implement fusion of AES instructions.  When fusion
happens it can give a significant performance gain.  If not, scheduling
fusion candidates next to each other has almost no effect on performance.
Due to the high benefit/low cost it makes sense to enable AES fusion with
-mcpu=generic so that cores that support it always benefit.  Any objections?

Bootstrapped on AArch64, no regressions.

ChangeLog:
2017-03-16  Wilco Dijkstra  <wdijkstr@arm.com>

        * gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.

--

Comments

Andrew Pinski March 16, 2017, 6:01 p.m. UTC | #1
On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> Many supported cores implement fusion of AES instructions.  When fusion
> happens it can give a significant performance gain.  If not, scheduling
> fusion candidates next to each other has almost no effect on performance.
> Due to the high benefit/low cost it makes sense to enable AES fusion with
> -mcpu=generic so that cores that support it always benefit.  Any objections?

I am ok with this due to our new cores support this and there was no
performance lost for ThunderX1.

Thanks,
Andrew

>
> Bootstrapped on AArch64, no regressions.
>
> ChangeLog:
> 2017-03-16  Wilco Dijkstra  <wdijkstr@arm.com>
>
>         * gcc/config/aarch64/aarch64.c (generic_tunings): Add AES fusion.
>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -536,7 +536,7 @@ static const struct tune_params generic_tunings =
>    &generic_approx_modes,
>    4, /* memmov_cost  */
>    2, /* issue_rate  */
> -  AARCH64_FUSE_NOTHING, /* fusible_ops  */
> +  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
>    8,   /* function_align.  */
>    8,   /* jump_align.  */
>    4,   /* loop_align.  */
Jim Wilson March 17, 2017, 3:26 a.m. UTC | #2
On Thu, Mar 16, 2017 at 11:01 AM, Andrew Pinski <apinski@cavium.com> wrote:
> On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
>> Many supported cores implement fusion of AES instructions.  When fusion
>> happens it can give a significant performance gain.  If not, scheduling
>> fusion candidates next to each other has almost no effect on performance.
>> Due to the high benefit/low cost it makes sense to enable AES fusion with
>> -mcpu=generic so that cores that support it always benefit.  Any objections?

No objection.  I'm not currently tracking performance of -mcpu=generic
on falkor, so I'm not very concerned about changes to the generic
tuning structure.

Jim
James Greenhalgh March 17, 2017, 10:55 a.m. UTC | #3
On Thu, Mar 16, 2017 at 08:26:42PM -0700, Jim Wilson wrote:
> On Thu, Mar 16, 2017 at 11:01 AM, Andrew Pinski <apinski@cavium.com> wrote:
> > On Thu, Mar 16, 2017 at 10:22 AM, Wilco Dijkstra <Wilco.Dijkstra@arm.com> wrote:
> >> Many supported cores implement fusion of AES instructions.  When fusion
> >> happens it can give a significant performance gain.  If not, scheduling
> >> fusion candidates next to each other has almost no effect on performance.
> >> Due to the high benefit/low cost it makes sense to enable AES fusion with
> >> -mcpu=generic so that cores that support it always benefit.  Any objections?
> 
> No objection.  I'm not currently tracking performance of -mcpu=generic
> on falkor, so I'm not very concerned about changes to the generic
> tuning structure.

Thanks for the feedback Jim, Andrew.

This patch is OK for trunk. As Richard pointed out on the branch costs
thread, if we had a bug here we'd likely have seen it by now on those
cores which do enable the fusion.

Thanks,
James
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 728ce7029f1e2b5161d9f317d10e564dd5a5f472..c8cf7169a5d387de336920b50c83761dc0c96f3a 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -536,7 +536,7 @@  static const struct tune_params generic_tunings =
   &generic_approx_modes,
   4, /* memmov_cost  */
   2, /* issue_rate  */
-  AARCH64_FUSE_NOTHING, /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
   8,	/* function_align.  */
   8,	/* jump_align.  */
   4,	/* loop_align.  */