[AARCH64] Enable compare branch fusion

Message ID	VI1PR0801MB2127EC4F7CC47F70B4AFF96683290@VI1PR0801MB2127.eurprd08.prod.outlook.com
State	New
Headers	show Return-Path: <gcc-patches-return-516476-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=Bsj Fkh3jrjF9fkwTS4NpTws88kD6EGmsAIPCOoiCxZl3NFi1fMOAT8ouTeCOtqrBEKt 5SJAw+phVt1XerEp0+OvEEM260Y7G0vfohOgU18guKKZBKidZZU+VXjKxnOT69gd hm5ooy1JKl+73dcVjk2QY2jwDZAhtPpBX5fGn468= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; From: Wilco Dijkstra <Wilco.Dijkstra@arm.com> To: GCC Patches <gcc-patches@gcc.gnu.org> CC: Richard Earnshaw <Richard.Earnshaw@arm.com>, Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>, Richard Sandiford <Richard.Sandiford@arm.com> Subject: [PATCH][AARCH64] Enable compare branch fusion Date: Tue, 24 Dec 2019 15:57:16 +0000 Message-ID: <VI1PR0801MB2127EC4F7CC47F70B4AFF96683290@VI1PR0801MB2127.eurprd08.prod.outlook.com> Authentication-Results-Original: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Original-Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT059.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 6d1a04fd-ce0f-4b63-cac9-08d78889f312
Series	[AARCH64] Enable compare branch fusion \| expand [AARCH64] Enable compare branch fusion

Message ID

VI1PR0801MB2127EC4F7CC47F70B4AFF96683290@VI1PR0801MB2127.eurprd08.prod.outlook.com

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:content-type
	:content-transfer-encoding:mime-version; q=dns; s=default; b=Bsj
	Fkh3jrjF9fkwTS4NpTws88kD6EGmsAIPCOoiCxZl3NFi1fMOAT8ouTeCOtqrBEKt
	5SJAw+phVt1XerEp0+OvEEM260Y7G0vfohOgU18guKKZBKidZZU+VXjKxnOT69gd
	hm5ooy1JKl+73dcVjk2QY2jwDZAhtPpBX5fGn468=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates
	63.35.35.123 as permitted sender)
	receiver=protection.outlook.com; client-ip=63.35.35.123;
	helo=64aa7808-outbound-1.mta.getcheckrecipient.com; 
From: Wilco Dijkstra <Wilco.Dijkstra@arm.com>
To: GCC Patches <gcc-patches@gcc.gnu.org>
CC: Richard Earnshaw <Richard.Earnshaw@arm.com>,
	Kyrylo Tkachov	<Kyrylo.Tkachov@arm.com>,
	Richard Sandiford <Richard.Sandiford@arm.com>
Subject: [PATCH][AARCH64] Enable compare branch fusion
Date: Tue, 24 Dec 2019 15:57:16 +0000
Message-ID: <VI1PR0801MB2127EC4F7CC47F70B4AFF96683290@VI1PR0801MB2127.eurprd08.prod.outlook.com>
Authentication-Results-Original: spf=none (sender IP is )
	smtp.mailfrom=Wilco.Dijkstra@arm.com; 
received-spf: None (protection.outlook.com: arm.com does not designate
	permitted sender hosts)
X-Microsoft-Antispam-Message-Info-Original: pu0PL2iXKsmhawAkcus0sDVZiugOpPKUnazhrlURbK95G/+ikSeS42yk0LvFPUoKwAiOH7V5GgSCh+FZErIK8Gur37QW7yDeOzX3ua8owmkflQhUbPt4Hn8NWhsXxv/YO1yEOonfT2y8yGLqzWCzOnvclMlHjN8jR2AI5gY5UU+5QuDamaJ3qlKol5nYwIVDv8OCtcd3JIxXUJzeBHmgQj9mfCfzO+6ES8SF1NKeM5+MFNbBwoJ9oJrpfUO0xtRKNdoi3fv/jJlhfu/vl35NuL0tcniyy2V8anq8Sm8kWe9B9V3Gm2bd92+SrslR8f/vo6fOnaIstk8wbOKhDx+16f8yAN/TYnkLfIf7LYL+y4kpk8MOmAbBGGJfKvrxtqt7FJWD5BYrRq0kG7PkNmwfRhipm5cUQZSruDgGUQD4TxYmhCCBzcAoabHiTrO+WhgBRglk0zxv7MgDaGyBA/Hx47DRpBkKF+veEeYVi+LLXECnswXMndtxH2OqLhAg9BZs
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Original-Authentication-Results: spf=none (sender IP is )
	smtp.mailfrom=Wilco.Dijkstra@arm.com; 
X-MS-Exchange-Transport-CrossTenantHeadersStripped: DB5EUR03FT059.eop-EUR03.prod.protection.outlook.com
X-MS-Office365-Filtering-Correlation-Id-Prvs: 6d1a04fd-ce0f-4b63-cac9-08d78889f312

Series

[AARCH64] Enable compare branch fusion | expand

Commit Message

Wilco Dijkstra Dec. 24, 2019, 3:57 p.m. UTC

Enable the most basic form of compare-branch fusion since various CPUs
support it. This has no measurable effect on cores which don't support
branch fusion, but increases fusion opportunities on cores which do.

Bootstrapped on AArch64, OK for commit?

ChangeLog:
2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>

	* config/aarch64/aarch64.c (generic_tunings): Add branch fusion.
	(neoversen1_tunings): Likewise.

--

Comments

Wilco Dijkstra Jan. 16, 2020, 5:43 p.m. UTC | #1

ping


Enable the most basic form of compare-branch fusion since various CPUs
support it. This has no measurable effect on cores which don't support
branch fusion, but increases fusion opportunities on cores which do.

Bootstrapped on AArch64, OK for commit?

ChangeLog:
2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>

        * config/aarch64/aarch64.c (generic_tunings): Add branch fusion.
        (neoversen1_tunings): Likewise.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   2, /* issue_rate  */
-  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "16:12",     /* function_align.  */
   "4", /* jump_align.  */
   "8", /* loop_align.  */
@@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   3, /* issue_rate  */
-  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16",     /* function_align.  */
   "32:16",     /* jump_align.  */
   "32:16",     /* loop_align.  */

Richard Sandiford Jan. 17, 2020, 9:16 a.m. UTC | #2

Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes:
> Enable the most basic form of compare-branch fusion since various CPUs
> support it. This has no measurable effect on cores which don't support
> branch fusion, but increases fusion opportunities on cores which do.

If you're able to say for the record which cores you tested, then that'd
be good.

> Bootstrapped on AArch64, OK for commit?
>
> ChangeLog:
> 2019-12-24  Wilco Dijkstra  <wdijkstr@arm.com>
>
> * config/aarch64/aarch64.c (generic_tunings): Add branch fusion.
> (neoversen1_tunings): Likewise.

OK, thanks.  I agree there doesn't seem to be an obvious reason why this
would pessimise any cores significantly.  And it looked from a quick
check like all AArch64 cores give these compares the lowest in-use
latency (as expected).

We can revisit this if anyone finds any counterexamples.

Richard


>
> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =
>    SVE_NOT_IMPLEMENTED, /* sve_width  */
>    4, /* memmov_cost  */
>    2, /* issue_rate  */
> -  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
>    "16:12",/* function_align.  */
>    "4",/* jump_align.  */
>    "8",/* loop_align.  */
> @@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =
>    SVE_NOT_IMPLEMENTED, /* sve_width  */
>    4, /* memmov_cost  */
>    3, /* issue_rate  */
> -  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */
> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
>    "32:16",/* function_align.  */
>    "32:16",/* jump_align.  */
>    "32:16",/* loop_align.  */

Wilco Dijkstra Jan. 17, 2020, 2:37 p.m. UTC | #3

Hi Richard,

> If you're able to say for the record which cores you tested, then that'd
> be good.

I've mostly checked it on Cortex-A57 - if there is any affect, it would be on
older cores.

> OK, thanks.  I agree there doesn't seem to be an obvious reason why this
> would pessimise any cores significantly.  And it looked from a quick
> check like all AArch64 cores give these compares the lowest in-use
> latency (as expected).

Indeed.

> We can revisit this if anyone finds any counterexamples.

Yes - it's unlikely there are any though!

Cheers,
Wilco





>

> --

> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c

> index a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 100644

> --- a/gcc/config/aarch64/aarch64.c

> +++ b/gcc/config/aarch64/aarch64.c

> @@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =

>    SVE_NOT_IMPLEMENTED, /* sve_width  */

>    4, /* memmov_cost  */

>    2, /* issue_rate  */

> -  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */

> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */

>    "16:12",/* function_align.  */

>    "4",/* jump_align.  */

>    "8",/* loop_align.  */

> @@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =

>    SVE_NOT_IMPLEMENTED, /* sve_width  */

>    4, /* memmov_cost  */

>    3, /* issue_rate  */

> -  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */

> +  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */

>    "32:16",/* function_align.  */

>    "32:16",/* jump_align.  */

>    "32:16",/* loop_align.  */

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -726,7 +726,7 @@  static const struct tune_params generic_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   2, /* issue_rate  */
-  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "16:12",	/* function_align.  */
   "4",	/* jump_align.  */
   "8",	/* loop_align.  */
@@ -1130,7 +1130,7 @@  static const struct tune_params neoversen1_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   3, /* issue_rate  */
-  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16",	/* function_align.  */
   "32:16",	/* jump_align.  */
   "32:16",	/* loop_align.  */